piper.verbs.transform

piper.verbs.transform(df: pandas.core.frame.DataFrame, index: Optional[Union[str, List[str]]] = None, **kwargs)pandas.core.frame.DataFrame[source]

Add a group calculation to grouped DataFrame

Transform is based on the pandas pd.DataFrame.transform() function. For details of args, kwargs - see help(pd.DataFrame.transform)

Based on the given dataframe and grouping index, creates new aggregated column values using the list of keyword/value arguments (kwargs) supplied.

By default, it calculates the group % of the first numeric column, in proportion to the first ‘index’ or grouping value.

Examples

Calculate group percentage value

%%piper
sample_data() >>
group_by(['countries', 'regions']) >>
summarise(TotSales1=('values_1', 'sum'))

|                      |   TotSales1 |
|:---------------------|------------:|
| ('France', 'East')   |        2170 |
| ('France', 'North')  |        2275 |
| ('France', 'South')  |        2118 |
| ('France', 'West')   |        4861 |
| ('Germany', 'East')  |        1764 |
| ('Germany', 'North') |        2239 |
| ('Germany', 'South') |        1753 |
| ('Germany', 'West')  |        1575 |

To add a group percentage based on the countries:

%%piper
sample_data() >>
group_by(['countries', 'regions']) >>
summarise(TotSales1=('values_1', 'sum')) >>
transform(index='countries', g_percent=('TotSales1', 'percent')) >>
head(8)

|                      |   TotSales1 |   g_percent |
|:---------------------|------------:|------------:|
| ('France', 'East')   |        2170 |       19    |
| ('France', 'North')  |        2275 |       19.91 |
| ('France', 'South')  |        2118 |       18.54 |
| ('France', 'West')   |        4861 |       42.55 |
| ('Germany', 'East')  |        1764 |       24.06 |
| ('Germany', 'North') |        2239 |       30.54 |
| ('Germany', 'South') |        1753 |       23.91 |
| ('Germany', 'West')  |        1575 |       21.48 |
Parameters
  • df – dataframe to calculate grouped value

  • index – grouped column(s) (str or list) to be applied as grouping index.

  • kwargs

    Similar to ‘assign’, keyword arguments to be assigned as dataframe columns containing, tuples of column_name and function e.g. new_column=(‘existing_col’, ‘sum’)

    If no kwargs supplied - calculates the group percentage (‘g%’) using the first index column as index key and the first column value(s).

    Note

    transform() has built-in functions:
    • percent: calculate group % associated with index group

    • rank: dense rank (ascending order)

    • rank_desc: dense rank (descending order)

Returns

Return type

original dataframe with additional grouped calculation column