piper.verbs.count¶
-
piper.verbs.count(df: pandas.core.frame.DataFrame, columns: Optional[Union[str, list]] = None, totals_name: str = 'n', percent: bool = True, cum_percent: bool = True, threshold: int = 100, round: int = 2, totals: bool = False, sort_values: bool = False, reset_index: bool = False, shape: bool = True)[source]¶ show column/category/factor frequency
For selected column or multi-index show the frequency count, frequency %, cum frequency %. Also provides optional totals.
Examples
import numpy as np import pandas as pd from piper.defaults import * np.random.seed(42) id_list = ['A', 'B', 'C', 'D', 'E'] s1 = pd.Series(np.random.choice(id_list, size=5), name='ids') s2 = pd.Series(np.random.randint(1, 10, s1.shape[0]), name='values') df = pd.concat([s1, s2], axis=1) %%piper count(df.ids, totals=True) >> head(tablefmt='plain') n % cum % E 3 60 60.0 C 1 20 80.0 D 1 20 100.0 Total 5 100 %piper df >> count('ids', totals=False, cum_percent=False) >> head(tablefmt='plain') ids n % E 3 60 C 1 20 D 1 20 %%piper df >> count(['ids'], sort_values=None, totals=True) >> head(tablefmt='plain') n % cum % C 1 20 20.0 D 1 20 40.0 E 3 60 100.0 Total 5 100
- Parameters
df – dataframe reference
columns – dataframe columns/index to be used in groupby function
totals_name – name of total column, default ‘n’
percent – provide % total, default True
cum_percent – provide cum % total, default True
threshold – filter cum_percent by this value, default 100
round – round decimals, default 2
totals – add total column, default False
sort_values – default False, None means use index sort
reset_index – default False
shape – default True. Show shape information as a logger.info() message
- Returns
- Return type
A pandas dataframe