piper.verbs.count

piper.verbs.count(df: pandas.core.frame.DataFrame, columns: Optional[Union[str, list]] = None, totals_name: str = 'n', percent: bool = True, cum_percent: bool = True, threshold: int = 100, round: int = 2, totals: bool = False, sort_values: bool = False, reset_index: bool = False, shape: bool = True)[source]

show column/category/factor frequency

For selected column or multi-index show the frequency count, frequency %, cum frequency %. Also provides optional totals.

Examples

import numpy as np
import pandas as pd
from piper.defaults import *

np.random.seed(42)

id_list = ['A', 'B', 'C', 'D', 'E']
s1 = pd.Series(np.random.choice(id_list, size=5), name='ids')
s2 = pd.Series(np.random.randint(1, 10, s1.shape[0]), name='values')
df = pd.concat([s1, s2], axis=1)

%%piper
count(df.ids, totals=True)
>> head(tablefmt='plain')

         n    %  cum %
E        3   60  60.0
C        1   20  80.0
D        1   20  100.0
Total    5  100

%piper df >> count('ids', totals=False, cum_percent=False) >> head(tablefmt='plain')

ids      n    %
E        3   60
C        1   20
D        1   20

%%piper
df
>> count(['ids'], sort_values=None, totals=True)
>> head(tablefmt='plain')

         n    %  cum %
C        1   20  20.0
D        1   20  40.0
E        3   60  100.0
Total    5  100
Parameters
  • df – dataframe reference

  • columns – dataframe columns/index to be used in groupby function

  • totals_name – name of total column, default ‘n’

  • percent – provide % total, default True

  • cum_percent – provide cum % total, default True

  • threshold – filter cum_percent by this value, default 100

  • round – round decimals, default 2

  • totals – add total column, default False

  • sort_values – default False, None means use index sort

  • reset_index – default False

  • shape – default True. Show shape information as a logger.info() message

Returns

Return type

A pandas dataframe