piper.verbs.distinct

piper.verbs.distinct(df: pandas.core.frame.DataFrame, *args, shape: bool = False, **kwargs)pandas.core.frame.DataFrame[source]

select distinct/unique rows

This is a wrapper function rather than using e.g. df.drop_duplicates() For details of args, kwargs - see help(pd.DataFrame.drop_duplicates)

Examples

from piper.verbs import select, distinct
from piper.factory import sample_data

df = sample_data()
df = select(df, ['countries', 'regions', 'ids'])
df = distinct(df, 'ids', shape=False)

expected = (5, 3)
actual = df.shape
Parameters
  • df – dataframe

  • shape – default False. If True, show shape information as a logger.info() message

  • *args – arguments for wrapped function

  • **kwargs – keyword-parameters for wrapped function

Returns

Return type

A pandas DataFrame