piper.verbs.duplicate_names

piper.verbs.duplicate_names(columns: List, sep: str = '_', info: bool = True)List[source]

identify and de-duplicate dataframe column names

Examples

cols = [
    'duplicate', 'allocated', 'first_name', 'last_name', 'employee_status',
    'last_name', 'subject', 'hire_date', 'allocated', 'full_time?',
    'certification', 'certification', 'certification', 'certification'
]

expected = [
    'duplicate', 'allocated', 'first_name', 'last_name', 'employee_status',
    'last_name_1', 'subject', 'hire_date', 'allocated_1', 'full_time?',
    'certification', 'certification_1', 'certification_2', 'certification_3'
]

df = pd.DataFrame(None, columns=cols)

assert expected == duplicate_names(df.columns.tolist())
Parameters

sep – default ‘_’. Separator used to append suffix number for duplicate column name values.

Returns

Return type

column names