piper.verbs.str_clean_number

piper.verbs.str_clean_number(series: pandas.core.series.Series, decimal: str = '.', dtype: str = 'float64')[source]

clean number (e.g. currency, price) values

Series based conversion of string values which are supposed to be numeric (e.g. prices, currency, float values).

Returns ‘cleaned’ values i.e. numbers, decimal point and negative ‘-‘ are the only character values allowed.

Note

If a non-decimal point symbol supplied, the function issues a warning that no data type conversion to numeric values can be performed.

Examples

values = ['$ 1000.48', '-23,500.54', '1004,0 00 .22', '-£43,000',
      'EUR 304s0,00.00', '354.00-', '301    ', '4q5056 GBP',
      'USD 20621.54973']

expected = [1000.48, -23500.54, 1004000.22, -43000.0, 304000.0,
            -354.0, 301.0, 45056.0, 20621.54973]

df = pd.DataFrame(values, columns=['values'])
df['values'] = str_clean_number(df['values'])

assert expected == df['values'].values.tolist()
Parameters
  • series – a pandas series

  • decimal – default is ‘.’ The decimal symbol e.g. decimal point or decimal comma (in Europe)

  • dtype – Default ‘float64’. The default data type to be used to convert the column/series. Set to None if you don’t want to auto convert data type.

Returns

Return type

a pandas series