piper.custom.duration¶
-
piper.custom.duration(s1: pandas.core.series.Series, s2: Optional[pandas.core.series.Series] = None, unit: Optional[str] = None, round: Union[bool, int] = 2, freq: str = 'd') → pandas.core.series.Series[source]¶ calculate duration between two columns (series)
- Parameters
s1 – ‘from’ datetime series
s2 – ‘to’ datetime series. Default None. If None, defaults to today.
interval –
- default None - returns timedelta in days
’d’ - days as an integer, ‘years’ (based on 365.25 days per year), ‘months’ (based on 30 day month)
- Other possible options are:
‘W’, ‘D’, ‘T’, ‘S’, ‘L’, ‘U’, or ‘N’
‘days’ or ‘day’
‘hours’, ‘hour’, ‘hr’, or ‘h’
‘minutes’, ‘minute’, ‘min’, or ‘m’
‘seconds’, ‘second’, or ‘sec’
‘milliseconds’, ‘millisecond’, ‘millis’, or ‘milli’
‘microseconds’, ‘microsecond’, ‘micros’, or ‘micro’-
‘nanoseconds’, ‘nanosecond’, ‘nanos’, ‘nano’, or ‘ns’.
check out pandas timedelta object for details.
round – Default False. If duration result is an integer and this parameter contains a positive integer, the result is round to this decimal precision.
freq –
Default is ‘d’(days). If the duration result is a pd.Timedelta dtype, the value can be ‘rounded’ using this frequency parameter.
Must be a fixed frequency like ‘S’ (second) not ‘ME’ (month end). For a list of valid values, check out pandas offset aliases
- Returns
if unit is None - series is of data type timedelta64[ns] otherwise series of type int.
- Return type
series
Examples
%%piper sample_data() >> select(['-countries', '-regions', '-ids', '-values_1', '-values_2']) >> assign(new_date_col=pd.to_datetime('2018-01-01')) >> assign(duration = lambda x: duration(x.new_date_col, x.order_dates, unit='months')) >> assign(duration_dates_age = lambda x: duration(x['dates'])) >> head(tablefmt='plain') dates rder_dates new_date_col duration duration_dates_age 0 2020-01-01 2020-01-07 2018-01-01 25 452 days 1 2020-01-02 2020-01-08 2018-01-01 25 451 days 2 2020-01-03 2020-01-09 2018-01-01 25 450 days 3 2020-01-04 2020-01-10 2018-01-01 25 449 days