pipeline.src.helpers.dates

Classes

Period

Functions

make_periods(→ List[Period])

Returns a list of Period of duration period_duration covering the time range

get_datetime_intervals(→ pandas.Series)

Takes a pandas Series with datetime dtype. Return a pandas Series with the same

is_in_validity_period(→ bool)

Check if a sample_date falls within a validity period.

Module Contents

class pipeline.src.helpers.dates.Period[source]
start: datetime.datetime[source]
end: datetime.datetime[source]
pipeline.src.helpers.dates.make_periods(start_datetime_utc: datetime.datetime, end_datetime_utc: datetime.datetime, period_duration: datetime.timedelta, overlap: None | datetime.timedelta = None) List[Period][source]

Returns a list of Period of duration period_duration covering the time range from start_datetime_utc to end_datetime_utc.

If overlap is specified, the Period returned will overlap by the amount specified, otherwise the end of one period will coincide with the start of the next one.

If period_duration is shorter than the time between start_datetime_utc and end_datetime_utc, returns a list with a single Period starting on start_datetime_utc and ending on end_datetime_utc.

This is useful to break a long time range into smaller periods for processing time series data that would take up too much memory to handle in one piece.

Parameters:
  • start_datetime_utc (datetime) – start of the period to cover

  • end_datetime_utc (datetime) – end of the period to cover

  • period_duration (timedelta) – duration of the individual periods returned

  • overlap (Union[None, timedelta]) – overlap between successive periods, if specified. Defaults to None.

pipeline.src.helpers.dates.get_datetime_intervals(s: pandas.Series, unit: str = None, how: str = 'backward') pandas.Series[source]

Takes a pandas Series with datetime dtype. Return a pandas Series with the same index and with time intervals between the successives values of the input Series as values.

Parameters:
  • s (Series) – pandas Series with datetime dtype

  • unit (Union[str, None]) –

    • if None, returns values as pandas Timedelta

    • if provided, must be one of ‘s’, ‘min’ or ‘h’, in which case values are returned as a float.

    Defaults to None.

  • how (str) – if, ‘forward’, computes the interval between each position and the next one. If ‘backward’, computes the interval between each position and the previous one. Defaults to ‘backward’

Returns:

Series of time intervals between the values of the input Series

Return type:

pd.Series

pipeline.src.helpers.dates.is_in_validity_period(validity_start_date: datetime.datetime, validity_end_date: datetime.datetime, repeat_each_year: bool, sample_date: datetime.datetime) bool[source]

Check if a sample_date falls within a validity period.

Parameters:
  • validity_start_date – Start of validity period (None means no start constraint)

  • validity_end_date – End of validity period (None means no end constraint)

  • repeat_each_year – If True, the validity period repeats annually

  • sample_date – Date to check against the validity period

Returns:

True if sample_date is within the validity period, False otherwise