pipeline.src.flows.missing_far_alerts ===================================== .. py:module:: pipeline.src.flows.missing_far_alerts Functions --------- .. autoapisummary:: pipeline.src.flows.missing_far_alerts.get_dates pipeline.src.flows.missing_far_alerts.make_positions_at_sea_query pipeline.src.flows.missing_far_alerts.extract_vessels_that_emitted_fars pipeline.src.flows.missing_far_alerts.concat pipeline.src.flows.missing_far_alerts.get_vessels_at_sea pipeline.src.flows.missing_far_alerts.get_vessels_with_missing_fars pipeline.src.flows.missing_far_alerts.merge_risk_factor pipeline.src.flows.missing_far_alerts.missing_far_alerts_flow Module Contents --------------- .. py:function:: get_dates(days_without_far: int) -> Tuple[datetime.datetime, datetime.datetime, datetime.datetime, datetime.datetime, float] Returns the dates used in the flow as a 5-tuple : - `days_without_far` days ago at 00:00 (beginning of the day) in UTC (1) - Yesterday at 8pm in UTC - Today at 00:00 (beginning of the day) in UTC - Current datetime in UTC (2) - The number of hours that separate 1 and 2 :returns: Tuple[datetime, datetime, datetime] .. py:function:: make_positions_at_sea_query(positions_table: sqlalchemy.Table, facade_areas_table: sqlalchemy.Table, from_date: datetime.datetime, to_date: datetime.datetime, states_to_monitor_iso2: list = None, vessels_table: sqlalchemy.Table = None, minimum_length: float = None, eez_areas_table: sqlalchemy.Table = None, eez_to_monitor_iso3: list = None, only_fishing_positions: bool = False, exclude_vessels_with_logbook_exemptions: bool = False) -> sqlalchemy.sql.Select Generates the `sqlalchemy.Select` statement to run in order to get the positions of vessels that were at sea (i.e. those that emitted at least one VMS position outside of a port) between the designated dates and matching the designated flag states. :param positions_table: `sqlalchemy.Table` representing `positions` :type positions_table: Table :param facade_areas_table: `sqlalchemy.Table` representing `facade_areas` :type facade_areas_table: Table :param from_date: Start of the time interval to query, in UTC :type from_date: datetime :param to_date: End of the time interval to query, in UTC :type to_date: datetime :param states_to_monitor_iso2: If provided, only vessels of the given flag_states will be queried. Defaults to None. :type states_to_monitor_iso2: list, optional :param vessels_table: `sqlalchemy.Table` representing `vessels`. Must be provided if `minimum_length` is not `None`. Defaults to None. :type vessels_table: Table, optional :param minimum_length: If provided, only vessels longer than the given value will be queried (only applies to french vessels). Defaults to None. :type minimum_length: float, optional :param eez_areas_table: `sqlalchemy.Table` representing `eez_areas`. Must be provided if `eez_to_monitor_iso3` is not `None`. Defaults to None. :type eez_areas_table: Table, optional :param eez_to_monitor_iso3: If provided, only VMS emission in the designated EEZ areas will be considered. Defaults to None. :type eez_to_monitor_iso3: list, optional :param only_fishing_positions: if `True`, only positions which were detected as being in fishing operation will be considered. Defaults to `False`. :type only_fishing_positions: bool, optional :param exclude_vessels_with_logbook_exemptions: if `True`, vessels with an 'Exempté' logbook_equipment_status will be excluded. Defaults to `False`. :type exclude_vessels_with_logbook_exemptions: bool, optional :raises ValueError: If `minimum_length` is not `None` and the `vessels_table` is not provided. :raises ValueError: If `eez_to_monitor_iso3` is not `None` and the `eez_areas_table` is not provided. :returns: `Select` statement representing a SQL query :rtype: Select .. py:function:: extract_vessels_that_emitted_fars(declaration_min_datetime_utc: datetime.datetime, declaration_max_datetime_utc: datetime.datetime, fishing_operation_min_datetime_utc: datetime.datetime, fishing_operation_max_datetime_utc: datetime.datetime) -> set Extracts the vessels that emitted at least one `FAR` logbook report between the designated dates and returns the result as a `set` of their CFR numbers. Date conditions on dates must be made on 3 dates : - `operation_datetime_utc`: for performance reasons (the table is chunked on this column) - `report_datetime_utc`: to get only reports that were filled between the given dates - `farDatetimeUtc` : in certain cases (in particular VisioCapture), reports can be filled weeks or months after the actual fishing operation. In the context of this flow, we are not interested in these reports and want to keep only reports that were filled directly on the boat, in 'live'. :param declaration_min_datetime_utc: Minimum `operation_datetime_utc` and `report_datetime_utc` :type declaration_min_datetime_utc: datetime :param declaration_max_datetime_utc: Maximum `operation_datetime_utc` and `report_datetime_utc` :type declaration_max_datetime_utc: datetime :param fishing_operation_min_datetime_utc: Minimum `farDatetimeUtc` :type fishing_operation_min_datetime_utc: datetime :param fishing_operation_max_datetime_utc: Maximum `farDatetimeUtc` :type fishing_operation_max_datetime_utc: datetime :returns: Set of `cfr` number of the vessels that emitted at least one `FAR` report between the given dates. :rtype: set .. py:function:: concat(positions_at_sea_yesterday_everywhere: pandas.DataFrame, positions_at_sea_yesterday_in_french_eez: pandas.DataFrame) -> pandas.DataFrame Concatenates the two input `DataFrame`. :param positions_at_sea_yesterday_everywhere: :type positions_at_sea_yesterday_everywhere: pd.DataFrame :param positions_at_sea_yesterday_in_french_eez: :type positions_at_sea_yesterday_in_french_eez: pd.DataFrame :returns: pd.DataFrame .. py:function:: get_vessels_at_sea(positions_at_sea: pandas.DataFrame, min_days: int) -> pandas.DataFrame Returns a DataFrame with the vessels present in the input `positions_at_sea` DataFrame which were at sea on at least `min_days` days. Must have columns : - `cfr` - `external_immatriculation` - `ircs` - `vessel_name` - `facade` - `flag_state` - `date_time` - `latitude` - `longitude` :param positions_at_sea: DataFrame of positions of vessels at sea :type positions_at_sea: pd.DataFrame :param min_days: minimum number of days at sea. Vessels at sea less than `min_days` days are excluded from the result. :type min_days: int :returns: vessels of the input that were at sea on at least `n_days` different days. :rtype: pd.DataFrame .. py:function:: get_vessels_with_missing_fars(vessels_at_sea: pandas.DataFrame, vessels_that_emitted_fars: set, max_share_of_vessels_with_missing_fars: float = 0.5) -> pandas.DataFrame Filters `vessels_at_sea` to keep only rows whose `cfr` is NOT in `vessels_that_emitted_fars`. :param vessels_at_sea: `DataFrame` of vessels at sea :type vessels_at_sea: pd.DataFrame :param vessels_that_emitted_fars: `set` cfrs of vessels that emitted `FAR` reports :type vessels_that_emitted_fars: set :param max_share_of_vessels_with_missing_fars: If the share of `vessels_at_sea` that are not in `vessels_that_emitted_fars` is greater than this value, it is assumed that there is a breakdown in the date pipeline and a `MonitorfishHealthError` is raised. Defaults to 0.5. :type max_share_of_vessels_with_missing_fars: float, optional :raises MonitorfishHealthError: raised if the share of vessels with missing fars is greater than `max_share_of_vessels_with_missing_fars` :returns: Filtered version of `vessels_at_sea` with only those that are not in `vessels_that_emitted_fars` :rtype: pd.DataFrame .. py:function:: merge_risk_factor(vessels_with_missing_fars: pandas.DataFrame, current_risk_factors: pandas.DataFrame) -> pandas.DataFrame Merges on the input DataFrame on ["cfr", "external_immatriculation", "ircs"]. :param vessels_with_missing_fars: :type vessels_with_missing_fars: pd.DataFrame :param current_risk_factors: :type current_risk_factors: pd.DataFrame :returns: pd.DataFrame .. py:function:: missing_far_alerts_flow(alert_type: str, name: str, states_iso2_to_monitor_everywhere: list, states_iso2_to_monitor_in_french_eez: list, max_share_of_vessels_with_missing_fars: float, minimum_length: float, only_raise_if_route_shows_fishing: bool, days_without_far: int)