pipeline.src.helpers.spatial ============================ .. py:module:: pipeline.src.helpers.spatial Classes ------- .. autoapisummary:: pipeline.src.helpers.spatial.Position pipeline.src.helpers.spatial.PositionRepresentation Functions --------- .. autoapisummary:: pipeline.src.helpers.spatial.coordinate_to_dms pipeline.src.helpers.spatial.position_to_position_representation pipeline.src.helpers.spatial.to_multipolygon pipeline.src.helpers.spatial.estimate_current_position pipeline.src.helpers.spatial.get_h3_indices pipeline.src.helpers.spatial.get_k_ring_of_h3_cells pipeline.src.helpers.spatial.point_dist pipeline.src.helpers.spatial.get_step_distances pipeline.src.helpers.spatial.compute_movement_metrics pipeline.src.helpers.spatial.detect_fishing_activity pipeline.src.helpers.spatial.enrich_positions pipeline.src.helpers.spatial.geocode pipeline.src.helpers.spatial.geocode_google Module Contents --------------- .. py:class:: Position .. py:attribute:: latitude :type: float .. py:attribute:: longitude :type: float .. py:class:: PositionRepresentation Representation of a position with latitude and longitude in human readable format. .. py:attribute:: latitude :type: str .. py:attribute:: longitude :type: str .. py:function:: coordinate_to_dms(coord: float) -> Tuple[int, float, int, int] Takes a coordinate and return the corresponding degrees, minutes_decimal, minutes and seconds. The sign is not taken into account - only returns positive values. :param coord: latitude or longitude coordinate value :type coord: float :returns: degrees, minutes_decimal, minutes, seconds :rtype: Tuple[int, float, int, int] .. rubric:: Examples >>> coordinate_to_dms(45.123) (45, 7.379999999999853, 7, 23) >>> coordinate_to_dms(-45.123) (45, 7.379999999999853, 7, 23) .. py:function:: position_to_position_representation(p: Position, representation_type: str = 'DMS') -> PositionRepresentation Converts a `Position` to a `PositionRepresentation` in the designated `representation_type`. :param p: input `Position` :type p: Position :param representation_type: "DMS" or "DMD". Defaults to "DMS". :type representation_type: str :returns: PositionRepresentation :raises ValueError: if : - `lat` is greater than 90.0 and less than -90.0 - `lon` is greater than 180.0 and less than -180.0 - `representation_type` is not 'DMD' or 'DMS'. .. py:function:: to_multipolygon(p: Union[shapely.geometry.Polygon, shapely.geometry.MultiPolygon]) -> shapely.geometry.MultiPolygon Returns a MultiPolygon of the input Polygon or MultiPolygon geometry. .. py:function:: estimate_current_position(last_latitude: float, last_longitude: float, course: float, speed: float, hours_since_last_position: float, max_hours_since_last_position: float = 2.0, on_error: str = 'ignore') -> Tuple[float, float] Estimate the current position of a vessel based on its last position, course and speed. If the last position is older than max_hours_since_last_position, or is in the future (i.e. hours_since_last_position is negative), returns None. :param last_latitude: last known latitude of vessel :type last_latitude: float :param last_longitude: last known longitude of vessel :type last_longitude: float :param course: last known route of vessel in degrees :type course: float :param speed: last known speed of vessel in nots :type speed: float :param hours_since_last_position: time since last known position of vessel, in hours :type hours_since_last_position: float :param max_hours_since_last_position: maximum time in hours since last position, after which the estimation is not performed (returns None instead) Defaults to 2.0 :type max_hours_since_last_position: float :param on_error: 'ignore' or 'raise' :type on_error: str :returns: estimated current latitude float: estimated current longitude :rtype: float .. py:function:: get_h3_indices(df: pandas.DataFrame, lat: str = 'latitude', lon: str = 'longitude', resolution: int = 12) -> pandas.Series Returns a Series with the same index as the input DataFrame and values equal to the h3 index corresponding to the latitude and longitude of the indicated columns of the DataFrame :param df: DataFrame with latitude and longitude coordinates in 2 of its columns :type df: pd.DataFrame :param lat: name of the column containing latitudes. Defaults to "latitude". :type lat: str :param lon: name of the column containing longitudes. Defaults to "longitude". :type lon: str :param resolution: h3 resolution of the h3 cells to output. :type resolution: int :returns: h3 cells indices :rtype: pd.Series .. py:function:: get_k_ring_of_h3_cells(h3_sequence: Iterable[str], k: int) -> Set[str] Takes an list-like sequence of h3 cells and an integer k, returns the set of h3 cells that belong to the k-ring of at least one of the h3 cells in the input sequence. :param h3_sequence: sequence of h3 cells :type h3_sequence: sequence :param k: number of rings to add around the input cells :type k: int :returns: sequence of h3 cells belonging to the k-ring of at least one of the h3 cells in the input sequence :rtype: sequence[str] .. py:function:: point_dist(position1: Position, position2: Position) -> float Computes the spherical distance between two Position objects in meters. :param position1: :type position1: Position :param position2: :type position2: Position :returns: distance in meters between the two input Positions :rtype: float .. py:function:: get_step_distances(df: pandas.DataFrame, lat: str = 'latitude', lon: str = 'longitude', how: str = 'backward', unit: str = 'm') -> numpy.array Compute the distance between successive positions (rows). The DataFrame must have latitude and longitude columns. Returns a numpy array with the same length as the input DataFrame and distances as values. :param df: :param lat: column name containing latitudes :type lat: str :param lon: column name containing longitudes :type lon: str :param how: if, 'forward', computes the interval between each position and the next one. if 'backward', computes the interval between each position and the previous one. :type how: str :param unit: the distance unit (passed to h3.great_circle_distance). Defaults to 'm'. :type unit: str :returns: array of distances between the successive positions. :rtype: np.array .. py:function:: compute_movement_metrics(positions: pandas.DataFrame, lat: str = 'latitude', lon: str = 'longitude', datetime_column: str = 'datetime_utc', is_at_port_column: str = 'is_at_port', time_emitting_at_sea_column: str = 'time_emitting_at_sea') -> pandas.DataFrame Takes a pandas DataFrame with: - latitude and longitude columns (float dtypes) - a column indicating the date and time of the position (datetime dtype) - a column indicating whether the vessel is at port (boolean dtype) - a column indicating how long the vessel has been continuously emitting at sea in hours (float dtype) whose rows represent successive positions of a vessel, assumed to be sorted chronologically by ascending order. Returns pandas DataFrame with the same index and columns, with : - speed, distance and time between successive positions as additionnal computed features in new columns - values for `time_emitting_at_sea_column` computed and updated - so if the input contained any NULL values, they will be computed and filled in. :param positions: DataFrame representing a vessel route :type positions: pd.DataFrame :param lat: column name of latitude values. May not contain null values. :type lat: str :param lon: column name of longitude values. May not contain null values. :type lon: str :param datetime_column: column name of datetime values. May not contain null values. :type datetime_column: str :param is_at_port_column: column indicating whether the vessel is at port. May not contain null values. :type is_at_port_column: str :param time_emitting_at_sea_column: column indicating how long the vessel has been continuously emitting at sea, in hours. May contain null values. :type time_emitting_at_sea_column: float :returns: the same DataFrame, plus added columns with the computed features :rtype: pd.DataFrame .. py:function:: detect_fishing_activity(positions: pandas.DataFrame, minimum_minutes_of_emission_at_sea: int, is_at_port_column: str = 'is_at_port', average_speed_column: str = 'average_speed', time_emitting_at_sea_column: str = 'time_emitting_at_sea', minimum_consecutive_positions: int = 3, min_fishing_speed_threshold: float = 0.1, max_fishing_speed_threshold: float = 4.5, return_floats: bool = False) -> pandas.DataFrame Detects fishing activity from positions of a vessel. Rows of the input DataFrame represent successive positions of the analyzed vessel, assumed to be sorted chronologically by ascending order. The DataFrame must have a columns indicating : 1) whether the position is at port 2) the average speed between each position and the previous one, in knots A vessel will be considered to be fishing if its average speed remains above the `min_fishing_speed_threshold` and below the `max_fishing_speed_threshold` for a minimum of `minimum_consecutive_positions` positions outside a port and after at least `minimum_time_of_emission_at_sea` time of uninterrupted VMS emission outside of a port. :param positions: DataFrame representing successive positions of a vessel, assumed to be sorted by ascending datetime :type positions: pd.DataFrame :param minimum_minutes_of_emission_at_sea: the minimum time a vessel is required to emit continuously at sea in order to be considred as in fishing activity, in minutes. This avoids detecting fishing activity when vessels leave ports. :type minimum_minutes_of_emission_at_sea: int :param is_at_port_column: name of the column containing boolean values for whether a position is in at port or not :type is_at_port_column: str :param average_speed_column: name of the column containing average speed values (distance from previous position divided by time since the last position), in knots :type average_speed_column: str :param time_emitting_at_sea_column: name of the column containing the duration (in hours) for which the vessel has been continuously emitting at sea outside ports. :type time_emitting_at_sea_column: str :param minimum_consecutive_positions: minimum number of consecutive positions below fishing speed threshold to consider that a vessel is fishing :type minimum_consecutive_positions: int :param min_fishing_speed_threshold: speed below which a vessel is considered to be stopped :type min_fishing_speed_threshold: float :param max_fishing_speed_threshold: speed above which a vessel is considered to be in transit :type max_fishing_speed_threshold: float :param return_floats: if `True`, return `float` dtypes with 1.0 representing `True`, 0.0 representing `False` and `np.nan` for null values. If `False` (the default), the return dtype is `object` and values are `True`, `False` and `np.nan`, which is more explicit and natural but slower. :type return_floats: bool :returns: copy of the input DataFrame with the added boolean column "is_fishing" :rtype: pd.DataFrame .. py:function:: enrich_positions(positions: pandas.DataFrame, minimum_minutes_of_emission_at_sea: int, lat: str = 'latitude', lon: str = 'longitude', datetime_column: str = 'datetime_utc', is_at_port_column: str = 'is_at_port', time_emitting_at_sea_column: str = 'time_emitting_at_sea', minimum_consecutive_positions: int = 3, min_fishing_speed_threshold: float = 0.1, max_fishing_speed_threshold: float = 4.5, return_floats: bool = False) -> pandas.DataFrame Applies `compute_movement_metrics` and `detect_fishing_activity` successively. See these two functions for help. .. py:function:: geocode(query_string=None, country_code_iso2=None, backend: str = 'Nominatim', **kwargs) Return latitude, longitude for input location from a query string or from one or more of the following keyword arguments: - street - city - county - state - country - postalcode .. py:function:: geocode_google(address=None, **kwargs) Return latitude, longitude for input location from a query string, with optionnal filtering on one or more of the following keyword arguments: - postal_code - country (country name or country code ISO2) - route - locality - administrative_area If address is not given, at least one kwarg must be given.