=========================== Deployment & Administration =========================== Prerequisites ^^^^^^^^^^^^^ Dependencies ------------ The following dependencies must be installed on the production machine : * `git `__ * `docker `__ * `make `__ Configuration ------------- Cloning the repository """""""""""""""""""""" Clone the repo with : .. code-block:: bash git clone https://github.com/MTES-MCT/monitorfish.git .. _environment_variables: Environment variables """"""""""""""""""""" * A ``.env`` file must be created in the ``pipeline`` folder, with all the variables listed in ``.env.template`` filled in. * Set the ``MONITORFISH_VERSION`` environment variable. This will determine which docker images to pull when running ```make`` commands. ERS files """"""""" ERS raw xml files are ingested by the ERS flow from the configured ``ERS_FILES_LOCATION`` in ``pipeline/config.py``. In order to make ERS data available to Monitorfish, ERS files should therefore be deposited in this directory. Running the database service ---------------------------- The Monitorfish database must be running for data processing operations to be carried out. For this, run the backend service first. ---- Running the orchestration service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Starting the Prefect server orchestrator ---------------------------------------- The orchestration service can be started with : .. code-block:: bash make run-pipeline-server-prod Automating log cleaning ----------------------- Logs of past flow runs are stored in a Postgres database that is part of the prefect server architecture. In order to keep the size of this database low, it is necessary to set up a cron job to delete old flow runs. The Prefect server database runs in a Docker container. The script ``infra/remote/data-pipeline/truncate-old-prefect-logs.sh`` goes into that container with ``docker exec`` and runs a ``DELETE`` query to delete old flow_runs. This query can be run daily by setting up a cron job, for instance by adding a line to the crontab file : .. code-block:: bash crontab -e then add the line in ``infra/remote/data-pipeline/crontab.txt`` (after updating the scripts and logs locations as needed) in the crontab file. ---- Running the execution service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The execution service can be started with : .. code-block:: bash make run-pipeline-flows-prod ---- Database backup & restore ^^^^^^^^^^^^^^^^^^^^^^^^^ This section explains how to perform and automate full database backups. Configuration ------------- * Create a backups folder on the host machine. * Create ``MONITORFISH_BACKUPS_FOLDER`` entry with the full path to the backups folder in ~/.monitorfish - e.g.g. ``export MONITORFISH_BACKUPS_FOLDER="/backups/"``. * Create ``MONITORFISH_LOGS_AND_BACKUPS_GID`` entry in ~/.monitorfish with the group that owns the backups folder (the database container with be run with this group so it can write to the backups folder on the host) - e.g.g. ``export MONITORFISH_LOGS_AND_BACKUPS_GID="125"``. * Make a copy of ``infra/remote/backup/pg_backup.config.template`` and rename it ``pg_backup.config``. * Optionnally, change the backup parameters in ``pg_backup.config``. Backup ------ Running the backup script """"""""""""""""""""""""" Once the configuration step is done, a backup can be made by running the script at ``infra/remote/backup/pg_backup_rotated.sh``. This script : * ``docker execs`` into the database container and makes a full database backup using ``pg_dump`` * outputs : * a single ``globals.sql.gz`` file that contains database globals (roles, tablespaces) * a ``*.custom`` file (full database dump in compressed `custom` postgres format) for each database on the postgres cluster * stores these files on the host machine, in a subfolder of the backups folder, named with the date of the backup * deletes old backups in rotation, keeping daily and weekly backups for as long as specified in the ``pg_backup.config`` file Automating backups """""""""""""""""" To automate backups, add the line ``infra/remote/backup/crontab.txt`` to the crontab file : .. code-block:: bash crontab -e We recommend running the backup script daily. Restore ------- To restore from a backup, see `TimescaleDB documentation `_.