pipeline.src.utils
Functions
|
Performs reflection to get a sqlalchemy Table object with metadata reflecting |
|
Delete tables. |
|
Deletes all rows of a table whose id is in |
|
Execute SQL statement inserting data |
|
Moves a file to another directory. If the destination directory |
|
Module Contents
- pipeline.src.utils.get_table(table_name: str, schema: str, conn: sqlalchemy.engine.Connectable, logger: logging.Logger) sqlalchemy.Table[source]
Performs reflection to get a sqlalchemy Table object with metadata reflecting the table found in the databse. Returns resulting Table object.
If the table is not found in the database, raises an error.
- pipeline.src.utils.delete(tables: List[sqlalchemy.Table], connection: sqlalchemy.engine.base.Connection, logger: logging.Logger, truncate: bool = False)[source]
Delete tables. Useful to wipe tables before re-inserting fresh data in ETL jobs.
- pipeline.src.utils.delete_rows(table: sqlalchemy.Table, id_column: str, ids_to_delete: Sequence, connection: sqlalchemy.engine.base.Connection, logger: logging.Logger)[source]
Deletes all rows of a table whose id is in
ids_to_delete.- Parameters:
table (sqlalchemy.Table) – table to remove rows from
id_column (str) – name of the column in the table that contains ids to delete
ids (Sequence) – list-like sequence of ids to look for in the table and delete
connection (sqlalchemy.engine.base.Connection) – database connection
logger (logging.Logger) – logger
- pipeline.src.utils.psql_insert_copy(table, conn, keys, data_iter)[source]
Execute SQL statement inserting data
- Parameters:
table (pandas.io.sql.SQLTable)
conn (sqlalchemy.engine.Engine or sqlalchemy.engine.Connection)
keys (list of str) – Column names
data_iter (Iterable that iterates the values to be inserted)