INFOColumn-level Lineage is supported for AWS Aurora and RDS Postgres and requires Cloudwatch to be configured.
Steps to complete:
- Run SQL script and create schema for Datafold
- Configure your data connection in Datafold
Run SQL script and create schema for Datafold
To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in your warehouse. */
CREATE SCHEMA datafold_tmp;
/* Create a datafold user */
CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';
/* Give the datafold role write access to the temporary schema */
GRANT ALL ON SCHEMA datafold_tmp TO datafold;
/* Make sure that the postgres user has read permissions on the tables */
GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;
Datafold utilizes a temporary schema, named datafold_tmp in the above script, to materialize scratch work and keep data processing in the your warehouse.
| Field Name | Description |
|---|
| Name | A name given to the data connection within Datafold |
| Host | The hostname address for your database; default value 127.0.0.1 |
| Port | Postgres connection port; default value is 5432 |
| User | The user role created in our SQL script, named datafold |
| Password | The password created in our SQL script |
| Database Name | The name of the Postgres database you want to connect to |
| Schema for temporary tables | The schema (datafold_tmp) created in our SQL script |
Click Create. Your data connection is ready!
Column-level Lineage with Aurora & RDS
This will guide you through setting up Column-level Lineage with AWS Aurora & RDS using CloudWatch.
Steps to complete:
- Setup Postgres with Permissions
- Increase the logging verbosity of Postgres so Datafold can parse lineage
- Set up an account for fetching the logs from CloudWatch.
- Configure your data connection in Datafold
Run SQL Script
To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in the your warehouse. */
CREATE SCHEMA datafold_tmp;
/* Create a datafold user */
CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';
/* Give the datafole role write access to the temporary schema */
GRANT ALL ON SCHEMA datafold_tmp TO datafold;
/* Make sure that the postgres user has read permissions on the tables */
GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;
Increase logging verbosity