Skip to main content
INFOColumn-level Lineage is supported for AWS Aurora and RDS Postgres and requires Cloudwatch to be configured.
Steps to complete:
  1. Run SQL script and create schema for Datafold
  2. Configure your data connection in Datafold

Run SQL script and create schema for Datafold

To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in your warehouse. */

CREATE SCHEMA datafold_tmp;

/* Create a datafold user */

CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';

/* Give the datafold role write access to the temporary schema */

GRANT ALL ON SCHEMA datafold_tmp TO datafold;

/* Make sure that the postgres user has read permissions on the tables */

GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;

Datafold utilizes a temporary schema, named datafold_tmp in the above script, to materialize scratch work and keep data processing in the your warehouse.

Configure in Datafold

Field NameDescription
NameA name given to the data connection within Datafold
HostThe hostname address for your database; default value 127.0.0.1
PortPostgres connection port; default value is 5432
UserThe user role created in our SQL script, named datafold
PasswordThe password created in our SQL script
Database NameThe name of the Postgres database you want to connect to
Schema for temporary tablesThe schema (datafold_tmp) created in our SQL script
Click Create. Your data connection is ready!

Column-level Lineage with Aurora & RDS

This will guide you through setting up Column-level Lineage with AWS Aurora & RDS using CloudWatch. Steps to complete:
  1. Setup Postgres with Permissions
  2. Increase the logging verbosity of Postgres so Datafold can parse lineage
  3. Set up an account for fetching the logs from CloudWatch.
  4. Configure your data connection in Datafold

Run SQL Script

To connect to Postgres, you need to create a user with read-only access to all tables in all schemas, write access to Datafold-specific schema for temporary tables:
/* Datafold utilizes a temporary dataset to materialize scratch work and keep data processing in the your warehouse. */

CREATE SCHEMA datafold_tmp;

/* Create a datafold user */

CREATE ROLE datafold WITH LOGIN ENCRYPTED PASSWORD 'SOMESECUREPASSWORD';

/* Give the datafole role write access to the temporary schema */

GRANT ALL ON SCHEMA datafold_tmp TO datafold;

/* Make sure that the postgres user has read permissions on the tables */

GRANT USAGE ON SCHEMA <myschema> TO datafold;
GRANT SELECT ON ALL TABLES IN SCHEMA <myschema> TO datafold;

Increase logging verbosity