Data contract, related terms and example of Vatico
A data contract is essentially an agreement where a data producer (like an ELT data pipeline) and a data consumer (like a CRM system) define how data is exchanged and managed.
Publish: The data producer (ELT data pipeline) not only shares the data interface (which details the data format and structure, in forms of downstream report tables) but also publishes a set of Service Level Agreements (SLAs). These SLAs define specific guarantees like ensuring that data fields are not null, and that data is readily available when needed by consumers.
Subscribe: The data consumer (a CRM system) registers their needs in a data contract store by choosing the required data interface (for example table report.crm_new_shipment_fulfillment_by_seller) and agreeing to the specified SLAs, which could include guarantees that non-null data will be provided
Specifically: Consider a scenario in Vatico, where an ELT data pipeline serves as the producer. It publishes report tables that are assured through SLAs to be always complete (no null fields) and available. A CRM system, acting as the consumer, subscribes to the report.crm_new_shipment_fulfillment_by_seller knowing from SLA that data will be non-null and correctly reflect all shipments that have not been fulfilled by us.
Data contract in dbt
Base on Model contracts | dbt Developer Hub
Contracts are particularly recommended for “public” models used extensively within and outside of dbt, such as in reports or dashboards. Dbt checks that the model adheres to its contract during build, enhancing stability and predictability in data handling and usage.
To enforce a model’s contract, set enforced: true under the contract configuration.
When enforced, your contract must include every column‘s name and data_type (where data_type matches one that your data platform understands).
Breaking changes related to contract include:
- Removing an existing column
- Changing the data_type of an existing column
Removing or modifying one of the constraints on an existing column (dbt v1.6 or higher)