Lineage
Entropy Data ingests OpenLineage events and renders an interactive lineage graph on data product pages, showing how datasets flow through your pipelines.
The Lineage API is experimental. The endpoints and event schema may change.
Overview
When OpenLineage events are submitted for a data product, a Lineage section appears on the data product detail page. It shows an interactive graph of pipeline jobs, input datasets, and output datasets. Click Expand to open the full-screen lineage visualizer.
OpenLineage is an open standard for pipeline metadata. Many tools emit OpenLineage events natively, including dbt, Airflow, Spark, Flink, and Dagster.
Linking events to data products
Events are linked to a data product and output port so that the lineage graph appears on the correct data product page. There are two options:
Option 1 — Query parameters
Configure urlParams in your OpenLineage transport config:
openlineage.yml
transport:
type: http
url: https://api.entropy-data.com
endpoint: api/v1/lineage
auth:
type: api_key
apiKey: <entropy-data-api-key>
urlParams:
dataProductId: orders
outputPortName: snowflake_orders_v2
Option 2 — entropy_data run facet
Embed the link in the event JSON itself:
entropy_data run facet
{
"run": {
"runId": "...",
"facets": {
"entropy_data": {
"_producer": "https://entropy-data.com",
"_schemaURL": "https://entropy-data.com/spec/facets/1-0-0/EntropyDataRunFacet.json",
"dataProductId": "orders",
"outputPortName": "snowflake_orders_v2"
}
}
}
}
Query parameters take precedence over the facet.
dbt integration
To send lineage events from dbt, install openlineage-dbt and configure the transport:
Install
pip install openlineage-dbt
Create an openlineage.yml in your dbt project root:
openlineage.yml
transport:
type: http
url: https://api.entropy-data.com
endpoint: api/v1/lineage
auth:
type: api_key
apiKey: <entropy-data-api-key>
urlParams:
dataProductId: orders
outputPortName: snowflake_orders_v2
Then run dbt as usual. The openlineage-dbt integration emits START, COMPLETE, and FAIL events automatically.
Retention
Events older than 90 days are automatically deleted when new events are submitted.
OpenAPI Specification
Refer to the OpenAPI Specification for the full formal API documentation.
Submit an OpenLineage event
Submit an OpenLineage RunEvent. Compatible with all OpenLineage producers (dbt, Airflow, Spark, Flink, etc.).
Optional parameters
- Name
dataProductId- Type
- string
- Required
- Description
Data product external ID to associate the event with.
- Name
outputPortName- Type
- string
- Required
- Description
Output port name within the data product. Used to auto-resolve the data contract.
Request
curl --request POST https://api.entropy-data.com/api/v1/lineage \
--header "x-api-key: $DMM_API_KEY" \
--header "content-type: application/json" \
--data @- << EOF
{
"eventType": "COMPLETE",
"eventTime": "2024-01-15T10:00:00.000Z",
"run": {
"runId": "d46e465b-d358-4d32-83d4-df660ff614dd"
},
"job": {
"namespace": "dbt",
"name": "analytics.dp_orders.stg_orders"
},
"inputs": [
{
"namespace": "snowflake://account.snowflakecomputing.com",
"name": "raw_db.public.raw_orders"
}
],
"outputs": [
{
"namespace": "snowflake://account.snowflakecomputing.com",
"name": "analytics_db.public.orders"
}
],
"producer": "https://github.com/OpenLineage/OpenLineage/tree/0.18.0/integration/dbt"
}
EOF
Get OpenLineage events
Retrieve stored OpenLineage events. All filters are optional; omit all to get every event.
Optional parameters
- Name
jobNamespace- Type
- string
- Required
- Description
Filter by job namespace.
- Name
jobName- Type
- string
- Required
- Description
Filter by job name.
- Name
runId- Type
- string
- Required
- Description
Filter by run ID.
- Name
eventType- Type
- string
- Required
- Description
Filter by event type:
START,RUNNING,COMPLETE,ABORT,FAIL.
- Name
dataProductId- Type
- string
- Required
- Description
Filter by data product external ID.
Request
curl --get https://api.entropy-data.com/api/v1/lineage \
--header "x-api-key: $DMM_API_KEY" \
--data-urlencode "dataProductId=orders"
Delete OpenLineage events
Delete events by run ID, by job namespace + name, or delete all events if no filters are provided.
Optional parameters
- Name
runId- Type
- string
- Required
- Description
Delete events for this run ID.
- Name
jobNamespace- Type
- string
- Required
- Description
Delete events for this job namespace (requires
jobName).
- Name
jobName- Type
- string
- Required
- Description
Delete events for this job name (requires
jobNamespace).
Request
curl --request DELETE https://api.entropy-data.com/api/v1/lineage \
--header "x-api-key: $DMM_API_KEY" \
--data-urlencode "runId=d46e465b-d358-4d32-83d4-df660ff614dd"