A Next.js-based OpenLineage proxy server that receives and stores OpenLineage events as individual JSON files for debugging and analysis purposes.
Send a test OpenLineage event to verify the proxy is working
POST /api/v1/lineageThis project serves as a debugging proxy for OpenLineage events emitted by various data integration and transformation tools:
Data Build Tool
Big Data Processing
Workflow Management
Receives OpenLineage events via HTTP POST requests
Saves each event as a separate JSON file with unique naming
Uses file locking for concurrent request handling
Counter + UUID format for easy tracking
Formatted JSON output for easy reading and debugging
Comprehensive error handling and logging
git clone https://github.com/senthilsweb/open-lineage-proxy.git
cd open-lineage-proxynpm installnpm run devConfigure your data integration tools to send OpenLineage events to this proxy:
OPENLINEAGE_URL=http://localhost:3000
OPENLINEAGE_NAMESPACE=dev# Install dbt with OpenLineage support
pip install dbt-openlineage
# Set environment variables
export OPENLINEAGE_URL=http://localhost:3000
export OPENLINEAGE_NAMESPACE=dev
# Run dbt with OpenLineage
dbt-ol runspark-submit \
--packages io.openlineage:openlineage-spark:0.21.1 \
--conf spark.extraListeners=io.openlineage.spark.agent.OpenLineageSparkListener \
--conf spark.openlineage.transport.type=http \
--conf spark.openlineage.transport.url=http://localhost:3000 \
--conf spark.openlineage.namespace=dev \
your_spark_job.pycurl -X POST http://localhost:3000/api/v1/lineage \
-H "Content-Type: application/json" \
-d '{
"eventType": "START",
"eventTime": "2024-01-20T10:00:00.000Z",
"run": {"runId": "12345678-1234-1234-1234-123456789012"},
"job": {"namespace": "dev", "name": "test_job"},
"inputs": [], "outputs": []
}'Receives OpenLineage events and saves them as JSON files.
Returns current status and statistics of the API server.
Each OpenLineage event is saved as a JSON file with the following naming convention:
{counter}_lineage_data_{uuid}.json/pages/api/v1/lineage/