Skip to content

Write lineage metadata to centralized file on hive #856

@zacaydcloudera

Description

@zacaydcloudera

Hi @wajda
I used on the HDFS on cloudera this config
spark.jars=hdfs:///tmp/spark-2.4-spline-agent-bundle_2.11-2.2.1.jar
spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener
spark.spline.mode=ENABLED
spark.spline.lineageDispatcher=hdfs
spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
spark.spline.lineageDispatcher.hdfs.fileNamePrefix=lineage_
spark.spline.lineageDispatcher.hdfs.fileBufferSize=4096
spark.spline.lineageDispatcher.hdfs.filePermissions=777
spark.driver.memory=4g

But it wrote the lineage to the target file on the script, seems to ignore the spark.spline.lineageDispatcher.hdfs.outputDir=hdfs:///tmp/spline/lineage/
Is there a way to set it to a centralized place and also to give each json file the execution_plan_id
thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions