Skip to content

This Project aims to enhance Spark Lineage Generation in Open Metadata by creating Lineage edges from Table-to-Table with Column level lineage as well as Container to Table Lineage.

Notifications You must be signed in to change notification settings

darshanik/Spark-Lineage-Plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This plugin Collects and Transforms the facets emitted by Open Lineage Spark Listener into OpenMetadata Compatible payload definition that creates/updates the lineage.

Features

  • Container-To-Table Lineage
  • Table-to-Table Lineage
  • Column-Level Lineage

Pre-Requisites

  • Spark Cluster or Spark Operator with OpenLineage Integration(i.e., your spark job or cluster must be having open-lineage jar Installed)
  • OpenLineage spark listener version >= 0.20.6 enabled
  • OpenLineage spark listener configured to emit event messages to a kafka topic or API of this plugin

Limitations

  • Generates Lineage in OpenMetadata for only spark jobs.

Note: This Document is subject to change as more features will be released

About

This Project aims to enhance Spark Lineage Generation in Open Metadata by creating Lineage edges from Table-to-Table with Column level lineage as well as Container to Table Lineage.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published