Apache NiFi - Data Provenance


Advertisements

Apache NiFi logs and store every information about the events occur on the ingested data in the flow. Data provenance repository stores this information and provides UI to search this event information. Data provenance can be accessed for full NiFi level and processor level also.

Data Provenance

The following table lists down the different fields in the NiFi Data Provenance event list have following fields −

S.No. Field Name Description
1 Date/Time Date and time of event.
2 Type Type of Event like ‘CREATE’.
3 FlowFileUuid UUID of the flowfile on which the event is performed.
4 Size Size of the flowfile.
5 Component Name Name of the component which  performed the event.
6 Component Type Type of the component.
7 Show lineage Last column has the show lineage icon, which is used to see the flowfile lineage as shown in the below image.
Lineage Icon

To get more information about the event, a user can click on the information icon present in the first column of the NiFi Data Provenance UI.

There are some properties in nifi.properties file, which are used to manage NiFi Data Provenance repository.

S.No. Property Name Default Value Description
1 nifi.provenance.repository.directory.default ./provenance_repository To specify the default path of NiFi data provenance .
2 nifi.provenance.repository.max.storage.time 24 hours To specify the maximum retention time of NiFi data provenance.
3 nifi.provenance.repository.max.storage.size 1 GB To specify the maximum storage of NiFi data provenance.
4 nifi.provenance.repository.rollover.time 30 secs To specify the rollover time of NiFi data provenance.
5 nifi.provenance.repository.rollover.size 100 MB To specify the rollover size of NiFi data provenance.
6 nifi.provenance.repository.indexed.fields EventType, FlowFileUUID, Filename, ProcessorID, Relationship To specify the fields used to search and index NiFi data provenance.

Useful Video Courses


Video

Apache Spark Online Training

46 Lectures 3.5 hours

Arnab Chakraborty

Video

Apache Spark with Scala - Hands On with Big Data

23 Lectures 1.5 hours

Mukund Kumar Mishra

Video

Learn Apache Cordova using Visual Studio 2015 & Command line

16 Lectures 1 hours

Nilay Mehta

Video

Delta Lake with Apache Spark using Scala

52 Lectures 1.5 hours

Bigdata Engineer

Video

Apache Zeppelin - Big Data Visualization Tool

14 Lectures 1 hours

Bigdata Engineer

Video

Olympic Games Analytics Project in Apache Spark for Beginner

23 Lectures 1 hours

Bigdata Engineer

Advertisements