Apache Pig - Union Operator


Advertisements

The UNION operator of Pig Latin is used to merge the content of two relations. To perform UNION operation on two relations, their columns and domains must be identical.

Syntax

Given below is the syntax of the UNION operator.

grunt> Relation_name3 = UNION Relation_name1, Relation_name2;

Example

Assume that we have two files namely student_data1.txt and student_data2.txt in the /pig_data/ directory of HDFS as shown below.

Student_data1.txt

001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
006,Archana,Mishra,9848022335,Chennai.

Student_data2.txt

7,Komal,Nayak,9848022334,trivendram.
8,Bharathi,Nambiayar,9848022333,Chennai.

And we have loaded these two files into Pig with the relations student1 and student2 as shown below.

grunt> student1 = LOAD 'hdfs://localhost:9000/pig_data/student_data1.txt' USING PigStorage(',') 
   as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray); 
 
grunt> student2 = LOAD 'hdfs://localhost:9000/pig_data/student_data2.txt' USING PigStorage(',') 
   as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);

Let us now merge the contents of these two relations using the UNION operator as shown below.

grunt> student = UNION student1, student2;

Verification

Verify the relation student using the DUMP operator as shown below.

grunt> Dump student; 

Output

It will display the following output, displaying the contents of the relation student.

(1,Rajiv,Reddy,9848022337,Hyderabad) (2,siddarth,Battacharya,9848022338,Kolkata)
(3,Rajesh,Khanna,9848022339,Delhi)
(4,Preethi,Agarwal,9848022330,Pune) 
(5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,9848022335,Chennai) 
(7,Komal,Nayak,9848022334,trivendram) 
(8,Bharathi,Nambiayar,9848022333,Chennai)

Useful Video Courses


Video

Apache Spark Online Training

46 Lectures 3.5 hours

Arnab Chakraborty

Video

Apache Spark with Scala - Hands On with Big Data

23 Lectures 1.5 hours

Mukund Kumar Mishra

Video

Learn Apache Cordova using Visual Studio 2015 & Command line

16 Lectures 1 hours

Nilay Mehta

Video

Delta Lake with Apache Spark using Scala

52 Lectures 1.5 hours

Bigdata Engineer

Video

Apache Zeppelin - Big Data Visualization Tool

14 Lectures 1 hours

Bigdata Engineer

Video

Olympic Games Analytics Project in Apache Spark for Beginner

23 Lectures 1 hours

Bigdata Engineer

Advertisements