Apache Pig - TOBAG()


Advertisements

The TOBAG() function of Pig Latin converts one or more expressions to individual tuples. And these tuples are placed in a bag.

Syntax

Given below is the syntax of the TOBAG() function.

TOBAG(expression [, expression ...])

Example

Assume we have a file named employee_details.txt in the HDFS directory /pig_data/, with the following content.

employee_details.txt

001,Robin,22,newyork
002,BOB,23,Kolkata
003,Maya,23,Tokyo 
004,Sara,25,London 
005,David,23,Bhuwaneshwar 
006,Maggy,22,Chennai

We have loaded this file into Pig with the relation name emp_data as shown below.

grunt> emp_data = LOAD 'hdfs://localhost:9000/pig_data/employee_details.txt' USING PigStorage(',')
   as (id:int, name:chararray, age:int, city:chararray);

Let us now convert the id, name, age and city, of each employee (record) into a tuple as shown below.

tobag = FOREACH emp_data GENERATE TOBAG (id,name,age,city);

Verification

You can verify the contents of the tobag relation using the Dump operator as shown below.

grunt> DUMP tobag;
  
({(1),(Robin),(22),(newyork)}) 
({(2),(BOB),(23),(Kolkata)}) 
({(3),(Maya),(23),(Tokyo)}) 
({(4),(Sara),(25),(London)}) 
({(5),(David),(23),(Bhuwaneshwar)}) 
({(6),(Maggy),(22),(Chennai)})
apache_pig_bag_tuple_functions.htm
Advertisements