Apache Pig - BinStorage()


Advertisements

The BinStorage() function is used to load and store the data into Pig using machine readable format. BinStorge() in Pig is generally used to store temporary data generated between the MapReduce jobs. It supports multiple locations as input.

Syntax

Given below is the syntax of the BinStorage() function.

grunt> BinStorage();

Example

Assume that we have a file named stu_data.txt in the HDFS directory /pig_data/ as shown below.

Stu_data.txt

001,Rajiv_Reddy,21,Hyderabad 
002,siddarth_Battacharya,22,Kolkata 
003,Rajesh_Khanna,22,Delhi 
004,Preethi_Agarwal,21,Pune 
005,Trupthi_Mohanthy,23,Bhuwaneshwar 
006,Archana_Mishra,23,Chennai 
007,Komal_Nayak,24,trivendram 
008,Bharathi_Nambiayar,24,Chennai 

Let us load this data into Pig into a relation as shown below.

grunt> student_details = LOAD 'hdfs://localhost:9000/pig_data/stu_data.txt' USING PigStorage(',')
   as (id:int, firstname:chararray, age:int, city:chararray);

Now, we can store this relation into the HDFS directory named /pig_data/ using the BinStorage() function.

grunt> STORE student_details INTO 'hdfs://localhost:9000/pig_Output/mydata' USING BinStorage();

After executing the above statement, the relation is stored in the given HDFS directory. You can see it using the HDFS ls command as shown below.

$ hdfs dfs -ls hdfs://localhost:9000/pig_Output/mydata/
  
Found 2 items 
-rw-r--r--   1 Hadoop supergroup       0 2015-10-26 16:58
hdfs://localhost:9000/pig_Output/mydata/_SUCCESS

-rw-r--r--   1 Hadoop supergroup        372 2015-10-26 16:58
hdfs://localhost:9000/pig_Output/mydata/part-m-00000

Now, load the data from the file part-m-00000.

grunt> result = LOAD 'hdfs://localhost:9000/pig_Output/b/part-m-00000' USING BinStorage();

Verify the contents of the relation as shown below

grunt> Dump result; 

(1,Rajiv_Reddy,21,Hyderabad) 
(2,siddarth_Battacharya,22,Kolkata) 
(3,Rajesh_Khanna,22,Delhi) 
(4,Preethi_Agarwal,21,Pune) 
(5,Trupthi_Mohanthy,23,Bhuwaneshwar) 
(6,Archana_Mishra,23,Chennai) 
(7,Komal_Nayak,24,trivendram) 
(8,Bharathi_Nambiayar,24,Chennai)
apache_pig_load_store_functions.htm
Advertisements