- Apache Pig Tutorial
- Apache Pig - Home
- Apache Pig Introduction
- Apache Pig - Overview
- Apache Pig - Architecture
- Apache Pig Environment
- Apache Pig - Installation
- Apache Pig - Execution
- Apache Pig - Grunt Shell
- Pig Latin
- Pig Latin - Basics
- Load & Store Operators
- Apache Pig - Reading Data
- Apache Pig - Storing Data
- Diagnostic Operators
- Apache Pig - Diagnostic Operator
- Apache Pig - Describe Operator
- Apache Pig - Explain Operator
- Apache Pig - Illustrate Operator
- Grouping & Joining
- Apache Pig - Group Operator
- Apache Pig - Cogroup Operator
- Apache Pig - Join Operator
- Apache Pig - Cross Operator
- Combining & Splitting
- Apache Pig - Union Operator
- Apache Pig - Split Operator
- Pig Latin Built-In Functions
- Apache Pig - Eval Functions
- Load & Store Functions
- Apache Pig - Bag & Tuple Functions
- Apache Pig - String Functions
- Apache Pig - date-time Functions
- Apache Pig - Math Functions
- Other Modes Of Execution
- Apache Pig - User-Defined Functions
- Apache Pig - Running Scripts
- Apache Pig Useful Resources
- Apache Pig - Quick Guide
- Apache Pig - Useful Resources
- Apache Pig - Discussion
Apache Pig - REPLACE()
This function is used to replace all the characters in a given string with the new characters.
Syntax
Given below is the syntax of the REPLACE() function. This function accepts three parameters, namely,
string − The string that is to be replaced. If we want to replace the string within a relation, we have to pass the column name the string belongs to.
regEXP − Here we have to pass the string/regular expression we want to replace.
newChar − Here we have to pass the new value of the string.
grunt> REPLACE(string, 'regExp', 'newChar');
Example
Assume that there is a file named emp.txt in the HDFS directory /pig_data/ as shown below. This file contains the employee details such as id, name, age, and city.
emp.txt
001,Robin,22,newyork 002,BOB,23,Kolkata 003,Maya,23,Tokyo 004,Sara,25,London 005,David,23,Bhuwaneshwar 006,Maggy,22,Chennai 007,Robert,22,newyork 008,Syam,23,Kolkata 009,Mary,25,Tokyo 010,Saran,25,London 011,Stacy,25,Bhuwaneshwar 012,Kelly,22,Chennai
And, we have loaded this file into Pig with a relation named emp_data as shown below.
grunt> emp_data = LOAD 'hdfs://localhost:9000/pig_data/emp1.txt' USING PigStorage(',') as (id:int, name:chararray, age:int, city:chararray);
Following is an example of the REPLACE() function. In this example, we have replaced the name of the city Bhuwaneshwar with a shorter form Bhuw.
grunt> replace_data = FOREACH emp_data GENERATE (id,city),REPLACE(city,'Bhuwaneshwar','Bhuw');
The above statement replaces the string 'Bhuwaneshwar' with 'Bhuw' in the column named city in the emp_data relation and returns the result. This result is stored in the relation named replace_data. Verify the content of the relation replace_data using the Dump operator as shown below.
grunt> Dump replace_data; ((1,newyork),newyork) ((2,Kolkata),Kolkata) ((3,Tokyo),Tokyo) ((4,London),London) ((5,Bhuwaneshwar),Bhuw) ((6,Chennai),Chennai) ((7,newyork),newyork) ((8,Kolkata),Kolkata) ((9,Tokyo),Tokyo) ((10,London),London) ((11,Bhuwaneshwar),Bhuw) ((12,Chennai),Chennai)