For writing UDF’s, complete support is provided in Java and limited support is provided in all the remaining languages. Using Java, you can write UDF’s involving all parts of the processing like data load/store, column transformation, and aggregation. Since Apache Pig has been written in Java, the UDF’s written using Java language work efficiently compared to other languages.
In Apache Pig, we also have a Java repository for UDF’s named Piggybank. Using Piggybank, we can access Java UDF’s written by other users, and contribute our own UDF’s.
While writing UDF’s using Java, we can create and use the following three types of functions −
To write a UDF using Java, we have to integrate the jar file Pig-0.15.0.jar. In this section, we discuss how to write a sample UDF using Eclipse. Before proceeding further, make sure you have installed Eclipse and Maven in your system.
Follow the steps given below to write a UDF function −
While writing UDF’s, it is mandatory to inherit the EvalFunc class and provide implementation to exec() function. Within this function, the code required for the UDF is written. In the above example, we have return the code to convert the contents of the given column to uppercase.
After writing the UDF and generating the Jar file, follow the steps given below −
After writing UDF (in Java) we have to register the Jar file that contain the UDF using the Register operator. By registering the Jar file, users can intimate the location of the UDF to Apache Pig.
Given below is the syntax of the Register operator.
As an example let us register the sample_udf.jar created earlier in this chapter.
Start Apache Pig in local mode and register the jar file sample_udf.jar as shown below.
Note − assume the Jar file in the path − /$PIG_HOME/sample_udf.jar
After registering the UDF we can define an alias to it using the Define operator.
Given below is the syntax of the Define operator.
Define the alias for sample_eval as shown below.
After defining the alias you can use the UDF same as the built-in functions. Suppose there is a file named emp_data in the HDFS /Pig_Data/ directory with the following content.
And assume we have loaded this file into Pig as shown below.
Let us now convert the names of the employees in to upper case using the UDF sample_eval.
Verify the contents of the relation Upper_case as shown below.
Apache Pig Related Tutorials
|Apache Tapestry Tutorial||Apache Cassandra Tutorial|
|Apache Solr Tutorial||Apache Storm Tutorial|
|Apache Hive Tutorial||Apache Flume Tutorial|
|Apache Kafka Tutorial||Apache Ant Tutorial|
|Apache Tajo Tutorial||Apache Impala Tutorial|
|Apache Presto Tutorial||Apache POI PPT Tutorial|
|Apache Struts 2 Tutorial|
Apache Pig Related Interview Questions
|Apache Tapestry Interview Questions||Apache Cassandra Interview Questions|
|Apache Spark Interview Questions||Apache Solr Interview Questions|
|Apache Storm Interview Questions||Apache Hive Interview Questions|
|Apache Pig Interview Questions||Apache Flume Interview Questions|
|Apache Kafka Interview Questions||Apache Ant Interview Questions|
|Apache Camel Interview Questions||Apache Tajo Interview Questions|
|Apache Impala Interview Questions||Apache Drill Interview Questions|
|Apache Presto Interview Questions||Apache POI PPT Interview Questions|
|Apache Struts 2 Interview Questions||Apache Ambari Interview Questions|
Apache Pig Related Practice Tests
|Apache HBase Practice Tests|
All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.