UrbanPro
true

Learn Big Data from the Best Tutors

  • Affordable fees
  • 1-1 or Group class
  • Flexible Timings
  • Verified Tutors

Search in

Use of Piggybank and Registration in Pig

S
Sachin Patil
16/09/2018 0 0

What is a Piggybank?

Piggybank is a jar and its a collection of user contributed UDF’s that is released along with Pig. These are not included in the Pig JAR, so we have to register them manually in our script.

1. Download piggybank.jar

2. Copy this jar to /usr/lib/pig/lib
Terminal > sudo cp /home/cloudera/Desktop/piggybank.jar /usr/lib/pig/lib/

3. Register this jar to Pig:
Terminal > Pig
Grunt > Register piggybank.jar;

4.Now we are set to use UDF’s of Piggybank like below to process CSV file in Pig:

Grunt > tweets = load ‘/user/cloudera/tweets.csv’ using org.apache.pig.piggybank.storage.CSVExcelStorage() as (date: chararray,timing:chararray,Tweet_Text:chararray,Type:chararray,Media_Type:chararray,Hashtags:chararray,Tweet_Id:long,
Tweet_Url:chararray,twt_favourites:long,Retweets:long,col1:chararray,col2:chararray);

5. Dump its result:

Grunt> Dump tweets;

0 Dislike
Follow 1

Please Enter a comment

Submit

Other Lessons for You

Cloud Computing
Introduction: In online world, we get information with just one click. But where this all information is stored? How we can store so much data from anywhere and can access from everywhere. No time bound,...
N

Namrata Y.

1 0
0

Lesson: Hive Queries
Lesson: Hive Queries This lesson will cover the following topics: Simple selects ? selecting columns Simple selects – selecting rows Creating new columns Hive Functions In SQL, of which...
C

5 Tips For Improving Your Documentation Immediately.
Tip 1) Quit it with the Passive Voice The passive voice is a plague on effective documentation. It reduces its clarity, its consistency, and the efficiency and tightness of the writing. The passive voice...

Apache Spark Architecture & Features
Let’s discuss about Apache Spark Architecture. Spark is a distributed computing platform designed for fast and flexible large scale parallel data processing. It is Master-Slave Architecture which...

How Big Data Hadoop and its importance for an enterprise?
In IT phrasing, Big Data is characterized as a collection of data sets (Hadoop), which are so mind boggling and large that the data cannot be easily captured, stored, searched, shared, analyzed or visualized...
X

Looking for Big Data Classes?

The best tutors for Big Data Classes are on UrbanPro

  • Select the best Tutor
  • Book & Attend a Free Demo
  • Pay and start Learning

Learn Big Data with the Best Tutors

The best Tutors for Big Data Classes are on UrbanPro

This website uses cookies

We use cookies to improve user experience. Choose what cookies you allow us to use. You can read more about our Cookie Policy in our Privacy Policy

Accept All
Decline All

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 55 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 7.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more