pig from alan gates’ book (in preparation for exam2)

8
Pig from Alan Gates’ book (In preparation for exam2)

Upload: gavivi

Post on 19-Feb-2016

20 views

Category:

Documents


1 download

DESCRIPTION

Pig from Alan Gates’ book (In preparation for exam2). Introduction. Download pig from pig.apache.org (into timberlake or your local computer/laptop) Unzip and untar it. You are set to go. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pig from Alan Gates’ book (In preparation for exam2)

Pig from Alan Gates’ book(In preparation for exam2)

Page 2: Pig from Alan Gates’ book (In preparation for exam2)

Introduction

• Download pig from pig.apache.org (into timberlake or your local computer/laptop)

• Unzip and untar it. You are set to go.• You can execute in local mode for learning purposes.

Later on you can test it on your hadoop installation.• Navigate to the director where pig is installed.

./bin/pig –x local• Will put you in grunt mode or local mode

Page 3: Pig from Alan Gates’ book (In preparation for exam2)

Data and pig Script• Create a data (called data) directory in the directory where bin is

located. • Download from github all the data files related to pig book and

store in the data directory– NYSE_divdidends– NYSE_daily– Etc.

• Now go thru’ the examples in chapters 1-4, either by typing them in line by line or by creating script files.

• Mystockanalysis.pig can be executed by• ./bin/pig –x local Mystockanalysis.pigor line by line on grunt

Page 4: Pig from Alan Gates’ book (In preparation for exam2)

Chapter 1

• Hello world of pig.• Mary had little lamb example.• Go through the example in page.3• Create “mary” file in your data directory• Type in the commands line by line as in p.3• Now create a ch1.pig file out of the coammands• Run the script file using the pig command• Try some other commands not listed there.• Understand the examples discussed in p.5,6

Page 5: Pig from Alan Gates’ book (In preparation for exam2)

Chapter 2

• Discusses installing and running pig• Go through the example in p.14.• That’s all.

Page 6: Pig from Alan Gates’ book (In preparation for exam2)

Chapter 3

• Discuss the grunt shell that is the prompt for the local mode

• pig –x local• Results in gruntgrunt>• See the example in page 20

Page 7: Pig from Alan Gates’ book (In preparation for exam2)

Chapter 4• Pig data model• Scalars like: int, long, float, double, etc.• Complex types: Map, chararray to element mapping, sort of like key, value pair• Tuple ordered collection of Pig elements (‘bob, 55)• Bag is an unordered collection of tuples• Nulls• Schemas: Pig has lax attitude towards schemas• Explicit:• dividends = load ‘NYSE_dividends’ as (exchange:chararray, symbol:chararray, date:

chararray, dividend:float);• Or you could say• divs = load ‘NYSE_dividends’ as (exchange, symbol, date, dividend);• See the table on page 28• See the example p.28,29,30.

Page 8: Pig from Alan Gates’ book (In preparation for exam2)

Chapter 5

• Pig Latin• Look at the examples p.33-50• Commands discussed are:• Load, store, dump• Relational operations: foreach, filter, group,

order ..by, distinct, join• Data operation: limit, sample, parallel.