Pigのインストール

Pigパッケージの配置

$ tar zxvf pig-0.8.1.tar.gz
$ sudo mv pig-0.8.1 /usr/local/
$ sudo chown -R hadoop:hadoop /usr/local/pig-0.8.1
$ sudo ln -s /usr/local/pig-0.8.1 /usr/local/pig

Pig環境設定

$ export PIG_HOME=/usr/local/pig
$ sudo -e /etc/bashrc

...
export PIG_HOME=/usr/local/pig #追記

$ export PATH=$PATH:$PIG_HOME/bin
$ sudo -e /etc/bashrc

...
export PATH=$PATH:$PIG_HOME/bin #追記
$ sudo -e /usr/local/pig/conf/pig-env.sh
$ cat /usr/local/pig/conf/pig-env.sh

PIG_CLASSPATH=$HADOOP_HOME/conf

$ sudo chown hadoop:hadoop /usr/local/pig/conf/pig-env.sh

PigシェルによるHadoopクラスタ接続確認

$ sudo su hadoop
$ pig
2011-04-26 10:54:25,548 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/hadoop/pig_1303782865545.log
2011-04-26 10:54:25,934 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:54310
2011-04-26 10:54:26,124 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:54311
grunt> ls /
hdfs://localhost:54310/hadoop   <dir>
hdfs://localhost:54310/tmp      <dir>
hdfs://localhost:54310/user     <dir>

Pigサンプル

Pigのインストール - osacaz4の日記にあるサンプルを動かしてみた

grunt> ls input                                    
hdfs://localhost:54310/user/hadoop/input/capacity-scheduler.xml<r 3>    3936
hdfs://localhost:54310/user/hadoop/input/core-site.xml<r 3>     390
hdfs://localhost:54310/user/hadoop/input/hadoop-policy.xml<r 3> 4190
hdfs://localhost:54310/user/hadoop/input/hdfs-site.xml<r 3>     407
hdfs://localhost:54310/user/hadoop/input/mapred-site.xml<r 3>   404

grunt> A = LOAD 'input';
grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*';
grunt> DUMP B;

...

(    dfsadmin and mradmin commands to refresh the security policy in-effect. )
(        <name>dfs.name.dir</name>)
(        <name>dfs.data.dir</name>)