安装LZO

环境

Linux版本: CentOS 6.5

jdk版本: JDK1.8

hadoop版本: 2.6.0-cdh5.7.0

参考: github上的LZO项目

安装类库

安装一些依赖的类库

1
# yum -y install lzo-devel zlib-devel gcc autoconf automake libtool

下载、解压LZO

1
2
3
$ cd ~/software
$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
$ tar -zxvf lzo-2.10.tar.gz -C ../source

编译LZO包

1
2
3
$ cd ~/source/lzo-2.10
$./configure --enable-shared --prefix /usr/local/lzo-2.10
$ make && sudo make install

下载Hadoop-LZO

1
2
3
$ cd ~/software
$ wget https://github.com/twitter/hadoop-lzo/archive/master.zip
$ unzip master.zip -d ../source

修改Hadoop-LZO pom

1
2
$ cd ~/source/hadoop-lzo-master
$ vi pom.xml

添加cloudera仓库

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<repositories>
<repository>
<id>cloudera-repo</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
</repository>
<repository>
<id>sonatype-nexus-snapshots</id>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>

修改hadoop版本

1
2
3
4
5
6
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<!-- <hadoop.current.version>2.6.4</hadoop.current.version>-->
<hadoop.current.version>2.6.0-cdh5.7.0</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>

编译Hadoop-LZO

1
2
3
4
5
6
7
8
9
$ C_INCLUDE_PATH=/usr/local/lzo-2.06/include \
LIBRARY_PATH=/usr/local/lzo-2.06/lib \
mvn clean package

$ cd target/native/Linux-amd64-64
$ tar -cBf - -C lib . | tar -xBvf - -C ~
$ mv ~/libgplcompression* $HADOOP_HOME/lib/native/
$ cp target/hadoop-lzo-0.4.18-SNAPSHOT.jar \
$HADOOP_HOME/share/hadoop/common/

配置Hadoop环境变量

修改hadoop-env.sh

1
$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh

添加配置

1
export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib

修改core-site.xml

1
$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

添加配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec,
org.apache.hadoop.io.compress.BZip2Codec
</value>
</property>

<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

修改mapred-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>
</property>

至此我们就完成了LZO的安装了,如果想看如何使用测试LZO的index功能