Monday, August 28, 2017

Integrate LZO with Hadoop2

Apache Hadoop 


Apache Hadoop is open source framework  written in java that is design to  store large shemaless/schema dataset in distrbute manner( HDFS)  and computing useful insights from stored data using programming model (Map Reduce).

In HDFS, the data can be compressed for storage to reduce the space it takes to store it and shorter disk read time and also speed the data transfer across network.  

Compression and De-Compression

There are many compression-decompression algorithms available such as GZIP, bzip2, LZO, LZ4, Snappy,Avro, Sequence file or Parpet file. Each algorithm has it own space time trade off- faster compression/decompression will space less space and vice-versa.

LZO is distributed under the term of GNU general purpose licence.Therefore, it is not included in apache hadoop and so need to install seperately.

In this tutorial, I will write down the steps to configure LZO with Hadoop.

LZO Format

LZO format compress and decompress file pretty efficiently. Although, it compression size is not efficient than GZIP or bzip2 but time taken to decompress is pretty fast. It can also be indexed to support split able.

Configure LZO with Hadoop

In the steps below we will demonstrate the steps to configure LZO.

Get LZO native library

pooja@pooja:~$ sudo apt-get install liblzo2-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  liblzo2-dev
0 upgraded, 1 newly installed, 0 to remove and 377 not upgraded.
Need to get 93.1 kB of archives.
After this operation, 690 kB of additional disk space will be used.
Get:1 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main liblzo2-dev amd64 2.06-1.2ubuntu1.1 [93.1 kB]
Fetched 93.1 kB in 0s (244 kB/s)     
Selecting previously unselected package liblzo2-dev:amd64.
(Reading database ... 184739 files and directories currently installed.)
Preparing to unpack .../liblzo2-dev_2.06-1.2ubuntu1.1_amd64.deb ...
Unpacking liblzo2-dev:amd64 (2.06-1.2ubuntu1.1) ...
Setting up liblzo2-dev:amd64 (2.06-1.2ubuntu1.1) ...


Install LZO

Now, download the lzo from the url or use below command to download it.

pooja@pooja:~/dev$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
--2017-08-28 16:03:11--  http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
Resolving www.oberhumer.com (www.oberhumer.com)... 193.170.194.40
Connecting to www.oberhumer.com (www.oberhumer.com)|193.170.194.40|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 600622 (587K) [application/x-gzip]
Saving to: ‘lzo-2.10.tar.gz’

100%[=====================================================================================================>] 600,622      601KB/s   in 1.0s   

2017-08-28 16:03:13 (601 KB/s) - ‘lzo-2.10.tar.gz’ saved [600622/600622]

Then, untar the file using the below command.
pooja@pooja:~/dev$ tar xvzf lzo-2.10.tar.gz

Then, configure the lzo.
pooja@pooja:~/dev/lzo-2.10$ ./configure --enable-shared --prefix /usr/local/lzo-2.10
configure: Configuring LZO 2.10
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking target system type... x86_64-pc-linux-gnu
checking whether to enable maintainer-specific portions of Makefiles... no
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for ar... ar
checking the archiver (ar) interface... ar
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking whether the C preprocessor needs special flags... none needed
checking for an ANSI C-conforming const... yes
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking whether byte ordering is bigendian... no
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... no
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking dependency style of gcc... gcc3
checking whether make supports nested variables... (cached) yes
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking how to convert x86_64-pc-linux-gnu file names to x86_64-pc-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-pc-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking whether time.h and sys/time.h may both be included... yes
checking assert.h usability... yes
checking assert.h presence... yes
checking for assert.h... yes
checking ctype.h usability... yes
checking ctype.h presence... yes
checking for ctype.h... yes
checking dirent.h usability... yes
checking dirent.h presence... yes
checking for dirent.h... yes
checking errno.h usability... yes
checking errno.h presence... yes
checking for errno.h... yes
checking fcntl.h usability... yes
checking fcntl.h presence... yes
checking for fcntl.h... yes
checking float.h usability... yes
checking float.h presence... yes
checking for float.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking for memory.h... (cached) yes
checking setjmp.h usability... yes
checking setjmp.h presence... yes
checking for setjmp.h... yes
checking signal.h usability... yes
checking signal.h presence... yes
checking for signal.h... yes
checking stdarg.h usability... yes
checking stdarg.h presence... yes
checking for stdarg.h... yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdint.h... (cached) yes
checking stdio.h usability... yes
checking stdio.h presence... yes
checking for stdio.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for strings.h... (cached) yes
checking time.h usability... yes
checking time.h presence... yes
checking for time.h... yes
checking for unistd.h... (cached) yes
checking utime.h usability... yes
checking utime.h presence... yes
checking for utime.h... yes
checking sys/mman.h usability... yes
checking sys/mman.h presence... yes
checking for sys/mman.h... yes
checking sys/resource.h usability... yes
checking sys/resource.h presence... yes
checking for sys/resource.h... yes
checking for sys/stat.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for sys/types.h... (cached) yes
checking sys/wait.h usability... yes
checking sys/wait.h presence... yes
checking for sys/wait.h... yes
checking whether limits.h is sane... yes
checking for off_t... yes
checking for ptrdiff_t... yes
checking for size_t... yes
checking return type of signal handlers... void
checking size of short... 2
checking size of int... 4
checking size of long... 8
checking size of long long... 8
checking size of __int16... 0
checking size of __int32... 0
checking size of __int64... 0
checking size of void *... 8
checking size of size_t... 8
checking size of ptrdiff_t... 8
checking size of __int32... (cached) 0
checking size of intmax_t... 8
checking size of uintmax_t... 8
checking size of intptr_t... 8
checking size of uintptr_t... 8
checking size of float... 4
checking size of double... 8
checking size of long double... 16
checking size of dev_t... 8
checking size of fpos_t... 16
checking size of mode_t... 4
checking size of off_t... 8
checking size of ssize_t... 8
checking size of time_t... 8
checking for access... yes
checking for alloca... no
checking for atexit... yes
checking for atoi... yes
checking for atol... yes
checking for chmod... yes
checking for chown... yes
checking for clock_getcpuclockid... yes
checking for clock_getres... yes
checking for clock_gettime... yes
checking for ctime... yes
checking for difftime... yes
checking for fstat... yes
checking for getenv... yes
checking for getpagesize... yes
checking for getrusage... yes
checking for gettimeofday... yes
checking for gmtime... yes
checking for isatty... yes
checking for localtime... yes
checking for longjmp... yes
checking for lstat... yes
checking for memcmp... yes
checking for memcpy... yes
checking for memmove... yes
checking for memset... yes
checking for mkdir... yes
checking for mktime... yes
checking for mmap... yes
checking for mprotect... yes
checking for munmap... yes
checking for qsort... yes
checking for raise... yes
checking for rmdir... yes
checking for setjmp... yes
checking for signal... yes
checking for snprintf... yes
checking for strcasecmp... yes
checking for strchr... yes
checking for strdup... yes
checking for strerror... yes
checking for strftime... yes
checking for stricmp... no
checking for strncasecmp... yes
checking for strnicmp... no
checking for strrchr... yes
checking for strstr... yes
checking for time... yes
checking for umask... yes
checking for utime... yes
checking for vsnprintf... yes
checking whether to build assembly versions... no
checking whether your compiler passes the LZO conformance test... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating lzo2.pc
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands

   LZO configuration summary
   -------------------------
   LZO version                : 2.10
   configured for host        : x86_64-pc-linux-gnu
   source code location       : .
   compiler                   : gcc
   preprocessor definitions   : -DLZO_HAVE_CONFIG_H=1
   preprocessor flags         : 
   compiler flags             : -g -O2
   build static library       : yes
   build shared library       : yes
   enable i386 assembly code  : no


   LZO 2.10 configured.

   Copyright (C) 1996-2017 Markus Franz Xaver Johannes Oberhumer
   All Rights Reserved.

   The LZO library is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License as
   published by the Free Software Foundation; either version 2 of
   the License, or (at your option) any later version.

   The LZO library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   Markus F.X.J. Oberhumer
   <markus@oberhumer.com>
   http://www.oberhumer.com/opensource/lzo/


Type 'make' to build LZO.
Type 'make check' and 'make test' to test LZO.
Type 'make install' to install LZO.
After installing LZO, please have a look at 'examples/simple.c'.

Finally, make and install LZO.
pooja@pooja:~/dev/lzo-2.10$ make && sudo make install
make  all-am
make[1]: Entering directory `/home/pooja/dev/lzo-2.10'
  CC       src/lzo1.lo
  CC       src/lzo1_99.lo
  CC       src/lzo1a.lo
  CC       src/lzo1a_99.lo
  CC       src/lzo1b_1.lo
  CC       src/lzo1b_2.lo
  CC       src/lzo1b_3.lo
  CC       src/lzo1b_4.lo
  CC       src/lzo1b_5.lo
  CC       src/lzo1b_6.lo
  CC       src/lzo1b_7.lo
  CC       src/lzo1b_8.lo
  CC       src/lzo1b_9.lo
  CC       src/lzo1b_99.lo
  CC       src/lzo1b_9x.lo
  CC       src/lzo1b_cc.lo
  CC       src/lzo1b_d1.lo
  CC       src/lzo1b_d2.lo
  CC       src/lzo1b_rr.lo
  CC       src/lzo1b_xx.lo
  CC       src/lzo1c_1.lo
  CC       src/lzo1c_2.lo
  CC       src/lzo1c_3.lo
  CC       src/lzo1c_4.lo
  CC       src/lzo1c_5.lo
  CC       src/lzo1c_6.lo
  CC       src/lzo1c_7.lo
  CC       src/lzo1c_8.lo
  CC       src/lzo1c_9.lo
  CC       src/lzo1c_99.lo
  CC       src/lzo1c_9x.lo
  CC       src/lzo1c_cc.lo
  CC       src/lzo1c_d1.lo
  CC       src/lzo1c_d2.lo
  CC       src/lzo1c_rr.lo
  CC       src/lzo1c_xx.lo
  CC       src/lzo1f_1.lo
  CC       src/lzo1f_9x.lo
  CC       src/lzo1f_d1.lo
  CC       src/lzo1f_d2.lo
  CC       src/lzo1x_1.lo
  CC       src/lzo1x_1k.lo
  CC       src/lzo1x_1l.lo
  CC       src/lzo1x_1o.lo
  CC       src/lzo1x_9x.lo
  CC       src/lzo1x_d1.lo
  CC       src/lzo1x_d2.lo
  CC       src/lzo1x_d3.lo
  CC       src/lzo1x_o.lo
  CC       src/lzo1y_1.lo
  CC       src/lzo1y_9x.lo
  CC       src/lzo1y_d1.lo
  CC       src/lzo1y_d2.lo
  CC       src/lzo1y_d3.lo
  CC       src/lzo1y_o.lo
  CC       src/lzo1z_9x.lo
  CC       src/lzo1z_d1.lo
  CC       src/lzo1z_d2.lo
  CC       src/lzo1z_d3.lo
  CC       src/lzo2a_9x.lo
  CC       src/lzo2a_d1.lo
  CC       src/lzo2a_d2.lo
  CC       src/lzo_crc.lo
  CC       src/lzo_init.lo
  CC       src/lzo_ptr.lo
  CC       src/lzo_str.lo
  CC       src/lzo_util.lo
  CCLD     src/liblzo2.la
  CC       examples/dict.o
  CCLD     examples/dict
  CC       examples/lzopack.o
  CCLD     examples/lzopack
  CC       examples/overlap.o
  CCLD     examples/overlap
  CC       examples/precomp.o
  CCLD     examples/precomp
  CC       examples/precomp2.o
  CCLD     examples/precomp2
  CC       examples/simple.o
  CCLD     examples/simple
  CC       lzotest/lzotest.o
  CCLD     lzotest/lzotest
  CC       tests/align.o
  CCLD     tests/align
  CC       tests/chksum.o
  CCLD     tests/chksum
  CC       tests/promote.o
  CCLD     tests/promote
  CC       tests/sizes.o
  CCLD     tests/sizes
  CC       minilzo/t-testmini.o
  CC       minilzo/t-minilzo.o
  CCLD     minilzo/testmini
make[1]: Leaving directory `/home/pooja/dev/lzo-2.10'
make[1]: Entering directory `/home/pooja/dev/lzo-2.10'
 /bin/mkdir -p '/usr/local/lzo-2.10/lib'
 /bin/bash ./libtool   --mode=install /usr/bin/install -c   src/liblzo2.la '/usr/local/lzo-2.10/lib'
libtool: install: /usr/bin/install -c src/.libs/liblzo2.so.2.0.0 /usr/local/lzo-2.10/lib/liblzo2.so.2.0.0
libtool: install: (cd /usr/local/lzo-2.10/lib && { ln -s -f liblzo2.so.2.0.0 liblzo2.so.2 || { rm -f liblzo2.so.2 && ln -s liblzo2.so.2.0.0 liblzo2.so.2; }; })
libtool: install: (cd /usr/local/lzo-2.10/lib && { ln -s -f liblzo2.so.2.0.0 liblzo2.so || { rm -f liblzo2.so && ln -s liblzo2.so.2.0.0 liblzo2.so; }; })
libtool: install: /usr/bin/install -c src/.libs/liblzo2.lai /usr/local/lzo-2.10/lib/liblzo2.la
libtool: install: /usr/bin/install -c src/.libs/liblzo2.a /usr/local/lzo-2.10/lib/liblzo2.a
libtool: install: chmod 644 /usr/local/lzo-2.10/lib/liblzo2.a
libtool: install: ranlib /usr/local/lzo-2.10/lib/liblzo2.a
libtool: finish: PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin" ldconfig -n /usr/local/lzo-2.10/lib
----------------------------------------------------------------------
Libraries have been installed in:
   /usr/local/lzo-2.10/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the '-LLIBDIR'
flag during linking and do at least one of the following:
   - add LIBDIR to the 'LD_LIBRARY_PATH' environment variable
     during execution
   - add LIBDIR to the 'LD_RUN_PATH' environment variable
     during linking
   - use the '-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to '/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
 /bin/mkdir -p '/usr/local/lzo-2.10/share/doc/lzo'
 /usr/bin/install -c -m 644 AUTHORS COPYING NEWS THANKS doc/LZO.FAQ doc/LZO.TXT doc/LZOAPI.TXT '/usr/local/lzo-2.10/share/doc/lzo'
 /bin/mkdir -p '/usr/local/lzo-2.10/lib/pkgconfig'
 /usr/bin/install -c -m 644 lzo2.pc '/usr/local/lzo-2.10/lib/pkgconfig'
 /bin/mkdir -p '/usr/local/lzo-2.10/include/lzo'
 /usr/bin/install -c -m 644 include/lzo/lzo1.h include/lzo/lzo1a.h include/lzo/lzo1b.h include/lzo/lzo1c.h include/lzo/lzo1f.h include/lzo/lzo1x.h include/lzo/lzo1y.h include/lzo/lzo1z.h include/lzo/lzo2a.h include/lzo/lzo_asm.h include/lzo/lzoconf.h include/lzo/lzodefs.h include/lzo/lzoutil.h '/usr/local/lzo-2.10/include/lzo'
make[1]: Leaving directory `/home/pooja/dev/lzo-2.10'

Clone Hadoop LZO source code

Now, clone the hadoop-lzo project.

pooja@pooja:~/dev$ git clone https://github.com/twitter/hadoop-lzo.git
Cloning into 'hadoop-lzo'...
remote: Counting objects: 1889, done.
remote: Total 1889 (delta 0), reused 0 (delta 0), pack-reused 1889
Receiving objects: 100% (1889/1889), 16.45 MiB | 4.56 MiB/s, done.
Resolving deltas: 100% (745/745), done.
Checking connectivity... done.

pooja@pooja:~/dev/hadoop-lzo$ C_INCLUDE_PATH=/usr/local/lzo-2.10/include \
> LIBRARY_PATH=/usr/local/lzo-2.10/lib \
>   mvn clean package
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building hadoop-lzo 0.4.21-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-lzo ---
[INFO] Deleting /home/pooja/dev/hadoop-lzo/target
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (check-platform) @ hadoop-lzo ---
[INFO] Executing tasks

check-platform:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (set-props-non-win) @ hadoop-lzo ---
[INFO] Executing tasks

set-props-non-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (set-props-win) @ hadoop-lzo ---
[INFO] Executing tasks

set-props-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.3:resources (default-resources) @ hadoop-lzo ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/pooja/dev/hadoop-lzo/src/main/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ hadoop-lzo ---
[INFO] Compiling 25 source files to /home/pooja/dev/hadoop-lzo/target/classes
[WARNING] bootstrap class path not set in conjunction with -source 1.6
/home/pooja/dev/hadoop-lzo/src/main/java/com/hadoop/compression/lzo/DistributedLzoIndexer.java:[52,20] [deprecation] isDir() in FileStatus has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/main/java/com/hadoop/compression/lzo/DistributedLzoIndexer.java:[112,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/main/java/com/hadoop/compression/lzo/LzoIndexer.java:[82,18] [deprecation] isDir() in FileStatus has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/main/java/com/hadoop/mapreduce/LzoIndexOutputFormat.java:[31,28] [deprecation] cleanupJob(JobContext) in OutputCommitter has been deprecated
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (build-info-non-win) @ hadoop-lzo ---
[INFO] Executing tasks

build-info-non-win:
[propertyfile] Creating new property file: /home/pooja/dev/hadoop-lzo/target/classes/hadoop-lzo-build.properties
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (build-info-win) @ hadoop-lzo ---
[INFO] Executing tasks

build-info-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (check-native-uptodate-non-win) @ hadoop-lzo ---
[INFO] Executing tasks

check-native-uptodate-non-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (check-native-uptodate-win) @ hadoop-lzo ---
[INFO] Executing tasks

check-native-uptodate-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (build-native-non-win) @ hadoop-lzo ---
[INFO] Executing tasks

build-native-non-win:
    [mkdir] Created dir: /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib
    [mkdir] Created dir: /home/pooja/dev/hadoop-lzo/target/classes/native/Linux-amd64-64/lib
    [mkdir] Created dir: /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/src/com/hadoop/compression/lzo
    [javah] [Forcefully writing file RegularFileObject[/home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/src/com/hadoop/compression/lzo/com_hadoop_compression_lzo_LzoCompressor.h]]
    [javah] [Forcefully writing file RegularFileObject[/home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/src/com/hadoop/compression/lzo/com_hadoop_compression_lzo_LzoCompressor_CompressionStrategy.h]]
    [javah] [Forcefully writing file RegularFileObject[/home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/src/com/hadoop/compression/lzo/com_hadoop_compression_lzo_LzoDecompressor.h]]
    [javah] [Forcefully writing file RegularFileObject[/home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/src/com/hadoop/compression/lzo/com_hadoop_compression_lzo_LzoDecompressor_CompressionStrategy.h]]
     [exec] checking for a BSD-compatible install... /usr/bin/install -c
     [exec] checking whether build environment is sane... yes
     [exec] checking for a thread-safe mkdir -p... /bin/mkdir -p
     [exec] checking for gawk... no
     [exec] checking for mawk... mawk
     [exec] checking whether make sets $(MAKE)... yes
     [exec] checking whether to enable maintainer-specific portions of Makefiles... no
     [exec] checking for style of include used by make... GNU
     [exec] checking for gcc... gcc
     [exec] checking whether the C compiler works... yes
     [exec] checking for C compiler default output file name... a.out
     [exec] checking for suffix of executables... 
     [exec] checking whether we are cross compiling... no
     [exec] checking for suffix of object files... o
     [exec] checking whether we are using the GNU C compiler... yes
     [exec] checking whether gcc accepts -g... yes
     [exec] checking for gcc option to accept ISO C89... none needed
     [exec] checking dependency style of gcc... gcc3
     [exec] checking how to run the C preprocessor... gcc -E
     [exec] checking for grep that handles long lines and -e... /bin/grep
     [exec] checking for egrep... /bin/grep -E
     [exec] checking for ANSI C header files... yes
     [exec] checking for sys/types.h... yes
     [exec] checking for sys/stat.h... yes
     [exec] checking for stdlib.h... yes
     [exec] checking for string.h... yes
     [exec] checking for memory.h... yes
     [exec] checking for strings.h... yes
     [exec] checking for inttypes.h... yes
     [exec] checking for stdint.h... yes
     [exec] checking for unistd.h... yes
     [exec] checking minix/config.h usability... no
     [exec] checking minix/config.h presence... no
     [exec] checking for minix/config.h... no
     [exec] checking whether it is safe to define __EXTENSIONS__... yes
     [exec] checking for gcc... (cached) gcc
     [exec] checking whether we are using the GNU C compiler... (cached) yes
     [exec] checking whether gcc accepts -g... (cached) yes
     [exec] checking for gcc option to accept ISO C89... (cached) none needed
     [exec] checking dependency style of gcc... (cached) gcc3
     [exec] checking build system type... x86_64-unknown-linux-gnu
     [exec] checking host system type... x86_64-unknown-linux-gnu
     [exec] checking for a sed that does not truncate output... /bin/sed
     [exec] checking for fgrep... /bin/grep -F
     [exec] checking for ld used by gcc... /usr/bin/ld
     [exec] checking if the linker (/usr/bin/ld) is GNU ld... yes
     [exec] checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
     [exec] checking the name lister (/usr/bin/nm -B) interface... BSD nm
     [exec] checking whether ln -s works... yes
     [exec] checking the maximum length of command line arguments... 1572864
     [exec] checking whether the shell understands some XSI constructs... yes
     [exec] checking whether the shell understands "+="... yes
     [exec] checking for /usr/bin/ld option to reload object files... -r
     [exec] checking for objdump... objdump
     [exec] checking how to recognize dependent libraries... pass_all
     [exec] checking for ar... ar
     [exec] checking for strip... strip
     [exec] checking for ranlib... ranlib
     [exec] checking command to parse /usr/bin/nm -B output from gcc object... ok
     [exec] checking for dlfcn.h... yes
     [exec] checking for objdir... .libs
     [exec] checking if gcc supports -fno-rtti -fno-exceptions... no
     [exec] checking for gcc option to produce PIC... -fPIC -DPIC
     [exec] checking if gcc PIC flag -fPIC -DPIC works... yes
     [exec] checking if gcc static flag -static works... yes
     [exec] checking if gcc supports -c -o file.o... yes
     [exec] checking if gcc supports -c -o file.o... (cached) yes
     [exec] checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
     [exec] checking whether -lc should be explicitly linked in... no
     [exec] checking dynamic linker characteristics... GNU/Linux ld.so
     [exec] checking how to hardcode library paths into programs... immediate
     [exec] checking whether stripping libraries is possible... yes
     [exec] checking if libtool supports shared libraries... yes
     [exec] checking whether to build shared libraries... yes
     [exec] checking whether to build static libraries... yes
     [exec] checking for dlopen in -ldl... yes
     [exec] checking for unistd.h... (cached) yes
     [exec] checking stdio.h usability... yes
     [exec] checking stdio.h presence... yes
     [exec] checking for stdio.h... yes
     [exec] checking stddef.h usability... yes
     [exec] checking stddef.h presence... yes
     [exec] checking for stddef.h... yes
     [exec] checking lzo/lzo2a.h usability... yes
     [exec] checking lzo/lzo2a.h presence... yes
     [exec] checking for lzo/lzo2a.h... yes
     [exec] checking Checking for the 'actual' dynamic-library for '-llzo2'... "liblzo2.so.2"
     [exec] checking for special C compiler options needed for large files... no
     [exec] checking for _FILE_OFFSET_BITS value needed for large files... no
     [exec] checking for stdbool.h that conforms to C99... yes
     [exec] checking for _Bool... yes
     [exec] checking for an ANSI C-conforming const... yes
     [exec] checking for off_t... yes
     [exec] checking for size_t... yes
     [exec] checking whether strerror_r is declared... yes
     [exec] checking for strerror_r... yes
     [exec] checking whether strerror_r returns char *... yes
     [exec] checking for mkdir... yes
     [exec] checking for uname... yes
     [exec] checking for memset... yes
     [exec] checking for JNI_GetCreatedJavaVMs in -ljvm... yes
     [exec] checking jni.h usability... yes
     [exec] checking jni.h presence... yes
     [exec] checking for jni.h... yes
     [exec] configure: creating ./config.status
     [exec] config.status: creating Makefile
     [exec] config.status: creating impl/config.h
     [exec] config.status: executing depfiles commands
     [exec] config.status: executing libtool commands
     [exec] depbase=`echo impl/lzo/LzoCompressor.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
     [exec] /bin/bash ./libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl  -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo  -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoCompressor.lo -MD -MP -MF $depbase.Tpo -c -o impl/lzo/LzoCompressor.lo /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoCompressor.c &&\
     [exec] mv -f $depbase.Tpo $depbase.Plo
     [exec] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoCompressor.lo -MD -MP -MF impl/lzo/.deps/LzoCompressor.Tpo -c /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoCompressor.c  -fPIC -DPIC -o impl/lzo/.libs/LzoCompressor.o
     [exec] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoCompressor.lo -MD -MP -MF impl/lzo/.deps/LzoCompressor.Tpo -c /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoCompressor.c -o impl/lzo/LzoCompressor.o >/dev/null 2>&1
     [exec] depbase=`echo impl/lzo/LzoDecompressor.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
     [exec] /bin/bash ./libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl  -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo  -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoDecompressor.lo -MD -MP -MF $depbase.Tpo -c -o impl/lzo/LzoDecompressor.lo /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoDecompressor.c &&\
     [exec] mv -f $depbase.Tpo $depbase.Plo
     [exec] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoDecompressor.lo -MD -MP -MF impl/lzo/.deps/LzoDecompressor.Tpo -c /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoDecompressor.c  -fPIC -DPIC -o impl/lzo/.libs/LzoDecompressor.o
     [exec] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I/home/pooja/dev/hadoop-lzo/src/main/native -I./impl -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -I/home/pooja/dev/hadoop-lzo/src/main/native/impl -Isrc/com/hadoop/compression/lzo -g -Wall -fPIC -O2 -m64 -g -O2 -MT impl/lzo/LzoDecompressor.lo -MD -MP -MF impl/lzo/.deps/LzoDecompressor.Tpo -c /home/pooja/dev/hadoop-lzo/src/main/native/impl/lzo/LzoDecompressor.c -o impl/lzo/LzoDecompressor.o >/dev/null 2>&1
     [exec] /bin/bash ./libtool --tag=CC   --mode=link gcc -g -Wall -fPIC -O2 -m64 -g -O2 -L/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server -Wl,--no-as-needed -o libgplcompression.la -rpath /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/../install/lib impl/lzo/LzoCompressor.lo impl/lzo/LzoDecompressor.lo  -ljvm -ldl 
     [exec] libtool: link: gcc -shared  impl/lzo/.libs/LzoCompressor.o impl/lzo/.libs/LzoDecompressor.o   -L/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server -ljvm -ldl  -m64 -Wl,--no-as-needed   -Wl,-soname -Wl,libgplcompression.so.0 -o .libs/libgplcompression.so.0.0.0
     [exec] libtool: link: (cd ".libs" && rm -f "libgplcompression.so.0" && ln -s "libgplcompression.so.0.0.0" "libgplcompression.so.0")
     [exec] libtool: link: (cd ".libs" && rm -f "libgplcompression.so" && ln -s "libgplcompression.so.0.0.0" "libgplcompression.so")
     [exec] libtool: link: ar cru .libs/libgplcompression.a  impl/lzo/LzoCompressor.o impl/lzo/LzoDecompressor.o
     [exec] libtool: link: ranlib .libs/libgplcompression.a
     [exec] libtool: link: ( cd ".libs" && rm -f "libgplcompression.la" && ln -s "../libgplcompression.la" "libgplcompression.la" )
     [exec] libtool: install: cp /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/.libs/libgplcompression.so.0.0.0 /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib/libgplcompression.so.0.0.0
     [exec] libtool: install: warning: remember to run `libtool --finish /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/../install/lib'
     [exec] libtool: install: (cd /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib && { ln -s -f libgplcompression.so.0.0.0 libgplcompression.so.0 || { rm -f libgplcompression.so.0 && ln -s libgplcompression.so.0.0.0 libgplcompression.so.0; }; })
     [exec] libtool: install: (cd /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib && { ln -s -f libgplcompression.so.0.0.0 libgplcompression.so || { rm -f libgplcompression.so && ln -s libgplcompression.so.0.0.0 libgplcompression.so; }; })
     [exec] libtool: install: cp /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/.libs/libgplcompression.lai /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib/libgplcompression.la
     [exec] libtool: install: cp /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/.libs/libgplcompression.a /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib/libgplcompression.a
     [exec] libtool: install: chmod 644 /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib/libgplcompression.a
     [exec] libtool: install: ranlib /home/pooja/dev/hadoop-lzo/target/native/Linux-amd64-64/lib/libgplcompression.a
     [copy] Copying 5 files to /home/pooja/dev/hadoop-lzo/target/classes/native/Linux-amd64-64/lib
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (build-native-win) @ hadoop-lzo ---
[INFO] Executing tasks

build-native-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.3:testResources (default-testResources) @ hadoop-lzo ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 12 resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ hadoop-lzo ---
[INFO] Compiling 6 source files to /home/pooja/dev/hadoop-lzo/target/test-classes
[WARNING] bootstrap class path not set in conjunction with -source 1.6
/home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[14,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[30,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[44,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[61,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[82,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/compression/lzo/TestDistLzoIndexerJobName.java:[96,16] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/mapreduce/TestLzoTextInputFormat.java:[165,14] [deprecation] Job(Configuration) in Job has been deprecated
[WARNING] /home/pooja/dev/hadoop-lzo/src/test/java/com/hadoop/mapreduce/TestLzoTextInputFormat.java:[314,14] [deprecation] Job(Configuration) in Job has been deprecated
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (prep-test) @ hadoop-lzo ---
[INFO] Executing tasks

prep-test:
    [mkdir] Created dir: /home/pooja/dev/hadoop-lzo/target/test-classes/logs
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-surefire-plugin:2.14.1:test (default-test) @ hadoop-lzo ---
[INFO] Surefire report directory: /home/pooja/dev/hadoop-lzo/target/surefire-reports

-------------------------------------------------------
 T E S T S
-------------------------------------------------------

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running com.hadoop.mapreduce.TestLzoTextInputFormat
2017-08-28 15:14:19,876 INFO  lzo.GPLNativeCodeLoader (GPLNativeCodeLoader.java:<clinit>(52)) - Loaded native gpl library from the embedded binaries
2017-08-28 15:14:19,895 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) - Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
2017-08-28 15:14:20,226 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-08-28 15:14:20,551 INFO  Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1129)) - hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2017-08-28 15:14:20,553 INFO  compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.lzo]
2017-08-28 15:14:21,210 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:21,266 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 3
2017-08-28 15:14:21,273 INFO  compress.CodecPool (CodecPool.java:getDecompressor(179)) - Got brand-new decompressor [.lzo]
2017-08-28 15:14:21,503 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:21,541 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 3
2017-08-28 15:14:22,116 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:22,129 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:22,274 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:22,289 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:22,820 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:22,844 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 3
2017-08-28 15:14:23,023 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:23,040 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 3
2017-08-28 15:14:23,573 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:23,588 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:23,709 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:23,726 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:24,187 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:24,217 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:24,358 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:24,375 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 2
2017-08-28 15:14:24,906 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:24,917 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
2017-08-28 15:14:25,032 INFO  output.FileOutputCommitter (FileOutputCommitter.java:commitTask(439)) - Saved output of task 'attempt_123_0001_r_000001_2' to file:/home/pooja/dev/hadoop-lzo/target/test-classes/data/outputDir/_temporary/0/task_123_0001_r_000001
2017-08-28 15:14:25,045 INFO  input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.396 sec
Running com.hadoop.compression.lzo.TestLzopOutputStream
2017-08-28 15:14:25,706 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-08-28 15:14:25,995 INFO  lzo.GPLNativeCodeLoader (GPLNativeCodeLoader.java:<clinit>(52)) - Loaded native gpl library from the embedded binaries
2017-08-28 15:14:26,002 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) - Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
2017-08-28 15:14:26,018 INFO  lzo.TestLzopOutputStream (TestLzopOutputStream.java:runTest(134)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000.txt.lzo
2017-08-28 15:14:26,217 INFO  Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1129)) - hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2017-08-28 15:14:26,263 INFO  lzo.TestLzopOutputStream (TestLzopOutputStream.java:runTest(134)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/1000.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/1000.txt.lzo
2017-08-28 15:14:26,299 INFO  lzo.TestLzopOutputStream (TestLzopOutputStream.java:runTest(134)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/100.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/100.txt.lzo
2017-08-28 15:14:26,328 INFO  lzo.TestLzopOutputStream (TestLzopOutputStream.java:runTest(134)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/issue20-lzop.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/issue20-lzop.txt.lzo
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.06 sec
Running com.hadoop.compression.lzo.TestLzoCodec
2017-08-28 15:14:26,732 INFO  lzo.GPLNativeCodeLoader (GPLNativeCodeLoader.java:<clinit>(52)) - Loaded native gpl library from the embedded binaries
2017-08-28 15:14:26,740 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) - Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
2017-08-28 15:14:26,824 INFO  Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1129)) - hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2017-08-28 15:14:26,994 INFO  compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.lzo_deflate]
2017-08-28 15:14:27,058 INFO  compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.lzo_deflate]
2017-08-28 15:14:27,138 INFO  compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.lzo_deflate]
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.572 sec
Running com.hadoop.compression.lzo.TestLzopInputStream
2017-08-28 15:14:27,544 INFO  lzo.GPLNativeCodeLoader (GPLNativeCodeLoader.java:<clinit>(52)) - Loaded native gpl library from the embedded binaries
2017-08-28 15:14:27,573 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) - Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
2017-08-28 15:14:27,575 INFO  lzo.TestLzopInputStream (TestLzopInputStream.java:runTest(119)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000.txt.lzo
2017-08-28 15:14:27,650 INFO  lzo.TestLzopInputStream (TestLzopInputStream.java:runTest(119)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/1000.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/1000.txt.lzo
2017-08-28 15:14:27,653 INFO  lzo.TestLzopInputStream (TestLzopInputStream.java:runTest(119)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/100.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/100.txt.lzo
2017-08-28 15:14:27,655 INFO  lzo.TestLzopInputStream (TestLzopInputStream.java:runTest(119)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/0.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/0.txt.lzo
2017-08-28 15:14:27,657 INFO  lzo.TestLzopInputStream (TestLzopInputStream.java:runTest(119)) - Comparing files /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000-truncated.txt and /home/pooja/dev/hadoop-lzo/target/test-classes/data/100000-truncated.txt.lzo
2017-08-28 15:14:27,691 WARN  lzo.LzopInputStream (LzopInputStream.java:close(347)) - Incorrect LZO file format: file did not end with four trailing zeroes.
java.io.IOException: Corrupted uncompressed block
at com.hadoop.compression.lzo.LzopInputStream.verifyChecksums(LzopInputStream.java:220)
at com.hadoop.compression.lzo.LzopInputStream.close(LzopInputStream.java:343)
at sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:378)
at sun.nio.cs.StreamDecoder.close(StreamDecoder.java:193)
at java.io.InputStreamReader.close(InputStreamReader.java:199)
at java.io.BufferedReader.close(BufferedReader.java:525)
at com.hadoop.compression.lzo.TestLzopInputStream.runTest(TestLzopInputStream.java:147)
at com.hadoop.compression.lzo.TestLzopInputStream.testTruncatedFile(TestLzopInputStream.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:208)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:159)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:87)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:95)
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.324 sec
Running com.hadoop.compression.lzo.TestDistLzoIndexerJobName
2017-08-28 15:14:28,435 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-08-28 15:14:28,628 INFO  Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1129)) - mapred.job.queue.name is deprecated. Instead, use mapreduce.job.queuename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.707 sec
Running com.hadoop.compression.lzo.TestLzoRandData
2017-08-28 15:14:29,130 INFO  lzo.GPLNativeCodeLoader (GPLNativeCodeLoader.java:<clinit>(52)) - Loaded native gpl library from the embedded binaries
2017-08-28 15:14:29,136 INFO  lzo.LzoCodec (LzoCodec.java:<clinit>(76)) - Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
2017-08-28 15:14:29,200 INFO  Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1129)) - hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2017-08-28 15:14:29,277 INFO  compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.lzo]
Start to write to file...
Closed file.
2017-08-28 15:14:29,391 INFO  compress.CodecPool (CodecPool.java:getDecompressor(179)) - Got brand-new decompressor [.lzo]
Start to write to file...
Closed file.
Start to write to file...
Closed file.
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.325 sec

Results :

Tests run: 27, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hadoop-lzo ---
[INFO] Building jar: /home/pooja/dev/hadoop-lzo/target/hadoop-lzo-0.4.21-SNAPSHOT.jar
[INFO] 
[INFO] >>> maven-source-plugin:2.2.1:jar (attach-sources) @ hadoop-lzo >>>
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (check-platform) @ hadoop-lzo ---
[INFO] Executing tasks

check-platform:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (set-props-non-win) @ hadoop-lzo ---
[INFO] Executing tasks

set-props-non-win:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (set-props-win) @ hadoop-lzo ---
[INFO] Executing tasks

set-props-win:
[INFO] Executed tasks
[INFO] 
[INFO] <<< maven-source-plugin:2.2.1:jar (attach-sources) @ hadoop-lzo <<<
[INFO] 
[INFO] --- maven-source-plugin:2.2.1:jar (attach-sources) @ hadoop-lzo ---
[INFO] Building jar: /home/pooja/dev/hadoop-lzo/target/hadoop-lzo-0.4.21-SNAPSHOT-sources.jar
[INFO] 
[INFO] --- maven-javadoc-plugin:2.9:jar (attach-javadocs) @ hadoop-lzo ---
[INFO] 
Loading source files for package org.apache.hadoop.io.compress...
Loading source files for package com.hadoop.mapred...
Loading source files for package com.hadoop.mapreduce...
Loading source files for package com.hadoop.compression.lzo...
Loading source files for package com.hadoop.compression.lzo.util...
Loading source files for package com.quicklz...
Constructing Javadoc information...
Standard Doclet version 1.8.0_131
Building tree for all the packages and classes...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/LzoCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/DeprecatedLzoLineRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/DeprecatedLzoTextInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoIndexOutputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoIndexRecordWriter.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoLineRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoSplitInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoSplitRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoSplitRecordReader.Counters.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/LzoTextInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/CChecksum.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/DChecksum.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/DistributedLzoIndexer.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/GPLNativeCodeLoader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzoCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzoIndex.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzoIndexer.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzoInputFormatCommon.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzopCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzopDecompressor.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzopInputStream.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/LzopOutputStream.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/CompatibilityUtil.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/QuickLZ.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/overview-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/package-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/package-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/package-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/constant-values.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/class-use/LzoCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/class-use/DeprecatedLzoLineRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/class-use/DeprecatedLzoTextInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoIndexRecordWriter.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoIndexOutputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoTextInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoSplitInputFormat.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoSplitRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoSplitRecordReader.Counters.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/class-use/LzoLineRecordReader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzoIndexer.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzopOutputStream.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzopInputStream.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/GPLNativeCodeLoader.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzoCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzoInputFormatCommon.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/CChecksum.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/DChecksum.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzoIndex.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/DistributedLzoIndexer.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzopCodec.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/class-use/LzopDecompressor.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/class-use/CompatibilityUtil.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/class-use/QuickLZ.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/package-use.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/compression/lzo/util/package-use.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapred/package-use.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/hadoop/mapreduce/package-use.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/com/quicklz/package-use.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/org/apache/hadoop/io/compress/package-use.html...
Building index for all the packages and classes...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/overview-tree.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/index-all.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/deprecated-list.html...
Building index for all classes...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/allclasses-frame.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/allclasses-noframe.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/index.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/overview-summary.html...
Generating /home/pooja/dev/hadoop-lzo/target/apidocs/help-doc.html...
[INFO] Building jar: /home/pooja/dev/hadoop-lzo/target/hadoop-lzo-0.4.21-SNAPSHOT-javadoc.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 30.462s
[INFO] Finished at: Mon Aug 28 15:14:35 PDT 2017
[INFO] Final Memory: 33M/253M
[INFO] ------------------------------------------------------------------------


Copy the LZO-Hadoop files to Hadoop 

We need to copy the hadoop-lzo jar and native file to all hadoop cluster machine as shown below.

pooja@pooja:~/dev/hadoop-lzo$ sudo cp  target/hadoop-lzo-0.4.21-SNAPSHOT.jar /home/hduser/lzo/
pooja@pooja:~/dev/hadoop-lzo$ sudo cp -R target/native /home/hduser/lzo/native


Change configuration file

We need to change hadoop property files to include both hadoop-lzo-*.jar and native file to classpath and  library path.

Edit hadoop-env.sh 

Add hadoop-lzo-*.jar file to classpath and native files to library path as shown below:

#Set LZO compression path
export HADOOP_CLASSPATH=/home/hduser/lzo/hadoop-lzo-0.4.21-SNAPSHOT.jar:$HADOOP_CLASSPATH

export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native:/home/hduser/lzo/native/Linux-amd64-64/lib" 

                   
                                                                                                         


Edit core-site.xml

Add LZO compression codec as shown below

<property>
  <name>io.compression.codecs</name>
  <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, com.hadoop.compression.lzo.LzoCodec, com.hadoop.compression.lzo.LzopCodec
  </value>
</property>
<property>
  <name>io.compression.codec.lzo.class</name>
  <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>



Edit yarn-site.xml

Here, set the below property:

<property>
   <name>mapreduce.map.output.compress</name>
   <value>true</value>
</property>
<property>
   <name>mapreduce.map.output.compress.codec</name>
   <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
     <name>yarn.application.classpath</name>
     <value>$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, $HADOOP_YARN_HOME/share/hadoop/yarn/*, $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*,/home/hduser/lzo/hadoop-lzo-0.4.21-SNAPSHOT.jar</value>
</property>


Finally, restart the hadoop cluster and you can write Map Reduce program to include LZO compression.

Verify Hadoop Classpath

Verify if lzo is set on hadoop classpath using below command:

ps -eaf | grep lzo

Verify LZO setup

Write a Map Reduce program with LZO Codec as output

 FileOutputFormat.setCompressOutput(job, true);
 FileOutputFormat.setOutputCompressorClass(job,LzoCodec.class);

Now, run the job as shown below
hduser@pooja:~/hadoop-data-files$ yarn jar weatherHadoop.jar com.jbksoft.WeatherJob /usr/WeatherData output cache
17/08/29 07:38:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/08/29 07:38:55 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.101:8032
17/08/29 07:38:56 INFO input.FileInputFormat: Total input files to process : 1
17/08/29 07:38:56 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries
17/08/29 07:38:56 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
17/08/29 07:38:57 INFO mapreduce.JobSubmitter: number of splits:1
17/08/29 07:38:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504017422827_0001
17/08/29 07:38:58 INFO impl.YarnClientImpl: Submitted application application_1504017422827_0001
17/08/29 07:38:58 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1504017422827_0001/
17/08/29 07:38:58 INFO mapreduce.Job: Running job: job_1504017422827_0001
17/08/29 07:39:07 INFO mapreduce.Job: Job job_1504017422827_0001 running in uber mode : false
17/08/29 07:39:07 INFO mapreduce.Job:  map 0% reduce 0%
17/08/29 07:39:12 INFO mapreduce.Job:  map 100% reduce 0%
17/08/29 07:39:18 INFO mapreduce.Job:  map 100% reduce 100%
17/08/29 07:39:19 INFO mapreduce.Job: Job job_1504017422827_0001 completed successfully
17/08/29 07:39:19 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=2555
FILE: Number of bytes written=282357
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=52455
HDFS: Number of bytes written=2430
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters 
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3010
Total time spent by all reduces in occupied slots (ms)=3286
Total time spent by all map tasks (ms)=3010
Total time spent by all reduce tasks (ms)=3286
Total vcore-milliseconds taken by all map tasks=3010
Total vcore-milliseconds taken by all reduce tasks=3286
Total megabyte-milliseconds taken by all map tasks=3082240
Total megabyte-milliseconds taken by all reduce tasks=3364864
Map-Reduce Framework
Map input records=366
Map output records=358
Map output bytes=4646
Map output materialized bytes=2551
Input split bytes=117
Combine input records=0
Combine output records=0
Reduce input groups=358
Reduce shuffle bytes=2551
Reduce input records=358
Reduce output records=358
Spilled Records=716
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=131
CPU time spent (ms)=1810
Physical memory (bytes) snapshot=454684672
Virtual memory (bytes) snapshot=3855101952
Total committed heap usage (bytes)=309854208
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=52338
File Output Format Counters 
Bytes Written=2430


Finally, verify the output as shown below

hduser@pooja:~/hadoop-data-files$ hdfs dfs -text /user/hduser/output/part-r-00000.lzo_deflate
17/08/29 08:23:14 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries
17/08/29 08:23:14 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev f12b7f24913ffbde938b8d140e8a7b22183221a0]
17/08/29 08:23:14 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
0.0
20000109 42.5
20000110 52.8
20000111 46.5
20000112 44.7


I hope you are able to follow my tutorial. If you still face any issues, please mail me, I will love to help you.

Happy Coding !!!



Thursday, August 24, 2017

Setup Multi node Apache Hadoop 2 Cluster

Apache Hadoop 


Hadoop is open source framework for writing and running distributed application. It consists of scale out fault tolerant distribute file system (HDFS) and  data processing system (Map Reduce).

Today, I will walk through the steps for set up Hadoop Cluster which involve 2 or more commodity machine. I will be configuring the set up using 2 machines.

Prerequisites:



Network accessible : Machines should be connect through network by either through Ethernet hubs or switch or routers. Therefore, cluster machines should have same subnetting IP address like 192.168.1.x.


Multi Node Hadoop Cluster Setup


1. Set up Hadoop on each machine


Please follow the steps provide in the tutorial and set up single node setup on each machine. Then stop the processes as shown in Step 8 of the tutorial.

2. Change each nodes hosts files to include all machine in cluster .


In my case, I have just 2 machine connected through network with IP Address (192.168.1.1, 192.168.1.2). Therefore, I have included the below 2 lines to file as shown below:

hduser@pooja:~$ sudo vi /etc/hosts

192.168.1.1 master
192.168.1.2 slave1

3. Set up password less SSH

We will be creating a passwordless ssh between master and all slaves machine in network.


3.1 Master machine ssh set up with itself


We have already set up password less ssh to localhost/itslef when configure Hadoop on each machine. Here, we will just verify if setup is proper.

hduser@pooja:~$ ssh master

The authenticity of host 'master (192.168.1.101)' can't be established.
ECDSA key fingerprint is ad:3c:12:c3:b1:d2:60:a4:8f:76:00:1d:15:b7:f5:41.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.1.101' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

385 packages can be updated.
268 updates are security updates.

New release '16.04.3 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Thu Aug 24 13:51:11 2017 from localhost
$


3.2 Master machine ssh set up with slave nodes


3.2.1 Copy the master ssh public key to all slave node.

          hduser@pooja:~$ ssh-copy-id -i /home/hduser/.ssh/id_rsa.pub hduser@slave1

            The authenticity of host 'slaves (192.168.1.2)' can't be established.
            The ECDSA key fingerprint is: b3: 7d: 41: 89: 03: 15: 04: 1c: 84: e3: d1: 69: 1f: c8: 5d.
            Are you sure you want to continue connecting (yes/no)? yes
            /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
            /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
            hduser@slave1's password:
            Number of key(s) added: 1
            Now try logging into the machine, with:   "ssh 'hduser@slave1'"
            and check to make sure that only the key(s) you wanted were added.
           
            Note: In the bold line, we specify the password of hduser@slave1 machine. 

3.2.2 Verify the authorization_keys file of slave1 machine
         
          Make sure you have a key enter from master node as shown below.

          hduser@prod01:~$ cat .ssh/authorized_keys 

            ssh-rsa AAAAB3NzaC1yc....LJ/67N+v7g8S0/U44Mhjf7dviODw5tY9cs5XXsb1FMVQL... hduser@prod01
            ssh-rsa fffA3zwdi0eWSkJvDWzD9du...kSRTRLZbzVY9ahLZNLFz+p1QU3HXuY3tLr hduser@pooja

3.2.3 Confirm passwordless ssh from master machine
     
           hduser@pooja:~$ ssh slave1
             
             Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)
             * Documentation:  https://help.ubuntu.com/
             334 packages can be updated.
             224 updates are security updates.

             New release '16.04.3 LTS' available.
             Run 'do-release-upgrade' to upgrade to it.

            Last login: Thu Aug 24 13:50:50 2017 from localhost
            $ 


4. Hadoop Configuration changes


4.1 Changes to masters files

This file specify the list of machine that run the name node and secondary name node (name node will always start on the master node but the secondary name node can run on any slave node if cluster started using start-dfs.sh from the particular slave node). Basically, Secondary namenode merge the fsimage and edit log periodically to keep edit log in size.

In our case we will specify the master machine only.

hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/masters


4.2 Changes to slave files

This file specify the list of machine that run the datanodes and node masters.
In our case we will specify the master and slave1, if you have more slaves, you can specify them here and can remove master node.
hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/slaves



4.3 Changes in core-site.xml for all machine in cluster.

Now, the namenode process will be running on master and not on localhost.
Therefore, we need to change the value of fs-default-name  property to hdfs://master:9000 as shown below.

hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

Note: Make sure you make changes to core-site.xml in slave nodes as well

4.4 Changes in hdfs-site.xml of all slave nodes (This is optional step)

Remove property "dfs.namenode.dir" as now namenode won't be running  on slave machine.


5. Starting hadoop cluster


From the master machine run the below commands

5.1 Start HDFS  

hduser@pooja:~$ start-dfs.sh

5.2 Start Yarn

hduser@pooja:~$ start-yarn.sh

5.3 Verify the running process on master

5.3.1 Process runnining on master machine.

hduser@pooja:~$ jps
6821 SecondaryNameNode
7126 NodeManager
6615 DataNode
7628 Jps
6444 NameNode
6990 ResourceManager
  
5.3.2 Process running on slave node
hduser@prod01:~$ jps
9749 NodeManager
9613 DataNode
9902 Jps

5.3.3 Run the PI Mapreduce job from the hadoop-examples jar.
hduser@pooja:~$ yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 4 1000 


6. Stop the Cluster


In the master node, stop the processes.

6.1 Stop yarn

hduser@pooja:~$ stop-yarn.sh 
stopping yarn daemons
stopping resourcemanager
master: stopping nodemanager
slave1: stopping nodemanager
master: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
slave1: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

6.2 Stop HDFS

hduser@pooja:~$ stop-dfs.sh
17/08/24 18:42:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namodes on [master]
Master: stopping forgive
master: stopping datanode
slave1: stopping datanode
Stopping secondary namodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
17/08/24 18:42:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

I hope you are able to follow my instruction to set up Hadoop Cluster. If still facing issue, I love to address them, please do write your problems !!!

Happy Coding !!!

Installing Hadoop2.8.1 on Ubuntu (Single Node Cluster)

 Overview

Hadoop is open source framework for running and writing distributed computing programs. This framework comprise of HDFS (Hadoop Distributed File system) and Map Reduce (Programming framework written in Java).

In Hadoop 1, Only Map Reduce program (either written in Java or Python ) can be run on the data stored in HDFS. Therefore, it only fit for  batch processing computations.

In Hadoop 2, the YARN (Yet Another Resource Negotiator) was introduced which provide API to work on requesting and allocating resource in cluster. These API facilitate application such as Spark, Tez, Storm program to process large scale fault tolerant data of HDFS. Thus, hadoop ecosystem now fits in for all batch or near real time or real time processing computation.  


Today, I will be discussing about the steps to set up Hadoop 2 in pseudo mode on Ubuntu machine.

Prerequisites


  • Hardware requirement
          The machine on which hadoop installed must have 64-128 MB RAM and atleast 1-4 GB     hard disk for better performance. This is the optional requirement.
  • Check java version
         Java version of machine should be greater than 7. If you have version small than 7 or no Java installed than install by steps provided in the article.
   
        You can version the java version with below command.
       
        $ java -version
             java version "1.8.0_131"
             Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
            Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Steps for Hadoop Set up on Single Machine.

Step 1 : Create a dedicated hadoop user.

  •  Create a group hadoop
            pooja@prod01:~$ sudo groupadd hadoop

  • 1.2 Create a user hduser in group hadoop.

pooja@prod01:~$ sudo useradd -G hadoop -m  hduser

Note:-m will create the home directory

1.3 Make sure home directory with hduser created.

pooja@prod01:~$ sudo ls -ltr /home/

total 8
drwxr-xr-x 28 pooja  pooja  4096 Aug 24 09:23 pooja
drwxr-xr-x  2 hduser hduser 4096 Aug 24 13:34 hduser

1.4 Define password for hduser.
pooja@prod01:~$ sudo passwd hduser

Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

1.5 Log-in  as hduser 
pooja@prod01:~$ su - hduser
Password: 

hduser@prod01:~$ pwd
/home/hduser

Step 2: Set up Passwordless SSH

2.1 Generate the ssh-keygen without password

hduser@prod01:~$ ssh-keygen -t rsa -P ""

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa): 
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
6c:c0:f4:c2:d1:d8:40:41:2b:e8:7b:8d:d4:c7:2c:62 hduser@prod01
The key's randomart image is:
+--[ RSA 2048]----+
|     oB*         |
|   . +.+o        |
|  . . * .        |
| .   o *         |
|  . E o S        |
|   + ++         |
|  . o .          |
|   .             |
|                 |
+-----------------+

2.2 Add the public ssh-key generated to authorized keys

hduser@prod01:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

2.3 Provide read and write permission to authorized keys.

hduser@prod01:~$  chmod 0600 ~/.ssh/authorized_keys

2.4 Verify if password less ssh is working.

Note: In continue question, please specify yes as shown below
hduser@prod01:~$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is ad:3c:12:c3:b1:d2:60:a4:8f:76:00:1e:15:b3:f4:41.
Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)
...Snippet
$

Step 3: Download Hadoop  2.8.1

3.1 Download Hadoop 2.8.1 tar file from Apache Download images or using below commands

hduser@prod01:~$ wget http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz

--2017-08-24 14:01:31--  http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz
Resolving apache.claz.org (apache.claz.org)... 74.63.227.45
Connecting to apache.claz.org (apache.claz.org)|74.63.227.45|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 424555111 (405M) [application/x-gzip]
Saving to: ‘hadoop-2.8.1.tar.gz’
100%[=====================================================================================================>] 424,555,111 1.51MB/s   in 2m 48s
2017-08-24 14:04:19 (2.41 MB/s) - ‘hadoop-2.8.1.tar.gz’ saved [424555111/424555111]

3.2 Untar the downloaded tar file.

hduser@prod01:~$ tar -xvf hadoop-2.8.1.tar.gz

...Snippet
hadoop-2.8.1/share/doc/hadoop/images/external.png
hadoop-2.8.1/share/doc/hadoop/images/h5.jpg
hadoop-2.8.1/share/doc/hadoop/index.html
hadoop-2.8.1/share/doc/hadoop/project-reports.html
hadoop-2.8.1/include/
hadoop-2.8.1/include/hdfs.h
hadoop-2.8.1/include/Pipes.hh
hadoop-2.8.1/include/TemplateFactory.hh
hadoop-2.8.1/include/StringUtils.hh
hadoop-2.8.1/include/SerialUtils.hh
hadoop-2.8.1/LICENSE.txt
hadoop-2.8.1/NOTICE.txt
hadoop-2.8.1/README.txt

3.3 Create the soft link.

hduser@prod01:~$ ln -s hadoop-2.8.1 hadoop

Step 4: Configure Hadoop Pseudo Distributed mode.

In the hadoop configuration, we only added the minimum required property, you can add more properties to it as well.

4.1 Set up the environment variable.

   4.1.1 Edit bashrc and add hadoop in path as shown below:

            hduser@pooja:~$ vi .bashrc

               #Add below lines to .bashrc
                export HADOOP_HOME=/home/hduser/hadoop
                export HADOOP_INSTALL=$HADOOP_HOME
                export HADOOP_MAPRED_HOME=$HADOOP_HOME
                export HADOOP_COMMON_HOME=$HADOOP_HOME
                export HADOOP_HDFS_HOME=$HADOOP_HOME
                export YARN_HOME=$HADOOP_HOME
                export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
               export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  4.1.2 Source .bashrc in current login session

          hduser@pooja:~$ source ~/.bashrc
          
4.2  Hadoop configuration file changes

   4.2.1 Changes to hadoop-env.sh (set $JAVA_HOME to installation directory)
         
           4.2.1.1 Find JAVA_HOME on machine.
                      
                        hduser@pooja:~$ which java
                         /usr/bin/java
                        
                         hduser@pooja:~$ readlink -f /usr/bin/java
                         /usr/lib/jvm/java-8-oracle/jre/bin/java

                         Note/usr/lib/jvm/java-8-oracle is JAVA_HOME diretory
          4.2.1.2  Edit hadoop-env.sh and set $JAVA_HOME.
         
                       hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh  
                      
                       Edit file and change

                       JAVA_HOME = ${JAVA_HOME} 
                                         to
                       JAVA_HOME = /usr/lib/jvm/java-8-oracle   
                            Note: JAVA_HOME=path fetched in step 4.2.1.1
                    
4.2.2  Changes to core-site.xml 
hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

Add the configuration property (NameNode property: fs.dafault.name).

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

4.2.3 Changes to hdfs-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the configuration property (NameNode property: dfs.name.dir, DataNode property: dfs.data.dir).

<configuration>
<property>
     <name>dfs.replication</name>
       <value>1</value>
</property>
<property>
       <name>dfs.name.dir</name>
       <value>file:///home/hduser/hadoopdata/hdfs/namenode</value>
</property>
<property>
     <name>dfs.data.dir</name>
     <value>file:///home/hduser/hadoopdata/hdfs/datanode</value>
</property>
</configuration>


4.2.3 Changes to mapred-site.xml

Here, first we will copy the mapred-site.xml.template to mapred-site.xml and then will add property to it.

hduser@prod01:~$ cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the configuration property (mapreduce.framework.name)
<configuration>
     <property>
         <name>mapreduce.framework.name</name>
          <value>yarn</value>
       </property>


</configuration>

Note: If you didn't specify this then Resource Manager UI (http://localhost:8088) will not show any jobs.

4.2.4 Changes to yarn-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the configuration property

<configuration>
     <property>
         <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
       </property>
</configuration>

Step 5: Verify and format HDFS File system

5.1 Format HDFS file system

       hduser@pooja:~$ hdfs namenode -format

       ...Snippet
           17/08/24 16:08:36 INFO util.GSet: capacity      = 2^15 = 32768 entries
           17/08/24 16:08:36 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1601791069-127.0.1.1-1503616
           17/08/24 16:08:37 INFO common.Storage: Storage directory /home/hduser/hadoopdata/hdfs/namenode has been successfully formatted.
            17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Saving image file       /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
           17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Image file /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
           17/08/24 16:08:37 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
           17/08/24 16:08:37 INFO util.ExitUtil: Exiting with status 0
           17/08/24 16:08:37 INFO namenode.NameNode: SHUTDOWN_MSG: 
          /************************************************************
            SHUTDOWN_MSG: Shutting down NameNode at pooja/127.0.1.1
          ************************************************************/

5.2 Verify the format (Make sure hadoopdata/hdfs/* folder created)
        
       hduser@prod01:~$ ls -ltr hadoopdata/hdfs/
       
         total 4
         drwxrwxr-x 3 hduser hduser 4096 Aug 24 16:09 namenode

Note: This is same path as specify in hdfs-site.xml property dfs-name-dir

Step 6: Start single node cluster

We will start the hadoop cluster using the hadoop start-up script.

6.1 Start HDFS
     
hduser@prod01:~$ start-dfs.sh 
17/08/24 16:38:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-namenode-prod01.out
localhost: starting datanode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-datanode-prod01.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is be:b3:7d:41:89:03:15:04:1c:84:e3:d9:69:1f:c8:5d.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-secondarynamenode-prod01.out
17/08/24 16:39:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

6.2 Start yarn

hduser@prod01:~$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-resourcemanager-prod01.out
localhost: starting nodemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-nodemanager-prod01.out

6.3 Verify if all process started

hduser@prod01:~$ jps
6775 DataNode
7209 ResourceManager
7017 SecondaryNameNode
6651 NameNode
7339 NodeManager
7663 Jps

6.4 Run the PI Mapreduce job from the hadoop-examples jar.

hduser@prod1:~$ yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 4 1000 




Step 7: Hadoop Web Interface

Web UI of NameNode(http://localhost:50070)


Resource Manager UI  (http://localhost:8088).
It will show all jobs running and resources on cluster information.This will help monitor the jobs running and progress report of the same.

Step 8: Stopping the hadoop

8.1 Stop Yarn processes

hduser@prod01:~$ stop-yarn.sh

stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
localhost: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

8.2 Stop HDFS processes

hduser@prod01:~$ stop-dfs.sh
17/08/24 17:11:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
17/08/24 17:12:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Hope you are able to follow my instructions on  Hadoop Pseudo Mode Setup. Please write to me if any of you  are still facing problem.

Happy Coding!!!!

Wednesday, January 25, 2017

Configure IntelliJ for Android Development on CentOS

Mobile Application

In today world, the application development for mobile has increased magnificently. The application from online payment to e-shopping to digital assistance to interactive messaging to many more operations are now just click away using mobile.
Mobile application user interface can be developed using a foray of technologies such as HTML 5, CSS,Javascript, Java, Android or iOS.

In this post, I will be discussing about setting up Android environment on existing IntelliJ.  

IntelliJ set up for Android Development

Perform below steps for setup.

Step 1. Install Java 8 or Java 7 JDK

$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Step 2. Install Android SDK

[user@localhost ~]$ cd /opt
[user@localhost opt]$ sudo wget http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
[sudo] password for pooja: 
--2017-01-24 22:25:23--  http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
Resolving dl.google.com (dl.google.com)... 172.217.6.46, 2607:f8b0:4005:805::200e
Connecting to dl.google.com (dl.google.com)|172.217.6.46|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 326412652 (311M) [application/x-tar]
Saving to: ‘android-sdk_r24.4.1-linux.tgz’

100%[============================================================================================================>] 326,412,652  148KB/s   in 29m 58s

2017-01-24 22:55:21 (177 KB/s) - ‘android-sdk_r24.4.1-linux.tgz’ saved [326412652/326412652]

[user@localhost opt]$ sudo tar zxvf android-sdk_r24.4.1-linux.tgz
[user@localhost opt]$ sudo chown -R root:root android-sdk_r24.4.1-linux 
[user@localhost opt]$ sudo ln -s android-sdk_r24.4.1-linux android-sdk-linux 

#If not change ownership, you will get error "selected directory is not a valid home for android SDK" while setting Andriod SDK path in IntelliJ



[user@localhost opt]$ sudo chown -R user:group /opt/android-sdk-linux/

#sudo vim /etc/profile.d/android-sdk-env.sh

export ANDROID_HOME=/opt/android-sdk-linux
export PATH=$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH
# source /etc/profile.d/android-sdk-env.sh

Step 3: Open SDK Manager under SDK Android Tool

[user@localhost opt]sudo android-sdk-linux/tools/android


Now, Select All Tools option and press "Install 23 packages". Then the license screen is open as shown below.

Finally, select 'Install' button that will start download of packages.


Step 4: Install IntelliJ (if not exists)

Download IntelliJ Community Edition is free, download it and untar the file.

Step 5: Open IntelliJ or close project will open up below screen.
Now, select 'Create New Project' and then select Project type as "Android" as shown below


Now, Select option "Application Module" and select 'Next'.


Now, Select option 'New' button. 

Then the browser window will open up, Now select /opt/android-sdk-linux and press 'OK'

Lastly, the android version popup window will be shown as below



This way, we have configured existing IntelliJ for Android Development project. Now press 'Finish' button to create project.

I hope you are also able to configure your existing IntelliJ for Android development. If any problems, please write back and I love to hear from you.

Tuesday, January 17, 2017

Debugging Apache Hadoop (NameNode,DataNode,SNN,ResourceManager,NodeManager) using IntelliJ

In the previous blogs, I discuss the set up the environment and then download  Apache Hadoop code and then build it and also set it up in IDE (IntelliJ).

In this blog, I will focus on debugging Apache Hadoop code for understanding. 

I used remote debugging to connect and debug any of the Hadoop processes (NameNode,DataNode, SecondaryNameNode,ResourceManager,NodeManager).

Prerequisites
1. Apache Hadoop code on local machine.
2. Code is build (look for hadoop/hadoop-dist created)
3. Set up of the code in IntelliJ.

Let dive into the step to understand the debug process.

Step 1: Look for hadoop-dist directory in hadoop main directory.
Once hadoop code is build, the directory hadoop-dist is created in Hadoop main directory as shown below.

 Step 2: Move in the target directory. 
 [pooja@localhost hadoop]$ cd hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT

The directory structure looks as below (It same as Apache Download  tar)

Step 3: Now, setup Hadoop configuration.
a. Change hadoop-env file to add JAVA_HOME path 

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh 
Add the below line. 
JAVA_HOME=$JAVA_HOME

b. Add configuration paramters (Note: I am doing minimum set up for running hadoop processes)

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/core-site.xml
<configuration>
<property>
   <name>fs.default.name</name>
   <value>hdfs://localhost:9000</value>
 </property>
<configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hdfs-site.xml 
<configuration>
 <property>
 <name>dfs.replication</name>
  <value>1</value>
</property>
  <property>
   <name>dfs.name.dir</name>
   <value>file:///home/pooja/hadoopdata/hdfs/namenode</value>
  </property>
  <property>
   <name>dfs.data.dir</name>
     <value>file:///home/pooja/hadoopdata/hdfs/datanode</value>
 </property>
</configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/yarn-site.xml
<configuration>
<property>
    <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
 </property>
</configuration>

Place the enviornment property in ~/.bashrc
export HADOOP_HOME=<hadoop source code directory>/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  Step 4: Run all hadoop process

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [localhost.localdomain]
2017-01-17 20:27:44,335 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ jps
25232 SecondaryNameNode
26337 Jps
24839 DataNode
24489 NameNode
25914 NodeManager
25597 ResourceManager

Step 5: Now stop all the processes.
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-yarn.sh
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-dfs.sh

Step 6: Debug a Hadoop process (eg. NameNode) by performing below change in hadoop-env.sh or hdfs-env.sh.

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh  
Add below line.
export HDFS_NAMENODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"

Simlarly, we can debug below processes:
YARN_RESOURCEMANAGER_OPTS
YARN_NODEMANAGER_OPTS
HDFS_NAMENODE_OPTS
HDFS_DATANODE_OPTS
HDFS_SECONDARYNAMENODE_OPTS 

Step 7: Enable remote debugging in IDE (IntelliJ) as shown below.
Note: Identify the main class for NameNode process by looking in startup script.

Open NameNode.java class ->Run/Debug Configuration (+)->Remote-> Change 'port' to 5000 (textbox) ->Apply button


Step 8: Now start the namenode process and put the break point in NameNode.java class as shown below.

Start the process:
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh

Start the debugger(Shift+9):


And now can debug the code as shown below.

I hope everyone is able to set up the code, if any problem. Please do write, I will be happy to help you.
In the next blog, will be writing about the steps for making patch for Apache Hadoop Contribution. 

Happy Coding and Keep Learning !!!!

Importing Apache Hadoop (HDFS,Yarn) module to IntelliJ

In previous blog, I wrote about the steps to set the environment and download Apache Hadoop code on our machine for understanding and contributing. In this blog, I will walk through the code set up on IDE (IntelliJ here).

By now, I presume to Apache Hadoop code is on our machine and also code is compiled. If not follow the blog.

Please follow below steps for importing  HDFS module on IntelliJ

Step 1: Open IntelliJ (either using short-link or idea.sh) and then close project if already open as shown below


Step 2: In below screen, choose Import project as shown below.

Step 3: Now, you have to browse to the folder you want to import. Select Hadoop/hadoop-hdfs-project/hadoop-hdfs folder directory and press 'OK'.


Step 4: The below screen will be shown. Please select the option "Import project from external model" and Click 'next'.



Step 5: Now, Keep pressing next->next and then finish. The project will be imported in IntelliJ as shown below.


Now, Apache Hadoop HDFS module is imported in IntelliJ. You can import other module (YARN,Common) similarly.

I hope all viewers are able to import the Apache Hadoop project successfully in IntelliJ. If facing any issues, please discuss as I will be happy assisting you all.

In the next tutorial, I will discuss the steps of debugging Hadoop.