Thursday, August 24, 2017

Installing Hadoop2.8.1 on Ubuntu (Single Node Cluster)

 Overview

Hadoop is open source framework for running and writing distributed computing programs. This framework comprise of HDFS (Hadoop Distributed File system) and Map Reduce (Programming framework written in Java).

In Hadoop 1, Only Map Reduce program (either written in Java or Python ) can be run on the data stored in HDFS. Therefore, it only fit for  batch processing computations.

In Hadoop 2, the YARN (Yet Another Resource Negotiator) was introduced which provide API to work on requesting and allocating resource in cluster. These API facilitate application such as Spark, Tez, Storm program to process large scale fault tolerant data of HDFS. Thus, hadoop ecosystem now fits in for all batch or near real time or real time processing computation.  


Today, I will be discussing about the steps to set up Hadoop 2 in pseudo mode on Ubuntu machine.

Prerequisites


  • Hardware requirement
          The machine on which hadoop installed must have 64-128 MB RAM and atleast 1-4 GB     hard disk for better performance. This is the optional requirement.
  • Check java version
         Java version of machine should be greater than 7. If you have version small than 7 or no Java installed than install by steps provided in the article.
   
        You can version the java version with below command.
       
        $ java -version
             java version "1.8.0_131"
             Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
            Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Steps for Hadoop Set up on Single Machine.

Step 1 : Create a dedicated hadoop user.

  •  Create a group hadoop
            pooja@prod01:~$ sudo groupadd hadoop

  • 1.2 Create a user hduser in group hadoop.

pooja@prod01:~$ sudo useradd -G hadoop -m  hduser

Note:-m will create the home directory

1.3 Make sure home directory with hduser created.

pooja@prod01:~$ sudo ls -ltr /home/

total 8
drwxr-xr-x 28 pooja  pooja  4096 Aug 24 09:23 pooja
drwxr-xr-x  2 hduser hduser 4096 Aug 24 13:34 hduser

1.4 Define password for hduser.
pooja@prod01:~$ sudo passwd hduser

Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

1.5 Log-in  as hduser 
pooja@prod01:~$ su - hduser
Password: 

hduser@prod01:~$ pwd
/home/hduser

Step 2: Set up Passwordless SSH

2.1 Generate the ssh-keygen without password

hduser@prod01:~$ ssh-keygen -t rsa -P ""

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa): 
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
6c:c0:f4:c2:d1:d8:40:41:2b:e8:7b:8d:d4:c7:2c:62 hduser@prod01
The key's randomart image is:
+--[ RSA 2048]----+
|     oB*         |
|   . +.+o        |
|  . . * .        |
| .   o *         |
|  . E o S        |
|   + ++         |
|  . o .          |
|   .             |
|                 |
+-----------------+

2.2 Add the public ssh-key generated to authorized keys

hduser@prod01:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

2.3 Provide read and write permission to authorized keys.

hduser@prod01:~$  chmod 0600 ~/.ssh/authorized_keys

2.4 Verify if password less ssh is working.

Note: In continue question, please specify yes as shown below
hduser@prod01:~$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is ad:3c:12:c3:b1:d2:60:a4:8f:76:00:1e:15:b3:f4:41.
Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)
...Snippet
$

Step 3: Download Hadoop  2.8.1

3.1 Download Hadoop 2.8.1 tar file from Apache Download images or using below commands

hduser@prod01:~$ wget http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz

--2017-08-24 14:01:31--  http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz
Resolving apache.claz.org (apache.claz.org)... 74.63.227.45
Connecting to apache.claz.org (apache.claz.org)|74.63.227.45|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 424555111 (405M) [application/x-gzip]
Saving to: ‘hadoop-2.8.1.tar.gz’
100%[=====================================================================================================>] 424,555,111 1.51MB/s   in 2m 48s
2017-08-24 14:04:19 (2.41 MB/s) - ‘hadoop-2.8.1.tar.gz’ saved [424555111/424555111]

3.2 Untar the downloaded tar file.

hduser@prod01:~$ tar -xvf hadoop-2.8.1.tar.gz

...Snippet
hadoop-2.8.1/share/doc/hadoop/images/external.png
hadoop-2.8.1/share/doc/hadoop/images/h5.jpg
hadoop-2.8.1/share/doc/hadoop/index.html
hadoop-2.8.1/share/doc/hadoop/project-reports.html
hadoop-2.8.1/include/
hadoop-2.8.1/include/hdfs.h
hadoop-2.8.1/include/Pipes.hh
hadoop-2.8.1/include/TemplateFactory.hh
hadoop-2.8.1/include/StringUtils.hh
hadoop-2.8.1/include/SerialUtils.hh
hadoop-2.8.1/LICENSE.txt
hadoop-2.8.1/NOTICE.txt
hadoop-2.8.1/README.txt

3.3 Create the soft link.

hduser@prod01:~$ ln -s hadoop-2.8.1 hadoop

Step 4: Configure Hadoop Pseudo Distributed mode.

In the hadoop configuration, we only added the minimum required property, you can add more properties to it as well.

4.1 Set up the environment variable.

   4.1.1 Edit bashrc and add hadoop in path as shown below:

            hduser@pooja:~$ vi .bashrc

               #Add below lines to .bashrc
                export HADOOP_HOME=/home/hduser/hadoop
                export HADOOP_INSTALL=$HADOOP_HOME
                export HADOOP_MAPRED_HOME=$HADOOP_HOME
                export HADOOP_COMMON_HOME=$HADOOP_HOME
                export HADOOP_HDFS_HOME=$HADOOP_HOME
                export YARN_HOME=$HADOOP_HOME
                export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
               export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  4.1.2 Source .bashrc in current login session

          hduser@pooja:~$ source ~/.bashrc
          
4.2  Hadoop configuration file changes

   4.2.1 Changes to hadoop-env.sh (set $JAVA_HOME to installation directory)
         
           4.2.1.1 Find JAVA_HOME on machine.
                      
                        hduser@pooja:~$ which java
                         /usr/bin/java
                        
                         hduser@pooja:~$ readlink -f /usr/bin/java
                         /usr/lib/jvm/java-8-oracle/jre/bin/java

                         Note/usr/lib/jvm/java-8-oracle is JAVA_HOME diretory
          4.2.1.2  Edit hadoop-env.sh and set $JAVA_HOME.
         
                       hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh  
                      
                       Edit file and change

                       JAVA_HOME = ${JAVA_HOME} 
                                         to
                       JAVA_HOME = /usr/lib/jvm/java-8-oracle   
                            Note: JAVA_HOME=path fetched in step 4.2.1.1
                    
4.2.2  Changes to core-site.xml 
hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

Add the configuration property (NameNode property: fs.dafault.name).

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

4.2.3 Changes to hdfs-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the configuration property (NameNode property: dfs.name.dir, DataNode property: dfs.data.dir).

<configuration>
<property>
     <name>dfs.replication</name>
       <value>1</value>
</property>
<property>
       <name>dfs.name.dir</name>
       <value>file:///home/hduser/hadoopdata/hdfs/namenode</value>
</property>
<property>
     <name>dfs.data.dir</name>
     <value>file:///home/hduser/hadoopdata/hdfs/datanode</value>
</property>
</configuration>


4.2.3 Changes to mapred-site.xml

Here, first we will copy the mapred-site.xml.template to mapred-site.xml and then will add property to it.

hduser@prod01:~$ cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the configuration property (mapreduce.framework.name)
<configuration>
     <property>
         <name>mapreduce.framework.name</name>
          <value>yarn</value>
       </property>


</configuration>

Note: If you didn't specify this then Resource Manager UI (http://localhost:8088) will not show any jobs.

4.2.4 Changes to yarn-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the configuration property

<configuration>
     <property>
         <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
       </property>
</configuration>

Step 5: Verify and format HDFS File system

5.1 Format HDFS file system

       hduser@pooja:~$ hdfs namenode -format

       ...Snippet
           17/08/24 16:08:36 INFO util.GSet: capacity      = 2^15 = 32768 entries
           17/08/24 16:08:36 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1601791069-127.0.1.1-1503616
           17/08/24 16:08:37 INFO common.Storage: Storage directory /home/hduser/hadoopdata/hdfs/namenode has been successfully formatted.
            17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Saving image file       /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
           17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Image file /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
           17/08/24 16:08:37 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
           17/08/24 16:08:37 INFO util.ExitUtil: Exiting with status 0
           17/08/24 16:08:37 INFO namenode.NameNode: SHUTDOWN_MSG: 
          /************************************************************
            SHUTDOWN_MSG: Shutting down NameNode at pooja/127.0.1.1
          ************************************************************/

5.2 Verify the format (Make sure hadoopdata/hdfs/* folder created)
        
       hduser@prod01:~$ ls -ltr hadoopdata/hdfs/
       
         total 4
         drwxrwxr-x 3 hduser hduser 4096 Aug 24 16:09 namenode

Note: This is same path as specify in hdfs-site.xml property dfs-name-dir

Step 6: Start single node cluster

We will start the hadoop cluster using the hadoop start-up script.

6.1 Start HDFS
     
hduser@prod01:~$ start-dfs.sh 
17/08/24 16:38:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-namenode-prod01.out
localhost: starting datanode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-datanode-prod01.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is be:b3:7d:41:89:03:15:04:1c:84:e3:d9:69:1f:c8:5d.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-secondarynamenode-prod01.out
17/08/24 16:39:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

6.2 Start yarn

hduser@prod01:~$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-resourcemanager-prod01.out
localhost: starting nodemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-nodemanager-prod01.out

6.3 Verify if all process started

hduser@prod01:~$ jps
6775 DataNode
7209 ResourceManager
7017 SecondaryNameNode
6651 NameNode
7339 NodeManager
7663 Jps

6.4 Run the PI Mapreduce job from the hadoop-examples jar.

hduser@prod1:~$ yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 4 1000 




Step 7: Hadoop Web Interface

Web UI of NameNode(http://localhost:50070)


Resource Manager UI  (http://localhost:8088).
It will show all jobs running and resources on cluster information.This will help monitor the jobs running and progress report of the same.

Step 8: Stopping the hadoop

8.1 Stop Yarn processes

hduser@prod01:~$ stop-yarn.sh

stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
localhost: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

8.2 Stop HDFS processes

hduser@prod01:~$ stop-dfs.sh
17/08/24 17:11:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
17/08/24 17:12:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Hope you are able to follow my instructions on  Hadoop Pseudo Mode Setup. Please write to me if any of you  are still facing problem.

Happy Coding!!!!

Wednesday, January 25, 2017

Configure IntelliJ for Android Development on CentOS

Mobile Application

In today world, the application development for mobile has increased magnificently. The application from online payment to e-shopping to digital assistance to interactive messaging to many more operations are now just click away using mobile.
Mobile application user interface can be developed using a foray of technologies such as HTML 5, CSS,Javascript, Java, Android or iOS.

In this post, I will be discussing about setting up Android environment on existing IntelliJ.  

IntelliJ set up for Android Development

Perform below steps for setup.

Step 1. Install Java 8 or Java 7 JDK

$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Step 2. Install Android SDK

[user@localhost ~]$ cd /opt
[user@localhost opt]$ sudo wget http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
[sudo] password for pooja: 
--2017-01-24 22:25:23--  http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
Resolving dl.google.com (dl.google.com)... 172.217.6.46, 2607:f8b0:4005:805::200e
Connecting to dl.google.com (dl.google.com)|172.217.6.46|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 326412652 (311M) [application/x-tar]
Saving to: ‘android-sdk_r24.4.1-linux.tgz’

100%[============================================================================================================>] 326,412,652  148KB/s   in 29m 58s

2017-01-24 22:55:21 (177 KB/s) - ‘android-sdk_r24.4.1-linux.tgz’ saved [326412652/326412652]

[user@localhost opt]$ sudo tar zxvf android-sdk_r24.4.1-linux.tgz
[user@localhost opt]$ sudo chown -R root:root android-sdk_r24.4.1-linux 
[user@localhost opt]$ sudo ln -s android-sdk_r24.4.1-linux android-sdk-linux 

#If not change ownership, you will get error "selected directory is not a valid home for android SDK" while setting Andriod SDK path in IntelliJ



[user@localhost opt]$ sudo chown -R user:group /opt/android-sdk-linux/

#sudo vim /etc/profile.d/android-sdk-env.sh

export ANDROID_HOME=/opt/android-sdk-linux
export PATH=$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH
# source /etc/profile.d/android-sdk-env.sh

Step 3: Open SDK Manager under SDK Android Tool

[user@localhost opt]sudo android-sdk-linux/tools/android


Now, Select All Tools option and press "Install 23 packages". Then the license screen is open as shown below.

Finally, select 'Install' button that will start download of packages.


Step 4: Install IntelliJ (if not exists)

Download IntelliJ Community Edition is free, download it and untar the file.

Step 5: Open IntelliJ or close project will open up below screen.
Now, select 'Create New Project' and then select Project type as "Android" as shown below


Now, Select option "Application Module" and select 'Next'.


Now, Select option 'New' button. 

Then the browser window will open up, Now select /opt/android-sdk-linux and press 'OK'

Lastly, the android version popup window will be shown as below



This way, we have configured existing IntelliJ for Android Development project. Now press 'Finish' button to create project.

I hope you are also able to configure your existing IntelliJ for Android development. If any problems, please write back and I love to hear from you.

Tuesday, January 17, 2017

Debugging Apache Hadoop (NameNode,DataNode,SNN,ResourceManager,NodeManager) using IntelliJ

In the previous blogs, I discuss the set up the environment and then download  Apache Hadoop code and then build it and also set it up in IDE (IntelliJ).

In this blog, I will focus on debugging Apache Hadoop code for understanding. 

I used remote debugging to connect and debug any of the Hadoop processes (NameNode,DataNode, SecondaryNameNode,ResourceManager,NodeManager).

Prerequisites
1. Apache Hadoop code on local machine.
2. Code is build (look for hadoop/hadoop-dist created)
3. Set up of the code in IntelliJ.

Let dive into the step to understand the debug process.

Step 1: Look for hadoop-dist directory in hadoop main directory.
Once hadoop code is build, the directory hadoop-dist is created in Hadoop main directory as shown below.

 Step 2: Move in the target directory. 
 [pooja@localhost hadoop]$ cd hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT

The directory structure looks as below (It same as Apache Download  tar)

Step 3: Now, setup Hadoop configuration.
a. Change hadoop-env file to add JAVA_HOME path 

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh 
Add the below line. 
JAVA_HOME=$JAVA_HOME

b. Add configuration paramters (Note: I am doing minimum set up for running hadoop processes)

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/core-site.xml
<configuration>
<property>
   <name>fs.default.name</name>
   <value>hdfs://localhost:9000</value>
 </property>
<configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hdfs-site.xml 
<configuration>
 <property>
 <name>dfs.replication</name>
  <value>1</value>
</property>
  <property>
   <name>dfs.name.dir</name>
   <value>file:///home/pooja/hadoopdata/hdfs/namenode</value>
  </property>
  <property>
   <name>dfs.data.dir</name>
     <value>file:///home/pooja/hadoopdata/hdfs/datanode</value>
 </property>
</configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/yarn-site.xml
<configuration>
<property>
    <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
 </property>
</configuration>

Place the enviornment property in ~/.bashrc
export HADOOP_HOME=<hadoop source code directory>/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  Step 4: Run all hadoop process

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [localhost.localdomain]
2017-01-17 20:27:44,335 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ jps
25232 SecondaryNameNode
26337 Jps
24839 DataNode
24489 NameNode
25914 NodeManager
25597 ResourceManager

Step 5: Now stop all the processes.
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-yarn.sh
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-dfs.sh

Step 6: Debug a Hadoop process (eg. NameNode) by performing below change in hadoop-env.sh or hdfs-env.sh.

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh  
Add below line.
export HDFS_NAMENODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"

Simlarly, we can debug below processes:
YARN_RESOURCEMANAGER_OPTS
YARN_NODEMANAGER_OPTS
HDFS_NAMENODE_OPTS
HDFS_DATANODE_OPTS
HDFS_SECONDARYNAMENODE_OPTS 

Step 7: Enable remote debugging in IDE (IntelliJ) as shown below.
Note: Identify the main class for NameNode process by looking in startup script.

Open NameNode.java class ->Run/Debug Configuration (+)->Remote-> Change 'port' to 5000 (textbox) ->Apply button


Step 8: Now start the namenode process and put the break point in NameNode.java class as shown below.

Start the process:
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh

Start the debugger(Shift+9):


And now can debug the code as shown below.

I hope everyone is able to set up the code, if any problem. Please do write, I will be happy to help you.
In the next blog, will be writing about the steps for making patch for Apache Hadoop Contribution. 

Happy Coding and Keep Learning !!!!

Importing Apache Hadoop (HDFS,Yarn) module to IntelliJ

In previous blog, I wrote about the steps to set the environment and download Apache Hadoop code on our machine for understanding and contributing. In this blog, I will walk through the code set up on IDE (IntelliJ here).

By now, I presume to Apache Hadoop code is on our machine and also code is compiled. If not follow the blog.

Please follow below steps for importing  HDFS module on IntelliJ

Step 1: Open IntelliJ (either using short-link or idea.sh) and then close project if already open as shown below


Step 2: In below screen, choose Import project as shown below.

Step 3: Now, you have to browse to the folder you want to import. Select Hadoop/hadoop-hdfs-project/hadoop-hdfs folder directory and press 'OK'.


Step 4: The below screen will be shown. Please select the option "Import project from external model" and Click 'next'.



Step 5: Now, Keep pressing next->next and then finish. The project will be imported in IntelliJ as shown below.


Now, Apache Hadoop HDFS module is imported in IntelliJ. You can import other module (YARN,Common) similarly.

I hope all viewers are able to import the Apache Hadoop project successfully in IntelliJ. If facing any issues, please discuss as I will be happy assisting you all.

In the next tutorial, I will discuss the steps of debugging Hadoop.
   

Thursday, January 12, 2017

Contribute to Apache Hadoop

From long time, I had desire to contribute to open source Apache Hadoop. Today, I was free so worked on setup of Hadoop code on my local machine for development. I am documenting the steps as it may be useful for any newcomers.

Below are the steps to set up the Hadoop code for development

Step 1:  Install Java JDK 8 and above

$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Step 2: Install Apache Maven version 3 or later

mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T08:41:47-08:00)
Maven home: /usr/local/apache-maven
Java version: 1.8.0_72, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_72/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-514.2.2.el7.x86_64", arch: "amd64", family: "unix"

Step 3:  Install Google protocol buffer (version 2.5.0)
Make sure protocol buffer version is 2.5.0
I have installed the Google protocol buffer higher version 3.1.0, but when compiling code got below error code.
[ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.0.0-alpha2-SNAPSHOT:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 3.1.0', expected version is '2.5.0' -> [Help 1]
[ERROR] 

Step 4:  Download the hadoop source code

We can either clone the directory or create a fork of directory and then clone it.

a) Directly cloning the directory.
 git clone git://git.apache.org/hadoop.git
b) Create fork as shown below:


And then download the code as shown below:

 $ git clone https://github.com/poojagpta/hadoop

Syn the fork project with current project changes
Add the remote link:
 $git remote add upstream https://github.com/apache/hadoop

 $ git remote -v
origin https://github.com/poojagpta/hadoop (fetch)
origin https://github.com/poojagpta/hadoop (push)
upstream https://github.com/apache/hadoop (fetch)
upstream https://github.com/apache/hadoop (push)

Now, if want to fetch latest code:
$ git fetch upstream
$ git checkout trunk

Step 5: Compile the downloaded code
$ cd hadoop
$ mvn clean install -Pdist -Dtar -Ptest-patch -DskipTests -Denforcer.skip=true

Snippet Output:
[INFO] --- maven-install-plugin:2.5.1:install (default-install) @ hadoop-client-modules ---
[INFO] Installing /home/pooja/dev/hadoop/hadoop-client-modules/pom.xml to /home/pooja/.m2/repository/org/apache/hadoop/hadoop-client-modules/3.0.0-alpha2-SNAPSHOT/hadoop-client-modules-3.0.0-alpha2-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.780 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  2.560 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.236 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.824 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.314 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.834 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  9.167 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  5.918 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 20.083 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  7.650 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:03 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 12.138 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 13.088 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.138 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 54.973 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [01:51 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [  1.323 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 41.081 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 12.680 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.070 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.073 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 35.955 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [01:38 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.089 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 22.489 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 32.492 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  8.606 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 20.153 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [02:26 min]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 55.442 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  5.479 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 17.122 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [  8.654 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [  8.234 s]
[INFO] Apache Hadoop YARN Timeline Service HBase tests .... SUCCESS [02:51 min]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.044 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  8.076 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  5.937 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.077 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 11.366 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [  1.832 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  8.590 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.225 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 43.115 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 27.865 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [  9.009 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 24.415 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 14.692 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 29.361 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  4.828 s]
[INFO] Apache Hadoop MapReduce NativeTask ................. SUCCESS [ 10.299 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 12.238 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  4.336 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 17.591 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 13.083 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  6.314 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  6.982 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 12.048 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 12.327 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  5.819 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  5.794 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  0.036 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  8.138 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 53.458 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 20.452 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [ 11.273 s]
[INFO] Apache Hadoop Client Aggregator .................... SUCCESS [  3.698 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  1.618 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 12.085 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 27.289 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  5.002 s]
[INFO] Apache Hadoop Kafka Library support ................ SUCCESS [  7.041 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.052 s]
[INFO] Apache Hadoop Client API ........................... SUCCESS [02:09 min]
[INFO] Apache Hadoop Client Runtime ....................... SUCCESS [01:21 min]
[INFO] Apache Hadoop Client Packaging Invariants .......... SUCCESS [  3.431 s]
[INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [03:13 min]
[INFO] Apache Hadoop Client Packaging Invariants for Test . SUCCESS [  0.329 s]
[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [  1.542 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 42.013 s]
[INFO] Apache Hadoop Client Modules ....................... SUCCESS [  0.105 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 32:42 min
[INFO] Finished at: 2017-01-12T11:30:58-08:00
[INFO] Final Memory: 131M/819M
[INFO] ------------------------------------------------------------------------


I hope you are also to set up Hadoop project and ready to contribute like me. Please let me know if you are still facing issues, I love to help you.
In the next tutorial, I will set up the code in IntelliJ and steps to debug the code.

Thanks and happy coding !!!

Problem Encounter will compiling code:

1. Some of the junit are failing.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.minikdc.TestMiniKdc
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 3.451 sec <<< FAILURE! - in org.apache.hadoop.minikdc.TestMiniKdc
testKeytabGen(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.314 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testMiniKdcStart(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.002 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testKerberosLogin(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.008 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

Running org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 3.289 sec <<< FAILURE! - in org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain
testKeytabGen(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.18 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testMiniKdcStart(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.014 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testKerberosLogin(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.009 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)


Results :

Tests in error: 
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...

Tests run: 6, Failures: 0, Errors: 6, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  2.060 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  1.584 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.018 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.161 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.253 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.803 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  9.047 s]
[INFO] Apache Hadoop MiniKDC .............................. FAILURE [ 11.461 s]
[INFO] Apache Hadoop Auth ................................. SKIPPED
[INFO] Apache Hadoop Auth Examples ........................ SKIPPED
......
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on project hadoop-minikdc: There are test failures.
[ERROR] 
[ERROR] Please refer to /home/pooja/dev/hadoop/hadoop-common-project/hadoop-minikdc/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-minikdc

Solution:
I fixed the problem by skipping the Junit (-DskipTests) for entire build and run Junit only for module you want to start fixing code.

2. Got below error for module  'Hadoop HDFS'. 

[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.612 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  1.507 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.033 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.937 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.312 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.670 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  8.432 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  5.359 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 14.786 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  5.682 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:04 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 12.310 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 14.775 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.074 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:19 min]
[INFO] Apache Hadoop HDFS ................................. FAILURE [  5.697 s]
[INFO] Apache Hadoop HDFS Native Client ................... SKIPPED
.........................................
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:48 min
[INFO] Finished at: 2017-01-11T15:43:27-08:00
[INFO] Final Memory: 90M/533M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hadoop-hdfs: Could not resolve dependencies for project org.apache.hadoop:hadoop-hdfs:jar:3.0.0-alpha2-SNAPSHOT: The following artifacts could not be resolved: org.apache.hadoop:hadoop-kms:jar:classes:3.0.0-alpha2-SNAPSHOT, org.apache.hadoop:hadoop-kms:jar:tests:3.0.0-alpha2-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-kms:jar:classes:3.0.0-alpha2-SNAPSHOT in apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-hdfs

Solution:

The problem suggest that problem in intallation of Openssl.
I want to install OpenSSL to be able to use the HTTPS protocol in HDFS or curl or different application.
openssl (which is the binary) is installed, but OpenSSL (which is required for the HTTPS protocol is not installed).

You can install openssl using below command
$sudo yum install openssl openssl-devel

$ which openssl
/usr/bin/openssl

$ openssl version
OpenSSL 1.0.1e-fips 11 Feb 2013
We can solve the problem using 2 approaches

   a. Create a link to openssl path as shown below

      ln -s /usr/bin/openssl /usr/local/openssl

or
    b. Download OpenSSL and compile it as shown below

$wget https://www.openssl.org/source/openssl-1.0.1e.tar.gz
$tar -xvf openssl-1.0.1e.tar.gz
$cd openssl-1.0.1e
$./config --prefix=/usr/local/openssl --openssldir=/usr/local/openssl
$ make
$ sudo make install

3. Error with enforcer
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (depcheck) on project hadoop-hdfs-client: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Solution:
I skipped enforcer (-Denforcer.skip=true) as this constrains allow only unix, mac machine.