Thursday, August 24, 2017

Setup Multi node Apache Hadoop 2 Cluster

Apache Hadoop 


Hadoop is open source framework for writing and running distributed application. It consists of scale out fault tolerant distribute file system (HDFS) and  data processing system (Map Reduce).

Today, I will walk through the steps for set up Hadoop Cluster which involve 2 or more commodity machine. I will be configuring the set up using 2 machines.

Prerequisites:



Network accessible : Machines should be connect through network by either through Ethernet hubs or switch or routers. Therefore, cluster machines should have same subnetting IP address like 192.168.1.x.


Multi Node Hadoop Cluster Setup


1. Set up Hadoop on each machine


Please follow the steps provide in the tutorial and set up single node setup on each machine. Then stop the processes as shown in Step 8 of the tutorial.

2. Change each nodes hosts files to include all machine in cluster .


In my case, I have just 2 machine connected through network with IP Address (192.168.1.1, 192.168.1.2). Therefore, I have included the below 2 lines to file as shown below:

hduser@pooja:~$ sudo vi /etc/hosts

192.168.1.1 master
192.168.1.2 slave1

3. Set up password less SSH

We will be creating a passwordless ssh between master and all slaves machine in network.


3.1 Master machine ssh set up with itself


We have already set up password less ssh to localhost/itslef when configure Hadoop on each machine. Here, we will just verify if setup is proper.

hduser@pooja:~$ ssh master

The authenticity of host 'master (192.168.1.101)' can't be established.
ECDSA key fingerprint is ad:3c:12:c3:b1:d2:60:a4:8f:76:00:1d:15:b7:f5:41.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.1.101' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

385 packages can be updated.
268 updates are security updates.

New release '16.04.3 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Thu Aug 24 13:51:11 2017 from localhost
$


3.2 Master machine ssh set up with slave nodes


3.2.1 Copy the master ssh public key to all slave node.

          hduser@pooja:~$ ssh-copy-id -i /home/hduser/.ssh/id_rsa.pub hduser@slave1

            The authenticity of host 'slaves (192.168.1.2)' can't be established.
            The ECDSA key fingerprint is: b3: 7d: 41: 89: 03: 15: 04: 1c: 84: e3: d1: 69: 1f: c8: 5d.
            Are you sure you want to continue connecting (yes/no)? yes
            /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
            /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
            hduser@slave1's password:
            Number of key(s) added: 1
            Now try logging into the machine, with:   "ssh 'hduser@slave1'"
            and check to make sure that only the key(s) you wanted were added.
           
            Note: In the bold line, we specify the password of hduser@slave1 machine. 

3.2.2 Verify the authorization_keys file of slave1 machine
         
          Make sure you have a key enter from master node as shown below.

          hduser@prod01:~$ cat .ssh/authorized_keys 

            ssh-rsa AAAAB3NzaC1yc....LJ/67N+v7g8S0/U44Mhjf7dviODw5tY9cs5XXsb1FMVQL... hduser@prod01
            ssh-rsa fffA3zwdi0eWSkJvDWzD9du...kSRTRLZbzVY9ahLZNLFz+p1QU3HXuY3tLr hduser@pooja

3.2.3 Confirm passwordless ssh from master machine
     
           hduser@pooja:~$ ssh slave1
             
             Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)
             * Documentation:  https://help.ubuntu.com/
             334 packages can be updated.
             224 updates are security updates.

             New release '16.04.3 LTS' available.
             Run 'do-release-upgrade' to upgrade to it.

            Last login: Thu Aug 24 13:50:50 2017 from localhost
            $ 


4. Hadoop Configuration changes


4.1 Changes to masters files

This file specify the list of machine that run the name node and secondary name node (name node will always start on the master node but the secondary name node can run on any slave node if cluster started using start-dfs.sh from the particular slave node). Basically, Secondary namenode merge the fsimage and edit log periodically to keep edit log in size.

In our case we will specify the master machine only.

hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/masters


4.2 Changes to slave files

This file specify the list of machine that run the datanodes and node masters.
In our case we will specify the master and slave1, if you have more slaves, you can specify them here and can remove master node.
hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/slaves



4.3 Changes in core-site.xml for all machine in cluster.

Now, the namenode process will be running on master and not on localhost.
Therefore, we need to change the value of fs-default-name  property to hdfs://master:9000 as shown below.

hduser@pooja:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

Note: Make sure you make changes to core-site.xml in slave nodes as well

4.4 Changes in hdfs-site.xml of all slave nodes (This is optional step)

Remove property "dfs.namenode.dir" as now namenode won't be running  on slave machine.


5. Starting hadoop cluster


From the master machine run the below commands

5.1 Start HDFS  

hduser@pooja:~$ start-dfs.sh

5.2 Start Yarn

hduser@pooja:~$ start-yarn.sh

5.3 Verify the running process on master

5.3.1 Process runnining on master machine.

hduser@pooja:~$ jps
6821 SecondaryNameNode
7126 NodeManager
6615 DataNode
7628 Jps
6444 NameNode
6990 ResourceManager
  
5.3.2 Process running on slave node
hduser@prod01:~$ jps
9749 NodeManager
9613 DataNode
9902 Jps

5.3.3 Run the PI Mapreduce job from the hadoop-examples jar.
hduser@pooja:~$ yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 4 1000 


6. Stop the Cluster


In the master node, stop the processes.

6.1 Stop yarn

hduser@pooja:~$ stop-yarn.sh 
stopping yarn daemons
stopping resourcemanager
master: stopping nodemanager
slave1: stopping nodemanager
master: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
slave1: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

6.2 Stop HDFS

hduser@pooja:~$ stop-dfs.sh
17/08/24 18:42:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namodes on [master]
Master: stopping forgive
master: stopping datanode
slave1: stopping datanode
Stopping secondary namodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
17/08/24 18:42:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

I hope you are able to follow my instruction to set up Hadoop Cluster. If still facing issue, I love to address them, please do write your problems !!!

Happy Coding !!!

Installing Hadoop2.8.1 on Ubuntu (Single Node Cluster)

 Overview

Hadoop is open source framework for running and writing distributed computing programs. This framework comprise of HDFS (Hadoop Distributed File system) and Map Reduce (Programming framework written in Java).

In Hadoop 1, Only Map Reduce program (either written in Java or Python ) can be run on the data stored in HDFS. Therefore, it only fit for  batch processing computations.

In Hadoop 2, the YARN (Yet Another Resource Negotiator) was introduced which provide API to work on requesting and allocating resource in cluster. These API facilitate application such as Spark, Tez, Storm program to process large scale fault tolerant data of HDFS. Thus, hadoop ecosystem now fits in for all batch or near real time or real time processing computation.  


Today, I will be discussing about the steps to set up Hadoop 2 in pseudo mode on Ubuntu machine.

Prerequisites


  • Hardware requirement
          The machine on which hadoop installed must have 64-128 MB RAM and atleast 1-4 GB     hard disk for better performance. This is the optional requirement.
  • Check java version
         Java version of machine should be greater than 7. If you have version small than 7 or no Java installed than install by steps provided in the article.
   
        You can version the java version with below command.
       
        $ java -version
             java version "1.8.0_131"
             Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
            Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Steps for Hadoop Set up on Single Machine.

Step 1 : Create a dedicated hadoop user.

  •  Create a group hadoop
            pooja@prod01:~$ sudo groupadd hadoop

  • 1.2 Create a user hduser in group hadoop.

pooja@prod01:~$ sudo useradd -G hadoop -m  hduser

Note:-m will create the home directory

1.3 Make sure home directory with hduser created.

pooja@prod01:~$ sudo ls -ltr /home/

total 8
drwxr-xr-x 28 pooja  pooja  4096 Aug 24 09:23 pooja
drwxr-xr-x  2 hduser hduser 4096 Aug 24 13:34 hduser

1.4 Define password for hduser.
pooja@prod01:~$ sudo passwd hduser

Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

1.5 Log-in  as hduser 
pooja@prod01:~$ su - hduser
Password: 

hduser@prod01:~$ pwd
/home/hduser

Step 2: Set up Passwordless SSH

2.1 Generate the ssh-keygen without password

hduser@prod01:~$ ssh-keygen -t rsa -P ""

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa): 
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
6c:c0:f4:c2:d1:d8:40:41:2b:e8:7b:8d:d4:c7:2c:62 hduser@prod01
The key's randomart image is:
+--[ RSA 2048]----+
|     oB*         |
|   . +.+o        |
|  . . * .        |
| .   o *         |
|  . E o S        |
|   + ++         |
|  . o .          |
|   .             |
|                 |
+-----------------+

2.2 Add the public ssh-key generated to authorized keys

hduser@prod01:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

2.3 Provide read and write permission to authorized keys.

hduser@prod01:~$  chmod 0600 ~/.ssh/authorized_keys

2.4 Verify if password less ssh is working.

Note: In continue question, please specify yes as shown below
hduser@prod01:~$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is ad:3c:12:c3:b1:d2:60:a4:8f:76:00:1e:15:b3:f4:41.
Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)
...Snippet
$

Step 3: Download Hadoop  2.8.1

3.1 Download Hadoop 2.8.1 tar file from Apache Download images or using below commands

hduser@prod01:~$ wget http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz

--2017-08-24 14:01:31--  http://apache.claz.org/hadoop/common/hadoop-2.8.1/hadoop-2.8.1.tar.gz
Resolving apache.claz.org (apache.claz.org)... 74.63.227.45
Connecting to apache.claz.org (apache.claz.org)|74.63.227.45|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 424555111 (405M) [application/x-gzip]
Saving to: ‘hadoop-2.8.1.tar.gz’
100%[=====================================================================================================>] 424,555,111 1.51MB/s   in 2m 48s
2017-08-24 14:04:19 (2.41 MB/s) - ‘hadoop-2.8.1.tar.gz’ saved [424555111/424555111]

3.2 Untar the downloaded tar file.

hduser@prod01:~$ tar -xvf hadoop-2.8.1.tar.gz

...Snippet
hadoop-2.8.1/share/doc/hadoop/images/external.png
hadoop-2.8.1/share/doc/hadoop/images/h5.jpg
hadoop-2.8.1/share/doc/hadoop/index.html
hadoop-2.8.1/share/doc/hadoop/project-reports.html
hadoop-2.8.1/include/
hadoop-2.8.1/include/hdfs.h
hadoop-2.8.1/include/Pipes.hh
hadoop-2.8.1/include/TemplateFactory.hh
hadoop-2.8.1/include/StringUtils.hh
hadoop-2.8.1/include/SerialUtils.hh
hadoop-2.8.1/LICENSE.txt
hadoop-2.8.1/NOTICE.txt
hadoop-2.8.1/README.txt

3.3 Create the soft link.

hduser@prod01:~$ ln -s hadoop-2.8.1 hadoop

Step 4: Configure Hadoop Pseudo Distributed mode.

In the hadoop configuration, we only added the minimum required property, you can add more properties to it as well.

4.1 Set up the environment variable.

   4.1.1 Edit bashrc and add hadoop in path as shown below:

            hduser@pooja:~$ vi .bashrc

               #Add below lines to .bashrc
                export HADOOP_HOME=/home/hduser/hadoop
                export HADOOP_INSTALL=$HADOOP_HOME
                export HADOOP_MAPRED_HOME=$HADOOP_HOME
                export HADOOP_COMMON_HOME=$HADOOP_HOME
                export HADOOP_HDFS_HOME=$HADOOP_HOME
                export YARN_HOME=$HADOOP_HOME
                export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
               export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  4.1.2 Source .bashrc in current login session

          hduser@pooja:~$ source ~/.bashrc
          
4.2  Hadoop configuration file changes

   4.2.1 Changes to hadoop-env.sh (set $JAVA_HOME to installation directory)
         
           4.2.1.1 Find JAVA_HOME on machine.
                      
                        hduser@pooja:~$ which java
                         /usr/bin/java
                        
                         hduser@pooja:~$ readlink -f /usr/bin/java
                         /usr/lib/jvm/java-8-oracle/jre/bin/java

                         Note/usr/lib/jvm/java-8-oracle is JAVA_HOME diretory
          4.2.1.2  Edit hadoop-env.sh and set $JAVA_HOME.
         
                       hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh  
                      
                       Edit file and change

                       JAVA_HOME = ${JAVA_HOME} 
                                         to
                       JAVA_HOME = /usr/lib/jvm/java-8-oracle   
                            Note: JAVA_HOME=path fetched in step 4.2.1.1
                    
4.2.2  Changes to core-site.xml 
hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml

Add the configuration property (NameNode property: fs.dafault.name).

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

4.2.3 Changes to hdfs-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Add the configuration property (NameNode property: dfs.name.dir, DataNode property: dfs.data.dir).

<configuration>
<property>
     <name>dfs.replication</name>
       <value>1</value>
</property>
<property>
       <name>dfs.name.dir</name>
       <value>file:///home/hduser/hadoopdata/hdfs/namenode</value>
</property>
<property>
     <name>dfs.data.dir</name>
     <value>file:///home/hduser/hadoopdata/hdfs/datanode</value>
</property>
</configuration>


4.2.3 Changes to mapred-site.xml

Here, first we will copy the mapred-site.xml.template to mapred-site.xml and then will add property to it.

hduser@prod01:~$ cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/mapred-site.xml

Add the configuration property (mapreduce.framework.name)
<configuration>
     <property>
         <name>mapreduce.framework.name</name>
          <value>yarn</value>
       </property>


</configuration>

Note: If you didn't specify this then Resource Manager UI (http://localhost:8088) will not show any jobs.

4.2.4 Changes to yarn-site.xml

hduser@prod01:~$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml

Add the configuration property

<configuration>
     <property>
         <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
       </property>
</configuration>

Step 5: Verify and format HDFS File system

5.1 Format HDFS file system

       hduser@pooja:~$ hdfs namenode -format

       ...Snippet
           17/08/24 16:08:36 INFO util.GSet: capacity      = 2^15 = 32768 entries
           17/08/24 16:08:36 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1601791069-127.0.1.1-1503616
           17/08/24 16:08:37 INFO common.Storage: Storage directory /home/hduser/hadoopdata/hdfs/namenode has been successfully formatted.
            17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Saving image file       /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
           17/08/24 16:08:37 INFO namenode.FSImageFormatProtobuf: Image file /home/hduser/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
           17/08/24 16:08:37 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
           17/08/24 16:08:37 INFO util.ExitUtil: Exiting with status 0
           17/08/24 16:08:37 INFO namenode.NameNode: SHUTDOWN_MSG: 
          /************************************************************
            SHUTDOWN_MSG: Shutting down NameNode at pooja/127.0.1.1
          ************************************************************/

5.2 Verify the format (Make sure hadoopdata/hdfs/* folder created)
        
       hduser@prod01:~$ ls -ltr hadoopdata/hdfs/
       
         total 4
         drwxrwxr-x 3 hduser hduser 4096 Aug 24 16:09 namenode

Note: This is same path as specify in hdfs-site.xml property dfs-name-dir

Step 6: Start single node cluster

We will start the hadoop cluster using the hadoop start-up script.

6.1 Start HDFS
     
hduser@prod01:~$ start-dfs.sh 
17/08/24 16:38:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-namenode-prod01.out
localhost: starting datanode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-datanode-prod01.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is be:b3:7d:41:89:03:15:04:1c:84:e3:d9:69:1f:c8:5d.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hduser/hadoop-2.8.1/logs/hadoop-hduser-secondarynamenode-prod01.out
17/08/24 16:39:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

6.2 Start yarn

hduser@prod01:~$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-resourcemanager-prod01.out
localhost: starting nodemanager, logging to /home/hduser/hadoop-2.8.1/logs/yarn-hduser-nodemanager-prod01.out

6.3 Verify if all process started

hduser@prod01:~$ jps
6775 DataNode
7209 ResourceManager
7017 SecondaryNameNode
6651 NameNode
7339 NodeManager
7663 Jps

6.4 Run the PI Mapreduce job from the hadoop-examples jar.

hduser@prod1:~$ yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 4 1000 




Step 7: Hadoop Web Interface

Web UI of NameNode(http://localhost:50070)


Resource Manager UI  (http://localhost:8088).
It will show all jobs running and resources on cluster information.This will help monitor the jobs running and progress report of the same.

Step 8: Stopping the hadoop

8.1 Stop Yarn processes

hduser@prod01:~$ stop-yarn.sh

stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
localhost: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop

8.2 Stop HDFS processes

hduser@prod01:~$ stop-dfs.sh
17/08/24 17:11:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
17/08/24 17:12:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Hope you are able to follow my instructions on  Hadoop Pseudo Mode Setup. Please write to me if any of you  are still facing problem.

Happy Coding!!!!

Wednesday, January 25, 2017

Configure IntelliJ for Android Development on CentOS

Mobile Application

In today world, the application development for mobile has increased magnificently. The application from online payment to e-shopping to digital assistance to interactive messaging to many more operations are now just click away using mobile.
Mobile application user interface can be developed using a foray of technologies such as HTML 5, CSS,Javascript, Java, Android or iOS.

In this post, I will be discussing about setting up Android environment on existing IntelliJ.  

IntelliJ set up for Android Development

Perform below steps for setup.

Step 1. Install Java 8 or Java 7 JDK

$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Step 2. Install Android SDK

[user@localhost ~]$ cd /opt
[user@localhost opt]$ sudo wget http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
[sudo] password for pooja: 
--2017-01-24 22:25:23--  http://dl.google.com/android/android-sdk_r24.4.1-linux.tgz
Resolving dl.google.com (dl.google.com)... 172.217.6.46, 2607:f8b0:4005:805::200e
Connecting to dl.google.com (dl.google.com)|172.217.6.46|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 326412652 (311M) [application/x-tar]
Saving to: ‘android-sdk_r24.4.1-linux.tgz’

100%[============================================================================================================>] 326,412,652  148KB/s   in 29m 58s

2017-01-24 22:55:21 (177 KB/s) - ‘android-sdk_r24.4.1-linux.tgz’ saved [326412652/326412652]

[user@localhost opt]$ sudo tar zxvf android-sdk_r24.4.1-linux.tgz
[user@localhost opt]$ sudo chown -R root:root android-sdk_r24.4.1-linux 
[user@localhost opt]$ sudo ln -s android-sdk_r24.4.1-linux android-sdk-linux 

#If not change ownership, you will get error "selected directory is not a valid home for android SDK" while setting Andriod SDK path in IntelliJ



[user@localhost opt]$ sudo chown -R user:group /opt/android-sdk-linux/

#sudo vim /etc/profile.d/android-sdk-env.sh

export ANDROID_HOME=/opt/android-sdk-linux
export PATH=$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH
# source /etc/profile.d/android-sdk-env.sh

Step 3: Open SDK Manager under SDK Android Tool

[user@localhost opt]sudo android-sdk-linux/tools/android


Now, Select All Tools option and press "Install 23 packages". Then the license screen is open as shown below.

Finally, select 'Install' button that will start download of packages.


Step 4: Install IntelliJ (if not exists)

Download IntelliJ Community Edition is free, download it and untar the file.

Step 5: Open IntelliJ or close project will open up below screen.
Now, select 'Create New Project' and then select Project type as "Android" as shown below


Now, Select option "Application Module" and select 'Next'.


Now, Select option 'New' button. 

Then the browser window will open up, Now select /opt/android-sdk-linux and press 'OK'

Lastly, the android version popup window will be shown as below



This way, we have configured existing IntelliJ for Android Development project. Now press 'Finish' button to create project.

I hope you are also able to configure your existing IntelliJ for Android development. If any problems, please write back and I love to hear from you.

Tuesday, January 17, 2017

Debugging Apache Hadoop (NameNode,DataNode,SNN,ResourceManager,NodeManager) using IntelliJ

In the previous blogs, I discuss the set up the environment and then download  Apache Hadoop code and then build it and also set it up in IDE (IntelliJ).

In this blog, I will focus on debugging Apache Hadoop code for understanding. 

I used remote debugging to connect and debug any of the Hadoop processes (NameNode,DataNode, SecondaryNameNode,ResourceManager,NodeManager).

Prerequisites
1. Apache Hadoop code on local machine.
2. Code is build (look for hadoop/hadoop-dist created)
3. Set up of the code in IntelliJ.

Let dive into the step to understand the debug process.

Step 1: Look for hadoop-dist directory in hadoop main directory.
Once hadoop code is build, the directory hadoop-dist is created in Hadoop main directory as shown below.

 Step 2: Move in the target directory. 
 [pooja@localhost hadoop]$ cd hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT

The directory structure looks as below (It same as Apache Download  tar)

Step 3: Now, setup Hadoop configuration.
a. Change hadoop-env file to add JAVA_HOME path 

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh 
Add the below line. 
JAVA_HOME=$JAVA_HOME

b. Add configuration paramters (Note: I am doing minimum set up for running hadoop processes)

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/core-site.xml
<configuration>
<property>
   <name>fs.default.name</name>
   <value>hdfs://localhost:9000</value>
 </property>
<configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hdfs-site.xml 
<configuration>
 <property>
 <name>dfs.replication</name>
  <value>1</value>
</property>
  <property>
   <name>dfs.name.dir</name>
   <value>file:///home/pooja/hadoopdata/hdfs/namenode</value>
  </property>
  <property>
   <name>dfs.data.dir</name>
     <value>file:///home/pooja/hadoopdata/hdfs/datanode</value>
 </property>
</configuration>

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/yarn-site.xml
<configuration>
<property>
    <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
 </property>
</configuration>

Place the enviornment property in ~/.bashrc
export HADOOP_HOME=<hadoop source code directory>/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

  Step 4: Run all hadoop process

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [localhost.localdomain]
2017-01-17 20:27:44,335 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ jps
25232 SecondaryNameNode
26337 Jps
24839 DataNode
24489 NameNode
25914 NodeManager
25597 ResourceManager

Step 5: Now stop all the processes.
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-yarn.sh
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/stop-dfs.sh

Step 6: Debug a Hadoop process (eg. NameNode) by performing below change in hadoop-env.sh or hdfs-env.sh.

[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ vi etc/hadoop/hadoop-env.sh  
Add below line.
export HDFS_NAMENODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"

Simlarly, we can debug below processes:
YARN_RESOURCEMANAGER_OPTS
YARN_NODEMANAGER_OPTS
HDFS_NAMENODE_OPTS
HDFS_DATANODE_OPTS
HDFS_SECONDARYNAMENODE_OPTS 

Step 7: Enable remote debugging in IDE (IntelliJ) as shown below.
Note: Identify the main class for NameNode process by looking in startup script.

Open NameNode.java class ->Run/Debug Configuration (+)->Remote-> Change 'port' to 5000 (textbox) ->Apply button


Step 8: Now start the namenode process and put the break point in NameNode.java class as shown below.

Start the process:
[pooja@localhost hadoop-3.0.0-alpha2-SNAPSHOT]$ sbin/start-dfs.sh

Start the debugger(Shift+9):


And now can debug the code as shown below.

I hope everyone is able to set up the code, if any problem. Please do write, I will be happy to help you.
In the next blog, will be writing about the steps for making patch for Apache Hadoop Contribution. 

Happy Coding and Keep Learning !!!!

Importing Apache Hadoop (HDFS,Yarn) module to IntelliJ

In previous blog, I wrote about the steps to set the environment and download Apache Hadoop code on our machine for understanding and contributing. In this blog, I will walk through the code set up on IDE (IntelliJ here).

By now, I presume to Apache Hadoop code is on our machine and also code is compiled. If not follow the blog.

Please follow below steps for importing  HDFS module on IntelliJ

Step 1: Open IntelliJ (either using short-link or idea.sh) and then close project if already open as shown below


Step 2: In below screen, choose Import project as shown below.

Step 3: Now, you have to browse to the folder you want to import. Select Hadoop/hadoop-hdfs-project/hadoop-hdfs folder directory and press 'OK'.


Step 4: The below screen will be shown. Please select the option "Import project from external model" and Click 'next'.



Step 5: Now, Keep pressing next->next and then finish. The project will be imported in IntelliJ as shown below.


Now, Apache Hadoop HDFS module is imported in IntelliJ. You can import other module (YARN,Common) similarly.

I hope all viewers are able to import the Apache Hadoop project successfully in IntelliJ. If facing any issues, please discuss as I will be happy assisting you all.

In the next tutorial, I will discuss the steps of debugging Hadoop.
   

Thursday, January 12, 2017

Contribute to Apache Hadoop

From long time, I had desire to contribute to open source Apache Hadoop. Today, I was free so worked on setup of Hadoop code on my local machine for development. I am documenting the steps as it may be useful for any newcomers.

Below are the steps to set up the Hadoop code for development

Step 1:  Install Java JDK 8 and above

$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

Step 2: Install Apache Maven version 3 or later

mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T08:41:47-08:00)
Maven home: /usr/local/apache-maven
Java version: 1.8.0_72, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_72/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-514.2.2.el7.x86_64", arch: "amd64", family: "unix"

Step 3:  Install Google protocol buffer (version 2.5.0)
Make sure protocol buffer version is 2.5.0
I have installed the Google protocol buffer higher version 3.1.0, but when compiling code got below error code.
[ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.0.0-alpha2-SNAPSHOT:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 3.1.0', expected version is '2.5.0' -> [Help 1]
[ERROR] 

Step 4:  Download the hadoop source code

We can either clone the directory or create a fork of directory and then clone it.

a) Directly cloning the directory.
 git clone git://git.apache.org/hadoop.git
b) Create fork as shown below:


And then download the code as shown below:

 $ git clone https://github.com/poojagpta/hadoop

Syn the fork project with current project changes
Add the remote link:
 $git remote add upstream https://github.com/apache/hadoop

 $ git remote -v
origin https://github.com/poojagpta/hadoop (fetch)
origin https://github.com/poojagpta/hadoop (push)
upstream https://github.com/apache/hadoop (fetch)
upstream https://github.com/apache/hadoop (push)

Now, if want to fetch latest code:
$ git fetch upstream
$ git checkout trunk

Step 5: Compile the downloaded code
$ cd hadoop
$ mvn clean install -Pdist -Dtar -Ptest-patch -DskipTests -Denforcer.skip=true

Snippet Output:
[INFO] --- maven-install-plugin:2.5.1:install (default-install) @ hadoop-client-modules ---
[INFO] Installing /home/pooja/dev/hadoop/hadoop-client-modules/pom.xml to /home/pooja/.m2/repository/org/apache/hadoop/hadoop-client-modules/3.0.0-alpha2-SNAPSHOT/hadoop-client-modules-3.0.0-alpha2-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.780 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  2.560 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.236 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.824 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.314 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.834 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  9.167 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  5.918 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 20.083 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  7.650 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:03 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 12.138 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 13.088 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.138 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 54.973 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [01:51 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [  1.323 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 41.081 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 12.680 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.070 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.073 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 35.955 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [01:38 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.089 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 22.489 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 32.492 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  8.606 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 20.153 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [02:26 min]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 55.442 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  5.479 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 17.122 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [  8.654 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [  8.234 s]
[INFO] Apache Hadoop YARN Timeline Service HBase tests .... SUCCESS [02:51 min]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.044 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  8.076 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  5.937 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.077 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 11.366 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [  1.832 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  8.590 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.225 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 43.115 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 27.865 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [  9.009 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 24.415 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 14.692 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 29.361 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  4.828 s]
[INFO] Apache Hadoop MapReduce NativeTask ................. SUCCESS [ 10.299 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 12.238 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  4.336 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 17.591 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 13.083 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  6.314 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  6.982 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 12.048 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 12.327 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  5.819 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  5.794 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  0.036 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  8.138 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 53.458 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 20.452 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [ 11.273 s]
[INFO] Apache Hadoop Client Aggregator .................... SUCCESS [  3.698 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  1.618 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 12.085 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 27.289 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  5.002 s]
[INFO] Apache Hadoop Kafka Library support ................ SUCCESS [  7.041 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.052 s]
[INFO] Apache Hadoop Client API ........................... SUCCESS [02:09 min]
[INFO] Apache Hadoop Client Runtime ....................... SUCCESS [01:21 min]
[INFO] Apache Hadoop Client Packaging Invariants .......... SUCCESS [  3.431 s]
[INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [03:13 min]
[INFO] Apache Hadoop Client Packaging Invariants for Test . SUCCESS [  0.329 s]
[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [  1.542 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 42.013 s]
[INFO] Apache Hadoop Client Modules ....................... SUCCESS [  0.105 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 32:42 min
[INFO] Finished at: 2017-01-12T11:30:58-08:00
[INFO] Final Memory: 131M/819M
[INFO] ------------------------------------------------------------------------


I hope you are also to set up Hadoop project and ready to contribute like me. Please let me know if you are still facing issues, I love to help you.
In the next tutorial, I will set up the code in IntelliJ and steps to debug the code.

Thanks and happy coding !!!

Problem Encounter will compiling code:

1. Some of the junit are failing.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.minikdc.TestMiniKdc
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 3.451 sec <<< FAILURE! - in org.apache.hadoop.minikdc.TestMiniKdc
testKeytabGen(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.314 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testMiniKdcStart(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.002 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testKerberosLogin(org.apache.hadoop.minikdc.TestMiniKdc)  Time elapsed: 1.008 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

Running org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 3.289 sec <<< FAILURE! - in org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain
testKeytabGen(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.18 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testMiniKdcStart(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.014 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testKerberosLogin(org.apache.hadoop.minikdc.TestChangeOrgNameAndDomain)  Time elapsed: 1.009 sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at org.apache.kerby.kerberos.kerb.client.KrbClientBase.<init>(KrbClientBase.java:51)
at org.apache.kerby.kerberos.kerb.client.KrbClient.<init>(KrbClient.java:38)
at org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.<init>(SimpleKdcServer.java:54)
at org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:280)
at org.apache.hadoop.minikdc.KerberosSecurityTestcase.startMiniKdc(KerberosSecurityTestcase.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)


Results :

Tests in error: 
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestMiniKdc>KerberosSecurityTestcase.startMiniKdc:49 » Runtime Unable to parse...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...
  TestChangeOrgNameAndDomain>KerberosSecurityTestcase.startMiniKdc:49 » Runtime ...

Tests run: 6, Failures: 0, Errors: 6, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  2.060 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  1.584 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.018 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.161 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.253 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.803 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  9.047 s]
[INFO] Apache Hadoop MiniKDC .............................. FAILURE [ 11.461 s]
[INFO] Apache Hadoop Auth ................................. SKIPPED
[INFO] Apache Hadoop Auth Examples ........................ SKIPPED
......
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on project hadoop-minikdc: There are test failures.
[ERROR] 
[ERROR] Please refer to /home/pooja/dev/hadoop/hadoop-common-project/hadoop-minikdc/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-minikdc

Solution:
I fixed the problem by skipping the Junit (-DskipTests) for entire build and run Junit only for module you want to start fixing code.

2. Got below error for module  'Hadoop HDFS'. 

[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.612 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  1.507 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.033 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  4.937 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.312 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.670 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  8.432 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  5.359 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 14.786 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  5.682 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:04 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 12.310 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 14.775 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.074 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:19 min]
[INFO] Apache Hadoop HDFS ................................. FAILURE [  5.697 s]
[INFO] Apache Hadoop HDFS Native Client ................... SKIPPED
.........................................
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:48 min
[INFO] Finished at: 2017-01-11T15:43:27-08:00
[INFO] Final Memory: 90M/533M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hadoop-hdfs: Could not resolve dependencies for project org.apache.hadoop:hadoop-hdfs:jar:3.0.0-alpha2-SNAPSHOT: The following artifacts could not be resolved: org.apache.hadoop:hadoop-kms:jar:classes:3.0.0-alpha2-SNAPSHOT, org.apache.hadoop:hadoop-kms:jar:tests:3.0.0-alpha2-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-kms:jar:classes:3.0.0-alpha2-SNAPSHOT in apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-hdfs

Solution:

The problem suggest that problem in intallation of Openssl.
I want to install OpenSSL to be able to use the HTTPS protocol in HDFS or curl or different application.
openssl (which is the binary) is installed, but OpenSSL (which is required for the HTTPS protocol is not installed).

You can install openssl using below command
$sudo yum install openssl openssl-devel

$ which openssl
/usr/bin/openssl

$ openssl version
OpenSSL 1.0.1e-fips 11 Feb 2013
We can solve the problem using 2 approaches

   a. Create a link to openssl path as shown below

      ln -s /usr/bin/openssl /usr/local/openssl

or
    b. Download OpenSSL and compile it as shown below

$wget https://www.openssl.org/source/openssl-1.0.1e.tar.gz
$tar -xvf openssl-1.0.1e.tar.gz
$cd openssl-1.0.1e
$./config --prefix=/usr/local/openssl --openssldir=/usr/local/openssl
$ make
$ sudo make install

3. Error with enforcer
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (depcheck) on project hadoop-hdfs-client: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Solution:
I skipped enforcer (-Denforcer.skip=true) as this constrains allow only unix, mac machine.