Big Data: Commission Data Nodes in Hadoop 2.8.1

Tuesday, August 29, 2017

Commission Data Nodes in Hadoop 2.8.1

Apache Hadoop Cluster

Hadoop is open source framework for writing and running distributed application. It comprise of distributed file system (HDFS) and programming language (Map Reduce).

It is designed to store large volume of variety data on cluster of commodity servers. These commodity servers can be added and remove easily.

Need for Commission Nodes

Due to some maintenance work on other nodes or adding earlier remove node or cluster load is increased.

Steps for Commission

For commission a data nodes from cluster, we need to make sure the data node adding is network accessible or in same subnet.

Edit hdfs-site.xml

1. Reduce the property the property dfs.replication (if needed) .

In case, we have 2-node cluster and we commission 1 more then we can increase replication factor also to 3.

2. Add the property dfs.hosts in file as shown below.

<property>
<name>dfs.hosts</name>
<value>/home/hduser/hadoop/etc/hadoop/include</value>
</property>

Edit yarn-site.xml

Add the property yarn.resourcemanager.nodes.include-path as shown below.

<property>
<name>yarn.resourcemanager.nodes.include-path</name>
<value>/home/hduser/hadoop/etc/hadoop/include</value>
</property>

Add include file

Add the include file in path /home/hduser/hadoop/etc/hadoop and write the name of the slave we need to commission.

Update Hadoop Processes

Update Name Node with set of permitted data nodes.

hadoop dfsadmin -refreshNodes

Update Resource manager with set of permitted node manager.

yarn rmadmin -refreshNodes

Start node manager and Data node on the slave using below commands:

hadoop-daemon.sh start datanode
yarn-daemon.sh start nodemanager

Verify of the node is Commissioned.

hadoop dfsadmin -report

Here, make sure you have live data node include the one you added to include file.

Edit the $HADOOP_HOME/etc/hadoop/slaves file on the master nodes and add the new node so that the commissioned node is connected itself by the hadoop on next cluster restart.

Balance Hadoop Cluster
Finally, we need to balance the load of all datanodes so that new added data node also get some data blocks.

hduser@pooja:~$ hdfs balancer

Hope you are able to add new data node to Hadoop Cluster. Please free to write to me if any problems.

Happy Coding!!!

Big Data

Tuesday, August 29, 2017

Commission Data Nodes in Hadoop 2.8.1

Apache Hadoop Cluster

Need for Commission Nodes

Steps for Commission

Balance Hadoop Cluster
Finally, we need to balance the load of all datanodes so that new added data node also get some data blocks.

hduser@pooja:~$ hdfs balancer

No comments:

Post a Comment

Tuesday, August 29, 2017

Commission Data Nodes in Hadoop 2.8.1

Apache Hadoop Cluster

Need for Commission Nodes

Steps for Commission

Balance Hadoop Cluster Finally, we need to balance the load of all datanodes so that new added data node also get some data blocks.

hduser@pooja:~$ hdfs balancer

No comments:

Post a Comment

Balance Hadoop Cluster
Finally, we need to balance the load of all datanodes so that new added data node also get some data blocks.