Big Data: Decommission Data Nodes in Hadoop 2.8.1

Monday, August 28, 2017

Decommission Data Nodes in Hadoop 2.8.1

Apache Hadoop Cluster

Hadoop is open source framework for writing and running distributed application. It comprise of distributed file system (HDFS) and programming language (Map Reduce).

It is designed to store large volume of variety data on cluster of commodity servers. These commodity servers can be added and remove easily.

Need for Decommission Nodes

Due to some maintenance work or some fault on nodes or cluster load is reduced.

Steps for Decommission

For decommission a data nodes from cluster, we need to make sure the data is copied from the outgoing slave node to other nodes.

Edit hdfs-site.xml

1. Reduce the property the property dfs.replication (if needed) .

In case, we have 2-node cluster and we decommission 1 then we should reduce replication factor also to 1.

2. Add the property dfs.hosts.exclude in file as shown below.

<property>
<name>dfs.hosts.exclude</name>
<value>/home/hduser/hadoop/etc/hadoop/excludes</value>
</property>

Edit yarn-site.xml

Add the property yarn.resourcemanager.nodes.exclude-path as shown below.

<property>
<name>yarn.resourcemanager.nodes.exclude-path</name>
<value>/home/hduser/hadoop/etc/hadoop/excludes</value>
</property>

Add excludes file

Add the exclude file in path /home/hduser/hadoop/etc/hadoop and write the name of the slave we need to decommission.

Update Hadoop Processes

Update Name Node with set of permitted data nodes.

hadoop dfsadmin -refreshNodes

Update Resource manager with set of permitted node manager.

yarn rmadmin -refreshNodes

Verify of the node is decommissioned.

hadoop dfsadmin -report

The status on slave node must be as below:

Decommission Status : Decommissioned

Stop node manager and Data node on the slave using below commands:

yarn-daemon.sh stop nodemanager
hadoop-daemon.sh stop datanode

Edit the slaves file on the working nodes so that the decommissioned node is not connected again by the hadoop.

Hope you are able to decommission your data node on Hadoop Cluster.

Happy Coding!!!

Big Data

Monday, August 28, 2017

Decommission Data Nodes in Hadoop 2.8.1

Apache Hadoop Cluster

Need for Decommission Nodes

Steps for Decommission

No comments:

Post a Comment