To manage a cluster, you really don’t want your workload to be proportional to the number of computers. For example, whenever you want to install a software, never ever ssh into every box and do the same thing again and again on every box. There are some tools available for this, puppet and chef are the two most famous ones. There is also another solution called Ansible which is an agent-less management tool implemented in Python, which is actually the tool that I am using because it is (1) agent-less (2) light weight (3) python
yum install -y ansible
ansible cluster -m command -a “yum install -y python-setuptools”
ansible cluster -m command -a “yum install -y python-pip”
ansible cluster -m command -a “easy_install beautifulsoup4”
ansible cluster -m command -a “yum install -y java-1.7.0-openjdk”
This is a screen shot of the command that I run, “cluster” is actually the group name which is all the data nodes.