Building Pyramids

Matt Polnik's blog

Scalable and Cost Effective Linux Administration


Managing multiple machines without automation and tool support could be an annoying duty for a creative person. There are many ways to better spend time than to execute the same command on each machine. Nowadays, a professional system administrator should be able to build and manage networks of thousands computers. Even if a system administration is not your full-time job and the cluster to manage is two orders of magnitude smaller it is still worth to use your time effectively. Why not invest some time upfront and master a few key skills to better manage Linux machines in the future?

This article aims to provide a collection of practices for cost effective and scalable administration of remote machines. Cost effective means that an operation will not require more human effort than absolutely necessary. For example, an administrator will still have to execute an apt-get install command to install a software package. However, this activity will be performed once regardless of the number of machines to manage, which moves us to the scalable part. Adding a new machine to a system will not increase administration effort if the system is working under normal conditions.


Running examples presented in the article requires access to multiple machines with a custom network configuration. Fortunately, to accurately imitate conditions of managing a lab it is enough to set up a simple cluster of three virtual machines: master and two slaves. The master will be used to issue administration commands which will be then replicated automatically to slave nodes. There is no limit on the number of slave nodes, but two are enough to demonstrate the approach with no loss of generality.

The cluster of virtual machines will be set up using Vagrant, a tool for configuring virtual machine environment in an automated fashion. Vagrant is independent of a virtual machine provider, such as VirtualBox, and the actual tool used to configure the machine, which referred to as the provisioner. To learn more about Vagrant I recommend starting with the introduction on the official project site. Examples below use Virtual Box and the shell provisioner, which is built into Vagrant.

  1. Install VirtualBox.

    Linux distributions often offer VirtualBox packages in their package repository. It is probably the easiest option to choose if you would like to automatically get access to future updates. However, before registering a new package source ensure it is dedicated for your version of the operating system. For example, if you are using Debian Jessie, the backport for Debian Wheezy may not work on your system.

    For popular Linux distributions the safest solution is installing the Virtual Box package from the official downloads page. After successful download, install the package using the command below with the correct package name.

    sudo dpkg -i virtualbox-5.1_5.1.22-115126-Debian-jessie_amd64.deb 

    Do not be discouraged by possible error messages printed by the package manager.

    dpkg: error processing package virtualbox-5.1 (--install):
     dependency problems - leaving unconfigured

    Dependency problems can be easily resolved by running the command:

    sudo apt-get install -f

    If you would like try Virtual Box from a package source, search the official documentation of your Linux distribution on this subject or click direct links for Debian and Ubuntu users.

  2. Install Vagrant.

    Commands below install the latest version of Vagrant as of writing this on Debian. Vagrant has very short release cycle, so it is strongly recommended to download the latest version from the official project downloads page.

    sudo dpkg -i vagrant_1.9.4_x86_64.deb
    sudo apt-get install -f
  3. Download Vagrant files for this article: master, slave1 and slave2. The virtual machines are provisioned to test examples with no extra configuration. Administration commands should be issued from the master node.

Use friendly aliases for machines

Being easy for humans to remember was certainly not a priority for the Internet Protocol design committee. A group counting more than handful machines deserves its own, meaningful naming scheme. Furthermore, if hosts are distributed in different locations, it may also be useful to mention the physical location in a domain name.

Come up with a naming scheme and register host names to the the /etc/hosts file. This change should be applied at least on the master node, where you issue administrator commands from.

sudo vim /etc/hosts
# /etc/hosts       localhost       debian     master.lab1 master     slave1.lab1 slave1     slave2.lab1 slave2

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Now, you can use a hostname alias to connect to the selected machine.

For consistency, you may decide to update the hostname on the target machine as well.

Edit the /etc/hostname file to update the hostname.

sudo sh -c 'echo new-hostname > /etc/hostname'

Execute the hostnamectl command to update the hostname. The configuration change will become visible for system services. Rebooting the system has a similar effect.

hostnamectl set-hostname new-hostname

Deploy administrator SSH key on the remote machine

If you often work remotely on a specific host it may be useful to add your SSH key to the ~/.ssh/authorized_hosts file on the remote machine. This way you will avoid being asked for a password before establishing a new SSH connection.

Instead of editing the file directly, you could use the ssh-copy-id command as shown in the example below.

ssh-copy-id username@remote-host

If you do not own a SSH key it can be generated using the ssh-keygen command from the openssl package.

ssh-keygen -t rsa -b 4096 -C ""

Finally, if you have not cut your teeth on SSH, I cannot tell how much I recommend reading An Illustrated Guide to SSH Agent Forwarding. SSH is ubiquitous nowadays and understanding the protocol internals will help you avoid security pitfalls in the future.

Use parallel SSH

PSSH is a collection of parallel extensions to console utilities originally developed by the OpenSSH project. The tools are distributed in the pssh package available for download from official Debian and Ubuntu repositories. Names for the parallel versions of programs were created by adding parallel- prefix. For example parallel-ssh is the parallel version of the ssh client.

The examples below are limited to parallel-ssh client. Unfortunately, comprehensive documentation is not the asset of this package. To learn more about the pssh project simply install the package and search its man pages or check out the official project sources.

If you feel that the program name parallel-ssh is too long to conveniently type in terminal, consider registering its shorter alias such as pssh in your bash profile.

First, check the absolute path to the program and add the alias to the ~/.bashrc file.

whereis parallel-ssh
parallel-ssh: /usr/bin/parallel-ssh /usr/share/man/man1/parallel-ssh.1.gz

Debian and Ubuntu users may end up with the following line in the ~/.bashrc file.

alias pssh=`/usr/bin/parallel-ssh`

Before letting the parallel-ssh shine define a text file with the list of hosts to target by your commands and save it as the hosts.txt file, which we refer to in the following examples, or whatever you find proper.

# hosts.txt

For example, let’s check the machine uptime.

pssh -h hosts.txt -i uptime

The -i switch instructs the pssh to display output from the processes executed on the target machines. If an exit code is sufficient, feel free to skip the switch.

Finally, one may wonder how to execute commands that require user input. If a workflow on a target machine is fully interactive, which cannot be handled with an input stream pipe, it may not be the best candidate for parallel execution to start with. If the only obstacle that prevents making the workflow parallel is providing sudo password, it may be useful to waive this security mechanism for certain user accounts or commands. Waiving sudo password question was explained in the post on productivity in Linux.

Setup a local package repository

One of administration practices that greatly reduces the complexity of a network host management and prevents potential software-related problems from arising is ensuring that every machine stays updated and has consistent versions of software installed. Debian community takes responsibility for preparing and testing software packages hosted in public repositories. However, your organization may not have moved to the most recent Linux distribution or require software that lacks a voluntary package maintainer. A robust solution to this problem is setting up a local package repository that will contain packages that address special requirements of your organization.

This section will present how to set up a local Advanced Packaging Tool (APT) repository hosted using the Apache HTTP server.

Firstly, install the apache2 and dpkg-dev packages on the machines that will be used to host the package repository.

sudo apt-get install apache2 dpkg-dev

Create the /var/www/deb/amd64 directory and copy there all packages that should compose the repository.

mkdir -p /var/www/deb/amd64

Build packages index files of the repository and save them compressed using the gzip algorithm.

cd /var/www/deb/
dpkg-scanpackages amd64 | gzip -9c > amd64/Packages.gz

The dpkg-scanpackages program may print errors or warnings, such as package duplication. Make sure you review the index files before release.

Finally, update the list of package sources on the client machines.

Open the sources.list file for edition.

sudo vim /etc/apt/sources.list

Register the package repository using the correct hostname.

# /etc/apt/sources.list
deb http://sever_hostname/deb/ amd64/

Packages available in the local repository should become visible after refreshing the apt configuration. shellpackage manage sudo apt-get update

Configure a cluster management system

If the main purpose of the machines is to serve as a computing cluster you may consider installing a dedicated cluster management system. For example, Torque is a robust, open source cluster management system used in many super computing centers around the world. Learning and configuring Torque for the first time would require a significant time commitment, but the effort should pay off long term. For more information on Torque you may refer to previous posts in this blog covering Torque installation, job submission and management.