Designing a high-performance computing cluster using low-cost hardware and open-source software.
In my previous article published on Opensource.com[1], I introduced the OpenHPC[2] project, which aims to accelerate innovation in high-performance computing (HPC). This article will delve deeper into using the features of OpenHPC to build a small HPC system. Calling it an HPC system might be a bit of an exaggeration, so more accurately, it should be referred to as a system based on the cluster building method[3] released by the OpenHPC project.
This cluster consists of two Raspberry Pi 3 systems as compute nodes and a virtual machine as the master node, as illustrated below:
Map of HPC cluster
My master node runs a x86_64 architecture CentOS operating system, while the compute nodes run a lightly modified version of CentOS operating system for aarch64.
The following image shows a real working photo of the devices:
HPC hardware setup
To configure my system to look like the HPC system shown above, I followed some steps from the CentOS 7.4/aarch64 + Warewulf + Slurm installation guide[4] (PDF) of the OpenHPC cluster building method. This method includes the configuration instructions for using Warewulf[5]; since my three systems were manually installed, I skipped the Warewulf section and some steps for creating the Ansible playbook[6].
After setting up my cluster in the Ansible[7] playbook, I could submit jobs to the resource manager. In my case, Slurm[8] acted as the resource manager, which is an instance in the cluster that decides when and where my jobs run. One way to start a simple job on the cluster is:
[ohpc@centos01 ~]$ srun hostname
calvin
If I need more resources, I can tell Slurm that I want to run my command on 8 CPUs:
[ohpc@centos01 ~]$ srun -n 8 hostname
hobbes
hobbes
hobbes
hobbes
calvin
calvin
calvin
calvin
In the first example, Slurm ran the specified command (hostname
) on a single CPU, while in the second example, Slurm ran that command on 8 CPUs. One of my compute nodes is named calvin
, and the other is named hobbes
; you can see their names in the output of the above command. Each compute node consists of 4 CPU cores of Raspberry Pi 3.
Another way to submit jobs in my cluster is to use the sbatch
command, which can be used to run scripts and write output to a file instead of my terminal.
[ohpc@centos01 ~]$ cat script1.sh
#!/bin/sh
date
hostname
sleep 10
date
[ohpc@centos01 ~]$ sbatch script1.sh
Submitted batch job 101
This will create an output file named slurm-101.out
, which contains the following content:
Mon 11 Dec 16:42:31 UTC 2017
calvin
Mon 11 Dec 16:42:41 UTC 2017
To demonstrate the basic functionality of the resource manager, a simple serial command-line tool will suffice, but configuring a similar HPC system can be a bit tedious.
A more interesting application is running a parallel job of Open MPI[9] on all available CPUs in this cluster. I used an application based on Conway’s Game of Life[10], which was used in a video called “Running Conway’s Game of Life Across Architectures with Red Hat Enterprise Linux” [11]. Unlike the previous MPI-based Game of Life
versions, this version running in my cluster generates different colors for the cells of each host involved. The following script starts the application interactively with graphical output:
$ cat life.mpi
#!/bin/bash
module load gnu6 openmpi3
if [[ "$SLURM_PROCID" != "0" ]]; then
exit
fi
mpirun ./mpi_life -a -p -b
I used the following command to start the job, telling Slurm to allocate 8 CPUs for this job:
$ srun -n 8 --x11 life.mpi
For demonstration purposes, this job has a graphical interface that displays the current computation results:
The red cells are computed by one of the compute nodes, while the green cells are computed by another compute node. I can also have the Conway’s Game of Life program generate different colors for each CPU core used (each compute node has four cores), resulting in output like this:
Thanks to the software packages and installation methods provided by OpenHPC, I was able to configure an HPC-style system with two compute nodes and one master node. I can submit jobs to the resource manager and then start MPI applications on the CPUs of my Raspberry Pi using the software provided by OpenHPC.
To learn more about building Raspberry Pi clusters using OpenHPC, join Adrian Reber’s discussion at DevConf.cz 2018[12], which will be held from January 26-28 in Brno, Czech Republic, and at CentOS Dojo 2018[13], which will be held on February 2 in Brussels.
About the Author
Adrian Reber — Adrian is a Senior Software Engineer at Red Hat who began migrating processes to high-performance computing environments back in 2010. Since then, he has migrated many processes and earned a PhD, then joined Red Hat and started migrating to containers. Occasionally, he still migrates individual processes and remains very interested in high-performance computing. More information about me can be found here[14]
via: https://opensource.com/article/18/1/how-build-hpc-system-raspberry-pi-and-openhpc
Author: Adrian Reber[14] Translator: qhwdw Proofreader: wxy
This article is originally compiled by LCTT and honorably published by Linux China