Running Powerflow on Ubuntu with SLURM and Infiniband

This is a walkthrough on my work on running a proprietary computational fluid dynamics code on the snap version of SLURM over Infiniband. This time, I’ll take you through what it takes to get powerflow to run on Ubuntu18.04. If you like to try out the same thing on STARCCM+, here is a link to a post that takes you through that.

You can use this to perform scaling studies, track down issues and optimizing performance or use it as you like. Much of this will work on other OS:es too.

This is the workbench used here:

Hardware: 2 hosts with 2×20 cores 187GB ram.
Infiniband: Mellanox MT28908 Family [ConnectX-6]
OS: Linux 4.15.0-109-generic (x86_64) Ubuntu18.04.4
SLURM 20.04 (https://snapcraft.io/slurm)
OpenMPI: 4.0.4 (ucx, openib)
Powerflow: 6.2019
A Reference model which is small enough for your computers and large enough to run over 2 nodes on your available cores.

I use Juju to deploy my SLURM clusters in any cloud to get up and running. In this case, I use “MAAS” as the underlying cloud, but this would work on other clouds aswell.

Lets get started.

Modify ulimits on all nodes.

This is done by editing /etc/security/limits.d/30-slurm.conf

* soft nofile  65000
* hard nofile  65000
* soft memlock unlimited
* hard memlock unlimited
* soft stack unlimited
* hard stack unlimited

Modify slurm systemd unit startup files to make ulimit permanent to the slurmd processes.

$ sudo systemctl edit snap.slurm.slurmd.service

[Service]
LimitNOFILE=131072
LimitMEMLOCK=infinity
LimitSTACK=infinity

* Restart slurm on all nodes.

$ sudo systemctl restart snap.slurm.slurmd.service

* Make sure login nodes has correct ulimits after a login.

* Validate that all worker nodes also has correct values on ulimits when using slurm. For example:

$ srun -N 1 --pty bash
$ ulimit -a

You must have all consistent settings for ulimit or things will go sideways. Remember that slurm propagates ulimits from the submitting node, so make sure those are consistent there too.

I’m going to assume you have installed Powerflow on all you nodes at: /software/powerflow/6.2019 but you can have it wherever.

Powerflow also needs csh, install it:

sudo apt install csh

Modify the Powerflow installation

Since powerflow doesn’t yet support Ubuntu (which is a shame) we need to get around this by fixing a few small bugs to get our simulation running as we want.

Workaround #1 – Incorrect awk path

Powerflow assumes that a few os commands are located in a fixed location on all OS:es, which is a bug of course. The bug is located in the file “/software/powerflow/6.2019/dist/generic/scripts/exawatcher” and causes problems.

To fix this, you either edit the exawatcher script and comment out the references:

#set awk=/bin/awk
#set cp=/bin/cp
#set date=/bin/date
#set rm=/bin/rm
#set sleep=/bin/sleep

… as an ugly alternative, you create a symlink to “awk” which is enough to work around the bug. Hopefully this will be fixed in future versions of powerflow.

sudo ln -s /usr/bin/awk /bin/awk

This is not needed on OS:es such at centos6 and centos7 which have those symlinks already in place.

Workaround #2 – bash is not sh

Powerflow has an incorrect script header, referencing “#!/bin/sh” for code that is in fact “#!/bin/bash” and will render into a syntax error on ubuntu.

Replace #!/bin/sh header with #!/bin/bash in the file: /share/apps/powerflow/6.2019/dist/generic/server/pf_sim_cp.select

This is enough really. Its time to run powerflow through SLURM.

Time to write the job-script

#!/bin/bash
#SBATCH -J powerflow_ubuntu
#SBATCH -A erik_lonroth
#SBATCH -e slurm_errors.%J.log
#SBATCH -o slurm_output.%J.log
#SBATCH -N 2
#SBATCH --ntasks-per-node=40
#SBATCH --exclusive
#SBATCH --partition debug

LC_ALL="en_US.utf8"
RELEASE="6.2019"
hosttype="x86_64-unknown-linux"
INSTALLPATH="/software/powerflow/${RELEASE}"

export PATH=$INSTALLPATH/bin":$PATH
export LD_LIBRARY_PATH="$INSTALLPATH/dist/x86_linux/lib:$INSTALLPATH/dist/x86_linux/lib64"

# Set a low number of timesteps since we are only here to test
NUM_TIMESTEPS=100

export EXACORP_LICENSE=27007@license.server.com


exaqsub \
-decompose \
-infiniband \
-num_timesteps $NUM_TIMESTEPS \
-foreground \
--slurm \
-simulate \
-nprocs $(expr $SLURM_NPROCS - 1) \
--mme_checkpoint_at_end \
*.cdi

You probably need to modify this above script for your own environment but the general things are in there. An important note here is that powerflow normally need a separate node to run its “Control Process (CP)” on with more memory than the “Simulation Processors (SP)” nodes. I’m not taking that into account since my example job is small and fits in RAM for this example. This why I also get away with setting:

-nprocs $(expr $SLURM_NPROCS - 1) \

Powerflow will decompose the simulation into N-1 partitions which when simulation start, will leave 1 cpu for running the “CP” process on. This is suboptimal but unless we do this, slurm will complain with:

srun: error: Unable to create step for job 197: More processors requested than permitted

There is probably a smart way of telling slurm about a master process which I hope to learn about soon and use to properly run powerflow with a separate “CP” node.

Submit to slurm

Submitting the script is simply:

$ squeue -p debug ./powerflow-on-ubuntu.sh

You can watch your Infiniband counters to see that significant amount of traffic is sent over the wire which will indicate that you have succeeded.

watch -d cat /sys/class/infiniband/mlx5_0/ports/1/counters/port_rcv_packets

You can also inspect a status file that Powerflow writes continously like this from the working directory of the simulation:

# Lets look at the status of the simulation
$ cat .exa_jobctl_status
Decomposer: Decomposing scale 6

# ... again
$ cat .exa_jobctl_status
Simulator: Initializing voxels [43% complete [21947662 of 51040996]

# ... and once the simulation is complete.
$ cat .exa_jobctl_status
Simulator: Finished 100 Timesteps

I’ve been presenting at Ubuntu Masters about the setup I use to work with my systems which allows me to do things like this easily. Here is a link to that material: https://youtu.be/SGrRqCuiT90

1 Comments on “Running Powerflow on Ubuntu with SLURM and Infiniband”

  1. Pingback: Running STARCCM+ using OpenMPI on Ubuntu with SLURM and Infiniband | Erik Lönroth

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: