I was using the OpenFOAM MPI modules today and hit the RedHat Bug 1408316, except that the 15s hang caused OpenFOAM to die hard, and my simulation with it. Fortunately the fix is quite easy and is fully described in comment five of bug report: compile and install a newer version of libfabric >= 1.4.0.
I grabbed the SRPM from Fedora 25 and recompiled it via Mock for CentOS7:
According to the bug report this is targeted to be fixed in RHEL 7.4 but in the mean time the above RPMs fixed the issue on my CentOS7 compute nodes.
I’ll get around to packaging these into a proper repo in a later post.