`ucx` not necessarily occupy all the cores at all times even bind by core when using openmpi

root@epyc:~# uname -a
Linux epyc.node2 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux
root@epyc:~# ldd --version
ldd (Debian GLIBC 2.28-10) 2.28
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

In cases of quantum espresso

mpirun -hostfile ../AUSURF112/host --mca pml ucx --mca btl sm,rc,ud,self --mca btl_tcp_if_include 192.168.10.0/24 --bind-to core -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS=1 -np 256 /home/qe/sb/bin/pw.x -nk 4 -nd 64 -i ./grir443.in > 11_7_out256 2> 11_7_out256err

It would dead for a while

The bug is reported by pthread_rwlock.
78F81AA7630E54DC102C3E66FCF43E97

diff --git 2.28/nptl/pthread_rwlock_common.c 2.29/nptl/pthread_rwlock_common.c
index a290d08332..81b162bbee 100644
--- 2.28/nptl/pthread_rwlock_common.c
+++ 2.29/nptl/pthread_rwlock_common.c
@@ -310,6 +310,7 @@ __pthread_rwlock_rdlock_full (pthread_rwlock_t *rwlock,
          if (atomic_compare_exchange_weak_relaxed
              (&rwlock->__data.__readers, &r, r | PTHREAD_RWLOCK_RWAITING))
            {
+              r |= PTHREAD_RWLOCK_RWAITING;
              /* Wait for as long as the flag is set.  An ABA situation is
                 harmless because the flag is just about the state of
                 __readers, and all threads set the flag under the same