Solve the spack install gromacs+cuda+mpi not compatible with [email protected]

For a temporary solution: you can merely download their upstream release from here that fixed the cuda-11 not compatible issue. and put them to /path/to/spack/var/spack/cache/_source-cache/archive/cd/cd12bd4977e19533c1ea189f1c069913c86238a8a227c5a458140b865e6e3dc5.tar.gz.

The checksum can be gotten by sha256sum gromacs-prepare-2020.4.tar.gz. Besides, you should modify the 2020.4 checksum in /path/to/spack/var/spack/repos/builtin/packages/gromacs/, add version('2020.4', sha256='cd12bd4977e19533c1ea189f1c069913c86238a8a227c5a458140b865e6e3dc5').

Then you can simply run spack install [email protected]+mpi+cuda

How to make a good perfing of pcie latency

The module is uploaded to the .ko

The script is

rmmod pcie-lat
ruby measure.rb -p 02:00.0 -l 10000 -b 0 -o 0x0
ruby measure.rb -p 02:00.0 -l 100000 -b 0 -o 0x0
ruby measure.rb -p 02:00.0 -l 1000000 -b 0 -o 0x0
insmod ./pcie-lat.ko ids=8086:1533
rmmod pcie-lat
echo 0000:02:00.0 > /sys/bus/pci/devices/0000:02:00.0/driver/unbind
ruby -v
lspci -nn -s 2:00.0
insmod ./pcie-lat.ko ids=8086
lspci|grep Eth
cd pcie-lat/
git clone

The result is

writing 3σ values (in ns) to file...
root@s3l-thinkstation ~/pcie-lat (master)# ruby measure.rb -p 02:00.0 -l 10000 -b 0 -o 0x0�����������������������������������������������������������������������������������������������������(base) TSC freq:     2294609000.0 Hz
TSC overhead: 28 cycles
Device:       02:00.0
BAR:          0
Offset:       0x0
Loops:        10000

       | Results (10000 samples)
Mean   |   3764.24 cycles |   1640.47 ns
Stdd   |    314.56 cycles |    137.08 ns

       | 3σ Results (9995 samples, 0.001% discarded)
Mean   |   3759.46 cycles |   1638.39 ns
Stdd   |     64.10 cycles |     27.93 ns