MOLECULAR DYNAMICS PERFORMANCE GUIDE - Digital Research Alliance of CANADA

BENCHMARK DETAILS

ID=135

  • Dataset: 6n4o
  • Software: NAMD2.cuda.ucx (namd-ucx-smp/2.14-iccifortcuda-2020.1.114-avx2)
  • Resource: 2 tasks, 12 cores, nodes, 2 GPUs, with NVLink
  • CPU: EPYC 7413 (Milan), 2.65 GHz
  • GPU: NVIDIA-A100-SXM4-40GB, 12 cores/GPU
  • Simulation speed: 6.77558 ns/day
  • Efficiency: 39.1 %
  • Site: Narval
  • Date: Jan. 28, 2022, 1:41 a.m.
  • Submission script:

    #!/bin/bash
    #SBATCH --ntasks=2 --nodes=1 -A def-svassili
    #SBATCH --cpus-per-task=12
    #SBATCH --gpus-per-node=2
    #SBATCH --mem-per-cpu=2000 --time=3:0:0
    # Usage: sbatch $0 number_of_steps
    INPFILE=namd.in

    STEPS=$1
    TMPFILE=tf_${SLURM_JOBID}
    RUN_IN=run_${SLURM_JOBID}.in
    cp ${INPFILE} ${RUN_IN}
    echo numsteps $1 >> run_${SLURM_JOBID}.in
    ml StdEnv/2020 cuda/11.4 namd-ucx-smp/2.14

    echo ${SLURM_NODELIST} running on ${SLURM_CPUS_PER_TASK} cores
    cat /proc/cpuinfo | grep "model name" | uniq

    (( NUM_PES=$SLURM_CPUS_PER_TASK - 1 ))
    srun --mpi=pmi2 namd2 ++ppn $NUM_PES +idlepoll ${RUN_IN} > ${TMPFILE}

  • Notes:

    *** INCONSISTENT PERFORMANCE ***

    nvidia-smi --query-gpu=index,utilization.gpu --format=csv -l 1

    jobs 2061493 2061495
    index, utilization.gpu [%]
    0, 1 %
    1, 2 %
    0, 2 %
    1, 1 %
    0, 2 %
    1, 2 %
    0, 3 %
    1, 3 %
    0, 2 %
    1, 1 %
    0, 1 %

  • Simulation input file:

    # Amber/(t,s,x)leap generated parm and crd file
    parmfile prmtop.parm7
    ambercoor inpcrd.rst7
    # Input
    bincoordinates equilibration.coor
    binvelocities equilibration.vel
    extendedsystem equilibration.xsc
    # Output
    restartfreq 10000
    dcdfreq 10000
    outputEnergies 10000
    outputPressure 10000
    outputname equilibration
    # Number of steps
    # numsteps 10000
    # AMBER Force Field settings
    amber on
    rigidBonds all
    useSettle on
    rigidTolerance 1.0e-8
    cutoff 9.0
    pairlistdist 11.0
    switching off
    exclude scaled1-4
    readexclusions yes
    1-4scaling 0.83333333
    scnb 2.0
    zeromomentum on
    ljcorrection on
    # Integrator Parameters
    timestep 1.0
    nonbondedFreq 1
    fullElectFrequency 1
    stepspercycle 10
    wrapAll on
    # Temperature control
    langevin on
    langevinTemp 300
    langevinDamping 1.0
    # Pressure control
    useGroupPressure yes
    LangevinPiston on
    LangevinPistonTarget 1.0
    LangevinPistonPeriod 200
    LangevinPistonDecay 100
    LangevinPistonTemp 300
    # PME settings
    PME on
    PMEGridSizeX 144
    PMEGridSizeY 144
    PMEGridSizeZ 144