GROMACS example for Salk
This example was derived from the gmxdemo file given at www.gromacs.org, and simulates Molecular Dynamics (MD) using the GROMACS software package.
The demo performs a complete MD simulation of a small peptide in water. The only input files needed are a pdb file of a small peptide and three mdp files containing run parameters for different steps of the example.
If you have any questions about this demonstration, please email PSC User Services, or check the resources on the website: http://www.gromacs.org.
The mdrun executables are compiled specifically for the compute nodes. mdrun must be launched with a mpirun command.. All other GROMACS executables are compiled to run on both the compute and front-end nodes.
This document is organized as follows. First a summary of the commands needed to perform the MD are given. Links from each command lead to an explanation of that step. The first three steps set up the run time environment, and the remainder perform the simulation.
Summary of commands (4 processor run)
1. qsub -I -q debug -l ncpus=4 -l walltime=30:00 2. set GROMACS_BIN=/usr/local/packages/gromacs/bin 3. cd working-directory 4. echo 0 | pdb2gmx -f cpeptide.pdb -o cpeptide.gro -p cpeptide.top >& ! output.pdb2gmx 5. editconf -f cpeptide.gro -o cpeptide.gro -d 0.5 >& ! output.genbox 6. genbox -cp cpeptide.gro -cs -o cpeptide_b4em.gro -p cpeptide.top >>& ! output.genbox 7. grompp -np $PBS_NCPUS -f em -c cpeptide_b4em -p cpeptide -o cpeptide_em >& ! output.grompp_em 8. mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_em -o cpeptide_em -c cpeptide_b4pr -v >& ! output.mdrun_em 9. grompp -np $PBS_NCPUS -f pr -c cpeptide_b4pr -r cpeptide_b4pr -p cpeptide -o cpeptide_pr >& ! output.grompp_pr 10. mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_pr -o cpeptide_pr -c cpeptide_b4md -v >& ! output.mdrun_pr 11. grompp -np $PBS_NCPUS -f md -c cpeptide_b4md -p cpeptide -o cpeptide_md >& ! output.grompp_md 12. mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_md -o cpeptide_md -c cpeptide_after_md -v >& ! output.mdrun_md
Please note, as stated above, steps 8, 10, and 12 need to be launched with the mpirun command. The mdrun executables run on only Salk's compute nodes.
Explanation of Demo
Setting up the run environment
1. Obtain an interactive session.
To begin the demo, obtain an interactive debugging session. Ask for 4 processors and a thirty minute wall time.
qsub -I -q debug -l ncpus=4 -l walltime=30:00
2. Set up the gromacs run time environment
Define the directory where the GROMACS binaries live. This will just make subsequent commands a little shorter to type.
set GROMACS_BIN=/usr/local/packages/gromacs/bin
3. Move to the submission directory.
Move to the directory where the qsub command was excecuted from. Otherwise, you are in your home directory by default.
Running the simulation
4. Generate a topology file
Before we can start any simulation, we need a molecular toplogy file. This topology file (.top extension) is generated by the program pdb2gmx. The peptide pdb file (.pdb extension) is the input for pdb2gmx.
Because most pdb files do not contain all hydrogen atoms, the pdb2gmx program will also add them to our peptide. The output file containing the structure of the peptide when hydrogen atoms are added is a gromos structure file (.gro extension).
echo 0 | pdb2gmx -f cpeptide.pdb -o cpeptide.gro -p cpeptide.top >& ! output.pdb2gmx
Input files: cpeptide.pdb
Output files: cpeptide.gro, cpeptide.top, output.pdb2gmx
5. Define the water box size
Because a simulation of a peptide in vacua is a bit unrealistic, we have to solvate our peptide in a box of water. Use genbox to do this.
First we will use the program editconf to define the right box size for our system.
editconf -f cpeptide.gro -o cpeptide.gro -d 0.5 >& ! output.genbox
Input files: cpeptide.gro from step 4
Output files: cpeptide.gro, output.genbox
6. Solvate the peptide
The genbox program reads the peptide structure file and the input file containing the size of the desired water box. The output of genbox is a gromos structure file of a peptide solvated in a box of water. genbox also changes the topology file (.top extension) to include water.
genbox -cp cpeptide.gro -cs -o cpeptide_b4em.gro -p cpeptide.top >>& ! output.genbox
Input files: cpeptide.gro from step 5
Output files: cpeptide_b4em.gro, cpeptide.top, output.genbox
7. Energy minimization, step 1: preprocess the input files.
In principle, we can start a Molecular Dynamics simulation now. However, it is not very wise to do so, because our system is full of close contacts. These close contacts are mainly a result of the genbox program. The added solvent might have some close contacts with the peptide resulting in very high repulsive energies. If we start a Molecular Dynamics (MD) simulation without energy minimization, the system would be unstable because of these high energies.
The standard procedure to remove these close contacts is Energy Minimization (EM). Energy minimization slightly changes the coordinates of our system to remove high energies from our system.
Before we can start the Energy Minimization, we have to preprocess all the input files with the GROMACS preprocessor grompp. grompp preprocesses the topology file (.top), the structure file (.gro) and a parameter file (.mdp) resulting in a binary topology file (.tpr). This binary topology file contains all information for a simulation (in this case, an energy minimization).
NOTE: the variable -np must specify the same number of processors with which you intend to run mdrun. Here it is set to the variable $PBS_NCPUS so this command does not change even when a different number of processors is used.
grompp -np $PBS_NCPUS -f em -c cpeptide_b4em -p cpeptide -o cpeptide_em >& ! output.grompp_em
Input files: cpeptide.top from step 6, cpeptide_b4em.gro from step 6, and em.mdp.
Output files: cpeptide_em.tpr, mdout.mdp, and output.grompp_em8. Energy minimization, step 2
Now that the binary topology file is generated, we can start the energy minimization (EM). The program which performs the EM is called mdrun. In fact, all simulations are performed by the same mdrun program.
As the Energy Minimization is running, watch the output of the program. The first number (from left to right) is the number of the iteration step. The second number is the step size, which gives an indication of the change in the system. The third number is the potential energy of the system. This number starts at a high value, rapidly drops and converges to a large negative value.
mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_em -o cpeptide_em -c cpeptide_b4pr -v >& ! output.mdrun_em
Input files: cpeptide_em.tpr from step 7
Output files: cpeptide_em.trr, cpeptide_b4pr.gro, output.mdrun_em, ener.edr, md0.log, md1.log, md2.log, md3.log.
9. Position Restrained MD, step 1: generate the binary topology file
Once all close contacts are removed from the system, we want to do molecular dynamics of the water molecules, and keep the peptide fixed. This is called position restrained (PR) MD.
Position Restrained MD keeps the peptide fixed and lets all water molecules equilibrate around the peptide in order to fill holes, etc., which were not filled by the genbox program.
First we must preprocess the input files to generate the binary topology. The input files are: the topology file, the structure file (output of the EM), a parameter file, and an index file.
By default, our system is split into several groups. In this case, we use two of those groups, Protein and SOL(vent), to put position restraints on all the atoms of the peptide.
The parameter file (.mdp extension) contains information about the PR-MD such as step size, number of steps, temperature, etc. This parameter file also tells GROMACS what kind of simulation should be performed (e.g., EM, PR-MD and MD etc.)
grompp -np $PBS_NCPUS -f pr -c cpeptide_b4pr -r cpeptide_b4pr -p cpeptide -o cpeptide_pr >& ! output.grompp_pr
Input files: pr.mdp, cpeptide_b4pr.gro from step 8, and cpeptide.top from step 6.
Output files: cpeptide_pr.tpr, mdout.mdp, and output.grompp_pr
10. Position Restrained MD, step 2: run the simulation
Now we start the PR-MD simulation. It is important to note that in this example the simulated time is too short (1 ps) to equilibrate our system completely, but that would simply take too much time (about one day).
mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_pr -o cpeptide_pr -c cpeptide_b4md -v >& ! output.mdrun_pr
Input files: cpeptide_pr.tpr from step 9
Output files: cpeptide_pr.trr, cpeptide_b4md.gro, output.mdrun_pr, ener.edr, md0.log, md1.log, md2.log, md3.log.
11. MD Simulation, step 1: generate the binary topolgy file
Now our complete system is finally ready for the actual Molecular Dynamics simulation. We start again by preprocessing the input files with the grompp program to generate the binary topology file (.tpb/.tpr extension).
grompp -np $PBS_NCPUS -f md -c cpeptide_b4md -p cpeptide -o cpeptide_md >& ! output.grompp_md
Input files: md.mdp, cpeptide_b4md.gro from step 10, cpeptide.top from step 6
Output files: cpeptide_md.tpr, output.grompp_md and mdout.mdp
12. MD simulation, step 2: run the simulation
Now we can start the MD simulation. Watch the number of steps increasing (the total number of steps is 2500, for 5 ps).
mpirun $GROMACS_BIN/mdrun -np $PBS_NCPUS -s cpeptide_md -o cpeptide_md -c cpeptide_after_md -v >& ! output.mdrun_md
Input files: cpeptide_md.tpr from step 11
Output files: cpeptide_after_md.gro, output.mdrun_md, ener.edr, md0.log, md1.log, md2.log, md3.log.