DCDFiles

Overview

KPrepFunctions.py includes a few functions that are made to work with DCD files. These functions use the MDAnalysis Python package. This can sometimes be difficult to install, refer to this page for instructions on installing it on the supercomputer. The functions were created to combine DCD files into one (useful if you have run a single simulation as multiple jobs on the supercomputer) or shift the protein within the DCD file if it crosses box boundaries (important if you want to perform RMSD calculations or produce figures or videos). These functions do not depend on the input or output features in KPrep, so they could easily be copied into a separate script and modified as needed. This may be necessary for some purposes because the functions were made for very specific purposes and may not work well for slightly different situations.

Combining DCD Files

This is done with the catDCD function in KPrepFunctions.py. The function expects a topology file (PDB, CRD, etc.) of the system, the basename for the files to be combined (the part of the dcd file name before .dcd excluding a number at the end to indicate the order of the files), the number of files to be combined, and the desired output name for the combined DCD file. Changing the variables in the script s-catDCD.py is a simple way to use the function. Although the function expects you to use a specific nomenclature for your DCD files to be combined, it is very simple, so you could easily copy the function into a separate Python script and modify it as you need to allow for different DCD naming schemes.

Shifting DCD Files

The function shiftDCD in KPrepFunctions.py is primarily used to undo a protein crossing box boundaries and being split visually. This allows for RMSD calculations and better images and videos. It could also be used to simply shift a protein to be located at a different location in the box. The script is very slow, so I recommend using a light spring constant to keep the protein centered, then only using this if that does not keep the protein in the box for all frames. The parameters needed for the function are somewhat complicated, but using the script s-shiftDCD.py makes it a little simpler. Here is a list of the variables needed:

  • top_file: the PDB or CRD file of the system simulated
  • traj_file: the DCD file to be shifted
  • outname: the name of the output DCD file
  • box_dims: an array of 6 numbers that correspond to the lower x-boundary of the box, the upper x-boundary of the box, the lower y-boundary of the box, the upper y-boundary of the box, the lower z-boundary of the box, and the upper z-boundary of the box, in that order. You can have LAMMPS print these values out so that you know them.
  • x_atom: information about an atom that you would like to maintain its x-coordinate across all frames of the DCD file. Typically, the best choice for atom is an atom near the center of the protein that does not jump around a lot (e.g. backbone atom rather than sidechain). It is often simplest to make this be the same atom as y_atom and z_atom, but you may choose other atoms if you wish. x_atom is an array of 5 values. The information needed for the atom is usually most easily found by a combination of a visualization program like VMD and looking at the PDB file itself. The values needed (in order) are:
    • move_x: a Boolean saying whether or not you want to actually do shifting in the x-direction using this atom as a basis. Usually you will choose "True," but there could be situations where you want to shift in some dimensions but not others
    • x_segname: The segname associated with the atom
    • x_resid: The residue ID associated with the atom
    • x_atom_name: The name of the atom
    • x_orig: The x-coordinate where you would like x_atom to stay
  • y_atom: same as x_atom, but for an atom you want to keep at the same y-coordinate for all trajectory frames. You will likely choose the same atom. If you do, be sure to change the value of "y_orig." All of the other values will be the same, but y_orig will most likely not be.
  • z_atom: same as x_atom and y_atom. Refer to information on them for more details.

This function is not as simple and easy to modify as catDCD, but it could still be done by copying the entire function to a new Python file and modifying it. It does not rely on any of the input/output functions of KPrep.