All-AtomPEGylation

Overview

Creating the needed files for all-atom PEGylation is a somewhat complex process that requires multiple scripts found in KPrep. This page explains how to create a PEGylated protein for simulation from an initial PDB file. It also explains how to create a PEG that is not conjugated to anything for simulation. The PDB files created following these should always be minimized before you start your simulation. In particular, mutating an amino acid into the protein will most likely lead to an unrealistic orientation of the new amino acid. Minimization will help address this and other issues.

PEGylating a Protein

To do this, you will need to follow the instructions on the main KPrep page here in the Lab Notebook to make sure KPrep is on your PYTHONPATH. Once that is working, you can proceed.

Obtain PDB file of your protein.
Copy your protein PDB file, the PEG PDB file (from GitHub or the supercomputer), the PDB file for whatever amino acid you are mutating to (usually PDC.pdb, again from GitHub or supercomputer) , s-AllAtom.mutateResidue.py, and s-AllAtom.genPEG.py to your working directory. These files will be needed in the following steps.
- Note: In your protein PDB, remove all HETATM and non-protein atoms. These are usually not needed for simulations and can cause challenges with setting up your inputs if they aren't removed. You simply delete the lines from your PDB file that have these unnecessary atoms. In essence, you should only need the lines that start with "ATOM," the very last line that says "END," and all the lines that start with "SEQRES", so if you still have trouble with the PDB, try deleting everything from it other than the lines mentioned. (Note: You should be able to switch the order of this step and the following and still be fine, do whichever order is easiest for you).
- Note: Many PDB's do not start at residue 1, skip residue numbers in the primary sequence, or have odd numberings like "51A". The former 2 may cause issues, and the last will cause numbering issues and the "A" will be truncated off. These may cause numbering issues with the code or postprocessing. Make sure you know where these problem areas are and pay attention to the flags KPrep issues. If this does become an issue, scripts like s-AllAtom.mutateResidue will fix the PDB file so that it begins at 1 and increments the numbering by 1 for each following amino acid.
  - Be sure to change "protName" in this script to the name of the protein you are working with. Add SEQRES section back to the new PDB. Running s-AllAtom.renumberPDB.py will remove all the SEQRES lines from your PDB file; however, you will need this section for KPrep scripts you will use in the next few steps to run. You can simply copy all the lines that start with "SEQRES" in your original PDB to the top of the PDB created by s-AllAtom.renumberPDB.py (just below the line that says "REMARK Created by KPrepWrite").
Run s-AllAtom.mutateResidue.py. This script will mutate a specified amino acid into a new amino acid. If you want to understand how it does this, refer to the README or the comments of the "mutateResidue" function in "KPrepFunctions.py" The variables you need to change in s-AllAtom.mutateResidue.py are:
- protName: this is the name of the PDB file that you want to mutate (don't include ".pdb" at the end)
- resName: this is the name of the residue you want to mutate into the protein (don't include ".pdb" at the end)
- resSeq: this is the residue number of the residue you want to replace with a new amino acid
- secondResAtomName = this is the name of the atom attached to the alpha carbon of the residue you are mutating in. It will usually be ' CB ' (with the spaces before and after the letters). This is needed because the residue will be rotated around the bond between the alpha carbon and the atom attached to it when it is fit into your PDB file.
Run s-AllAtom.genPEG.py. This script will create a PEG at the residue you mutated. Refer to the README or the comments of the "pegGenAllAtom" function in "KPrepFunctions.py" if you want to understand how the function works. There are some variables that you will always need to specify, and some that you can usually leave as the default values in the script.
- Variables you always need to specify:
  - protName: this is the name of the PDB file that you want to add the PEG to, which will be the PDB output in the previous step (don't include ".pdb" at the end)
  - pegRes: this is the residue number of the residue you want to attach the PEG to
  - numPEG: this is the number of PEG monomers in the PEG chain you want to create
- If you choose to use a different PDB as a monomer for polymer creation or if you use an amino acid other than PDC for polymer attachment then you will need to change the following variables. Otherwise, you can leave the default values:
  - protRemoveAtom: Atom in protein that is removed when first PEG is bonded to it. This will usually be a hydrogen atom, e.g. ' H* '
  - protBondAtom: Atom in protein that first PEG is bonded to. This will be the atom that protRemoveAtom was bonded to, e.g. ' C* '
  - pegRemoveAtomFront: Atom in PEG that is removed when PEG is bonded to a previous atom, usually ' H1C'
  - pegRemoveAtomBack: Atom in PEG that is removed when PEG is bonded to a following atom, usually ' H2C'
  - pegBondAtomFront: Atom in PEG that is bonded to when PEG is bonded to a previous atom, usually ' C1 '
  - pegBondAtomBack: Atom in PEG that is bonded to when PEG is bonded to a following atom, usually ' C2 '
If you don't like the PEG's shape, run s-AllAtom.genPEG.py again. The script randomly orients the monomers, so you may get a PEG that is close to the protein, very spread out, etc. Just run it as many times as you need to get the PEG shape you want. You can do this without changing anything in the script.

Creating a protein with a uAA mutation and no PEG

This is the same as the steps under "PEGylating a Protein", but stop after you've mutated the residue in (Step 3).

Creating a Free PEG

This is a similar process to the steps under "PEGylating a Protein", but quite a bit simpler. It basically only uses the last two steps of that section.

The only files you need in your working directory are the PEG PDB file (found in KPrep/PDB folder) and s-AllAtom.genPEG.py
Run s-AllAtom.genPEG.py. Unlike "PEGylating a Protein," you will need to change some of the variables that you can usually leave alone in the script. In the descriptions "protein" actually refers to the first PEG monomer for this situation.
- protName: this is the name of the PDB file that you want to add the PEG to, which in this case will just be "PEG"
- pegRes: this is the residue number of the residue you want to attach the PEG to, which in this case is just 1
- numPEG: this is the number of PEG monomers in the PEG chain you want to create
- protRemoveAtom: Atom in protein that is removed when first PEG is bonded to it. In this case, this will be ' H2C'
- protBondAtom: Atom in protein that first PEG is bonded to. This will be the atom that protRemoveAtom was bonded to, in this case this will be ' C2 '
- You can leave pegRemoveAtomFront, pegRemoveAtomBack, pegBondAtomFront, and pegBondAtomBack at their usual values.
If you don't like the PEG's shape, run s-AllAtom.genPEG.py again.

Future

Following these steps will create a PDB that has all of the atoms needed and will look good visually. However, what has not been done is producing a PSF file. This is needed to use charmm2lammps to create LAMMPS input files. The biggest challenge will be connecting the PEG to the protein. They are close spatially, but the PDB does not indicate that they are bonded. A possible solution may be to use a LINK record (look into the PDB format to learn more about these) and then using a tool like CHARMM-GUI to produce a PSF. We may also need to create a record in a topology (.top) file to make this work with charmm2lammps. We also need to create parameters (both .top and .param files) for the DBCO linker used to attach the protein to the PEG.