* This is a general template script to calculate the pKa values for * HIS, ASP GLU and CTER residues in a model protein using the isolated * site model. The procedure requires the following steps: * 1) generate a pH=8 model of the protein (HSD and HSE) * 2) generate a fully charged model of the protein * 3) identify titratable groups * 4) calculate pKa using continuum models. *** * Required files: top_all22_prot.inp, par_all22_prot.inp * !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !* IMPORTANT USAGE NOTE: ONLY SINGLE CHAIN PROTEINS CAN BE PROCESSED !* WITH GENERATION/MINIMAZATION STAGE BECAUSE OF PSF GENERATION PROCEDURE !* USED, I.E. ASSUMPTION THAT PSF CAN BE GENERATED BY READING SEQUENCE OF !* PROTEIN FROM PDB FILE. !**_______________________________________________________________________** !** WRITTEN BY C.L. BROOKS, III ** !** MAY, 2000 ** !** ** !** COPYRIGHT C.L. BROOKS, III, THE SCRIPPS RESEARCH INSTITUTE ** !**_______________________________________________________________________** !******** !* The usage requires that several variables either be set in the !* input script or on the CHARMM command line at execution. !* The needed variables are: !** Variable Default Function !*__________________________________________________________________________ !* toppar ($toppar) ==> path to parameter and topology files !** !* pdbfile (...) ==> name of pdb coordinate file containing !** coordinates and sequence information !** !* minimize (false) ==> logical determining whether to condition !** pdb structure by minimization or not !** !* epsinterior (10) ==> value to sue for protein interior dielectric !** constant !* pb (false) ==> logical indicating whether to use Poisson-Boltzmann !** continuum model to do pKa calculation !* size (0.75) ==> grid spacing to use in finite difference PB !** calculation !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! if @?pdbfile eq 0 goto iamdead if @?toppar eq 0 set toppar = "$toppar" if @?minimize eq 0 set minimize = false if @?epsinterior eq 0 set epsinterior = 10 if @?pb eq 0 set pb = false ! Default don't do PB calculation if @?size eq 0 set size = 0.75 ! Default grid size for PB ! Read the parameter and topology files: goto readparams label fromreadparams ! Set-up protein, charge arrays and pointers to ionizable groups goto setupprotein label fromsetupprotein ! At this point the vector arrays 1-3 contain: ! (1) HSP, and neutral ASP, GLU and CTER ! (2) HSD, and negatively charged ASP, GLU and CTER ! (3) HSE, and negatively charged ASP, GLU and CTER ! ! The vector array 4 contains the pointers to the ionizable groups ! The variable nionizable denotes the number of ionizable groups !!!!!!!!!!!!!!!!!!!! ! Set charges to pH 8 state w/ HSD scalar charge recall 2 !!!!!!!!!!!!!!!!!!!! ! Set-up energy calculations update cutnb 999 switch vswitch skipe all excl elec gben !***Optimized GB parameters for version 22 protein parameters !***w/ polar hydrogen radii set to 1.0 A define polarH select property radii .le. 1.0 end ! Select all hydrogen atoms w/ ! vdW radii .le. 1.0 A scalar wmain = radii scalar wmain set 1.0 select polarH end GBorn P1 0.4380 P2 0.1664 P3 0.0138 P4 8.84109 P5 1.0 Lambda 0.7160 Epsilon 80.0 Weight ! Scale De by correct ratio to get eps_interior Calc fctr = ( 1 / @epsinterior - 1/ 80 ) / ( 1 - 1 / 80 ) ! Begin to loop over ionizable groups, compute Delta_pKa ! First calculate reference energies w/ GB and PB !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! Open file to unit 2 to write results open unit 2 write form name @PDBFILE.pka write title unit 2 * Res name Res # dpka (GB) dpka (PB) define all select all end energy eps @epsinterior set Deref = ?elec + @fctr * ?gben ! Set-up for Poisson-Boltzmann calculation set return = fromref set select = all set scale = 1.2 set DePbref = 0 set Epb = 0 set pass = first if PB eq true goto dopbenr label fromref set DePbref = ?elec + @Epb !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! set whichgroup = 1 label nextgroup scalar wmain recall 4 define thisgroup select property wmain .eq. @whichgroup end set name = ?selresn set resnum = ?selresi ! Now calculate energies of the protonated state for the entire protein !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! scalar charge stat select thisgroup end scalar charge recall 2 scalar charge recall 1 select thisgroup end scalar charge stat select thisgroup end energy eps @epsinterior Calc De = @Deref - ( ?elec + @fctr * ?gben ) ! Set-up for PB set return = fromprot set select = all if PB eq true goto dopbenr label fromprot Calc DePb = @DepBref - ( ?elec + @Epb ) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! Now we need to subtract the corresponding values for the isolated amino acid chains ! For GB we use the interaction energy command (inte) and select just the amino acid ! of interest. !! First residue in charged state scalar charge recall 2 scalar charge recall 1 select thisgroup end scalar charge stat select thisgroup end inte select thisgroup end select thisgroup end Calc De = @De + ( ?elec + @fctr * ?gben ) ! For PB we use a "trick" an set the radii and charges of atoms not involved in ! the calculation to zero. set return = fromres1 set select = thisgroup if PB eq true goto dopbenr label fromres1 Calc DePb = @DepB + ( ?elec + @Epb ) !! Second we complete calculation by consideration of residue in pH 8 state scalar charge recall 2 scalar charge stat select thisgroup end inte select thisgroup end select thisgroup end Calc De = @De - ( ?elec + @fctr * ?gben ) ! Set-up for PB set return = fromres2 set select = thisgroup if PB eq true goto dopbenr label fromres2 Calc DePb = @DepB - ( ?elec + @Epb ) Calc DpK = @De / ( ?kblz * 298 * 2.303 ) Calc DpKpB = @DepB / ( ?kblz * 298 * 2.303 ) if PB ne true set DpKpB = 0 ! Write out information for this site and move onto next one. write title unit 2 * @name @resnum @DpK @DpKPb * increment whichgroup by 1 if whichgroup le @nionizable goto nextgroup gborn clear stop !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! label dopbenr !!!!!!!!!!!!!!!!!!! ! Determine size of grid for whole protein calculation if pass ne first goto notfirst coor stats Calc idim = int ( @scale * ( ?xmax - ?xmin ) / @size ) + 1 Calc jdim = int ( @scale * ( ?ymax - ?ymin ) / @size ) + 1 Calc kdim = int ( @scale * ( ?zmax - ?zmin ) / @size ) + 1 set ncell = 0 Let ncell = maxi @idim @jdim Let ncell = maxi @ncell @kdim set pass = 0 label notfirst coor copy compare scalar x set 0 select .not. @select end scalar y set 0 select .not. @select end scalar z set 0 select .not. @select end pbeq scalar wmain set 0 scalar charge set 0 select .not. @select end scalar wmain = radius select @select end solve epsw 80 epsp @epsinterior ncell @ncell dcel @size conc 0 watr 0 set epb = ?enpb solve epsw @epsinterior epsp @epsinterior ncell @ncell dcel @size conc 0 watr 0 Calc epb = @epb - ?enpb reset end coor swap !!!!!!!!!!!!!!!!!! goto @return !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! label readparams prnlev 0 node 0 wrnlev 0 node 0 open unit 1 read form name @TOPPAR/top_all22_prot.inp read rtf card unit 1 close unit 1 read rtf card unit 5 append * This is an rtf patch to make a neutral C-terminus * appropriate for pH < ~3. 22 1 PRES CTRN 0.00 ! neutral C-terminus GROUP ! use in generate statement ATOM C CD 0.55 ATOM OT1 OB -0.50 ATOM OT2 OH1 -0.60 ATOM HT2 H 0.55 DELETE ATOM O BOND C OT2 OT2 HT2 DOUBLE C OT1 IMPR OT1 CA OT2 C DONOR HT2 OT2 ACCEPTOR OT1 C ACCEPTOR OT2 C IC N CA C OT2 0.0000 0.0000 180.0000 0.0000 0.0000 IC OT2 CA *C OT1 0.0000 0.0000 180.0000 0.0000 0.0000 IC HT2 OT2 C OT1 0.0000 0.0000 0.0000 0.0000 0.0000 END open unit 1 read form name @TOPPAR/par_all22_prot.inp read param card unit 1 close unit 1 read param card unit 5 append ANGLES OH1 CD CT1 55.000 110.5000 ! ALLOW ALI PEP POL ARO ALC END prnlev 5 node 0 wrnlev 2 node 0 goto fromreadparams !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! label setupprotein !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! SOLUTE GENERATION COMPONENT: ASSUMES SINGLE CHAIN SOLUTE IN PDB ! ! FILE GIVEN BY KEY PDBFILE. IF MORE THAN ONE CHAIN IS PRESENT ! ! THE GENERATION ASPECTS WILL BE ERRONEOUS. ! ! THIS COMPONENT OF THE SCRIPT: ! ! 1) GENERATES PROTEIN SEQUENCE ! ! 2) BUILDS ALL MISSING ATOMS ! ! 3) WRITES THE PDB FILE ! ! CHECKS ARE DONE ON NUMBER OF MISSING ATOMS TO GUESS WHETHER FILE WAS ! ! PROCESSED CORRECTLY. EXAMINE OUTPUT FILE FOR POTENTIAL PROBLEMS! ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! Construct the pH 8 state, HSD, ASP, GLU and CTER set HIS = HSD set cter = CTER set return = frompH8 goto dogenerate label frompH8 scalar wmain = charge open unit 1 write form name @pdbfile_pH8-hsd.crd write coor card unit 1 * Coordinates for pH 8 (all HIS = HSD, ASP, GLU and CTER charged) * delete atom select all end ! Construct the pH 8 state, HSE, ASP, GLU and CTER set HIS = HSE set cter = CTER set return = frompH8E goto dogenerate label frompH8E scalar wmain = charge open unit 1 write form name @pdbfile_pH8-hse.crd write coor card unit 1 * Coordinates for pH 8 (all HIS = HSE, ASP, GLU and CTER charged) * delete atom select all end ! Construct the low pH state, HSP, ASPP, GLUP and CTRN set HIS = HSP set cter = CTRN set return = fromlowpH goto dogenerate label fromlowpH scalar wmain = charge scalar wmain store 1 open unit 1 write form name coor/@pdbfile_pH0.crd write coor card unit 1 * Coordinates for low pH (all HIS charged, ASP, GLU and CTER neutral) * Structure conditioned/regularized with minimization * ! Get charge arrays for HSD and HSE pH 8 varients ! First HSD pH 8 varient scalar wcomp set 0 rename resname hsd select resname hsp end open unit 1 read form name @pdbfile_ph8-hsd.crd read coor card unit 1 comparison resid close unit 1 rename resname hsp select resname hsd end scalar wcomp store 2 ! This intermediate file isn't needed anymore, delete it. system "rm *_ph8-hsd.crd" ! Now HSE pH 8 varient scalar wcomp set 0 rename resname hse select resname hsp end open unit 1 read form name @pdbfile_ph8-hse.crd read coor card unit 1 comparison resid close unit 1 rename resname hsp select resname hse end scalar wcomp store 3 ! This intermediate file isn't needed anymore, delete it. system "rm *_ph8-hse.crd" ! The vector arrays 1, 2 and 3 hold the charges for: ! (1) HSP, and neutral ASP, GLU and CTER ! (2) HSD, and negatively charged ASP, GLU and CTER ! (3) HSE, and negatively charged ASP, GLU and CTER scalar xcomp recall 1 scalar ycomp recall 2 scalar zcomp recall 3 ! Define all possibly ionizable residues for pH < 8 define ionizable select ( resname hsp .or. resname asp .or. resname glu - .or. ires ?nres ) end ! Do "while" loop over list of ionizable residues and identify residue numbers prnlev 0 node 0 ! Set-up for first ionizable residue scalar wmain set 0 select all end define next select ionizable end set number ?selresi !residue number for first residue in list define done select resid @number end !make "done" list contain this residue set nionizable 1 scalar wmain set @nionizable select resid @number end label reslabels define next select ionizable .and. .not. done end set n ?nsel if n eq 0 goto finished set number ?selresi !residue number for next residue in list define done select resid @number .or. done end !add this residue to done incr nionizable by 1 scalar wmain set @nionizable select resid @number end goto reslabels label finished scalar wmain store 4 ! labels for residues with possible ionization state changes prnlev 5 node 0 !scalar wmain show select property wmain .gt. 0 end goto fromsetupprotein stop !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! SUBROUTINE DOGENERATE label dogenerate prnlev 0 node 0 ! Copy original file for safe keeping then "fix" pdb file system "echo @PDBFILE.pdb > filename" system "cp coor/`awk '{print tolower($0)}' filename` pdb_orig" ! Make "standard" changes to accomodate pdb -> top_all22/28 atom/residue renames. system "echo 'BEGIN {write = 0}' > awk.file" system "echo '$1 == "TER" {write += 1;}'>> awk.file" system "echo '{' >> awk.file" system "echo 'if ($4 == "HIS") sub("HIS","@HIS");' >> awk.file" system "echo 'if ( write == 0 ) {print $0 "@PDBFILE" ;}' >> awk.file" system "echo '}'>> awk.file" system "echo '{if ( write == 1 ) {' >> awk.file" system "echo 'if ($1 == "TER") {print "END" ;}'>> awk.file" system "echo '}}'>> awk.file" ! Beyond column 72 we should have the segid, some NMR files don't do this. Add ! segid after column 72 using cut and awk. system "cut -b 1-72 pdb_orig | awk -f awk.file > pdb_new" ! Clean-up temporary files system "rm awk.file" system "mv pdb_new `(awk '{print tolower($0)}' filename)`" open unit 1 read form name @PDBFILE.pdb read sequ pdb unit 1 resid close unit 1 generate @pdbfile setup first nter last @cter ! Check for disulphide bonds in SSBOND records system "echo "* Disuplhides for @pdbfile" > disu.strm" system "echo "* " >> disu.strm" system "awk '$1 == "SSBOND" {print "patch disu @pdbfile",$4," @pdbfile",$6}' pdb_orig >> disu.strm" stream "disu.strm" system "rm disu.strm" if his ne hsp goto dontprotonate ! Generate patches to turn ASP and GLU into neutrals ! First ASP define asps select resname asp end set thisasp ?selresi scalar wmain set 0 scalar wmain set 1 select asps .and. type ca end label doasps define asps select property wmain .eq. 1 .and. type ca end set n ?nsel if n eq 0 goto doneasps set thisasp ?selresi patch aspp @pdbfile @thisasp scalar wmain set 0 select resid @thisasp end goto doasps label doneasps ! Next GLU define glus select resname glu end set thisglu ?selresi scalar wmain set 0 scalar wmain set 1 select glus .and. type ca end label doglus define glus select property wmain .eq. 1 .and. type ca end set n ?nsel if n eq 0 goto doneglus set thisglu ?selresi patch glup @pdbfile @thisglu scalar wmain set 0 select resid @thisglu end goto doglus label doneglus label dontprotonate rename atom cd1 select resname ile .and. type cd end rename atom oxt select type ot1 end rename atom o select ires ?nres .and. type ot2 end open unit 1 read form name @PDBFILE.pdb read coor pdb unit 1 resid close unit 1 rename atom cd select resname ile .and. type cd1 end rename atom ot1 select type oxt end rename atom ot2 select ires ?nres .and. type o end ! Now move original pdb file back and finish clean-up system "rm pdb_orig `awk '{print tolower($0)}' filename` filename" define nonH select .not. hydrogen end set nonHatoms = ?nsel define missing select .not. hydrogen .and. .not. initialized end set missingatoms = ?nsel Calc fraction = (@nonHatoms - @missingatoms) / @nonHatoms if fraction le 0.8 goto toomanywrong prnlev 5 node 0 print coor select missing end ! Build missing atoms if just a few ic param ic build hbuild define missing select .not. initialized end if nsel ne 0 goto skipthis system "banner Some atoms not built" print coor select missing end stop label skipthis coor copy compare if his ne hsp goto fromminimizemodel if minimize eq true goto minimizemodel label fromminimizemodel coor orie coor stat goto @return !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! label iamdead system "banner Must specify pdb filename" stop label toomanywrong system "banner Too many missing atoms" print coor select missing end stop !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! label minimizemodel faster on set k 30.0 label minimize cons harm force @k select .not. hydrogen end minimize conj nstep 500 tolenr 0.001 - cutnb 14.0 rdie eps 2.0 vswitch shift coor orie rms select type ca end coor orie rms select .not. hydrogen end decr k by 10.0 if k ge 10.0 goto minimize goto fromminimizemodel !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!