DelPhi

 

A Macromolecular Electrostatics Modeling Package

 

DelPhi is a software package, which calculates electrostatic potentials in and around macromolecules. It incorporates the effects of ionic strength through the (non) linear Poisson-Boltzmann equation. Any value for the dielectric constant of the molecule and solvent may be specified. Periodic boundary conditions can be used to model long periodic molecules such as DNA. The output from the programs can be used to calculate interactions, changes in pKa, solvation energies and many other properties of interest. Potentials may be displayed as 3-Dimensional isopotential contours, 2-Dimensional contour slices or as color coded molecular surfaces.

References *

Introduction *

DIRECTORIES *

Installation *

Files *

The parameter file (file.prm) *

Statements and Functions *

Formats *

Shorthand and Longhand Statements *

Functions: In Detail *

Energy *

Read/In *

Write/Out *

Qinclude *

Programs *

Mkcontr *

Program DelPhi *

Brief description *

Algorithm *

Surface generation: *

Input files *

Output files *

Overall program flow: *

Unformatted input/output option *

Energies calculated in delphi *

Worked examples *

THE POTENTIAL AROUND CU,ZN SUPEROXIDE DISMUTASE *

TEST CHARGES IN A SPHERE- Comparison with analytical solutions *

SOLVATION ENERGY OF THE ACETATE ION *

PK SHIFTS IN SUBTILISIN *

HINTS *

Appendix A: An index of Statements and their shorthands *

Appendix B: Glossary of Input Parameters: *

Appendix C: File Formats *

Appendix D: Notes on Input Parameters *

 

Version:

DelPhi, Version II, May 1998

Authors:

Kim A. Sharp, Anthony Nicholls

&

Sundaram Sridharan

Dept. of Biochemistry and Molecular Biophysics,

Columbia University,

630 W168th St.

New York, NY 10032

delphi@trantor.bioc.columbia.edu

Copyright:

This software forms version II of a package called DELPHI, which is copyrighted by COLUMBIA UNIVERSITY. INSIGHT is a trade name of Biosym Corporation.

Permission to use this software is given on the understanding that it will not be distributed without consulting Professor Barry Honig, in the Dept. of Biochemistry at Columbia, (212)-305-7970. Other copyright laws may also apply. For further information contact Barry Honig.

References

General:

DNA, nonlinear PB:

pKa's:

Charge & Radius parameters:

The original reference to the use of the finite difference method for macromolecular electrostatics is:

Introduction

This is a collection of programs to take a Brookhaven data base format coordinate file of a molecule and calculate the potential in and around the molecule, using a finite difference solution to the non-linear POISSON-BOLTZMANN equation for any given ionic strength, solvent and molecule dielectrics.

Briefly, it consists of the following steps, explained in more detail below:

  1. Prepare three files containing a) run parameters b) atomic radii c) atomic charges.
  2. Modify the Brookhaven format atomic coordinate file as necessary: e.g. if a full partial charge set on all atoms is required it will be necessary to build in hydrogens using a suitable modeling program. Also if the output of potentials at various specified atoms are required, a second atomic coordinate file may be needed.
  3. Run the program delphi (finite difference algorithm), in batch or interactively directing the output into a log file if required.
  4. Analyze the results. Two things that are commonly required are to read out and display the potentials. Two options exist:

DIRECTORIES

Upon installation, a directory structure should be created in order to group files in some kind of order. The directories should be arranged as:

parent_directory/

src/ source directory

bin/ all executables and script files

data/ atomic radii and charge data files

docs/ documentation

examples/

 

Installation

After downloading the encrypted tar archive decrypt it by doing the following in a shell window.

1: Decode the file delphi_en.tar.Z with the command:

crypt < delphi_en.tar.Z > delphi.tar.Z

(enter de-cryption key at this point)

2: Uncompress the file delphi.tar.Z with the command:

uncompress delphi.tar.Z

3: This last command should result in a file delphi.tar. Extract the files from this with the command:

tar -xvf delphi.tar

4: This should give a directory called delphi containing delphi source and data files.

COMPILING THE EXECUTABLE

There are a number of makefiles in the main directory of the form make.XXX where XXX indicates the hardware platform. Link the appropriate make.XXX to makefile and issue make.

ln -s make.XXX makefile

make

If the makefile for your platform is not found make one up and after successful compilation, please deposit the compiler flags and any necessary modifications to source code with the address shown below.

On newer Silicon Graphics machines like the INDIGO2-impact, Octane and ORIGIN servers which use an R10000 and a 64 bit OS, use make.IRIS64 for Makefile.

EXAMPLES

The best way to check the installation is by working through the examples. In particular, it is recommended that example 2, on charges in a sphere be worked through carefully, and the results compared to those in section on worked examples – charges in a sphere. The logfiles and *.frc output files should be compared with those provided from the test (*cvx.log, *cvx.frc). This will provide a good check that the program is working, and also is a tutorial on how delphi can be used to calculate the various components of an electrostatic system. It is no exaggeration to say that if you understand the case of two charges in a sphere then that is 90% of continuum electrostatics! The input files, and the output files that are not in binary form (i.e. not *.phi, *.eps) are provided in examples. Explanation and analysis is also provided in Section that section. These examples are taken from published papers that used this or similar forms of DelPhi, and these should be referred to for more details. The calculations are typical of those that can be made with this package, although for the purpose of illustration, several shortcuts were taken. Also read section 10, Hints.

Files

FILE NAMING

Most file types have default extensions to simplify things:

file.pdb: Atomic coordinates, input to DELPHI.

file.siz: Atomic radii, input to DELPHI.

file.crg: Atomic charges, input to DELPHI.

file.prm: Parameters for DELPHI.

file.log: Logfile from DELPHI.

file.phi: Binary potential map, output of DELPHI, input to GRASP and BIOSYM InsightII

file.frc: Potentials and fields, output from DELPHI.

 

FILE FORMATS

FILE.pdb

Standard format Brookhaven protein data bank file containing atom labels and coordinates. Only records starting with ATOM or HETATM are used. Default extension *.pdb. Format (as assumed by the program: beware many minor and not so minor differences abound in the literature and on disk!): (A6,5X,A5,X,A3,X,A1,A4,4X,3F8.3) header, atom name, residue name, chain name, residue number, x,y,z. coordinates. Note that the program treats the residue number as an ascii string, not an integer.

The parameter file (file.prm)

(Description written by Anthony Nicholls)

DelPhi uses a command interpreter allowing english-like commands to be used in the parameter file.

Statements and Functions

The concept of the command in DelPhi comes in two forms, statements and functions.
Statements have the form:

	Variable=value

e.g.,

	Scale=2.0 
	grid size=33
	periodic boundary=x
	frc write=true

etc.

Commands have the form:

	operation(specifier,file="xxx.yyy",format=abc")

e.g.,

	in(pdb,file="lys.pdb")
	out(phi,unit=20,format=2)
	center(file="test.pdb")

etc.

(Note the (intentional) similarity to FORTRAN open statements) The difference is one of complexity. Statements merely set values or flags, functions cause operations to occur within the function and include moderators to influence these operations.

Formats

It should also be noted as to how DelPhi tells them apart. It does so by noticing that every statement has an equals sign that this is not inside of any brackets. I mention this because inevitably typos will occur and the interpreter will get confused occasionally. I have tried to anticipate some input errors and to inform the user of them, but the hardest part of every complex program is error handling, and at the moment DelPhi will only pipe back to you what it doesnt understand, and continue on with the program. So, be careful with equal signs. The other thing to use with care are commas. The interpreter uses (non-bracketed) commas to decide how many commands it is being faced with. For instance,

	Scale=2.0, gridsize=65,center(file="mid.pdb")

is fine, but

	frcwrite=on, out,(eps),

will get you a warning, and quite rightly so. One does not have to use commas to separate commands, DelPhi will also tolerate "|" and ":", which are less ambiguous, but I find less natural. Do not use periods, since these can be confused with numerical values.

The commands are not case sensitive, although file names will be, i.e.

	in(pdb,file="lys.pdb")
and
	in(pdb,file="Lys.pdb")

are distinguished. Spaces are fine, the interpreter discards them anyway. Comments can be added via exclamation marks. Anything to the right of an ititial "!" is ignored until another "!" or the end of the line is reached. For instance:

	!set scale! scale=1.5 ! probe radius=1.4

will cause the scale to be set but not the probe radius, whereas,

	scale=1.5! now set probe radius! probe radius=1.4

causes both to be set.

You can put as many commands as you like on a line, suitable seperated by commas of course, or put one per line. Also, if you reach the end of the line and want to continue on to the next you can use a continuation slash as in FORTRAN. For instance:

	membrane=true,ion radius=\
	2.0, gridsize=33

is read as,

	membrane=true,ion radius=2.0, gridsize=33

(I doubt this will get used much, but one never can tell.)

You might have noticed that I said that if DelPhi fails to interpret something it will go ahead and run anyway. This might seem odd since older versions have tended to crash if one does not give them all the parameters they need. DelPhi II is a step away from this in that it contains a complete set of parameters at run time. These default parameters will execute a DelPhi run (80% box fill) without any externally supplied parameters, i.e. the parameter file can be completely absent. This can, of course lead to inadvertent runs being done, but that is part of the price of the increase easy of use, i.e. one only needs to set those parameters which are different from the default. I, at least have noticed that relatively few parameters change from run to run which is the reason behind this "prepackaged" DelPhi. Finally, as we shall get into later, one can devise ones own set of default parameter if the system defaults are not to ones liking.

Shorthand and Longhand Statements

There is a set of abbreviations for each statement type. This comes in two forms, two letter codes and six letter codes. Actually six letter codes are six to three letter codes. The reason for including these is that they are more compact (though less readable) than long descriptions, and as such they are less prone to typing error (once one knows them!). It is a matter of taste. A complete listing of abbreviations and the allowable long versions appears in Appendix A

at the end of this document.

Yes, No, Maybe

When setting logical values the following are case insensitive and equivalent:

	yes, on, true, t
	no, off, false, f

Functions: In Detail

The present set of allowed functions are:

	CENTER
	ACENTER
	READ/IN (Equivalent)
	WRITE/OUT (Equivalent)
	ENERGY
	QINCLUDE

We shall cover these one by one since they vary somewhat more in format than statements. But first some of the common features

	Function(file="test.file")

will open the file test.file, whether for centering, output or input.

	Function(unit=14)

will do the same but with fort.14 or whatever is linked to it.

	Function(format=abc)

will perform operations on files with a particular format, or in a specified way. The default format is always zero (i.e. "0"). The format can be a number or a string. For instance, if one wanted to write a formatted phimap one would write,

	out(phi, format=1)

Note that "format", "frm" and "form" are all equivalent. As one can begin to see this makes functions very flexible. There are no shorthands for functions.

Center

	Center(0.2,3,2)

will offset the molecule by 0.2 grids in the x direction, 3 in the y and 2 in the z, just as in the standard parameter file. So why make it a function? Because of the possibilities of needing to open a file to get the center

	Center(unit=15)

will open fort.15, read all the atoms in it and make their center the center for the run.

There is a possibility of adding a format quantifier to enable this to read, say unformatted file, but this hasn’t been done yet.

To read just the first atom of a file and use its coordinates use the following,

	Center(file="whatever",an=1)

Why the "an=1"? Why not just use a format specifier or something? The reason is that "an=1" is GRASP language for atom number one. At sometime in the future I may want to include the ability to add GRASP qualifiers to select parts of files for a function, e.g. in an frc file to only write those atoms which are charged etc. So this is just a bridgehead into the future so to speak.

Acenter

This hardly deserves to be a function, but I couldnt decide how to fit it in any other way. Acenter takes three absolute coordinates, i.e. in Å and uses those as the center, so

	Acenter(1.0,5.6,7.0)

centers the molecule at x=1.0Å, y=5.6Å, z=7.0Å.

Energy

This replaces the energy line in the old style parameter file. At present it takes as its argument any of the following:

	G or GRID for the grid energy,
	S or SOL or SOLVATION for the corrected reaction field energy
	C or COULOMBIC or COU for the coulombic energy
	AS or ANASURF or ANALYTICALSURFACE for the analytical surface energy
	AG or ANAGRID or ANALYTICALGRID for the analytical grid energy

separated by commas. (As always there is no case sensitivity here.)

So, for example,

	energy(s,g,Cou)

gives the solvation, coulombic and grid energies.

Here also it would be nice to be able to select out part of the molecule, i.e. to find the energies associated with a portion of the molecule. All good things will come to those who wait.

Read/In

Obviously this function deals with file input. It comes with several specifiers, namely:

	SIZ: for radius files
	CRG: for charge file
	PDB: for the pdb file
	FRC: for the file use to determine site potentials
	PHI: for the phimap used in focussing

The main use, at present will be to give the user flexibity to specify the file name or unit number of any of these files. At a later date functionality will be added so, for instance, a GRASP command file could set the sizes or charges, different format of pdb file could be read in etc. Note that the default files for all read (and write) operations are the standard DelPhi ones.

Write/Out

Equally obviously this deals with output. The specifiers are:

	PHI: for phimaps
	FRC: for site potentials
	EPS: for epsmaps
	MODPDB: for modified pdb files
	UNPDB: for unformatted pdb file
	UNFRC: for unformatted frc files

Clearly there is some redundancy here, for instance the unformatted pdb and frc files could fall under the PDB and FRC rubric with different format types. But for now we shall hone close to what we are used to. As an example of use,

	write(eps)

writes an epsmap, standard format.

	out(modpdb, file="test.out")

writes a modified pdb file called "test.out". Note that all writes are turned OFF initially, including eps and phi.

The Parameter File

One will have noticed that using the above commands one can specify the unit number or file name of every file DelPhi normally uses, except the parameter file. Here we run into a catch-22 situation, there is no point being able to change the input file name from within the input file! Instead we have opted to allow the user to pass the name of the input file to DelPhi. For instance if one wants to use the parameter file "test.prm" as the parameter file one types:

	delphi test.prm 

Typing

	 delphi 

defaults to fort.10 as usual. Any additional parameters are ignored, i.e. only the first is used as the input file. So,

	delphi test.prm test2.prm

only uses test.prm

Qinclude

The qinclude function works in the same way as an "include" statement works in FORTRAN or C, i.e., it inserts lines from another file into the current one. For instance, suppose we have the following files:

test.prm:

	scale=3.0, write(frc),write(modpdb,file="test.out")
	acenter(0.123,4.55,2.34)

test2.prm:

	boundary type=2, read(pdb,file="test.pdb")

then the file:

	scale=3.0, write(frc),write(modpdb,file="test.out")
	qinclude(test2.prm)
	acenter(0.123,4.55,2.34)

is equivalent to:

	scale=3.0, write(frc),write(modpdb,file="test.out")
	boundary type=2, read(pdb,file="test.pdb")
	acenter(0.123,4.55,2.34)

or one could even write:

	qinclude(test1.prm)
	qinclude(test2.prm)

Clearly the motivation behind this form is to allow the user to build up his/her own default file and qinclude this file at the begining of any subaequent parameter file. Hence one then needs only a qinclude statement plus a line or lines indicating those parameters we want to change from the default file.

Note that qinclude is immediate, i.e. it includes the lines from the indicated file at the position of the qinclude command. This is important to remember since in DelPhi if you multiply define quantities, then the last instance is the current one, i.e.

	scale=2.0
	scale=3.0

leaves the scale set to 3 grids/Å. This then is the reason we include a write(specifier,off) command, so that if we have a default file which enables a write, we can still turn it off without modifying the default file. Can a qinclude file contain a qinclude file? But of course. At present you can nest qinclude files upto ten deep, and I dare anyone to require more than that EVER!

If a qinclude file does not exist, DelPhi will tell you so and move on to the next command. If there is no file passed to qinclude, i.e.

	qinclude()

then the default include file ~/qpref.prm is passed, if it exists. Qinclude is a special command and as such always requires its own line, i.e. do NOT add more commands to a line which (must) start with a qinclude command (not even comments).

FILE.siz:

Van der Waals radii to be assigned to each atom/residue type.

1: any number of lines with ! in the first position are treated as comments.

2: a single line of ascii information that DOESNÕT have ! in the first column, which marks the positions of the various fields for human readability and indicates the start of the radius records, but is otherwise not used.

3: an indefinite number of lines each containing an atom label record a residue label record and a radius in angstroms in the format (A6,A3,F8.3). For example:

aaaaaannnrrrrrrrr

c 1.8

ca 1.7

ca pro 1.6

Using this file, all proline ca atoms will have radius 1.6A, all other ca atoms will have radius 1.7, and all other types of carbon atoms will have radius 1.8.

Note the atom and residue fields ignore case and leading blanks. The residue field may be left blank (wild card), causing a match with the given atom type of any residue. ONLY if the residue field is left blank, the LAST 5 characters of the atom record maybe left blank. In this case all atom types beginning with the letter in column 1 will be matched. Records of greater specificity overide those of less specificity. Beware of ambiguities like calcium (ca) and alpha carbon! All atoms of an input *.pdb file must be assigned a radius through the *.siz file, even if it is 0, or an error will be flagged.

FILE.crg:

Charges to be assigned to each atom/residue/#/chain type.

1: any number of lines with ! are ignored as comments.

2: a single line of ascii information that DOESN’T have ! in the first column which marks the positions of the various fields for human readability and indicates the start of the charge records, but is otherwise not used.

3: an indefinite number of lines each containing an atom label record, a residue label record, a residue number record, a chain label record and a charge in the format: (A6,A3,A4,A1,F8.3). For example:

!example of charge file

aaaaaarrrnnnncqqqqqqqq

ca -0.2

ca pro -0.23

ca pro 1 -0.25

ca pro 1y -0.20

In this fictitious example, all alpha carbons will have a charge of -0.2, except for proline C alpha’s, which will have a charge of -0.2 on the y chain PRO 1, a charge of -0.25 on all other chain PRO 1 atoms, and a charge of -0.25 on all other PRO’s. Atoms that do not find a match in the *.crg file will be neutral.

The ascii fields for atom, residue, number and chain ignore case and leading blanks. Any field except the atom name may be left blank and will be treated as a wild card. Records of greater specificity overide those of less specificity as for the *.siz file above.

search order:

atom_res_num_chain

atom_res_num______

atom_res_____chain

atom_res__________

atom_____num_chain

atom_____num______

atom_________chain

atom______________

Atoms that do not find a match in the *.crg file will be neutral (q=0.0)

 

FILE.frc:

List of potentials and fields at coordinates in *.pdb file 12 lines of ascii header information, followed by a variable number of records written as:

230 format(8G10.3)

write(16,230)xo,chrgv,phiv,fx,fy,fz

where xo(3) are x,y,z coordinates of charge, chrgv is the charge value, phiv is the potential in kT/e at that point, and fx,fy,fz are the field components in kT/a/A. (1 kT/e = 25.6mV = 0.593 kcal/mole/e).

FILE.phi:

Unformatted file

character*20 toplabel

character*10 head,character*60 title

real*4 phi(igrid,igrid,igrid)

character*16 botlabel

real*4 scale, oldmid(3)

where phi is potential map; igrid is number of grids used in calculation. Scale and oldmid relate grid coordinates back to real space coords. Other variables are text information.

FILE.eps:

Unformatted file

real*4 oldmid(3)

integer*4 imap

integer*2 eps(igrid/16,igrid,igrid,3)

write (10) imap, scale, oldmid

write (10) eps

where eps is the dielectric map, igrid the number of grids used in the calculation. Scale and oldmid relate grid coordinates to real space coordinates. imap is unused flag

 

 

Programs

Mkcontr

When the BIOSYM option is chosen for the potential output (see section on parameter file and APPENDIX) one can use mkcontr to make a command file for displaying DelPhi potential contours in'

insightII a display program from MSI-BIOSYM.

Program DelPhi

Latest revision: May 1998

Brief description

DELPHI is a program to solve the (non-linear) Poisson-Boltzmann equation by finite difference methods on a NSZxNSZxNSZ cubical lattice. It can handle both the linear and non-linear forms of the equation. The program can use: periodic or focussing boundary conditions on the lattice edge; a variable ion-exclusion (or Stern) layer around the molecule; a variable probe radius to define the solvent accessible surface; a variable bulk solvent ionic strength; any required dielectric constant values for the molecule and solvent.

For input the program requires four files, containing i) Run parameters, ii) atom coordinates iii) Rules for assigning atomic radii, and iv) Rules for assigning atomic charges. For focussing boundary conditions a fifth file containing potentials calculated from a previous run of the program is also needed.

The output is a 3D array giving the potential (or net solvent ion concentration) at each lattice point and (optional) a list of potentials at charge locations contained in another atom coordinate file,

Algorithm

DELPHI relaxes the difference equation form of the nonlinear Poisson-Boltzmann equation (Equation 1)

where w is a relaxation parameter between 0 and 2 that controls the rate of iteration.

f 0 is the current estimate of the potential at the 0th lattice point.

f n is next estimate of the potential at the 0th lattice point.

f i is potential at 6 lattice points surrounding the 0th.

e i is the dielectric constant assigned to the lattice lines connecting the grid point to its 6 neighbours.

q0 is charge at the 0th lattice point.

k 0 is Debye Huckel term at the 0th lattice point.

Terms in powers of f 0 in the RHS denominator represent successive non-linear corrections to the linear equation. The current version of the program includes two such correction terms, which appears to be sufficient for the types of potentials encountered in aqueous macromolecular systems.

The largest permissible grid size is NGRID (ngrid=257) CUBED. The program usually calculates an appropriate value for igrid depending on molecular size and resolution desired. The difference equation is iterated until the change in potential at the grid ponts is smaller than some required value. Grid points on the boundary of the lattice do not have a full set of six neighbors unless periodic boundary conditions are applied to them, hence they are not relaxed, and initial values of the potential must be provided for them through some kind of boundary condition.

Surface generation:

Unlike the earliest versions of DelPhi, this version uses a grid independent method to produce the solvent accessible surface - it calculates the accessible surface arcs as a set of more or less evenly spaced points. In the second step it uses this data to grow the re-entrant surface in an iterative procedure. The contact part of the surface is calculated by an analytic procedure. What results is a mapping of Richard's Molecular Surface on the cubic grid. This results in improved accuracy and precision on calculated quantities (such as solvation energies, pKs etc). Unlike the earlier method, which had a fifth power dependence of CPU time (to render surface) on grid-resolution, this method has a quadratic dependence. However the time to create surface arcs scales as the number of atoms and can be quite significant for large molecules. Because of this the program calculates this data only once and stores it in a file called ARCDAT; it looks for this for all subsequent calculations on the same system.

Input files

The following are the default fortran file unit numbers for the files used in input.

UNIT

NAME

Description

10

*.prm

input parameters

11

*.siz

atomic Van der Waals radii

12

*.crg

atomic charges

13

*.pdb

Brookhaven format atom coordinates

(15)

*.pdb

list of atom coordinates if site potential output is required

(18)

*.phi

binary potential needed if focussing boundary conditions are selected

Units 15 and 18 are optional depending on values for the site potential output flag, and boundary condition option respectively, given in the parameter file.

Output files

UNIT

NAME

Description

6

*.log

logfile when run in batch

(14)

*.phi

output binary potential

(16)

*.frc

list of potentials at coordinates contained in the unit 15 pdb file

(17)

*.eps

binary dielectric map output

(19)

*.pdb

output atom coordinate file with radii and charge records added

Units 16 and 19 are not always written, depending on values assigned in the parameter file for the site potential output flag, and modified pdb file output flag options respectively. Potential map is not written out by default as it can be a large file. It can be output by a flag in parameter file if needed, e.g., in a focussing calculation. Likewise the dielectric map file is not written out by default.

Overall program flow:

In general outline, the program flow is as follows:

Header with time, date written.

Parameters are read from unit 10 and echoed to output.

Radius data read from unit 11 and stored in hash table for efficient look up.

Charge data read from unit 12 and stored in hash table for efficient look up.

Atomic coordinates are read from unit 13 and scaling is computed.

Arrays that describe 3D distribution of dielectric and ion accessibility in space are initialized.

Atomic coordinate and label data read from unit 13 again, and radii and charge assigned to each atom. Distribution of dielectric values, ionic strength parameters and charge values over the lattice are determined from the coordinate/charge/radius data

Atom file with charge and radii records outputted on unit 19 if requested.

Centres of + and - charge distributions, and net charge calculated for check on charge distribution.

The Molecular Surface (MS) is calculated and mapped onto the cubic grid. The output is an array that specifies whether each grid branch mid point is inside or outside the solvent excluded surface.

Arrays are set up for the difference equation iteration.

Boundary values are set- either through analytical expressions or from interpolation into the potential map read on unit 18.

Linear then non-linear iterative relaxations are done and convergence histories are displayed as simple log/lin line plots.

Potentials are converted to concentrations if requested.

Potentials and fields are calculated at the coordinates of the atoms read on unit 15, and outputted to unit 16 if requested.

Grid of potentials is increased in fineness to 65x65x65 if necessary and outputted on unit 14.

The dielectric map is outputted on unit 17.

 

Unformatted input/output option

Since charges and radii have to be assigned to a molecules atoms before the Poisson Boltzmann equation can be solved, the program has assignment routines. Since many runs are often done on the same protein/dna/whatever, it seemed sensible to be able to avoid the repetition of this by writing out a file with the coordinates, radii and charges. Secondly, since unformatted read/writes are much quicker, we store the information in an unformatted file, typically with extension

Further, since assignments of charge are made if a site potential file is to be written, one might want to avoid this too, so there is an option to store the frc-type pdb file in an unformatted file, so as to be read in rapidly on successive runs.

Finally, the site potential file itself can be written in unformatted form.

Unformatted versions of the files required for input do not initially exist- only formatted *.pdb files exist. The unformatted versions are generated by DELPHI itself, by setting certain output flags in the parameter file (See section on parameter file format). Old parameter files that do not contain these lines will by default not write unformatted files. Once the unformatted files exist, they can be linked to the appropriate input units. On subsequent runs of DELPHI, the program will automatically read the files as formatted/unformatted, depending on what type of file is linked.

 

UNFORMATTED VERSIONS OF INPUT FILES FOR DELPHI

UNIT

NAME

13

*.bb1 (unformatted input atom coordinates: generated from a previous run writing to unit 20)

15

*.bb2 (unformatted list of atom coordinates if site potential output is required: generated from a previous run writing to unit 21)

 

 

UNFORMATTED OUTPUT FILES for DELPHI

UNIT

NAME

20

*.bb1 unformatted output atom coordinate file with radii and charge records added for faster, unformatted input on unit 13 in subsequent DELPHI runs

21

*.bb2 unformatted list of coordinates for site potentials1 generated from the formatted pdb file attached to 15

22

*.bb2 unformatted list of potentials at coordinates read from the unformatted coordinate file attached to unit 15

 

FORM OF UNFORMATTED OUTPUT FILES

The format of files written on units 21 and 22 are both:

inum,xo,chrgv,pot,ex,ey,ez

where inum= atom number

xo = three entry array with atom coordinates

chrgv=any charge associated with the atom

pot=the potential

ex,ey,ez = the field at the atom.

For the file on unit 21, pot, Ex, Ey and Ez are all set at zero.

The format for the unformatted pdb file is:

xo,rad,chrgv

where rad is the atom radius, other quantities as before.

Energies calculated in delphi

The COULOMBIC ENERGY is calculated via coulombs law, and is the energy to bring the charges present from infinity to their final resting place, in a dielectric of that specified as being internal. This includes all charges, not just those in the grid.

The TOTAL ENERGY (formally called the grid energy) is that obtained from the potential at each charge WITHIN the grid, multiplying by that charge, and summing over all such sites. It contains all the real electrostatic energy terms, plus the self energy of the grid. This latter is not a meaningful number in itself, but can be subtracted out to yield meaningful quantities such as solvation energies etc.

The REACTION FIELD ENERGY term (formally called the solvation energy term) is obtained by calculating the induced surface charge at each surface point within the box, then using these charges to calculate the potential at every charge, not just those in the box. If the molecule lies entirely within the box, and there is no salt present, this corresponds to the energy of taking the molecule from a solvent of dielectric equal to that of the interior, to that of the exterior. Depending on the physical process, this may be the required solvation energy, but in general the solvation energy is obtained by taking the difference in reaction field energies between suitable final and initial reference states that define the required process- hence the name change. In the earliest versions of DELPHI, the reaction field energy was not accurate if the scale was less than 2 grids/angstrom. The current version uses a better representation of the dielectric surface, and by repositioning the induced surface charge at its correct position before calculating the reaction potential back at the charge, gives much more accurate estimates, even at 0.5 to 1 grids/angstrom. Thus solvation energy calculations should always be done via differences in reaction field energy, rather than via differences in total energy as was done before. NOTE however that the best way to assess the inaccuracies arising from the lattice representation is by repeating calculations at different scales or box fills, and observing the change in the calculated quantity.

 

 

Worked examples

THE POTENTIAL AROUND CU,ZN SUPEROXIDE DISMUTASE

OBJECTIVE

produce a potential map showing the potential distribution in and around CU,ZN Superoxide Dismutase

REFERENCES

Klapper et al, Proteins, 1:47 (1986) for parameters of the run and pictures of potentials in a slice through the two active site coppers.

Getzoff, Tainer et al., Nature 306:284,287 (1983) for structural details.

FILES

contour.kt: list of contour values in kT, 1 per line, input to CONTOUR2D.

sod.siz atomic radii, input to delphi.

sod.crg atomic charges, input to delphi.

sod.eps dielectric map=protein ÕshapeÕ.

sod.log log file of delphi run (finite difference calculation).

sod.pdb atomic coordinates, input to delphi.

sod.phi output potential map, input to contour2d.

sod.prm input parameters for delphi.

sodout.pdb output atomic coordinates with assigned radii and charges, input for contour2d.

sod.plt output plot file from contour2d for HP75 plotter showing potentials in slice thru 2 coppers.

NOTES

The grid size in this example is set to 67. The protein fills 80% of box. For further accuracy, focussing can be used . 400 iterations were perform, using the linear PB equation, since no salt is present (even with salt, SOD is not highly charged, so we could probably still use the linear equation). For large grids or better convergence, iterations should be increased. A max change in potential of <0.001kT/e is reasonable. Looking at the charge file sod.crg, note that only ‘full’ charges have been assigned-for potentials far out in solution, dipole charges (or partial charges) make little difference. A complete set of partial charges (including polar H’s) will require building of H atoms into sod.pdb using some modelling package. The program mkcontr was used to create sod.inp, a command file that can be sourced in InsightII (version 95 and higher) to display potentials as contour lines.

 

 

TEST CHARGES IN A SPHERE- Comparison with analytical solutions

OBJECTIVE

Determine the potential and field at each of 3 points in a 10A sphere of dielectric 2, surrounded by solvent with a dielectric of 80, containing 1-1 electrolyte , concentration 0.145M (physiological). Find the total electrostatic energy, and contributions from self energy (interaction of each charge with the potential it induces by polarizing solvent, a.k.a the image or solvation terms) and the interaction between charges. Results from the finite difference method (FDPB) are compared with analytical solutions from the Tanford-Kirkwood equations (TK). Potentials are in kT/e (== 25.6mV, or 0.593 kcal/mole at 25oC). Fields are in kT/e/A, and energies are in kT.

REFERENCES

Gilson Sharp and Honig, J. Comp. Chem 9:327 (1987), for method and error assessments in the finite difference procedure for solving the Poisson-Boltzmann eqn. (FDPB method).

Gilson, M., Sharp, K., Honig, B. Calculating electrostatic interactions in bio-molecules: Method and error assessment. J. Computational Chem. 9, pp327-335. For a definition of the various electrostatic terms calculated here.

Tanford and Kirkwood, JACS 79:533 (1957), T.L. Hill, J.Phys.Chem. 60:253 (1956) for analytical series solutions to the Poisson-Boltzmann equation for charges in a sphere.

OUTPUT FILES

Files with extension *.frc contain potentials, fields and sum of 1/2.(potential*charge) from runs as follows: For prefix sph#, # refers to which of the three charges was present. No number means all three charges were present. All calculations were done with dielectric 2 for the sphere, 80 for the solvent, and ionic strength of 0.145M. Analytical solutions from the Tanford-Kirkwood equations for the 2/80 case are in files with suffix _anal:

sph1.frc

sph1_anal.frc

sph2.frc

sph2_anal.frc

sph3.frc

sph3_anal.frc

sph.frc

sph_anal.frc

sph_anal_slf.frc

NOTES

A grid resolution of 1.4 grids/Å was used along with a boxfil of 75% More accuracy in solvation and interaction energies can be achieved by using finer grids.

BACKGROUND TO ANALYSIS OF RESULTS

The image or reaction field: Consider a single charge in a low dielectric (Di) spherical cavity surrounded by high dielectric (Do) solvent. Its field polarizes the material in the cavity less than the surrounding solvent. The mathematical statement of this is that there is a discontinuity in the normal component of the field at the sphere boundary, such that Di.Eo = Do.Eo. Alternatively, a surface charge density is induced at the sphere boundary. This surface charge produces a field (the reaction field), which when added to the field from the charge yields the total field. The potential at the charge due to the reaction field is the self, or image potential, Ps, and the change in "solvation energy" of the charge is q.Ps/2.

Calculation of the reaction potential via the Finite Difference Procedure: When a FDPB calculation is performed, the potential at the site of a charge is not singular, since mapping onto a lattice essentially distributes the point charge over some volume element of order of one little grid box. The potential at the charge is then Pd = Pg + Ps, where Pg is some grid potential term, whose values has no-intrinsic meaning, but which is constant for a given charge location, grid size and boundary condition. The term Ps can be evaluated by subtracting the potential, Pu, at the charge with Do=Di (when Ps==0 by definition, and P = Pg), from the potential, Pd, with Do not = Di. Thus:

Ps = Pd - Pu,

Also the solvation energy change is:

dGs = q.(Pd - Pu)/2.

Similarly, the field at the charge is Ed = Eg + Es, and since (ignoring the error due to boundary conditions, Eg ==0), a good approx. to the field at the charge can be obtained directly from grad(Pd). This is provided in the *.frc file.

Many Charges in a Sphere: When there is more one charge, the potential at charge qi is

Pid = Piis + Pig + sum(Pjis + Pjic), i.ne.j

where Piis is the potential at i due to qiÕs polarization of solvent, Pig is grid potential at qi, Pjis is the potential at qi due to qjÕs polarization of the solvent, Pjic is the direct Coulombic potential due to qj at qi [ie Pjic = qj/(Di.rji)]. For a uniform dielectric case (Di=Do), Pijs==0 for all i,j, and:

Piu = Pig + sum(Pjic), i.ne.j

Similarly for the fields:

Eid = Eis + sum(Ejis + Ejic), i.ne.j

where E’s are vectorial quantities, and Eig==0. The total solvation energy is then

dGs = sum[qi.(Pid - Piu)/2.]

= sumi[qi.(Piis + sumj{Pjis},i.ne.j)]/2.

ANALYSIS OF RESULTS

In the first run, dielectrics of 2 and 80 are used and all three charges are present. The total field at each charge can be obtained directly with all 3 charges present (sph_2_80.frc), giving:

inner,outer dielectric: 2.0 80.0

ionic strength (M): 0.145

 

 

Coordinates

Charge

Electric Field (kT/e/ )

Number

X

Y

Z

(e)

Ex

Ey

Ez

1

-3

0

0

1

6.07

-4.37

0

2

3

0

0

2

-7.63

-67.43

0

3

3

3

0

-2

-4.424

-69.69

0

while the TK analytical results (in sph_anal.frc) are:

Number

X

Y

Z

(e)

Ex

Ey

Ez

1

-3

0

0

1

5.84

-4.2

0

2

3

0

0

2

-7.55

-68.0

0

3

3

3

0

-2

-4.48

-70.2

0

with around 2% error or less. The total solvation energy is obtained from the three charge run as -22.4 kT. The analytic value for D=2/80 from sph_anal.frc is -386.6543, while from Coulombic potentials for D=2/2 it is -364.11, yielding a difference of -22.5 kT, for less than 1% error.

Thus fields at all the charges and total solvation energy can be calculated simultaneously with all charges present, with one run.

To calculate the individual pairwise interactions, three sets of runs must be done, each with only one charge present in turn (file prefixes sph1, sph2, sph3), with D=2/80. Analytical results are from *_anal.frc files.

For charge 1, FDPB calculated fields (sph1.frc) and analytical fields are:

FDPB

TK analytical

Ex

Ey

Ez

Ex

Ey

Ez

1.01

0

0

0.988

0

0

-7.4

0

0

-7.32

0

0

-4.93

-2.77

0

-4.95

-2.67

0

for charge 2:

FDPB

TK analytical

Ex

Ey

Ez

Ex

Ey

Ez

14.8

0

0

14.6

0

0

-2.03

0

0

-1.98

0

0

-2.0

-69.43

0

-1.95

-69.9

0

for charge 3:

FDPB

TK analytical

Ex

Ey

Ez

Ex

Ey

Ez

-9.75

-4.37

0

-9.8

-4.2

0

1.79

-67.43

0

1.75

-68.0

0

2.51

2.51

0

2.42

2.42

0

 

Interaction potentials between different charges are also obtained from these runs:

Charge pair

Potential (kT/e)

FDPB

TK

1-3

16.35

16.4

2-1

42.92

42.5

2-3

-128.6

-126

 

SOLVATION ENERGY OF THE ACETATE ION

OBJECTIVE

To calculate the electrostatic contribution to the solvation energy of the acetate ion in water. The acetate molecule is assumed to have a dielectric of 2, accounting for electronic polarizability, and the water to have dielectric 80 (actually 78.6).

REFERENCES

Gilson, Sharp and Honig, J. Comp. Chem. 9:327 (1988). For general method.

Rashin and Namboodri, J. Phys. Chem., 91:6000 (1987), and Gilson and Honig, Proteins, 3:32 (1988) for solvation energy calculations- the values for acetate are taken from the last reference.

Kang, Nemethy and Scheraga, J.Phys.Chem. 91:4118 (1987). Wolfenden, Anderson and Cullis, Biochem. 20:849 (1981), for experimental solvation energies.

FILES

ace.crg charge file with CHARMM partial charges for acetate.

ace.pdb input atom coordinates for acetate built by CHARMM.

ace.prm parameter file for delphi.

ace.siz radius file, with charmm radii.

ace_2_1.frc reference run with vacuum dielectric outside (D=2/1).

ace_2_80.frc solvation energy run, with solvent (dielectric 80 present). (D=2/80).

ace_2_80.log log file.

ace_2_1.log log file

ace_delphi.pdb output atom coordinate file including charges and radii.

 

NOTES

The basic idea here is to calculate the sum of (charge*potential)/2. at all the charged atoms in acetate. In the case with solvent present (two dielectric, D=2/80), this energy contains four terms: the energy of interaction of each charge with all the others, the interaction of each charge with the reaction field induced by its own field, the interaction with the reaction field of other charges, and the self energy of the grid. For the vacuum dielectric case D=2/1, only the grid self energy and the direct charge charge interaction are present. Subtracting the energy in the D=2/1 run from the energy in the D=2/80 run, we are left with only the charge/solvent interaction energy, or solvent energy.

RESULTS

Run

Reaction field energy (kT) from .log files

D=2/80

-61.1

D=2/1

60.4

Difference = 121.5 == -72 Kcal/mole

Experimental values range from -79 to -87 kcal/mole. Previously calculated value from Gilson and Honig was -70 kcal/mole.

 

 

PK SHIFTS IN SUBTILISIN

OBJECTIVE

Calculate the expected change in pK of the active site histidine in subtilisin due to site directed mutagenesis of asp99->ser and glu156->ser at two different ionic strengths.

REFERENCES

Gilson and Honig, Nature 330:84 (1987)

Sternberg et al, Fersht, Nature 330:86 (1987) for experimental data and similar theoretical calculations (but not including ionic strength).

FILES

sbt.crg atom charges

sbt.pdb atom coordinates from brookhaven data bank

sbt.siz atom radii

sbt1.prm run at 0.1M ionic strength

sbt1.log log file

sbt1.frc frc file

sbt01.prm run at 0.01M ionic strength

sbt01.log "

sbt01.frc "

sbt_99_156.pdb atomic coordinates where potentials outputted

NOTES

The change in pK of a group due to to the electrostatic potential of another group is the change in work necessary to bring a proton from solution to that group, ie dpK = phi/2.303, where phi is the potential at one group due to the other, in units of kT/e. Since reciprocity holds, the charge can be put on either group, and the potential calculated at the other group. Thus for this case, it is more efficient to put the charge on the histidine 64, and get the potential at all other charged groups simultaneously. Note that where site-directed mutagenesis being performed, the change in bulk of the side chain could also have an effect on the potential at the titrating group, and so runs should be done with both side chains present, and the difference in potential taken. Also this procedure only yields CHANGES in pK, since the intrinsic pK of the group is not known (unless measured or calculated independently). The intrinsic pK being the pK of the group with all the other titratable groups neutral, ie the difference in intrinsic pK of an isolated side chain and one in a folded protein depends on the difference in solvation energy of the charged specie when incorporated into the protein.

The brookhaven data bank file for subtilisin contains oxygen coordinates for several water oxygens. In this example these waters are included as part of the molecule (ie with dielectric 2). What the dielectric behavior of crystallographically bound waters is, and whether they shold be included as part of the molecule is an open question.

In this example the centre of the protein is not set at the grid centre, but is offset 0,5,0.5,0.5, angstroms. This is merely to bring attention to the fact that when doing quantitative work, it is advisable to check that the results are not sensitive to the precise position of the molecule with respect to the grid. This applies particularly when conclusions are based on calculations from a few potential values at selected coordinates, such as for the pK changes in this example, and with the solvation eenrgies in example 3. If sensitivity to grid position is found results from runs with different positions of the molecule upon the grid should be averaged (eg see discussion of rotational averaging in Gilson,Sharp and Honig, (1988) J.Comp Chem, 9:327).

RESULTS

The data is taken from sbt1.frc (0.1M) and sbt01.frc (0.01M) and average potentials at the asp 99 and glu 156 carboxyl oxygens computed. This gives the histidine pK changes when these groups are mutated to serines:

Case

Ionic Strength

pK change

Experimental

(Fersht et al)

Gilson & Honig

Calculated

Asp 99

0.1M

0.23-0.29

0.18

0.17

0.01M

0.42

0.29

0.25

Glu156

0.1M

0.25

0.2

0.25

0.01M

0.42

0.37

0.34

Differences from calculations of Gilson & Honig arise since focussing boundary conditions and rotational averaging were not used here.

 

 

HINTS

Caveat Emptor...

When starting off, or running a new molecule, always use the option to output a PDB file with the assigned radii and charges in it. This will help check that the assignments were made as you intended them.

Always check the number of charges assigned, net positive, negative and total charge, as printed in the log file. Especially check problem residues such as histidines. Several charge states are included in some of the *.crg files, but these are by no means gospel. DelPhi will place warnings in the log file if residue charges don’t add up to a whole number. If charges are found outside the grid box, and a solvation energy is requested, the program will abort.

Check the alignment of fields in the PDB file against those used by DelPhi. Pay attention especially to: the atom name, special characters (such as ‘ or * in nucleic acid naming conventions), hydrogen names (e.g. H12 v. H2 v. HC1 etc). The program is case insensitive. As long as the name lies totally within the field, right or left justification is irrelevant. However included blanks will cause problems.

Check the dependence of your results on all numerical parameters, especially the grid scale, or box fill, and the number of non-linear iterations, the orientation and position of the molecule on the grid. Variation of more than a few percent indicates poor choice of parameters inappropriate application of the method, or limitations of the method due to the finite size and resolution of the lattice. Try using averaging and focussing to extend the latter.

DelPhi will output warnings if parts of molecule lie outside the grid box and when heavy atoms are assigned a zero radius.

While doing focussing calculations, as in pK calculations e.g., it is essential one keeps the grid size the same throughout. Since this version of delphi calculates the grid size based on grid resolution and percent box fill, it is essential to note the grid size from a test run and input this quantity in all focussing calculations on that system. Since grid size, scale and percent fill are all related, only one additional quantity need to be specified along with grid size. If this were percent fill, then this could change as in 50%, 100%, 200% etc. Alternatively the some multiple of scale could be used. E.g., 0.5,1.0,2.0,4.0 grids/ .

Remember that the units are proton charge (e), for charge, angstroms ( ) for distance, moles/litre for ionic strength. The dielectric is usually defined as a ratio relative to the permittivity of vacuum, and so is unitless. However an inspection of Coulomb’s law indicates that it really has units of energy x distance / charge squared. The numerical value is about 1/332 Angstroms*kcal/mole per proton charge squared. Ie. working electrostatic calculations in these units, and multiplying the answer by 332 will give potentials in kcal/mole/charge. Similarly, for units of kT/e at 25 Celsius, the ‘natural’ units for chemistry in the laboratory, the factor is 561. DelPhi and most of the examples use units of kT/e. Note however this only works at 25 Celsius. To convert to other temperature, use 561x(300/Temp), and remember that the dielectric constant is a function of temperature.

Electrostatic energies are FREE ENERGIES, since the dielectric response includes the entropy of organization in response to a field, ie. the product of charge and potential is the isothermal work to bring that charge from infinite to that point.

 

Appendix A: An index of Statements and their shorthands

Statement Long Form

Short

2L abr

Default Value

       

AUTOCON
AUTOCONVERGENCE
AUTOMATICCONVERGENCE

AUTOC

AC

TRUE

BOUNDARYCONDITION
BOUNDARYCONDITIONS

BNDCON

BC

2(=DIPOLAR)

BOXFILL
PERCENTFILL
PERCENTBOXFILL

PERFIL

PF

80 %

CLCSRF

CLCSRF

CS

FALSE

CONVERGENCEINTERVAL

CONINT

CI

10

CONVERGENCEFRACTION

CONFRA

CF

1

EXTERIORDIELECTRIC
EXTERNALDIELECTRIC

EXDI

ED

80

FANCYCHARGE
SPHERICALCHARGEDISTRIBUTION

FCRG

FC

FALSE

GRIDSIZE

GSIZE

GS

AUTOMATIC

GRIDCONVERGENCE

GRDCON

GC

0.0

IONICSTRENGTH
SALTCONC
SALTCONCENTRATION

SALT

IS

0.0

IONRADIUS

IONRAD

IR

0.0/2.0

ITERATION
ITERATIONS
LINEARITERATION

LINIT

LI

AUTOMATIC

INTERIORDIELECTRIC

INDI

ID

2.0

LOGFILECONVERGENCE

LOGGRP

LG

FALSE

LOGFILEPOTENTIALS

LOGPOT

LP

FALSE

MEMBRANEDATA

MEMDAT

MD

FALSE

NONLINEARITERATION
NONLINEARITERATIONS

NONIT

NI

0

PERIODICBOUNDARYX

PBX

PX

FALSE

PERIODICBOUNDARYY

PBY

PY

FALSE

PERIODICBOUNDARYZ

PBZ

PZ

FALSE

PROBERADIUS

PRBRAD

PR

1.4

SCALE

SCALE

SC

1.2

SOLVPB

SOLVPB

SP

TRUE

 

Appendix B: Glossary of Input Parameters:

AUTOCONVERGENCE

The program by default will automatically calculate the number of iterations needed to attain convergence. See also LINIT and GC options

BOUNDARY CONDITION

integer flag specifying the type of boundary condition imposed on the edge of the lattice
allowed options:

1== potential is zero.
2== Approximated by the Debye-Huckel potential of the equivalent dipole to the molecular charge distribution,
 
where f is the potential estimate at a given lattice boundary point, q+ (q-) is the sum of all positive (negative) charges, and r+ (r-) is distance from the point to the center of positive (negative) charge, d is the Debye length, and e is the solvent dielectric constant.
3== focussing, where a potential map from a previous calculation is read in on unit 18, and values for the potential at the lattice edge are interpolated from this map - clearly the first map should have been generated with a coarser grid (greater distance between lattice points) and positioned such that current lattice lies completely within old lattice or program will protest. (See note 5)
4== Approximated by the sum of Debye-Huckel potentials of all the charges.
 
where now qi is the i'th charge, and ri is distance from the lattice boundary point to the charge.

BOXFILL/PERFILL

percentage ( 0) of the lattice that the largest of the x,y or z linear dimensions of the molecule will fill. This along with grid resolution will determine the number of grids needed for the solver. The percentage fill of the lattice will depend on the application (see note 2).

CALCULATE SURFACE (CLCSRF)

when set to true, outputs a GRASP viewable surface file in the name grasp.srf

DIELECTRIC CONSTANTS

dielectric constants (= 1.0) for the molecule(INDI) and the surrounding solvent (EXDI). See note 4.

FANCY CHARGE DISTRIBUTION(FCRG)

normally set to false indicating a linear cubical interpolation of charges to grid points; set to true this turns on a spherical charge distribution see note 8 for more details.

GRIDSIZE

(odd) integer number of points per side of the cubic lattice, normally set by the program depending on size of molecule, grid resolution and percentage fill. (see note 1)

GRID RESOLUTION (SCALE)

expressed in no. grids/Angstrom this indicates the lattice spacing; this is used along with the PERFIL parameter to determine the number of grids in each dimension required to enclose the molecule.

GRIDCONVERGENCE

when set to a fractional value, the iteration continues till two successive 10 iterations result in grid energies that are within GC

IONICSTRENGTH

ionic strenght of solvent in moles/litre (=0.0)

IONRADIUS

thickness of the ion exclusion layer around molecule (in Angstroms); default values are 0.0 if ionic strength is not mentioned and 2.0 if ionic strength is greater than zero. see note 4

LINEAR ITERATIONS

integer number (3) of iterations with linear equation. (normally not set. see note 7).

MODIFIED PDB FILE

This option produces a modified PDB file written on named file (or unit 19 by default), containing the radius and charge assigned to each atom written after the coordinates, in the fields used for occupancy and B factor. It is recommended that this option be set initially so that you can check that all the radius and charge assignments are correct. An additional check on the charge assignment can be made by looking at the total charge written to the log file. Also this modified PDB file is required for displaying cross-sections of the molecule surface in the 2D potential contour program CONTOUR2D.

NON-LINEAR ITERATIONS

integer number (= 0) of non-linear iterations. If linear PB equation only is required set NI = 0.

PERIODIC BOUNDARY CONDITION

three logical flags (t/f) for periodic boundary conditions for the x,y,z edges of the lattice respectively. Note that periodic boundary conditions will overide other boundary conditions on edges to which they are applied. See note 6.

PROBERADIUS

radius of the solvent probe molecule that will define solvent excluded surface in Lee and Richard's sense (in Angstroms 0.0). See note 4.

SOLVER ON/OFF OPTION (SOLVPB)

normally DelPhi will invoke the Poisson-Boltzman solver but if you are interested in using DelPhi for other things such as calculating surface area or producing a GRASP viewable surface file, you can turn off the solver using this option.

 

Appendix C: File Formats

A description of the input required by DELPHI is given below, with some suggested parameter values. However no hard and fast rules can be given since in many cases a tradeoff is involved. More details are given in the notes, Appendix D.

INPUT FORMATS

UNIT 10

Default extension .prm. Contains input parameters read in free format. Values are real unless stated otherwise. See section under Files: Input Parameter File

UNIT 11

Default extension *.siz. List of rules describing the van der Waals radii to be assigned to each atom/residue pdb record type. The format is described in Section 4.2.3 on files. Note the atom and residue fields ignore case and leading blanks. The residue field may be left blank (wild card), causing a match with the given atom type of any residue. ONLY if the residue field is left blank, the LAST 5 characters of the atom record maybe left blank. In this case all atom types beginning with the letter in column 1 will be matched. Records of greater specificity overide those of less specificity. Beware of ambiguities like calcium (ca) and alpha carbon! All atoms of an input *.pdb file must be assigned a radius through the *.siz file, even if it is 0, or an error will be flagged.

UNIT 12

Default extension *.crg. List of rules describing the atomic charges to be assigned to each atom/residue/number/chain pdb record type. The format is described in Section 4.2.4 on files. The ascii fields for atom, residue, number and chain ignore case and leading blanks. Any field except the atom name may be left blank and will be treated as a wild card. Records of greater specificity overide those of less specificity as for the *.siz file above.

search order:

atom_res_num_chain

atom_res_num______

atom_res_____chain

atom_res__________

atom_____num_chain

atom_____num______

atom_________chain

atom______________

 

 

Atoms that do not find a match in the *.crg file will be neutral (q=0.0)

UNIT 13

Standard format Brookhaven protein data bank file containing atom labels and coordinates. Only records starting with ATOM or HETATM are used. Default extension *.pdb. Format (as assumed by the program: beware many minor and not so minor differences abound in the literature and on disk!): (A6,5X,A5,X,A3,X,A1,A4,4X,3F8.3) header, atom name, residue name, chain name, residue number, x,y,z. coordinates. Note that the program treats the residue number as an ascii string, not an integer.

UNIT 15

Default extension: *.pdb. list of coordinates where site potentials are output, format as for unit 13

UNIT 18

default extension *.phi potential map for focussing boundary conditions. Potentials are in kT/e (25.6mV, 0.593 kcal/mole at 25oC). Format:

unformatted (binary file)

character*20 uplbl

character*10 nxtlbl,character*60 toplbl

real*4 phi(65,65,65)

character*16 botlbl

real*4 scale,oldmid(3), integer*4 igrid

uplbl,nxtlbl,toplbl,botlbl are ascii information. Phi is the 3D array conatining values of potential for all the lattice points. Index order is x,y,z. Scale is lattice scale in grid squares/angstrom. Oldmid is the x,y,z coordinates in real space (angstroms) of the centre of the lattice: thus the real space coordinates x,y,z of the lattice point for phi(IX,IY,IZ), for the case where IGRID = 65, are:

x = (IX - 33)/scale + oldmid(1)

y = (IY - 33)/scale + oldmid(2)

z = (IZ - 33)/scale + oldmid(3)

where 33 = (65+1)/2 is the middle point of the grid.

OUTPUT FORMATS

UNIT 6

Output from program, including error messages and convergence history. When run interactively, appears on standard output. Default extension *.log when run in batch

UNIT 14

If flag IBIOS is false, then output is in DELPHI format, default extension *.phi. Output potential map or concentration map, with format same as for unit 18 above. When GRASP option is chosen for potential map then potentials are interpolated to produce 65X65X65 density lattice for compatibility with display/analysis programs (GRASP).

If flag IBIOS is true, then output is in INSIGHT format, default extension *.ins. This is an unformatted (binary) file

character*132 toplbl !ascii header

integer*4 ivary !0 => x index varys most rapidly

integer*4 nbyte !=4, # of bytes in data

integer*4 inddat !=0, floating point data

real*4 xang,yang,zang !=90,90,90 unit cell angles

integer*4 intx,inty,intz !=igrid-1, # of intervals/grid side

real*4 extent !maximum extent of grid

real*4 xstart,xend !beginning, end of grid sides

real*4 ystart,yend !in fractional

real*4 zstart,zend !units of extent

write(14)toplbl

write(14)ivary, nbyte, intdat, extent, extent, extent,

xang, yang, zang, xstart, xend, ystart, yend, zstart, zend, intx, inty, intz

do k = 1,igrid

do j = 1,igrid

write(14)(phimap(i,j,k),i=1,igrid)

end do

end do

Note that for grid sizes less than 65, INSIGHT format files will occupy less disk space than the corresponding DELPHI files. *.ins files are designed as input to a Biosym Corp. stand alone utility called CONTOUR, supplied with INSIGHT Version 2.4. This program will produce contour files for display with INSIGHT.

UNIT 16

Default extension *.frc. A list of potentials and fields at coordinates in *.pdb file read on unit 15. Format: 12 lines of ascii header information, followed by a variable number of records written as:

230 format(8G10.3)

write(16,230)xo,chrgv,phiv,fx,fy,fz

where xo(3) are x,y,z coordinates of charge, chrgv is the charge value, phiv is the potential (in kT/e) at that point, and fx,fy,fz are the field components (in kT/e/Angstrom). The last line of the file is the sum of chrgv*phiv/2 over all the charges in the file. This quantity can be used for calculating solvation and interation energies- see examples 2 and 3 in Section 7.

UNIT 17

Dielectric bit map, default extension: *.eps. For a igrid^3 grid, There are 3*igrid*igrid*igrid lines joining neighbouring grid points, igrid*igrid*igrid each in of the x,y,z directions. The midpoint of each line is given a value of 1 if it lies within the solvent accessible volume of the molecule, 0 if outside. By this means the shape of the molecule is defined, and space is separated into two regions with different dielectrics. For compact output purposes the array of REAL*4, epsmap(igrid*igrid*igrid,3), is compressed into an INTEGER*2 array, neps(5, igrid*igrid), by bitmapping: the first index of epsmap, range 1-igrid is compressed into the first index of neps, range 1-igrid/16, where the indices 1-16 go into bits 0-15 of the word with index 1, indices 17-32 -> bits 0-15 of word with index 2 etc. The array neps is then written to an unformatted binary file:

write (17) imap, scale, oldmid

write (17) neps

where imap is an unused integer*4 flag and scale, oldmid(3) are real*4 scaling information as above.

UNIT 19

List of input coordinates read from unit 13, default extension *.pdb, with the assigned atomic radius and charge of each atom placed in columns 55-60 and 61-67 in formats F6.2,F7.3 respectively. Other formats are same as for unit 13

 

 

Appendix D: Notes on Input Parameters

1. GRID SIZE

The number of iterations required to reach a certain convergence will increase approximately linearly with parameter GS. Since the time per iteration will go up as the cube of this parameter the amount of calculation will thus increase at about the fourth power of GS. In this version of DelPhi, this parameter is automatically set within the program and directly affects the amount of CPU memory required to run the program. This parameter has an upper limit of 257 in the current version.

2. PERFIL

A large % fill will provide a more detailed mapping of the molecular shape onto the lattice. On the other hand it will bring the dielectric boundary of the molecule closer to the lattice edge. This will cause larger errors arising from the boundary potential estimates, which are set to zero or approximated by coulombic/debye-huckel type functions using a uniform solvent dielectric. At higher salt concentrations, and for weakly charged molecules the potentials at the boundary, and consequently, the error, will be smaller. Smaller percentages will increase the accuracy of the boundary conditions, but result in a coarser representation of the molecule. For a single run, a reasonable compromise seems to be about 60% for looking at potentials outside a protein, and 90% for solvation energy calculations. The tradoff involved in scaling can at the cost of extra computing be avoided by FOCUSSING. Start with a small percentage, say 10-20%, using zero or Debye-Huckel boundary conditions, and then focus in to say 90%, in one (or two) stages, using focussing boundary conditions for the second (and third) runs. It is not necessary for the molecule to lie completely within the grid although then the potential boundary conditions must be generated by focussing. However when calcu- lating solvation energies with box fills of > 100% remember that unexpected results may be obtained since parts of the surface, (and perhaps some charges) are not included in the grid (see notes on QDIFFXS regarding energy calculations). In any quantitative calculation, it is advised that the largest scale possible be used, preferably above 2 grids/angstrom (for larger molecules this may not be possible without prohibitive increase in memory requirements). Whatever the grid scale, calculations should be repeated at different scales to assess the size of lattice resolution errors.

3. OFFSET

This parameter is used to specify the grid point at which the centre of the molecule [pmid(3)] is placed, or conversely, what point of the molecule [oldmid(3)] is placed at the centre or the grid. The relationship between real space r(i) and grid g(i) coordinates for a grid size of igrid, with a scale of gpa grids/angstrom is as follows:

The centre of the grid is

midg = (igrid+1)/2

oldmid(i) = pmid(i) - OFFSET(i)/gpa

g(i) = (r(i) - pmid(i))*gpa + midg + OFFSET(i)

r(i) = (g(i) - midg)/gpa + oldmid(3)

The scale, gpa and oldmid are printed in the logfile.

Note that a certain error inevitably results from the mapping of the molecule onto the grid. By moving the molecule slightly (changing OFFSET between 0,0,0 and 1,1,1) and repeating the calculations, it is possible to see whether the results are sensitive to the particular position on the grid, and if so, to improve the accuracy by averaging (This is related to rotational averaging, discussed in the J. Comp Chem paper of Gilson et al.). However using a larger scale is a more effective way of improving accuracy than averag- ing. To check on the positioning of the molecule within the grid as determined by PERFIL and OFFSET, the output dielectric bit map, FILE.eps, can be examined by taking slices using the utility program LOOKEPS.

4.DIELECTRIC CONSTANTS

A value of INDI=1 corresponds to the molecule in vacuum, EXDI=80 to the molecule in water, and INDI=EXDI to a reference state where there is no dielectric boundary. Depending on the application runs with EXDI equal to either of these values may be used to represent different states in a thermodynamic cycle. A value of INDI=1 corresponds to a molecule with no polarizability- a state of affairs assumed in most molecular mechanics applications. Epsin=2 represents a molecule with only electronic polarizability (ie assuming no re-orientation of fixed dipoles- peptide bonds etc). A value of 2 is based on the experimentally observed high frequency dielectric behavior of essentially all organic materials. INDI=4-6 represents a process where some small reorganization of molecular dipoles occurs which is not represented explicitly (for example in modeling the effects of site directed mutageneis experiments, when the structure of the wild type, but not mutant protein is known). This value is based on a number of experimental and theoretical papers (eg M.K. Gilson and B. Honig, Biopolymers, 25:2097 (1986)) which indicate that a macroscopic material which consisted of similar dipole density, moment and flexibility as globular proteins would have a dielectric of 4-6. In modelling any process where a large reorientation of dipoles, or large conformational change occurs, ie upon folding or denaturation, then using a dielectric constant for the molecule would be inappropriate, and the change in conformation should be modelled explicitly.

IONRAD,PRBRAD

The first parameter, IONRAD, in combination with the atomic van der Waals radii in the *.siz file, determines the regions of space, and hence the lattice points, which are inaccessible to solvent ions (ie where Do=0 in equation 1).

The second parameter, PRBRAD, in combination with the atomic van der Waals radii in the *.siz file, determines the regions of space, and hence the lattice points, which are inaccessible to solvent molecules (water) (ie where Ei=EPSIN in equation 1).

Suggested values are IONRAD = 2.0 for sodium chloride, and PRBRAD = 1.4-1.8 for water. To understand how these parameters work, you should be familiar with the concepts of contact and solvent accessible surface, as discussed by Lee and Richards, and by Mike Connolly.

For the purpose of DelPhi, a solvent ion is considered as a point charge, which can approach no closer than its ionic radius, IONRAD, to any atoms Van der Waals surface. The ion excluded volume is thus bounded by the contact surface, which is the locus of the ion centre when in Van der Waals contact with any accessible atom of the molecule. A zero value for IONRAD will just yield the Van der Waals volume. A non zero value of IONRAD will thus introduce a Stern, or ion exclusion layer around the molecule where the solvent ion concentration will be zero, and whose dielectric constant is that of the solvent, EXDI.

For the purpose of DelPhi, any region of space that is accessible to any part of a solvent (water) molecule is considered as having a dielectric of EXDI.

A value of zero for PRBRAD used with a *.siz file containing the standard van der Waals radii values will assign any region of space not inside any atom's van der Waals sphere to the solvent. Note that this can include regions of the molecule not actually accessible to the solvent, such as the interstices where the spherical atoms pack together in the molecule's interior, which would erroneously be assigned the solvent dielectric. Thus this combination of parameters should generally not be used.

If a constant distance (say equal to the solvent molecule radius) is added to each radius in the *.siz file, and PRBRAD is set to 0, then any point lying within the contact surface (as defined by by Lee and Richards) will be assigned the molecule dielectric. Note that on the surface a shell of space that is in fact accessible to the solvent will be included as part of the molecule dielectric- This could be used for example to model a shell of immobilized water around a molecule that had a low dielectric constant.

If PRBRAD is not zero, and standard van der Waals radii are used (default situation), then the solvent accessible volume will be generated. This is the region inaccessible to any part of a solvent probe sphere of radius PRBRAD. Its surface as defined by Lee and Richards, consists (Connolly) of convex portions composed of the atomic van der Waals surfaces when the probe sphere is in contact with one atom, and concave portions composed of the probe sphere surface, when the latter is in contact with two or more atoms.

5. BOUNDARY CONDITIONS

Unless focussing is being used, it is recommended that the coulombic boundary condition (ibctyp=4) be used. For focussing boundary conditions, the program reads in a potential map from a previous run, and compares the scale of the focussing map with that for the current run. If they are the same, it assume that this is a continuation of a previous run, and iteration of the potentials contained in the previous potential map is continued. If the scales are not the same, it checks to ensure that the new lattice lies completely within the old lattice before interpolating the boundary conditions.

6. PERIODIC BOUNDARY CONDITIONS

Periodic boundary conditions can be applied in one or more of the x,y or z directions. When applied, the potential at each periodic lattice boundary point is iterated by equation 1 by supplying its missing neighbor(s) from the corresponding point on the opposite edge of the lattice.This can be used for example to model an infinite length of DNA. Assume that the helical axis of the DNA in the *.pdb file is aligned along the Z axis. The periodic boundary flags are set to false,false,true, and the percent fill of the box, PERFIL, is adjusted so that an integral number of turns just fill the box in the Z direction. Normal boundary conditions are applied to the X,Y boundaries. By setting two, or three of the boundary flags to true, one can simulate 2 dimensional or 3 dimensional cubic lattices of molecules.

7. LINIT

The convergence behaviour of the finite difference procedure is reported in the log file as both the mean and maximum absolute change in potential at the grid points between successive iterations. The latter is probably more important since it puts an upper bound on how much the potential is changing at the grid points. It is suggested that sufficient iterations be performed to give a final maximum change of less than 0.001 kT/e. The number of iterations per se is not important, as long as its sufficient to give the required convergence. The convergence behavior can also be judged from the slope of the semi-log plot of the mean and max changes given in the log file. LINIT is best determined by experience, since the convergence rate depends on several factors. Start with say 100 iterations, and then increase the number of iterations until sufficient. Note that a run can be restarted by using focussing boundary conditions with exactly the same PERFIL and OFFSET values (see note 5). Some guidelines are: The number of iterations needed will increase with grid size. It will decrease with decreasing PERFIL, since the potentials converge more rapidly in the solvent. It will decrease with increasing ionic strength. The number is fairly insensitive to the size and number of charges on the molecule. Convergence will slow with use of the non-linear equation (NONIT not zero). It is suggested that for more rapid convergence and greater numerical stability do 30% linear iterations, then refine the solution with 70% non-linear iterations.

8. SPHERICAL CHARGE DISTRIBUTION

If an atomic charge does not lie exactly on a grid point, then it must somehow be distributed onto the grid points. If this flag is set false, the standard algorithm is used which distributes charge to nearest 8 grid points (quick and simple, see the Proteins paper of Klapper et al.). If this flag is set true, then an algorithm is used which gives a more spherically symmetric charge distribution, although the charge is now spread over a wider region of space. For certain cases this gives higher accuracy for potentials less than 3 grid units from a charge (see J.Comp. Chem paper), although this point has not been exhaustively explored.