Constructing a molecular system

The construction of a complete system for simulation or analysis involves some or all of the following operations:

MMTK offers a large range of functions to deal with these tasks.


Creating chemical objects

Chemical objects (atoms, molecules, complexes) are created from definitions in the database. Since these definitions contain most of the necessary information, the subsequent creation of the objects is a simple procedure.

All objects are created by their class name (MMTK.Atom, MMTK.Molecule, and MMTK.Complex) with the name of the definition file as first parameter. Additional optional parameters can be specified to modify the object being created. The following optional parameters can be used for all object types:

Some examples with additional explanations for specific types:

Proteins, peptide chains, and nucleotide chains

MMTK contains special support for working with proteins, peptide chains, and nucleotide chains. As described in the chapter on the database, proteins can be described by a special database definition file. However, it is often simpler to create protein objects directly in an application program. The classes are MMTK.Proteins.PeptideChain, MMTK.Proteins.Protein, and MMTK.NucleicAcids.NucleotideChain.

Proteins can be created from definition files in the database, from previously constructed peptide chain objects, or directly from PDB files if no special manipulations are necessary.

Examples: Protein('insulin') creates a protein object for insulin from a database file. Protein('1mbd.pdb') creates a protein object for myoglobin directly from a PDB file, but leaving out the heme group, which is not a peptide chain.

Peptide chains are created from a sequence of residues, which can be a MMTK.PDB.PDBPeptideChain object, a list of three-letter residue codes, or a string containing one-letter residue codes. In the last two cases the atomic positions are not defined. MMTK provides several models for the residues which provide different levels of detail: an all-atom model, a model without hydrogen atoms, two models containing only polar hydrogens (using different definitions of polar hydrogens), and a model containing only the C-alpha atoms, with each C-alpha atom having the mass of the entire residue. The last model is useful for conformational analyses in which only the backbone conformations are important.

The construction of nucleotide chains is very similar. The residue list can be either a MMTK.PDB.PDBNucleotideChain object or a list of two-letter residue names. The first letter of a residue name indicates the sugar type ('R' for ribose and 'D' for desoxyribose), and the second letter defines the base ('A', 'C', and 'G', plus 'T' for DNA and 'U' for RNA). The models are the same as for peptide chains, except that the C-alpha model does not exist.

Most frequently proteins and nucleotide chains are created from a PDB file. The PDB files often contain solvent (water) as well, and perhaps some other molecules. MMTK provides convenient functions for extracting information from PDB files and for building molecules from them in the module MMTK.PDB. The first step is the creation of a MMTK.PDB.PDBConfiguration object from the PDB file:

from MMTK.PDB import PDBConfiguration
configuration = PDBConfiguration('some_file.pdb')
The easiest way to generate MMTK objects for all molecules in the PDB file is then
molecules = configuration.createAll()
The result is a collection of molecules, peptide chains, and nucleotide chains, depending on the contents of the PDB files. There are also methods for modifying the PDBConfiguration before creating MMTK objects from it, and for creating objects selectively. See the documentation for the modules MMTK.PDB and Scientific.IO.PDB for details, as well as the protein and DNA examples.

Lattices

Sometimes it is necessary to generate objects (atoms or molecules) positioned on a lattice. To facilitate this task, MMTK defines lattice objects which are essentially sequence objects containing points or objects at points. Lattices can therefore be used like lists with indexing and for-loops. The lattice classes are MMTK.Geometry.RhombicLattice, MMTK.Geometry.BravaisLattice, and MMTK.Geometry.SCLattice.

Random numbers

The Python standard library and the Numerical Python package provide random number generators, and more are available in seperate packages. MMTK provides some convenience functions that return more specialized random quantities: random points in a universe, random velocities, random particle displacement vectors, random orientations. These functions are defined in module MMTK.Random.

Collections

Often it is useful to treat a collection of several objects as a single entity. Examples are a large number of solvent molecules surrounding a solute, or all sidechains of a protein. MMTK has special collection objects for this purpose, defined as class MMTK.Collection. Most of the methods available for molecules can also be used on collections.

A variant of a collection is the partitioned collection, implemented in class MMTK.PartitionedCollection. This class acts much like a standard collection, but groups its elements by geometrical position in small sub-boxes. As a consequence, some geometrical algorithms (e.g. pair search within a cutoff) are much faster, but other operations become somewhat slower.

Creating universes

A universe describes a complete molecular system consisting of any number of chemical objects and a specification of their interactions (i.e. a force field) and surroundings: boundary conditions, external fields, thermostats, etc. The universe classes are defined in module MMTK:

Universes are created empty; the contents are then added to them. Three types of objects can be added to a universe: chemical objects (atoms, molecules, etc.), collections, and environment objects (thermostats etc.). It is also possible to remove objects from a universe.

Force fields

MMTK comes with several force fields, and permits the definition of additional force fields. Force fields are defined in module MMTK.ForceFields. The most import built-in force field is the Amber 94 force field, represented by the class MMTK.ForceFields.Amber94ForceField. It offers several strategies for electrostatic interactions, including Ewald summation, a fast multipole method [DPMTA], and cutoff with charge neutralization and optional screening [Wolf1999].

In addition to the Amber 94 force field, there is a Lennard-Jones force field for noble gases (Class MMTK.ForceFields.LennardJonesForceField) and a deformation force field for protein normal mode calculations (Class MMTK.ForceFields.DeformationForceField).


Referring to objects and parts of objects

Most MMTK objects (in fact all except for atoms) have a hierarchical structure of parts of which they consist. For many operations it is necessary to access specific parts in this hierarchy.

In most cases, parts are attributes with a specific name. For example, the oxygen atom in every water molecule is an attribute with the name "O". Therefore if w refers to a water molecule, then w.O refers to its oxygen atom. For a more complicated example, if m refers to a molecule that has a methyl group called "M1", then m.M1.C refers to the carbon atom of that methyl group. The names of attributes are defined in the database.

Some objects consist of parts that need not have unique names, for example the elements of a collection, the residues in a peptide chain, or the chains in a protein. Such parts are accessed by indices; the objects that contain them are Python sequence types. Some examples:

Peptide and nucleotide chains also allow the operation of slicing: if p refers to a peptide chain, then p[1:-1] is a subchain extending from the second to the next-to-last residue.

The structure of peptide and nucleotide chains

Since peptide and nucleotide chains are not constructed from an explicit definition file in the database, it is not evident where their hierarchical structure comes from. But it is only the top-level structure that is treated in a special way. The constituents of peptide and nucleotide chains, residues, are normal group objects. The definition files for these group objects are in the MMTK standard database and can be freely inspected and even modified or overriden by an entry in a database that is listed earlier in MMTKDATABASE.

Peptide chains are made up of amino acid residues, each of which is a group consisting of two other groups, one being called "peptide" and the other "sidechain". The first group contains the peptide group and the C and H atoms; everything else is contained in the sidechain. The C atom of the fifth residue of peptide chain p is therefore referred to as p[4].peptide.C_alpha.

Nucleotide chains are made up of nucleotide residues, each of which is a group consisting of two or three other groups. One group is called "sugar" and is either a ribose or a desoxyribose group, the second one is called "base" and is one the five standard bases. All but the first residue in a nucleotide chain also have a subgroup called "phosphate" describing the phosphate group that links neighbouring residues.


Analyzing and modifying atom properties

General operations

Many inquiry and modification operations act at the atom level and can equally well be applied to any object that is made up of atoms, i.e. atoms, molecules, collections, universes, etc. These operations are defined once in a mix-in class called MMTK.Collection.GroupOfAtoms, but are available for all objects for which they make sense. They include inquiry-type functions (total mass, center of mass, moment of inertia, bounding box, total kinetic energy etc.), coordinate modifications (translation, rotation, application of transformation objects) and coordinate comparisons (RMS difference, optimal fits).

Coordinate transformations

The most common coordinate manipulations involve translations and rotations of specific parts of a system. It is often useful to refer to such an operation by a special kind of object, which permits the combination and analysis of transformations as well as its application to atomic positions.

Transformation objects specify a general displacement consisting of a rotation around the origin of the coordinate system followed by a translation. They are defined in the module Scientific.Geometry, but for convenience the module MMTK contains a reference to them as well. Transformation objects corresponding to pure translations can be created with Translation(displacement); transformation objects describing pure rotations with Rotation(axis, angle) or Rotation(rotation_matrix). Multiplication of transformation objects returns a composite transformation.

The translational component of any transformation can be obtained by calling the method translation(); the rotational component is obtained analogously with rotation(). The displacement vector for a pure translation can be extracted with the method displacement(), a tuple of axis and angle can be extracted from a pure rotation by calling axisAndAngle().

Atomic property objects

Many properties in a molecular system are defined for each individual atom: position, velocity, mass, etc. Such properties are represented in special objects, defined in module MMTK: MMTK.ParticleScalar for scalar quantities, MMTK.ParticleVector for vector quantities, and MMTK.ParticleTensor for rank-2 tensors. All these objects can be indexed with an atom object to retrieve or change the corresponding value. Standard arithmetic operations are also defined, as well as some useful methods.

Configurations

A configuration object, represented by the class MMTK.Configuration is a special variant of a MMTK.ParticleVector object. In addition to the atomic coordinates of a universe, it stores geometric parameters of a universe that are subject to change, e.g. the edge lengths of the elementary cell of a periodic universe. Every universe has a current configuration, which is what all operations act on by default. It is also the configuration that is updated by minimizations, molecular dynamics, etc. The current configuration can be obtained by calling the method configuration().

There are two ways to create configuration objects: by making a copy of the current configuration (with copy(universe.configuration()), or by reading a configuration from a trajectory file.