parmed.formats package

Module contents

A package dealing with different file formats and automatic detection of those formats

class parmed.formats.CIFFile[source]

Bases: object

Standard PDBx/mmCIF file format parser and writer

Methods

download(pdb_id[, timeout, saveto])

Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a Structure instance

id_format(filename)

Identifies the file type as a PDBx/mmCIF file

parse(filename[, skip_bonds])

Read a PDBx or mmCIF file and return a populated Structure class

write(struct, dest[, renumber, coordinates, …])

Write a PDB file from the current Structure instance

static download(pdb_id, timeout=10, saveto=None)[source]

Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a Structure instance

Parameters
pdb_idstr

The 4-letter PDB ID to try and download from the RCSB PDB database

timeoutfloat, optional

The number of seconds to wait before raising a timeout error. Default is 10 seconds

savetostr, optional

If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no CIF file will be written. This will be a verbatim copy of the downloaded CIF file, unlike the somewhat-stripped version you would get by using Structure.write_cif

Returns
structStructure

Structure instance populated by the requested PDBx/mmCIF

Raises
socket.timeout if the connection times out while trying to contact the
FTP server
IOError if there is a problem retrieving the requested PDB
ImportError if the gzip module is not available
TypeError if pdb_id is not a 4-character string
static id_format(filename)[source]

Identifies the file type as a PDBx/mmCIF file

Parameters
filenamestr

Name of the file to check format for

Returns
is_fmtbool

True if it is a PDBx/mmCIF file

static parse(filename, skip_bonds=False)[source]

Read a PDBx or mmCIF file and return a populated Structure class

Parameters
filenamestr or file-like

Name of PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)

skip_bondsbool, optional

If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the CIF file is being parsed simply for its coordinates. Default is False.

Returns
structure1 [, structure2 [, structure3 [, …] ] ]
structure#Structure

The Structure object initialized with all of the information from the PDBx/mmCIF file. No bonds or other topological features are added by default. If multiple structures are defined in the CIF file, multiple Structure instances will be returned as a tuple.

Raises
ValueError if the file severely violates the PDB format specification.
If this occurs, check the formatting on each line and make sure it
matches the others.
static write(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, standard_resnames=False)[source]

Write a PDB file from the current Structure instance

Parameters
structStructure

The structure from which to write the PDBx/mmCIF file

deststr or file-like

Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.

renumberbool

If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously

coordinatesarray-like of float

If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)

altlocsstr

Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:

  • ‘all’ : (default) print all alternate locations

  • ‘first’ : print only the first alternate locations

  • ‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed

Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.

write_anisoubool

If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written

standard_resnamesbool, optional

If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values. Default is False

Notes

If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDBx standard dictates that only one set of unit cells shall be present).

class parmed.formats.Mol2File[source]

Bases: object

Class to read and write TRIPOS Mol2 files

Methods

id_format(filename)

Identify the file as a Mol2 (or Mol3) file format or not

parse(filename[, structure])

Parses a mol2 file (or mol3) file

write(struct, dest[, mol3, split, …])

Writes a mol2 file from a structure or residue template

BOND_ORDER_MAP = {'am': 1.25, 'ar': 1.5}
REVERSE_BOND_ORDER_MAP = {1.25: 'am', 1.5: 'ar'}
static id_format(filename)[source]

Identify the file as a Mol2 (or Mol3) file format or not

Parameters
filenamestr

Name of the file to test whether or not it is a mol2 file

Returns
is_fmtbool

True if it is a mol2 (or mol3) file, False otherwise

static parse(filename, structure=False)[source]

Parses a mol2 file (or mol3) file

Parameters
filenamestr or file-like

Name of the file to parse or file-like object to parse from

structurebool, optional

If True, the return value is a Structure instance. If False, it is either a ResidueTemplate or ResidueTemplateContainter instance, depending on whether there is one or more than one residue defined in it. Default is False

Returns
moleculeStructure, ResidueTemplate, or

ResidueTemplateContainer

The molecule defined by this mol2 file

Raises
Mol2Error

If the file format is not recognized or non-numeric values are present where integers or floating point numbers are expected. Also raises Mol2Error if you try to parse a mol2 file that has multiple @<MOLECULE> entries with structure=True.

static write(struct, dest, mol3=False, split=False, compress_whitespace=False)[source]

Writes a mol2 file from a structure or residue template

Parameters
structStructure or ResidueTemplate or

ResidueTemplateContainer

The input structure to write the mol2 file from

deststr or file-like obj

Name of the file to write or open file handle to write to

mol3bool, optional

If True and struct is a ResidueTemplate or container, write HEAD/TAIL sections. Default is False

splitbool, optional

If True and struct is a ResidueTemplateContainer or a Structure with multiple residues, each residue is printed in a separate @<MOLECULE> section that appear sequentially in the output file

compress_whitespacebool, optional

If True, seprate fields on one line with a single space instead of aligning them with whitespace. This is useful for parsers that truncate lines at 80 characters (e.g., some versions of OpenEye). However, it will not look as “neat” upon visual inspection in a text editor. Default is False.

class parmed.formats.PDBFile(fileobj)[source]

Bases: object

Standard PDB file format parser and writer

Methods

AtomLookupKey(name, number, residue_name, …)

Attributes

download(pdb_id[, timeout, saveto])

Goes to the wwPDB website and downloads the requested PDB, loading it as a Structure instance

id_format(filename)

Identifies the file type as a PDB file

parse(filename[, skip_bonds])

Read a PDB file and return a populated Structure class

write(struct, dest[, renumber, coordinates, …])

Write a PDB file from a Structure instance

class AtomLookupKey(name, number, residue_name, residue_number, chain, insertion_code, segment_id, alternate_location)

Bases: tuple

Attributes
alternate_location

Alias for field number 7

chain

Alias for field number 4

insertion_code

Alias for field number 5

name

Alias for field number 0

number

Alias for field number 1

residue_name

Alias for field number 2

residue_number

Alias for field number 3

segment_id

Alias for field number 6

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property alternate_location

Alias for field number 7

property chain

Alias for field number 4

property insertion_code

Alias for field number 5

property name

Alias for field number 0

property number

Alias for field number 1

property residue_name

Alias for field number 2

property residue_number

Alias for field number 3

property segment_id

Alias for field number 6

static download(pdb_id, timeout=10, saveto=None)[source]

Goes to the wwPDB website and downloads the requested PDB, loading it as a Structure instance

Parameters
pdb_idstr

The 4-letter PDB ID to try and download from the RCSB PDB database

timeoutfloat, optional

The number of seconds to wait before raising a timeout error. Default is 10 seconds

savetostr, optional

If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no PDB file will be written. This will be a verbatim copy of the downloaded PDB file, unlike the somewhat-stripped version you would get by using Structure.write_pdb

Returns
structStructure

Structure instance populated by the requested PDB

Raises
socket.timeout if the connection times out while trying to contact the
FTP server
IOError if there is a problem retrieving the requested PDB or writing a
requested saveto file
ImportError if the gzip module is not available
TypeError if pdb_id is not a 4-character string
static id_format(filename)[source]

Identifies the file type as a PDB file

Parameters
filenamestr or file object

Name of the file to check format for

Returns
is_fmtbool

True if it is a PDB file

classmethod parse(filename, skip_bonds=False)[source]

Read a PDB file and return a populated Structure class

Parameters
filenamestr or file-like

Name of the PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)

skip_bondsbool, optional

If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the PDB file is being parsed simply for its coordinates. This may also reduce element assignment if element information is not present in the PDB file already. Default is False.

Returns
structureStructure

The Structure object initialized with all of the information from the PDB file. No bonds or other topological features are added by default.

static write(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, charmm=False, use_hetatoms=True, standard_resnames=False, increase_tercount=True, write_links=False)[source]

Write a PDB file from a Structure instance

Parameters
structStructure

The structure from which to write the PDB file

deststr or file-like

Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.

renumberbool, optional, default True

If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously.

coordinatesarray-like of float, optional

If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)

altlocsstr, optional, default ‘all’

Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:

  • ‘all’ : print all alternate locations

  • ‘first’ : print only the first alternate locations

  • ‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed

Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.

write_anisoubool, optional, default False

If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written.

charmmbool, optional, default False

If True, SEGID will be written in columns 73 to 76 of the PDB file in the typical CHARMM-style PDB output. This will be omitted for any atom that does not contain a SEGID identifier.

use_hetatoms: bool, optional, default True

If True, certain atoms will have the HETATM tag instead of ATOM as per the PDB-standard.

standard_resnamesbool, optional, default False

If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values.

increase_tercountbool, optional, default True

If True, the TER atom number field increased by one compared to atom card preceding it; this conforms to PDB standard.

write_linksbool, optional, default False

If True, any LINK records stored in the Structure will be written to the LINK records near the top of the PDB file. If this is True, then renumber must be False or a ValueError will be thrown

Notes

If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDB standard dictates that only one set of unit cells shall be present).

class parmed.formats.PQRFile[source]

Bases: object

Standard PDB file format parser and writer

Methods

id_format(filename)

Identifies the file type as a PDB file

parse(filename[, skip_bonds])

Read a PQR file and return a populated Structure class

write(struct, dest[, renumber, coordinates, …])

Write a PDB file from a Structure instance

static id_format(filename)[source]

Identifies the file type as a PDB file

Parameters
filenamestr

Name of the file to check format for

Returns
is_fmtbool

True if it is a PQR file

static parse(filename, skip_bonds=True)[source]

Read a PQR file and return a populated Structure class

Parameters
filenamestr or file-like

Name of the PQR file to read, or a file-like object that can iterate over the lines of a PQR. Compressed file names can be specified and are determined by file-name extension (e.g., file.pqr.gz, file.pqr.bz2)

skip_bondsbool, optional

If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the PQR file is being parsed simply for its coordinates. Default is False.

Returns
structureStructure

The Structure object initialized with all of the information from the PDB file. No bonds or other topological features are added by default.

static write(struct, dest, renumber=True, coordinates=None, standard_resnames=False)[source]

Write a PDB file from a Structure instance

Parameters
structStructure

The structure from which to write the PDB file

deststr or file-like

Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.

renumberbool, optional

If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously. Default is True

coordinatesarray-like of float, optional

If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)

standard_resnamesbool, optional

If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values. Default is False

class parmed.formats.PSFFile[source]

Bases: object

CHARMM- or XPLOR-style PSF file parser and writer. This class is specifically a holder for the writing functionality and a vessel for automatic file type detection. If you wish to instantiate a PSF file directly, use parmed.charmm.CharmmPsfFile or the parmed.formats.load_file() function instead.

Methods

id_format(filename)

Identifies the file type as a CHARMM PSF file

parse(filename)

Read a CHARMM- or XPLOR-style PSF file

write(struct, dest[, vmd])

Writes a PSF file from the stored molecule

static id_format(filename)[source]

Identifies the file type as a CHARMM PSF file

Parameters
filenamestr

Name of the file to check format for

Returns
is_fmtbool

True if it is a CHARMM or Xplor-style PSF file

static parse(filename)[source]

Read a CHARMM- or XPLOR-style PSF file

Parameters
filenamestr

Name of the file to parse

Returns
psf_fileCharmmPsfFile

The PSF file instance with all information loaded

static write(struct, dest, vmd=False)[source]

Writes a PSF file from the stored molecule

Parameters
structStructure

The Structure instance from which the PSF should be written

deststr or file-like

The place to write the output PSF file. If it has a “write” attribute, it will be used to print the PSF file. Otherwise, it will be treated like a string and a file will be opened, printed, then closed

vmdbool

If True, it will write out a PSF in the format that VMD prints it in (i.e., no NUMLP/NUMLPH or MOLNT sections)

Examples

>>> cs = CharmmPsfFile('testfiles/test.psf')
>>> cs.write_psf('testfiles/test2.psf')
class parmed.formats.SDFFile[source]

Bases: object

Class to read SDF file

Methods

id_format(filename)

Identify the file as a SDF file format or not

parse

static id_format(filename)[source]

Identify the file as a SDF file format or not

Parameters
filenamestr

Name of the file to test whether or not it is a sdf file

Returns
is_fmtbool

True if it is a sdf file, False otherwise

static parse(filename, structure=False)[source]
parmed.formats.load_file(filename, *args, **kwargs)[source]

Identifies the file format of the specified file and returns its parsed contents.

Parameters
filenamestr

The name of the file to try to parse. If the filename starts with http:// or https:// or ftp://, it is treated like a URL and the file will be loaded directly from its remote location on the web

structureobject, optional

For some classes, such as the Mol2 file class, the default return object is not a Structure, but can be made to return a Structure if the structure=True keyword argument is passed. To facilitate writing easy code, the structure keyword is always processed and only passed on to the correct file parser if that parser accepts the structure keyword. There is no default, as each parser has its own default.

natomint, optional

This is needed for some coordinate file classes, but not others. This is treated the same as structure, above. It is the # of atoms expected

hasboxbool, optional

Same as structure, but indicates whether the coordinate file has unit cell dimensions

skip_bondsbool, optional

Same as structure, but indicates whether or not bond searching will be skipped if the topology file format does not contain bond information (like PDB, GRO, and PQR files).

*argsother positional arguments

Some formats accept positional arguments. These will be passed along

**kwargsother options

Some formats can only be instantiated with other options besides just a file name.

Returns
object

The returned object is the result of the parsing function of the class associated with the file format being parsed

Raises
IOError

If filename does not exist

parmed.exceptions.FormatNotFound

If no suitable file format can be identified, a TypeError is raised

TypeError

If the identified format requires additional arguments that are not provided as keyword arguments in addition to the file name

Notes

Compressed files are supported and detected by filename extension. This applies both to local and remote files. The following names are supported:

  • .gz : gzip compressed file

  • .bz2 : bzip2 compressed file

SDF file is loaded via rdkit package.

Examples

Load a Mol2 file

>>> load_file('tripos1.mol2')
<ResidueTemplate DAN: 31 atoms; 33 bonds; head=None; tail=None>

Load a Mol2 file as a Structure

>>> load_file('tripos1.mol2', structure=True)
<Structure 31 atoms; 1 residues; 33 bonds; NOT parametrized>

Load an Amber topology file

>>> load_file('trx.prmtop', xyz='trx.inpcrd')
<AmberParm 1654 atoms; 108 residues; 1670 bonds; parametrized>

Load a CHARMM PSF file

>>> load_file('ala_ala_ala.psf')
<CharmmPsfFile 33 atoms; 3 residues; 32 bonds; NOT parametrized>

Load a PDB and CIF file

>>> load_file('4lzt.pdb')
<Structure 1164 atoms; 274 residues; 0 bonds; PBC (triclinic); NOT parametrized>
>>> load_file('4LZT.cif')
<Structure 1164 atoms; 274 residues; 0 bonds; PBC (triclinic); NOT parametrized>

Load a Gromacs topology file – only works with Gromacs installed

>>> load_file('1aki.ff99sbildn.top')
<GromacsTopologyFile 40560 atoms [9650 EPs]; 9779 residues; 30934 bonds; parametrized>

Load a SDF file – only works with rdkit installed

>>> load_file('mol.sdf', structure=True)
<Structure 34 atoms; 1 residues; 66 bonds; NOT parametrized>