parmed.formats package¶
Subpackages¶
Submodules¶
Module contents¶
A package dealing with different file formats and automatic detection of those formats
-
class
parmed.formats.
CIFFile
[source]¶ Bases:
object
Standard PDBx/mmCIF file format parser and writer
Methods
download
(pdb_id[, timeout, saveto])Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a
Structure
instanceid_format
(filename)Identifies the file type as a PDBx/mmCIF file
parse
(filename[, skip_bonds])Read a PDBx or mmCIF file and return a populated Structure class
write
(struct, dest[, renumber, coordinates, …])Write a PDB file from the current Structure instance
-
static
download
(pdb_id, timeout=10, saveto=None)[source]¶ Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a
Structure
instance- Parameters
- pdb_idstr
The 4-letter PDB ID to try and download from the RCSB PDB database
- timeoutfloat, optional
The number of seconds to wait before raising a timeout error. Default is 10 seconds
- savetostr, optional
If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no CIF file will be written. This will be a verbatim copy of the downloaded CIF file, unlike the somewhat-stripped version you would get by using
Structure.write_cif
- Returns
- struct
Structure
Structure instance populated by the requested PDBx/mmCIF
- struct
- Raises
- socket.timeout if the connection times out while trying to contact the
- FTP server
- IOError if there is a problem retrieving the requested PDB
- ImportError if the gzip module is not available
- TypeError if pdb_id is not a 4-character string
-
static
id_format
(filename)[source]¶ Identifies the file type as a PDBx/mmCIF file
- Parameters
- filenamestr
Name of the file to check format for
- Returns
- is_fmtbool
True if it is a PDBx/mmCIF file
-
static
parse
(filename, skip_bonds=False)[source]¶ Read a PDBx or mmCIF file and return a populated Structure class
- Parameters
- filename
str or file-like
Name of PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)
- skip_bondsbool, optional
If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the CIF file is being parsed simply for its coordinates. Default is False.
- filename
- Returns
- structure1 [, structure2 [, structure3 [, …] ] ]
- structure#
Structure
The Structure object initialized with all of the information from the PDBx/mmCIF file. No bonds or other topological features are added by default. If multiple structures are defined in the CIF file, multiple Structure instances will be returned as a tuple.
- Raises
- ValueError if the file severely violates the PDB format specification.
- If this occurs, check the formatting on each line and make sure it
- matches the others.
-
static
write
(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, standard_resnames=False)[source]¶ Write a PDB file from the current Structure instance
- Parameters
- struct
Structure
The structure from which to write the PDBx/mmCIF file
- dest
str or file-like
Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.
- renumber
bool
If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously
- coordinates
array-like of float
If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)
- altlocs
str
Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:
‘all’ : (default) print all alternate locations
‘first’ : print only the first alternate locations
‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed
Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.
- write_anisou
bool
If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written
- standard_resnamesbool, optional
If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values. Default is False
- struct
Notes
If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDBx standard dictates that only one set of unit cells shall be present).
-
static
-
class
parmed.formats.
Mol2File
[source]¶ Bases:
object
Class to read and write TRIPOS Mol2 files
Methods
id_format
(filename)Identify the file as a Mol2 (or Mol3) file format or not
parse
(filename[, structure])Parses a mol2 file (or mol3) file
write
(struct, dest[, mol3, split, …])Writes a mol2 file from a structure or residue template
-
BOND_ORDER_MAP
= {'am': 1.25, 'ar': 1.5}¶
-
REVERSE_BOND_ORDER_MAP
= {1.25: 'am', 1.5: 'ar'}¶
-
static
id_format
(filename)[source]¶ Identify the file as a Mol2 (or Mol3) file format or not
- Parameters
- filenamestr
Name of the file to test whether or not it is a mol2 file
- Returns
- is_fmtbool
True if it is a mol2 (or mol3) file, False otherwise
-
static
parse
(filename, structure=False)[source]¶ Parses a mol2 file (or mol3) file
- Parameters
- filenamestr or file-like
Name of the file to parse or file-like object to parse from
- structurebool, optional
If True, the return value is a
Structure
instance. If False, it is either aResidueTemplate
orResidueTemplateContainter
instance, depending on whether there is one or more than one residue defined in it. Default is False
- Returns
- molecule
Structure
,ResidueTemplate
, or ResidueTemplateContainer
The molecule defined by this mol2 file
- molecule
- Raises
- Mol2Error
If the file format is not recognized or non-numeric values are present where integers or floating point numbers are expected. Also raises Mol2Error if you try to parse a mol2 file that has multiple @<MOLECULE> entries with
structure=True
.
-
static
write
(struct, dest, mol3=False, split=False, compress_whitespace=False)[source]¶ Writes a mol2 file from a structure or residue template
- Parameters
- struct
Structure
orResidueTemplate
or ResidueTemplateContainer
The input structure to write the mol2 file from
- deststr or file-like obj
Name of the file to write or open file handle to write to
- mol3bool, optional
If True and
struct
is a ResidueTemplate or container, write HEAD/TAIL sections. Default is False- splitbool, optional
If True and
struct
is a ResidueTemplateContainer or a Structure with multiple residues, each residue is printed in a separate @<MOLECULE> section that appear sequentially in the output file- compress_whitespacebool, optional
If True, seprate fields on one line with a single space instead of aligning them with whitespace. This is useful for parsers that truncate lines at 80 characters (e.g., some versions of OpenEye). However, it will not look as “neat” upon visual inspection in a text editor. Default is False.
- struct
-
-
class
parmed.formats.
PDBFile
(fileobj)[source]¶ Bases:
object
Standard PDB file format parser and writer
Methods
AtomLookupKey
(name, number, residue_name, …)- Attributes
download
(pdb_id[, timeout, saveto])Goes to the wwPDB website and downloads the requested PDB, loading it as a
Structure
instanceid_format
(filename)Identifies the file type as a PDB file
parse
(filename[, skip_bonds])Read a PDB file and return a populated Structure class
write
(struct, dest[, renumber, coordinates, …])Write a PDB file from a Structure instance
-
class
AtomLookupKey
(name, number, residue_name, residue_number, chain, insertion_code, segment_id, alternate_location)¶ Bases:
tuple
- Attributes
alternate_location
Alias for field number 7
chain
Alias for field number 4
insertion_code
Alias for field number 5
name
Alias for field number 0
number
Alias for field number 1
residue_name
Alias for field number 2
residue_number
Alias for field number 3
segment_id
Alias for field number 6
Methods
count
(value, /)Return number of occurrences of value.
index
(value[, start, stop])Return first index of value.
-
property
alternate_location
¶ Alias for field number 7
-
property
chain
¶ Alias for field number 4
-
property
insertion_code
¶ Alias for field number 5
-
property
name
¶ Alias for field number 0
-
property
number
¶ Alias for field number 1
-
property
residue_name
¶ Alias for field number 2
-
property
residue_number
¶ Alias for field number 3
-
property
segment_id
¶ Alias for field number 6
-
static
download
(pdb_id, timeout=10, saveto=None)[source]¶ Goes to the wwPDB website and downloads the requested PDB, loading it as a
Structure
instance- Parameters
- pdb_idstr
The 4-letter PDB ID to try and download from the RCSB PDB database
- timeoutfloat, optional
The number of seconds to wait before raising a timeout error. Default is 10 seconds
- savetostr, optional
If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no PDB file will be written. This will be a verbatim copy of the downloaded PDB file, unlike the somewhat-stripped version you would get by using
Structure.write_pdb
- Returns
- struct
Structure
Structure instance populated by the requested PDB
- struct
- Raises
- socket.timeout if the connection times out while trying to contact the
- FTP server
- IOError if there is a problem retrieving the requested PDB or writing a
- requested
saveto
file - ImportError if the gzip module is not available
- TypeError if pdb_id is not a 4-character string
-
static
id_format
(filename)[source]¶ Identifies the file type as a PDB file
- Parameters
- filenamestr or file object
Name of the file to check format for
- Returns
- is_fmtbool
True if it is a PDB file
-
classmethod
parse
(filename, skip_bonds=False)[source]¶ Read a PDB file and return a populated Structure class
- Parameters
- filenamestr or file-like
Name of the PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)
- skip_bondsbool, optional
If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the PDB file is being parsed simply for its coordinates. This may also reduce element assignment if element information is not present in the PDB file already. Default is False.
- Returns
- structure
Structure
The Structure object initialized with all of the information from the PDB file. No bonds or other topological features are added by default.
- structure
-
static
write
(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, charmm=False, use_hetatoms=True, standard_resnames=False, increase_tercount=True, write_links=False)[source]¶ Write a PDB file from a Structure instance
- Parameters
- struct
Structure
The structure from which to write the PDB file
- deststr or file-like
Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.
- renumberbool, optional, default True
If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously.
- coordinatesarray-like of float, optional
If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)
- altlocsstr, optional, default ‘all’
Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:
‘all’ : print all alternate locations
‘first’ : print only the first alternate locations
‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed
Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.
- write_anisoubool, optional, default False
If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written.
- charmmbool, optional, default False
If True, SEGID will be written in columns 73 to 76 of the PDB file in the typical CHARMM-style PDB output. This will be omitted for any atom that does not contain a SEGID identifier.
- use_hetatoms: bool, optional, default True
If True, certain atoms will have the HETATM tag instead of ATOM as per the PDB-standard.
- standard_resnamesbool, optional, default False
If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values.
- increase_tercountbool, optional, default True
If True, the TER atom number field increased by one compared to atom card preceding it; this conforms to PDB standard.
- write_linksbool, optional, default False
If True, any LINK records stored in the Structure will be written to the LINK records near the top of the PDB file. If this is True, then renumber must be False or a ValueError will be thrown
- struct
Notes
If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDB standard dictates that only one set of unit cells shall be present).
-
class
parmed.formats.
PQRFile
[source]¶ Bases:
object
Standard PDB file format parser and writer
Methods
id_format
(filename)Identifies the file type as a PDB file
parse
(filename[, skip_bonds])Read a PQR file and return a populated Structure class
write
(struct, dest[, renumber, coordinates, …])Write a PDB file from a Structure instance
-
static
id_format
(filename)[source]¶ Identifies the file type as a PDB file
- Parameters
- filenamestr
Name of the file to check format for
- Returns
- is_fmtbool
True if it is a PQR file
-
static
parse
(filename, skip_bonds=True)[source]¶ Read a PQR file and return a populated Structure class
- Parameters
- filenamestr or file-like
Name of the PQR file to read, or a file-like object that can iterate over the lines of a PQR. Compressed file names can be specified and are determined by file-name extension (e.g., file.pqr.gz, file.pqr.bz2)
- skip_bondsbool, optional
If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the PQR file is being parsed simply for its coordinates. Default is False.
- Returns
- structure
Structure
The Structure object initialized with all of the information from the PDB file. No bonds or other topological features are added by default.
- structure
-
static
write
(struct, dest, renumber=True, coordinates=None, standard_resnames=False)[source]¶ Write a PDB file from a Structure instance
- Parameters
- struct
Structure
The structure from which to write the PDB file
- deststr or file-like
Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.
- renumberbool, optional
If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously. Default is True
- coordinatesarray-like of float, optional
If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)
- standard_resnamesbool, optional
If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values. Default is False
- struct
-
static
-
class
parmed.formats.
PSFFile
[source]¶ Bases:
object
CHARMM- or XPLOR-style PSF file parser and writer. This class is specifically a holder for the writing functionality and a vessel for automatic file type detection. If you wish to instantiate a PSF file directly, use
parmed.charmm.CharmmPsfFile
or theparmed.formats.load_file()
function instead.Methods
id_format
(filename)Identifies the file type as a CHARMM PSF file
parse
(filename)Read a CHARMM- or XPLOR-style PSF file
write
(struct, dest[, vmd])Writes a PSF file from the stored molecule
-
static
id_format
(filename)[source]¶ Identifies the file type as a CHARMM PSF file
- Parameters
- filenamestr
Name of the file to check format for
- Returns
- is_fmtbool
True if it is a CHARMM or Xplor-style PSF file
-
static
parse
(filename)[source]¶ Read a CHARMM- or XPLOR-style PSF file
- Parameters
- filenamestr
Name of the file to parse
- Returns
- psf_file
CharmmPsfFile
The PSF file instance with all information loaded
- psf_file
-
static
write
(struct, dest, vmd=False)[source]¶ Writes a PSF file from the stored molecule
- Parameters
- struct
Structure
The Structure instance from which the PSF should be written
- deststr or file-like
The place to write the output PSF file. If it has a “write” attribute, it will be used to print the PSF file. Otherwise, it will be treated like a string and a file will be opened, printed, then closed
- vmdbool
If True, it will write out a PSF in the format that VMD prints it in (i.e., no NUMLP/NUMLPH or MOLNT sections)
- struct
Examples
>>> cs = CharmmPsfFile('testfiles/test.psf') >>> cs.write_psf('testfiles/test2.psf')
-
static
-
class
parmed.formats.
SDFFile
[source]¶ Bases:
object
Class to read SDF file
Methods
id_format
(filename)Identify the file as a SDF file format or not
parse
-
parmed.formats.
load_file
(filename, *args, **kwargs)[source]¶ Identifies the file format of the specified file and returns its parsed contents.
- Parameters
- filenamestr
The name of the file to try to parse. If the filename starts with http:// or https:// or ftp://, it is treated like a URL and the file will be loaded directly from its remote location on the web
- structureobject, optional
For some classes, such as the Mol2 file class, the default return object is not a Structure, but can be made to return a Structure if the
structure=True
keyword argument is passed. To facilitate writing easy code, thestructure
keyword is always processed and only passed on to the correct file parser if that parser accepts the structure keyword. There is no default, as each parser has its own default.- natomint, optional
This is needed for some coordinate file classes, but not others. This is treated the same as
structure
, above. It is the # of atoms expected- hasboxbool, optional
Same as
structure
, but indicates whether the coordinate file has unit cell dimensions- skip_bondsbool, optional
Same as
structure
, but indicates whether or not bond searching will be skipped if the topology file format does not contain bond information (like PDB, GRO, and PQR files).- *argsother positional arguments
Some formats accept positional arguments. These will be passed along
- **kwargsother options
Some formats can only be instantiated with other options besides just a file name.
- Returns
- object
The returned object is the result of the parsing function of the class associated with the file format being parsed
- Raises
- IOError
If
filename
does not exist- parmed.exceptions.FormatNotFound
If no suitable file format can be identified, a TypeError is raised
- TypeError
If the identified format requires additional arguments that are not provided as keyword arguments in addition to the file name
Notes
Compressed files are supported and detected by filename extension. This applies both to local and remote files. The following names are supported:
.gz
: gzip compressed file.bz2
: bzip2 compressed file
SDF file is loaded via rdkit package.
Examples
Load a Mol2 file
>>> load_file('tripos1.mol2') <ResidueTemplate DAN: 31 atoms; 33 bonds; head=None; tail=None>
Load a Mol2 file as a Structure
>>> load_file('tripos1.mol2', structure=True) <Structure 31 atoms; 1 residues; 33 bonds; NOT parametrized>
Load an Amber topology file
>>> load_file('trx.prmtop', xyz='trx.inpcrd') <AmberParm 1654 atoms; 108 residues; 1670 bonds; parametrized>
Load a CHARMM PSF file
>>> load_file('ala_ala_ala.psf') <CharmmPsfFile 33 atoms; 3 residues; 32 bonds; NOT parametrized>
Load a PDB and CIF file
>>> load_file('4lzt.pdb') <Structure 1164 atoms; 274 residues; 0 bonds; PBC (triclinic); NOT parametrized> >>> load_file('4LZT.cif') <Structure 1164 atoms; 274 residues; 0 bonds; PBC (triclinic); NOT parametrized>
Load a Gromacs topology file – only works with Gromacs installed
>>> load_file('1aki.ff99sbildn.top') <GromacsTopologyFile 40560 atoms [9650 EPs]; 9779 residues; 30934 bonds; parametrized>
Load a SDF file – only works with rdkit installed
>>> load_file('mol.sdf', structure=True) <Structure 34 atoms; 1 residues; 66 bonds; NOT parametrized>