parmed.formats.pdb module

This package contains classes responsible for reading and writing both PDB and PDBx/mmCIF files.

class parmed.formats.pdb.CIFFile[source]

Bases: object

Standard PDBx/mmCIF file format parser and writer

Methods

download(pdb_id[, timeout, saveto])

Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a Structure instance

id_format(filename)

Identifies the file type as a PDBx/mmCIF file

parse(filename[, skip_bonds])

Read a PDBx or mmCIF file and return a populated Structure class

write(struct, dest[, renumber, coordinates, …])

Write a PDB file from the current Structure instance

static download(pdb_id, timeout=10, saveto=None)[source]

Goes to the wwPDB website and downloads the requested PDBx/mmCIF, loading it as a Structure instance

Parameters
pdb_idstr

The 4-letter PDB ID to try and download from the RCSB PDB database

timeoutfloat, optional

The number of seconds to wait before raising a timeout error. Default is 10 seconds

savetostr, optional

If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no CIF file will be written. This will be a verbatim copy of the downloaded CIF file, unlike the somewhat-stripped version you would get by using Structure.write_cif

Returns
structStructure

Structure instance populated by the requested PDBx/mmCIF

Raises
socket.timeout if the connection times out while trying to contact the
FTP server
IOError if there is a problem retrieving the requested PDB
ImportError if the gzip module is not available
TypeError if pdb_id is not a 4-character string
static id_format(filename)[source]

Identifies the file type as a PDBx/mmCIF file

Parameters
filenamestr

Name of the file to check format for

Returns
is_fmtbool

True if it is a PDBx/mmCIF file

static parse(filename, skip_bonds=False)[source]

Read a PDBx or mmCIF file and return a populated Structure class

Parameters
filenamestr or file-like

Name of PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)

skip_bondsbool, optional

If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the CIF file is being parsed simply for its coordinates. Default is False.

Returns
structure1 [, structure2 [, structure3 [, …] ] ]
structure#Structure

The Structure object initialized with all of the information from the PDBx/mmCIF file. No bonds or other topological features are added by default. If multiple structures are defined in the CIF file, multiple Structure instances will be returned as a tuple.

Raises
ValueError if the file severely violates the PDB format specification.
If this occurs, check the formatting on each line and make sure it
matches the others.
static write(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, standard_resnames=False)[source]

Write a PDB file from the current Structure instance

Parameters
structStructure

The structure from which to write the PDBx/mmCIF file

deststr or file-like

Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.

renumberbool

If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously

coordinatesarray-like of float

If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)

altlocsstr

Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:

  • ‘all’ : (default) print all alternate locations

  • ‘first’ : print only the first alternate locations

  • ‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed

Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.

write_anisoubool

If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written

standard_resnamesbool, optional

If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values. Default is False

Notes

If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDBx standard dictates that only one set of unit cells shall be present).

class parmed.formats.pdb.PDBFile(fileobj)[source]

Bases: object

Standard PDB file format parser and writer

Methods

AtomLookupKey(name, number, residue_name, …)

Attributes

download(pdb_id[, timeout, saveto])

Goes to the wwPDB website and downloads the requested PDB, loading it as a Structure instance

id_format(filename)

Identifies the file type as a PDB file

parse(filename[, skip_bonds])

Read a PDB file and return a populated Structure class

write(struct, dest[, renumber, coordinates, …])

Write a PDB file from a Structure instance

class AtomLookupKey(name, number, residue_name, residue_number, chain, insertion_code, segment_id, alternate_location)

Bases: tuple

Attributes
alternate_location

Alias for field number 7

chain

Alias for field number 4

insertion_code

Alias for field number 5

name

Alias for field number 0

number

Alias for field number 1

residue_name

Alias for field number 2

residue_number

Alias for field number 3

segment_id

Alias for field number 6

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property alternate_location

Alias for field number 7

property chain

Alias for field number 4

property insertion_code

Alias for field number 5

property name

Alias for field number 0

property number

Alias for field number 1

property residue_name

Alias for field number 2

property residue_number

Alias for field number 3

property segment_id

Alias for field number 6

static download(pdb_id, timeout=10, saveto=None)[source]

Goes to the wwPDB website and downloads the requested PDB, loading it as a Structure instance

Parameters
pdb_idstr

The 4-letter PDB ID to try and download from the RCSB PDB database

timeoutfloat, optional

The number of seconds to wait before raising a timeout error. Default is 10 seconds

savetostr, optional

If provided, this will be treated as a file name to which the PDB file will be saved. If None (default), no PDB file will be written. This will be a verbatim copy of the downloaded PDB file, unlike the somewhat-stripped version you would get by using Structure.write_pdb

Returns
structStructure

Structure instance populated by the requested PDB

Raises
socket.timeout if the connection times out while trying to contact the
FTP server
IOError if there is a problem retrieving the requested PDB or writing a
requested saveto file
ImportError if the gzip module is not available
TypeError if pdb_id is not a 4-character string
static id_format(filename)[source]

Identifies the file type as a PDB file

Parameters
filenamestr or file object

Name of the file to check format for

Returns
is_fmtbool

True if it is a PDB file

classmethod parse(filename, skip_bonds=False)[source]

Read a PDB file and return a populated Structure class

Parameters
filenamestr or file-like

Name of the PDB file to read, or a file-like object that can iterate over the lines of a PDB. Compressed file names can be specified and are determined by file-name extension (e.g., file.pdb.gz, file.pdb.bz2)

skip_bondsbool, optional

If True, skip trying to assign bonds. This can save substantial time when parsing large files with non-standard residue names. However, no bonds are assigned. This is OK if, for instance, the PDB file is being parsed simply for its coordinates. This may also reduce element assignment if element information is not present in the PDB file already. Default is False.

Returns
structureStructure

The Structure object initialized with all of the information from the PDB file. No bonds or other topological features are added by default.

static write(struct, dest, renumber=True, coordinates=None, altlocs='all', write_anisou=False, charmm=False, use_hetatoms=True, standard_resnames=False, increase_tercount=True, write_links=False)[source]

Write a PDB file from a Structure instance

Parameters
structStructure

The structure from which to write the PDB file

deststr or file-like

Either a file name or a file-like object containing a write method to which to write the PDB file. If it is a filename that ends with .gz or .bz2, a compressed version will be written using either gzip or bzip2, respectively.

renumberbool, optional, default True

If True, renumber the atoms and residues sequentially as they are stored in the structure. If False, use the original numbering if it was assigned previously.

coordinatesarray-like of float, optional

If provided, these coordinates will be written to the PDB file instead of the coordinates stored in the structure. These coordinates should line up with the atom order in the structure (not necessarily the order of the “original” PDB file if they differ)

altlocsstr, optional, default ‘all’

Keyword controlling which alternate locations are printed to the resulting PDB file. Allowable options are:

  • ‘all’ : print all alternate locations

  • ‘first’ : print only the first alternate locations

  • ‘occupancy’ : print the one with the largest occupancy. If two conformers have the same occupancy, the first one to occur is printed

Input is case-insensitive, and partial strings are permitted as long as it is a substring of one of the above options that uniquely identifies the choice.

write_anisoubool, optional, default False

If True, an ANISOU record is written for every atom that has one. If False, ANISOU records are not written.

charmmbool, optional, default False

If True, SEGID will be written in columns 73 to 76 of the PDB file in the typical CHARMM-style PDB output. This will be omitted for any atom that does not contain a SEGID identifier.

use_hetatoms: bool, optional, default True

If True, certain atoms will have the HETATM tag instead of ATOM as per the PDB-standard.

standard_resnamesbool, optional, default False

If True, common aliases for various amino and nucleic acid residues will be converted into the PDB-standard values.

increase_tercountbool, optional, default True

If True, the TER atom number field increased by one compared to atom card preceding it; this conforms to PDB standard.

write_linksbool, optional, default False

If True, any LINK records stored in the Structure will be written to the LINK records near the top of the PDB file. If this is True, then renumber must be False or a ValueError will be thrown

Notes

If multiple coordinate frames are present, these will be written as separate models (but only the unit cell from the first model will be written, as the PDB standard dictates that only one set of unit cells shall be present).

parmed.formats.pdb.try_convert(value, cast_type, default=None)[source]