Protein Data Bank (PDB)

The PDB (Protein Data Bank) file format is a standard text-based format used to store information
about the three-dimensional structures of biological macromolecules, including proteins, nucleic acids,

and large molecular complexes. While the PDB format is commonly associated with biological macromolecules, it can also be used to represent crystal structures of small molecules. The file includes details such as atomic coordinates, bond information, and metadata about the structure.

Below is a simplified example of a PDB file representing a crystal structure of a small molecule:

HEADER    SMALL MOLECULE CRYSTAL STRUCTURE
CRYST1    5.0   5.0   5.0  90.00  90.00  90.00 P 1           1
ATOM      1  O   HOH A   1       0.000   0.000   0.000  1.00  0.00
ATOM      2  H1  HOH A   1       0.500   0.500   0.000  1.00  0.00
ATOM      3  H2  HOH A   1       0.500   0.000   0.500  1.00  0.00
ATOM      4  H3  HOH A   1       0.000   0.500   0.500  1.00  0.00
END

In this example:

  • The HEADER line provides a brief description of the structure.

  • The CRYST1 line specifies the unit cell parameters (a, b, c, alpha, beta, gamma) and space group information.

  • The ATOM lines list the atoms present in the structure, including their atomic coordinates, element type, and connectivity.

  • The END keyword indicates the end of the file.

Please note that while the PDB format was initially designed for biological macromolecules, it has been adapted and extended to accommodate small-molecule crystal structures as well. For more complex structures or those with additional features, the PDB format may include other sections, such as connectivity information, crystallographic details, and experimental methods.