Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 1.68 KB

table_identifiers.md

File metadata and controls

33 lines (26 loc) · 1.68 KB

Table: identifiers

Description: Table of identifiers of chemical substances. These can be formal specifications or primary keys of online databases. Some in this table where derived from the ThermoML files and others were retrieved from other sites using the InChIKey as the search term. Examples are:

  • ThermoML file: InChI and InChIKey
  • PubChem: PubChem Compound ID (CID), IUPAC name, CAS Registry Number, canonical smiles string and isomeric smiles string
  • CommonChemistry: CAS Registry Number
  • Wikidata: Wikidata ID, ChemSpiderID, CAS Registry Number, DSSTox Compound ID, EC Number and CheMBL ID
  • OPSIN: IUPAC Name

Statistics on the sources of identifiers in database (for 8284 unique chemical substances) are below.

Identifier Stats

Fields from the 'Compound' section in the ThermoML Schema

ThermoML Schema

Example data from a 'Compound' section of a ThermoML file

ThermoML Example

MySQL 'identifiers' table structure

MySQL Structure

MySQL Fields

  • id: components primary key (auto-generated and unique)
  • substance_id: foreign key (substances table) of a substance that an identifier represents
  • type: enumerated list of types of identifier stored in the table
  • value: text value of an identifier
  • source: the source (ThermoML file or online database) of the identifier
  • updated: datetime last updated