Proof of concept to generate InChIs that distinguish structures with different Square Planar, Trigonal Bipyramidal, and Octahedral configurations.
It adds a '/ma' (metal architecture) layer as proposed by Jonathan Goodman to the end of an InChI. Currently the following geometries are supported:
- Square Planar:
/ma<atom>s<order>
where order is 1,2,3 - Trigonal Bipyramidal:
/ma<atom>tb<order>
where order is 1-20 - Octahedral:
/ma<atom>o<order>
where order is 1-30
Daylight used the term 'chiral order', I prefer 'order' as these geometries are not necessarily chiral.
The order specifies a permutation index and uses the same coding scheme as SMILES (see. relevant blog post). It works because we need to be able to specify any order of the neighbors around the central (or 'focus') atom. To specify any ordering for octahedral we have 6 neighbors so there are 720 (6 factorial) possible ways to order them. However there are 24 symmetries and so we only need 720/24 = 30 possible orders. For trigonal bipyramidal, 120 (i.e. 5 factorial) ways to order but 6 symmetries, 120/5 = 20 possible orders.
For each of these geometries we use a table to look up the ordering we have ended up with from the InChI atom numbers (parsed from the AuxInfo). When there are symmetries within the neighbors we choose the lowest possible ordering. Currently such symmetries are broken by re-enumeration but in practice a backtracking canonical labelling algorithm (such as that used by the InChI) can take care of this step.
There are 16 possible orderings of Square Planar neighbors, for platin
when two of each neighbors are symmetric there two possibilities: cis-platin
and trans-platin. As shown below the /ma
layer divides the possibilities
with /ma5sp1
corresponding to cis-, and /ma5sp2
to trans-.
Note the numbers here refer to the input atom order in the SMILES (see
examples/platin.smi
)
and not the InChI canonical numbers.
Note: The SMILES depiction will make some bonds longer to try fit in the expanded NO2 group.
cis/trans- examples:
fac/mer- examples:
When we have 2 and 3 groups in TBPY there are 3 possibilities: two as the axis, two equatorial, or one in the axis and one equatorial.
A command line application is provided (available: inchi-ma.jar) that can provide InChIs with an /ma
layer for a SMILES or 3D SDfile:
$ java -jar inchi-ma.jar input.smi
$ java -jar inchi-ma.jar input.sdf
Some example inputs are provided in the examples/
directory.
Currently only constitutionally different neighbors are handled. The system used here can also be used to encode geometries such as lambda/delta Fe(ox3) if there was tighter integration within the canonically labelling procedure.
The project is built with Apache Maven.
$ mvn install
will generate the file target/inchi-ma.jar
.