Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Every time I reinstall torchani i get this error message about atoms being bytes instead of strings, this easily fixes this. #639

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

avanteijlingen
Copy link

Error:

File ~\anaconda3\lib\site-packages\torchani\data_init_.py:164 in reenterable_iterable_factory
d['species'] = numpy.array([idx[s] for s in d['species']], dtype='i8')

File ~\anaconda3\lib\site-packages\torchani\data_init_.py:164 in
d['species'] = numpy.array([idx[s] for s in d['species']], dtype='i8')

KeyError: b'C'

The proposed fix will allow the program to work wether it parses the atomic labels as bytes or strings by using .decode() within a try-catch

fixes
  File ~\anaconda3\lib\site-packages\torchani\data\__init__.py:164 in reenterable_iterable_factory
    d['species'] = numpy.array([idx[s] for s in d['species']], dtype='i8')

  File ~\anaconda3\lib\site-packages\torchani\data\__init__.py:164 in <listcomp>
    d['species'] = numpy.array([idx[s] for s in d['species']], dtype='i8')

KeyError: b'C'
backwards compatibility
sigmoid scaling between min and max values
@yueyericardo
Copy link
Contributor

Hi, thanks for contributing to TorchANI!
Could I know how did you get the error KeyError: b'C' you mentioned?

@avanteijlingen
Copy link
Author

SAMPLE.zip

When i make a HDF5 dataset and then load it into ANI it always finds the species table to contain the atoms as b'C', b'H' etc which then it doesnt recognise without doing .decode().

I make the HDF5 datasets always similar to this:

mol, E, C, S, F = [],[],[],[],[]

mol.append(HDF5_Dataset.create_group(groupname))

E.append(mol[-1].create_dataset("energies", (energies.shape[0],), dtype='float64'))
E[-1][()] = energies
C.append(mol[-1].create_dataset("coordinates", Conformers.shape, dtype='float64'))
C[-1][()] = Conformers

species = np.array(species.split(), dtype="<U2")
species = np.array(species, dtype = h5py.special_dtype(vlen=str) )

S.append(mol[-1].create_dataset("species", data=species))
atom_types = np.unique(np.hstack((atom_types, species)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants