Refactor ZairaChem modules to Ersilia Model Hub components #30

miquelduranfrigola · 2023-12-05T15:47:02Z

Motivation

At the moment, ZairaChem has a lot of dependencies, as apparent from the install.sh file. Importantly, several Conda environments are created, which makes it difficult/impossible to maintain. To make ZairaChem more sustainable, we need to migrate most of its code to Ersilia Model Hub artefacts.

Types of ZairaChem elements

There are 2 types of modules we want to migrate:

Static: These are the easy ones. For example, ZairaChem uses MELLODDY-Tuner to convert a list of SMILES into a normalized form, with some extra columns. In principle, we could create a new Ersilia model (called, for example, melloddy-tuner) where this is done in an isolated container/environment. This would prevent us from having to install MELLODDY and its dependencies.
Trainable: ZairaChem uses AutoML frameworks such as FLAML, AutoGluon, Keras Tuner and others. These frameworks are used to automatically train models based on descriptors. Ideally, we want to migrate these trainers into Ersilia Model Hub artefacts. The main challenge 😟 is that, at the moment, Ersilia does not accept fit instructions. Therefore, we would need to figure this out first. At a high level, we'd like to have fitting capabilities at training time, acompanied with some persistency of AI models in order to use them at prediction time.

Roadmap

We should start with static migration, while we figure out the approach for trainable models. I suggest the following order (subject to change):

Ersilia Compound Embeddings (eos2gw4)
MELLODDY-Tuner
Modify Ersilia codebase to enable fit commands. Probably, we should work on this in a separate issue.
Create a fittable FLAML Ersilia model.
Create a fittable Keras Tuner Ersilia model.
Create a fittable AutoGluon Ersilia model.
Create a fittable MolMap Ersilia model.
Replace existing code in ZairaChem by ErsiliaModel Python API calls.

The text was updated successfully, but these errors were encountered:

GemmaTuron · 2024-05-08T09:47:38Z

We will need to complete the refactoring before fully incorporating #31 @miquelduranfrigola

miquelduranfrigola · 2024-05-09T11:49:17Z

Yes, all clear. Tagging @DhanshreeA since we will have to further improve input output adapters, and most important, figure out a way to make Ersilia models trainable and fine-tunable.

GemmaTuron mentioned this issue Apr 8, 2024

Umap Learn dependency issue #37

Open

GemmaTuron assigned miquelduranfrigola May 8, 2024

DhanshreeA self-assigned this May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor ZairaChem modules to Ersilia Model Hub components #30

Refactor ZairaChem modules to Ersilia Model Hub components #30

miquelduranfrigola commented Dec 5, 2023 •

edited by GemmaTuron

Loading

GemmaTuron commented May 8, 2024

miquelduranfrigola commented May 9, 2024

Refactor ZairaChem modules to Ersilia Model Hub components #30

Refactor ZairaChem modules to Ersilia Model Hub components #30

Comments

miquelduranfrigola commented Dec 5, 2023 • edited by GemmaTuron Loading

Motivation

Types of ZairaChem elements

Roadmap

GemmaTuron commented May 8, 2024

miquelduranfrigola commented May 9, 2024

miquelduranfrigola commented Dec 5, 2023 •

edited by GemmaTuron

Loading