Usage of the data4cat module

jupyter

jupytext

kernelspec

formats

text_representation

ipynb,md

extension	format_name	format_version	jupytext_version
.md	markdown	1.3	1.16.4

display_name	language	name
Python 3 (ipykernel)	python	python3

Usage of the data4cat module

For convenience and e.g. the usage in lectures datasets from the BasCat repository (Dataverse) where wrapped into modules. The convenience functions should enable a smooth start on how to work with published remote data. Datasets included up to now are:

The BasCat DinoRun dataset on synthesis to ethanol

Installation of the data4cat module

For the installation you can clone or download the repository:

git clone https://github.com/nfdi4cat/data4cat.git

cd into the directory an install data4cat:

pip install .

Or you can directly install the module from the remote source:

python -m pip install git+https://github.com/nfdi4cat/data4cat.git@main

To uninstall simply do a:

pip uninstall data4cat

With the package installed you first need to import the module:

from data4cat import dino_run

And create an instance:

dinodat = dino_run.dino_offline()

The two steps above have to be done always.

The dino_run dataset from the NFDI4Cat Dataverse instance

One dataset is the BasCat performance dataset on the syngas to ethanol reaction.

Download the dino_run dataset

In case that there is no offline version of the dataset available (e.g. after a fresh install) a copy of the dataset can be downloaded like this:

dinodat.one_shot_dumb()

Create a dataset from the offline data

You can get the data either in the form of a pandas dataframe or as a Bunch object in the style of scikit-learn datasets. You can get the original data in the following way:

original = dinodat.original_data()

original.head()

Create a subset of the offline data for the startup phase

There is a sub dataset for the startup phase with a TOS < 85 available. Again both as pandas dataframe and Bunch object.

startup = dinodat.startup_data()

startup.head()

Create a subset of the offline data for the selectivity

Especially for unsupervised learning tasks there is a subset of the data prepared that contains only the selectivity data. When asking for this subset also reactors are provided, here they are put in a clusters object.

selectivity, clusters = dinodat.selectivity()

selectivity.head()

clusters.head()

Create a subset of the offline data for the selectivity without reactor 5

In case needed when you provide the r5 argument to False it will exclude the empty reactor 5.

selectivity_wo5, clusters = dinodat.selectivity(r5=False)

selectivity_wo5.head()

clusters.head()

Create a subset of the offline data for the reaction conditions

For supervised tasks a subset of the data is provided that contains the reaction conditions as features and the selectivity to ethanol as target.

react_cond, selectivity_EtOH = dinodat.react_cond()

react_cond.head()

selectivity_EtOH.head()

Create a subset of the offline data for the reaction conditions without reactor 5

Like before the empty reactor 5 can be excluded with the r5 argument set to False.

react_cond, selectivity_EtOH = dinodat.react_cond(r5=False)

react_cond.tail()

selectivity_EtOH.tail()

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data4cat		data4cat
BasCat_DataInspection.ipynb		BasCat_DataInspection.ipynb
BasCat_DataInspection.md		BasCat_DataInspection.md
BasCat_DataInspection.py		BasCat_DataInspection.py
Readme.ipynb		Readme.ipynb
Readme.md		Readme.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage of the data4cat module

Installation of the data4cat module

The dino_run dataset from the NFDI4Cat Dataverse instance

Download the dino_run dataset

Create a dataset from the offline data

Create a subset of the offline data for the startup phase

Create a subset of the offline data for the selectivity

Create a subset of the offline data for the selectivity without reactor 5

Create a subset of the offline data for the reaction conditions

Create a subset of the offline data for the reaction conditions without reactor 5

About

Releases

Languages

nfdi4cat/data4cat

Folders and files

Latest commit

History

Repository files navigation

Usage of the data4cat module

Installation of the data4cat module

The dino_run dataset from the NFDI4Cat Dataverse instance

Download the dino_run dataset

Create a dataset from the offline data

Create a subset of the offline data for the startup phase

Create a subset of the offline data for the selectivity

Create a subset of the offline data for the selectivity without reactor 5

Create a subset of the offline data for the reaction conditions

Create a subset of the offline data for the reaction conditions without reactor 5

About

Resources

Stars

Watchers

Forks

Releases

Languages