Skip to content

Commit

Permalink
Add rough mkdocs markdown
Browse files Browse the repository at this point in the history
  • Loading branch information
johnbradley committed Nov 22, 2024
1 parent 17722d0 commit 5c15113
Show file tree
Hide file tree
Showing 8 changed files with 472 additions and 3 deletions.
70 changes: 70 additions & 0 deletions docs/command-line-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Command Line API

## bioclip predict
```
usage: bioclip predict [-h] [--format {table,csv}] [--output OUTPUT]
[--rank {kingdom,phylum,class,order,family,genus,species} |
--cls CLS | --bins BINS] [--k K] [--device DEVICE]
[--model MODEL] [--pretrained PRETRAINED]
image_file [image_file ...]
positional arguments:
image_file input image file(s)
options:
-h, --help show this help message and exit
--format {table,csv} format of the output, default: csv
--output OUTPUT print output to file, default: stdout
--rank {kingdom,phylum,class,order,family,genus,species}
rank of the classification, default: species (when)
--cls CLS classes to predict: either a comma separated list or a
path to a text file of classes (one per line), when
specified the --rank and --bins arguments are not allowed.
--bins BINS path to CSV file with two columns with the first being
classes and second being bin names, when specified the
--cls argument is not allowed.
--k K number of top predictions to show, default: 5
--device DEVICE device to use (cpu or cuda or mps), default: cpu
--model MODEL model identifier (see command list-models);
default: hf-hub:imageomics/bioclip
--pretrained PRETRAINED
pretrained model checkpoint as tag or file, depends on
model; needed only if more than one is available
(see command list-models)
```


## bioclip embed
```
usage: bioclip embed [-h] [--output OUTPUT] [--device DEVICE] [--model MODEL]
[--pretrained PRETRAINED] image_file [image_file ...]
positional arguments:
image_file input image file(s)
options:
-h, --help show this help message and exit
--output OUTPUT print output to file, default: stdout
--device DEVICE device to use (cpu or cuda or mps), default: cpu
--model MODEL model identifier (see command list-models);
default: hf-hub:imageomics/bioclip
--pretrained PRETRAINED
pretrained model checkpoint as tag or file, depends
on model; needed only if more than one is available
(see command list-models)
```

## bioclip list-models
```
usage: bioclip list-models [-h] [--model MODEL]
Note that this will only list models known to open_clip; any model identifier
loadable by open_clip, such as from hf-hub, file, etc should also be usable for
--model in the embed and predict commands.
(The default model hf-hub:imageomics/bioclip is one example.)
options:
-h, --help show this help message and exit
--model MODEL list available tags for pretrained model checkpoint(s) for
specified model
```
126 changes: 126 additions & 0 deletions docs/command-line-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Command Line Tutorial

Before beginning this tutorial you need to [install pybioclip](/#installation) and download two example images: [`Ursus-arctos.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Ursus-arctos.jpeg)
and [`Felis-catus.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Felis-catus.jpeg).

## Tree Of Life Predictions

### Predict species for an image

Predict species for an `Ursus-arctos.jpeg` file:
```console
bioclip predict Ursus-arctos.jpeg
```
Output:
```
bioclip predict Ursus-arctos.jpeg
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
```

### Predict species for multiple images saving to a file

To make predictions for files `Ursus-arctos.jpeg` and `Felis-catus.jpeg` saving the output to a file named `predictions.csv`:
```console
bioclip predict --output predictions.csv Ursus-arctos.jpeg Felis-catus.jpeg
```
The contents of `predictions.csv` will look like this:
```
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,silvestris,Felis silvestris,European Wildcat,0.7221033573150635
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,catus,Felis catus,Domestic Cat,0.19810837507247925
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,margarita,Felis margarita,Sand Cat,0.02798456884920597
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Lynx,felis,Lynx felis,,0.021829601377248764
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,bieti,Felis bieti,Chinese desert cat,0.010979168117046356
```

### Predict top 3 genera for an image and display output as a table
```console
bioclip predict --format table --k 3 --rank=genus Ursus-arctos.jpeg
```

Output:
```
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
| file_name | kingdom | phylum | class | order | family | genus | score |
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Carnivora | Ursidae | Ursus | 0.9994320273399353 |
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Cervus | 0.00032594642834737897 |
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Alces | 7.803700282238424e-05 |
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
```

## Custom Label Predictions
### Predict from a list of classes
Create predictions for 3 classes (cat, bird, and bear) for image `Ursus-arctos.jpeg`:
```console
bioclip predict --cls cat,bird,bear Ursus-arctos.jpeg
```
Output:
```
file_name,classification,score
Ursus-arctos.jpeg,cat,4.581644930112816e-08
Ursus-arctos.jpeg,bird,3.051998476166773e-08
Ursus-arctos.jpeg,bear,0.9999998807907104
```

### Predict from a binning CSV
Create predictions for 3 classes (cat, bird, and bear) with 2 bins (one, two) for image `Ursus-arctos.jpeg`:

Create a CSV file named `bins.csv` with the following contents:
```
cls,bin
cat,one
bird,one
bear,two
```
The names of the columns do not matter. The first column values will be used as the classes. The second column values will be used for bin names.

Run predict command:
```console
bioclip predict --bins bins.csv Ursus-arctos.jpeg
```

Output:
```
Ursus-arctos.jpeg,two,0.9999998807907104
Ursus-arctos.jpeg,one,7.633736487377973e-08
```

## Create embeddings

### Create embedding for an image

```console
bioclip embed Ursus-arctos.jpeg
```
Output:
```
{
"model": "hf-hub:imageomics/bioclip",
"embeddings": {
"Ursus-arctos.jpeg": [
-0.23633578419685364,
-0.28467196226119995,
-0.4394485652446747,
...
]
}
}
```

## View command line help
```console
bioclip -h
bioclip <command> -h
```

7 changes: 7 additions & 0 deletions docs/css/mkdocstrings.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
div.doc-contents {
padding-left: 25px;
}

.doc-heading {
padding-top: 10px;
}
22 changes: 21 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1 +1,21 @@
# TODO
# pybioclip

Command line tool and python package to simplify using [BioCLIP](https://imageomics.github.io/bioclip/), including for taxonomic or other label prediction on (and thus annotation or labeling of) images, as well as for generating semantic embeddings for images. No particular understanding of ML or computer vision is required to use it. It also implements a number of performance optimizations for batches of images or custom class lists, which should be particularly useful for integration into computational workflows.

## Installation
Requires python that is compatible with [PyTorch](https://pytorch.org/get-started/locally/#linux-python).

```console
pip install pybioclip
```
If you have any issues with installation, please first upgrade pip by running `pip install --upgrade pip`.


## Usage
- __See [Command Line Tutorial](command-line-tutorial.md) for how to use the `bioclip` command line tool.__
- __See [Python Tutorial](python-tutorial.md) for how to use the `bioclip` python package.__





27 changes: 27 additions & 0 deletions docs/python-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Python API

::: bioclip.TreeOfLifeClassifier
options:
members:
- predict
- get_label_data
- create_taxa_filter
- apply_filter
show_root_heading: true
show_source: true

::: bioclip.Rank
options:
show_root_heading: true
show_source: true

::: bioclip.CustomLabelsClassifier
options:
members:
- predict
show_root_heading: true
show_source: true

::: bioclip.predict.BaseClassifier
options:
show_root_heading: true
92 changes: 92 additions & 0 deletions docs/python-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Python Tutorial

Before beginning this tutorial you need to [install pybioclip](/#installation) and download two example images: [`Ursus-arctos.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Ursus-arctos.jpeg)
and [`Felis-catus.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Felis-catus.jpeg).

### Predict species classification

```python
from bioclip import TreeOfLifeClassifier, Rank

classifier = TreeOfLifeClassifier()
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)

for prediction in predictions:
print(prediction["species"], "-", prediction["score"])
```

Output:
```console
Ursus arctos - 0.9356034994125366
Ursus arctos syriacus - 0.05616999790072441
Ursus arctos bruinosus - 0.004126196261495352
Ursus arctus - 0.0024959812872111797
Ursus americanus - 0.0005009894957765937
```

Output from the `predict()` method showing the dictionary structure:
```
[{
'kingdom': 'Animalia',
'phylum': 'Chordata',
'class': 'Mammalia',
'order': 'Carnivora',
'family': 'Ursidae',
'genus': 'Ursus',
'species_epithet': 'arctos',
'species': 'Ursus arctos',
'common_name': 'Kodiak bear'
'score': 0.9356034994125366
}]
```

The output from the predict function can be converted into a [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) like so:
```python
import pandas as pd
from bioclip import TreeOfLifeClassifier, Rank

classifier = TreeOfLifeClassifier()
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)
df = pd.DataFrame(predictions)
```

The first argument of the `predict()` method supports both a single path or a list of paths.

### Predict from a list of classes
```python
from bioclip import CustomLabelsClassifier

classifier = CustomLabelsClassifier(["duck","fish","bear"])
predictions = classifier.predict("Ursus-arctos.jpeg")
for prediction in predictions:
print(prediction["classification"], prediction["score"])
```
Output:
```console
duck 1.0306726583309e-09
fish 2.932403668845507e-12
bear 1.0
```

### Predict from a list of classes with binning
```python
from bioclip import CustomLabelsBinningClassifier
classifier = CustomLabelsBinningClassifier(cls_to_bin={
'dog': 'small',
'fish': 'small',
'bear': 'big',
})
predictions = classifier.predict("Ursus-arctos.jpeg")
for prediction in predictions:
print(prediction["classification"], prediction["score"])
```
Output:
```console
big 0.99992835521698
small 7.165559509303421e-05
```

### PIL Images
The predict() functions used in all the examples above allow passing a list of paths or a list of [PIL Images](https://pillow.readthedocs.io/en/stable/reference/Image.html).
When a list of PIL images is passed the index of the image will be filled in for `file_name`. This is because PIL images may not have an associated file name.

44 changes: 44 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
site_name: pybioclip
repo_url: https://github.com/Imageomics/pybioclip
nav:
- Home: index.md
- Command Line Usage:
- Tutorial: command-line-tutorial.md
- API: command-line-api.md
- Python Usage:
- Tutorial: python-tutorial.md
- API: python-api.md

theme:
name: material
features:
- navigation.tabs
- navigation.tabs.sticky
- content.code.copy
plugins:
- search
- mkdocstrings:
handlers:
python:
paths: [src] # search packages in the src folder
options:
docstring_style: google
merge_init_into_class: true
markdown_extensions:
- admonition
- attr_list
- md_in_html
- pymdownx.betterem
- pymdownx.blocks.caption
- pymdownx.details
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.superfences
- pymdownx.tasklist
- pymdownx.tilde
- pymdownx.highlight:
anchor_linenums: true
line_spans: __span
pygments_lang_class: true
extra_css:
- css/mkdocstrings.css
Loading

0 comments on commit 5c15113

Please sign in to comment.