From 4e894ab27c8a046a3ee6d3e025dc81767af08c15 Mon Sep 17 00:00:00 2001 From: Tom Denton Date: Fri, 12 Apr 2024 14:30:43 -0700 Subject: [PATCH] Add notes on installation platforms to the `inference` README. PiperOrigin-RevId: 624293943 --- README.md | 8 ++++++++ chirp/inference/README.md | 35 ++++++++++++++++++++++++----------- 2 files changed, 32 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index a9759c55..420d55e1 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,14 @@ A bioacoustics research project. ## Installation +We support installation on a generic Linux workstation. +A GPU is recommended, especially when working with large datasets. +The recipe below is the same used by our continuous integration testing. + +Some users have successfully used our repository with the Windows Linux +Subsystem, or with Docker in a cloud-based virtual machine. Anecdotally, +installation on OS X is difficult. + You might need the following dependencies. ```bash diff --git a/chirp/inference/README.md b/chirp/inference/README.md index b6f1b019..be9576b9 100644 --- a/chirp/inference/README.md +++ b/chirp/inference/README.md @@ -7,19 +7,26 @@ This library is for applying trained models to data. We provide a few Python notebooks for efficient transfer learning, as suggested in [Feature Embeddings from Large-Scale Acoustic Bird Classifiers Enable Few-Shot Transfer Learning](https://arxiv.org/abs/2307.06292). +The full workflow is illustrated in a +[Colab tutorial](https://colab.research.google.com/drive/1gPBu2fyw6aoT-zxXFk15I2GObfMRNHUq). +This tutorial can be used with Google Colab's free-tier, requiring no software +installation, though a (free) Google account is required. This notebook can +be copied and adapted to work with your own data, stored in Drive. + +For local installation and use of the base Python notebooks, we recommend using +a Linux machine (eg, Ubuntu) with a moderate GPU. Our continuous integration +tests install and run on Linux, so that is your best bet for compatibility. +Some users have had success using the Windows Linux Subsystem (WSL), or with +using Docker and virtual machines hosted in the cloud. +Anecdotally, installation on OS X is difficult. + ### Workflow Overview The classifier workflow has two-or-three steps. We work with an /embedding model/, a large quantity of /unlabeled data/ and a usually-smaller set of /labeled data/. -Before attempting to run code from this workflow, first download an embedding model. You can use the [Google Bird -Vocalization Classifier](https://tfhub.dev/google/bird-vocalization-classifier/) -(aka, Perch). Download the model and unzip it, keeping track of where you've -placed it. (It is also possible to use [BirdNET](https://github.com/kahst/BirdNET-Analyzer) or, with a bit more effort, any -model which turns audio into embeddings, such as [YAMNet](https://github.com/tensorflow/models/tree/master/research/audioset/yamnet).) - -Then we need to compute /embeddings/ of the target unlabeled audio. The +We first need to compute /embeddings/ of the target unlabeled audio. The unlabeled audio is specified by one or more 'globs' of files like: `/my_home/audio/*/*.flac`. Any audio formats readable by Librosa should be fine. We provide `embed_audio.ipynb` to do so. This creates a dataset of embeddings @@ -30,9 +37,10 @@ or more), we provide a Beam pipeline via `embed.py` which can run on a cluster. Setting this up may be challenging, however; feel free to get in touch if you have questions. -Once we have embedded the unlabeled audio, you can use `search_embeddings.ipynb` -to search for interesting audio. Starting from a clip (or Xeno-Canto id, or -URL for an audio file), you can search for similar audio in the unlabeled data. +Once we have embedded the unlabeled audio, you can use `agile_modeling.ipynb` +to search for interesting audio and create a classifier. Starting from a clip +(or Xeno-Canto id, or URL for an audio file), you can search for similar audio +in the unlabeled data. By providing a label and clicking on relevant results, you will start amassing a set of `labeled data`. @@ -43,11 +51,16 @@ easy to add additional examples. It is recommended to add examples with length matching the /window size/ of the embedding model (5 seconds for Perch, or 3 seconds for BirdNET). -From there, `active_learning.ipynb` will help build a small classifier using +From there, the notebook will build a small classifier using the embeddings of the labeled audio. The classifier can then be run on the unlabeled data. Hand-labeling results will allow you to feed new data into the labeled dataset, and iterate quickly. +The `analysis.ipynb` notebook provides additional tools for analyzing data with +a pre-trained classifier, as developed in `agile_modeling.ipynb`. It can be +used to run detections over new data, estimate total call density, and +evaluate the real-world model quality. + ### Installation (Linux) Install the main repository, following the [instructions](https://github.com/google-research/perch) in the main README.md, and check that the tests pass.