Merge pull request #2367 from cgarciae:move-example-to-rtd

PiperOrigin-RevId: 472265759
google · Sep 5, 2022 · e0baf0d · e0baf0d
2 parents 6784fbd + cb9715d
commit e0baf0d
Show file tree

Hide file tree

Showing 2 changed files with 227 additions and 162 deletions.
diff --git a/docs/examples.rst b/docs/examples.rst
@@ -1,17 +1,222 @@
+*********
 Examples
-========
-
-.. toctree::
-   :maxdepth: 1
-
-   Launching jobs on Google Cloud <https://github.com/google/flax/tree/main/examples/cloud>
-   ImageNet Classification <https://github.com/google/flax/tree/main/examples/imagenet>
-   Language Modeling (lm1b) <https://github.com/google/flax/tree/main/examples/lm1b>
-   MNIST Classification <https://github.com/google/flax/tree/main/examples/mnist>
-   Part-of-Speech Tagging <https://github.com/google/flax/tree/main/examples/nlp_seq>
-   Predicting Biological Activities of Molecules with Graph Neural Networks <https://github.com/google/flax/tree/main/examples/ogbg_molpcba>
-   Proximal Policy Optimization <https://github.com/google/flax/tree/main/examples/ppo>
-   Seq2Seq: Addition <https://github.com/google/flax/tree/main/examples/seq2seq>
-   SST-2 Classification <https://github.com/google/flax/tree/main/examples/sst2>
-   Basic VAE <https://github.com/google/flax/tree/main/examples/vae>
-   Machine Translation <https://github.com/google/flax/tree/main/examples/wmt>
+*********
+
+Core examples
+##############
+
+
+Core examples are hosted on the Flax repo in the `examples <https://github.com/google/flax/tree/main/examples>`__ 
+directory.
+
+Each example is designed to be **self-contained and easily forkable**, while
+reproducing relevant results in different areas of machine learning.
+
+As discussed in `#231 <https://github.com/google/flax/issues/231>`__, we decided 
+to go for a standard pattern for all examples including the simplest ones (like MNIST). 
+This makes every example a bit more verbose, but once you know one example, you 
+know the structure of all of them. Having unit tests and integration tests is also 
+very useful when you fork these examples.
+
+Some of the examples below have a link "Interactive🕹" that lets you run them
+directly in Colab.
+
+Image classification
+********************
+
+- :octicon:`mark-github;0.9em` `MNIST <https://github.com/google/flax/tree/main/examples/mnist/>`__ -
+  `Interactive🕹 <https://colab.research.google.com/github/google/flax/blob/main/examples/mnist/mnist.ipynb>`__:
+  Convolutional neural network for MNIST classification (featuring simple
+  code).
+
+- :octicon:`mark-github;0.9em` `ImageNet <https://github.com/google/flax/tree/main/examples/imagenet/>`__ -
+  `Interactive🕹 <https://colab.research.google.com/github/google/flax/blob/main/examples/imagenet/imagenet.ipynb>`__:
+  Resnet-50 on ImageNet with weight decay (featuring multi host SPMD, custom
+  preprocessing, checkpointing, dynamic scaling, mixed precision).
+
+Reinforcement learning
+**********************
+
+- :octicon:`mark-github;0.9em` `Proximal Policy Optimization <https://github.com/google/flax/tree/main/examples/ppo/>`__:
+  Learning to play Atari games (featuring single host SPMD, RL setup).
+
+Natural language processing
+***************************
+
+-  :octicon:`mark-github;0.9em` `Sequence to sequence for number
+   addition <https://github.com/google/flax/tree/main/examples/seq2seq/>`__:
+   (featuring simple code, LSTM state handling, on the fly data generation).
+-  :octicon:`mark-github;0.9em` `Parts-of-speech
+   tagging <https://github.com/google/flax/tree/main/examples/nlp_seq/>`__: Simple
+   transformer encoder model using the universal dependency dataset.
+-  :octicon:`mark-github;0.9em` `Sentiment
+   classification <https://github.com/google/flax/tree/main/examples/sst2/>`__:
+   with a LSTM model.
+-  :octicon:`mark-github;0.9em` `Transformer encoder/decoder model trained on
+   WMT <https://github.com/google/flax/tree/main/examples/wmt/>`__:
+   Translating English/German (featuring multihost SPMD, dynamic bucketing,
+   attention cache, packed sequences, recipe for TPU training on GCP).
+-  :octicon:`mark-github;0.9em` `Transformer encoder trained on one billion word
+   benchmark <https://github.com/google/flax/tree/main/examples/lm1b/>`__:
+   for autoregressive language modeling, based on the WMT example above.
+
+Generative models
+*****************
+
+-  :octicon:`mark-github;0.9em` `Variational
+   auto-encoder <https://github.com/google/flax/tree/main/examples/vae/>`__:
+   Trained on binarized MNIST (featuring simple code, vmap).
+
+Graph modeling
+**************
+
+- :octicon:`mark-github;0.9em` `Graph Neural Networks <https://github.com/google/flax/tree/main/examples/ogbg_molpcba/>`__:
+  Molecular predictions on ogbg-molpcba from the Open Graph Benchmark.
+
+Contributing Examples
+*********************
+
+Most of the core examples follow a structure that we found to work
+well with Flax projects, and we strive to make the examples easy to explore and
+easy to fork. In particular (taken from `#231 <https://github.com/google/flax/issues/231>`__)
+
+- README: contains links to paper, command line, `TensorBoard <https://tensorboard.dev/>`__ metrics
+- Focus: an example is about a single model/dataset
+- Configs: we use ``ml_collections.ConfigDict`` stored under ``configs/``
+- Tests: executable ``main.py`` loads ``train.py`` which has ``train_test.py``
+- Data: is read from `TensorFlow Datasets <https://www.tensorflow.org/datasets>`__
+- Standalone: every directory is self-contained
+- Requirements: versions are pinned in ``requirements.txt``
+- Boilerplate: is reduced by using `clu <https://pypi.org/project/clu/>`__
+- Interactive: the example can be explored with a `Colab <https://colab.research.google.com/>`__
+
+Repositories Using Flax
+#######################
+
+The following code bases use Flax and provide training frameworks and a wealth
+of examples, in many cases with pre-trained weights:
+
+- `🤗 Hugging Face <https://huggingface.co/flax-community>`__ is a
+  very popular library for building, training, and deploying state of the art
+  machine learning models.
+  These models can be applied on text, images, and audio. After organizing the
+  `JAX/Flax community week <https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md>`__,
+  they have now over 5,000
+  `Flax/JAX models <https://huggingface.co/models?library=jax&sort=downloads>`__ in
+  their repository.
+
+- `🥑 DALLE Mini <https://huggingface.co/dalle-mini>`__ is a Transformer-based 
+  text-to-image model implemented in JAX/Flax that follows the ideas from the
+  original `DALLE <https://openai.com/blog/dall-e/>`__ paper by OpenAI.
+
+- `Scenic <https://github.com/google-research/scenic>`__ is a codebase/library
+  for computer vision research and beyond. Scenic's main focus is around
+  attention-based models. Scenic has been successfully used to develop
+  classification, segmentation, and detection models for multiple modalities
+  including images, video, audio, and multimodal combinations of them.
+
+- `Big Vision <https://github.com/google-research/big_vision/>`__ is a codebase
+  designed for training large-scale vision models using Cloud TPU VMs or GPU
+  machines. It is based on Jax/Flax libraries, and uses tf.data and TensorFlow
+  Datasets for scalable and reproducible input pipelines. This is the original
+  codebase of ViT, MLP-Mixer, LiT, UViM, and many more models.
+
+- `T5X <https://github.com/google-research/t5x>`__  is a modular, composable,
+  research-friendly framework for high-performance, configurable, self-service
+  training, evaluation, and inference of sequence models (starting with
+  language) at many scales.
+
+Community Examples
+###################
+
+In addition to the curated list of official Flax examples, there is a growing
+community of people using Flax to build new types of machine learning models. We
+are happy to showcase any example built by the community here! If you want to
+submit your own example, we suggest that you start by forking one of the
+official Flax example, and start from there.
+
+Models
+******
+.. list-table::
+    :header-rows: 1
+
+    * - Link
+      - Author
+      - Task type
+      - Reference
+    * - `matthias-wright/flaxmodels <https://github.com/matthias-wright/flaxmodels>`__
+      - `@matthias-wright <https://github.com/matthias-wright>`__
+      - Various
+      - GPT-2, ResNet, StyleGAN-2, VGG, ...
+    * - `DarshanDeshpande/jax-models <https://github.com/DarshanDeshpande/jax-models>`__
+      - `@DarshanDeshpande <https://github.com/DarshanDeshpande>`__
+      - Various
+      - Segformer, Swin Transformer, ... also some stand-alone layers
+    * - `google/vision_transformer <https://github.com/google-research/vision_transformer>`__
+      - `@andsteing <https://github.com/andsteing>`__
+      - Image classification, image/text
+      - https://arxiv.org/abs/2010.11929, https://arxiv.org/abs/2105.01601, https://arxiv.org/abs/2111.07991, ...
+    * - `jax-resnet <https://github.com/n2cholas/jax-resnet>`__
+      - `@n2cholas <https://github.com/n2cholas>`__
+      - Various resnet implementations
+      - ``torch.hub``
+
+Examples
+********
+
+.. list-table::
+    :header-rows: 1
+
+    * - Link
+      - Author
+      - Task type
+      - Reference
+    * - `JAX-RL <https://github.com/henry-prior/jax-rl>`__
+      - `@henry-prior <https://github.com/henry-prior>`__
+      - Reinforcement learning
+      - N/A
+    * - `BigBird Fine-tuning <https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects/big_bird>`__
+      - `@vasudevgupta7 <https://github.com/vasudevgupta7>`__
+      - Question-Answering
+      - https://arxiv.org/abs/2007.14062
+    * - `DCGAN <https://github.com/bkkaggle/jax-dcgan>`__
+      - `@bkkaggle <https://github.com/bkkaggle>`__
+      - Image Synthesis
+      - https://arxiv.org/abs/1511.06434
+
+Tutorials
+********
+
+.. currently left empty as a placeholder for tutorials
+.. list-table::
+    :header-rows: 1
+
+    * - Link
+      - Author
+      - Task type
+      - Reference
+    * -
+      -
+      -
+      -
+
+Contributing Policy
+********************
+
+If you are interested in adding a project to the Community Examples section, take the following 
+into consideration:
+
+* **Examples**: examples should contain a README that is helpful, clear, and makes it easy to run 
+  the code. The code itself should be easy to follow.
+* **Tutorials**: tutorials must preferably be runnable notebooks, be well written, and discuss 
+  an interesting topic. Also, the tutorial's content must be different from the existing
+  guides in the Flax documentation and other community examples to be considered for inclusion.
+* **Models**: repositories with models ported to Flax must provide at least one of the following:
+
+  * Metrics that are comparable to the original work when the model is trained to completion. Having 
+    available plots of the metric's history during training is highly encouraged.
+  * Tests to verify numerical equivalence against a well known implementation (same inputs 
+    + weights = same outputs) preferably using pretrained weights.
+
+On all cases above, code should work with the latest stable version of packages like ``jax``, 
+``flax``, and ``optax``, and make substantial use of Flax.
diff --git a/examples/README.md b/examples/README.md
@@ -1,153 +1,13 @@
 # Flax Examples
 
-## Core examples
-
-The examples from this directory.
-
 Each example is designed to be **self-contained and easily forkable**, while
 reproducing relevant results in different areas of machine learning.
 
-As discussed in [#231], we decided to go for a standard pattern for all examples
-including the simplest ones (like MNIST). This makes every example a bit more
-verbose, but once you know one example, you know the structure of all of them.
-Having unit tests and integration tests is also very useful when you fork these
-examples.
-
-Some of the examples below have a link "🕹Interactive🕹" that lets you run them
-directly in Colab.
-
-Image classification
-
-- [MNIST](https://github.com/google/flax/tree/main/examples/mnist/) -
-  [🕹Interactive🕹](https://colab.research.google.com/github/google/flax/blob/main/examples/mnist/mnist.ipynb):
-  Convolutional neural network for MNIST classification (featuring simple
-  code).
-
-- [ImageNet](https://github.com/google/flax/tree/main/examples/imagenet/) -
-  [🕹Interactive🕹](https://colab.research.google.com/github/google/flax/blob/main/examples/imagenet/imagenet.ipynb):
-  Resnet-50 on ImageNet with weight decay (featuring multi host SPMD, custom
-  preprocessing, checkpointing, dynamic scaling, mixed precision).
-
-Reinforcement learning
-
-- [Proximal Policy Optimization](https://github.com/google/flax/tree/main/examples/ppo/):
-  Learning to play Atari games (featuring single host SPMD, RL setup).
-
-Natural language processing
-
--  [Sequence to sequence for number
-   addition](https://github.com/google/flax/tree/main/examples/seq2seq/):
-   (featuring simple code, LSTM state handling, on the fly data generation).
--  [Parts-of-speech
-   tagging](https://github.com/google/flax/tree/main/examples/nlp_seq/): Simple
-   transformer encoder model using the universal dependency dataset.
--  [Sentiment
-   classification](https://github.com/google/flax/tree/main/examples/sst2/):
-   with a LSTM model.
--  [Transformer encoder/decoder model trained on
-   WMT](https://github.com/google/flax/tree/main/examples/wmt/):
-   Translating English/German (featuring multihost SPMD, dynamic bucketing,
-   attention cache, packed sequences, recipe for TPU training on GCP).
--  [Transformer encoder trained on one billion word
-   benchmark](https://github.com/google/flax/tree/main/examples/lm1b/):
-   for autoregressive language modeling, based on the WMT example above.
-
-Generative models
-
--  [Variational
-   auto-encoder](https://github.com/google/flax/tree/main/examples/vae/):
-   Trained on binarized MNIST (featuring simple code, vmap).
-
-Graph modeling
-
-- [Graph Neural Networks](https://github.com/google/flax/tree/main/examples/ogbg_molpcba/):
-  Molecular predictions on ogbg-molpcba from the Open Graph Benchmark.
-
-[#231]: https://github.com/google/flax/issues/231
-
-## Repositories Using Flax
-
-The following code bases use Flax and provide training frameworks and a wealth
-of examples, in many cases with pre-trained weights:
-
-- [HuggingFace Transformers](https://github.com/huggingface/transformers) is a
-  very popular library for building, training, and deploying state of the art
-  machine learning models.
-  These models can be applied on text, images, and audio. After organizing the
-  [JAX/Flax community week](https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md),
-  they have now over 5,000
-  [Flax/JAX models](https://huggingface.co/models?library=jax&sort=downloads) in
-  their repository.
-
-- [Scenic](https://github.com/google-research/scenic) is a codebase/library
-  for computer vision research and beyond. Scenic's main focus is around
-  attention-based models. Scenic has been successfully used to develop
-  classification, segmentation, and detection models for multiple modalities
-  including images, video, audio, and multimodal combinations of them.
-
-- [Big Vision](https://github.com/google-research/big_vision/) is a codebase
-  designed for training large-scale vision models using Cloud TPU VMs or GPU
-  machines. It is based on Jax/Flax libraries, and uses tf.data and TensorFlow
-  Datasets for scalable and reproducible input pipelines. This is the original
-  codebase of ViT, MLP-Mixer, LiT, UViM, and many more models.
-
-- [T5X](https://github.com/google-research/t5x)  is a modular, composable,
-  research-friendly framework for high-performance, configurable, self-service
-  training, evaluation, and inference of sequence models (starting with
-  language) at many scales.
-
-
-
-## Community Examples
-
-In addition to the curated list of official Flax examples, there is a growing
-community of people using Flax to build new types of machine learning models. We
-are happy to showcase any example built by the community here! If you want to
-submit your own example, we suggest that you start by forking one of the
-official Flax example, and start from there.
-
-|             Link              |       Author        |             Task type             |                               Reference                               |
-| ----------------------------- | ------------------- | --------------------------------- | --------------------------------------------------------------------- |
-| [matthias-wright/flaxmodels]  | [@matthias-wright]  | Various                           | GPT-2, ResNet, StyleGAN-2, VGG, ...                                   |
-| [DarshanDeshpande/jax-models] | [@DarshanDeshpande] | Various                           | Segformer, Swin Transformer, ... also some stand-alone layers         |
-| [google/vision_transformer]   | [@andsteing]        | Image classification, image/text  | https://arxiv.org/abs/2010.11929, https://arxiv.org/abs/2105.01601, https://arxiv.org/abs/2111.07991, ... |
-| [JAX-RL]                      | [@henry-prior]      | Reinforcement learning            | N/A                                                                   |
-| [DCGAN] Colab                 | [@bkkaggle]         | Image Synthesis                   | https://arxiv.org/abs/1511.06434                                      |
-| [BigBird Fine-tuning]         | [@vasudevgupta7]    | Question-Answering                | https://arxiv.org/abs/2007.14062                                      |
-| [jax-resnet]                  | [@n2cholas]         | Various resnet implementations    | `torch.hub`                                                           |
-
-[matthias-wright/flaxmodels]: https://github.com/matthias-wright/flaxmodels
-[DarshanDeshpande/jax-models]: https://github.com/DarshanDeshpande/jax-models
-[google/vision_transformer]: https://github.com/google-research/vision_transformer
-[JAX-RL]: https://github.com/henry-prior/jax-rl
-[DCGAN]: https://github.com/bkkaggle/jax-dcgan
-[BigBird Fine-tuning]: https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects/big_bird
-[jax-resnet]: https://github.com/n2cholas/jax-resnet
-[@matthias-wright]: https://github.com/matthias-wright
-[@DarshanDeshpande]: https://github.com/DarshanDeshpande
-[@andsteing]: https://github.com/andsteing
-[@henry-prior]: https://github.com/henry-prior
-[@bkkaggle]: https://github.com/bkkaggle
-[@vasudevgupta7]: https://github.com/vasudevgupta7
-[@n2cholas]: https://github.com/n2cholas
-
-## Anatomy of a Flax Example
-
-Most of our examples in this directory follow a structure that we found to work
-well with Flax projects, and we strive to make the examples easy to explore and
-easy to fork. In particular (taken from [#231])
+As discussed in [#231](https://github.com/google/flax/issues/231), we decided 
+to go for a standard pattern for all examples including the simplest ones 
+(like MNIST). This makes every example a bit more verbose, but once you know 
+one example, you know the structure of all of them. Having unit tests and 
+integration tests is also very useful when you fork these examples.
 
-- README: contains links to paper, command line, [TensorBoard] metrics
-- Focus: an example is about a single model/dataset
-- Configs: we use `ml_collections.ConfigDict` stored under `configs/`
-- Tests: executable `main.py` loads `train.py` which has `train_test.py`
-- Data: is read from [TensorFlow Datasets]
-- Standalone: every directory is self-conained
-- Requirements: are pinned in `requirements.txt`
-- Boilerplate: is reduced by using [`clu`]
-- Interactive: the example can be explored with a [Colab]
+For more examples including contributions from the community and other projects currently using Flax see the **[Examples](https://flax.readthedocs.io/en/latest/examples/index.html)** section in the documentation.
 
-[#231]: https://github.com/google/flax/issues/231
-[TensorBoard]: https://tensorboard.dev/
-[`clu`]: https://pypi.org/project/clu/
-[Colab]: https://colab.research.google.com/