Merge pull request adaptive-intelligent-robotics#126 from adaptive-in…

…telligent-robotics/develop Version 0.2.0 of QDax
Hasasasaki · Nov 30, 2022 · fc0ad78 · fc0ad78
2 parents 2fb5619 + cfaf3c8
commit fc0ad78
Show file tree

Hide file tree

Showing 106 changed files with 10,125 additions and 778 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -168,5 +168,4 @@ cython_debug/
 .gitignore
 Makefile
 mypy.ini
-README.md
 dev.Dockerfile
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,21 @@
+Related issues: [refer to issues]
+
+[Introduce the overall change made in the PR and list the associated modifications below]
+
+This PR introduces:
+- [modification 1]
+- [modification 2]
+
+
+## Checks
+
+- [ ] a clear description of the PR has been added
+- [ ] sufficient tests have been written
+- [ ] relevant section added to the documentation
+- [ ] example notebook added to the repo
+- [ ] clean docstrings and comments have been written
+- [ ] if any issue/observation has been discovered, a new issue has been opened
+
+## Future improvements
+
+[List here potential observations made and/or improvements that could be made in the future. If relevant, open issues for those.]
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -10,7 +10,7 @@ repos:
     - id: black
       language_version: python3.8
       args: ["--target-version", "py38"]
--   repo: https://gitlab.com/pycqa/flake8
+-   repo: https://github.com/PyCQA/flake8
     rev: 3.8.4
     hooks:
     - id: flake8
@@ -24,7 +24,7 @@ repos:
     rev: 0.3.9
     hooks:
       - id: nbstripout
-        args: ["notebooks/"]
+        args: ["examples/"]
 -   repo: https://github.com/pre-commit/pre-commit-hooks
     rev: v4.0.1
     hooks:

diff --git a/README.md b/README.md
@@ -1,13 +1,18 @@
+<div align="center">
+<img src="docs/img/qdax_logo.png" alt="qdax_logo" width="140"></img>
+</div>
+
+# QDax: Accelerated Quality-Diversity
+
 [![Documentation Status](https://readthedocs.org/projects/qdax/badge/?version=latest)](https://qdax.readthedocs.io/en/latest/?badge=latest)
 ![pytest](https://github.com/adaptive-intelligent-robotics/QDax/actions/workflows/ci.yaml/badge.svg?branch=main)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/adaptive-intelligent-robotics/QDax/blob/main/LICENSE)
 [![codecov](https://codecov.io/gh/adaptive-intelligent-robotics/QDax/branch/feat/add-codecov/graph/badge.svg)](https://codecov.io/gh/adaptive-intelligent-robotics/QDax)
 
 
-# QDax: Accelerated Quality-Diversity
 QDax is a tool to accelerate Quality-Diversity (QD) and neuro-evolution algorithms through hardware accelerators and massive parallelization. QD algorithms usually take days/weeks to run on large CPU clusters. With QDax, QD algorithms can now be run in minutes! ⏩ ⏩ 🕛
 
-QDax has been developed as a research framework: it is flexible and easy to extend and build on and can be used for any problem setting. Get started with simple example and run a QD algorithm in minutes here! [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/mapelites_example.ipynb)
+QDax has been developed as a research framework: it is flexible and easy to extend and build on and can be used for any problem setting. Get started with simple example and run a QD algorithm in minutes here! [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/mapelites.ipynb)
 
 - QDax [paper](https://arxiv.org/abs/2202.01258)
 - QDax [documentation](https://qdax.readthedocs.io/en/latest/)
@@ -27,18 +32,75 @@ Installing QDax via ```pip``` installs a CPU-only version of JAX by default. To
 However, we also provide and recommend using either Docker, Singularity or conda environments to use the repository which by default provides GPU support. Detailed steps to do so are available in the [documentation](https://qdax.readthedocs.io/en/latest/installation/).
 
 ## Basic API Usage
-For a full and interactive example to see how QDax works, we recommend starting with the tutorial-style [Colab notebook](./notebooks/mapelites_example.ipynb). It is an example of the MAP-Elites algorithm used to evolve a population of controllers on a chosen Brax environment (Walker by default).
+For a full and interactive example to see how QDax works, we recommend starting with the tutorial-style [Colab notebook](./examples/mapelites.ipynb). It is an example of the MAP-Elites algorithm used to evolve a population of controllers on a chosen Brax environment (Walker by default).
 
 However, a summary of the main API usage is provided below:
 ```python
-import qdax
+import jax
+import functools
 from qdax.core.map_elites import MAPElites
+from qdax.core.containers.mapelites_repertoire import compute_euclidean_centroids
+from qdax.tasks.arm import arm_scoring_function
+from qdax.core.emitters.mutation_operators import isoline_variation
+from qdax.core.emitters.standard_emitters import MixingEmitter
+from qdax.utils.metrics import default_qd_metrics
+
+seed = 42
+num_param_dimensions = 100  # num DoF arm
+init_batch_size = 100
+batch_size = 1024
+num_iterations = 50
+grid_shape = (100, 100)
+min_param = 0.0
+max_param = 1.0
+min_bd = 0.0
+max_bd = 1.0
+
+# Init a random key
+random_key = jax.random.PRNGKey(seed)
+
+# Init population of controllers
+random_key, subkey = jax.random.split(random_key)
+init_variables = jax.random.uniform(
+    subkey,
+    shape=(init_batch_size, num_param_dimensions),
+    minval=min_param,
+    maxval=max_param,
+)
+
+# Define emitter
+variation_fn = functools.partial(
+    isoline_variation,
+    iso_sigma=0.05,
+    line_sigma=0.1,
+    minval=min_param,
+    maxval=max_param,
+)
+mixing_emitter = MixingEmitter(
+    mutation_fn=lambda x, y: (x, y),
+    variation_fn=variation_fn,
+    variation_percentage=1.0,
+    batch_size=batch_size,
+)
+
+# Define a metrics function
+metrics_fn = functools.partial(
+    default_qd_metrics,
+    qd_offset=0.0,
+)
 
 # Instantiate MAP-Elites
 map_elites = MAPElites(
-    scoring_function=scoring_fn,
+    scoring_function=arm_scoring_function,
     emitter=mixing_emitter,
-    metrics_function=metrics_function,
+    metrics_function=metrics_fn,
+)
+
+# Compute the centroids
+centroids = compute_euclidean_centroids(
+    grid_shape=grid_shape,
+    minval=min_bd,
+    maxval=max_bd,
 )
 
 # Initializes repertoire and emitter state
@@ -62,24 +124,32 @@ QDax currently supports the following algorithms:
 
 | Algorithm  | Example |
 | --- | --- |
-| [MAP-Elites](https://arxiv.org/abs/1504.04909) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/mapelites_example.ipynb) |
-| [CVT MAP-Elites](https://arxiv.org/abs/1610.05729) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/mapelites_example.ipynb) |
-| [Policy Gradient Assisted MAP-Elites (PGA-ME)](https://hal.archives-ouvertes.fr/hal-03135723v2/file/PGA_MAP_Elites_GECCO.pdf) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/pgame_example.ipynb) |
-| [OMG-MEGA](https://arxiv.org/abs/2106.03894) |  [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/omgmega_example.ipynb) |
-| [CMA-MEGA](https://arxiv.org/abs/2106.03894) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/cmamega_example.ipynb) |
-| [Multi-Objective Quality-Diversity (MOME)](https://arxiv.org/abs/2202.03057) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/mome_example.ipynb) |
+| [MAP-Elites](https://arxiv.org/abs/1504.04909) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/mapelites.ipynb) |
+| [CVT MAP-Elites](https://arxiv.org/abs/1610.05729) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/mapelites.ipynb) |
+| [Policy Gradient Assisted MAP-Elites (PGA-ME)](https://hal.archives-ouvertes.fr/hal-03135723v2/file/PGA_MAP_Elites_GECCO.pdf) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/pgame.ipynb) |
+| [QDPG](https://arxiv.org/abs/2006.08505) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/qdpg.ipynb) |
+| [CMA-ME](https://arxiv.org/pdf/1912.02400.pdf) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/cmame.ipynb) |
+| [OMG-MEGA](https://arxiv.org/abs/2106.03894) |  [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/omgmega.ipynb) |
+| [CMA-MEGA](https://arxiv.org/abs/2106.03894) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/cmamega.ipynb) |
+| [Multi-Objective MAP-Elites (MOME)](https://arxiv.org/abs/2202.03057) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/mome.ipynb) |
+| [MAP-Elites Evolution Strategies (MEES)](https://dl.acm.org/doi/pdf/10.1145/3377930.3390217) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/mees.ipynb) |
 
 
 ## QDax baseline algorithms
 The QDax library also provides implementations for some useful baseline algorithms:
 
 | Algorithm  | Example |
 | --- | --- |
-| [DIAYN](https://arxiv.org/abs/1802.06070) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/diayn_example.ipynb) |
-| [DADS](https://arxiv.org/abs/1907.01657) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/dads_example.ipynb) |
-| [SMERL](https://arxiv.org/abs/2010.14484) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/smerl_example.ipynb) |
-| [NSGA2](https://ieeexplore.ieee.org/document/996017) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/nsga2_spea2_example.ipynb) |
-| [SPEA2](https://www.semanticscholar.org/paper/SPEA2%3A-Improving-the-strength-pareto-evolutionary-Zitzler-Laumanns/b13724cb54ae4171916f3f969d304b9e9752a57f) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/notebooks/nsga2_spea2_example.ipynb) |
+| [DIAYN](https://arxiv.org/abs/1802.06070) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/diayn.ipynb) |
+| [DADS](https://arxiv.org/abs/1907.01657) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/dads.ipynb) |
+| [SMERL](https://arxiv.org/abs/2010.14484) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/smerl.ipynb) |
+| [NSGA2](https://ieeexplore.ieee.org/document/996017) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/nsga2_spea2.ipynb) |
+| [SPEA2](https://www.semanticscholar.org/paper/SPEA2%3A-Improving-the-strength-pareto-evolutionary-Zitzler-Laumanns/b13724cb54ae4171916f3f969d304b9e9752a57f) | [![Open All Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adaptive-intelligent-robotics/QDax/blob/main/examples/nsga2_spea2.ipynb) |
+
+## QDax Tasks
+The QDax library also provides numerous implementations for several standard Quality-Diversity tasks.
+
+All those implementations, and their descriptions are provided in the [tasks directory](./qdax/tasks).
 
 ## Contributing
 Issues and contributions are welcome. Please refer to the [contribution guide](https://qdax.readthedocs.io/en/latest/guides/CONTRIBUTING/) in the documentation for more details.
@@ -103,8 +173,11 @@ If you use QDax in your research and want to cite it in your work, please use:
 
 QDax was developed and is maintained by the [Adaptive & Intelligent Robotics Lab (AIRL)](https://www.imperial.ac.uk/adaptive-intelligent-robotics/) and [InstaDeep](https://www.instadeep.com/).
 
-<img align="center" src="docs/images/AIRL_logo.png" alt="AIRL_Logo" width="220"/> <img align="center" src="docs/images/instadeep_logo.png" alt="InstaDeep_Logo" width="220"/>
+<div align="center">
+<img align="center" src="docs/img/AIRL_logo.png" alt="AIRL_Logo" width="220"/> <img align="center" src="docs/img/instadeep_logo.png" alt="InstaDeep_Logo" width="220"/>
+</div>
 
+<div align="center">
 <a href="https://github.com/limbryan" title="Bryan Lim"><img src="https://github.com/limbryan.png" height="auto" width="50" style="border-radius:50%"></a>
 <a href="https://github.com/maxiallard" title="Maxime Allard"><img src="https://github.com/maxiallard.png" height="auto" width="50" style="border-radius:50%"></a>
 <a href="https://github.com/Lookatator" title="Luca Grilloti"><img src="https://github.com/Lookatator.png" height="auto" width="50" style="border-radius:50%"></a>
@@ -117,3 +190,4 @@ QDax was developed and is maintained by the [Adaptive & Intelligent Robotics Lab
 <a href="https://github.com/GRichard513" title="Guillaume Richard"><img src="https://github.com/GRichard513.png" height="auto" width="50" style="border-radius:50%"></a>
 <a href="https://github.com/flajolet" title="Arthur Flajolet"><img src="https://github.com/flajolet.png" height="auto" width="50" style="border-radius:50%"></a>
 <a href="https://github.com/remidebette" title="Rémi Debette"><img src="https://github.com/remidebette.png" height="auto" width="50" style="border-radius:50%"></a>
+</div>
diff --git a/dev.Dockerfile b/dev.Dockerfile
@@ -93,6 +93,7 @@ FROM cuda-image as run-image
 
 COPY qdax qdax
 COPY setup.py ./
+COPY README.md ./
 
 RUN pip install .
 

diff --git a/docs/api_documentation/core/cmame.md b/docs/api_documentation/core/cmame.md
@@ -0,0 +1,13 @@
+# Covariance Matrix Adaptation MAP Elites (CMAME)
+
+To create an instance of CMAME, one need to use an instance of [MAP-Elites](map_elites.md) with the desired CMA Emitter - optimizing, random direction, improvement - detailed below.To use the pool of emitter mechanism, use the CMAPoolEmitter.
+
+Three emitter types:
+
+::: qdax.core.emitters.cma_emitter.CMAEmitter
+::: qdax.core.emitters.cma_rnd_emitter.CMARndEmitter
+::: qdax.core.emitters.cma_opt_emitter.CMAOptimizingEmitter
+
+Pool of homogeneous emitters:
+
+::: qdax.core.emitters.cma_pool_emitter.CMAPoolEmitter
diff --git a/docs/api_documentation/core/map_elites.md b/docs/api_documentation/core/map_elites.md
@@ -5,3 +5,7 @@ This class implement the base mechanism of MAP-Elites. It must be used with an e
 The MAP-Elites class can be used with other emitters to create variants, like [PGAME](pgame.md), [CMA-MEGA](cma_mega.md) and [OMG-MEGA](omg_mega.md).
 
 ::: qdax.core.map_elites.MAPElites
+
+We also provide a class to have MAP-Elites efficiently distributed over several devices.
+
+::: qdax.core.distributed_map_elites.DistributedMAPElites
diff --git a/docs/api_documentation/core/mees.md b/docs/api_documentation/core/mees.md
@@ -0,0 +1,5 @@
+# MAP Elites with Evolution Strategies (ME-ES)
+
+To create an instance of ME-ES, one need to use an instance of [MAP-Elites](map_elites.md) with the MEESEmitter, detailed below.
+
+::: qdax.core.emitters.mees_emitter.MEESEmitter
diff --git a/docs/api_documentation/core/qdpg.md b/docs/api_documentation/core/qdpg.md
@@ -0,0 +1,5 @@
+# Quality Diversity Policy Gradient (QDPG)
+
+To create an instance of QDPG, one need to use an instance of [MAP-Elites](map_elites.md) with the QDPGEmitter, detailed below.
+
+::: qdax.core.emitters.qdpg_emitter.QDPGEmitter
diff --git a/docs/caveats.md b/docs/caveats.md
@@ -0,0 +1,11 @@
+# QDax Caveats
+
+Here is a few caveats one should be aware of when using QDax.
+
+## Use of auto reset for Brax environments
+The use of `auto_reset` can be tricky and lead to problems and/or unwanted behaviors. By defaults in our examples, we set auto reset equals True, so the samples collected in the Replay Buffer are good quality samples and stay within the distribution of interest. This is particularly important in the case of PGAME, where putting auto reset to False could lead to important decrease in data efficiency and final performance.
+
+## In-place replacement of state descriptors in QDTransition
+The state descriptor from Brax environments is stored in a dictionary. The retrievement of this data when building the QDTransition in a step is hence tricky. The state descriptor must be stored in a variable before applying a environment step, because an in-place replacement is going to occur during the step function of Brax.
+
+One should take inspiration from the `play_step` function from our examples.
diff --git a/docs/examples b/docs/examples
@@ -0,0 +1 @@
+../examples/
diff --git a/docs/images/AIRL_logo.png → docs/img/AIRL_logo.png b/docs/images/AIRL_logo.png → docs/img/AIRL_logo.png
diff --git a/docs/img/favicon.ico b/docs/img/favicon.ico
diff --git a/docs/images/instadeep_logo.png → docs/img/instadeep_logo.png b/docs/images/instadeep_logo.png → docs/img/instadeep_logo.png
diff --git a/docs/img/qdax_logo.png b/docs/img/qdax_logo.png
diff --git a/docs/notebooks b/docs/notebooks
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -4,5 +4,8 @@ mkdocs==1.2.3
 mkdocs-autorefs==0.3.1
 mkdocs-git-revision-date-plugin==0.3.1
 mkdocs-material==8.2.3
-mkdocstrings==0.18.1
+mkdocs-material-extensions==1.1.1
+mkdocstrings==0.19
+mkdocstrings-python-legacy==0.2.3
 mknotebooks==0.7.1
+pytkdocs==0.16.1