Skip to content

Merge over from main to release, to create v2.0.1 #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 135 commits into from
Jun 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
135 commits
Select commit Hold shift + click to select a range
1dee123
generalized rate-cell a bit
ago109 Jul 22, 2024
e72d75e
Merge branch 'main' of github.com:NACLab/ngc-learn
ago109 Jul 22, 2024
2189f51
touched up rate-cell further
ago109 Jul 23, 2024
23eb7ed
minor mod to lif
ago109 Jul 23, 2024
1ddd86d
updated lif-cell to use units/tags and minor cleanup and edits
ago109 Jul 23, 2024
e240644
Monitor plot (#66)
willgebhardt Jul 23, 2024
086bd4d
added meta-data to rate-cell, input encoders, adex
ago109 Jul 23, 2024
19bf502
fixed minor saving/loading in rate-cell w/ vectorized compartments
ago109 Jul 23, 2024
4e298db
Added auto resolving for monitors (#67)
willgebhardt Jul 23, 2024
22e0d3a
fixed surr arg in lif-cell
ago109 Jul 24, 2024
42167cb
Merge branch 'main' of github.com:NACLab/ngc-learn
ago109 Jul 24, 2024
05ea912
modded bernoulli-cell to include max-frequency constraint
ago109 Jul 24, 2024
c19d15e
added warning check to bernoulli, some cleanup
ago109 Jul 24, 2024
23a54f6
integrated if-cell, cleaned up lif and inits
ago109 Jul 24, 2024
27a61ef
mod to latency-cell
ago109 Jul 24, 2024
05a97f0
updated the poissonCell to be a true poisson
willgebhardt Jul 25, 2024
cdea291
Merge branch 'dynamics' of github.com:NACLab/ngc-learn into dynamics
willgebhardt Jul 25, 2024
efa61a5
fixed minor bug in deprecation for poiss/bern
ago109 Jul 25, 2024
223d3c0
fixed minor bug in deprecation for poiss/bern
ago109 Jul 25, 2024
9afaadf
fixed validation fun in bern/poiss
ago109 Jul 25, 2024
bf72094
moved back and cleaned up bernoulli and poisson cells
ago109 Jul 25, 2024
c894b8a
added threshold-clipping to latency cell
ago109 Jul 25, 2024
ba08453
updates to if/lif
ago109 Jul 26, 2024
9c932b1
added batch-size arg to slif
ago109 Jul 26, 2024
03940e9
fixed minor load bug in lif-cell
ago109 Jul 27, 2024
6bc5cd8
fixed a blocking jit-partial call in lif update_theta method; when lo…
Jul 27, 2024
f4c03a1
minor edit to dim-reduce
Jul 28, 2024
8d74157
Patched synapses added (#68)
Faezehabibi Aug 1, 2024
8d5bbd1
updated monitor plot code
willgebhardt Aug 6, 2024
97c4d92
update to dim-reduce
ago109 Aug 6, 2024
bf06510
update to dim-reduce with merge
ago109 Aug 6, 2024
77f347f
integrated phasor-cell, minor cleanup of latency
ago109 Aug 6, 2024
714a58c
tweak to adex thr arg
ago109 Aug 7, 2024
6ec2e7a
tweak to adex thr arg
ago109 Aug 7, 2024
fb8524a
integrated resonate-and-fire neuronal cell
ago109 Aug 8, 2024
dd49e5f
mod to raf-cell
ago109 Aug 8, 2024
8882208
cleaned up raf
ago109 Aug 8, 2024
ee50f33
cleaned up raf
ago109 Aug 8, 2024
611e5b3
cleaned up raf-cell
ago109 Aug 9, 2024
94f37f7
cleaned up raf-cell
ago109 Aug 9, 2024
73e5aa1
cleaned up raf-cell
ago109 Aug 9, 2024
6408ee0
minor tweak to dim-reduce in utils
Aug 11, 2024
da439bd
Fix typo in pcn_discrim.md (#69)
sonnygeorge Oct 7, 2024
7510bb3
model_utils and rate cell (#70)
Faezehabibi Oct 16, 2024
889d230
Fix/reorganize feature library (#74)
Faezehabibi Oct 25, 2024
7338c34
Update model_utils.py (#78)
Faezehabibi Nov 19, 2024
35eae76
Additions for inhibition stuff
willgebhardt Nov 19, 2024
94f1697
add sindy documentation for exhibits (#81)
Faezehabibi Dec 2, 2024
de53d20
Update ode_utils.py (#79)
Faezehabibi Dec 2, 2024
23473ab
Add patched synapse (#80)
Faezehabibi Dec 2, 2024
2295ba5
Update __init__.py (#83)
Faezehabibi Dec 6, 2024
eeb057a
Add l1 decay term to update calculation (#84)
Faezehabibi Dec 9, 2024
cf53968
feat NGC module regression (#86)
Faezehabibi Dec 9, 2024
c49daea
Update odes.py
Faezehabibi Dec 10, 2024
d5def75
Update odes.py (#87)
Faezehabibi Dec 10, 2024
21a8af0
Update odes.py
Faezehabibi Dec 10, 2024
537c29d
Update __init__.py
Faezehabibi Dec 10, 2024
34278ed
Update __init__.py
Faezehabibi Dec 10, 2024
0782fa6
Merge pull request #89 from Faezehabibi/fix-typo
rxng8 Dec 10, 2024
8b2730a
Merge branch 'NACLab:main' into refactor-odes
Faezehabibi Dec 10, 2024
9a3ce0e
Merge pull request #88 from Faezehabibi/refactor-odes
rxng8 Dec 10, 2024
da2f24e
Add attribute 'lr' (#90)
Faezehabibi Dec 16, 2024
796178d
commit probes/mods to utils to analysis_tools branch
Mar 1, 2025
84237ff
commit probes/mods to utils to analysis_tools branch
Mar 1, 2025
9d7acbb
update documentation
rxng8 Mar 1, 2025
247de74
cleaned up probes/docs for probes
Mar 1, 2025
d0df86e
change heads_dim to attn_dim, and modify the mlp to be as similar as …
rxng8 Mar 1, 2025
8a36e40
in layer normalization or any other Gaussian, standardeviation can ne…
rxng8 Mar 1, 2025
f402d98
update attentive probe code
rxng8 Mar 1, 2025
2a71b7f
minor tweak to attentive prob code comments
Mar 3, 2025
b688c6c
cleaned up probe parent fit routine
Mar 3, 2025
9ad4ae2
cleaned up probe parent fit routine
Mar 3, 2025
3a2de99
cleaned up probe parent fit routine
Mar 3, 2025
155d830
cleaned up probe parent fit routine
Mar 3, 2025
099c588
minor edits to attn probe
Mar 5, 2025
aeabf61
update attentive probe with input layer norm
rxng8 Mar 5, 2025
8682954
update input layer normalization
rxng8 Mar 6, 2025
dc8c127
update code to fix nan bug
rxng8 Mar 6, 2025
27fd9bf
minor tweak to attn probe
Mar 6, 2025
84005b5
cleaned up probes
Mar 6, 2025
2feeced
cleaned up probes
Mar 6, 2025
56f006c
cleaned up probes
Mar 6, 2025
1b7bff8
cleaned up probes
Mar 6, 2025
f38373f
generalized dropout in terms of shape
Mar 6, 2025
012395b
tweak to atten probe
Mar 6, 2025
53ed773
tweak to atten probe
Mar 6, 2025
1fbbf93
added silu/swish/elu to model_utils
Mar 6, 2025
23e8c84
cleaned up model_utils
Mar 6, 2025
695e9d8
fix bug in attention probe dropout, fix bug in None noise_key passed …
rxng8 Mar 7, 2025
04e1343
hyperparameter tunning arguments added
rxng8 Mar 10, 2025
b3418df
Merge branch 'main' into analysis_tools
rxng8 Mar 11, 2025
ffd8f0e
Merging over Dynamics feature branch to main (#92)
ago109 Mar 12, 2025
2d0452a
Merge branch 'main' into analysis_tools
ago109 Mar 12, 2025
7bfd8ac
remove unused local variables
rxng8 Mar 12, 2025
27ae7e2
update note
rxng8 Mar 12, 2025
92633f9
update model utils
rxng8 Mar 13, 2025
08b4d12
remove notes
rxng8 Mar 13, 2025
8f75b0d
Merge pull request #93 from NACLab/analysis_tools
rxng8 Mar 13, 2025
5664c64
Update ode utils (#94)
Faezehabibi Mar 13, 2025
36e8152
minor fix to header in diffeq
Mar 13, 2025
534ab67
Update files with ode_solver (#95)
Faezehabibi Mar 13, 2025
6e8261e
revised/cleaned up sindy tutorial doc/imgs
Mar 13, 2025
1d15f1f
add prior for hebbian patched synapse (#96)
Faezehabibi Mar 14, 2025
9de3c98
cleaned up doc-strings in odes.py to comply w/ ngc-learn format
Mar 17, 2025
0d720e1
minor tweak to sig-figs printing in probe utils
Mar 17, 2025
b9227f0
add-sigma-to-gaussianErrorCell (#97)
Faezehabibi Mar 20, 2025
4af85dc
cleaned up ode_utils, cleaned up gaussian/laplacian cell
Mar 20, 2025
7f3e7c8
Update gaussianErrorCell.py (#98)
Faezehabibi Mar 21, 2025
e055d95
cleaned up gauss/laplace error cells
Mar 21, 2025
b0b496a
integrated bernoulli err-cell
Mar 21, 2025
bbea397
Major release update merge to main (in prep for 2.0.0 release on rele…
ago109 Apr 12, 2025
54ec2dd
Major release update (to 2.0.0) (#100)
ago109 Apr 12, 2025
86b7189
Major release update merge to main (sync up) (#101)
ago109 Apr 12, 2025
20c81d2
update test cases
rxng8 Apr 12, 2025
bbfe622
added hh-plot for hh tutorial
Apr 12, 2025
214a6d3
tweak to img folder for sindy
Apr 12, 2025
0b07fff
Merge branch 'release' into main
ago109 Apr 12, 2025
fb5ddc4
update to sindy tutorial to adhere to readthedocs formatting
Apr 13, 2025
d5104ec
Merge branch 'main' of github.com:NACLab/ngc-learn
Apr 13, 2025
c3570de
Minor mods in release sync'd up back to main (#106)
ago109 Apr 13, 2025
061e713
minor edit to h-h text in modeling api doc
Apr 13, 2025
9fd17f1
Update jaxProcess.py
willgebhardt Apr 18, 2025
49bccb5
update to mstdp-et and var-trace
Apr 23, 2025
5396eb3
Merge branch 'main' of github.com:NACLab/ngc-learn
Apr 23, 2025
0182340
minor tweaks + init of rl-snn exhibit lesson
Apr 25, 2025
8f5a650
Dynamic synapses and updates to lessons (including operant conditioni…
ago109 Apr 29, 2025
1fe0d55
fixed exp-syn pytest
Apr 29, 2025
087d32a
Add tutorial diagram for GaussianErrorCells (#107)
Faezehabibi Apr 29, 2025
4d0cdff
Refactor patched synapse (#110)
Faezehabibi May 2, 2025
0419997
Add vis_mode to generate_patch_set (#111)
Faezehabibi May 6, 2025
276fbbb
cleaned up trace
May 22, 2025
a09a3c4
edit to requirements
May 24, 2025
91789a5
rename variables for masking (#112)
Faezehabibi May 29, 2025
cf7ee73
minor cleanup/patches to rate-cell/lif/hebb-syn/trace-stdp and dim_re…
Jun 4, 2025
2d8c6e4
Merge branch 'main' of github.com:NACLab/ngc-learn
Jun 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ $ python install -e .
</pre>

**Version:**<br>
2.0.0 <!--1.2.3-Beta--> <!-- -Alpha -->
2.0.1 <!--1.2.3-Beta--> <!-- -Alpha -->

Author:
Alexander G. Ororbia II<br>
Expand Down
Binary file added docs/images/museum/rat_accuracy.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/museum/rat_rewards.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/museum/ratmaze.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/museum/real_ratmaze.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/tutorials/neurocog/GEC.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/tutorials/neurocog/SingleGEC.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/tutorials/neurocog/alphasyn.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/tutorials/neurocog/exp2syn.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/tutorials/neurocog/expsyn.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ without a GPU.
<i>Setup:</i> <a href="https://github.com/NACLab/ngc-learn">ngc-learn</a>,
in its entirety (including its supporting utilities),
requires that you ensure that you have installed the following base dependencies in
your system. Note that this library was developed and tested on Ubuntu 22.04 (and 18.04).
your system. Note that this library was developed and tested on Ubuntu 22.04 (and earlier versions on 18.04/20.04).
Specifically, ngc-learn requires:
* Python (>=3.10)
* ngcsimlib (>=0.3.b4), (<a href="https://github.com/NACLab/ngc-sim-lib">official page</a>)
* ngcsimlib (>=1.0.0), (<a href="https://github.com/NACLab/ngc-sim-lib">official page</a>)
* NumPy (>=1.26.0)
* SciPy (>=1.7.0)
* JAX (>= 0.4.18; and jaxlib>=0.4.18) <!--(tested for cuda 11.8)-->
* JAX (>= 0.4.28; and jaxlib>=0.4.28) <!--(tested for cuda 11.8)-->
* Matplotlib (>=3.4.2), (for `ngclearn.utils.viz`)
* Scikit-learn (>=1.3.1), (for `ngclearn.utils.patch_utils` and `ngclearn.utils.density`)

Expand Down Expand Up @@ -45,7 +45,7 @@ $ git clone https://github.com/NACLab/ngc-learn.git
$ cd ngc-learn
```

2. (<i>Optional</i>; only for GPU version) Install JAX for either CUDA 11 or 12 , depending
2. (<i>Optional</i>; only for GPU version) Install JAX for either CUDA 12 , depending
on your system setup. Follow the
<a href="https://jax.readthedocs.io/en/latest/installation.html">installation instructions</a>
on the official JAX page to properly install the CUDA 11 or 12 version.
Expand Down
2 changes: 1 addition & 1 deletion docs/modeling/neurons.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ fast spiking (FS), low-threshold spiking (LTS), and resonator (RZ) neurons.
This cell models dynamics over voltage `v` and three channels/gates (related to
potassium and sodium activation/inactivation). This sophisticated cell system is,
as a result, a set of four coupled differential equations and is driven by an appropriately configured set of biophysical constants/coefficients (default values of which have been set according to relevant source work).
(Note that this cell supports either Euler or midpoint method / RK-2 integration.)
(Note that this cell supports Euler, midpoint / RK-2 integration, or RK-4 integration.)

```{eval-rst}
.. autoclass:: ngclearn.components.HodgkinHuxleyCell
Expand Down
28 changes: 28 additions & 0 deletions docs/modeling/synapses.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,34 @@ This synapse performs a deconvolutional transform of its input signals. Note tha

## Dynamic Synapse Types

### Exponential Synapse

This (chemical) synapse performs a linear transform of its input signals. Note that this synapse is "dynamic" in the sense that its efficacies are a function of their pre-synaptic inputs; there is no inherent form of long-term plasticity in this base implementation. Synaptic strength values can be viewed as being filtered/smoothened through an expoential kernel.

```{eval-rst}
.. autoclass:: ngclearn.components.ExponentialSynapse
:noindex:

.. automethod:: advance_state
:noindex:
.. automethod:: reset
:noindex:
```

### Alpha Synapse

This (chemical) synapse performs a linear transform of its input signals. Note that this synapse is "dynamic" in the sense that its efficacies are a function of their pre-synaptic inputs; there is no inherent form of long-term plasticity in this base implementation. Synaptic strength values can be viewed as being filtered/smoothened through a kernel that models more realistic rise and fall times of synaptic conductance..

```{eval-rst}
.. autoclass:: ngclearn.components.AlphaSynapse
:noindex:

.. automethod:: advance_state
:noindex:
.. automethod:: reset
:noindex:
```

### Short-Term Plasticity (Dense) Synapse

This synapse performs a linear transform of its input signals. Note that this synapse is "dynamic" in the sense that it engages in short-term plasticity (STP), meaning that its efficacy values change as a function of its inputs/time (and simulated consumed resources), but it does not provide any long-term form of plasticity/adjustment.
Expand Down
1 change: 1 addition & 0 deletions docs/museum/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ relevant, referenced publicly available ngc-learn simulation code.
snn_dc
snn_bfa
sindy
rl_snn
132 changes: 132 additions & 0 deletions docs/museum/rl_snn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Reinforcement Learning through a Spiking Controller

In this exhibit, we will see how to construct a simple biophysical model for
reinforcement learning with a spiking neural network and modulated
spike-timing-dependent plasticity.
This model incorporates a mechanisms from several different models, including
the constrained RL-centric SNN of <b>[1]</b> as well as the simplifications
made with respect to the model of <b>[2]</b>. The model code for this
exhibit can be found
[here](https://github.com/NACLab/ngc-museum/tree/main/exhibits/rl_snn).

## Modeling Operant Conditioning through Modulation

Operant conditioning refers to the idea that there are environmental stimuli that can either increase or decrease the occurrence of (voluntary) behaviors; in other words, positive stimuli can lead to future repeats of a certain behavior whereas negative stimuli can lead to (i.e., punish) a decrease in future occurrences. Ultimately, operant conditioning, through consequences, shapes voluntary behavior where actions followed by rewards are repeated and actions followed by punished/negative outcomes diminish.

In this lesson, we will model very simple case of operant conditioning for a neuronal motor circuit used to engage in the navigation of a simple maze.
The maze's design will be the rat T-maze and the "rat" will be allowed to move, at a particular point in the maze, in one of four directions (up/North, down/South, left/West, and right/East). A positive reward will be supplied to our rat neuronal circuit if it makes progress towards the direction of food (placed in the upper right corner of the T-maze) and a negative reward will be provided if fails to make progress/gets stuck, i.e., a dense reward functional will be employed. For the exhibit code that goes with this lesson, an implementation of this T-maze environment is provided, modeled in the same style/with the same agent API as the OpenAI gymnasium.

### Reward-Modulated Spike-Timing-Dependent Plasticity (R-STDP)

Although [spike-timing-dependent plasticity](../tutorials/neurocog/stdp.md) (STDP) and [reward-modulated STDP](../tutorials/neurocog/mod_stdp.md) (MSTDP) are covered and analyzed in detail in the ngc-learn set of tutorials, we will briefly review the evolution
of synaptic strengths as prescribed by modulated STDP with eligibiility traces here. In effect, STDP prescribes changes
in synaptic strength according to the idea that <i>neurons that fire together, wire together, except that timing matters</i>
(a temporal interpretation of basic Hebbian learning). This means that, assuming we are able to record the times of
pre-synaptic and post-synaptic neurons (that a synaptic cable connects), we can, at any time-step $t$, produce an
adjustment $\Delta W_{ij}(t)$ to a synapse via the following pair of correlational rules:

$$
\Delta W_{ij}(t) = A^+ \big(x_i s_j \big) - A^- \big(s_i x_j \big)
$$

where $s_j$ is the spike recorded at time $t$ of the post-synaptic neuron $j$ (and $x_j$ is an exponentially-decaying trace that tracks its spiking history) and $s_i$ is the spike recorded at time $t$ of the pre-synaptic neuron $i$ (and $x_i$ is an exponentially-decaying trace that tracks its pulse history). STDP, as shown in a very simple format above, effectively can be described as balancing two types of alterations to a synaptic efficacy -- long-term potentiation (the first term, which increases synaptic strength) and long-term depression (the second term, which decreases synaptic strength).

Modulated STDP is a three-factor variant of STDP that multiplies the final synaptic update by a third signal, e.g., the modulatory signal is often a reward (dopamine) intensity value (resulting in reward-modulated STDP). However, given that reward signals might be delayed or not arrive/be available at every single time-step, it is common practice to extend a synapse to maintain a second value called an "eligibility trace", which is effectively another exponentially-decaying trace/filter (instantiated as an ODE that can be integrated via the Euler method or related tools) that is constructed to track a sequence of STDP updates applied across a window of time. Once a reward/modulator signal becomes available, the current trace is multiplied by the modulator to produce a change in synaptic efficacy.
In essence, this update becomes:

$$
\Delta W_{ij} = \nu E_{ij}(t) r(t), \; \text{where } \; \tau_e \frac{\partial E_{ij}(t)}{\partial t} = -E_{ij}(t) + \Delta W_{ij}(t)
$$

where $r(t)$ is the dopamine supplied at some time $t$ and $\nu$ is some non-negative global learning rate. Note that MSTDP with eligibility traces (MSTDP-ET) is agnostic to the choice of local STDP/Hebbian update used to produce $\Delta W_{ij}(t)$ (for example, one could replace the trace-based STDP rule we presented above with BCM or a variant of weight-dependent STDP).

## The Spiking Neural Circuit Model

In this exhibit, we build one of the simplest possible spiking neural networks (SNNs) one could design to tackle a simple maze navigation problem such as the rat T-maze; specifically, a three-layer SNN where the first layer is a Poisson encoder and the second and third layers contain sets of recurrent leaky integrate-and-fire (LIF) neurons. The recurrence in our model is to be non-plastic and constructed such that a form of lateral competition is induced among the LIF units, i.e., LIF neurons will be driven by a scaled Hollow-matrix initialized recurrent weight matrix (which will multiply spikes encountered at time $t - \Delta t$ by negative values), which will (quickly yet roughly) approximate the effect of inhibitory neurons. For the synapses that transmit pulses from the sensory/input layer to the second layer, we will opt for a non-plastic sparse mixture of excitatory and inhibitory strength values (much as in the model of <b>[1]</b>) to produce a reasonable encoding of the input Poisson spike trains. For the synapses that transmit pulses from the second layer to the third (control/action) layer, we will employ MSTDP-ET (as shown in the previous section) to adjust the non-negative efficacies in order to learn a basic reactive policy. We will call this very simple neuronal model the "reinforcement learning SNN" (RL-SNN).

The SNN circuit will be provided raw pixels of the T-maze environment (however, this view is a global view of the
entire maze, as opposed to something more realistic such as an egocentric view of the sensory space), where a cross
"+" marks its current location and an "X" marks the location of the food substance/goal state. Shown below is an
image to the left depicting a real-world rat T-maze whereas to the right is our implementation/simulation of the
T-maze problem (and what our SNN circuit sees at the very start of an episode of the navigation problem).

```{eval-rst}
.. table::
:align: center

+-------------------------------------------------+------------------------------------------------+
| .. image:: ../images/museum/real_ratmaze.jpg | .. image:: ../images/museum/ratmaze.png |
| :width: 250px | :width: 200px |
| :align: center | :align: center |
+-------------------------------------------------+------------------------------------------------+
```

## Running the RL-SNN Model

To fit the RL-SNN model described above, go to the `exhibits/rl_snn`
sub-folder (this step assumes that you have git cloned the model museum repo
code), and execute the RL-SNN's simulation script from the command line as follows:

```console
$ ./sim.sh
```
which will execute a simulation of the MSTDP-adapted SNN on the T-maze problem, specifically executing four uniquely-seeded trial runs (i.e., four different "rat agents") and produce two plots, one containing a smoothened curve of episodic rewards over time and another containing a smoothened task accuracy curve (as in, did the rat reach the goal-state and obtain the food substance or not). You should obtain plots that look roughly like the two below.

```{eval-rst}
.. table::
:align: center

+-----------------------------------------------+-----------------------------------------------+
| .. image:: ../images/museum/rat_rewards.jpg | .. image:: ../images/museum/rat_accuracy.jpg |
| :width: 400px | :width: 400px |
| :align: center | :align: center |
+-----------------------------------------------+-----------------------------------------------+
```

Notice that we have provided a random agent baseline (i.e., uniform random selection of one of the four possible
directions to move at each step in an episode) to contrast the SNN rat motor circuit's performance with. As you may
observe, the SNN circuit ultimately becomes conditioned to taking actions akin to the optimal policy -- go North/up
if it perceives itself (marked as a cross "+") at the bottom of the T-maze and then go East/right once it has reached the top
of the T of the T-maze and go right upon perception of the food item (goal state marked as an "X").

The code has been configured to also produce a small video/GIF of the final episode `episode200.gif`, where the MSTDP
weight changes have been disabled and the agent must solely rely on its memory of the uncovered policy to get to the
goal state.

### Some Important Limitations

While the above MSTDP-ET-driven motor circuit model is useful and provides a simple model of operant conditioning in
the context of a very simple maze navigation task, it is important to identify the assumptions/limitations of the
above setup. Some important limitations or simplifications that have been made to obtain a consistently working
RL-SNN model:
1. As mentioned earlier, the sensory input contains a global view of the maze navigation problem, i.e., a 2D birds-eye
view of the agent, its goal (the food substance), and its environment. More realistic, but far more difficult
versions of this problem would need to consider an ego-centric view (making the problem a partially observable
Markov decision process), a more realistic 3D representation of the environment, as well as more complex maze
sizes and shapes for the agent/rat model to navigate.
2. The reward is only delayed with respect to the agent's stimulus processing window, meaning that the agent essentially
receives a dopamine signal after an action is taken. If we ignore the SNN's stimulus processing time between video
frames of the actual navigation problem, we can view our agent above as tackling what is known in reinforcement
learning as the dense reward problem. A far more complex, yet more cognitively realistic, version of the problem
is to administer a sparse reward, i.e., the rat motor circuit only receives a useful dopamine/reward stimulus at the
end of an episode as opposed to after each action. The above MSTDP-ET model would struggle to solve the sparse
reward problem and more sophisticated models would be required in order to achieve successful outcomes, i.e.,
appealing to models of memory/cognitive maps, more intelligent forms of exploration, etc.
3. The SNN circuit itself only permits plastic synapses in its control layer (i.e., the synaptic connections between
the second layer and third output/control layer). The bottom layer is non-plastic and fixed, meaning that the
agent model is dependent on the quality of the random initialization of the input-to-hidden encoding layer. The
input-to-hidden synapses could be adapted with STDP (or MSTDP); however, the agent will not always successfully
and stably converge to a consistent policy as the encoding layer's effectiveness is highly dependent on how much
of the environment the agent initially sees/explores (if the agent gets "stuck" at any point, STDP will tend to
fill up the bottom layer receptive fields with redundant information and make it more difficult for the control
layer to learn the consequences of taking different actions).

<!-- References/Citations -->
## References
<b>[1]</b> Chevtchenko, Sérgio F., and Teresa B. Ludermir. "Learning from sparse
and delayed rewards with a multilayer spiking neural network." 2020 International
Joint Conference on Neural Networks (IJCNN). IEEE, 2020. <br>
<b>[2]</b> Diehl, Peter U., and Matthew Cook. "Unsupervised learning of digit
recognition using spike-timing-dependent plasticity." Frontiers in computational
neuroscience 9 (2015): 99.

Loading
Loading