Extra Encoders (Image, Video, sound, ...) #259

breznak · 2019-02-09T20:41:02Z

ctrl-z-9000-times · 2019-02-16T03:40:05Z

Maybe also a Grid Cell encoder, which converts a Cartesian coordinate into a grid-cell like encoding.

ctrl-z-9000-times · 2019-04-18T14:19:15Z

The original nupic had a "time & date" encoder. It's not critical, but might be nice to have. I think the motivation for this encoder is that many anomalies correlate with a time of day or day of the week.

HTM school video for date/time encoder:
https://discourse.numenta.org/t/htm-school-episode-6-datetime-encoding/892

ctrl-z-9000-times · 2019-09-19T22:32:49Z

I wrote a vision encoder which may be added to this repo. It needs some cleanup & modification before it will be ready, and I don't know when I will get around to it. Currently it is a research prototype.

It is written in python, and uses opencv. The Opencv library has functions which do log-polar transforms, and Parvo/Magno-cellular transforms. Open-CV works well and it looks pretty well researched. I wrote an encoder which converts the processed images into sparse-distributed-representations. My encoders are less well researched.

Here are some images showing the log-polar & parvo/magno-cellular transforms.

Field of view:

Parvo-cellular:

Magno-cellular:

Here are some statistics about the SDR encoded output of eye, after viewing the dataset which contains the above image.

Parvo SDR( 250 250 1 )
    Sparsity Min/Mean/Std/Max 0.197344 / 0.200787 / 0.00118844 / 0.205056
    Activation Frequency Min/Mean/Std/Max 0 / 0.20079 / 0.29597 / 1
    Entropy 0.458866
    Overlap Min/Mean/Std/Max 0.355327 / 0.833063 / 0.0761267 / 0.956605
Magno SDR( 250 250 1 )
    Sparsity Min/Mean/Std/Max 0.19864 / 0.203853 / 0.00151922 / 0.208624
    Activation Frequency Min/Mean/Std/Max 0 / 0.203854 / 0.182725 / 0.595165
    Entropy 0.76574
    Overlap Min/Mean/Std/Max 0.00212632 / 0.848003 / 0.133877 / 0.981023

breznak · 2019-09-19T23:56:24Z

This is amazing!!

parvo/magno-cellular transforms.

I'll need to refresh my knowledge, this is a good starter:
https://foundationsofvision.stanford.edu/chapter-5-the-retinal-representation/#visualinformation

I cite:

When cells in the parvocellular layers of a monkey’s lateral geniculate nucleus are destroyed, performance deteriorates on a variety of tasks, such as color discrimination and pattern detection. Since the parvocellular pathway includes more than seventy percent of the retinal ganglion cells, perhaps this result is not terribly surprising. When cell bodies in the magnocellular layers are destroyed many visual performances are unaffected.

What conclusion can we draw from these lesion studies? The information carried by the neurons in the magnocellular pathway provide the best information in the low temporal and high spatial frequency components of the image.

= image classification type of tasks

Performance on motion tasks and other tasks that require this information is better when the magnocellular pathway signal is available.

= video processing in vision tasks. motion detection/tracking.

Here are some statistics [...] after viewing the dataset

So you needed to produce several SDRs from the image.

do you use some saccadic movements (to generate a few centers of focus)
or is this used on an "animation" (a sequence of related images)?

I am wondering how this kind of biologically plausible encoding would fare on "stupid" classification datasets, like MNIST, CIFAR10 etc.
Or we'd have to extend to real-world tasks: object recognition in video..

I'll be reading more chapters on vision, please share your retina code when you have time, even if it's not ready yet. Thank you

ctrl-z-9000-times · 2019-09-20T01:57:57Z

https://foundationsofvision.stanford.edu/chapter-5-the-retinal-representation/#visualinformation

That looks like a good source. My model is wrong with regards to many details, including the relative densities of magnocellular and parvocellular neurons.

do you use some saccadic movements (to generate a few centers of focus)

No, this is TODO. Currently I move the eye by a small random amount between each compute cycle. Controlling where the eye looks is an open issue, and it involves action selection and motor control.

please share your retina code

I'm keeping my latest work on the eye encoder here: https://github.com/ctrl-z-9000-times/sdr_algorithms/blob/master/eye.py I don't know when I will have time to work on it further.

breznak · 2019-09-20T11:36:16Z

My model is wrong with regards to many details, including

yes, according to the literature, these modifications could apply:

magno-/parvo-cellular neurons in 70:30 proportion ( = this is "Retina")
force abstraction by 1:10 compression (in retina to optical nerve neuron counts), SP could do that.
then starts the (large in size) visual cortex (HTM, SP+TM)
Q:
- for image classification, use only parvo cells?

move the eye by a small random amount between each compute cycle. [...] involves action selection and motor control.

would the saccadic moves wrongly trigger movement/magno cells? Or would that help to encode "I moved the eye/focus, so the 'move' is caused by the move of sensor, not move of the objects in the scene"?

I'm considering for initial, trivial use case of 28x28 MNIST images.
- use just 1 focus point? (problem with bounding, location)
- use "5" fixed saccades (always visit same coordinates, so this'd be consistent for different images)

eye.py I don't know when I will have time to work on it further.

cool! Would you please make just an initial PR with the eye.py (or other necessarities) when you have time? I'd like to play with it in the next week and I'll try to adapt it to the current state of htm.core. I just ask for this so you author the file, so you'd get the (c) for the lines :) I'll then continue to make modifications to it.

Btw, reviewing that repo of yours, we're pretty much synced, aren't we? ae, CP, SDR...are more or less here 👍 . eye.py is the only missing piece. Or is there something else significant?

breznak added question Further information is requested community encoder labels Feb 9, 2019

breznak mentioned this issue Feb 9, 2019

Encoders in c++ #258

Closed

11 tasks

ctrl-z-9000-times mentioned this issue Mar 5, 2019

Coordinate Encoder - WIP #304

Draft

ctrl-z-9000-times mentioned this issue Mar 31, 2019

Grid Cell Encoder #353

Merged

breznak mentioned this issue May 28, 2019

Re-allow builds with Anaconda Python (Only Python 3 checked) #484

Merged

breznak mentioned this issue Sep 19, 2019

Py datasets downloaded from openML #678

Merged

2 tasks

This was referenced Sep 20, 2019

Add new NLP encoder Google's Universal Sentence Encoder #681

Open

Add new Vision encoder EYE #682

Open

HTM tasks, capabilities - What you can do #683

Open

breznak mentioned this issue Oct 3, 2019

Provide bindings for BaseEncoder for python encoders #704

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extra Encoders (Image, Video, sound, ...) #259

Extra Encoders (Image, Video, sound, ...) #259

breznak commented Feb 9, 2019 •

edited

Loading

ctrl-z-9000-times commented Feb 16, 2019 •

edited

Loading

ctrl-z-9000-times commented Apr 18, 2019

ctrl-z-9000-times commented Sep 19, 2019

breznak commented Sep 19, 2019

ctrl-z-9000-times commented Sep 20, 2019

breznak commented Sep 20, 2019

Extra Encoders (Image, Video, sound, ...) #259

Extra Encoders (Image, Video, sound, ...) #259

Comments

breznak commented Feb 9, 2019 • edited Loading

Domains:

ctrl-z-9000-times commented Feb 16, 2019 • edited Loading

ctrl-z-9000-times commented Apr 18, 2019

ctrl-z-9000-times commented Sep 19, 2019

breznak commented Sep 19, 2019

ctrl-z-9000-times commented Sep 20, 2019

breznak commented Sep 20, 2019

breznak commented Feb 9, 2019 •

edited

Loading

ctrl-z-9000-times commented Feb 16, 2019 •

edited

Loading