Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving Spatial Pooler #641

Closed
Sedom9 opened this issue Aug 19, 2019 · 23 comments
Closed

Saving Spatial Pooler #641

Sedom9 opened this issue Aug 19, 2019 · 23 comments
Labels
Python Binding question Further information is requested serializable SP

Comments

@Sedom9
Copy link

Sedom9 commented Aug 19, 2019

Hello!

I want to migrate from Python2 to Python3. That’s why I use htm.core( community fork). And I don’t understand about serialization in htm.core. I want to save Spatial Pooler, but I get error: AttributeError: ‘htm.bindings.algorithms.SpaialPooler’ object has no attribute ‘save’

@dkeeney
Copy link

dkeeney commented Aug 19, 2019

Thanks for the post @Sedom9

I am also repeating the forum post from @1111 because it also has some hints.

I’m facing a similar problem.

[bindings/py/tests/algorithms/spatial_pooler_test.py]

    def testNupicSpatialPoolerPickling(self):
    sp = SP()
    pickledSp = pickle.dumps(sp)
    sp2 = pickle.loads(pickledSp)

Following the example above, I ran the following tests and got the following results:

Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] on linux

    import pickle
    from htm.bindings.algorithms import SpatialPooler
    sp = SpatialPooler()
    print(str(sp))

Spatial Pooler Connections:
Inputs (1024) ~> Outputs (4096) via Segments (4096)
Segments on Cell Min/Mean/Max 1 / 1 / 1
Potential Synapses on Segment Min/Mean/Max 512 / 512 / 512
Connected Synapses on Segment Min/Mean/Max 218 / 256.154 / 302
Synapses Dead (0%) Saturated (0%)

    with open(‘saved_model/sp.pickle’, ‘wb’) as f:
    pickle.dump(sp, f)

    with open(‘saved_model/sp.pickle’, ‘rb’) as f:
    saved_sp = pickle.load(f)

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

.

@Sedom9
Copy link
Author

Sedom9 commented Aug 19, 2019

Thanks. I hope you can solve this problem. Because without saving I can't use this for fork.

@dkeeney
Copy link

dkeeney commented Aug 19, 2019

In looking at this I see that the save() and load() functions were not implemented so I will have to do some work on this.

pickle was implemented on both the SP and TM but apparently there seems to be a problem with it as indicated by @1111

There is are functions loadFromString( string) and writeToString( ) that should have been a JSON representation of the SP but I think it is binary. So that needs to be changed as well.

I will try to have something for you shortly.

@Sedom9
Copy link
Author

Sedom9 commented Aug 19, 2019

Thanks a lot. I will wait

@breznak
Copy link
Member

breznak commented Aug 19, 2019

@Sedom9 have a look at test bindings/py/tests/algorithms/spatial_pooler_test.py
hint, have python >= 3.6, it would help if you can provide a test-case that crashes for you, ideally a PR.

@breznak breznak added Python Binding question Further information is requested serializable SP labels Aug 19, 2019
@Sedom9
Copy link
Author

Sedom9 commented Aug 19, 2019

(venv) sergey@sergey-HP-ProBook-440-G5:/var/project/htm.core-master/bindings/py/tests/algorithms$ python spatial_pooler_test.py
..Successfully caught incorrect uint numpy data length
..Successfully caught incorrect uint numpy data length
..Successfully caught incorrect float numpy data length
...
----------------------------------------------------------------------
Ran 9 tests in 2.650s

OK

@breznak
Copy link
Member

breznak commented Aug 19, 2019

If you look at the tests, it uses pickle to serialize a (basic) SP. What is your scenario that is failing?

@dkeeney
Copy link

dkeeney commented Aug 19, 2019

@breznak The test is not very extensive. It does not actually check if the restored SP is the same as the original. The original SP contained no data and has all default parameters (other than the dimensions). We should be able to do a little better.

@dkeeney
Copy link

dkeeney commented Aug 19, 2019

Also, save(filename) and load(filename) was not implemented as far as I can tell. Should we implement them?

@breznak
Copy link
Member

breznak commented Aug 20, 2019

It does not actually check if the restored SP is the same as the original.

yeah, I know. A good test would be eg running a SP for 1000 random inputs, serializing, computing the next iteration from the original and a restored SP and see if those are the same. That's why I'm asking OP for usecase where this fails, ideally a PR with test case.

save(filename) and load(filename) was not implemented as far as I can tell. Should we

Depends, we should be compatible with Serializable.hpp, I'm not sure if we pulled through and switched to save/load, or kept using save_ar/load_ar. I think we're not 100% on Cereal, so we still have the _ar versions.

@dkeeney
Copy link

dkeeney commented Aug 20, 2019

I think we're not 100% on Cereal, so we still have the _ar versions.

We are 100% on Cereal. The save/load calls are in Serializable and get converted into Cereal calls save_ar and load_ar which pass the ar argument.

For C++ users we do have

  • save(stream) and load(stream) and
  • saveToFile(filename) and loadFromFile(filename) on the Serializable base class.

All of these can have the optional flag to set the format (Binary, JSON, etc).

For Python we have

  • pickle (or will have when I figure out how to fix it). This will use Binary format.
  • SP has saveToString() and loadFromString() which uses JSON format. Perhaps all bindings should implement this.
  • I would like to add saveToFile(filename), loadToFile(filename) which uses Binary format but does not require pickle.

We cannot do streams across language barriers (at least not very easy) so maybe save() and load() do not need to be exposed to Python.

Note that the original Nupic API http://nupic.docs.numenta.org/stable/guides/serialization.html uses read(), write(), readFromFile(), and writeToFile() which were captn proto calls. It depriciated save() and load()

@dkeeney
Copy link

dkeeney commented Aug 20, 2019

@breznak ,The problem I am trying to fix with pickle:

I have found the problem although I am not sure how to fix it yet.
The pickle.load( ) is implemented in the bindings as a call to load( ) we do this:

            std::stringstream ss( s.cast<std::string>() );
            SpatialPooler sp;
            sp.load(ss);
  std::cout << "SP=" << sp <<std::endl;
            return sp;

I added the cout << to see if the SP is restored correctly and it is. So the passing of the byte array did work. However, when we return the sp object, we are returning it by value. pybind11 is doing some sort of magic on it to make it into a python object but in Python, print(str(sp)) is getting nothing but garbage.

Still studying pybind11 to see what it really wants to have returned. Does SP have a problem with a copy constructor perhaps?

@dkeeney
Copy link

dkeeney commented Aug 20, 2019

Found the problem with pickling. The bindings for pickling on SP was broken because it could not handle returning the created SP object by value during deserialization. Probably due to a problem with the copy constructor. We don't really want it to be copied anyway so it works returning it in a unique_ptr object.

@breznak
Copy link
Member

breznak commented Aug 21, 2019

Resolved in #644, thank you David!

@breznak breznak closed this as completed Aug 21, 2019
@dkeeney
Copy link

dkeeney commented Aug 21, 2019

@Sedom9 If you pull a fresh repository from htm.core you should find that SP can now be saved and restored with pickle. It can also be saved and restored using saveToFile(filename) and loadFromFile(filename ).

The save( ) and load( ) functions work with C++ but because the argument is a stream they cannot be used directly from python.

I hope this works for you. Thank you for submitting the issue because it helped us identify and fix this problem.

@Sedom9
Copy link
Author

Sedom9 commented Aug 21, 2019

Thanx a lot for your help! I will pull fresh repository from htm.core. I will try to save my Spatial Pooler. And I will write posts here about my testing.

@dkeeney dkeeney reopened this Aug 21, 2019
@hirarinNakalab
Copy link

@dkeeney Thank you for your quick response!
I am account @1111 of htm forum.
I was thinking of using sp and tm saving, so it was very helpful.

@breznak
Copy link
Member

breznak commented Sep 18, 2019

@hirarinNakalab @Sedom9 can you confirm the issue was fixed by the merged PR, so we can close this?

@hirarinNakalab
Copy link

Cloned the updated Repository to verify that the SP and TM can successfully save load. I think there is no problem closing this issue.

@breznak
Copy link
Member

breznak commented Sep 18, 2019

Thanks for testing! Issue successfully resolved.

@breznak breznak closed this as completed Sep 18, 2019
@Sedom9
Copy link
Author

Sedom9 commented Sep 26, 2019

Sorry, I forgot confirm my testing. Thanx a lot, after updating repository. Everything is ok.

@Sedom9
Copy link
Author

Sedom9 commented Feb 11, 2020

@breznak @dkeeney. Last time, when I tested save and load, I didn't notice about loading. Now I come back to loading my Spatial Pooler, and I have some questions. When I tried to load my Spatial Pooler in this manner:
self.sp = SpatialPooler.loadFromFile(self.sp_file_path)
I got this error:
`TypeError: loadFromFile(): incompatible function arguments. The following argument types are supported:
1. (self: htm.bindings.algorithms.SpatialPooler, arg0: str) -> None

Invoked with: '/var/main_project/union-classifier1/data/sp.tmp'`
How I can used function loadFromFile() ? What is the first parameter in this function ? Empty Spatial Pooler or what ?

Thanx a lot for your helping!

@breznak
Copy link
Member

breznak commented Feb 11, 2020

How I can used function loadFromFile() ? What is the first parameter in this function ? Empty Spatial Pooler or what ?

yes, it's not a static function, you need a dummy SP first.

sp = SpatialPooler()
sp.loadFromFile("path.out")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python Binding question Further information is requested serializable SP
Projects
None yet
Development

No branches or pull requests

4 participants