-
Notifications
You must be signed in to change notification settings - Fork 29
Add methods to form thermometer codes #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, it's looking good! I just have a couple comments about some details.
torchhd/functional.py
Outdated
device=rand_hv.device, | ||
) | ||
|
||
for i in range(num_vectors): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you perform a quick benchmark (timing) of your implementation against something like torch.triu_indices
? This would let you set all elements in parallel so I suspect it would be quite a bit faster especially as the number of dimensions increases but I haven't tried it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point! I agree it is no-brainer that the usage of indexing will make the implementation extremely fast. So the code was modified accordingly
torchhd/functional.py
Outdated
) | ||
|
||
for i in range(num_vectors): | ||
if (model == BSC) | (model == FHRR): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering shouldn't the FHRR also use bipolar elements? Since it's a generalization of the bipolar hypervectors. This would also be inline with the model agnostic version I mentioned in another comment using the identity and negative identity.
requires_grad=False, | ||
**kwargs, | ||
) -> VSA_Model: | ||
"""Creates a thermometer code for given dimensionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe another way to look at the thermometer codes is to take the first hypervector as the identity hypervector of the model and the last hypervector as the negative of the identity and then perform the interpolation in between. This could make it such that the function is agnostic to the model used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to save this point somewhere! I think the current implementation with the use of torch.tril_indices is pretty compact but as we expand the number of supported HD/VSA models we might want to switch to this idea of interpolation.
torchhd/functional.py
Outdated
) -> VSA_Model: | ||
"""Creates a thermometer code for given dimensionality. | ||
|
||
Implements similarity-preserving hypervectors as described in "Sparse Binary Distributed Encoding of Scalars" <https://doi.org/10.1615/J Automat Inf Scien.v37.i6.20>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the thermometer_hv and Thermometer to the documentation under docs/torchhd.rst and docs/embeddings.rst? Please double check using the sphinx documentation generation command in the README whether the link appears correctly, there might be an error in your syntax for the reference in the function description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have not thought about the documentation in the first place. Fixed it now by updating the corresponding .rst files. There was indeed an error in the link formatting. Fixed it as well so it works in the generated .html
input, self.low_value, self.high_value, self.num_embeddings | ||
).clamp(0, self.num_embeddings - 1) | ||
|
||
return super(Thermometer, self).forward(indices).as_subclass(MAP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikeheddes a question here. I followed the structure as in Level(). It, however, looks weird that MAP is used at the end when doing .as_subclass(). It seems that we rather need to specify a "model" parameter at the input and use the corresponding model with return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. It is currently this way to be compatible with the way we implemented it before v4. But we should allow the user to specify the model of the various embedding classes. I will make a new issue for this.
I'm wondering whether it would be better to keep the same API as for the other hypervector creation functions where you can specify the number of vectors and their dimensions. Because for 10000 dimension this will create a very large matrix while you might only need to use 6 hypervectors or something. |
Will work now on revising the code to address this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@denkle thank you for revising the code and congrats with merging your first PR!
I removed the VSA model argument from the embedding class so that all the embeddings have a consistent API. I think it's better if we change all the embedding classes to accept a VSA model argument in a separate PR.
Thanks, @mikeheddes! Agree with the point on separate PR for changing embeddings' API for all classes. Just let me know if you would like to start it or whether I should give the first try. It might at least be worth checking if the code for Thermometer was reasonable. |
@denkle you can start working on that feature if you want. What you had before was in the right direction I think, it's a similar API to the functional ones. The only thing that was still missing is making sure that the dtype of the created tensors matches (or is compatible with) the provided VSA_Model. |
No description provided.