Skip to content

Random Projection Encoding #87

Closed
@thomas9t

Description

@thomas9t

This is a cool project and thanks for referencing our paper for random-projection encoding methods! Just FYI - the random projection encoding method you have implemented is a bit different from those discussed in the paper referenced. Let x be an $n$-dimensional input point to encode. I think the particular embedding you are referring to is the following:

z = sign(Mx)

where M is a d x n dimensional matrix whose rows are sampled from the uniform distribution over the unit-sphere. A simple way to generate samples from the uniform distribution over the unit-sphere is to normalize a sample from the $n$-dimensional standard normal distribution (see here for more info). Note that I'm not sure sampling from the uniform distribution over [-1,1]^n and normalizing results in a uniform distribution over the sphere. In particular, I think this approach results in too little mass distributed near the equator and poles. The following code should do the trick:

# Generate the embedding matrix:
import numpy as np
d = 10000; n = 100
M = np.random.normal(size=(d,n))
M /= np.linalg.norm(M, axis=1).reshape(-1,1)

# encode a point:
x = np.random.rand(n)
z = np.sign(M.dot(x))

The sign function is important because the sense in which the encoding preserves distances between points is different without it (and I'm not sure is what one would want). You may not want to use the sign function because it messes with gradients (e.g. its derivative is zero everywhere except at zero, where it does not exist). If you want to omit the sign function and use a linear projection, I would recommend looking into the "Johnson-Lindenstrauss Transform" (see here).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions