-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sml] add kmeans++ and support executing with multiple initial centers in Kmeans #546
Conversation
When running
|
hello,thanks for your excellent contributions first! The reason of this error is that you initialize emulator in every You can refer to this. (BTW, I may give some detailed reviews after Spring Festival -.-) |
Thank you for your help. The problem has been solved! |
整体实现都非常nice,没什么问题~ |
感谢您的建议!我在实现的时候遇到一个小问题,就是尝试在__init__里面生成init_params,但是生成出来的数字完全不在[0, 1]这个范围,拿unittest里面的sample举例,生成出来的self.init_params为: |
sorry,,没有发现这个问题,,事实上,choice也是不能使用的(或者说不正确的),如: x = np.array([1, 0, 2, 3, 1, 1, 1, 1, 1, 1])
fn = lambda x: jax.random.choice(jax.random.PRNGKey(1), x)
spu_fn = ppsim.sim_jax(sim, fn, copts=copts)
z = spu_fn(x)
print(f"spu out = {z}") # -2147483648
print(f"cpu out = {fn(x)}") # 1 但是由于index out of range不会报错,所以我估计程序还是能正常跑下来。 这个问题主要是由于spu暂时没有hack jax的随机数模块的api,针对这个问题,一个可以缓解的方式就是限制一下用法: def emul_kmeans_kmeans_plus_plus(mode: emulation.Mode.MULTIPROCESS):
X = jnp.array([[-4, -3, -2, -1], [-4, -3, -2, -1]]).T
# define model in outer scope
# then __init__ will be computed in plaintext
model = KMEANS(
n_clusters=4,
n_samples=X.shape[0],
init="k-means++",
init_params=None,
n_init=1,
max_iter=10,
)
def proc(x):
# only run fit in crypto
model.fit(x)
return model._centers.sort(axis=0)
X = emulator.seal(X)
result = emulator.run(proc)(X)
print("result\n", result) |
事实上,,您原来的写法是把kmeans对象定义在SPU的runtime里,所以即使随机数定义在__init__中,也无法正常计算~ |
非常感谢老师的建议,纠正了我之前错误的理解(我之前以为SPU会自动将__init__中的运行改为在SPU runtime前执行😓) |
目前将随机数的生成全部改为在init函数中生成init_params这个参数删除,和sklearn保持一。在test和emulation中,random和kmeans++的测试都改为在SPU runtime前初始化model。 |
这black报的问题,似乎并不是我这一块代码引起的? |
麻烦 merge 一下 main :P |
感谢,问题解决了 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Realize some functions of Kmeans.
Add kmeans++ for center initialization.
Support executing Kmens with multiple initial centers and using the best result.