submit implementation

hwaves · hwaves · commit b699796b15b5 · 2021-02-28T17:02:34.000+08:00
diff --git a/README.md b/README.md
@@ -1,2 +1,43 @@
-# cd_algorithm
-Code repository for under review paper.
+This project hosts the official implementation for our ICONIP 2020 paper:
+
+Learning Discrete Sentence Representations via Construction & Decomposition [Springer Link](https://link.springer.com/chapter/10.1007/978-3-030-63830-6_66).
+
+## Abstract
+In this paper, we address the problem of learning low-dimensional, discrete representations of real-valued vectors. We propose a new algorithm called similarity matrix construction and decomposition (C\&D). In the preparation phase, we constructively generate a set of consistent, unbiased and comprehensive anchor vectors, and obtain their low-dimensional forms with PCA. The C\&D algorithm learns the discrete representations of vectors in batches. For a batch of input vectors, we first construct a similarity matrix between them and the anchor vectors, and then learn their discrete representations from the similarity matrix decomposition, where the low-dimensional forms of the anchor vectors are regarded as a fixed factor of the similarity matrix. The matrix decomposition is a mixed-integer optimization problem. We obtain the optimal solution for each bit with mathematical derivation, and then use the discrete coordinate descent method to solve it. The C\&D algorithm does not learn directly discrete representations from the input vectors, which distinguishes it from other discrete learning algorithms. We evaluate the C\&D algorithm on sentence embedding compression tasks. Extensively experimental results reveal the C\&D algorithm outperforms the latest 4 methods and reaches state-of-the-art. Detailed analysis and ablation study further validate the rationality of the C\&D algorithm.
+
+## Usage
+The experimental environment of C&D algorithm is consistent with [Shen et al..](https://github.com/Linear95/BinarySentEmb) This means that using this repository requires 3 simple steps:
+1. Set up the experimental environment according to the [Shen et al.](https://github.com/Linear95/BinarySentEmb) instructions.
+1. Add the CDBinEncoder class in discrete_encoders.py to the file with the same name in the [repository](https://github.com/Linear95/BinarySentEmb).
+1. Modify the evaluate.py in [repository](https://github.com/Linear95/BinarySentEmb) to evaluate the C&D algorithm.
+
+## Citation
+If you find our work or code useful in your research, please consider citing:
+'''
+@inproceedings{DBLP:conf/iconip/SongZL20,
+  author    = {Haohao Song and
+               Dongsheng Zou and
+               Weijia Li},
+  editor    = {Haiqin Yang and
+               Kitsuchart Pasupa and
+               Andrew Chi{-}Sing Leung and
+               James T. Kwok and
+               Jonathan H. Chan and
+               Irwin King},
+  title     = {Learning Discrete Sentence Representations via Construction {\&}
+               Decomposition},
+  booktitle = {Neural Information Processing - 27th International Conference, {ICONIP}
+               2020, Bangkok, Thailand, November 23-27, 2020, Proceedings, Part {I}},
+  series    = {Lecture Notes in Computer Science},
+  volume    = {12532},
+  pages     = {786--798},
+  publisher = {Springer},
+  year      = {2020},
+  url       = {https://doi.org/10.1007/978-3-030-63830-6\_66},
+  doi       = {10.1007/978-3-030-63830-6\_66},
+  timestamp = {Fri, 20 Nov 2020 12:41:31 +0100}
+}
+'''
+If you have any questions, please contact me via issue or [email](songhaohao2018@cqu.edu.cn).
+
+
diff --git a/discrete_encoders.py b/discrete_encoders.py
@@ -0,0 +1,87 @@
+import numpy as np
+
+import torch
+
+
+class CDBinEncoder():
+    def __init__(self, g, r):  # g is the original input dimension, and r is the target dimension
+        super(object, self).__init__()
+
+        self.fix_seed(37)
+
+        print('initia parameters... ...')
+        self.g = g
+        self.r = r
+
+        self.V = torch.from_numpy(self.generate_V(g, g * 5)).float().cuda()
+        self.normed_V = (self.V / torch.norm(self.V, dim=0).unsqueeze(0)).cuda()
+
+        self.P = self.generate_P_svd(self.V, r).float().cuda()
+
+        self.V_p = (self.P @ self.V * np.sqrt(r)).float().cuda()
+        self.inverse_V_p = torch.pinverse(self.V_p).float().cuda()
+
+    def fix_seed(self, seed):
+        np.random.seed(seed)
+        torch.manual_seed(seed)
+        torch.cuda.manual_seed(seed)
+
+    def generate_V(self, num_rows, num_cols):
+        limit = np.sqrt(2. / (num_rows + num_cols))
+        random_matrix = np.random.normal(loc=0.0, scale=limit, size=(num_rows, num_cols))
+
+        emb_mean = np.mean(random_matrix, axis=0)[None, :]
+        random_matrix -= emb_mean
+
+        return random_matrix
+
+    def generate_P_svd(self, V, r):
+        u, sigma, v = torch.svd(V)
+        return u[:r, :]
+
+    def generate_P(self, g, r):
+        limit = np.sqrt(6. / (g + r))
+        random_matrix = np.random.uniform(low=-limit, high=limit, size=(g, r))
+
+        u, sigma, v = np.linalg.svd(random_matrix)
+
+        return u[:r, :]
+
+    def dcd(self, S, U, V):
+        L = U.shape[0]
+        Q = (V @ S.t()).cuda()
+
+        while True:
+            is_update = False
+            for i in range(L):
+                U_b_prime = torch.cat((U[:i, :], U[i + 1:, :]))
+
+                v_p = V[i, :]
+                V_p_prime = torch.cat((V[:i, :], V[i + 1:, :]))
+
+                q = Q[i, :]
+
+                bracket_result = (q - U_b_prime.t() @ V_p_prime @ v_p).cuda()
+
+                new_u = bracket_result.sign().cuda()
+                new_u[torch.eq(new_u, 0.)] = 1.
+
+                if torch.all(torch.eq(new_u, U[i, :])):
+                    continue
+                U[i, :] = new_u
+                is_update = True
+
+            if not is_update: break
+
+        return U.t().cpu().numpy()
+
+    def encode(self, X):
+        X = torch.from_numpy(X).cuda()
+
+        normed_X = (X / torch.norm(X, dim=1).unsqueeze(1)).cuda()
+
+        S = (normed_X @ self.normed_V * self.r).cuda()
+
+        X_small_code = (S @ self.inverse_V_p).cuda()
+
+        return self.dcd(S, X_small_code.t(), self.V_p)