tensorflow
diff --git a/‎research/meo/README.md
Lines changed: 69 additions & 0 deletions b/‎research/meo/README.md
Lines changed: 69 additions & 0 deletions
diff --git a/‎research/meo/linear_finetuning/BUILD
Lines changed: 49 additions & 0 deletions b/‎research/meo/linear_finetuning/BUILD
Lines changed: 49 additions & 0 deletions
@@ -0,0 +1,69 @@
+# Manipulating embeddings with obfuscations.
+
+## Description
+
+This codebase provides some techniques to create robust embeddings under
+various obfuscations. The code provided here robustifies the embeddings
+themselves, without fine tuning the rest of the model. The intent for this
+is to train models which are robust to obfuscations, without the need of
+retraining a very large architecture from scratch.
+
+The approach taken for this in this repository is done by generating
+extra obfuscated embeddings. These embeddings are trained so that they mimic the
+real obfuscated embeddings of each image, for any given obfuscation type.
+These embeddings are then used as extra data, to train a downstream classifier.
+Modeling these obfuscated embeddings is intended to help the model later
+on classify images under unseen obfuscations. More specifically, the generated
+obfuscated embeddings are used as additional training data when training a
+classifier.
+
+## Methods
+
+The files in this repository cover two basic methods:
+
+- ```multiple_decoders.py```: This trains a model using an autoencoder style
+architecture, with one decoder per obfuscation type. This model receives a
+clean embedding as input, and generates a corresponding obfuscated embedding
+for each obfuscation type. This allows the model to be more flexible,
+as there is a separate portion dedicated to each obfuscation type.
+
+- ```parameter_generator.py```: This trains a model using an autoencoder
+style architecture, where the decoder is not trained, but rather its
+parameters are produced by a different architecture, which is trained. The
+latter receives as input the obfuscation type, and provides as output the
+parameters of the decoder corresponding to each seen obfuscation.
+
+Finally, ```linear_finetuning.py``` is provided, which trains only a linear
+classifier on top of frozen embeddings. A sample run command for this is the
+following:
+
+## Auxiliary files:
+
+- ```configs.py```: File containing metadata for the dataset and the models
+used.
+
+- ```extended_model.py```: File containing architecture definitions for our
+  models.
+
+- ```losses.py```: File containing the losses for our models.
+
+- ```obfuscations.py```: File containing definitions for the datasets that
+we use.
+
+## Data required
+
+The provided code can receive data in two formats for the parameter
+```data_dir_train``` (directory of data to be used during training):
+
+- In the case of ```input_feature_name==pixel```, the data is assumed to be
+in the format of ```tf.train.Example``` protos, where each field has a key
+named ```label```, and one key of the form ```image_{obf}```, for each
+obfuscation ```obf``` in the set of valid obfuscations.
+
+- In the case of ```input_feature_name==embed```, the data is assumed to be
+in the format of ```tf.train.Example``` protos, with a key named ```label```
+containing the label of the image and a key named ```embed```, containing a
+matrix of size $N \times d$, where $N$ is the number of obfuscations and
+$d$ is the dimension of the embedding.
+
+Contributor: Georgios Smyrnis
@@ -0,0 +1,49 @@
+# Copyright 2022 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+load("//devtools/python/blaze:pytype.bzl", "pytype_strict_binary", "pytype_strict_library")
+
+package(
+    default_visibility = ["//research/meo:__subpackages__"],
+    licenses = ["notice"],  # Apache 2.0
+)
+
+pytype_strict_library(
+    name = "linear_finetuning_lib",
+    srcs = [
+        "linear_finetuning.py",
+    ],
+    srcs_version = "PY3",
+    deps = [
+        # package absl:app
+        # package absl/flags
+        # package absl/logging
+        "//research/meo/mlp_baseline:mlp_baseline_lib",
+        "//research/meo/mlp_baseline:multiple_decoders_lib",
+        # package numpy
+        # package tensorflow:tensorflow_no_contrib
+    ],
+)
+
+pytype_strict_binary(
+    name = "linear_finetuning",
+    srcs = ["linear_finetuning.py"],
+    python_version = "PY3",
+    deps = [
+        ":linear_finetuning_lib",
+        # package absl:app
+        # package absl/flags
+        # package absl/logging
+    ],
+)