Updated configs and README

vfdev-5 · vfdev-5 · commit c70578c0fb81 · 2020-06-14T10:12:39.000Z
diff --git a/README.md b/README.md
@@ -7,82 +7,54 @@ Based on ["FixMatch: Simplifying Semi-Supervised Learning withConsistency and Co
 ## Requirements
 
 ```bash
-pip install --upgrade --pre hydra-core
+pip install --upgrade --pre hydra-core tensorboardX
 pip install --upgrade --pre pytorch-ignite
 ```
 
 ## Training
 
 ```bash
-python -u main_fixmatch.py
-# or python -u main_fixmatch.py --params "data_path=/path/to/cifar10"
+python -u main_fixmatch.py model=WRN-28-2
 ```
 
 This script automatically trains in multiple GPUs (`torch.nn.DistributedParallel`). 
 
-### Distributed Data Parallel (DDP) on multiple GPUs (Experimental)
+If it is needed to specify input/output folder :  
+```
+python -u main_fixmatch.py dataflow.data_path=/data/cifar10/ hydra.run.dir=/output-fixmatch model=WRN-28-2
+```
 
-For example, training on 2 GPUs 
+To use wandb logger, we need login and run with `online_exp_tracking.wandb=true`:
 ```bash
-python -u -m torch.distributed.launch --nproc_per_node=2 main_fixmatch.py --params="distributed=True"
+wandb login <token>
+python -u main_fixmatch.py model=WRN-28-2 online_exp_tracking.wandb=true
 ```
 
-### TPU(s) on Colab (Experimental)
-
-#### Installation
+To see other options:
 ```bash
-VERSION = "1.5"
-!curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
-!python pytorch-xla-env-setup.py --version $VERSION
+python -u main_fixmatch.py --help
 ```
 
-#### Single TPU
+### Training curves visualization
+
+By default, we use Tensorboard to log training curves
+
 ```bash
-python -u main_fixmatch.py --params="device='xla'"
+tensorboard --logdir=/tmp/output-fixmatch-cifar10-hydra/
 ```
 
-#### 8 TPUs on Colab
 
+### Distributed Data Parallel (DDP) on multiple GPUs (Experimental)
+
+For example, training on 2 GPUs 
 ```bash
-python -u main_fixmatch.py --params="device='xla';distributed=True"
+python -u -m torch.distributed.launch --nproc_per_node=2 main_fixmatch.py model=WRN-28-2 distributed.backend=nccl
 ```
 
-## TODO
-
-* [x] Resume training from existing checkpoint:
-    * [x] save/load CTA
-    * [x] save ema model
-
-* [ ] DDP: 
-    * [x] Synchronize CTA across processes
-    * [x] Unified GPU and TPU approach    
-    * [ ] Bug: DDP performances are worse than DP on the first epochs        
-
-* [ ] Logging to an online platform: NeptuneML or Trains or W&B
-
-* [ ] Replace PIL augmentations with Albumentations
-
-```python
-class BlurLimitSampler:
-    def __init__(self, blur, weights):
-        self.blur = blur # [3, 5, 7]
-        self.weights = weights # [0.1, 0.5, 0.4]    
-    def get_params(self):
-        return {"ksize": int(random.choice(self.blur, p=self.weights))}
-        
-class Blur(ImageOnlyTransform):
-    def __init__(self, blur_limit, always_apply=False, p=0.5):
-        super(Blur, self).__init__(always_apply, p)
-        self.blur_limit = blur_limit    
-        
-    def apply(self, image, ksize=3, **params):
-        return F.blur(image, ksize)    
-    
-    def get_params(self):
-        if isinstance(self.blur_limit, BlurLimitSampler):
-            return self.blur_limit.get_params()
-        return {"ksize": int(random.choice(np.arange(self.blur_limit[0], self.blur_limit[1] + 1, 2)))}    
-    
-    def get_transform_init_args_names(self):
-        return ("blur_limit",)
-```
+### TPU(s) on Colab (Experimental)
+
+#### 8 TPUs on Colab
+
+```bash
+python -u main_fixmatch.py model=WRN-28-2 distributed.backend=xla-tpu
+``
diff --git a/TODO b/TODO
@@ -0,0 +1,38 @@
+## TODO
+
+* [x] Resume training from existing checkpoint:
+    * [x] save/load CTA
+    * [x] save ema model
+
+* [ ] DDP:
+    * [x] Synchronize CTA across processes
+    * [ ] Bug: DDP performances are worse than DP on the first epochs
+
+* [x] Logging to an online platform: W&B
+
+* [ ] Replace PIL augmentations with Albumentations
+
+```python
+class BlurLimitSampler:
+    def __init__(self, blur, weights):
+        self.blur = blur # [3, 5, 7]
+        self.weights = weights # [0.1, 0.5, 0.4]
+    def get_params(self):
+        return {"ksize": int(random.choice(self.blur, p=self.weights))}
+
+class Blur(ImageOnlyTransform):
+    def __init__(self, blur_limit, always_apply=False, p=0.5):
+        super(Blur, self).__init__(always_apply, p)
+        self.blur_limit = blur_limit
+
+    def apply(self, image, ksize=3, **params):
+        return F.blur(image, ksize)
+
+    def get_params(self):
+        if isinstance(self.blur_limit, BlurLimitSampler):
+            return self.blur_limit.get_params()
+        return {"ksize": int(random.choice(np.arange(self.blur_limit[0], self.blur_limit[1] + 1, 2)))}
+
+    def get_transform_init_args_names(self):
+        return ("blur_limit",)
+```
diff --git a/config/solver/default.yaml b/config/solver/default.yaml
@@ -13,7 +13,7 @@ resume_from: null
 optimizer:
   cls: torch.optim.SGD
   params:
-    lr: 0.01
+    lr: 0.03
     momentum: 0.9
     weight_decay: 0.0001
     nesterov: false