Some FL experiment (learning) parameters not propagated from config file #38

AbeleMM · 2022-05-21T11:06:01Z

Bug Report

Current Behavior
The value of some learning parameters (e.g., clients per round and epochs) provided in the config of an experiment is seemingly not correctly propagated to the orchestrator (and, subsequently, federator). It appears that the default value from fltk/util/learning_config.py's FedLearningConfig (e.g. clients_per_round: int = 2 and epochs: int = 1) is always used. The issue might also affect other parameters, although I have not experimented with all of them.

Input Code
Given configs/federated_tasks/example_arrival_config.json:

[
  {
    "type": "federated",
    "jobClassParameters": {
      "networkConfiguration": {
        "network": "FashionMNISTCNN",
        "lossFunction": "CrossEntropyLoss",
        "dataset": "mnist"
      },
      "systemParameters": {
        "dataParallelism": null,
        "configurations": {
          "Master": {
            "cores": "1000m",
            "memory": "1Gi"
          },
          "Worker": {
            "cores": "750m",
            "memory": "1Gi"
          }
        }
      },
      "hyperParameters": {
        "default": {
          "batchSize": 128,
          "testBatchSize": 128,
          "learningRateDecay": 0.0002,
          "optimizerConfig": {
            "type": "SGD",
            "learningRate": 0.01,
            "momentum": 0.1
          },
          "schedulerConfig": {
            "schedulerStepSize": 50,
            "schedulerGamma": 0.5,
            "minimumLearningRate": 1e-10
          }
        },
        "configurations": {
          "Master": null,
          "Worker": {
            "batchSize": 500,
            "optimizerConfig": {
              "learningRate": 0.05
            },
            "schedulerConfig": {
              "schedulerStepSize": 2000
            }
          }
        }
      },
      "learningParameters": {
        "totalEpochs": 5,
        "rounds": 1,
        "epochsPerRound": 3,
        "cuda": false,
        "clientsPerRound": 1,
        "dataSampler": {
          "type": "uniform",
          "qValue": 0.07,
          "seed": 42,
          "shuffle": true
        },
        "aggregation": "FedAvg"
      },
      "experimentConfiguration": {
        "randomSeed": [
          89
        ],
        "workerReplication": {
          "Master": 1,
          "Worker": 1
        }
      }
    }
  }
]

Run helm install flearner charts/orchestrator --namespace test -f charts/fltk-values-abel.yaml --set-file orchestrator.experiment=./configs/federated_tasks/example_arrival_config.json,orchestrator.configuration=./configs/example_cloud_experiment.json

Expected behavior/code
The values of the given config should be correctly reflected within the config_dict of fltk/core/distributed/orchestrator.py and self.config of fltk/core/federator.py after their initialization.

The text was updated successfully, but these errors were encountered:

JMGaljaard · 2022-05-21T15:19:23Z

Hi @AbeleMM, thank you for the report, indeed this should not be the case. I have created a branch 38-loading-configuration-parameters, which you can pull/use to resolve the issue.

Note, however, that there may be some issue, as I am busy with writing a test suite for configuration object parsing.

AbeleMM · 2022-05-21T15:22:57Z

Thanks for looking into it!

JMGaljaard · 2022-05-24T11:51:05Z

@AbeleMM It should be fully resolved now. In addition losses are now properly parsed and a typo during instantiation was broken.

In addition, I have added a (admittedly somewhat hacky) test-case for both data_parallel and federated learning experiments.

Note that the jinja templates require some changes as well

AbeleMM · 2022-05-24T12:06:59Z

Got it. Thanks for the update!

JMGaljaard mentioned this issue May 21, 2022

WIP: 38 loading configuration parameters #39

Merged

9 tasks

JMGaljaard linked a pull request May 24, 2022 that will close this issue

WIP: 38 loading configuration parameters #39

Merged

9 tasks

AbeleMM closed this as completed May 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some FL experiment (learning) parameters not propagated from config file #38

Some FL experiment (learning) parameters not propagated from config file #38

AbeleMM commented May 21, 2022

JMGaljaard commented May 21, 2022

AbeleMM commented May 21, 2022

JMGaljaard commented May 24, 2022

AbeleMM commented May 24, 2022

Some FL experiment (learning) parameters not propagated from config file #38

Some FL experiment (learning) parameters not propagated from config file #38

Comments

AbeleMM commented May 21, 2022

Bug Report

JMGaljaard commented May 21, 2022

AbeleMM commented May 21, 2022

JMGaljaard commented May 24, 2022

AbeleMM commented May 24, 2022