Skip to content

Commit 9261e7d

Browse files
yiheng-wang-nvwyli
andauthored
Add FL example with nvflare (#189)
* Add flare based trainers Signed-off-by: Yiheng Wang <vennw@nvidia.com> * Refactor the whole pipeline * Rename fl example * Update nvflare example Signed-off-by: Yiheng Wang <vennw@nvidia.com> * Fix pep8 errors Signed-off-by: Yiheng Wang <vennw@nvidia.com> * Specify nvflare version Signed-off-by: Yiheng Wang <vennw@nvidia.com> * Update copyrights and docker info Signed-off-by: Yiheng Wang <vennw@nvidia.com> * Use fixed monai docker version and add new lines Signed-off-by: Yiheng Wang <vennw@nvidia.com> * fixes readme Signed-off-by: Wenqi Li <wenqil@nvidia.com> * update readme Signed-off-by: Wenqi Li <wenqil@nvidia.com> * by default exclude the federated learning related (often requires multiple nodes) Signed-off-by: Wenqi Li <wenqil@nvidia.com> * fixes a link Signed-off-by: Wenqi Li <wenqil@nvidia.com> Co-authored-by: Wenqi Li <wenqil@nvidia.com>
1 parent 316e530 commit 9261e7d

25 files changed

+2391
-4
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,9 @@ This is a simple example of training and deploying a MONAI network with [BentoML
128128
This uses the previous notebook's trained network to demonstrate deployment a web server using [Ray](https://docs.ray.io/en/master/serve/index.html#rayserve).
129129

130130
**federated learning**
131+
#### [NVFlare](./federated_learning/nvflare)
132+
The example show how to train a federated learning model with [NVFlare](https://pypi.org/project/nvflare/) and the MONAI trainers.
133+
131134
#### [Substra](./federated_learning/substra)
132135
The example show how to execute the 3d segmentation torch tutorial on a federated learning platform, Substra.
133136

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"## Introduction"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"In `Provision Package Preparation` step of the README, we created `audit.pkl` and `zip` files for all the provisioned parties (server, clients, and admins) in `expr_files/`. The zip files are encrypted and the passwords are saved in `audit.pkl`.\n",
15+
"\n",
16+
"In an experiment, you need to send decrypted folders to each site so they could run it on their system. Therefore, in this notebook, we would decrypt and send folders to all the provisioned parties. After running this notebook."
17+
]
18+
},
19+
{
20+
"cell_type": "code",
21+
"execution_count": 1,
22+
"metadata": {},
23+
"outputs": [],
24+
"source": [
25+
"import shutil\n",
26+
"from zipfile import ZipFile\n",
27+
"import pickle\n",
28+
"import os"
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": 2,
34+
"metadata": {},
35+
"outputs": [
36+
{
37+
"data": {
38+
"text/plain": [
39+
"['project.yml',\n",
40+
" 'prerpare_expr_files.sh',\n",
41+
" 'researcher@nvidia.com.zip',\n",
42+
" 'download_dataset.py',\n",
43+
" 'authz_config.json',\n",
44+
" 'org1-b.zip',\n",
45+
" 'researcher@org2.com.zip',\n",
46+
" 'admin@nvidia.com.zip',\n",
47+
" 'org1-a.zip',\n",
48+
" 'audit.pkl',\n",
49+
" 'server.zip',\n",
50+
" 'researcher@org1.com.zip',\n",
51+
" 'org2.zip',\n",
52+
" 'it@org2.com.zip']"
53+
]
54+
},
55+
"execution_count": 2,
56+
"metadata": {},
57+
"output_type": "execute_result"
58+
}
59+
],
60+
"source": [
61+
"os.listdir(\"expr_files/\")"
62+
]
63+
},
64+
{
65+
"cell_type": "markdown",
66+
"metadata": {},
67+
"source": [
68+
"In this example, `server.zip` will be used to create the server, `org1-a.zip` and `org1-b.zip` will be used to create two clients, and `admin@nvidia.com.zip` will be used to create an admin client to operate the FL experiment.\n",
69+
"\n",
70+
"First, unzip all the packages with the following code:"
71+
]
72+
},
73+
{
74+
"cell_type": "code",
75+
"execution_count": 3,
76+
"metadata": {
77+
"lines_to_next_cell": 2
78+
},
79+
"outputs": [
80+
{
81+
"name": "stdout",
82+
"output_type": "stream",
83+
"text": [
84+
"demo_workspace created!\n",
85+
"unzip: server finished.\n",
86+
"unzip: admin@nvidia.com finished.\n",
87+
"unzip: researcher@nvidia.com finished.\n",
88+
"unzip: researcher@org1.com finished.\n",
89+
"unzip: researcher@org2.com finished.\n",
90+
"unzip: it@org2.com finished.\n",
91+
"unzip: org1-a finished.\n",
92+
"unzip: org1-b finished.\n",
93+
"unzip: org2 finished.\n"
94+
]
95+
}
96+
],
97+
"source": [
98+
"startup_path = \"expr_files\" # this is the path that contains `audit.pkl` and zip files\n",
99+
"workspace = \"demo_workspace\" # this is the folder that will be created to contain all experiment related files\n",
100+
"\n",
101+
"if not os.path.exists(workspace):\n",
102+
" os.makedirs(workspace)\n",
103+
" print(workspace, \" created!\")\n",
104+
"\n",
105+
"admin_name = \"admin@nvidia.com\"\n",
106+
"client_name_1 = \"org1-a\"\n",
107+
"client_name_2 = \"org1-b\"\n",
108+
"server_name = \"server\"\n",
109+
"\n",
110+
"# access audit file get passwords for unzipping packages\n",
111+
"with open(os.path.join(startup_path, \"audit.pkl\"), 'rb') as handle:\n",
112+
" audit_file = pickle.load(handle)\n",
113+
"\n",
114+
"proj_name = list(audit_file.keys())[0]\n",
115+
"pw_key = \"zip_pw\"\n",
116+
"server_folder_list = [\"server\"]\n",
117+
"client_folder_list = [\"admin_clients\", \"fl_clients\"]\n",
118+
"\n",
119+
"folder_pwd_dict = {}\n",
120+
"for obj in server_folder_list:\n",
121+
" unzip_pw = audit_file[proj_name][obj][pw_key]\n",
122+
" folder_pwd_dict[obj] = unzip_pw\n",
123+
"\n",
124+
"for obj in client_folder_list:\n",
125+
" obj_sub_dict = audit_file[proj_name][obj]\n",
126+
" for client in obj_sub_dict.keys():\n",
127+
" unzip_pw = obj_sub_dict[client][pw_key]\n",
128+
" folder_pwd_dict[client] = unzip_pw\n",
129+
"\n",
130+
"# unzip all folders into workspace\n",
131+
"for name, pwd in folder_pwd_dict.items():\n",
132+
" zip_file_path = os.path.join(startup_path, name + \".zip\")\n",
133+
" dst_file_path = os.path.join(workspace, name)\n",
134+
" if not os.path.exists(dst_file_path):\n",
135+
" os.makedirs(dst_file_path)\n",
136+
" with ZipFile(zip_file_path, 'r') as zip_ref:\n",
137+
" zip_ref.extractall(path=dst_file_path, pwd=bytes(pwd, 'utf-8'))\n",
138+
" # change permissions\n",
139+
" if \".com\" in name:\n",
140+
" sub_file_list = [\"docker.sh\", \"fl_admin.sh\"]\n",
141+
" else:\n",
142+
" sub_file_list = [\"start.sh\", \"sub_start.sh\", \"docker.sh\"]\n",
143+
" for file in sub_file_list:\n",
144+
" os.chmod(os.path.join(dst_file_path, \"startup\", file), 0o755)\n",
145+
" print(\"unzip: {} finished.\".format(name))"
146+
]
147+
},
148+
{
149+
"cell_type": "code",
150+
"execution_count": 4,
151+
"metadata": {},
152+
"outputs": [
153+
{
154+
"data": {
155+
"text/plain": [
156+
"['org1-b',\n",
157+
" 'researcher@nvidia.com',\n",
158+
" 'server',\n",
159+
" 'admin@nvidia.com',\n",
160+
" 'researcher@org2.com',\n",
161+
" 'org1-a',\n",
162+
" 'org2',\n",
163+
" 'researcher@org1.com',\n",
164+
" 'it@org2.com']"
165+
]
166+
},
167+
"execution_count": 4,
168+
"metadata": {},
169+
"output_type": "execute_result"
170+
}
171+
],
172+
"source": [
173+
"# check the created workspace\n",
174+
"os.listdir(workspace)"
175+
]
176+
},
177+
{
178+
"cell_type": "markdown",
179+
"metadata": {},
180+
"source": [
181+
"With default settings, the experiment related config folder `spleen_example` should be copied into the `transfer` folder within the admin package:"
182+
]
183+
},
184+
{
185+
"cell_type": "code",
186+
"execution_count": 5,
187+
"metadata": {},
188+
"outputs": [
189+
{
190+
"name": "stdout",
191+
"output_type": "stream",
192+
"text": [
193+
"copied spleen_example into demo_workspace/admin@nvidia.com/transfer/.\n"
194+
]
195+
}
196+
],
197+
"source": [
198+
"config_folder = \"spleen_example\"\n",
199+
"\n",
200+
"transfer_path = os.path.join(workspace, admin_name, \"transfer/\")\n",
201+
"if not os.path.exists(transfer_path):\n",
202+
" os.makedirs(transfer_path)\n",
203+
"shutil.copytree(config_folder, os.path.join(transfer_path, config_folder))\n",
204+
"print(\"copied {} into {}.\".format(config_folder, transfer_path))"
205+
]
206+
},
207+
{
208+
"cell_type": "markdown",
209+
"metadata": {},
210+
"source": [
211+
"So far, all required files are created in the workspace. Before starting the docker images, we can update the permissions for these files:"
212+
]
213+
},
214+
{
215+
"cell_type": "code",
216+
"execution_count": 6,
217+
"metadata": {},
218+
"outputs": [],
219+
"source": [
220+
"!chown -R 1000:1000 demo_workspace/*"
221+
]
222+
},
223+
{
224+
"cell_type": "markdown",
225+
"metadata": {},
226+
"source": [
227+
"### Next Steps\n",
228+
"\n",
229+
"You have now finished unzipping the provisioning files and copying the experiment folder to the admin's transfer folder.\n",
230+
"In the next notebook, [Server Startup Notebook](2-Server.ipynb), you will start the server container."
231+
]
232+
}
233+
],
234+
"metadata": {
235+
"kernelspec": {
236+
"display_name": "Python 3",
237+
"language": "python",
238+
"name": "python3"
239+
},
240+
"language_info": {
241+
"codemirror_mode": {
242+
"name": "ipython",
243+
"version": 3
244+
},
245+
"file_extension": ".py",
246+
"mimetype": "text/x-python",
247+
"name": "python",
248+
"nbconvert_exporter": "python",
249+
"pygments_lexer": "ipython3",
250+
"version": "3.8.5"
251+
}
252+
},
253+
"nbformat": 4,
254+
"nbformat_minor": 4
255+
}

0 commit comments

Comments
 (0)