You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: cczoo/vertical_fl/README.md
+90-62Lines changed: 90 additions & 62 deletions
Original file line number
Diff line number
Diff line change
@@ -3,35 +3,41 @@
3
3
4
4
- Ubuntu 18.04. This solution should work on other Linux distributions as well, but for simplicity we provide the steps for Ubuntu 18.04 only.
5
5
6
-
- Docker Engine. Docker Engine is an open source containerization technology for building and containerizing your applications. In this solution, Gramine, Fedlearner, gRPC will be built in Docker images. Please follow [this guide](https://docs.docker.com/engine/install/ubuntu/#install-using-the-convenience-script) to install Docker engine.
6
+
- Docker Engine. Docker Engine is an open source containerization technology for building and containerizing your applications. In this solution, Gramine, Fedlearner, gRPC will be built in a Docker image. Please follow [this guide](https://docs.docker.com/engine/install/ubuntu/#install-using-the-convenience-script) to install Docker Engine. The Docker daemon's storage location (/var/lib/docker for example) should have at least 32GB available.
7
7
8
8
- SGX capable platform. Intel SGX Driver and SDK/PSW. You need a machine that supports Intel SGX and FLC/DCAP. Please follow [this guide](https://download.01.org/intel-sgx/latest/linux-latest/docs/) to install the Intel SGX driver and SDK/PSW. One way to verify SGX enabling status in your machine is to run [QuoteGeneration](https://github.com/intel/SGXDataCenterAttestationPrimitives/blob/master/QuoteGeneration) and [QuoteVerification](https://github.com/intel/SGXDataCenterAttestationPrimitives/blob/master/QuoteVerification) successfully.
9
9
10
-
Here, we will demonstrate how to run leader and follower from two containers.
11
-
10
+
Here, we will demonstrate vertical federated learning using a leader container and a follower container.
12
11
13
12
14
13
## Executing Fedlearner in SGX
15
14
16
15
### 1. Download source code
17
16
17
+
Download the [Fedlearner source code](https://github.com/bytedance/fedlearner/tree/fix_dev_sgx) which is a git submodule of CCZoo.
`build_dev_docker_image.sh` provides the parameter `proxy_server` to specify the network proxy. `build_dev_docker_image.sh` also accepts an optional argument to specify the docker image tag.
30
+
31
+
For deployments on Microsoft Azure:
27
32
```
28
-
img_tag=Your_defined_tag
29
-
./sgx/build_dev_docker_image.sh ${img_tag}
33
+
AZURE=1 ./sgx/build_dev_docker_image.sh
34
+
```
35
+
For other cloud deployments:
36
+
```
37
+
./sgx/build_dev_docker_image.sh
30
38
```
31
39
32
-
*Note:*`build_dev_docker_image.sh` provides parameter `proxy_server` to help you set your network proxy. It can be removed from this script if it is not needed.
In terminal 2, enter the follower container shell:
73
+
74
+
```
75
+
docker exec -it fedlearner_follower bash
67
76
```
68
77
69
78
#### 3.1 Configure PCCS
70
79
80
+
- For deployments on Microsoft Azure, skip this section, as configuring the PCCS is not necessary on Azure.
81
+
71
82
- If you are using public cloud instance, please replace the PCCS url in `/etc/sgx_default_qcnl.conf` with the new pccs url provided by the cloud.
72
83
73
84
```
@@ -84,15 +95,15 @@ docker run -it \
84
95
85
96
#### 3.2 Start aesm service
86
97
87
-
Execute below script in both leader and follower container:
98
+
Start the aesm service in both the leader and follower containers:
88
99
89
100
```
90
101
/root/start_aesm_service.sh
91
102
```
92
103
93
104
#### 4. Prepare data
94
105
95
-
Generate data in both leader and follower container:
106
+
Generate data in both the leader and follower containers:
96
107
97
108
```
98
109
cd /gramine/CI-Examples/wide_n_deep
@@ -101,14 +112,15 @@ cd /gramine/CI-Examples/wide_n_deep
101
112
102
113
#### 5. Compile applications
103
114
104
-
Compile applications in both leader and follower container:
115
+
Compile applications in both the leader and follower containers:
105
116
106
117
```
107
118
cd /gramine/CI-Examples/wide_n_deep
108
119
./test-ps-sgx.sh make
109
120
```
110
121
111
-
Please find `mr_enclave`,`mr_signer` from the print log as below:
122
+
Take note of the `mr_enclave` and `mr_signer` values from the resulting log from the leader container.
123
+
The following is an example log:
112
124
113
125
```
114
126
+ make
@@ -121,7 +133,7 @@ Please find `mr_enclave`,`mr_signer` from the print log as below:
121
133
isv_svn: 0
122
134
```
123
135
124
-
Then, update the leader's `dynamic_config.json` under current folder with follower's `mr_enclave`,`mr_signer`. Also, update follower's `dynamic_config.json` with leader's `mr_enclave`,`mr_signer`.
136
+
In both the leader and follower containers, in `dynamic_config.json`, confirm that `mr_enclave` and `mr_signer` are set to the values from the leader container's log. Use the actual values from the leader container's log, not the values from the example log above.
125
137
126
138
```
127
139
dynamic_config.json:
@@ -140,60 +152,76 @@ dynamic_config.json:
140
152
141
153
```
142
154
143
-
#### 6. Config leader and follower's IP
155
+
#### 6. Run the distributing training
144
156
145
-
In leader's `test-ps-sgx.sh`, for `--peer-addr` , please replace `localhost` with `follower_contianer_ip`
157
+
Start the training process in the follower container:
0 commit comments