You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3. Convert to requested form (data type, format, order, etc.)
9
9
10
+
Existing data sources:
11
+
- Synthetic data from sklearn
12
+
- OpenML datasets
13
+
- Custom loaders for named datasets
14
+
- User-provided datasets in compatible format
15
+
16
+
## Data Caching
17
+
10
18
There are two levels of caching with corresponding directories: `raw cache` for files downloaded from external sources, and just `cache` for files applicable for fast-loading in benchmarks.
11
19
12
20
Each dataset has few associated files in usual `cache`: data component files (`x`, `y`, `weights`, etc.) and JSON file with dataset properties (number of classes, clusters, default split arguments).
@@ -21,16 +29,39 @@ data_cache/
21
29
```
22
30
23
31
Cached file formats:
24
-
| Format | File extension | Associated Python types |
0 commit comments