-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update Danbooru2017 for pre-train models.
Former-commit-id: 94b8d1e Former-commit-id: ce546e2fb7a762c38d724f37b2b315fe25bbb3ca
- Loading branch information
Showing
4 changed files
with
5 additions
and
1 deletion.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,7 @@ | ||
[Danbooru 2017 database](https://www.gwern.net/Danbooru2017) | ||
|
||
The txt file contains 113k 512x512 image file names for training a CNN-LSTM classifier. You may download them by `rsync` with `-- files-from` | ||
The ```[Danbooru2017] training image list_``` file contains 113k 512x512 image file names for training a CNN-LSTM classifier. You may download them by `rsync` with `-- files-from` | ||
|
||
The ```113k_imgs_512tags_encoded.7z``` is a json file containing tags for 113k images. The ```sk-LabelEncoder_512tags.pk``` is used for one hot encoding from scikit-learn. | ||
|
||
If you need a complete list of tags, please open an issue. I might be able to provide it. For a reference, if you process the large meta file from Danbooru2017, you may need around 8G to unzip the file and around 40 minutes with 8 cores to run through all json files. Good luck. |
Binary file not shown.
Binary file not shown.