This repository was archived by the owner on Nov 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 797
Add additional fields in batch reader, run_model output file & knowledge distillation data files #783
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e8175f1
to
b6d54cf
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Jul 16, 2019
Summary: Pull Request resolved: facebookresearch#783 Goal: Read post_id field and parse it along with output file Make each gen_kd_data[i] where i in {0,1,2} includes post_id as well With post_id inside the output file, PyText users are able to look into the details of model output and explore additional metric associated with team target Differential Revision: D16271134 fbshipit-source-id: 0438830dba8f366627048338b1a84357f3fa3f93
b6d54cf
to
525ba0c
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Jul 16, 2019
Summary: Pull Request resolved: facebookresearch#783 Goal: Read post_id field and parse it along with output file Make each gen_kd_data[i] where i in {0,1,2} includes post_id as well With post_id inside the output file, PyText users are able to look into the details of model output and explore additional metric associated with team target Differential Revision: D16271134 fbshipit-source-id: 999f08228945f537893deb130a437ee89e87443f
525ba0c
to
d34c8e5
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 1, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: c6b2a4ba128b5a1d2141840777a8b33d80895c29
d34c8e5
to
13007bc
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 1, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: bf4544e20f60a6936a71b2a98bbd4715941085ca
13007bc
to
7eedd1e
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 1, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: 6310550f52de6eb8c32af5d0e40137c0f54b717f
7eedd1e
to
e8c3f68
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 2, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: 3c32543d3178d19c7f0ca746116f6f41e6d9f166
e8c3f68
to
e35213e
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 2, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: d1805fd59c426e09eb5dfd37b28ce6895ae5b7f2
e35213e
to
9aa09a3
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 2, 2019
…D data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read addtional field such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include this field in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the filed. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional field in the generated data files, which is helpful for building teacher and student network for multi-label experiments. [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: d00a36aaaec29ab51f3cb774bac793039c7f0140
9aa09a3
to
db454ba
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 5, 2019
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: ed368ec5d2e4859b7c2e162c3afb01c6748599aa
db454ba
to
2833609
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 5, 2019
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: ae70ae5526031158afacb8edf8dc0faabdab15bc
2833609
to
74caf0e
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 5, 2019
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: abf2ca517f67009b8ad3631306e56ff5c99d8859
74caf0e
to
173da9a
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 6, 2019
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: 3b01cafd7999d80522fb29ba4221735bbba7bfdb
173da9a
to
0b6dc59
Compare
haowu666
pushed a commit
to haowu666/pytext
that referenced
this pull request
Aug 6, 2019
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: 0031c517f5e59b557bfaabd88e38b7d500612c73
…nerated KD data files in knowledge distillation (facebookresearch#783) Summary: [The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments. In the debug file for the test dataset, one header example is #predicted, actual, scores_str, text, post_id, post_url In generating knowledge distillation data for training dataset, one header example is #label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com} [The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution. Pull Request resolved: facebookresearch#783 Differential Revision: D16271134 fbshipit-source-id: 32299d2ce96ed260d5a415061b97b625b06b059e
0b6dc59
to
1fbd3cc
Compare
This pull request has been merged in e14ea15. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[The updated version] generalized the previous version (adding post_id). The main workflow can read list of addtional fields such as post_id, page_id, page_url into context, then parse it along with classification metric reporter. Now PyText users are able to include these fields in the file of saving results, and it's useful to look into the details of the model output, explore other metric assocaited with team target, or look for other relevant info form hive table by searching on the fileds. Also, in knowledge distillation, each gen_kd_data[i] where i in {0,1,2} (training, validation, test) can handle label list for multi-label task, and include the additional fields as a dictionary in the generated data files, which is helpful for building teacher and student network for multi-label experiments.
In the debug file for the test dataset, one header example is
#predicted, actual, scores_str, text, post_id, post_url
In generating knowledge distillation data for training dataset, one header example is
#label_list, score, logit, label_names(in order), text, {"post_id": 123456, "post_url": http://post_url.com}
[The previous version] aimed at reading post_id and parsing it along with the output file, so that the file is useful for our metric calculation and building KD - teacher/student in our workflow in the following steps. Other users that have the same desired goal can use this Diff as a potentially alternative solution.
Differential Revision: D16271134