Structure of /Limitation-statistics
folder:
Limitation-statistics
┝━━ CAT-google-FN.csv
┝━━ CAT-google-FP.csv
┝━━ CIT-google-FN.csv
┝━━ CIT-google-FP.csv
┝━━ PatInv-google-FN.csv
┝━━ PatInv-google-FP.csv
┝━━ Purity-google-FN.csv
┝━━ Purity-google-FP.csv
┝━━ SIT-google-FN.csv
┕━━ SIT-google-FP.csv
In each of the above "[IT]-[SUT]-[FP,FN].csv" file, each row records an False Positive (FP) or False Negative (FN) of the baseline method, which contains the following items:
- S_s: the source input sentence.
- S_f: the follow-up input sentence.
- T_s: the source output translation.
- T_f: the source input sentence.
- Limitation-A: whether this FP or FN is due to the Limitation-A, i.e., cannot compare two output translations at a fine granularity for precise comparison.
- Limitation-B: whether this FN is due to the Limitation-A, i.e., lack the linkage between the fragment in the source output and its counterpart in the follow-up output to implement rigorous comparison.
- Limitation-C: whether this FN is due to the Limitation-B, i.e., fail to detect the incorrect translations of the same input words.
- Limitation-text: whether this FN is due to the Limitation of text-based comparison methods, i.e., fail to detect the incorrect translations that have the same structure as its counterpart.
- Limitation-structure: whether this FP is due to the Limitation of structure-based comparison methods, i.e., cannot recognize the synonyms between the two translations.
Structure of /Motivation-examples
folder:
Motivation-examples
┝━━ CAT-en2zh-motivation.csv
┝━━ CIT-en2zh-motivation.csv
┝━━ PatInv-en2zh-motivation.csv
┝━━ Purity-en2zh-motivation.csv
┕━━ SIT-en2zh-motivation.csv
Each of the above "[IT]-[Language]-motivation.csv" file contains a motivation example for the corresponding Input Transformation (IT), which contains the following items:
- S_s: the source input sentence.
- S_f: the follow-up input sentence.
- T_s: the source output translation.
- T_f: the source input sentence.
- Violation: whether this pair of test cases violate the output relation. 1 for violation, 0 for non-violation.
- Fine-grained Violations in T_s: the tokens in T_s that lead to the violation.
- Fine-grained Violations in T_f: the tokens in T_f that lead to the violation.
Structure of /RQ1
folder:
RQ1
┝━━ CAT-en2zh-google.csv
┝━━ CAT-en2zh-bing.csv
┝━━ CAT-en2zh-youdao.csv
┝━━ CIT-en2zh-google.csv
┝━━ CIT-en2zh-bing.csv
┝━━ CIT-en2zh-youdao.csv
┝━━ PatInv-en2zh-google.csv
┝━━ PatInv-en2zh-bing.csv
┝━━ PatInv-en2zh-youdao.csv
┝━━ Purity-en2zh-google.csv
┝━━ Purity-en2zh-bing.csv
┝━━ Purity-en2zh-youdao.csv
┝━━ SIT-en2zh-google.csv
┝━━ SIT-en2zh-bing.csv
┝━━ SIT-en2zh-youdao.csv
┝━━ CAT-zh2en-google.csv
┝━━ CAT-zh2en-bing.csv
┝━━ CAT-zh2en-youdao.csv
┝━━ CIT-zh2en-google.csv
┝━━ CIT-zh2en-bing.csv
┝━━ CIT-zh2en-youdao.csv
┝━━ PatInv-zh2en-google.csv
┝━━ PatInv-zh2en-bing.csv
┝━━ PatInv-zh2en-youdao.csv
┝━━ Purity-zh2en-google.csv
┝━━ Purity-zh2en-bing.csv
┝━━ Purity-zh2en-youdao.csv
┝━━ SIT-zh2en-google.csv
┝━━ SIT-zh2en-bing.csv
┕━━ SIT-zh2en-youdao.csv
Each of the above "[IT]-[Language]-[SUT].csv" file contains all the test case pairs of the Language setting (en2zh means English-to-Chiese, zh2en means Chinese-to-English) generated by IT for SUT, each of which contains the following items:
- S_s: the source input sentence.
- S_f: the follow-up input sentence.
- T_s: the source output translation.
- T_f: the source input sentence.
- Violation: whether this pair of test cases violate the output relation. 1 for violation, 0 for non-violation.
- Fine-grained Violations in T_s: the tokens in T_s that lead to the violation.
- Fine-grained Violations in T_f: the tokens in T_f that lead to the violation.
Structure of /RQ2&5
folder:
RQ1
┝━━ CAT-en2zh-merge.csv
┝━━ CAT-zh2en-merge.csv
┝━━ CIT-en2zh-merge.csv
┝━━ CIT-zh2en-merge.csv
┝━━ PatInv-en2zh-merge.csv
┝━━ PatInv-zh2en-merge.csv
┝━━ Purity-en2zh-merge.csv
┝━━ Purity-zh2en-merge.csv
┝━━ SIT-en2zh-merge.csv
┕━━ SIT-zh2en-merge.csv
Each of the above "[IT]-[Language]-merge.csv" file contains all the test case pairs of the Language setting (en2zh means English-to-Chiese, zh2en means Chinese-to-English) generated by IT for the three SUTs (Google, Bing, and Youdao), each of which contains the following items:
- S_s: the source input sentence.
- S_f: the follow-up input sentence.
- T_s: the source output translation.
- T_f: the source input sentence.
- Violation: whether this pair of test cases violate the output relation. 1 for violation, 0 for non-violation.
- Fine-grained Violations in T_s: the tokens in T_s that lead to the violation.
- Fine-grained Violations in T_f: the tokens in T_f that lead to the violation.
Structure of /RQ3
folder:
RQ1
┝━━ CAT-en2zh-google-LABEL.txt
┝━━ CAT-zh2en-google-LABEL.txt
┝━━ CIT-en2zh-google-LABEL.txt
┝━━ CIT-zh2en-google-LABEL.txt
┝━━ Purity-en2zh-google-LABEL.txt
┕━━ Purity-zh2en-google-LABEL.txt
Each of the above "[IT]-[Language]-google-LABEL.txt" file contains the True Positives (TPs) of the Language setting (en2zh means English-to-Chiese, zh2en means Chinese-to-English) identified by the baselines and our method. In these files, each test is decomposed into 13 lines:
- Line 1: the id of this test case pair.
- Line 2: the source input sentence.
- Line 3: the follow-up input sentence.
- Line 4: the source output translation.
- Line 5: the source input sentence.
- Line 6: the tokens with token indexes of the source output translation.
- Line 7: the tokens with token indexes of the follow-up output translation.
- Line 8: the token indexes of the fine-grained violations in T_s located by the baseline methods.
- Line 9: the token indexes of the fine-grained violations in T_f located by the baseline methods.
- Line 10: the token indexes of the fine-grained violations in T_s located by our method.
- Line 11: the token indexes of the fine-grained violations in T_f located by our method.
- Line 12: the manually labeled token indexes of the true fine-grained violations in T_s.
- Line 13: the manually labeled token indexes of the true fine-grained violations in T_f.