Adaptive-Code-Summarization/code_summarization.json at main · srlabUsask/Adaptive-Code-Summarization · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
[
	{
		"id": "http://zotero.org/users/15747588/items/34AZ344I",
		"type": "paper-conference",
		"event-title": "Deep Learning for Code Workshop",
		"title": "Code summarization: Do transformers really understand code?",
		"author": [
			{
				"family": "Sontakke",
				"given": "Ankita Nandkishor"
			},
			{
				"family": "Patwardhan",
				"given": "Manasi"
			},
			{
				"family": "Vig",
				"given": "Lovekesh"
			},
			{
				"family": "Medicherla",
				"given": "Raveendra Kumar"
			},
			{
				"family": "Naik",
				"given": "Ravindra"
			},
			{
				"family": "Shroff",
				"given": "Gautam"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/6ITLAHGD",
		"type": "paper-conference",
		"abstract": "Source code summarization aims to generate natural language descriptions of code snippets. Many existing studies learn the syntactic and semantic knowledge of code snippets from their token sequences and Abstract Syntax Trees (ASTs). They use the learned code representations as input to code summarization models, which can accordingly generate summaries describing source code. Traditional models traverse ASTs as sequences or split ASTs into paths as input. However, the former loses the structural properties of ASTs, and the latter destroys the overall structure of ASTs. Therefore, comprehensively capturing the structural features of ASTs in learning code representations for source code summarization remains a challenging problem to be solved. In this paper, we propose M2TS, a Multi-scale Multi-modal approach based on Transformer for source code Summarization. M2TS uses a multi-scale AST feature extraction method, which can extract the structures of ASTs more completely and accurately at multiple local and global levels. To complement missing semantic information in ASTs, we also obtain code token features, and further combine them with the extracted AST features using a cross modality fusion method that not only fuses the syntactic and contextual semantic information of source code, but also highlights the key features of each modality. We conduct experiments on two Java and one Python datasets, and the experimental results demonstrate that M2TS outperforms current state-of-the-art methods. We release our code at https://github.com/TranSMS/M2TS.",
		"collection-title": "ICPC '22",
		"container-title": "Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension",
		"DOI": "10.1145/3524610.3527907",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-9298-3",
		"note": "event-place: Virtual Event",
		"page": "24–35",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "M2TS: multi-scale multi-modal approach based on transformer for source code summarization",
		"URL": "https://doi.org/10.1145/3524610.3527907",
		"author": [
			{
				"family": "Gao",
				"given": "Yuexiu"
			},
			{
				"family": "Lyu",
				"given": "Chen"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/FIT53TIB",
		"type": "paper-conference",
		"container-title": "2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)",
		"DOI": "10.1109/SANER53432.2022.00112",
		"page": "935-946",
		"title": "Assemble Foundation Models for Automatic Code Summarization",
		"author": [
			{
				"family": "Gu",
				"given": "Jian"
			},
			{
				"family": "Salza",
				"given": "Pasquale"
			},
			{
				"family": "Gall",
				"given": "Harald C."
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/NA3XXDY9",
		"type": "paper-conference",
		"abstract": "Code summarization, the task of generating useful comments given the code, has long been of interest. Most of the existing code summarization models are trained and validated on widely-used code comment benchmark datasets. However, little is known about the quality of the benchmark datasets built from real-world projects. Are the benchmark datasets as good as expected? To bridge the gap, we conduct a systematic research to assess and improve the quality of four benchmark datasets widely used for code summarization tasks. First, we propose an automated code-comment cleaning tool that can accurately detect noisy data caused by inappropriate data preprocessing operations from existing benchmark datasets. Then, we apply the tool to further assess the data quality of the four benchmark datasets, based on the detected noises. Finally, we conduct comparative experiments to investigate the impact of noisy data on the performance of code summarization models. The results show that these data preprocessing noises widely exist in all four benchmark datasets, and removing these noisy data leads to a significant improvement on the performance of code summarization. We believe that the findings and insights will enable a better understanding of data quality in code summarization tasks, and pave the way for relevant research and practice.",
		"collection-title": "ESEC/FSE 2022",
		"container-title": "Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering",
		"DOI": "10.1145/3540250.3549145",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-9413-0",
		"note": "event-place: Singapore, Singapore",
		"page": "107–119",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "Are we building on the rock? on the importance of data preprocessing for code summarization",
		"URL": "https://doi.org/10.1145/3540250.3549145",
		"author": [
			{
				"family": "Shi",
				"given": "Lin"
			},
			{
				"family": "Mu",
				"given": "Fangwen"
			},
			{
				"family": "Chen",
				"given": "Xiao"
			},
			{
				"family": "Wang",
				"given": "Song"
			},
			{
				"family": "Wang",
				"given": "Junjie"
			},
			{
				"family": "Yang",
				"given": "Ye"
			},
			{
				"family": "Li",
				"given": "Ge"
			},
			{
				"family": "Xia",
				"given": "Xin"
			},
			{
				"family": "Wang",
				"given": "Qing"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/RDDZXTR2",
		"type": "article-journal",
		"container-title": "IEEE Transactions on Software Engineering",
		"DOI": "10.1109/TSE.2024.3422274",
		"issue": "8",
		"page": "2077-2095",
		"title": "Esale: Enhancing Code-Summary Alignment Learning for Source Code Summarization",
		"volume": "50",
		"author": [
			{
				"family": "Fang",
				"given": "Chunrong"
			},
			{
				"family": "Sun",
				"given": "Weisong"
			},
			{
				"family": "Chen",
				"given": "Yuchen"
			},
			{
				"family": "Chen",
				"given": "Xiao"
			},
			{
				"family": "Wei",
				"given": "Zhao"
			},
			{
				"family": "Zhang",
				"given": "Quanjun"
			},
			{
				"family": "You",
				"given": "Yudu"
			},
			{
				"family": "Luo",
				"given": "Bin"
			},
			{
				"family": "Liu",
				"given": "Yang"
			},
			{
				"family": "Chen",
				"given": "Zhenyu"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2024"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/3G5MAFRS",
		"type": "paper-conference",
		"container-title": "2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)",
		"DOI": "10.1109/ASE51524.2021.9678724",
		"page": "155-166",
		"title": "EditSum: A Retrieve-and-Edit Framework for Source Code Summarization",
		"author": [
			{
				"family": "Li",
				"given": "Jia Allen"
			},
			{
				"family": "Li",
				"given": "Yongmin"
			},
			{
				"family": "Li",
				"given": "Ge"
			},
			{
				"family": "Hu",
				"given": "Xing"
			},
			{
				"family": "Xia",
				"given": "Xin"
			},
			{
				"family": "Jin",
				"given": "Zhi"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2021"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/YFQRNVAB",
		"type": "article-journal",
		"abstract": "With the fast development of large software projects, automatic code summarization techniques, which summarize the main functionalities of a piece of code using natural languages as comments, play essential roles in helping developers understand and maintain large software projects. Many research efforts have been devoted to building automatic code summarization approaches. Typical code summarization approaches are based on deep learning models. They transform the task into a sequence-to-sequence task, which inputs source code and outputs summarizations in natural languages. All code summarization models impose different input size limits, such as 50 to 10,000, for the input source code. However, how the input size limit affects the performance of code summarization models still remains under-explored. In this article, we first conduct an empirical study to investigate the impacts of different input size limits on the quality of generated code comments. To our surprise, experiments on multiple models and datasets reveal that setting a low input size limit, such as 20, does not necessarily reduce the quality of generated comments.Based on this finding, we further propose to use function signatures instead of full source code to summarize the main functionalities first and then input the function signatures into code summarization models. Experiments and statistical results show that inputs with signatures are, on average, more than 2 percentage points better than inputs without signatures and thus demonstrate the effectiveness of involving function signatures in code summarization. We also invite programmers to do a questionnaire to evaluate the quality of code summaries generated by two inputs with different truncation levels. The results show that function signatures generate, on average, 9.2% more high-quality comments than full code.",
		"container-title": "ACM Trans. Softw. Eng. Methodol.",
		"DOI": "10.1145/3652156",
		"ISSN": "1049-331X",
		"issue": "6",
		"title": "Do Code Summarization Models Process Too Much Information? Function Signature May Be All That Is Needed",
		"URL": "https://doi.org/10.1145/3652156",
		"volume": "33",
		"author": [
			{
				"family": "Ding",
				"given": "Xi"
			},
			{
				"family": "Peng",
				"given": "Rui"
			},
			{
				"family": "Chen",
				"given": "Xiangping"
			},
			{
				"family": "Huang",
				"given": "Yuan"
			},
			{
				"family": "Bian",
				"given": "Jing"
			},
			{
				"family": "Zheng",
				"given": "Zibin"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2024",
					6
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/P6T8C24C",
		"type": "paper-conference",
		"abstract": "In recent years, research in the domain of source code summarization has adopted data-driven techniques pioneered in machine translation (MT). Automatic evaluation metrics such as BLEU, METEOR, and ROUGE, are fundamental to the evaluation of MT systems and have been adopted as proxies of human evaluation in the code summarization domain. However, the extent to which automatic metrics agree with the gold standard of human evaluation has not been evaluated on code summarization tasks. Despite this, marginal improvements in metric scores are often used to discriminate between the performance of competing summarization models. In this paper, we present a critical exploration of the applicability and interpretation of automatic metrics as evaluation techniques for code summarization tasks. We conduct an empirical study with 226 human annotators to assess the degree to which automatic metrics reflect human evaluation. Results indicate that metric improvements of less than 2 points do not guarantee systematic improvements in summarization quality, and are unreliable as proxies of human evaluation. When the difference between metric scores for two summarization approaches increases but remains within 5 points, some metrics such as METEOR and chrF become highly reliable proxies, whereas others, such as corpus BLEU, remain unreliable. Based on these findings, we make several recommendations for the use of automatic metrics to discriminate model performance in code summarization.",
		"collection-title": "ESEC/FSE 2021",
		"container-title": "Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering",
		"DOI": "10.1145/3468264.3468588",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-8562-6",
		"note": "event-place: Athens, Greece",
		"page": "1105–1116",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "Reassessing automatic evaluation metrics for code summarization tasks",
		"URL": "https://doi.org/10.1145/3468264.3468588",
		"author": [
			{
				"family": "Roy",
				"given": "Devjeet"
			},
			{
				"family": "Fakhoury",
				"given": "Sarah"
			},
			{
				"family": "Arnaoudova",
				"given": "Venera"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2021"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/CDKHJ69A",
		"type": "paper-conference",
		"abstract": "Source code summarization involves creating brief descriptions of source code in natural language. These descriptions are a key component of software documentation such as JavaDocs. Automatic code summarization is a prized target of software engineering research, due to the high value summaries have to programmers and the simultaneously high cost of writing and maintaining documentation by hand. Current work is almost all based on machine models trained via big data input. Large datasets of examples of code and summaries of that code are used to train an e.g. encoder-decoder neural model. Then the output predictions of the model are evaluated against a set of reference summaries. The input is code not seen by the model, and the prediction is compared to a reference. The means by which a prediction is compared to a reference is essentially word overlap, calculated via a metric such as BLEU or ROUGE. The problem with using word overlap is that not all words in a sentence have the same importance, and many words have synonyms. The result is that calculated similarity may not match the perceived similarity by human readers. In this paper, we conduct an experiment to measure the degree to which various word overlap metrics correlate to human-rated similarity of predicted and reference summaries. We evaluate alternatives based on current work in semantic similarity metrics and propose recommendations for evaluation of source code summarization.",
		"collection-title": "ICPC '22",
		"container-title": "Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension",
		"DOI": "10.1145/3524610.3527909",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-9298-3",
		"note": "event-place: Virtual Event",
		"page": "36–47",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "Semantic similarity metrics for evaluating source code summarization",
		"URL": "https://doi.org/10.1145/3524610.3527909",
		"author": [
			{
				"family": "Haque",
				"given": "Sakib"
			},
			{
				"family": "Eberhart",
				"given": "Zachary"
			},
			{
				"family": "Bansal",
				"given": "Aakash"
			},
			{
				"family": "McMillan",
				"given": "Collin"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/7EJSWKIE",
		"type": "article-journal",
		"abstract": "Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort to deep learning techniques such as sequence-to-sequence models for generating accurate code summaries, among which Transformer-based approaches have achieved promising performance. However, effectively integrating the code structure information into the Transformer is under-explored in this task domain. In this article, we propose a novel approach named SG-Trans to incorporate code structural properties into Transformer. Specifically, we inject the local symbolic information (e.g., code tokens and statements) and global syntactic structure (e.g., dataflow graph) into the self-attention module of Transformer as inductive bias. To further capture the hierarchical characteristics of code, the local information and global structure are designed to distribute in the attention heads of lower layers and high layers of Transformer. Extensive evaluation shows the superior performance of SG-Trans over the state-of-the-art approaches. Compared with the best-performing baseline, SG-Trans still improves 1.4% and 2.0% on two benchmark datasets, respectively, in terms of METEOR score, a metric widely used for measuring generation quality.",
		"container-title": "ACM Trans. Softw. Eng. Methodol.",
		"DOI": "10.1145/3522674",
		"ISSN": "1049-331X",
		"issue": "1",
		"title": "Code Structure–Guided Transformer for Source Code Summarization",
		"URL": "https://doi.org/10.1145/3522674",
		"volume": "32",
		"author": [
			{
				"family": "Gao",
				"given": "Shuzheng"
			},
			{
				"family": "Gao",
				"given": "Cuiyun"
			},
			{
				"family": "He",
				"given": "Yulan"
			},
			{
				"family": "Zeng",
				"given": "Jichuan"
			},
			{
				"family": "Nie",
				"given": "Lunyiu"
			},
			{
				"family": "Xia",
				"given": "Xin"
			},
			{
				"family": "Lyu",
				"given": "Michael"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2023",
					2
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/GA4T8Y9Y",
		"type": "paper-conference",
		"container-title": "2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)",
		"DOI": "10.1109/ICPC58990.2023.00026",
		"page": "113-124",
		"title": "Interpretation-based Code Summarization",
		"author": [
			{
				"family": "Geng",
				"given": "Mingyang"
			},
			{
				"family": "Wang",
				"given": "Shangwen"
			},
			{
				"family": "Dong",
				"given": "Dezun"
			},
			{
				"family": "Wang",
				"given": "Haotian"
			},
			{
				"family": "Cao",
				"given": "Shaomeng"
			},
			{
				"family": "Zhang",
				"given": "Kechi"
			},
			{
				"family": "Jin",
				"given": "Zhi"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2023"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/G2HXM7LK",
		"type": "article-journal",
		"container-title": "arXiv preprint arXiv:2012.14710",
		"journalAbbreviation": "arXiv preprint arXiv:2012.14710",
		"title": "Code summarization with structure-induced transformer",
		"author": [
			{
				"family": "Wu",
				"given": "Hongqiu"
			},
			{
				"family": "Zhao",
				"given": "Hai"
			},
			{
				"family": "Zhang",
				"given": "Min"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2020"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/TM7PU9WQ",
		"type": "article-journal",
		"container-title": "arXiv preprint arXiv:2407.07959",
		"journalAbbreviation": "arXiv preprint arXiv:2407.07959",
		"title": "Source code summarization in the era of large language models",
		"author": [
			{
				"family": "Sun",
				"given": "Weisong"
			},
			{
				"family": "Miao",
				"given": "Yun"
			},
			{
				"family": "Li",
				"given": "Yuekang"
			},
			{
				"family": "Zhang",
				"given": "Hongyu"
			},
			{
				"family": "Fang",
				"given": "Chunrong"
			},
			{
				"family": "Liu",
				"given": "Yi"
			},
			{
				"family": "Deng",
				"given": "Gelei"
			},
			{
				"family": "Liu",
				"given": "Yang"
			},
			{
				"family": "Chen",
				"given": "Zhenyu"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2024"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/LMY6QWDR",
		"type": "paper-conference",
		"abstract": "Software developers spend a great deal of time reading and understanding code that is poorly-documented, written by other developers, or developed using differing styles. During the past decade, researchers have investigated techniques for automatically documenting code to improve comprehensibility. In particular, recent advances in deep learning have led to sophisticated summary generation techniques that convert functions or methods to simple English strings that succinctly describe that code's behavior. However, automatic summarization techniques are assessed using internal metrics such as BLEU scores, which measure natural language properties in translational models, or ROUGE scores, which measure overlap with human-written text. Unfortunately, these metrics do not necessarily capture how machine-generated code summaries actually affect human comprehension or developer productivity.We conducted a human study involving both university students and professional developers (n = 45). Participants reviewed Java methods and summaries and answered established program comprehension questions. In addition, participants completed coding tasks given summaries as specifications. Critically, the experiment controlled the source of the summaries: for a given method, some participants were shown human-written text and some were shown machine-generated text.We found that participants performed significantly better (p = 0.029) using human-written summaries versus machine-generated summaries. However, we found no evidence to support that participants perceive human- and machine-generated summaries to have different qualities. In addition, participants' performance showed no correlation with the BLEU and ROUGE scores often used to assess the quality of machine-generated summaries. These results suggest a need for revised metrics to assess and guide automatic summarization techniques.",
		"collection-title": "ICPC '20",
		"container-title": "Proceedings of the 28th International Conference on Program Comprehension",
		"DOI": "10.1145/3387904.3389258",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-7958-8",
		"note": "event-place: Seoul, Republic of Korea",
		"page": "2–13",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "A Human Study of Comprehension and Code Summarization",
		"URL": "https://doi.org/10.1145/3387904.3389258",
		"author": [
			{
				"family": "Stapleton",
				"given": "Sean"
			},
			{
				"family": "Gambhir",
				"given": "Yashmeet"
			},
			{
				"family": "LeClair",
				"given": "Alexander"
			},
			{
				"family": "Eberhart",
				"given": "Zachary"
			},
			{
				"family": "Weimer",
				"given": "Westley"
			},
			{
				"family": "Leach",
				"given": "Kevin"
			},
			{
				"family": "Huang",
				"given": "Yu"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2020"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/8IQHYIM9",
		"type": "article-journal",
		"container-title": "IEEE Transactions on Software Engineering",
		"DOI": "10.1109/TSE.2017.2664836",
		"issue": "12",
		"page": "1095-1109",
		"title": "Autofolding for Source Code Summarization",
		"volume": "43",
		"author": [
			{
				"family": "Fowkes",
				"given": "Jaroslav"
			},
			{
				"family": "Chanthirasegaran",
				"given": "Pankajan"
			},
			{
				"family": "Ranca",
				"given": "Razvan"
			},
			{
				"family": "Allamanis",
				"given": "Miltiadis"
			},
			{
				"family": "Lapata",
				"given": "Mirella"
			},
			{
				"family": "Sutton",
				"given": "Charles"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2017"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/T86JKVQP",
		"type": "paper-conference",
		"container-title": "Advances in Neural Information Processing Systems",
		"publisher": "Curran Associates, Inc.",
		"title": "Code Generation as a Dual Task of Code Summarization",
		"URL": "https://proceedings.neurips.cc/paper_files/paper/2019/file/e52ad5c9f751f599492b4f087ed7ecfc-Paper.pdf",
		"volume": "32",
		"author": [
			{
				"family": "Wei",
				"given": "Bolin"
			},
			{
				"family": "Li",
				"given": "Ge"
			},
			{
				"family": "Xia",
				"given": "Xin"
			},
			{
				"family": "Fu",
				"given": "Zhiyi"
			},
			{
				"family": "Jin",
				"given": "Zhi"
			}
		],
		"editor": [
			{
				"family": "Wallach",
				"given": "H."
			},
			{
				"family": "Larochelle",
				"given": "H."
			},
			{
				"family": "Beygelzimer",
				"given": "A."
			},
			{
				"family": "Alché-Buc",
				"given": "F.",
				"dropping-particle": "d'"
			},
			{
				"family": "Fox",
				"given": "E."
			},
			{
				"family": "Garnett",
				"given": "R."
			}
		],
		"issued": {
			"date-parts": [
				[
					"2019"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/LBQWHURN",
		"type": "article-journal",
		"abstract": "Code summarization plays a pivotal role in the field of software engineering by offering developers a concise natural language comprehension of source code semantics. As software complexity continues to escalate, code summarization confronts various challenges, including discrepancies between source code and summarization, the absence of crucial or up-to-date information, and the inefficiency and resource demands of manual summarization. To address these challenges, Automatic Source Code Summarization (ASCS) has garnered widespread attention. This paper presents a comprehensive review and synthesis of ASCS research. It aims to provide an in-depth understanding of the core issues and challenges inherent in each phase of ASCS, illustrated with specific examples and application scenarios. Around of the core phases of ASCS including data collection, source code modeling, the generation of code summaries, and the assessment of their quality, the paper thoroughly compiles and assesses existing datasets, categorizes and examines prevalent source code modeling techniques, and delves into the methods for generating and evaluating the quality of code summaries. Concluding with an exploration of future research avenues and emerging trends, this paper serves as a guide for readers to grasp the cutting-edge developments in this field, enriched by the analysis of pivotal research contributions.",
		"container-title": "Empirical Software Engineering",
		"DOI": "10.1007/s10664-024-10553-6",
		"ISSN": "1573-7616",
		"issue": "6",
		"journalAbbreviation": "Empirical Software Engineering",
		"page": "162",
		"title": "A review of automatic source code summarization",
		"URL": "https://doi.org/10.1007/s10664-024-10553-6",
		"volume": "29",
		"author": [
			{
				"family": "Zhang",
				"given": "Xuejun"
			},
			{
				"family": "Hou",
				"given": "Xia"
			},
			{
				"family": "Qiao",
				"given": "Xiuming"
			},
			{
				"family": "Song",
				"given": "Wenfeng"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2024",
					10,
					7
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/IFSUUVMM",
		"type": "paper-conference",
		"abstract": "Source code summaries are important for program comprehension and maintenance. However, there are plenty of programs with missing, outdated, or mismatched summaries. Recently, deep learning techniques have been exploited to automatically generate summaries for given code snippets. To achieve a profound understanding of how far we are from solving this problem and provide suggestions to future research, in this paper, we conduct a systematic and in-depth analysis of 5 state-of-the-art neural code summarization models on 6 widely used BLEU variants, 4 pre-processing operations and their combinations, and 3 widely used datasets. The evaluation results show that some important factors have a great influence on the model evaluation, especially on the performance of models and the ranking among the models. However, these factors might be easily overlooked. Specifically, (1) the BLEU metric widely used in existing work of evaluating code summarization models has many variants. Ignoring the differences among these variants could greatly affect the validity of the claimed results. Besides, we discover and resolve an important and previously unknown bug in BLEU calculation in a commonly-used software package. Furthermore, we conduct human evaluations and find that the metric BLEU-DC is most correlated to human perception; (2) code preprocessing choices can have a large (from -18% to +25%) impact on the summarization performance and should not be neglected. We also explore the aggregation of pre-processing combinations and boost the performance of models; (3) some important characteristics of datasets (corpus sizes, data splitting methods, and duplication ratios) have a significant impact on model evaluation. Based on the experimental results, we give actionable suggestions for evaluating code summarization and choosing the best method in different scenarios. We also build a shared code summarization toolbox to facilitate future research.",
		"collection-title": "ICSE '22",
		"container-title": "Proceedings of the 44th International Conference on Software Engineering",
		"DOI": "10.1145/3510003.3510060",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-9221-1",
		"note": "event-place: Pittsburgh, Pennsylvania",
		"page": "1597–1608",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "On the evaluation of neural code summarization",
		"URL": "https://doi.org/10.1145/3510003.3510060",
		"author": [
			{
				"family": "Shi",
				"given": "Ensheng"
			},
			{
				"family": "Wang",
				"given": "Yanlin"
			},
			{
				"family": "Du",
				"given": "Lun"
			},
			{
				"family": "Chen",
				"given": "Junjie"
			},
			{
				"family": "Han",
				"given": "Shi"
			},
			{
				"family": "Zhang",
				"given": "Hongyu"
			},
			{
				"family": "Zhang",
				"given": "Dongmei"
			},
			{
				"family": "Sun",
				"given": "Hongbin"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/PUBIEAHQ",
		"type": "article-journal",
		"abstract": "Source code summarization aims to generate concise descriptions for code snippets in a natural language, thereby facilitates program comprehension and software maintenance. In this paper, we propose a novel approach–GSCS–to automatically generate summaries for Java methods, which leverages both semantic and structural information of the code snippets. To this end, GSCS utilizes Graph Attention Networks to process the tokenized abstract syntax tree of the program, which employ a multi-head attention mechanism to learn node features in diverse representation sub-spaces, and aggregate features by assigning different weights to its neighbor nodes. GSCS further harnesses an additional RNN-based sequence model to obtain the semantic features and optimizes the structure by combining its output with a transformed embedding layer. We evaluate our approach on two widely-adopted Java datasets; the experiment results confirm that GSCS outperforms the state-of-the-art baselines.",
		"container-title": "Journal of Systems and Software",
		"DOI": "https://doi.org/10.1016/j.jss.2022.111257",
		"ISSN": "0164-1212",
		"page": "111257",
		"title": "Automatic source code summarization with graph attention networks",
		"URL": "https://www.sciencedirect.com/science/article/pii/S0164121222000279",
		"volume": "188",
		"author": [
			{
				"family": "Zhou",
				"given": "Yu"
			},
			{
				"family": "Shen",
				"given": "Juanjuan"
			},
			{
				"family": "Zhang",
				"given": "Xiaoqing"
			},
			{
				"family": "Yang",
				"given": "Wenhua"
			},
			{
				"family": "Han",
				"given": "Tingting"
			},
			{
				"family": "Chen",
				"given": "Taolue"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/3SYTRAHA",
		"type": "paper-conference",
		"container-title": "Advances in Neural Information Processing Systems",
		"page": "56660–56672",
		"publisher": "Curran Associates, Inc.",
		"title": "On-the-Fly Adapting Code Summarization on Trainable Cost-Effective Language Models",
		"URL": "https://proceedings.neurips.cc/paper_files/paper/2023/file/b16e6de5fbbdcb2df237aa66b302bc17-Paper-Conference.pdf",
		"volume": "36",
		"author": [
			{
				"family": "Cai",
				"given": "Yufan"
			},
			{
				"family": "Lin",
				"given": "Yun"
			},
			{
				"family": "Liu",
				"given": "Chenyan"
			},
			{
				"family": "Wu",
				"given": "Jinglian"
			},
			{
				"family": "Zhang",
				"given": "Yifan"
			},
			{
				"family": "Liu",
				"given": "Yiming"
			},
			{
				"family": "Gong",
				"given": "Yeyun"
			},
			{
				"family": "Dong",
				"given": "Jin Song"
			}
		],
		"editor": [
			{
				"family": "Oh",
				"given": "A."
			},
			{
				"family": "Naumann",
				"given": "T."
			},
			{
				"family": "Globerson",
				"given": "A."
			},
			{
				"family": "Saenko",
				"given": "K."
			},
			{
				"family": "Hardt",
				"given": "M."
			},
			{
				"family": "Levine",
				"given": "S."
			}
		],
		"issued": {
			"date-parts": [
				[
					"2023"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/LXXRQX9E",
		"type": "article-journal",
		"container-title": "arXiv preprint arXiv:2206.00804",
		"journalAbbreviation": "arXiv preprint arXiv:2206.00804",
		"title": "Learning code summarization from a small and local dataset",
		"author": [
			{
				"family": "Ahmed",
				"given": "Toufique"
			},
			{
				"family": "Devanbu",
				"given": "Premkumar"
			}
		],
		"issued": {
			"date-parts": [
				[
					"2022"
				]
			]
		}
	},
	{
		"id": "http://zotero.org/users/15747588/items/UNC2DABD",
		"type": "paper-conference",
		"abstract": "We present a novel tool, TASSAL, that automatically creates a summary of each source file in a project by folding its least salient code regions. The intended use-case for our tool is the first-look problem: to help developers who are unfamiliar with a new codebase and are attempting to understand it. TASSAL is intended to aid developers in this task by folding away less informative regions of code and allowing them to focus their efforts on the most informative ones. While modern code editors do provide code folding to selectively hide blocks of code, it is impractical to use as folding decisions must be made manually or based on simple rules. We find through a case study that TASSAL is strongly preferred by experienced developers over simple folding baselines, demonstrating its usefulness. In short, we strongly believe TASSAL can aid program comprehension by turning code folding into a usable and valuable tool. A video highlighting the main features of TASSAL can be found at https://youtu.be/_yu7JZgiBA4.",
		"collection-title": "ICSE '16",
		"container-title": "Proceedings of the 38th International Conference on Software Engineering Companion",
		"DOI": "10.1145/2889160.2889171",
		"event-place": "New York, NY, USA",
		"ISBN": "978-1-4503-4205-6",
		"note": "event-place: Austin, Texas",
		"page": "649–652",
		"publisher": "Association for Computing Machinery",
		"publisher-place": "New York, NY, USA",
		"title": "TASSAL: autofolding for source code summarization",
		"URL": "https://doi.org/10.1145/2889160.2889171",
		"author": [
			{
				"family": "Fowkes",
				"given": "Jaroslav"
			},
			{
				"family": "Chanthirasegaran",
				"given": "Pankajan"
			},
			{
				"family": "Ranca",
				"given": "Razvan"
			},
			{
				"family": "Allamanis",
				"given": "Miltiadis"
			},
			{