Code plagiarism has seriously endangered the healthy and orderly development of the software industry. Therefore, scholars and experts at home and abroad have proposed various types of code plagiarism detection technologies for this problem. In this paper, a code plagiarism detection method based on the graph density clustering algorithm is proposed to solve the problem of plagiarism in students’ programming assignments. In the proposed algorithm, the program dependency graph is applied to achieve the representative source code; Moreover, one-hot encoding is utilized to generate feature vector from the program dependency graph; Finally, Density-Based Spatial Clustering of Applications with Noise works as the clustering algorithm to achieve the code plagiarism detection. To verify the feasibility and effectiveness of the proposed approach, experimental is designed based on real programming assignments code datasets. Compared with some detection methods, experimental results show that the proposed algorithm based on graph density clustering has improved almost 10% in accuracy and has better time efficiency.

1.
Huang
,
Q.
,
Song
,
X.
and
Fang
,
G.
, “
Code Plagiarism Detection Method Based on Code Similarity and Student Behavior Characteristics
”, in
Proceeding of the International Conference on Artificial Intelligence and Computer Applications (ICAICA
) (
IEEE
,
2020
), pp.
167
172
.
2.
Ragkhitwetsagul
,
C. F.
,
Empirical Software Engineering.
23
(
4
),
2464
2519
(
2018
).
3.
Li
,
L.
,
Feng
,
H.
,
Zhuang
,
W.
,
Meng
,
N.
and
Ryder
,
B.
, “
Cclearner: A deep learning-based clone detection approach
”, in
Proceeding of the International Conference on Software Maintenance and Evolution (ICSME
) (
IEEE
,
2017
) pp.
249
260
.
4.
Mou
,
L.
,
Song
,
Y.
,
Yan
,
R.
,
Li
,
G.
,
Zhang
,
L.
and
Jin
,
Z.
,
Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation
. arXiv preprint arXiv:1607.00970 (
2016
).
5.
Chae
,
D.K.
,
Ha
,
J.
,
Kim
,
S.W.
,
Kang
,
B.
and
Im
,
E.G.
, “
Software plagiarism detection: a graph-based approach
”, in
Proceedings of the 22nd ACM international conference on Information & Knowledge Management
(
2013
), pp.
1577
1580
.
6.
Shawky
,
D.M.
and
Abd-El-Hafiz
,
S.K.
, “
The impact of agile approaches on software quality attributes an empirical study
”, in
Proceeding of the 9th International Conference on Software Paradigm Trends (ICSOFT-PT
) (
IEEE
,
2014
) pp.
49
57
.
7.
Chen
,
J. F.
,
J. Comput. Sci. Technol.
30
(
5
),
942
956
(
2015
).
8.
Sajnani
,
H.
,
Saini
,
V.
,
Svajlenko
,
J.
,
Roy
,
C.K.
and
Lopes
,
C.V.
, “
Sourcerercc: Scaling code clone detection to big-code
”, in
Proceedings of the 38th International Conference on Software Engineering
(
2016
), pp.
1157
1168
.
9.
Wu
,
L.
,
Liu
,
M.
,
Li
,
J.
and
Zhang
,
Y.
, “
An Intelligent Vehicle Alarm User Terminal System Based on Emotional Identification Technology
”,
Scientific Programming
(
2022
).
10.
Takahashi
,
R.
,
Suzuki
,
H.
,
Chew
,
J.Y.
,
Ohtake
,
Y.
,
Nagai
,
Y.
and
Ohtomi
,
K.
,
J. Comput. Des. Eng.
,
5
(
4
),
449
457
(
2018
).
11.
Ullah
,
F.
,
Wang
,
J.
,
Jabbar
,
S.
,
Al-Turjman
,
F.
and
Alazab
,
M.
,
IEEE Access
,
7
,
141987
141999
(
2019
).
12.
Lv
,
H.
,
Wang
,
Z.
and
Zhang
,
H.
,
Infrared Phys. Technol.
,
122
,
104039
(
2022
).
13.
14.
Chen Feiya
,
F.
, in
Proceeding of the 2nd International Conference on Artificial Intelligence and Information Systems (ICAIIS)
(
2021
), pp.
01
04
.
15.
Wang MingYu
,
F.
,
Journal of Physics
2132
(
1
),
012008
(
2021
).
16.
Jiang
,
L.
,
Misherghi
,
G.
,
Su
,
Z.
and
Glondu
,
S.
, “
Deckard: Scalable and accurate tree-based detection of code clones
”,
in Proceeding of the 29th International Conference on Software Engineering
(
IEEE
,
2007
), pp.
96
105
.
17.
Prechelt Lutz
,
F.
,
J Univers. Comput. Sci.
,
8
,
1016
1038
(
2002
).
This content is only available via PDF.
You do not currently have access to this content.