SOFTWARE PLAGIARISM DETECTION
Keywords:
Plagiarism, Birth-marking, Jaccard, Cosine, Dice, Reflect API, feature setAbstract
With the development of internet and electronic devices, software plagiarism
hasbecomeprevalentinsoftwareindustriesaswellaseducationalinstitutes, violating one’s intellectual integrity. Certain
techniques such as watermarking and semantics-preserving code obfuscations were introduced to tackle this issue.
However, besides the need to insert additional data in the original program, code obfuscations can often destroy
watermarks. Also, it was found that a sufficiently determined attacker may be able to destroy any watermark. In order to
overcome these issues, birth-marking technique is proposed, which extracts a set of characteristics that uniquely identify
the original program. Our work focuses on extracting birthmarks from source codes, implementing algorithms to
measure the similarity between them and displaying the results on a dedicated user interface with respect to a pre-set
threshold.