The Encoplot similarity measure for automatic detection of plagiarism
Notebook for PAN at CLEF 2011
This paper describes the evolution of our method Encoplot for automatic plagiarism detection and the results of the participation to the PAN'11competition. The main novelties are the introduction of a new similarity measure and of a new ranking method, which cooperate to rank much better the source suspicious document pairs when selecting the candidates for the detailed analysis phase. We have obtained excellent results in the competition, ranking 1st on the manually paraphrased cases, 2nd overall in the external plagiarism detection task,and getting the best recall on the non-translated corpus.