A real-life study in phishing detection
Presentation at the seventh Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS 2010), Redmond, Washington, USA, July 13-14, 2010
Phishing is a serious threat to global security and economy. Previously we have developed a phishing ltering system based on automatic classi cation. We perform statistical ltering of emails, where a classi er is trained on character- istic features of existing emails and subsequently is able to identify new phishing emails with dierent contents. In this work we test our developed system in a real-life environment at a commercial ISP. The system is applied to an unskewed real-life stream consisting of thousands of emails every day. We use active learning to keep the system's model up-to- date. The experiments show that the system performs very well as a lter even in the presence of many spam emails. We furthermore demonstrate that active learning is indeed useful and leads to better results than using a xed model. Last, we integrate the output of another spam lter into the system and show that this combined lter leads to better results than either lter by itself.