Using Artificial Intelligence to Fight Phishing

Phishing is quite easy and profitable, that is why uncountable new phishing websites pop up every day. Experts try to mitigate phishing before users fall into the trap. Now, artificial intelligence helps them to get the job done.

Phishing is quite easy and profitable, that is why uncountable new phishing websites pop up every day. Experts try to mitigate phishing before users fall into the trap. Now, artificial intelligence helps them to get the job done.

Phishing – or the malicious act of impersonating a trustworthy entity with the intent to obtain sensitive information – is an ongoing, very popular threat. Creating fake websites or hijacking existing ones, registering domains, and spreading the corresponding links by e-mail or social media is simple, cheap, and apparently quite profitable. Hence, hundreds of new phishing websites are created on daily basis and lots of Internet users are affected. The Swisscom security team is very aware of this and tries to mitigate phishing before users are tricked to access such a website.

However, this is a very time consuming, repetitive, and monotone task. In the vast amount of e-mails the security agents receive, they need to extract the ones that report phishing, confirm that the referenced website actually is phishing, notify the web hosting provider, and block the site so that Swisscom customers won’t risk accidentally exposing their personal information.

Phishing Detection

To assist in this process, we at Swisscom Innovations developed and trained an artificial intelligence based phishing detection system. It predicts reliably whether a formerly unknown website contains phishing or not. To achieve that, we first had to figure out how we as humans actually work: How can we tell a phishing website from a legitimate website apart? It turned out that, like in other areas, our brains excel at recognizing known visual patterns.

An example: If we navigate to a website that is mainly white and gray with blue accent colors and clearly shows a logo consisting of a double P next to the word PayPal, it feels familiar. A quick glance to the address bar that does not show paypal.com but www-paypal.com-somethingelse.info and we are certain that this is phishing. However, if the targeted company is unfamiliar – a small insurance company from a foreign country for instance – it quickly becomes complicated to tell the true intent of a website.

graphik1_folge8

Knowing this, we started to identify descriptive features that can be used as indicators for a website to be either phishing or legitimate. Things such as how the URL is structured, what the website body contains, or lacks, etc. Until today, we identified over 130 distinctive features. Having this information and a collection of thousands of sample websites allowed us to train a machine learning algorithm. The results are astonishing: it is now able to classify websites with over 98% accuracy. In fact, filtering the results based on the prediction’s confidence yields near 100% accuracy.

graphik2_folge8

This system is now integrated in a tool for managing domain barrings. It makes the previously described process of handling phishing reports more efficient and more reliable, giving the security agents time to focus on other important work. What used to take several minutes per report turned – with the help of AI and classical automation – into a one click action.