DeltaPhish - Detecting phishing webpages in compromised websites

Cybersecurity 02 February 2019

By Igino Corona (1,2), Battista Biggio (1,2), Matteo Contini (2), Luca Piras (1,2), Roberto Corda (2), Mauro Mereu (2), Guido Mureddu (1), Davide Ariu (1,2) and Fabio Roli (1,2)

1 Pluribus One, Cagliari, Italy

2 Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy

This approach is based on our ESORICS2017 paper: DeltaPhish: Detecting Phishing Webpages in Compromised Websites.

To visualize the classification reports of Deltaphish on the data used in our ESORICS2017 paper, please visit this page.

Fig. 1. DeltaPhish Logo

The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild. To understand the importance of this phenomen, note that, according to the most recent Global Phishing Survey by APWG, published in 2014, 59,485 out of the 87,901 domains linked to phishing scams (namely, the 71.4%) were actually pointing to legitimate (though compromised) websites.

To counter this threat, we have developed DeltaPhish, a tool capable of detecting phishing webpages hosted in compromised websites through the analysis of the differences between the visited webpages and a predetermined reference page (e.g., the website homepage).

Fig. 2. Homepage (left), legitimate (middle) and phishing (right) pages hosted in a compromised website.

DeltaPhish analyzes both the HTML code and the visual appearance of each page, using state-of-the-art computer-vision techniques.

Fig. 3. General architecture of DeltaPhish.

Fig. 4. Computation of the visual features. Color histograms and HOG features are extracted from each image tile and concatenated to form a complete feature vector.

Furthermore, by exploiting the latest research findings in the area of adversarial machine learning, we have designed DeltaPhish to be robust to adversarial, worst-case manipulations of the HTML code of the phishing webpages made with the specific intent of evading our detection algorithm.

Fig. 5. Adversarial fusion schemes of HTML and Image classifiers to improve resilience against attacks targeting the HTML component.

DeltaPhish can be used as an additional component within a web application firewall to protect a website from these kinds of automatized phishing attacks and, hence, also to reveal signs of website compromise. Under this application setting, we have shown that DeltaPhish can detect more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages, and that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system.

This email address is being protected from spambots. You need JavaScript enabled to view it. for further information.

Read the paper.

Access the Dataset.

DeltaPhish - Detecting phishing webpages in compromised websites

Info

Legal entity

University of Cagliari

Certifications