By Igino Corona (1,2), Battista Biggio (1,2), Matteo Contini (2), Luca Piras (1,2), Roberto Corda (2), Mauro Mereu (2), Guido Mureddu (1), Davide Ariu (1,2) and Fabio Roli (1,2)
Fig. 1. DeltaPhish Logo
Fig. 2. Homepage (left), legitimate (middle) and phishing (right) pages hosted in a compromised website.
DeltaPhish analyzes both the HTML code and the visual appearance of each page, using state-of-the-art computer-vision techniques.
Fig. 3. General architecture of DeltaPhish.
Fig. 4. Computation of the visual features. Color histograms and HOG features are extracted from each image tile and concatenated to form a complete feature vector.
Furthermore, by exploiting the latest research findings in the area of adversarial machine learning, we have designed DeltaPhish to be robust to adversarial, worst-case manipulations of the HTML code of the phishing webpages made with the specific intent of evading our detection algorithm.
Fig. 5. Adversarial fusion schemes of HTML and Image classifiers to improve resilience against attacks targeting the HTML component.
DeltaPhish can be used as an additional component within a web application firewall to protect a website from these kinds of automatized phishing attacks and, hence, also to reveal signs of website compromise. Under this application setting, we have shown that DeltaPhish can detect more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages, and that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system.
This email address is being protected from spambots. You need JavaScript enabled to view it. for further information.
Read the paper.
Access the Dataset.