Hello!

I am a Post-doctorate Research Assistant at Dipartimento di Elettronica e Informazione, Politecnico di Milano in Italy, working at the NECSTLab with Prof. Stefano Zanero. My research interests revolve around web security and anomaly detection.

See my CV (PDF)
Me

Contact information

Email is the quickest way to contact me. Should you need to communicate with me privately, please use my GPG/PGP public key.

Snail Mail

Selected Publications

Two Years of Short URLs Internet Measurement: Security Threats and Countermeasures
F. MaggiA. FrossiS. ZaneroG. StringhiniB. Stone-GrossC. KruegelG. Vigna Proceedings of the 22nd International World Wide Web Conference (WWW) 18/06/2013, Rio de Janeiro, Brazil Paper (PDF) BibTeX Abstract

BibTeX

@inproceedings{
    2013_maggi_frossi_zanero_stringhini_stone-gross_kruegel_vigna_shorturls,
    author = "Maggi, Federico and Frossi, Alessandro and Zanero, Stefano and Stringhini, Gianluca and Stone-Gross, Brett and Kruegel, Christopher and Vigna, Giovanni",
    publisher = "ACM",
    title = "Two Years of Short URLs Internet Measurement: Security Threats and Countermeasures",
    booktitle = "Proceedings of the 22nd International World Wide Web Conference (WWW)",
    venue = "Rio de Janeiro, Brazil",
    volume = "(to appear)",
    keywords = "selected",
    abstract = "URL shortening services have become extremely popular. However, it is still unclear whether they are an effective and reliable tool that can be leveraged to hide malicious URLs, and to what extent these abuses can impact the end users. With these questions in mind, we first analyzed existing countermeasures adopted by popular shortening services. Surprisingly, we found such countermeasures to be ineffective and trivial to bypass. This first measurement motivated us to proceed further with a large-scale collection of the HTTP interactions that originate when web users access live pages that contain short URLs. To this end, we monitored 622 distinct URL shortening services between March 2010 and April 2012, and collected 24,953,881 distinct short URLs. With this large dataset, we studied the abuse of short URLs. Despite short URLs are a significant, new security risk, in accordance with the reports resulting from the observation of the overall phishing and spamming activity, we found that only a relatively small fraction of users ever encountered malicious short URLs. Interestingly, during the second year of measurement, we noticed an increased percentage of short URLs being abused for drive-by download campaigns and a decreased percentage of short URLs being abused for spam campaigns. In addition to these security-related findings, our unique monitoring infrastructure and large dataset allowed us to complement previous research on short URLs and analyze these web services from the user’s perspective."
}


×
Two Years of Short URLs Internet Measurement: Security Threats and Countermeasures
F. MaggiA. FrossiS. ZaneroG. StringhiniB. Stone-GrossC. KruegelG. Vigna Proceedings of the 22nd International World Wide Web Conference (WWW) 18/06/2013, Rio de Janeiro, Brazil

URL shortening services have become extremely popular. However, it is still unclear whether they are an effective and reliable tool that can be leveraged to hide malicious URLs, and to what extent these abuses can impact the end users. With these questions in mind, we first analyzed existing countermeasures adopted by popular shortening services. Surprisingly, we found such countermeasures to be ineffective and trivial to bypass. This first measurement motivated us to proceed further with a large-scale collection of the HTTP interactions that originate when web users access live pages that contain short URLs. To this end, we monitored 622 distinct URL shortening services between March 2010 and April 2012, and collected 24,953,881 distinct short URLs. With this large dataset, we studied the abuse of short URLs. Despite short URLs are a significant, new security risk, in accordance with the reports resulting from the observation of the overall phishing and spamming activity, we found that only a relatively small fraction of users ever encountered malicious short URLs. Interestingly, during the second year of measurement, we noticed an increased percentage of short URLs being abused for drive-by download campaigns and a decreased percentage of short URLs being abused for spam campaigns. In addition to these security-related findings, our unique monitoring infrastructure and large dataset allowed us to complement previous research on short URLs and analyze these web services from the user’s perspective.


×
Effective Anomaly Detection with Scarce Training Data
W. RobertsonF. MaggiC. KruegelG. Vigna Proceedings of the Network and Distributed System Security Symposium (NDSS) 28/02/2010, San Diego, California, United States Paper (PDF) BibTeX Abstract

BibTeX

@inproceedings{
    2010_robertson_maggi_kruegel_vigna_long_tail,
    author = "Robertson, William and Maggi, Federico and Kruegel, Christopher and Vigna, Giovanni",
    publisher = "The Internet Society",
    date-modified = "2012-01-17 13:40:58 +0000",
    title = "Effective Anomaly Detection with Scarce Training Data",
    abstract = "Learning-based anomaly detection has proven to be an effective black-box technique for detecting unknown attacks. However, the effectiveness of this technique crucially depends upon both the quality and the completeness of the training data. Unfortunately, in most cases, the traffic to the system (e.g., a web application or daemon process) protected by an anomaly detector is not uniformly distributed. Therefore, some components (e.g., authentication, payments, or content publishing) might not be exercised enough to train an anomaly detection system in a reasonable time frame. This is of particular importance in real-world settings, where anomaly detection systems are deployed with little or no manual configuration, and they are expected to automatically learn the normal behavior of a system to detect or block attacks. In this work, we first demonstrate that the features utilized to train a learning-based detector can be semantically grouped, and that features of the same group tend to induce similar models. Therefore, we propose addressing local training data deficiencies by exploiting clustering techniques to construct a knowledge base of well-trained models that can be utilized in case of undertraining. Our approach, which is independent of the particular type of anomaly detector employed, is validated using the realistic case of a learning-based system protecting a pool of web servers running several web applications such as blogs, forums, or Web services. We run our experiments on a real-world data set containing over 58 million HTTP requests to more than 36,000 distinct web application components. The results show that by using the proposed solution, it is possible to achieve effective attack detection even with scarce training data.",
    venue = "San Diego, California, United States",
    keywords = "selected",
    year = "2010",
    date = "2010-02-28",
    booktitle = "Proceedings of the Network and Distributed System Security Symposium (NDSS)"
}


×
Effective Anomaly Detection with Scarce Training Data
W. RobertsonF. MaggiC. KruegelG. Vigna Proceedings of the Network and Distributed System Security Symposium (NDSS) 28/02/2010, San Diego, California, United States

Learning-based anomaly detection has proven to be an effective black-box technique for detecting unknown attacks. However, the effectiveness of this technique crucially depends upon both the quality and the completeness of the training data. Unfortunately, in most cases, the traffic to the system (e.g., a web application or daemon process) protected by an anomaly detector is not uniformly distributed. Therefore, some components (e.g., authentication, payments, or content publishing) might not be exercised enough to train an anomaly detection system in a reasonable time frame. This is of particular importance in real-world settings, where anomaly detection systems are deployed with little or no manual configuration, and they are expected to automatically learn the normal behavior of a system to detect or block attacks. In this work, we first demonstrate that the features utilized to train a learning-based detector can be semantically grouped, and that features of the same group tend to induce similar models. Therefore, we propose addressing local training data deficiencies by exploiting clustering techniques to construct a knowledge base of well-trained models that can be utilized in case of undertraining. Our approach, which is independent of the particular type of anomaly detector employed, is validated using the realistic case of a learning-based system protecting a pool of web servers running several web applications such as blogs, forums, or Web services. We run our experiments on a real-world data set containing over 58 million HTTP requests to more than 36,000 distinct web application components. The results show that by using the proposed solution, it is possible to achieve effective attack detection even with scarce training data.


×
Protecting a Moving Target: Addressing Web Application Concept Drift
F. MaggiW. RobertsonC. KruegelG. Vigna Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID) 23/09/2009, St Malo, Brittany, France Paper (PDF) BibTeX Abstract

BibTeX

@inproceedings{
    2009_maggi_robertson_kruegel_vigna_concept_drift,
    author = "Maggi, Federico and Robertson, William and Kruegel, Christopher and Vigna, Giovanni",
    bdsk-url-1 = "http://dx.doi.org/10.1007/978-3-642-04342-0_2",
    doi = "10.1007/978-3-642-04342-0_2",
    date-modified = "2012-01-16 12:19:26 +0000",
    title = "Protecting a Moving Target: Addressing Web Application Concept Drift",
    abstract = "Because of the ad hoc nature of web applications, intrusion detection systems that leverage machine learning techniques are particularly well-suited for protecting websites. The reason is that these systems are able to characterize the applications' normal behavior in an automated fashion. However, anomaly-based detectors for web applications suffer from false positives that are generated whenever the applications being protected change. These false positives need to be analyzed by the security officer who then has to interact with the web application developers to confirm that the reported alerts were indeed erroneous detections. In this paper, we propose a novel technique for the automatic detection of changes in web applications, which allows for the selective retraining of the affected anomaly detection models. We demonstrate that, by correctly identifying legitimate changes in web applications, we can reduce false positives and allow for the automated retraining of the anomaly models. We have evaluated our approach by analyzing a number of real-world applications. Our analysis shows that web applications indeed change substantially over time, and that our technique is able to effectively detect changes and automatically adapt the anomaly detection models to the new structure of the changed web applications.",
    venue = "St Malo, Brittany, France",
    keywords = "selected",
    year = "2009",
    date = "2009-09-23",
    booktitle = "Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID)"
}


×
Protecting a Moving Target: Addressing Web Application Concept Drift
F. MaggiW. RobertsonC. KruegelG. Vigna Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID) 23/09/2009, St Malo, Brittany, France

Because of the ad hoc nature of web applications, intrusion detection systems that leverage machine learning techniques are particularly well-suited for protecting websites. The reason is that these systems are able to characterize the applications' normal behavior in an automated fashion. However, anomaly-based detectors for web applications suffer from false positives that are generated whenever the applications being protected change. These false positives need to be analyzed by the security officer who then has to interact with the web application developers to confirm that the reported alerts were indeed erroneous detections. In this paper, we propose a novel technique for the automatic detection of changes in web applications, which allows for the selective retraining of the affected anomaly detection models. We demonstrate that, by correctly identifying legitimate changes in web applications, we can reduce false positives and allow for the automated retraining of the anomaly models. We have evaluated our approach by analyzing a number of real-world applications. Our analysis shows that web applications indeed change substantially over time, and that our technique is able to effectively detect changes and automatically adapt the anomaly detection models to the new structure of the changed web applications.


×
See all →