APT CLASS
To date, most research focuses on source code authorship attribution and the application of similar techniques to benign and malicious binaries. However, this approach provides an opportunity for malicious authors to attack the authorship attribution models due to the stark differences between both source code and binaries and benign and malicious authors.
Our survey (joint work with S3Lab) explores the style of threat actors and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on state-of-the-art methods for binary authorship attribution. Through this approach, we identify key findings and explore the open research challenges to identifying authorship style within malicious binaries.
One major challenge is the lack of a ground truth dataset of malware and authors. To mitigate this issue for the community, we publish alongside this survey a meta-information dataset of 15,660 malware labeled to 164 threat actor groups. This is the largest and diverse dataset to date. Additionally, we identify a further 7,485 malicious samples currently linked to unknown groups.
Access
To request access to the dataset, please complete the following form: We have already granted access to people from the following institutions (alphabetical order):- Amadeus IT Group, Spain
- Beijing University of Posts and Telecommunications, China
- Ben Gurion University, Israel
- Bern University of Applied Sciences, Switzerland
- BlackTruffle Security
- Cybergeeks[.]tech
- Delhi Technological University, India
- Fraunhofer FKIE, Germany
- Georgia Tech Research Institute, USA
- Global Infotek, Inc, USA
- Grammatech, USA
- Harfanglab, France
- HRL Laboratories, USA
- International Business Machines (IBM), USA
- Indian Institute of Technology Kanpur, India
- Information Sciences Institute, University of Southern California, USA
- Jinan University, China
- Kennesaw State University, USA
- Kudu Dynamics, USA
- Lancaster University, UK
- Nanyang Technological University - NTU Singapore
- National University and Science and Technology Islamabad, Pakistan
- National University of Singapore
- NATO
- Naval Research Laboratory, USA
- OpenAnalysis Inc
- Osaka Electro-Communication University, Japan
- Recorded Future, USA
- Rice University, USA
- Royal Holloway University Of London, UK
- Ruhr-Universität Bochum, Germany
- Sabancı University, Turkey
- Shahid Beheshti University, Iran
- TU Wien, Austria
- UC Berkeley, USA
- University Institute of Information Technology, PMAS, Pakistan
- University of Chinese Academy of Sciences, China
- University of Illinois, USA
- University of Kent, UK
- University of New Brunswick, Canada
- University of Saskatchewan, Canada
- Westphalian University, Germany
- Wuhan University, China
- Zeropoint Dynamics, USA
Papers
Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets
CoRR · arXiv CoRR, 2021
CoRR · arXiv CoRR, 2021
@article{gray2021aptclass,
author = {Jason Gray and Daniele Sgandurra and Lorenzo Cavallaro},
title = {Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets},
journal = {CoRR},
volume = {abs/2101.06124},
year = {2021},
url = {http://arxiv.org/abs/2101.06124},
eprint = {2101.06124},
archivePrefix = {arXiv}
}
People
- Jason Gray, Ph.D. Student, King's College London & Royal Holloway, University of London
- Daniele Sgandurra, Senior Lecturer (Associate Professor), Royal Holloway, University of London
- Lorenzo Cavallaro, Full Professor of Computer Science, Chair in Cybersecurity (Systems Security), King's College London