APT CLASS

Attributing a piece of malware to its creator typically requires threat intelligence to attain a sufficient confidence level. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to gather relevant features and build a fingerprint to identify the author.

To date, most research focuses on source code authorship attribution and the application of similar techniques to benign and malicious binaries. However, this approach provides an opportunity for malicious authors to attack the authorship attribution models due to the stark differences between both source code and binaries and benign and malicious authors.

Our survey (joint work with S3Lab) explores the style of threat actors and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on state-of-the-art methods for binary authorship attribution. Through this approach, we identify key findings and explore the open research challenges to identifying authorship style within malicious binaries.

One major challenge is the lack of a ground truth dataset of malware and authors. To mitigate this issue for the community, we publish alongside this survey a meta-information dataset of 15,660 malware labeled to 164 threat actor groups. This is the largest and diverse dataset to date. Additionally, we identify a further 7,485 malicious samples currently linked to unknown groups.

Access

To request access to the dataset, please complete the following form: We have already granted access to people from the following institutions (alphabetical order):
  1. Amadeus IT Group, Spain
  2. Beijing University of Posts and Telecommunications, China
  3. Ben Gurion University, Israel
  4. Bern University of Applied Sciences, Switzerland
  5. BlackTruffle Security
  6. Cybergeeks[.]tech
  7. Delhi Technological University, India
  8. Fraunhofer FKIE, Germany
  9. Georgia Tech Research Institute, USA
  10. Global Infotek, Inc, USA
  11. Grammatech, USA
  12. Harfanglab, France
  13. HRL Laboratories, USA
  14. International Business Machines (IBM), USA
  15. Indian Institute of Technology Kanpur, India
  16. Information Sciences Institute, University of Southern California, USA
  17. Jinan University, China
  18. Kennesaw State University, USA
  19. Kudu Dynamics, USA
  20. Lancaster University, UK
  21. Nanyang Technological University - NTU Singapore
  22. National University and Science and Technology Islamabad, Pakistan
  23. National University of Singapore
  24. NATO
  25. Naval Research Laboratory, USA
  26. OpenAnalysis Inc
  27. Osaka Electro-Communication University, Japan
  28. Recorded Future, USA
  29. Rice University, USA
  30. Royal Holloway University Of London, UK
  31. Ruhr-Universität Bochum, Germany
  32. Sabancı University, Turkey
  33. Shahid Beheshti University, Iran
  34. TU Wien, Austria
  35. UC Berkeley, USA
  36. University Institute of Information Technology, PMAS, Pakistan
  37. University of Chinese Academy of Sciences, China
  38. University of Illinois, USA
  39. University of Kent, UK
  40. University of New Brunswick, Canada
  41. University of Saskatchewan, Canada
  42. Westphalian University, Germany
  43. Wuhan University, China
  44. Zeropoint Dynamics, USA

Papers

Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets
Jason Gray, Daniele Sgandurra, Lorenzo Cavallaro
CoRR · arXiv CoRR, 2021
@article{gray2021aptclass,
author = {Jason Gray and Daniele Sgandurra and Lorenzo Cavallaro},
title = {Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets},
journal = {CoRR},
volume = {abs/2101.06124},
year = {2021},
url = {http://arxiv.org/abs/2101.06124},
eprint = {2101.06124},
archivePrefix = {arXiv}
}

People

  • Jason Gray, Ph.D. Student, King's College London & Royal Holloway, University of London
  • Daniele Sgandurra, Senior Lecturer (Associate Professor), Royal Holloway, University of London
  • Lorenzo Cavallaro, Full Professor of Computer Science, Chair in Cybersecurity (Systems Security), King's College London