Advancing threat-hunting and threat knowledge acquisition capabilities.
Our research currently aims at achieving two specific goals:
Efficient search of provenance graphs for known threat behaviors: A provenance graph allows representing system audit logs in the form of a heterogeneous, typed, directed, and a dynamic graph that shows a variety of information and control flow relations among high-level system entities, such as processes, network sockets, files, etc. The ability to evaluate the context of system events made provenance graphs a very suitable means for detecting system compromises and performing forensic analysis. Our research utilizes provenance graphs to answer the question of how to efficiently search audit logs of enterprise-grade systems for known attack behaviors. In our approach, we formulate the threat-hunting problem as an approximate subgraph matching problem and utilize graph neural networks. The complex nature of provenance graphs, however, poses significant challenges to graph learning methods. In our work, we address these challenges and propose the use of geometric embeddings to order the space of sub-graph-level representations and to rapidly match these representations with those of queried behaviors.
Advancing natural language understanding for CTI Text: This thread of work aims at improving language understanding capabilities from cybersecurity text in view of recent advances in NLP and machine learning. As a first step towards this goal, we investigate the problem of automatic annotation of incident response reports to identify trends and patterns in the way APT attacks are carried out. This task is currently performed manually by analysts. Our approach aims at mapping the findings of a security report to the relevant tactics and techniques in the MITRE ATT&CK framework. We formulate the problem as a semantic retrieval task and build on solutions from this domain to obtain a working model. The work in this area will be extended to extract more actionable intelligence from security reports.
The project is still in a nascent stage with encouraging results indicating the viability of initially proposed solutions. The research team is planned to grow with the addition of a new post-doctoral member joining our group.
Our project considers the operational setting of an enterprise security team and aims at improving their capabilities to respond to incidents and better understand threat behaviors. Since response is the most human-centered security task, there is a need for support systems to improve the efficiency of analysts. In a similar manner, both the generation and consumption of threat intelligence information largely require humans in the loop. Our project mainly tries to bridge this gap by reducing analyst effort. In this regard, the first part of our work strives to build a system for threat hunters to efficiently query attack behaviors. This will allow threat hunters to proactively search in their historical system log data for recent attack behaviors which are typically described in open-source threat intelligence sources. Similarly, the work on threat report annotation work aims to extract structured threat information from unstructured data, such as incident response reports.
To develop systems that are truly impactful, we understand the importance of aligning our research goals with the needs of our stakeholders and the industry. Therefore, we will prioritize exploring collaborations and partnerships that can guide us in this process and help us achieve our commercialization goals. Ultimately, our plan is to build systems that are not only cutting-edge but also directly address the real-world challenges faced by our stakeholders and industry partners.