Stories

The latest cybersecurity trends, best practices,
security vulnerabilities, and more

Email address

Please enter a valid email address.

Country

Please select your country of residence.

Subscription Topics

The latest threat alerts and information.

News, promos, and events for you.

Blogs:

Platform

Research

Perspectives

Trellix Excels at Protection, Visibility & Detections in the Wizard Spider & Sandworm ATT&CK Evaluation

By Natalia Ciapponi, Raul Collantes, Maristela Ames, Ismael Valenzuela · March 31, 2022

For the last 4 years, MITRE Engenuity has been evaluating cybersecurity products using an open methodology based on the ATT&CK® knowledge base. The main goal for these evaluations is to improve organization’s defenses against known adversary behaviors.

To accomplish this, MITRE prepares real-world attack emulations that cover the behavior of relevant advanced persistent threats (APT) and ask vendor participants to demonstrate their ability to see, identify and detect those activities.

Since 2018, these evaluations have become more sophisticated and aligned with the reality of the cyber security industry:

Round1 (2018): APT3 emulation, focused on defense evasion using LOLBAS
Round2 (2020): APT29 emulation, focused on defense evasion using PowerShell and WMI (Windows Management Instrumentation)
Round3 (2021): Fin7/Carbanak emulation, focused on stealth operations using obfuscation and advance malware.
Round4 (2022): WizardSpider/Sandworm emulation, focused on Ransomware behavior and a broad range of advance post-exploitation tradecraft.

In the first rounds, most of the visibility and detections offered by evaluated products were rather simple and focused on process execution and command lines. Over the years, MITRE Engenuity has been pushing all security vendors to the next level to obtain better visibility and more contextual, behavioral based detections.

With the 2022 Enterprise Evaluation on Wizard Spider and Sandworm, the MITRE ATT&CK team has challenged all security vendors to highlighting their latest technologies, integrations, and sensors to demonstrate their ability to see and detect the activity emulated by these ransomware groups.

The victims were in South Asia in the Telecommunication and Defense sectors, and align with China’s geopolitical interests. One such initiative is the Belt and Road Initiative, via which China aims to establish strong social economical relationships across Europe, Asia, and Africa via trade.

Figure 1: Our improvement journey throughout the MITRE ATT&CK evaluation program

We have been participating on ATT&CK evaluations since the very first one in 2018, using these opportunities to test and improve our technologies, to learn from them, and adapt to what is required by organizations to get the best visibility and protection against advanced persistent threat (APT) and other malicious activities.

Trellix Results in Round 4

Visibility

An alert driven solution is not enough to support the mission of a SOC (Security Operations Center). Investigations and threat hunting workflows are a critical part of security operations, and high-quality visibility (or telemetry using MITRE’s definition) is key to eliminate blind spots and to obtain a full picture of what is happening during an attack. Visibility is also needed to determine the most appropriate countermeasures to protect organizations from being attacked or exposed.

Since the very first evaluation, we continued reviewing and improving our visibility capabilities, constantly testing, analyzing, and tuning them, to ensure high quality telemetry, with context that is readily available to support our customers’ security outcomes.

Figure 2. Our visibility results over the last four MITRE evaluations

Detection

We have been working on maximizing detection rates, as well as the precision and their specificness, to provide world-class detections along with high-quality visibility. With both elements in place, we can provide maximum context and understanding of activity and behaviors, while eliminating blind spots. The aggregation of low-level detections into higher contextual ones also allow SOC teams to eliminate alert fatigue, increasing alert actionability.

Over the past 4 years we have been constantly improving on the quality and precision of our detections. In the 2018 APT3 evaluation we obtained 14% of technique detections over the total amount of steps tested. In the 2022 evaluation we obtained 66% of technique detection on Wizard Spider and Sandworm emulation. This is important as a ‘technique detection’ represents the highest contextual and quality type of detection in MITRE’s detection rating system.

Time is also a critical factor when evaluating the quality of a detection system . On this latest MITRE Evaluation, 19 attack objectives or phases were exercised and in 100% of the cases the blue team received early and very precise indications of an attack, multiple times before the breakout point or the detonation of the ransomware payload.

Figure 3. Our detection improvements over the last 4 MITRE evaluations

Let us now review the analytical coverage of our detections to have a proper understanding of the effectiveness and precision that was achieved in the latest MITRE Evaluation.

Alert-Actionability & Time-Based Security

One of the key aspects of an effective detection solution is to detect and react early, raising an alarm as fast as possible during the attack chain, while correlating, enriching, and summarizing all subsequent activity to preserve actionability.

We have demonstrated its superior detection efficacy by successfully identifying all 19 attack phases in their preliminary stages with high contextual detections and telemetry that helps analysts to understand the situation they are facing.

Figure 4: Our early detection capabilities on every attack phase exercised (Day1)

Figure 5: Our early detection capabilities on every attack phase exercised (Day2)

Alert-Specificness

When talking about detection quality and specificness, MITRE Engenuity defines 3 simple categories:

General:Limited or No details about the specific action performed. (Examples: Abnormal Activity, Suspicious file detected, unusual network activity)
Tactic: Gives a more specific context to the actions performed using ATT&CK Tactic (Examples: Discovery activity detected, Exfiltration detected, Privilege Escalation attempt)
Technique A detection using ATT&CK techniques and Sub-techniques giving the most precise level of detection and enrichment, explaining what is happening in relation to the activity detected (Examples: Credential Dumping via LSASS memory, Lateral Movement using Service Execution, Process Discovery using PowerShell)

As we can see in the graph below, from the total of number of detections generated for the Wizard Spider and Sandworm evaluation, McAfee/Trellix has provided precise and specific details on 89% of them and due to that, those detections were rated by MITRE as Technique (86%) or Tactic (3%).

Figure 6: Our Detection Specificness on Wizard Spider and Sandworm Evaluation

Detection In-Depth

We have achieved a 77% of accurate detections by leveraging the multiple sources available within the different sensors, utilizing at least two of them to maximize context and correlation. This approach provided different perspectives and enrichment to the analyst who is investigating an alert.

Figure 7: Data sources used to produce detections on Wizard Spider and Sandworm Evaluation

How Did We Get Here? Think Red, Act Blue

In preparation for each year's evaluation, there is a coordinated effort led by the Applied Countermeasures (AC3) team, with all product engineering and other research teams, to understand what technologies and sensor capabilities are best positioned to detect behaviors associated to the adversary to be exercised by MITRE.

The AC3 team guides this combined effort through a detailed purple team research methodology to have an in-depth view of how different MITRE Techniques can be executed. We use this opportunity to review each technique in a holistic way to ensure we have a full 360-degree view of each tactic, technique, and procedure.

With all the information documented and available, researchers work closely with the different product engineering teams to ensure all required sensors and subsequent data sources are available for visibility purposes. If the data is not available, the work is focused on implementing the necessary features and improvements in the product sensors, to provide what is needed in the most effective and efficient way.

At the same time, our internal “Red Team” develops the operational flow and TTPs (Tactics, Techniques, and Procedures) associated to the specific adversary to be emulated, using the documented research but also adding some of our ‘own flavor’ to add less predictable variations to the attacks.

Everything is planned and tested in monthly cycles with specific focus on a particular topic. For example, one testing cycle can be focused on all the ‘Discovery’ behaviors while the next cycle can be focused on ‘Active Directory’ attacks. At the end of the process, the complete operational flow is tested a few times with both internal and external red teams, to ensure the visibility and coverage targets are met. Organization and communication are key to achieve these results.

Figure 8: McAfee MITRE Readiness Schema

As mentioned before, the key to this process is the way we measure and check our progress, following what we call the ‘Think Red, Act Blue’ methodology, through a regular monthly Purple Team Program. This has proven to be a good strategy to show progress through meaningful metrics, keep the team engaged and, of course, have some fun along the way.

In these Purple Team exercises we:

Tested the different TTPs associated to the adversary to be evaluated
Tested new product implementations (features, content, and sensors)
Learned where we can improve (tune detections, fix bugs)

As the famous quote goes: “accomplishments will prove to be a journey, not a destination”. This journey started for us with the first ATT&CK evaluation in 2018, and it has certainly taken us to achieve great results that we are proud of. But these results are more than a mere statistic. They have translated into real actionable defensive capabilities in our products that allow our customer to achieve powerful security outcomes, to protect their assets at risk, and to mitigate the impact of damaging attacks.

With this blog, we want to publicly thank the MITRE ATT&CK evaluations team for their continued support throughout these years, and for continuously pushing us to move forward and to achieve excellence in our pursuit to learn, adapt and improve.

Latest from our newsroom

Blogs | Research

From Click to Compromise: Unveiling the Sophisticated Attack of DoNot APT Group on Southern European Government Entities

By Aniket Choukde, Aparna Aripirala, Alisha Kadam, Akhil Reddy, Pham Duy Phuc and Alex Lanstein · July 8, 2025

The DoNot APT group, also identified by various security vendors as APT-C-35, Mint Tempest, Origami Elephant, SECTOR02, and Viceroy Tiger, has been active since at least 2016, and has been attributed by several vendors to have links to India."

Read the Article

Blogs | Research

Perspectives from the Head of CISO Engagement: Mind of CISO: Closing the Gap Between Reaction and Readiness Report

By Brian Brown · June 10, 2025

Trellix's new Mind of the CISO survey finds that the vast majority of organizations struggle to make sense of and truly operationalize their threat intelligence assets.

Read the Article

Blogs | Platform

Proactive Security Reimagined: Introducing the Renewed Trellix Insights

By John Fokker, Ryan Delany · April 29, 2025

At Trellix, we believe proactive security is not only possible but essential, and it begins with the renewed and reimagined Trellix Insights.

Read the Article

Get the latest

Stay up to date with the latest cybersecurity trends, best practices, security vulnerabilities, and so much more.

Zero spam. Unsubscribe at any time.

Quick Links

Stories

Trellix Excels at Protection, Visibility & Detections in the Wizard Spider & Sandworm ATT&CK Evaluation