The Sound of Malware
By Trellix · June 23, 2022
Do, a debugger, you often use
Re, a reverse engineer
Mi, a name, I call myself
By now, you must be very thankful I reminded you of this famous song; I am sure it will be stuck in your head the rest of the day. You’re welcome!
Confused on how this relates to malware analysis?
In the world of malware and reversing, there are tools, scripts, and methods we use to investigate the relationship between malware families, detect new versions and understand differences across malware samples. A great example of why we are doing this is to understand if a current detection log written is still working on a newer variant of the malware. What did the adversary change in the code and will that impact our protection towards our customers? One approach is to compare older and newer samples to figure out which components are always present in both samples. In some cases, we get ambitious and use a bulk number of samples that are, for example classified as ransomware, and search for common denominators in the code to help identify new samples which fit the classification of belonging to ransomware.
Another classic method is to perform code comparison is binary diffing. Using BinDiff injunction with IDA Pro, we create two databases of the malware samples and start to compare the two. An overall comparison score is generated, and it will demonstrate the similarities which exists between samples as shown in Figure 1.
In Figure 1, this dashboard from BinDiff, demonstrates the two sample have a similarity of 52 percent. Drilling more down into this dashboard, the differences and similarities are split up in several categories. It is also possible to visualize the functions for example, compare and spot the exact difference. Adversaries tend to reuse code and optimize certain parts in newer releases. The important question is where do these similarities exist? In the ‘generic’ code elements, or in the components that were used to create this malware?
There are many other methods to compare malware such as fuzzy-hashing, extracting code-blocks and then comparing etc. Recently in our blog on DPRK Ransomware families, we used graph-technology and the Hilbert Curve mapping to discover similarities shown in Figure 2.
We have frequently used code comparisons and visualizations but would it be possible to compare malware samples using a more abstract technique? What about sound?
A novel technique
Music, sound, and especially analog synths have always been a fascinating topic for me. As a veteran from the Dutch Navy, I had the pleasure to spend a week on a submarine listening to sonar sounds, seeing frequency waterfalls and filtering out ships versus animals; it was a phenomenal experience. Combining all of this with the desire to investigate and innovate new comparison methods, I pondered would it be possible to compare malware samples based on their sound?
To start, we took two samples of the Conti ransomware for Linux, one released in May, the other one in June. From a BinDiff and code comparison perspective, there were minimal differences and an overall score of 99.8 percent equality. Would our experiment with sound show the same?
First, we had to transfer the sample into an audio-file, so this can be played and used for frequency analysis. We used the ‘cat’ command from the Linux command-line and sent it through an audio player to generate the sound:
>> cat malwarefile.bin | mplayer -cache 1024 -quiet -rawaudio samplesize =1: channels=1 :rate=8000 -demuxer rawaudio -
This will make some noisy audio, similar to a dial up modem trying to connect to the Internet using a telephone line. Rerouting the audio from the headset-jack towards a mixer, the sound was recorded and exported both into the .WAV (waveform) and .MP3 audio files.
Loading the two .WAV samples generated from the Conti ransomware Linux variants in one of our used audio-analysis tools (Audacity and Sonic LineUp), we saw the spectrogram in Figure 3:
Studying the above picture, one can spot that while they are almost identical, there is a minor difference at the end of the sound-profile of the June sample. This matches what was discovered during binary code analysis.
Now, onto the next experiment. From our investigation on DPRK ransomware families, we discovered that the VHD ransomware sample and the BEAF sample had some code similarities but also a lot of differences. Would that also be possible to prove with sound?
Again, we created the required audio files for both samples. Since the binary malware samples were different in file size, the length of recording would be different for each sample. This was reflected when we loaded our audio files into Audacity.
We see first the difference in length, but also the similarities in the waveforms stand out. Using Sonic LineUp, we tried to line up the wave and frequency spectrums to align them as best as possible.
Not only visually, but also when we conduct a frequency plot-spectrum analysis of the two audio samples, the differences clearly present themselves as seen in Figure 6.
The VHD plot spectrum displays a lot more activity in the ranges above 7000Hz than BEAF as one example of identifiable differences.
Converting malware samples to sound to compare them was an interesting and worthy learning experience. It also proved things we found with traditional code analysis or visual analysis were also seen in audio analysis. Honestly, we did not expect the results of this fun experiment in proving what we observed during traditional code similarity/comparison research methods. It was remarkable to discover that in the Conti case, where we observed minimal changes from a traditional approach, that the sound conversion and frequency analysis demonstrated the same findings.
Wouldn’t it be nice if instead of asking for ransom, the ransomware gangs would start making music of their code for us to enjoy on Spotify or Apple Music? We gave it a try to create the first version of Conti ransomware gang Techno where we combined the generated audio of the Conti code and combined it with some tracks using Garageband. Check it out and tell us what you think!
I’m curious if there’s more creative talent out there that can convert code to audio and make some nice music with it. Upload your mix and share it in our SoundCloud channel so we can add it to the playlist.
May 30, 2023
Trellix Expands AWS Integrations to Provide Greater Data Security to Cloud Infrastructure Customers
May 8, 2023
CRN Recognizes Trellix on its 2023 Women of the Channel and Power 100 Lists
Apr 25, 2023
96% of CISOs Struggle to Get the Support Required to Be Resilient Against Cyber Attacks
Apr 24, 2023
Trellix Launches Comprehensive Endpoint Security Suite
Apr 24, 2023
Trellix Receives FedRAMP High Authorization to Operate for Trellix Extended Detection and Response GovCloud
The latest from our newsroom
Trustwave and Trellix Partnership to Deliver Best-in-Class Managed Detection and Response
March 15, 2023
Strategic partnership announced to bring superior visibility and faster, more precise detection and response to security teams defending against cyberthreats.
Trellix Leads XDR Evolution
By Aparna Rayasam · March 2, 2023
Chief Product Officer Aparna Rayasam explains the evolution of XDR and how it provides the connecting tissue needed to detect, prevent, and remediate attacks across all vectors.
The Bug Report – March 2023 Edition
By Kasimir Schulz · April 5, 2022
Welcome back to the Bug Report, Ides of March edition! This month features CVE-2023-24033, CVE-2023-21036 (Acropalypse), CVE-2023-23397, and CVE-2023-24880.
Get the latest
We’re no strangers to cybersecurity. But we are a new company.
Stay up to date as we evolve.
Zero spam. Unsubscribe at any time.