Malware Detection
Files are converted into images and then classified by a CNN to identify Malware.
When it comes to malware, there are two ways to play detective: dynamic and static analysis. One involves running suspicious software in a virtual playground to see how it behaves, and the other looks at the malware’s characteristics without letting it run wild. Traditional methods, like checking hash values in a database, have their limits as new malware pops up all the time.
But here’s where it gets interesting: back in high school, I stumbled upon a cool idea from Nataraj et al. Instead of sifting through lines of code, what if we turned it into a picture? We’re talking about representing a file’s binary code as a grayscale image. Interestingly, it transforms malware analysis into an image recognition game.

These representations are being created by converting every 8 Bits of a file into a greyscale value for a pixel. Taking advantage of that lead to a hobby project in my high school days. The original paper classified these with GIST feature extraction and k-nearest-neighbour. Henceforth, I decided to use CNNs for that and outperform the previous accuracies. Moreover, the FC-layers of CNNs require the inputs to be of same size. Due to programs being of different lenghts, I decided to not only scale the images to the same size, but also compare the approach of using spatial pyramid pooling for this matter.
Dataset | Resizing | Spatial Pyramid Pooling | Adaptive Max Pooling |
---|---|---|---|
Malimg (25 virus families) | 98.07% | 96.79% | 94.97% |
MS Malware (9 virus families) | 97.33% | 90.71% | 88.78 % |
Raw PE (binary) | 82.54% | 83.53 % | 78.17 % |

Thus, resizing all images to the same size (256x256) appears to perform best for classifying virus families. The spatial pyramid pooling lead to the best accuracy for distinguishing malware and goodware. Regardless of the domain, examining other approaches to handle the matter of having inputs of different sizes for a classifier was quite thrilling. Shoutout to my school mate Jan P. Große who did it together with me.