Rich Headers: leveraging this mysterious artifact of the PE format for threat hunting

Michal Poslusny, Peter Kalnai/ESET

Ever since the release of Visual Studio 97 SP3, Microsoft started placing an undocumented chunk of data between the DOS and PE headers in every native Portable Executable (PE) binary produced by its linker without any possibility to opt out. The data contains information about the build environment and the scale of the project, stored in a simple yet effective way using blocks of the following values: a product identifier, its build number, and the number of times it was used during the build process. Several research papers on this topic have been released over the years, coming up with the name "Rich Header" and shedding some light on its purpose and structure, but we feel that it has never been used to its full potential by the security industry.

When an analyst encounters a rare custom malware sample involved in, say, an APT, and is grasping at straws to draw conclusions about the case, this mysterious structure could provide some helpful clues. Not only does it reveal the type of components involved in the project behind the malware, and the build tools used — but, forming an abundant set of variations, it also helps with locating similar samples. We introduce a hierarchy of similarity levels, together with real-world examples where they have been applied successfully.

For various crimeware kits, which are (re)distributed on a daily basis, the header could suggest whether their source code is available more widely, or under the control of a single actor. Moreover, the headers from their encapsulating malware packers often manifest their own anomalies and could cluster a larger set of samples of the same nature. These inconsistencies could be easily identified and turned into heuristics based on the situation, such as an unusual offset or the size of the header, an invalid identifier or its combination with the build version, the image size not corresponding to the magnitude of the project, etc.

We will also showcase our in-house-designed database infrastructure and the tooling we`ve built around it: similarity lookup, rule-based notification system for malware hunting and the detection of anomalies. Moreover, we will share our modified version of a YARA scanner with extended Rich Header functionality, which we also hope to contribute to the original project. Finally, we will show several Rich Header based rules for detection of high-profile malware families and attribution to well-known, malicious actors.

You can find complete paper here:

November 7 at 15:40 - 16:10, Stage A

Michal Poslušný is a malware researcher working at ESET, where he is mainly responsible for reverse engineering of complex malware threats. He also works on developing various internal projects and tools and has actively participated in research presented at AVAR, CARO, OFFZONE and Virus Bulletin international conferences in the past. In his free time he likes to play online games, develop fun projects and spend time with his family.

Peter Kálnai is a malware researcher at ESET. As a speaker, he has represented ESET at various international conferences including Virus Bulletin, AVAR, OFFZONE, cyberCentral and CARO Workshop. He hates mostly malware like crypto-ransomware, because it displays hardly any inventiveness and has a very destructive impact on the victim. His golden rule for cyberspace is always to prioritise security measures over user comfort. In his free time he enjoys foosball and travelling.