This review covers No Starch Press’ Ghidra Book, which is written by Chris Eagle and Kara Nance. The book provides an extensive overview of Ghidra’s capabilities, including screenshots and examples. This review covers the whole book, where I summarised each chapter, together with my thoughts and experiences on the covered content. Within the conclusion, a recap of my opinion on the book as a whole is given.
Table of contents
- Review structure
- Chapter 1: Introduction to Disassembly
- Chapter 2: Reversing and Disassembly Tools
- Chapter 3: Meet Ghidra
- Chapter 4: Getting Started with Ghidra
- Chapter 5: Ghidra Data Displays
- Chapter 6: Making Sense of a Ghidra Disassembly
- Chapter 7: Disassembly Manipulation
- Chapter 8: Data Types and Data Structures
- Chapter 9: Cross-References
- Chapter 10: Graphs
- Chapter 11: Collaborative SRE
- Chapter 12: Customizing Ghidra
- Chapter 13: Extending Ghidra’s Worldview
- Chapter 14: Basic Ghidra Scripting
- Chapter 15: Eclipse and GhidraDev
- Chapter 16: Ghidra in Headless Mode
- Chapter 17: Ghidra Loaders
- Chapter 18: Ghidra Processors
- Chapter 19: The Ghidra Decompiler
- Chapter 20: Compiler Variations
- Chapter 21: Obfuscated Code Analysis
- Chapter 22: Patching Binaries
- Chapter 23: Binary Differencing and Version Tracking
The table of contents of the book is used as the structure for this review, excluding the appendix. For each chapter, I’ll give a short summary of the content, after which I will share my thoughts and experiences based on the given information. As I try to be impartial when it comes to subjective parts in the book, I’ll clearly state what part of the review is my personal opinion, and what part is based upon a more rational approach.
The book starts with an introduction, stating it’s not meant as a user manual, but rather how to use Ghidra when reversing. This chapter provides theory on disassembly theory, where different generations of languages are discussed. It also goes into the what, why, and how of disassembly. Overall, it provides an overview of the struggles that tooling has when disassembling, with a lot of background information as to why. The theory is given to help analysts spot disassembly mistakes, and to help the analyst rectify these mistakes.
Unless you’re really new to reverse engineering, most of the information in this chapter isn’t too relevant. The different methods of disassembling were new to me, as I never looked into that. As I’m generally rather fond of knowing all the ins and outs of a subject, I enjoyed this chapter thoroughly.
To goal in this chapter is to show tools that are commonly used when reverse engineering. Aside from giving the reader background into these tools, the main goal of the chapter is to provide insight into Ghidra’s capabilities, which matches quite some of the listed tools.
Malware analysis is a knowledge based profession, not tool based. This chapter clearly explains the concepts behind the tools, and how these concepts are used in multiple tools, including Ghidra. This makes the chapter useful for any reverse engineer that reads it, regardless if Ghidra is part of the (commonly) used toolset.
In this chapter, the installation of Ghidra is documented, excluding the build process. Additionally, information about the folder structure within the installation folder is given, providing a clear overview of the different components that are present within Ghidra.
Even though the installation chapter might not be interesting to all, as the FlareVM has a pre-configured version of Ghidra installed, this part of the book cannot be left out due its specific focus on Ghidra itself.
The process flow of creating a project, loading a binary, selecting and running the analysers, and viewing the analysis benchmarks, is covered within this chapter. Whereas some analysts might find Ghidra’s user interface intuitive, others might feel the opposite. A simple guide through the basics is therefore a help for anybody who is new to the tool, or to those who do not want to explore the basics on their own.
The default interface of Ghidra provides the user with an array of options. To get the most out of the offered features, one has to know what is configurable. Some settings depend on specific use cases, whereas others are more related to the personal preference of the analyst. All main components of Ghidra are discussed within this chapter.
Some analysts prefer to work with a more standard edition of the tool they’re using, whereas others like to fully customise it to fit their way of working. Knowing what is possible within Ghidra is the first step to making an informed decision in the customisation process.
Navigating through the disassembly can be a tedious task. Knowing how to jump to an address of choice, understanding the names and labels, as well as understanding several known calling conventions, are useful skills to quickly move through the disassembly.
To clarify the disassembly, one has to modify Ghidra’s output. This can be done by renaming, editing, creating or deleting functions and variables, by adding comments, or by changing types and values. Changing values can be done by converting them, or by using the equate functionality. The conversion simply displays a value differently, meaning a signed integer can be displayed as an unsigned integer. The equate functionality is used to reverse enumerations to a known or custom variable. As such, the output becomes much more readable.
Effectively manipulating the disassembly paves the road to a swift analysis, and reduces the chance of making a mistake. Doing this correctly takes quite some time, but will greatly increase the effectiveness of the analysis of the rest of the binary.
This chapter provides information on data types and their concepts. Structs, global and stack based, are addressed, as well as the custom struct creation and generation options in Ghidra. Lastly, information about C++ reversing is given. The C++ related information relates to virtual functions, virtual function tables, and name mangling.
Using and recovering the correct data types is essential during the analysis. Recreating a missing struct can instantly clarify a whole function, thus reducing the time it takes to reach the goal the analyst has set.
Cross references are, regardless of the analysis type, extremely helpful to find clues and navigate to crucial parts of the program. Understanding how they are listed and used within Ghidra is therefore essential. The usage is straightforward, but some more obscure cases are also explained in this chapter.
A picture is worth a thousand words in the same way that a graph is worth a thousand instructions. As such, using graphs to get an overview of the lay-out of a function, or the relationship between functions, allows the analyst to quickly understand what is going on and what part of the code should be looked at in detail.
Ghidra allows multiple analysts to work on the same binary in a shared project. This chapter covers the set-up of the Ghidra server, the creation of a shared project, and the version control commands. Noteworthy is the comparison of Ghidra’s version control commands to the commonly known Git terminology, which helps those who have experience with Git to quickly adopt Ghidra’s way of working.
Ghidra’s modules work together in a coordinated fashion. Upon selecting a function in the disassembly view, the decompiler shows the same function. It is possible to add more windows that are not connected in such a fashion, which can later be reconnected. Additionally, further customisation regarding the look and feel of Ghidra is discussed, including the (in)famous dark theme.
Recognising known functions is done based upon the function signature database. In Ghidra, function signatures are called FunctionIDs. Adding new function signatures can be done based upon the data within a project, by parsing C headers, or by manually adding them. Several methods on how to do this are given in this chapter.
When looking at more complex samples, or performing repetitive tasks, one can use Ghidra’s API to speed up the analysis process by automating such tasks. These plug-ins can be written in Java and Python2, the latter is interpreted via Jython. Since Ghidra is open-source, documentation of the API is available in the form of code and JavaDoc, and the scripts that are embedded within Ghidra by default can be used as guidance. This chapter provides a clearer overview of the API that Ghidra has to offer, and guides the user through several detailed examples.
The examples in the book are all written in Java, as opposed to the more popular and commonly used Python. My personal preference goes out to Java, which matches very well with Ghidra’s API. I do realise that I’m the odd one out based on this preference, so not every reader might share my excitement and sentiment related to this. Regardless of your own preference, the outline of this chapter provides a very well documented introduction to Ghidra scripts.
In the previous chapter, an introduction into Ghidra’s API functions is given. This chapter shows the reader how to create and edit scripts and modules in Eclipse, making full use of the IDE’s capabilities. This is, similar to the previous chapter, done with the help of example scripts that are created step-by-step.
When making a simple script, one does not need the more extensive editor that Eclipse offers. When creating a complete module, or a more complex script, it is strongly advised to use the editor, even if one’s preference is not Eclipse.
In some cases, its preferable to run Ghidra without the graphical interface. This way, one can harness the power of Ghidra from the command line interface, allowing analysts to automate tasks. This chapter details how to use Ghidra’s common command line interface parameters, and provides information on writing Ghidra scripts that can be used when running headless.
Using Eclipse, three different shellcode loaders are created in a detailed and step-by-step manner. This chapter might be out-of-scope for many analysts, meaning that there are less resources available that document this process. As such, this chapter will help out those who are interested in making their own loader.
When disassembling a binary, some instructions might be disassembled incorrectly as they are interpreted wrongly, or if they are missing from the instruction set within Ghidra. This chapter provides detailed guidance on how to edit a processor module to improve Ghidra’s disassembly capability.
The decompiler is one of Ghidra’s most used components. The default variable names are derived from the types of the variables, as well as the types that are present in the used function signatures. This chapter covers the renaming and retyping of variables, creating and editing structs, handle non-returning functions, overriding function signatures, and highlighting slices. Slices are used to quickly view the usage of a variable or its value in a function.
The decompiler provides a quick overview of the code, making the analysis faster and easier. Tweaking the decompiler to better handle the code, will result in less mistakes and a faster analysis.
This chapter deals with the difference outputs that compilers produce, and how to recognise several of those differences. A switch, compiled with different compilers on different platforms, is used in these examples. Additionally, there is focus finding the main function, as well as a focus on C++ related code such as function overloading and Runtime Type Identification (RTII).
Obfuscated code is commonly seen in malware, and is an important topic for any analyst to be comfortable with. For those who are not used to handling obfuscated code, the chapter provides an overview to the definition, as well as multiple known techniques. One will learn (with examples in code) about desynchronised disassembly, dynamic computed target addresses, control flow obfuscation, opcode obfuscation, imported function obfuscation, and more.
Additionally, one will learn about anti-virtualisation techniques, which can detect specific software, hardware, or CPU features, all of which can be used to discover more about the environment that the sample is executed in.
At last, static deobfuscation using Ghidra is discussed. Static deobfuscation can be done using scripts, debuggers, or emulation. The latter being the most flexible. Ghidra provides an emulator class to use within scripts, further helping the analyst. A detailed example on how to write and use the emulator is discussed within this chapter.
Some malware might contain anti-debugging or anti-virtualisation techniques that one would rather remove. This can help to more easily debug the sample using a debugger of choice. This chapter provides step-by-step guides on the patching process. It teaches what to take into account when patching an under or oversized chunk of data, and teaches how to avoid creating errors in the binary. It also shows how to export the binary from Ghidra once the patches have been applied.
Reusing (a part of) an analysis is helpful to the analyst, as it saves time and avoids errors during the new investigation. Malware often receives additional features when updated, whilst the rest remains the same. Using the analysis database of the prior version can help to significantly reduce the amount spent to discover a new feature.
The diff function of Ghidra is also extensively discussed, where one can merge two binaries (or functions, depending on the analyst’s choice) from two different Ghidra projects. This way, an analyst does not have to open two instances of Ghidra to continuously compare the output of the two binaries.
The Ghidra Book provides a thorough introduction for new users, using clear examples with plenty of background information. For more experienced users, the second half of the book is the most relevant, as it provides in-depth examples over more complex topics, including numerous examples through which the reader is guided on a step-by-step basis.
All in all, I thoroughly enjoyed reading the book. The second part, especially related to writing and editing scripts, contains information that is harder to come across in such a clearly explained manner. The added background information helps to understand why Ghidra works the way it does, and also helps the analyst to use this knowledge when using another tool. Overall, I would advise to read this book, as it is a valuable addition to the skill set of a malware analyst.