How to Use Fuzzing in Security Research

Introduction

Fuzzing is one of the most employed methods for automatic software testing. Through fuzzing, one can generate a lot of possible inputs for an application, according to a set of rules, and inject them into a program to observe how the application behaves. In the security realm, fuzzing is regarded as an effective way to identify corner-case bugs and vulnerabilities.

There are a plethora of fuzzing frameworks, both open-source projects and commercial. There are two major classes of fuzzing techniques:

This classification is not mutually exclusive, but more of a general design distinction. There are tools that include both techniques, such as PeachFuzzer.

Here at the Application and Threat Intelligence (ATI) Research Center, one of our objectives is to identify vulnerabilities in applications and help developers fix them before they are exploited. This is done by connecting different applications and libraries to our fuzzing framework. This article will show how we use fuzzing in our security research by highlighting some of our findings while investigating an open-source library.

Fuzzing THE SDL Library

The Simple DirectMedia Layer (SDL) is a cross-platform library that provides an API for implementing multimedia software, such as games and emulators. Written in C, it is actively maintained and employed by the community.

Choosing a Fuzzing Framework

We are going to fuzz SDL using the well-known AFL. Written by lcamtuf, AFL uses runtime-guided techniques, compile-time instrumentation, and genetic algorithms to create mutated input for the tested application. It has an impressive trophy case of identified vulnerabilities, which is why it is considered one of the best fuzzing frameworks out there. Some researchers studied AFL in detail and came up with extensions that modify the behavior of certain components, for example the mutation strategy or importance attributed to different code branches. Such projects gave rise to FairFuzz, AFL-GO, afl-unicorn, AFLSmart, and python-AFL.

We are going to use AFLFast, a project that implemented some fuzzing strategies to target not only high-frequency code paths, but also low-frequency paths, “to stress significantly more program behavior in the same amount of time.” In short, during our research, we observed that for certain fuzzing campaigns, this optimization produces an approximate 2x speedup improvement and a better overall code coverage compared to vanilla AFL.

Fuzzing Preparation

To use AFL, you must compile the library’s sources with AFL’s compiler wrappers.

$ ./configure CC=afl-clang-fast \
CFLAGS ="-O2 -D_FORTIFY_SOURCE=0 -fsanitize=address" \
LDFLAGS="-O2 -D_FORTIFY_SOURCE=0 -fsanitize=address"
$ make; sudo make install

As observed, we will use both the AFL instrumentation and the ASAN (Address Sanitizer) compiler tool, used to identify memory-related errors. As specified here, ASAN adds a 2x slowdown to execution speed to the instrumented program, but the gain is much higher, allowing us to possibly detect memory-related issues such as:

Furthermore, to optimize the fuzzing process, we compile the sources with:

Let’s check if the settings were applied successfully:

$ checksec /usr/local/lib/libSDL-1.2.so.0
[*] '/usr/local/lib/libSDL-1.2.so.0'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    ASAN:     Enabled

Checksec is a nice tool that allows users to inspect binaries for security options, such as whether the binary is built with a non-executable stack (NX), or with relocation table as read-only (RELRO). It also checks whether the binary is built with ASAN instrumentation, which is what we need. It is part of the pwntools Python package. As observed, the binaries were compiled with ASAN instrumentation enabled as we wanted. Now let’s proceed to fuzzing!

Writing a Test Harness

An AFL fuzzing operation consists of three primary steps:

  1. Fork a new process
  2. Feed it an input modified by the mutation engine
  3. Monitor the code coverage by keeping a track of which paths are reached using this input, informing you if any crashes or hangs occurred

This is done automatically by AFL, which makes it ideal for fuzzing binaries that accept input as an argument, then parse it. But to fuzz the library, we must first make a test harness and compile it. In our case, a harness is simply a C program that makes use of certain methods from a library, allowing you to indirectly fuzz it.

#include <stdlib.h>
#include "SDL_config.h"
#include "SDL.h"

struct {
  SDL_AudioSpec spec;
  Uint8   *sound;     /* Pointer to wave data */
  Uint32   soundlen;    /* Length of wave data */
  int      soundpos;    /* Current play position */
} wave;

/* Call this instead of exit(), to clean up SDL. */
static void quit(int rc){
  SDL_Quit();
  exit(rc);
}

int main(int argc, char *argv[]){ 
    /* Load the SDL library */
  if ( SDL_Init(SDL_INIT_AUDIO) < 0 ) {
    fprintf(stderr, "[-] Couldn't initialize SDL: %s\n",SDL_GetError());
    return(1);
  }
  if ( argv[1] == NULL ) {
    fprintf(stderr, "[-] No input supplied.\n");
  }
  /* Load the wave file */
  if ( SDL_LoadWAV(argv[1], &wave.spec, &wave.sound, &wave.soundlen) == NULL ) {
    fprintf(stderr, "Couldn't load %s: %s\n",
            argv[1], SDL_GetError());
    quit(1);
  }
  /* Free up the memory */    
  SDL_FreeWAV(wave.sound);
  SDL_Quit();
  return(0);

}

Our intention here is to initialize the SDL environment, then fuzz the SDL_LoadWAV method pertaining to the SDL audio module. To do that, we will supply a sample WAV file, with which AFL will tamper using its mutation engine to go as far into the library code as possible. Introducing some new fuzzing terminology, this file represents our initial seed, which will be placed in the corpus_wave folder.

Let’s compile it:

$ afl-clang-fast -o harness_sdl harness_sdl.c -g -O2 \
  -D_FORTIFY_SOURCE=0 -fsanitize=address \
  -I/usr/local/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT \
  -L/usr/local/lib -Wl,-rpath,/usr/local/lib -lSDL -lX11  -lpthread 

And start the fuzzing process:

$ afl-fuzz -i corpus_wave/ -o output_wave -m none -M fuzzer_1_SDL_sound \
    -- /home/radu/apps/sdl_player_lib/harness_sdl @@

As you can see, starting a fuzzing job is easy, we just execute afl-fuzz with the following parameters:

There are other useful parameters that you can use, such as specifying a dictionary containing strings related to a certain file format, which would theoretically help the mutation engine reach certain paths quicker. But for now, let’s see how this goes.

Fuzz1 My display is ok, that is just a mountain in the back.

We are conducting this investigation on a machine with 32GB of RAM, having 2 AMD Opteron 6328 CPUs, each with 4 cores per socket and 2 threads per core, giving us a total of 16 threads. As we can observe, we get 170 evaluated samples per second as the fuzzing speed. Can we do better than that?

Optimizing for Better Fuzzing Speed

Some of the things we can tweak are:

Let’s see how our fuzzer performs with all these changes.

Fuzz2 Run, Fuzzer, run!

For one instance, we get a 2.4x improvement speed and already a crash! Running one master instance and four more slave instances, we get the following stats:

$ afl-whatsup -s output_wave/
status check tool for afl-fuzz by <[email protected]>

Summary stats
=============

       Fuzzers alive : 5
      Total run time : 0 days, 0 hours
         Total execs : 0 million
    Cumulative speed : 1587 execs/sec
       Pending paths : 6 faves, 35 total
  Pending per fuzzer : 1 faves, 7 total (on average)
       Crashes found : 22 locally unique

With 5 parallel fuzzers, we get more than 1500 executions per second, which is a decent speed. Let’s see them working!

Fuzz3

Results

After one day of fuzzing, we got a total of 60 unique crashes. Triaging them, we obtained 12 notable ones, which were reported to the SDL community and MITRE. In effect, CVE-2019-7572, CVE-2019-7573, CVE-2019-7574, CVE-2019-7575, CVE-2019-7576, CVE-2019-7577, CVE-2019-7578, CVE-2019-7635, CVE-2019-7636, CVE-2019-7637, CVE-2019-7638 were assigned. The maintainers of the library acknowledged that the vulnerabilities are present in the last version (2.0.9) of the library as well. Just to emphasize the fact that some bugs can stay well-hidden for years, some of the vulnerabilities were introduced with a commit dating from 2006 and have never been discovered until now.

LEVERAGE SUBSCRIPTION SERVICE TO STAY AHEAD OF ATTACKS

Keysight's Application and Threat Intelligence (ATI) Subscription provides bi-weekly updates of the latest application protocols and attacks for use with Keysight test platforms. The ATI Research Center continuously monitors threats as they appear in the wild. Customers of our BreakingPoint product have access to strikes for different attacks, allowing them to test their currently deployed security controls’ ability to detect or block such attacks; this capability can afford you time to patch your deployed web applications. Our monitoring of in-the-wild attackers ensures that such attacks are also covered for customers of Keysight Threat Simulator.

Originally written by Radu-Emanuel Chiscariu

limit
3