Skip to content

Commit

Permalink
Merge pull request #100 from S2E/issue/503-cgc
Browse files Browse the repository at this point in the history
Run CGC binaries on vanilla Linux with user mode syscall emulation
  • Loading branch information
vitalych authored Jun 2, 2024
2 parents 3063162 + 88b8446 commit 4caf969
Show file tree
Hide file tree
Showing 49 changed files with 2,787 additions and 933 deletions.
1 change: 0 additions & 1 deletion docs/src/Tutorials/MSOffice/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ this case Windows and Microsoft Office. S2E supports various combinations of Win
* all - Build all images
Available images:
* cgc_debian-9.2.1-i386 - Debian i386 image with CGC kernel and user-space packages
* debian-12.5-i386 - Debian i386 image
* debian-12.5-x86_64 - Debian x86_64 image
* windows-10pro1909-x86_64 - Windows 10 Pro 1909 x86_64
Expand Down
72 changes: 43 additions & 29 deletions docs/src/Tutorials/PoV/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,12 @@ You can find the corresponding source file in
.. code-block:: console
$ cd ~/s2e/env
$ s2e new_project -i debian-11.3-i386 -n vuln-lin32-32 --tools=pov \
$ s2e new_project -i debian-12.5-i386 -n vuln-lin32-32 --tools=pov \
build/s2e/guest-tools32/common/demos/vulnerabilities @@
Here is a breakdown of the ``s2e new_project`` command above:

* It creates a new project called ``vuln-lin32-32`` that will run on the ``debian-11.3-i386`` image.
* It creates a new project called ``vuln-lin32-32`` that will run on the ``debian-12.5-i386`` image.

* The ``--tools=pov`` option is important in order to generate PoVs. If you omit it, you will just get random test
cases, which will at best crash the binary depending on what the constraint solver generates. Vulnerability generation
Expand Down Expand Up @@ -222,51 +222,57 @@ find vulnerabilities and exploit them. This demo walks you through the process o
The CGC Final Event (CFE) ran on the Decree operating system. Decree is a modified Linux OS with a reduced number of
`system calls <https://github.com/CyberGrandChallenge/libcgc>`__. In addition to this, the Decree OS has been modified
to add "hook points" for S2E (e.g. to signal process creation, termination, etc.) and to allow S2E to inject symbolic
values. The source code for the Decree OS is available at https://github.com/S2E/s2e-linux-kernel. A Decree
virtual machine image can be built by running the following command:
values. The source code for the Decree OS is available at https://github.com/S2E/s2e-linux-kernel.

The current version of S2E does not use the old Decree kernel anymore. It was based on Linux 3.13, which does not build
on modern Linux images that currently ship with S2E. Instead, S2E comes with ``cgcload``, a tool that emulates the Decree system calls
in user-mode using the syscall `user dispatch mechanism <https://docs.kernel.org/5.19/admin-guide/syscall-user-dispatch.html>`__
in recent Linux kernels. This tool is a loader that maps the native CGC challenge binary into its address space, sets up syscall emulation
that translates Decree syscalls into native Linux syscalls, then jumps to the binary's entry point.

To run CGC binaries, build the 32-bit Linux image. This is the same image that is used for running normal Linux binaries.

.. code-block:: console
$ cd ~/s2e/env
$ s2e image_build cgc_debian-9.2.1-i386
$ s2e image_build debian-12.5-i386
Next, create an analysis project for a challenge binary. Sample CBs are available `here
<https://github.com/CyberGrandChallenge/samples>`__ and can be built using the instructions `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/building-a-cb.md>`__ . The
remainder of this tutorial will focus on the CADET_00001 program (a pre-compiled version of which is available `here
<https://github.com/S2E/Decree/blob/master/samples/CADET_00001>`__), but the ideas and techniques should be applicable
to all of the CBs.
to all of the CBs. Note that the S2E tooling does not support multi-binary CBs.

The following command creates a ``projects/CADET_00001`` directory with various scripts and configuration files needed
by S2E, as described `here <../../s2e-env.rst>`__.

.. code-block:: console
$ s2e new_project --image cgc_debian-9.2.1-i386 ./source/s2e/decree/samples/CADET_00001
$ s2e new_project ./source/s2e/decree/samples/CADET_00001
Finally, to start S2E, run the following command:
Then run S2E:

.. code-block:: console
$ s2e run CADET_00001
$ cd projects/CADET_00001
$ ./launch-s2e.sh
This will display a TUI-based dashboard, similar to that used by the American Fuzzy Lop (AFL) fuzzer. As S2E finds
vulnerabilities, it generates PoV files in the ``s2e-last`` directory. These files have either ``.xml`` or ``.c``
file extensions. Once some PoV files have been generated you can press ``q`` to stop S2E.
Wait for PoVs to be generated (``s2e-last/*.c`` files) then interrupt S2E with Ctrl+C.
Finally, verify that the generated PoVs are valid, i.e., they can reproduce the correct crashes when run,
using the following command:

.. image:: cadet_00001_tui.png
.. code-block:: console
Alternatively, you can run S2E without the TUI by using the ``-n`` option in ``s2e run``. Instead of the TUI you will
see the standard S2E output. Once some POVs have been generated you can stop S2E by killing the process with
``Ctrl+C`` or ``killall -9 qemu-system-i386``.
$ ./verify-pov.sh
Understanding CGC-style PoVs
============================

If you followed the tutorial on PoV generation on Linux, you will notice that the PoV format for CGC binaries is
different. Instead of being a concrete input file, CGC binaries produce PoVs in ``.xml`` or ``.c`` format. The reason
different. Instead of being a concrete input file, S2E plugins produce PoVs in ``.c`` format. The reason
for this is that CGC binaries read their input from ``stdin`` and write results to ``stdout``. So in order to exercise
the vulnerability, the PoV must implement a two-way communication with the program, by reading the program's output and
writing an appropriate input. This is different from file-based PoVs, where all the input is sent to the program at
Expand All @@ -280,21 +286,29 @@ once, and the program's output is ignored.


For this reason, replaying a CGC-style PoV is more complex. It requires a special setup so that the PoV can communicate
with the CB. For more details, see `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/understanding-cfe-povs.md>`__.
The following outlines the steps required to replay a PoV:
with the CB. The ``verify-pov.sh`` script takes care of all the steps. First, the PoV file must be compiled to a binary.
Then the ``povtest`` tool takes this binary and connects its standard input/output to the challenge binary's output/input.
It negotiates the PoV parameters with the PoV binary and monitors the challenge binary for crashes. When the PoV terminates,
``povtest`` verifies that the challenge binary crashed at the correct location (type 1 PoV) or the PoV binary exfiltrated
the correct data from the challenge binary's address space (type 2 PoV).

1. Follow the instructions `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/running-the-vm.md>`__ to
setup and run the CGC testing VM
.. code-block:: console
# Build the PoV source code. The C file is generated by the DecreePoVGenerator plugin.
$ gcc -O0 -g -m32 -o s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1 \
-L$S2EDIR/install/bin/guest-tools32/lib -I$S2EDIR/source/s2e/guest/linux/include \
s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1.c -lpov -lcgc
2. As discussed in the instructions in the previous step, files can be shared between the host and CGC testing VM via
the ``/vagrant`` directory. Copy the CADET_00001 binary, the PoV XML files generated by S2E and `this
<https://github.com/S2E/decree/blob/master/scripts/test_pov.sh>`__ script (located in your S2E environment in
``bin/cgc-tools/test_pov.sh``) to the CGC testing VM.
# Verify the PoV.
$ $S2EDIR/install/bin/guest-tools32/povtest \
--pov=s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1 \
-- $S2EDIR/bin/guest-tools32/cgcload $S2EDIR/source/s2e/testsuite/pov-cgc-cadet0/CADET_00001/bin/CADET_00001
.. note::

3. Run ``vagrant ssh`` to access the VM and copy the files from ``/vagrant`` into ``/home/vagrant``. Then run the
``test_pov.sh`` script to check the PoV's correctness.
S2E does not use the original DARPA CGC tooling for PoV verification. The tooling is not maintained
and does not run on modern Linux distributions, mostly because it uses an old Python 2 version.
The Decree kernel does not build anymore either so it is not possible to run the CGC binaries natively.


Plugin architecture overview
Expand Down
15 changes: 7 additions & 8 deletions docs/src/Tutorials/PoV/pov.rst
Original file line number Diff line number Diff line change
Expand Up @@ -394,9 +394,8 @@ the binary, S2E disassembles it, extracts all function addresses, then invokes e
the function produces the expected output, identification succeeded.



S2E uses `RevGen <https://github.com/S2E/s2e/tree/master/tools/tools/revgen32>`__, an x86-to-LLVM translator,
in order to extract function types from the binary before analyzing it.
S2E used `RevGen <../Revgen/Revgen.rst>`__, an x86-to-LLVM translator, in order to extract function types from
the binary before analyzing it.


Generating Replayable PoVs
Expand Down Expand Up @@ -536,10 +535,10 @@ A correct PoV would look like this:
To generate a correct PoV, S2E looks at all branch conditions and looks for cases where the content of a receive buffer
is compared with a symbolic value derived from the random number generator. Once it found such a comparison, it can
easily generate the correct PoV code by mapping the symbolic value created in the receive call to the symbolic value
generate the correct PoV code by mapping the symbolic value created in the receive call to the symbolic value
written by the transmit function.

A much harder case happens when the PoV needs to perform computations. Consider the slightly modified above example:
A harder case happens when the PoV needs to perform computations. Consider the slightly modified above example:

.. code-block:: c
:number-lines:
Expand Down Expand Up @@ -589,9 +588,9 @@ solver. The following snippet shows how would an automatically generated PoV wit
transmit(&data1, sizeof(data1));
}
Unfortunately, we ran out of time and didn't have time to implement this solution. The main challenge was to fit an
entire solver within the size and memory limits of a PoV, as well as modifying the solver to accommodate a very
restricted runtime environment, that has primitive memory allocation, no standard library, etc.
Unfortunately, we ran out of time and did not have time to implement this solution. The main challenge was to fit an
entire solver within the size and memory limits of a PoV, as well as modifying the solver to accommodate
a restricted runtime environment, that has primitive memory allocation, no standard library, etc.

Conclusion
==========
Expand Down
17 changes: 0 additions & 17 deletions guest/common/include/s2e/monitors/commands/decree.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,6 @@ extern "C" {
#define S2E_DECREEMON_COMMAND_VERSION 0x202301082207ULL // date +%Y%m%d%H%M

enum S2E_DECREEMON_COMMANDS {
DECREE_SEGFAULT,
DECREE_PROCESS_LOAD,
DECREE_READ_DATA,
DECREE_WRITE_DATA,
DECREE_FD_WAIT,
Expand All @@ -54,13 +52,7 @@ enum S2E_DECREEMON_COMMANDS {
DECREE_HANDLE_SYMBOLIC_TRANSMIT_BUFFER,
DECREE_HANDLE_SYMBOLIC_RECEIVE_BUFFER,
DECREE_HANDLE_SYMBOLIC_RANDOM_BUFFER,
DECREE_COPY_TO_USER,
DECREE_UPDATE_MEMORY_MAP,
DECREE_SET_CB_PARAMS,
DECREE_INIT,
DECREE_KERNEL_PANIC,
DECREE_MODULE_LOAD,
DECREE_TASK_SWITCH
};

struct S2E_DECREEMON_COMMAND_READ_DATA {
Expand Down Expand Up @@ -187,27 +179,18 @@ struct S2E_DECREEMON_COMMAND_KERNEL_PANIC {
struct S2E_DECREEMON_COMMAND {
uint64_t version;
enum S2E_DECREEMON_COMMANDS Command;
struct S2E_LINUXMON_TASK CurrentTask;
union {
struct S2E_LINUXMON_COMMAND_PROCESS_LOAD ProcessLoad;
struct S2E_LINUXMON_COMMAND_MODULE_LOAD ModuleLoad;
struct S2E_DECREEMON_COMMAND_READ_DATA Data;
struct S2E_DECREEMON_COMMAND_WRITE_DATA WriteData;
struct S2E_DECREEMON_COMMAND_FD_WAIT FDWait;
struct S2E_DECREEMON_COMMAND_SEG_FAULT SegFault;
struct S2E_DECREEMON_COMMAND_RANDOM Random;
struct S2E_DECREEMON_COMMAND_READ_DATA_POST DataPost;
struct S2E_DECREEMON_COMMAND_GET_CFG_BOOL GetCfgBool;
struct S2E_DECREEMON_COMMAND_HANDLE_SYMBOLIC_SIZE SymbolicSize;
struct S2E_DECREEMON_COMMAND_HANDLE_SYMBOLIC_BUFFER SymbolicBuffer;
struct S2E_DECREEMON_COMMAND_COPY_TO_USER CopyToUser;
struct S2E_DECREEMON_COMMAND_UPDATE_MEMORY_MAP UpdateMemoryMap;
struct S2E_DECREEMON_COMMAND_SET_CB_PARAMS CbParams;
struct S2E_DECREEMON_COMMAND_INIT Init;
struct S2E_DECREEMON_COMMAND_KERNEL_PANIC Panic;
struct S2E_LINUXMON_COMMAND_TASK_SWITCH TaskSwitch;
};
char currentName[32]; // not NULL terminated
} __attribute__((packed));

#ifdef __cplusplus
Expand Down
8 changes: 7 additions & 1 deletion guest/linux/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,13 @@ set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O0")
add_subdirectory(function_models)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O3")
add_subdirectory(cgccmd)
add_subdirectory(s2e.so)

if(${BITS} EQUAL 32)
add_subdirectory(libpov)
add_subdirectory(libcgc)
add_subdirectory(cgcload)
add_subdirectory(povtest)
endif()

install(DIRECTORY scripts/ DESTINATION .)
139 changes: 0 additions & 139 deletions guest/linux/cgccmd/cgccmd.c

This file was deleted.

Loading

0 comments on commit 4caf969

Please sign in to comment.