Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CGC binaries on vanilla Linux with user mode syscall emulation #100

Merged
merged 20 commits into from
Jun 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
5c7a241
guest/linux/libcgc: decree syscall emulation for linux
vitalych May 25, 2024
5d05ce7
guest/linux/libpov: added pov negotiation library
vitalych May 25, 2024
241326b
guest/linux/cgcload: added tool to run cgc binaries on linux
vitalych Apr 27, 2024
2952122
guest/linux/povtest: added tool to test generated POVs
vitalych May 25, 2024
b1130f4
guest/linux: removed cgccmd
vitalych May 5, 2024
0f8acf3
plugins: pass full guest data to onSegFault signal
vitalych May 18, 2024
0449e20
plugins/memutils: fixed build error
vitalych May 18, 2024
327d957
MemoryMap: removed decree support
vitalych May 18, 2024
c5f61e5
DecreeMonitor: removed generic OS monitoring support
vitalych May 18, 2024
9ed30d6
LinuxMonitor: moved kill process group option from DecreeMonitor
vitalych May 18, 2024
535733a
DecreePovGenerator: removed xml format from decree pov generator
vitalych May 18, 2024
f9ce4ce
DecreePovGenerator: fixed C template for new libpov
vitalych May 25, 2024
1a41f26
DecreePovGenerator: treat crashes without POV as Type1 POV with null …
vitalych May 26, 2024
400dbe2
RegionMap: added getter for interval map
vitalych May 26, 2024
01c0d9e
MemUtils: optimized page finder
vitalych May 26, 2024
42e614e
MemUtils: support read-only pages in page finder
vitalych May 26, 2024
eac745c
Recipe: optimized page lookup
vitalych May 26, 2024
7e978ad
RecipeDescriptor: fixed plugin lookup
vitalych May 26, 2024
ce29b6c
testsuite/pov-cgc-cadet0: fixed build and validate pov
vitalych May 26, 2024
88b8446
docs: update pov tutorial
vitalych Jun 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion docs/src/Tutorials/MSOffice/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ this case Windows and Microsoft Office. S2E supports various combinations of Win
* all - Build all images

Available images:
* cgc_debian-9.2.1-i386 - Debian i386 image with CGC kernel and user-space packages
* debian-12.5-i386 - Debian i386 image
* debian-12.5-x86_64 - Debian x86_64 image
* windows-10pro1909-x86_64 - Windows 10 Pro 1909 x86_64
Expand Down
72 changes: 43 additions & 29 deletions docs/src/Tutorials/PoV/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,12 @@ You can find the corresponding source file in
.. code-block:: console

$ cd ~/s2e/env
$ s2e new_project -i debian-11.3-i386 -n vuln-lin32-32 --tools=pov \
$ s2e new_project -i debian-12.5-i386 -n vuln-lin32-32 --tools=pov \
build/s2e/guest-tools32/common/demos/vulnerabilities @@

Here is a breakdown of the ``s2e new_project`` command above:

* It creates a new project called ``vuln-lin32-32`` that will run on the ``debian-11.3-i386`` image.
* It creates a new project called ``vuln-lin32-32`` that will run on the ``debian-12.5-i386`` image.

* The ``--tools=pov`` option is important in order to generate PoVs. If you omit it, you will just get random test
cases, which will at best crash the binary depending on what the constraint solver generates. Vulnerability generation
Expand Down Expand Up @@ -222,51 +222,57 @@ find vulnerabilities and exploit them. This demo walks you through the process o
The CGC Final Event (CFE) ran on the Decree operating system. Decree is a modified Linux OS with a reduced number of
`system calls <https://github.com/CyberGrandChallenge/libcgc>`__. In addition to this, the Decree OS has been modified
to add "hook points" for S2E (e.g. to signal process creation, termination, etc.) and to allow S2E to inject symbolic
values. The source code for the Decree OS is available at https://github.com/S2E/s2e-linux-kernel. A Decree
virtual machine image can be built by running the following command:
values. The source code for the Decree OS is available at https://github.com/S2E/s2e-linux-kernel.

The current version of S2E does not use the old Decree kernel anymore. It was based on Linux 3.13, which does not build
on modern Linux images that currently ship with S2E. Instead, S2E comes with ``cgcload``, a tool that emulates the Decree system calls
in user-mode using the syscall `user dispatch mechanism <https://docs.kernel.org/5.19/admin-guide/syscall-user-dispatch.html>`__
in recent Linux kernels. This tool is a loader that maps the native CGC challenge binary into its address space, sets up syscall emulation
that translates Decree syscalls into native Linux syscalls, then jumps to the binary's entry point.

To run CGC binaries, build the 32-bit Linux image. This is the same image that is used for running normal Linux binaries.

.. code-block:: console

$ cd ~/s2e/env
$ s2e image_build cgc_debian-9.2.1-i386
$ s2e image_build debian-12.5-i386


Next, create an analysis project for a challenge binary. Sample CBs are available `here
<https://github.com/CyberGrandChallenge/samples>`__ and can be built using the instructions `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/building-a-cb.md>`__ . The
remainder of this tutorial will focus on the CADET_00001 program (a pre-compiled version of which is available `here
<https://github.com/S2E/Decree/blob/master/samples/CADET_00001>`__), but the ideas and techniques should be applicable
to all of the CBs.
to all of the CBs. Note that the S2E tooling does not support multi-binary CBs.

The following command creates a ``projects/CADET_00001`` directory with various scripts and configuration files needed
by S2E, as described `here <../../s2e-env.rst>`__.

.. code-block:: console

$ s2e new_project --image cgc_debian-9.2.1-i386 ./source/s2e/decree/samples/CADET_00001
$ s2e new_project ./source/s2e/decree/samples/CADET_00001

Finally, to start S2E, run the following command:
Then run S2E:

.. code-block:: console

$ s2e run CADET_00001
$ cd projects/CADET_00001
$ ./launch-s2e.sh

This will display a TUI-based dashboard, similar to that used by the American Fuzzy Lop (AFL) fuzzer. As S2E finds
vulnerabilities, it generates PoV files in the ``s2e-last`` directory. These files have either ``.xml`` or ``.c``
file extensions. Once some PoV files have been generated you can press ``q`` to stop S2E.
Wait for PoVs to be generated (``s2e-last/*.c`` files) then interrupt S2E with Ctrl+C.
Finally, verify that the generated PoVs are valid, i.e., they can reproduce the correct crashes when run,
using the following command:

.. image:: cadet_00001_tui.png
.. code-block:: console

Alternatively, you can run S2E without the TUI by using the ``-n`` option in ``s2e run``. Instead of the TUI you will
see the standard S2E output. Once some POVs have been generated you can stop S2E by killing the process with
``Ctrl+C`` or ``killall -9 qemu-system-i386``.
$ ./verify-pov.sh


Understanding CGC-style PoVs
============================

If you followed the tutorial on PoV generation on Linux, you will notice that the PoV format for CGC binaries is
different. Instead of being a concrete input file, CGC binaries produce PoVs in ``.xml`` or ``.c`` format. The reason
different. Instead of being a concrete input file, S2E plugins produce PoVs in ``.c`` format. The reason
for this is that CGC binaries read their input from ``stdin`` and write results to ``stdout``. So in order to exercise
the vulnerability, the PoV must implement a two-way communication with the program, by reading the program's output and
writing an appropriate input. This is different from file-based PoVs, where all the input is sent to the program at
Expand All @@ -280,21 +286,29 @@ once, and the program's output is ignored.


For this reason, replaying a CGC-style PoV is more complex. It requires a special setup so that the PoV can communicate
with the CB. For more details, see `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/understanding-cfe-povs.md>`__.
The following outlines the steps required to replay a PoV:
with the CB. The ``verify-pov.sh`` script takes care of all the steps. First, the PoV file must be compiled to a binary.
Then the ``povtest`` tool takes this binary and connects its standard input/output to the challenge binary's output/input.
It negotiates the PoV parameters with the PoV binary and monitors the challenge binary for crashes. When the PoV terminates,
``povtest`` verifies that the challenge binary crashed at the correct location (type 1 PoV) or the PoV binary exfiltrated
the correct data from the challenge binary's address space (type 2 PoV).

1. Follow the instructions `here
<https://github.com/CyberGrandChallenge/cgc-release-documentation/blob/master/walk-throughs/running-the-vm.md>`__ to
setup and run the CGC testing VM
.. code-block:: console

# Build the PoV source code. The C file is generated by the DecreePoVGenerator plugin.
$ gcc -O0 -g -m32 -o s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1 \
-L$S2EDIR/install/bin/guest-tools32/lib -I$S2EDIR/source/s2e/guest/linux/include \
s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1.c -lpov -lcgc

2. As discussed in the instructions in the previous step, files can be shared between the host and CGC testing VM via
the ``/vagrant`` directory. Copy the CADET_00001 binary, the PoV XML files generated by S2E and `this
<https://github.com/S2E/decree/blob/master/scripts/test_pov.sh>`__ script (located in your S2E environment in
``bin/cgc-tools/test_pov.sh``) to the CGC testing VM.
# Verify the PoV.
$ $S2EDIR/install/bin/guest-tools32/povtest \
--pov=s2e-last/recipe-type2_i386_decree_shellcode_1.rcp-pov-type2-1 \
-- $S2EDIR/bin/guest-tools32/cgcload $S2EDIR/source/s2e/testsuite/pov-cgc-cadet0/CADET_00001/bin/CADET_00001

.. note::

3. Run ``vagrant ssh`` to access the VM and copy the files from ``/vagrant`` into ``/home/vagrant``. Then run the
``test_pov.sh`` script to check the PoV's correctness.
S2E does not use the original DARPA CGC tooling for PoV verification. The tooling is not maintained
and does not run on modern Linux distributions, mostly because it uses an old Python 2 version.
The Decree kernel does not build anymore either so it is not possible to run the CGC binaries natively.


Plugin architecture overview
Expand Down
15 changes: 7 additions & 8 deletions docs/src/Tutorials/PoV/pov.rst
Original file line number Diff line number Diff line change
Expand Up @@ -394,9 +394,8 @@ the binary, S2E disassembles it, extracts all function addresses, then invokes e
the function produces the expected output, identification succeeded.



S2E uses `RevGen <https://github.com/S2E/s2e/tree/master/tools/tools/revgen32>`__, an x86-to-LLVM translator,
in order to extract function types from the binary before analyzing it.
S2E used `RevGen <../Revgen/Revgen.rst>`__, an x86-to-LLVM translator, in order to extract function types from
the binary before analyzing it.


Generating Replayable PoVs
Expand Down Expand Up @@ -536,10 +535,10 @@ A correct PoV would look like this:

To generate a correct PoV, S2E looks at all branch conditions and looks for cases where the content of a receive buffer
is compared with a symbolic value derived from the random number generator. Once it found such a comparison, it can
easily generate the correct PoV code by mapping the symbolic value created in the receive call to the symbolic value
generate the correct PoV code by mapping the symbolic value created in the receive call to the symbolic value
written by the transmit function.

A much harder case happens when the PoV needs to perform computations. Consider the slightly modified above example:
A harder case happens when the PoV needs to perform computations. Consider the slightly modified above example:

.. code-block:: c
:number-lines:
Expand Down Expand Up @@ -589,9 +588,9 @@ solver. The following snippet shows how would an automatically generated PoV wit
transmit(&data1, sizeof(data1));
}

Unfortunately, we ran out of time and didn't have time to implement this solution. The main challenge was to fit an
entire solver within the size and memory limits of a PoV, as well as modifying the solver to accommodate a very
restricted runtime environment, that has primitive memory allocation, no standard library, etc.
Unfortunately, we ran out of time and did not have time to implement this solution. The main challenge was to fit an
entire solver within the size and memory limits of a PoV, as well as modifying the solver to accommodate
a restricted runtime environment, that has primitive memory allocation, no standard library, etc.

Conclusion
==========
Expand Down
17 changes: 0 additions & 17 deletions guest/common/include/s2e/monitors/commands/decree.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,6 @@ extern "C" {
#define S2E_DECREEMON_COMMAND_VERSION 0x202301082207ULL // date +%Y%m%d%H%M

enum S2E_DECREEMON_COMMANDS {
DECREE_SEGFAULT,
DECREE_PROCESS_LOAD,
DECREE_READ_DATA,
DECREE_WRITE_DATA,
DECREE_FD_WAIT,
Expand All @@ -54,13 +52,7 @@ enum S2E_DECREEMON_COMMANDS {
DECREE_HANDLE_SYMBOLIC_TRANSMIT_BUFFER,
DECREE_HANDLE_SYMBOLIC_RECEIVE_BUFFER,
DECREE_HANDLE_SYMBOLIC_RANDOM_BUFFER,
DECREE_COPY_TO_USER,
DECREE_UPDATE_MEMORY_MAP,
DECREE_SET_CB_PARAMS,
DECREE_INIT,
DECREE_KERNEL_PANIC,
DECREE_MODULE_LOAD,
DECREE_TASK_SWITCH
};

struct S2E_DECREEMON_COMMAND_READ_DATA {
Expand Down Expand Up @@ -187,27 +179,18 @@ struct S2E_DECREEMON_COMMAND_KERNEL_PANIC {
struct S2E_DECREEMON_COMMAND {
uint64_t version;
enum S2E_DECREEMON_COMMANDS Command;
struct S2E_LINUXMON_TASK CurrentTask;
union {
struct S2E_LINUXMON_COMMAND_PROCESS_LOAD ProcessLoad;
struct S2E_LINUXMON_COMMAND_MODULE_LOAD ModuleLoad;
struct S2E_DECREEMON_COMMAND_READ_DATA Data;
struct S2E_DECREEMON_COMMAND_WRITE_DATA WriteData;
struct S2E_DECREEMON_COMMAND_FD_WAIT FDWait;
struct S2E_DECREEMON_COMMAND_SEG_FAULT SegFault;
struct S2E_DECREEMON_COMMAND_RANDOM Random;
struct S2E_DECREEMON_COMMAND_READ_DATA_POST DataPost;
struct S2E_DECREEMON_COMMAND_GET_CFG_BOOL GetCfgBool;
struct S2E_DECREEMON_COMMAND_HANDLE_SYMBOLIC_SIZE SymbolicSize;
struct S2E_DECREEMON_COMMAND_HANDLE_SYMBOLIC_BUFFER SymbolicBuffer;
struct S2E_DECREEMON_COMMAND_COPY_TO_USER CopyToUser;
struct S2E_DECREEMON_COMMAND_UPDATE_MEMORY_MAP UpdateMemoryMap;
struct S2E_DECREEMON_COMMAND_SET_CB_PARAMS CbParams;
struct S2E_DECREEMON_COMMAND_INIT Init;
struct S2E_DECREEMON_COMMAND_KERNEL_PANIC Panic;
struct S2E_LINUXMON_COMMAND_TASK_SWITCH TaskSwitch;
};
char currentName[32]; // not NULL terminated
} __attribute__((packed));

#ifdef __cplusplus
Expand Down
8 changes: 7 additions & 1 deletion guest/linux/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,13 @@ set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O0")
add_subdirectory(function_models)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O3")
add_subdirectory(cgccmd)
add_subdirectory(s2e.so)

if(${BITS} EQUAL 32)
add_subdirectory(libpov)
add_subdirectory(libcgc)
add_subdirectory(cgcload)
add_subdirectory(povtest)
endif()

install(DIRECTORY scripts/ DESTINATION .)
139 changes: 0 additions & 139 deletions guest/linux/cgccmd/cgccmd.c

This file was deleted.

Loading
Loading