debugging-benchmark

Welcome to the debugging benchmark toolkit! This guide will walk you through using our benchmarks to test and evaluate your research prototypes efficiently.

Quickstart

Initializing the Calculator Benchmark

Let's start by initializing the CalculatorBenchmarkRepository from our benchmark collection. This repository contains different subjects for the calculator benchmark, each designed to test various aspects of calculator implementations.

from debugging_benchmark.calculator.calculator import CalculatorBenchmarkRepository

calculator_repo = CalculatorBenchmarkRepository()
calculator_subjects = calculator_repo.build()

print(f"Initialized Calculator Benchmark with {len(calculator_subjects)} subjects.")

Fuzzing the Calculator Benchmark

Next, we'll fuzz each calculator subject to generate passing and failing inputs. The GrammarBasedEvaluationFuzzer is utilized here to create inputs based on the grammar and rules defined in the calculator benchmark.

from debugging_framework.tools import GrammarBasedEvaluationFuzzer

print(f"Fuzzing the calculator repository...")

for calculator_subject in calculator_subjects:
    print(f"Fuzzing the calculator subject ({calculator_subject})...")
    param = calculator_subject.to_dict()
    
    fuzzer = GrammarBasedEvaluationFuzzer(**param)
    failing_inputs = fuzzer.run().get_all_failing_inputs()

    if failing_inputs:
        print(f"Found the following failing inputs:")
        for failing_input in failing_inputs:
            print(failing_input)
    else:
        print("No failing inputs found.")

Deeper Look into the Class Structure

Check out the Class Diagram for a first overview. Further down in this section we take a look at some key functions of interest.

Class Diagram

build()

Returns a List of BenchmarkPrograms. Calls internally _construct_test_program(). This function is our interface.

to_dict()

Returns a dict with the keys grammar, oracle and initial_inputs. These Parameter can be used for fuzzing new inputs.

Example use of the abstract Classes

The implementation of these classes can be found in debugging_benchmark/student_assignments.py

The faulty programs can be found at debugging_benchmark/student_assignments/problem_1_GCD and the correct implementation at debugging_benchmark/student_assignments/reference1.py

Install, Development, Testing

Install

If all external dependencies are available, a simple pip install PLACEHOLDER suffices. We recommend installing PLACEHOLDER inside a virtual environment (virtualenv):

python3.10 -m venv venv
source venv/bin/activate

pip install --upgrade pip
pip install PLACEHOLDER

Development and Testing

For development and testing, we recommend using PLACEHOLDER inside a virtual environment (virtualenv). By doing the following steps in a standard shell (bash), one can run the PLACEHOLDER tests:

git clone https://github.com/martineberlein/debugging-benchmark
cd debugging-benchmark

python3.10 -m venv venv
source venv/bin/activate

pip install --upgrade pip

# Run tests
pip install -e .[dev]
python3 -m pytest

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
.github/workflows		.github/workflows
playground		playground
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

debugging-benchmark

Quickstart

Initializing the Calculator Benchmark

Fuzzing the Calculator Benchmark

Deeper Look into the Class Structure

Class Diagram

build()

to_dict()

Example use of the abstract Classes

Install, Development, Testing

Install

Development and Testing

About

Releases

Packages

Languages

License

aboutsblank/debugging-benchmark

Folders and files

Latest commit

History

Repository files navigation

debugging-benchmark

Quickstart

Initializing the Calculator Benchmark

Fuzzing the Calculator Benchmark

Deeper Look into the Class Structure

Class Diagram

build()

to_dict()

Example use of the abstract Classes

Install, Development, Testing

Install

Development and Testing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages