Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gltfpack: Implement end-to-end fuzzing #625

Merged
merged 6 commits into from
Oct 26, 2023
Merged

gltfpack: Implement end-to-end fuzzing #625

merged 6 commits into from
Oct 26, 2023

Conversation

zeux
Copy link
Owner

@zeux zeux commented Oct 26, 2023

While individual components of gltfpack like cgltf or mesh encoders have
their own dedicated fuzzers, gltfpack has a lot of code that can be
difficult to validate. We would expect that gltfpack is resistant to
invalid inputs (with the exception, perhaps, of .obj inputs where
fast_obj, the library we use, is currently not fuzz safe).

This change implements a in-memory GLB loading flow (parseGlb) and an
in-memory fuzzer; we disable all file I/O by using NULL paths, and
otherwise share all the same processing code with the exception of final
buffer writing.

This is work in progress as fuzzing issues are being discovered and fixed.

zeux added 6 commits October 26, 2023 16:13
While individual components of gltfpack like cgltf or mesh encoders have
their own dedicated fuzzers, gltfpack has a lot of code that can be
difficult to validate. We would expect that gltfpack is resistant to
invalid inputs (with the exception, perhaps, of .obj inputs where
fast_obj, the library we use, is currently not fuzz safe).

This change implements a in-memory GLB loading flow (parseGlb) and an
in-memory fuzzer; we disable all file I/O by using NULL paths, and
otherwise share all the same processing code with the exception of final
buffer writing.
Note that the dictionary doesn't contain a lot of glTF extensions
but it's not clear if we need this, because LLVM libFuzzer uses a
mutator that uses strings from the binary in addition to supplied
dictionary data.

The GLB seed is taken from glTF-Sample-Assets BoxTextured, and is
simultaneously reasonably small and covers all the basic features so
it's a good candidate to use for a seed.
We technically do this validation during mesh processing, but it's
better to flag any out of bounds indices earlier.
We currently have a limit of 16 on some stream based processing
algorithms, and it's possible with custom attributes to reach a limit of
16, so for now we just drop extra streams.
To make sure that we don't spend too much time fuzzing the cgltf parser,
we only add the files to the corpus if the parsing succeeded.

As the comment explains, this is a tradeoff and it's not fully clear
what is optimal here, but for now this seems to work okay.
This doesn't touch the top-level gltfpack executable which makes it
easier to work on gltfpack while the fuzzer is working, and launches the
fuzzer automatically.
@zeux
Copy link
Owner Author

zeux commented Oct 26, 2023

This PR originally contained a few fixes to cgltf validation but I've extracted them into upstream PRs: jkuhlmann/cgltf#233 jkuhlmann/cgltf#234 jkuhlmann/cgltf#235. Once they get merged this PR should result in a fairly clean fuzzing run: a couple minor problems still remain, and there's some alignment validation missing that results in UBSAN failures (which this change doesn't enable for fuzzing config) but things seem to generally be in a good shape, at least with the default setting set.

@zeux zeux merged commit 0415662 into master Oct 26, 2023
12 checks passed
@zeux zeux deleted the gltffuzz branch October 26, 2023 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant