-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow up on fuzzing reports #3800
Comments
I'll try this one. |
Thank you @Dev-Voldemort for taking this up. It would help a lot in identifying the faults in the daily and weekly CI. |
I can help with this one, since no one is assigned. |
Glad to be of help :) |
I'm new to fuzzers, so I don't know how an OK fuzzer should be working. After reviewing some errors, and applying some nits, I'm getting this: Does this mean it's working? |
Yes, This is how fuzzing reports come in. |
Making the responsible functions asynchronous seems to resolve the errors, but I'm not sure if that's the correct way to go. |
Are you sure? Making all those functions asynchronous would require all of their caller functions to be asynchronous as well, or they have to be called in an asyncio loop (basically we'd have to change a lot of code :).. ) |
we generally do fuzz testing to check that weather our functionality is robust to different inputs(thats why we feed random data/unexpected inputs here in hopes to find something on which our functionality crashes) and probably using that information to improve it, so I am not sure how making the functions async would help can you elaborate further, also i thinking async have totally different use case. |
As others have said: yes, that looks like it's working! Fuzzing means basically "throw garbage data at this and see if it crashes." Sometimes your data has an expected format and sufficient error checking that it'll reject anything that doesn't fit that format (e.g. reject all things that aren't json, reject all things that don't include When you (or our CI system) runs one of those fuzzers, it's going to keep making superficially correct data filled with trash until "something interesting" happens. If it gets a crash or a slowdown, it should dump some information about what input it used to make that happen so that you can try it again and figure out what went wrong. Most of the time it's finding places where we didn't do perfect input validation, so it'll be stuff like "this field had a % in it and we should only allow letters and numbers" that should be pretty easy to fix. Every once in a while it'll find something that might be a viable security exploit ("if you put html here it'll show up in the final report!") but a lot of it should be fairly simple and mundane. When run in CI we're currently running for an hour and then timing out, so the jobs that run for around an hour will likely only show "we're making garbage!" lines like you screenshotted and then it'll spit out a line about a keyboard interrupt or something (the CI timeout seems to do the equivalent of hitting ^C ). The interesting jobs will be the ones where it says something about a crash. Note that an hour is really short for a fuzzing campaign -- it'd be normal to run a fuzzer for days or weeks. But I didn't want to abuse the Github Actions resources so our CI job right now is meant mostly to make sure that the fuzzers work (even in the face of other code changes going on) and help us get started on any easy findings before we start looking at longer runs. BTW, I think the tooling we're using is smart enough to "start where it left off" in garbage generation but we might need to make sure we're caching whatever it needs correctly so it can do that. I don't know offhand if that's set up yet so it might be a thing to look into. We can also run the fuzzing job more often if at least one person actively wants to work on it, or you can run it yourself on a fork to play around. |
well, that's the interesting part, async has a totally unrelated use case, yet just marking a function with async(I'm not even changing it's call) suppresses the errors and the fuzzer behaves as in working condition. At first glance, it seems some kind of blocking issue. I'll dig more when I find time. |
Exactly, IMO making them asynchronous shouldn't resolve/suppress the errors, yet they are. Issue at hand is related to some validation nits, I'll make PRs soon, but I'm confused about this async behavior. |
As a completely uninformed guess: the fuzzer does look for slowdowns and treats them as errors, so the async may be a valid fix here in that it unblocks a potential choke point, or it may be disguising places where we should be improving regexes or data validation to avoid slowdowns in the first place. Can't know without looking at exactly what's going on! |
The initial fuzzing reports here have been examined and issues fixed. I'm going to close this, but we likely will have new issues to examine especially since we've added so many new fuzzing options. |
@crazytrain328 has been working on a bunch of new fuzzers for cve-bin-tool's language parsers, and @mastersans set up a regular fuzzing job that runs them and sees what it finds.
And it's definitely finding some things:
https://github.com/intel/cve-bin-tool/actions/workflows/fuzzing.yml
It's a bit hard to tell because the fuzzing jobs always fail -- if they don't find something then the timeout eventually marks the job as a failure. But if you look, the ones that didn't find anything all quit after an hour, and the ones that did find something tend to have much shorter runtimes. Here's one:
https://github.com/intel/cve-bin-tool/actions/runs/7780736331/job/21213930431
What I'd like:
I don't know that this is a good first issue but it could be a good follow-up issue for those of you who've already got a few PRs done and want something more complicated to work on that'll give you an excuse to look at different pieces of the code.
The text was updated successfully, but these errors were encountered: