Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cover_or_fail! macro #3424

Closed
wants to merge 3 commits into from

Conversation

carolynzech
Copy link
Contributor

@carolynzech carolynzech commented Aug 7, 2024

This PR adds a cover_or_fail! macro to Kani. This macro has the same functionality as the existing cover! macro, except it causes verification to fail when the outcome is unsatisfiable or unreachable.

Currently, a caller can invoke this macro from anywhere in their code (not just harnesses), which is also true for cover. However, we only analyze reachability starting from a given proof harness (the entry point), so if the cover_or_fail call is contained inside a function that the harness does not call, the verification will succeed even though the call actually is not covered. I'm concerned that this behavior will produce confusing results for users. Specifically, see the test cases inside cover-unreachable-outside-harness-pass (which pass verification) and cover-unreachable-outside-harness-fail (which fail verification). See also the discussion of this issue here. Is this the behavior that we want?

Our options are:

  1. Leave the implementation like this, with clear documentation of the reachability behavior. The benefit of this design is that users can more easily use cover_or_fail throughout their code for debugging.
  2. Add a pass which scans the entire crate for calls to cover_or_fail and makes sure that CBMC marked them all as reachable. This allows using cover_or_fail throughout the codebase and avoids any confusing reachability behavior, but may be slow.
  3. Restrict calls to cover_or_fail to proof harnesses only. This approach solves the reachability issue, since CBMC will see every call to cover_or_fail, but does restrict users--the author of the issue that prompted this PR, for example, wanted to use it outside of a proof harness.

I'm leaning toward #1 or #3, but curious to hear others' thoughts.

Resolves #2792

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@github-actions github-actions bot added the Z-BenchCI Tag a PR to run benchmark CI label Aug 7, 2024
@adpaco-aws
Copy link
Contributor

Thanks @carolynzech for this! It's nice to finally see a concrete implementation for this feature.

I do have some thoughts about it too. I suppose you know about the very related option --fail-uncoverable which essentially converts all cover! macros into something like this. The proposal was added to the RFC: Cover statement in #2967 but we didn't get to finish the implementation in #2532 (#2967 in particular had a nice discussion based on theoretical implementations of the cover_or_fail! macro 😄 ). This is mostly to say that the cover_or_fail! macro can be implemented in a number of ways, and we won't know which option is the best without user feedback. We can try to guess which one is the best though.

In my opinion, option 3 would restrict too much the usage for the macro to be useful. Option 1 seems fine to me and it's the same we were doing for --fail-uncoverable. But I'm wondering now: Can we do better? Maybe we could something like a mix of options 1 and 2 where (1) we scan the codebase and emit a warning for each cover_or_fail! macro that is going to get sliced away and (2) perform the analysis as it's being proposed in option 1. However, I'm not sure this option would be acceptable if we need to add another compiler pass just for it - it would be a noticeable overhead even when users don't have any cover_or_fail! macros in their code. What do you think?

@carolynzech
Copy link
Contributor Author

I suppose you know about the very related option --fail-uncoverable which essentially converts all cover! macros into something like this.

Yes, I did see that. This implementation is subtly different in that it allows users to mix and match cover! with cover_or_fail!, rather than use a flag to force all covers to become failures. That might be nice for debugging if a user just wants to focus on a few failing cover statements at a time, rather than get a wall of output for all of the cover statements. In practice, it's probably not a huge difference, and the command line flag would have the benefit of reducing some of the code duplication between cover/cover_or_fail that we see here.

In my opinion, option 3 would restrict too much the usage for the macro to be useful.

I would tend to agree--it's a less messy solution for sure, but that's moot if it doesn't help users in the first place.

Maybe we could something like a mix of options 1 and 2 where (1) we scan the codebase and emit a warning for each cover_or_fail! macro that is going to get sliced away and (2) perform the analysis as it's being proposed in option 1. However, I'm not sure this option would be acceptable if we need to add another compiler pass just for it - it would be a noticeable overhead even when users don't have any cover_or_fail! macros in their code. What do you think?

If I imagine myself as a user of Kani unfamiliar with the implementation, I would expect that if I put cover_or_fail throughout my code and verification succeeded, Kani found all those statements to be reachable from the harness. My concern is that users won't read the fine print re: slicing and take the successful verification to mean something it doesn't.
However, your performance point is well-taken--we don't expect people to use this macro that often, so doing a whole extra pass just to check for it seems like a waste. Perhaps we flip the order of your proposed steps--by default, we do (2) and do the analysis as proposed in option 1, but offer a --comprehensive flag that will do (1). Additionally, for each cover_or_fail that we do see, if we detect that --comprehensive was not passed, we can emit a warning suggesting it. This of course does not help in the case that no cover statements were detected in the first place, but (hopefully) the user is more likely to notice that anyway. I think this warning would cover the more common human error case where a few cover statements of many were not analyzed and the user doesn't notice.

@adpaco-aws
Copy link
Contributor

Perhaps we flip the order of your proposed steps--by default, we do (2) and do the analysis as proposed in option 1, but offer a --comprehensive flag that will do (1). Additionally, for each cover_or_fail that we do see, if we detect that --comprehensive was not passed, we can emit a warning suggesting it. This of course does not help in the case that no cover statements were detected in the first place, but (hopefully) the user is more likely to notice that anyway. I think this warning would cover the more common human error case where a few cover statements of many were not analyzed and the user doesn't notice.

If we are going to have an option that warns about cover_or_fail!s being left out, we might as well error out instead. That will make sure the user notices 😆 In fact, I believe that's the main thing users want out of this feature: if some line of code stops being verified for whatever reason, they want to know. Warnings will likely go unnoticed, so they would not know.

I'd be happy with a feature like that. We need to make sure the reachability caveat is well documented, and I'd probably use a more detailed name for the flag, but the rest sounds good to me.

@carolynzech
Copy link
Contributor Author

Closing for now; this feature requires further design discussion.

@carolynzech carolynzech closed this Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-BenchCI Tag a PR to run benchmark CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fail harness if cover is not satisfied
2 participants