Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Line Coverage #2612

Merged
merged 13 commits into from
Jul 26, 2023
Merged

RFC: Line Coverage #2612

merged 13 commits into from
Jul 26, 2023

Conversation

adpaco-aws
Copy link
Contributor

@adpaco-aws adpaco-aws commented Jul 20, 2023

Description of changes:

An RFC to discuss an option in Kani to generate verification-based line coverage reports.

Rendered version is available here.

Proof-of-concept is available in #2609

Resolved issues:

Resolves (partially) #2610

Call-outs:

Testing:

  • How is this change tested? N/A

  • Is this a refactor change? N/A

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@adpaco-aws adpaco-aws added the T-RFC Label RFC PRs and Issues label Jul 20, 2023
@adpaco-aws adpaco-aws requested a review from a team as a code owner July 20, 2023 17:18
@adpaco-aws adpaco-aws mentioned this pull request Jul 20, 2023
4 tasks
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the proposal and think that implementing coverage in Kani is the way to go. However, I think we should rely on existing compiler APIs (e.g. CodeRegion) to compute coverage that is consistent with compiler-instrumented coverage.

rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
@zhassan-aws
Copy link
Contributor

In particular, can we implement a MIR pass similar to rustc's InstrumentCoverage? This is described in some details in the coverage instrumentation dev guide:
https://github.com/rust-lang/rustc-dev-guide/blob/master/src/llvm-coverage-instrumentation.md#mir-pass-instrumentcoverage

@adpaco-aws
Copy link
Contributor Author

However, I think we should rely on existing compiler APIs (e.g. CodeRegion) to compute coverage that is consistent with compiler-instrumented coverage.

We both agree on this, and the RFC should discuss these references in more detail.

That said, my main concern as of now with the compiler APIs for coverage is that they heavily rely on LLVM intrinsics and tools to do most of the profiling work. We can definitely plug in the InstrumentCoverage MIR pass to our compiler, but our backend is not LLVM-based so it's not clear to me what to do then (note that LLVM intrinsics are used to increment the region-based counters, but then you also need the LLVM runtime to extract their values). So I don't think the compiler APIs are directly usable for us unless some work happens in our backend first.

In summary, I don't think we're capable of adopting these APIs at the moment, but I'm looking forward to learning more about them and eventually replace the proposed mechanism. But that's out of scope for this version, which aims at line-level coverage information.

@zhassan-aws
Copy link
Contributor

We can definitely plug in the InstrumentCoverage MIR pass to our compiler, but our backend is not LLVM-based so it's not clear to me what to do then (note that LLVM intrinsics are used to increment the region-based counters

True: the InstrumentCoverage pass cannot be used as is. But what I meant is that we can implement our own version of it that does not rely on any LLVM intrinsics. In particular, from the description of the pass , it performs multiple steps, one of which is to determine a minimum set of spans for computing coverage. Next, for each span, it injects a counter. Our version can use the step that computes the minimum set of spans, but replace the counter injection step with the injection of a cover statement (assert(false)). Does this make sense?

@adpaco-aws
Copy link
Contributor Author

Yes, that makes sense. Thank you!

Do you have any feedback on the UX side?

@zhassan-aws
Copy link
Contributor

Do you have any feedback on the UX side?

One question is: how does this work with multiple harnesses? Ideally, one would want a single coverage report where a line is considered covered if it is reachable from one or more harnesses. That way, one can focus on writing new harnesses that cover lines that are currently uncovered by any of the existing harnesses.

@zhassan-aws
Copy link
Contributor

Another question: how would we report coverage for functions in user code that are not reachable from a harness? As a user, I would expect the lines in those functions to be reported as uncovered, but the pruning done by the MIR linker may remove those functions altogether, and thus no checks would be generated for them at all.

@adpaco-aws
Copy link
Contributor Author

We can implement our own version of it that does not rely on any LLVM intrinsics. In particular, from the description of the pass , it performs multiple steps, one of which is to determine a minimum set of spans for computing coverage. Next, for each span, it injects a counter. Our version can use the step that computes the minimum set of spans, but replace the counter injection step with the injection of a cover statement (assert(false)).

Today I tried implementing a MIR pass similar to InstrumentCoverage but I haven't been successful. The main problem is that the coverage module in rustc_mir_transform is private, so all types (BasicCoverageBlocks, CoverageSpan, etc.) and APIs (inject_counters, generate_coverage_spans, etc.) used for the transformation pass need to copied into kani-compiler. I tried bringing them into kani-compiler so I could experiment with this approach but at some point the number of compiler errors was too high for me to resolve in a single day.

But I learned something important along the way: This instrumentation is quite similar to the one proposed in the RFC, but with many steps dedicated to optimizing runtime. In particular, I'd like to point you to these comments in CoverageSpan and CoverageSpans:

Each Span maps to a CoverageSpan that references the originating BCB and one or more MIR Statements and/or Terminators. Initially, the Spans come from the Statements and Terminators, but subsequent transforms can combine adjacent Spans and CoverageSpan from the same BCB [...]

Converts the initial set of CoverageSpans (one per MIR Statement or Terminator) into a minimal set of CoverageSpans [...]

Because of that, I'm confident that our proposed implementation, which injects coverage statements for each statement and terminator, is essentially a naive version of the InstrumentCoverage algorithm. Steps like make_bcb_counters wouldn't add much in our case since we don't need to distinguish between different kinds of counters. I'm hoping this makes clearer the relation between the proposed approach and Rust's coverage instrumentation.

Another thing that I've tried is passing -C instrument-coverage when calling the Rust compiler from Kani. This resulted in the error:

error[E0463]: can't find crate for `profiler_builtins`
  |
  = note: the compiler may have been built without the profiler runtime

I was able to work around this by also adding the -Z no-profiler-runtime, but then all I got was a mysterious assertion:

Check 1: wrong_coverage_2.missing_definition.1
         - Status: FAILURE
         - Description: "assertion"
         - Location: assume_false.rs:3:17 in function wrong_coverage_2

Check 2: wrong_coverage_2.assertion.1
         - Status: UNREACHABLE
         - Description: "assertion failed: a < 5"
         - Location: assume_false.rs:5:5 in function wrong_coverage_2

Debug logs showed a single counter being injected for the whole block:

2023-07-21T18:22:33.702508Z DEBUG CodegenFunction{name="wrong_coverage_2"}: kani_compiler::codegen_cprover_gotoc::utils::debug: Coverage::Counter(1) for assume_false.rs:2:1 - 6:2
[...]
2023-07-21T18:22:33.703152Z DEBUG CodegenFunction{name="wrong_coverage_2"}:CodegenStatement{statement=Coverage::Counter(1) for assume_false.rs:2:1 - 6:2}: kani_compiler::codegen_cprover_gotoc::codegen::statement: handling_statement stmt=Coverage::Counter(1) for assume_false.rs:2:1 - 6:2 kind=Coverage(Coverage { kind: Counter(1), code_region: Some(assume_false.rs:2:1 - 6:2) })

At that point, Kani APIs may not have been taken into account in the computation for coverage basic blocks, resulting in a single CoverageStatement for the whole function (this is the example with assume(false) followed by assert!(...)). But note that the check introduced here isn't the coverage statement we discussed earlier. It's possible that this missing_definition check is related to the profile missing from the compilation process in Kani.

In summary, some of these APIs aren't public which makes it difficult to use them, and it isn't clear to me what code we'd need to trim even if we do. The basic idea seems to be the same, but many steps seem focused on optimizing runtime. Another attempt to use the coverage instrumentation has failed. Do you still think it's worth exploring this for the first version?

@zhassan-aws
Copy link
Contributor

I see. Thanks for exploring this option @adpaco-aws. If we have no reason to believe that the proposed approach is different from the way rustc instruments coverage, then I'm OK with proceeding with the current plan. Once we implement it, we can conduct experiments that check for discrepancies between rustc's coverage and ours (e.g. by checking that lines that are reported as covered through rustc's instrumentation are reported as covered by Kani).

@qinheping
Copy link
Contributor

How do we report if a line is partially covered? That is, there are multiple instructions in the same line and only some of them are reachable.

Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the goal of this RFC is not quite clear to me (neither, the path to stabilization).

BTW, I think it's OK for us to have a lot of open questions, it's just not clear to me what is an open question and what is a design decision at this point. For example, what is the coverage granularity that we are planning to provide? The design talks about statements, but the output is line based.

rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-viewerless-coverage.md Outdated Show resolved Hide resolved
@adpaco-aws
Copy link
Contributor Author

adpaco-aws commented Jul 24, 2023

One question is: how does this work with multiple harnesses? Ideally, one would want a single coverage report where a line is considered covered if it is reachable from one or more harnesses. That way, one can focus on writing new harnesses that cover lines that are currently uncovered by any of the existing harnesses.

Good point! Yes, we should merge coverage results from all harnesses to produce a combined report. I've discussed it with @jaisnan and I think we're going to go with the following approach:

  1. New section for coverage results for each individual harness.
  2. Final section that combines the previous coverage results into one if we're running more than one harness.

Let me add that to the RFC.

@adpaco-aws
Copy link
Contributor Author

Update: I've tried to address most comments in the latest changes, but tomorrow I'll post individual replies to each of them.

Thank you all for the feedback!

@adpaco-aws
Copy link
Contributor Author

Do you have any feedback on the UX side?

One question is: how does this work with multiple harnesses? Ideally, one would want a single coverage report where a line is considered covered if it is reachable from one or more harnesses. That way, one can focus on writing new harnesses that cover lines that are currently uncovered by any of the existing harnesses.

This is now considered in the future possibilities section. Some coverage frameworks have a dedicated tool to combine coverage results from individual harnesses and I'd like Kani to do the same. However, I'm not sure how to distribute and in any case it's a nice-to-have which is why I'm adding it to that section (actually, @jaisnan has a prototype if I'm not mistaken).

@adpaco-aws
Copy link
Contributor Author

Another question: how would we report coverage for functions in user code that are not reachable from a harness? As a user, I would expect the lines in those functions to be reported as uncovered, but the pruning done by the MIR linker may remove those functions altogether, and thus no checks would be generated for them at all.

We don't provide coverage results for unreachable code. In other words, we only report coverage for lines which contain at least one coverage check. The rest are essentially considered not covered.

@adpaco-aws
Copy link
Contributor Author

How do we report if a line is partially covered? That is, there are multiple instructions in the same line and only some of them are reachable.

This is now addressed in the design section (postprocessing of coverage checks). We'll report three different types of coverage: FULL , PARTIAL and NONE.

@adpaco-aws adpaco-aws changed the title RFC: Viewerless coverage RFC: Line Coverage Jul 25, 2023
Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Thanks for all the changes. My main request is to make it clear that we still need to define a final UX as part of this RFC. Thanks!

rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Outdated Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
rfc/src/rfcs/0008-line-coverage.md Show resolved Hide resolved
Copy link
Contributor

@qinheping qinheping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we report if a line is partially covered? That is, there are multiple instructions in the same line and only some of them are reachable.

This is now addressed in the design section (postprocessing of coverage checks). We'll report three different types of coverage: FULL , PARTIAL and NONE.

LGTM. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-RFC Label RFC PRs and Issues
Projects
No open projects
Status: No status
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants