model-checking · carolynzech · Sep 3, 2024 · Aug 23, 2024 · Aug 27, 2024 · Aug 27, 2024
@@ -16,3 +16,4 @@
 - [0008-line-coverage](rfcs/0008-line-coverage.md)
 - [0009-function-contracts](rfcs/0009-function-contracts.md)
 - [0010-quantifiers](rfcs/0010-quantifiers.md)
+- [0011-list](rfcs/0011-list.md)
diff --git a/rfc/src/rfcs/0011-list.md b/rfc/src/rfcs/0011-list.md
@@ -0,0 +1,157 @@
+- **Feature Name:** List
+- **Feature Request Issue:** [#2573](https://github.com/model-checking/kani/issues/2573), [#1612](https://github.com/model-checking/kani/issues/1612)
+- **RFC PR:** *Link to original PR*
+- **Status:** Under Review
+- **Version:** 0
+
+-------------------
+
+## Summary
+
+Add a subcommand `kani list` that lists the locations of Kani attributes in a project.
+
+## User Impact
+
+Currently, there is no automated way for a user to gather metadata about Kani's integration with their project. If, for example, a user wants a list of contracts for their project, they must search for all the relevant contract attributes (currently `#[requires]`, `#[ensures]`, or `#[modifies]`) themselves. If done manually, this process is tedious, especially for large projects. Even with a short shell script, it is error-prone--if, for example, we introduce a new type of contract attribute, users would have to account for it when searching their project.
+
+Internally, this feature will be useful for tracking our customers' use of Kani and our progress with standard library verification. Externally, users can leverage this feature to get a high-level view of which areas of their projects have Kani attributes (and, by extension, which areas are still in need of verification). This feature can act as a complement to our existing line-by-line coverage feature--one can imagine, for example, using this feature to *informally* identify which areas seem to have few to no harnesses, then using the coverage feature to *formally* identify which areas remain unverified.
+
+This feature will not cause any regressions for exisiting users.
+
+## User Experience
+
+Users run a `kani list` subcommand, which prints metadata about the harnesses, contracts, and stubs in each module.
+For `cargo kani`, this will be every module in the package; for `kani`, it is every module in the crate.
+This subcommand will not run verification; it will exit after printing the information.
+
+There will be two options:
+- `--filter=[harnesses|contracts]`: Only output the information specified. In the example above, `--filter harnesses` would output the `mod verify` entry and `--filter contracts` would output the `mod implementation` entry. In the case where harnesses and contracts are in the same module, Kani will output only the functions that are harnesses or have contracts, depending on the filter.
+- `--message-format=[human|json]`: Specify the output format. The default is `human`, which prints to the terminal. The `json` option creates and writes to a JSON file instead.
+
+This subcommand will not fail. In the case that it does not find any Kani attributes, it will print a message informing the user of that fact.
+
+### Output Format
+
+The default format, `human`, will print the package's modules as a tree. The leaves of the tree are functions with Kani attributes. If a module has no Kani attributes, Kani doesn't print it. After printing the tree, Kani will print the total count of harnesses, stubs, and contracts.
+For example:
+
+```
+crate example
+├── mod implementation
+│ └── fn foo 
+| - requires-contracts:
+| - ub_checks::can_dereference(x)
+| - ub_checks::can_write(x)
+| - ensures-contracts: 
+| - |result| *result == *x + 1
+| - modifies-contracts: 
+| - x
+| - proofs-for-contracts: 
+| - verify::check_foo_u32, 
+| - verify::check_foo_u64
+├── mod verify
+| └── fn check_foo_u32
+| - kani::proof_for_contract(foo)
+| └── fn check_foo_u64
+| - kani::proof_for_contract(foo)
+│ └── fn check_func
+| - kani::proof
+| - kani::should_panic
+| - kani::stub(rand::random, mock_random)
+| └── fn check_bar
+| - kani::proof
+| - kani::unwind(5)
+
+Totals:
+ - Harnesses: 4
+ - Requires Contracts: 2
+ - Ensures Contracts: 1
+ - Modifies Contracts: 1
+ - Stubs: 1
+```
+
+As the name implies, the goal of the `human` output is to be friendly for human readers. If the user wants an output format that's more easily parsed by a script, they can opt for the `json` option, which will output the same information in a more verbose, machine-readable format. By "more verbose" we mean that it will parse the attributes, so that instead of outputting `kani::stub(rand::random, mock_random)`, it would output:
+
+```json
+stub : {
+ original: rand::random,
+ target: mock_random
+}
+```
+and so on for other attributes.
+
+## Software Design
+
+We will add a new subcommand to `kani-driver` which follows the normal verification workflow of invoking `kani-compiler` and constructing a `Project` from the result. Then, instead of verifying the project, Kani will iterate over the `KaniMetadata` in the `Project` to construct the output.
+
+### Contracts
+
+The `KaniMetadata` struct has all of the information that `list` requires, except for contracts information--it has the CBMC representation of the contracts, while this feature requires the unexpanded, raw text.
+
+To store the contracts information, we will change the contracts code generation library (in `kani_macros::sysroot::contracts`). Contracts are defined as attribute macros, e.g.:
+```rust
+#[proc_macro_attribute]
+pub fn requires(attr: TokenStream, item: TokenStream) -> TokenStream {
+ attr_impl::requires(attr, item)
+}
+```
+
+The `item` `TokenStream` gives us the function declaration to which the attribute is applied, e.g. `fn foo() {}`, but that is insufficient to uniquely identify the function or construct the module tree. Instead, we will change the macro expansion handler (`ContractConditionsHandler`) to insert the contracts text as a separate attribute, e.g.:
+
+```rust
+// these are wrappers we already generate for contracts
+#[kanitool::recursion_check = #recursion_name]
+#[kanitool::checked_with = #check_name]
+...
+#[kanitool::contract_text(requires(x == 1))] // this is new
+```
+
+Then, in `kani-compiler`, we add a new branch of the code generation process for the `list` subcommand. When `codegen_crate` encounters this reachability type, it will:
+- Construct a `CodegenUnits` object.
+- Call a new `CodegenUnits` method that filters local crate instances to those with the `contract_text` attribute, then stores the following in a new `contracted_functions` field:
+ - the type of contract (`Requires`, `Ensures`, or `Modifies`), 
+ - the text of the contract, and 
+ - the fully qualified path to it.
+- Call `write_metadata` on the `CodegenUnits` object, which will construct a `KaniMetadata` object with a new field containing the contracted function information, then write it to a file.
+
+### Constructing the Tree
+
+Once we've made the above updates to `KaniMetadata`, that struct will have all of the information we need for `kani list`. 
+
+To print the tree, Kani will read the metadata file and construct an intermediate data structure that maps each module to a list of its metadata. (This list will be sorted so that the output is stable across runs). Then, we'll iterate through that structure to print the tree. If the user passes a `--filter` option, Kani will only construct the map for the relevant metadata--e.g., if the user passes `--filter harnesses`, we will only map each module to its harnesses and ignore the collected contracts.
+
+Kani will parse the module paths to print the tree, inserting a new level for each delimiter `::`. For example, `example::implementation::verify` would parse to:
+
+```
+crate example
+├── mod implementation
+│ └── mod verify
+```
+
+### Calculating Totals
+Since we already have the information about each harness, contract, etc., calculating totals is as simple as taking the length of those data structures.
+
+## Rationale and alternatives
+
+We organize the output by module so that users can compare Kani integration across their package. The output can be quite long, since Kani prints the whole module tree. As an alternative, we could support a more compact output format, e.g. one that just maps the Rust path (`example::verify::foo`) to the metadata. This format would have the advantage of being more concise, but the disadvantage of making it harder for the user to visualize how Kani fits into their package structure.
+
+We do not print modules without Kani attributes, since for large projects, printing even the empty module name would make the output much longer. Also, it may make the tree harder to read, since the user would have to scroll around many empty modules to find the entries which actually have data. However, this design leaves it up to the user to manually identify which modules are missing verification. As an alternative, we could print the module names that do not have Kani attributes, or introduce an option so the user can toggle this feature on and off.
+
+If we do not implement this feature, users will have to obtain this metadata through manual searching, or by writing a script to do it themselves. This feature will improve our internal productivity by automating the process.
+
+## Open questions
+
+## Out of scope / Future Improvements
+
+It would be nice to differentiate between regular Kani harnesses and Bolero harnesses. Bolero harnesses invoke Kani using conditional compilation, e.g.:
+
+```rust
+#[cfg_attr(kani, kani::proof)]
+fn check() {
+ bolero::check!()...
+}
+```
+
+See [this blog post](https://model-checking.github.io/kani-verifier-blog/2022/10/27/using-kani-with-the-bolero-property-testing-framework.html) for more information.
+
+There's no easy way for us to know whether a harness comes from Bolero, since Bolero takes care of rewriting the test to use Kani syntax and invoking the Kani engine. By the time the harness gets to Kani, there's no way for us to tell it apart from a regular harness. Fixing this would require some changes to our Bolero integration.