From 359fe20d6317f9e6238272cc27688052360e77ca Mon Sep 17 00:00:00 2001 From: Zyad Hassan <88045115+zhassan-aws@users.noreply.github.com> Date: Tue, 20 Aug 2024 11:49:48 -0700 Subject: [PATCH] Add challenge on memory safety of String (#55) By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses. --------- Co-authored-by: Michael Tautschnig --- doc/src/SUMMARY.md | 3 +- doc/src/challenges/0010-string.md | 75 +++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 doc/src/challenges/0010-string.md diff --git a/doc/src/SUMMARY.md b/doc/src/SUMMARY.md index bb064448efe87..d5885d4ec7bb7 100644 --- a/doc/src/SUMMARY.md +++ b/doc/src/SUMMARY.md @@ -18,4 +18,5 @@ - [Pointer Arithmetic](./challenges/0003-pointer-arithmentic.md) - [Memory safety of BTreeMap's `btree::node` module](./challenges/0004-btree-node.md) - [Inductive data type](./challenges/0005-linked-list.md) - - [Contracts for SmallSort](./challenges/0008-smallsort.md) \ No newline at end of file + - [Contracts for SmallSort](./challenges/0008-smallsort.md) + - [Memory safety of String](./challenges/0010-string.md) diff --git a/doc/src/challenges/0010-string.md b/doc/src/challenges/0010-string.md new file mode 100644 index 0000000000000..1e7020c52cf53 --- /dev/null +++ b/doc/src/challenges/0010-string.md @@ -0,0 +1,75 @@ +# Challenge X: Memory safety of String + +- **Status:** Open +- **Tracking Issue:** [Link to issue](https://github.com/model-checking/verify-rust-std/issues/61) +- **Start date:** *2024-08-19* +- **End date:** *2024-12-10* + +------------------- + +## Goal + +In this challenge, the goal is to verify the memory safety of `std::string::String`. +Even though the majority of `String` methods are safe, many of them are safe abstractions over unsafe code. + +For instance, the `insert` method is implemented as follows in v1.80.1: +```rust +pub fn insert(&mut self, idx: usize, ch: char) { + assert!(self.is_char_boundary(idx)); + let mut bits = [0; 4]; + let bits = ch.encode_utf8(&mut bits).as_bytes(); + + unsafe { + self.insert_bytes(idx, bits); + } +} +``` +where `insert_bytes` has the following implementation: +```rust +unsafe fn insert_bytes(&mut self, idx: usize, bytes: &[u8]) { + let len = self.len(); + let amt = bytes.len(); + self.vec.reserve(amt); + + unsafe { + ptr::copy(self.vec.as_ptr().add(idx), self.vec.as_mut_ptr().add(idx + amt), len - idx); + ptr::copy_nonoverlapping(bytes.as_ptr(), self.vec.as_mut_ptr().add(idx), amt); + self.vec.set_len(len + amt); + } +} +``` +The call to the unsafe `insert_bytes` method (which itself contains unsafe code) makes `insert` susceptible to undefined behavior. + +### Success Criteria + +Verify the memory safety of all public functions that are safe abstractions over unsafe code: + +1. `from_utf16le` (unbounded) +1. `from_utf16le_lossy`(unbounded) +1. `from_utf16be` (unbounded) +1. `from_utf16be_lossy` (unbounded) +1. `pop` +1. `remove` +1. `remove_matches` (unbounded) +1. `retain` (unbounded) +1. `insert` +1. `insert_str` (unbounded) +1. `split_off` (unbounded) +1. `drain` +1. `replace_range` (unbounded) +1. `into_boxed_str` +1. `leak` + +Ones marked as unbounded must be verified for any string/slice length. + +### List of UBs + +All proofs must automatically ensure the absence of the following [undefined behaviors](https://github.com/rust-lang/reference/blob/142b2ed77d33f37a9973772bd95e6144ed9dce43/src/behavior-considered-undefined.md): + +* Accessing (loading from or storing to) a place that is dangling or based on a misaligned pointer. +* Reading from uninitialized memory. +* Mutating immutable bytes. +* Producing an invalid value + +Note: All solutions to verification challenges need to satisfy the criteria established in the [challenge book](../general-rules.md) +in addition to the ones listed above.