Document when tags are preserved when copying memory #12

kevin-brodsky-arm · 2021-02-09T11:13:33Z

By and large, the current approach used by CHERI LLVM is to preserve capability tags when copying memory if it is valid for the source to contain capabilities. While it is clearly the case in some situations (e.g. during a struct assignment if the struct contains capability types), this is not obvious in general.

It would be very helpful for the guide to document:

What C/C++ objects may hold capabilities from a compiler point of view (considering types, alignment, etc.).
Consequently, when capability tags are preserved during any form of copy (explicit memcpy(), struct assignment, implicit copy constructor, etc.).

The text was updated successfully, but these errors were encountered:

rwatson · 2021-02-09T12:37:32Z

This sounds good; three further thoughts:

We should also define an API for non-capability-preserving memcpy(), possibly memcpy_nocap(), and its expected behaviour. We should explicitly identify some use cases -- such as when we intentionally don't want to preserve tags (such as in copyin()/copyout()-style use cases).
We should iterate through the standard C library APIs, and likely also POSIX APIs, and identify memcpy() synonyms/wrappers, and indicate for each whether they are expected to preserve tags. For example, we might define that strcpy() doesn't preserve tags, but that memmove() and sort() do (subject to suitable alignment/etc).
Where there is some ambiguity or the compiler may have to do different things to get useful access to optimisations, etc., we should also identify those. It's not clear to me what the scope of these cases is, but the impact may be more clear: the surprising stripping of tags, the surprising preservation of tags, and worse, behaviour that depends on optimisation level. I think it is fine (necessary?) for such cases to exist, but we should constrain them as much as makes sense.

rwatson · 2021-02-10T01:30:00Z

A further note from the meeting earlier today: We should also be documenting whether memory-mapping APIs produce tag-enabled mappings, whether by default or as a result of additional flags/arguments/etc. For example, we probably want tags enabled for MAP_ANON mappings by default with mmap(2) (as is the case today), but System V shared memory mappings should not (but we probably want an option/flag to enable it).

rwatson · 2021-02-10T22:45:36Z

Tagging @bsdjhb @brooksdavis @arichardson @jrtc27.

brooksdavis · 2021-02-12T17:33:32Z

Attemping to answer one part of the question: memcpy and memmove must preserve tags any time they copy sizeof(void * __capability) bytes where the source and destination are aligned. E.g., this needs to work:

struct s {
    uint64_t a;
    uint64_t b;
    void * __capability c;
};

void init_from_other(struct s *dst, struct s *src)
{
    memcpy(&dst->b, &src->b, sizeof(struct s) - offsetof(struct s, b));
}

One could imagine a more restricted C implementation (e.g. with strict sub-object bounds) that didn't preserve tags with unaligned starts, but for existing systems code this probably must work.

I think I've convinced myself that *sort need only preserve tags for objects aligned to sizeof(void * __capability) and which are a multiple of sizeof(void * __capability) bytes, but IIRC we preserve as with memcpy today so you can do absurd things if you really want to.

sbaranga-arm · 2021-02-15T12:26:32Z

A lot of memcpy calls are emitted by the compiler (e.g. for assignments) and those would copy the entire object. For these cases it would make sense to emit a call to a memcpy variant that doesn't preserve tags on unaligned starts.

brooksdavis · 2021-02-15T16:46:56Z

@arichardson has done some work looking at compiler generated copies in the context of improved inlining (CTSRD-CHERI/llvm-project#506). We probably do want many of them to be tag-clearing, but de-facto C requires copying in all sorts of awkward places. For example:

struct s {
    uint64_t a;
    uint64_t b;
    char c[16];
} __attribute__((aligned(16)));

requires a tag-preserving memcpy because we can't know what's actually being stored in c since the C language can't differentiate between a string and a bag of bytes. (We likely want an annotation to say a string is actually a string or to push for a byte type as I believe is being discussed). One could implement a C dialect that restricted tag preservation further, but the cost of adaptation would start to climb so I believe it would need to be optional.

arichardson · 2021-02-15T18:31:14Z

We could try to make this distinction for C++20 (or maybe it's 17) code by only treating std::byte as potentially tag-bearing and assuming that char is actually a string. However, I feel this could be rather risky and it's safer to assume that all of signed char/unsigned char/char/std::byte can potentially hold tags.

kevin-brodsky-arm · 2021-02-16T10:26:48Z

As long as a char* is allowed to alias any other pointer (which I presume is still the case in C++17/20), I think we should preserve the assumption that an array of char may hold capabilities, because otherwise it feels like the departure from C/C++ is too great and it could break quite a lot of software. Of course having an optional compiler flag to remove that assumption wouldn't hurt either.

rwatson assigned brooksdavis and arichardson and unassigned brooksdavis and arichardson Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document when tags are preserved when copying memory #12

Document when tags are preserved when copying memory #12

kevin-brodsky-arm commented Feb 9, 2021

rwatson commented Feb 9, 2021 •

edited by brooksdavis

Loading

rwatson commented Feb 10, 2021

rwatson commented Feb 10, 2021

brooksdavis commented Feb 12, 2021

sbaranga-arm commented Feb 15, 2021

brooksdavis commented Feb 15, 2021

arichardson commented Feb 15, 2021

kevin-brodsky-arm commented Feb 16, 2021

Document when tags are preserved when copying memory #12

Document when tags are preserved when copying memory #12

Comments

kevin-brodsky-arm commented Feb 9, 2021

rwatson commented Feb 9, 2021 • edited by brooksdavis Loading

rwatson commented Feb 10, 2021

rwatson commented Feb 10, 2021

brooksdavis commented Feb 12, 2021

sbaranga-arm commented Feb 15, 2021

brooksdavis commented Feb 15, 2021

arichardson commented Feb 15, 2021

kevin-brodsky-arm commented Feb 16, 2021

rwatson commented Feb 9, 2021 •

edited by brooksdavis

Loading