Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::ssize_t audit #2972

Open
4 tasks
Lestropie opened this issue Aug 20, 2024 · 9 comments
Open
4 tasks

std::ssize_t audit #2972

Lestropie opened this issue Aug 20, 2024 · 9 comments

Comments

@Lestropie
Copy link
Member

Problem identified by @daljit46.

In various parts of the C++ code base, we have erroneously used std::ssize_t as "a signed integer of the same bit width as std::size_t, when in fact by the letter of the specification it is only guaranteed to support the value -1.

  • Identify all uses of std::ssize_t where negative values smaller than -1 need to be supported.
  • Determine an appropriate substitute type to use (typedef one if necessary)
  • Do a bulk substitution
  • (optional) Identify uses of std::size_t that involve casting between signed and unsigned integer types to silence compiler warnings, and reconsider types used.
    (Eg. Pretty sure on at least one occasion I removed a non-functional assertion that a size_t be non-negative; may be other areas where catching negative values as errors may be better than using usigned)
@daljit46
Copy link
Member

daljit46 commented Aug 20, 2024

Determine an appropriate substitute type to use (typedef one if necessary)

Eigen uses std::ptrdiff_t for indices so I guess we could use that or perhaps int64_t for non-indices.

@Lestropie
Copy link
Member Author

std::ptrdiff_t feels wrong from its name but from my first search for a definition is actually right:

std::ptrdiff_t is used for pointer arithmetic and array indexing, if negative values are possible.

@daljit46
Copy link
Member

In the meeting yesterday, we discussed a few possible typedefs for std::ptrdiff_t:

  • Index_t
  • Offset_t
  • Stride_t

@Lestropie
Copy link
Member Author

Sorry I missed. We've generally stuck to lowercase for typedefs internally. I'm also wondering if typedefs that are intended for more broad use tend to be *_type rather than *_t (eg. default_type). I like "index" and "stride"; is "offset" intended to be just a difference of "index"es, or more / something else?

@jdtournier
Copy link
Member

My personal preference is also for all lowercase. Leaning towards index_t, but would be happy enough with index_type. I feel offset & stride both have slightly narrower scopes than what we need here.

@daljit46
Copy link
Member

Another point I would like to add is that adding multiple typedefs for indices, offset (for file offsets) and strides with the same underlying idea is not a bad idea. There are very few downsides and it would make it easier if at some point we want to change the types.

@jdtournier
Copy link
Member

Yes, I think that was suggested the last time we discussed it, and I agree there's no real reason to stick with a single typedef. But on the other hand, there may be a few edge cases where it's not clear which one is most appropriate (though it admittedly wouldn't make any difference...). Furthermore, we are very unlikely to ever define these differently, given how often we'll be mixing them up. For example, computing the offset to the current voxel data in an image involves multiplying the strides by the indices and summing them to produce an offset - all 3 types in one expression...

So my preference would be to stick with a single typedef, and try to find one that is sufficiently broad to encompass all 3 uses. Thought I'll freely admit I'm not sure index_type really nails it...

@daljit46
Copy link
Member

Another name that came to my mind was to use isize_t as in Rust where signed integers are named as i32,i64,etc.

@Lestropie
Copy link
Member Author

adding multiple typedefs for indices, offset (for file offsets) and strides with the same underlying idea is not a bad idea.

I'm quite happy to have multiple typedefs that resolve to the same underlying type. I've used that in some places. Often it helps to communicate the intent / appropriate usage of the data at hand.

What it doesn't catch is instances of implicit conversion between them, as there's no corresponding compiler warning for eg. loss of precision. Though perhaps once they are typedef'd we could (optionally temporarily) replace those typedefs with classes with explicit casts?

If something is an "index", I'd expect that to be a ssize_t rather than a signed integer; just an array index but with the scope for catching -1 errors. In the same vein, I suppose that for something like image strides, std::ptrdiff_t is used as it can act as a "difference of indices": idiff_t? Explicitly using std::ptrdiff_t everywhere feels strange because it's typically not pointers on which we are computing differences; this would substitute out just that component.

Other factor here is that STL uses size_t for indexing: https://eigen.tuxfamily.narkive.com/ZhXOa82S/signed-or-unsigned-indexing. There's likely different places in the code that use size_t vs. ssize_t depending on whether they are indexing into an STL container or an Eigen class.

Another name that came to my mind was to use isize_t

I'd be concerned about the subtlety of the distinction between isize_t and ssize_t: too easy to infer from the names alone that ssize_t is "signed" and therefore isize_t must be the opposite of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants