Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle 32-bit overflows in disk utilization stats #1641

Merged
merged 2 commits into from
Oct 6, 2023

Conversation

stilor
Copy link
Contributor

@stilor stilor commented Oct 3, 2023

Linux prints [1] some of the values reported in block device's stat sysfs file as 32-bit unsigned integers. fio interprets them as 64-bit integers when reading that sysfs file and performs further arithmetics on them in 64-bits. If the reported value overflows during fio run, a huge bogus value is reported in the "disk utilization" block instead.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/genhd.c#n962

diskutil.c Outdated
{
/* Linux kernel prints some of the stat fields as 32-bit integers. It is
* possible that the value overflows, but since fio uses unsigned 64-bit
* arithmetics in update_io_tick_disk(), it instead results in a huge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest changing from arithmetics to arithmetic, sorry not grammar police. :)

Copy link
Contributor

@ankit-sam ankit-sam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stilor please fix the build failure

diskutil.c Outdated
* unsigned boundary; assume overflow only happens once between
* successive polls.
*/
if (oval <= nval || oval >= (1ull << 32)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for curly braces for this if else condition

@vincentkfu
Copy link
Collaborator

Please change the link in the commit message to one that doesn't require login credentials.

@stilor
Copy link
Contributor Author

stilor commented Oct 4, 2023

@stilor please fix the build failure

This seems to be a GCC bug. I minimized the testcase to https://godbolt.org/z/nMEGzfW7Y and it still produces this error. Reported https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111696 to GCC maintainers with a minimal testcase derived from diskutil.c.

For now, disabled the warning using GCC diagnostic pragmas around the code triggering the spurious warning. Alternatively, if you prefer I could add a test for affected compilers in configure and if the compiler fails the test, add -Wno-stringop-overflow to prevent such spurious warnings in all files.

@stilor stilor force-pushed the fix-stat-overflow branch 3 times, most recently from d43fbc8 to 36c1447 Compare October 5, 2023 03:38
diskutil.c Outdated

fio_gettime(&t, NULL);
dus->s.msec += mtime_since(&du->time, &t);
memcpy(&du->time, &t, sizeof(t));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this #pragma and change the memcpy() calls in this function into assignments. There is no reason to use memcpy() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


fio_gettime(&t, NULL);
dus->s.msec += mtime_since(&du->time, &t);
memcpy(&du->time, &t, sizeof(t));
Copy link
Contributor

@bvanassche bvanassche Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this change is unrelated to the 32-bit overflow fix, please consider moving this change into a separate patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This is to avoid triggering a spurious warning caused by [1], which
is triggered by the next commit in chain (unrelated change in
update_io_tick_disk()).

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111696

Signed-off-by: Alexey Neyman <aneyman@google.com>
Linux prints [1] some of the values reported in block device's `stat`
sysfs file as 32-bit unsigned integers. fio interprets them as 64-bit
integers when reading that sysfs file and performs further arithmetics
on them in 64-bits. If the reported value overflows during fio run,
a huge bogus value is reported in the "disk utilization" block instead.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/genhd.c#n962

Signed-off-by: Alexey Neyman <aneyman@google.com>
@axboe axboe merged commit 16b9f29 into axboe:master Oct 6, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants