Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy #322

Open
teqwve opened this issue Dec 19, 2024 · 2 comments
Open

Out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy #322

teqwve opened this issue Dec 19, 2024 · 2 comments
Labels
bug Something isn't working
Milestone

Comments

@teqwve
Copy link

teqwve commented Dec 19, 2024

Hi! It looks like that there is an out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy. It can be triggered with pattern "bat|cat|mat|rat|fat|sat|pat|hat|vat" and corpus "VAt hat pat sat fat rat mat ca". If corpus is located just before end of mapped region it causes a segfault (this is how it was noticed). I tested it using current develop and vectorscan/5.4.11 branches on AMD EPYC CPU.

It probably happens somewhere here. With this input loopBytes is 30 and loadu256 is executed on ptr. It reads 32 bytes, but ptr points to a memory when only 31 bytes are valid (30 bytes + 1 null byte). Since ptr doesn't have to be aligned (it's aligned in non-avx512vbmi version, but in this version it's not) it can cross page boundary and read a 1 byte after mapped memory region:

    if (likely(ptr + loopBytes <= buf_end)) {
        u64a k0 = FAT_TEDDY_VBMI_CONF_MASK_HEAD;
        m512 p_mask0 = set_mask_m512(~((k0 << 32) | k0));
        m512 r_0 = prep_conf_fat_teddy_512vbmi_templ<NMSK>(&lo_mask, dup_mask, sl_msk, set2x256(loadu256(ptr)));

It can be detected with ASAN:

TEST(OutOfBoundRead, data) {
    const char* pattern = "bat|cat|mat|rat|fat|sat|pat|hat|vat";
    const char* corpus = "VAt hat pat sat fat rat mat ca";

    hs_error_t err;
    hs_scratch_t *scratch = nullptr;
    hs_database_t *db = buildDBAndScratch(pattern, HS_FLAG_CASELESS, 0, HS_MODE_BLOCK, &scratch);

    err = hs_scan(db, corpus, strlen(corpus), 0, scratch, countHandler, NULL);
    ASSERT_EQ(HS_SUCCESS, err) << "hs_scan didn't return HS_SCAN_TERMINATED";

    err = hs_free_scratch(scratch);
    ASSERT_EQ(HS_SUCCESS, err);
    hs_free_database(db);
}
$ cmake -DSANITIZE=address -DFAT_RUNTIME=no -DBUILD_AVX512VBMI=yes ..
$ make unit-hyperscan
$ ./bin/unit-hyperscan --gtest_filter="OutOfBoundRead.data"
Note: Google Test filter = OutOfBoundRead.data
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from OutOfBoundRead
[ RUN      ] OutOfBoundRead.data
=================================================================
==11039==ERROR: AddressSanitizer: unknown-crash on address 0x000002481ea0 at pc 0x0000018423ed bp 0x7ffc75ba2530 sp 0x7ffc75ba2528
READ of size 32 at 0x000002481ea0 thread T0
    #0 0x18423ec in _mm256_loadu_si256(long long __vector(4) const*) /usr/local/lib/gcc/x86_64-linux-gnu/12.4.0/include/avxintrin.h:929
    #1 0x18423ec in loadu256 /home/teqwve/vectorscan/src/util/arch/x86/simd_utils.h:651
    #2 0x18423ec in int fdr_exec_fat_teddy_512vbmi_templ<3>(FDR const*, FDR_Runtime_Args const*, unsigned long long) /home/teqwve/vectorscan/src/fdr/teddy_fat.cpp:286
    #3 0x1677d15 in fdrExec /home/teqwve/vectorscan/src/fdr/fdr.c:823
    #4 0x9c986f in pureLiteralBlockExec /home/teqwve/vectorscan/src/runtime.c:218
    #5 0x9c986f in hs_scan /home/teqwve/vectorscan/src/runtime.c:422
    #6 0x76db43 in TestBody /home/teqwve/vectorscan/unit/hyperscan/behaviour.cpp:1660
    #7 0x5dba63 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3562
    #8 0x5dba63 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3598
    #9 0x5baf4d in testing::Test::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3634
    #10 0x5bbebf in testing::Test::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3626
    #11 0x5bbebf in testing::TestInfo::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3810
    #12 0x5bc59c in testing::TestInfo::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3785
    #13 0x5bc59c in testing::TestCase::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3928
    #14 0x5caa39 in testing::TestCase::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3914
    #15 0x5caa39 in testing::internal::UnitTestImpl::RunAllTests() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:5799
    #16 0x5b9aba in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3562
    #17 0x5b9aba in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:3598
    #18 0x5b9aba in testing::UnitTest::Run() /home/teqwve/vectorscan/unit/gtest/gtest-all.cc:5410
    #19 0x55f052 in RUN_ALL_TESTS() /home/teqwve/vectorscan/unit/gtest/gtest.h:20058
    #20 0x55f052 in main /home/teqwve/vectorscan/unit/hyperscan/main.cpp:35
    #21 0x789f75d15249  (/lib/x86_64-linux-gnu/libc.so.6+0x27249)
    #22 0x789f75d15304 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27304)
    #23 0x56ef30 in _start (/home/teqwve/vectorscan/build/bin/unit-hyperscan+0x56ef30)

0x000002481ebf is located 0 bytes to the right of global variable '*.LC304' defined in '/home/teqwve/vectorscan/unit/hyperscan/behaviour.cpp' (0x2481ea0) of size 31
  '*.LC304' is ascii string 'VAt hat pat sat fat rat mat ca'
SUMMARY: AddressSanitizer: unknown-crash /usr/local/lib/gcc/x86_64-linux-gnu/12.4.0/include/avxintrin.h:929 in _mm256_loadu_si256(long long __vector(4) const*)
Shadow bytes around the buggy address:
  0x000080488380: 00 00 00 00 00 07 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x000080488390: 00 00 00 00 00 07 f9 f9 f9 f9 f9 f9 07 f9 f9 f9
  0x0000804883a0: f9 f9 f9 f9 00 04 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x0000804883b0: 00 00 07 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9
  0x0000804883c0: 02 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9
=>0x0000804883d0: f9 f9 f9 f9[00]00 00 07 f9 f9 f9 f9 00 00 00 00
  0x0000804883e0: 00 01 f9 f9 f9 f9 f9 f9 00 00 00 06 f9 f9 f9 f9
  0x0000804883f0: 00 00 00 04 f9 f9 f9 f9 00 00 07 f9 f9 f9 f9 f9
  0x000080488400: 00 07 f9 f9 f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9
  0x000080488410: 00 00 00 03 f9 f9 f9 f9 00 00 07 f9 f9 f9 f9 f9
  0x000080488420: 00 06 f9 f9 f9 f9 f9 f9 00 06 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==11039==ABORTING

To reliably have corpus located just before end of mapped memory region I have to use mmap (with malloc it would require many trials with some random allocations in between):

TEST(OutOfBoundRead, mmap) {
    const char* pattern = "bat|cat|mat|rat|fat|sat|pat|hat|vat";
    const char* corpus = "VAt hat pat sat fat rat mat ca";

    // Use mmap to reliably get corpus at the and of mapped memory region
    size_t buffer_len = (128<<20);
    char* buffer = (char*) mmap(NULL, buffer_len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    char* mmaped_corpus = strcpy(buffer + buffer_len - strlen(corpus) - 1, corpus);

    hs_error_t err;
    hs_scratch_t *scratch = nullptr;
    hs_database_t *db = buildDBAndScratch(pattern, HS_FLAG_CASELESS, 0, HS_MODE_BLOCK, &scratch);

    err = hs_scan(db, mmaped_corpus, strlen(mmaped_corpus), 0, scratch, countHandler, NULL);
    ASSERT_EQ(HS_SUCCESS, err) << "hs_scan didn't return HS_SCAN_TERMINATED";

    err = hs_free_scratch(scratch);
    ASSERT_EQ(HS_SUCCESS, err);
    hs_free_database(db);
    munmap(buffer, buffer_len);
}
$ cmake -DBUILD_AVX512VBMI=yes ..
$ make unit-hyperscan
$ ./bin/unit-hyperscan --gtest_filter="OutOfBoundRead.mmap"
Note: Google Test filter = OutOfBoundRead.mmap
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from OutOfBoundRead
[ RUN      ] OutOfBoundRead.mmap
Segmentation fault (core dumped)
$ gdb --args ./bin/unit-hyperscan --gtest_filter="OutOfBoundRead.mmap"
(...)
Note: Google Test filter = OutOfBoundRead.mmap
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from OutOfBoundRead
[ RUN      ] OutOfBoundRead.mmap

Program received signal SIGSEGV, Segmentation fault.
fdr_exec_fat_teddy_512vbmi_templ<3> (fdr=0x25d0680, a=0x7ffefd31cf10, control=1) at /home/teqwve/vectorscan/src/fdr/teddy_fat.cpp:286
286	        m512 r_0 = prep_conf_fat_teddy_512vbmi_templ<NMSK>(&lo_mask, dup_mask, sl_msk, set2x256(loadu256(ptr)));
(gdb) bt
#0  fdr_exec_fat_teddy_512vbmi_templ<3> (fdr=0x25d0680, a=0x7ffefd31cf10, control=1) at /home/teqwve/vectorscan/src/fdr/teddy_fat.cpp:286
#1  0x0000000000c73373 in fdrExec (fdr=<optimized out>, buf=<optimized out>, len=<optimized out>, start=<optimized out>, cb=<optimized out>, scratch=<optimized out>, groups=1)
    at /home/teqwve/vectorscan/src/fdr/fdr.c:823
#2  0x00000000008d90f3 in pureLiteralBlockExec (scratch=0x25cf6c0, rose=0x25d0340) at /home/teqwve/vectorscan/src/runtime.c:218
#3  hs_scan (db=<optimized out>, data=<optimized out>, length=30, flags=<optimized out>, scratch=0x25cf6c0, onEvent=<optimized out>, userCtx=<optimized out>)
    at /home/teqwve/vectorscan/src/runtime.c:422
@markos markos added the bug Something isn't working label Dec 19, 2024
@markos markos added this to the 5.4.12 milestone Dec 19, 2024
@markos
Copy link

markos commented Dec 19, 2024

hi @teqwve and thank you for your bug report. VBMI code paths have not been tested as much due to lack of hardware, so they were not part of the CI pipeline. This will change from next month though and we will be able to test VBMI properly.
Until this is fixed, for now I would suggest you disable AVX512VBMI during CMake. Apologies and thank you for the understanding.

dowgird pushed a commit to dowgird/vectorscan that referenced this issue Dec 23, 2024
…ectorCamp#322)

  * Replaced the 32 byte read with a properly truncated mapped read
  * Added a unit test
@dowgird
Copy link

dowgird commented Dec 23, 2024

I have created a PR with a fix for this issue: #323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants