Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completely broken, will not compile on Clear Linux or other distros I've tried #177

Open
GabeAl opened this issue Oct 10, 2023 · 22 comments

Comments

@GabeAl
Copy link

GabeAl commented Oct 10, 2023

This code is messy. Very, unbelievably messy.

I don't understand why GCC references it in its documentation, when GCC itself compiles brilliantly across dozens of very different environments I've tried it on, and this doesn't compile anywhere for me...

GCC mentions the need to use one little utility function:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fauto-profile
("Then use the create_gcov tool to convert the raw profile data to a format that can be used by GCC.")

I don't know why they ever merged support of this less-than-research grade codebase, and why it persists into GCC13.

Why did you merge this into GCC in a partial frankenstenian way? Why not merge the ability to support the perf output directly? Nobody I've ever talked to whose heard of AutoFDO has gotten this hot mess to compile.

GCC 13.1. No LLVM. I am not using the LLVM compiler, and am only interested in using auto-profile with GCC. I just need that one utility, "create_gcov", and nothing else in this 1.7GB code morass that gets pulled in from git clone. (Why so huge and messy?).

garbage.log

Here's the log showing the extent of the failure. It's not pretty.

@erozenfeld
Copy link
Contributor

@rlavaee @shenhanc78 The sync to internal version last week broke create_gcov build. It also overridden the changes that were made in this repo, e.g., #172, #156 and probably others. Can this be fixed? Why wasn't a proper merge done?

@erozenfeld
Copy link
Contributor

@gabai If you checkout the commit before the last one (e.g., 1f9133a) you will be able to build create_gcov.

@GabeAl
Copy link
Author

GabeAl commented Oct 11, 2023

No gravy:

git reset --hard 1f9133a
rm -r build; mkdir build && cd build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=. ../  
-- The C compiler identification is GNU 13.2.1
-- The CXX compiler identification is GNU 13.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17 - Success
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
CMake Error at CMakeLists.txt:9 (add_subdirectory):
  The source directory

    /home/gabe/build/autofdo/third_party/glog

  does not contain a CMakeLists.txt file.
Call Stack (most recent call first):
  CMakeLists.txt:596 (config_without_llvm)


-- Configuring incomplete, errors occurred!

@GabeAl
Copy link
Author

GabeAl commented Oct 11, 2023

(Yes I also tried with git checkout --recurse-submodules 1f9133a) -- no dice.

@erozenfeld
Copy link
Contributor

Ok, please use the previous commit (-2 from HEAD). That one definitely builds.

@GabeAl
Copy link
Author

GabeAl commented Oct 11, 2023

Welp, same issue so I figured out it was a problem with glog being removed and git not wanting to recurse if it was removed since HEAD.

Fixed with git submodule init explicitly at the old branch followed by git submodule update --recursive

Have you considered submitting just the converter to gcc themselves to integrate into gcc to "natively" read the prof file? Would eliminate a lot of this extraneous maintenance and compilation and dependencies etc.

@GabeAl
Copy link
Author

GabeAl commented Oct 11, 2023

Not sure if due to failed build, but that one is crashing with segfault on a simple profile from a simple program:

~/build/autofdo/build/create_gcov --binary myProg --profile perf.data --gcov profile.afdo
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1320] Skipping 1676 bytes of metadata: HEADER_CPU_TOPOLOGY
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event PERF_RECORD_ID_INDEX
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event PERF_RECORD_CPU_MAP
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_17
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_18
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1067] Skipping unsupported event UNKNOWN_EVENT_82
[INFO:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1058] Number of events stored: 89489
[INFO:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_parser.cc:274] Parser processed: 47 MMAP/MMAP2 events, 2 COMM events, 0 FORK events, 1 EXIT events, 89408 SAMPLE events, 89408 of these were mapped, 0 SAMPLE events with a data address, 0 of these were mapped
WARNING: Logging before InitGoogleLogging() is written to STDERR
F20231010 20:54:03.864609 193833 dwarf2reader.cc:835] Unhandled form type
*** Check failure stack trace: ***
Aborted (core dumped)

Should I open a new ticket or is this just a build failure?

@erozenfeld
Copy link
Contributor

@GabeAl If you can share myProg and perf.data, I can debug the failure you are seeing.

@erozenfeld
Copy link
Contributor

Have you considered submitting just the converter to gcc themselves to integrate into gcc to "natively" read the prof file? >> Would eliminate a lot of this extraneous maintenance and compilation and dependencies etc.

Yes, separating the stuff that's needed for create_gcov and moving it to gcc repo would be good and it has been suggested in the past. It's a matter of finding a contributor who has the time to do the work for that.

@GabeAl
Copy link
Author

GabeAl commented Oct 11, 2023

Sure I can share -- but first, anything amiss with what I'm using to invoke things?

AMD Zen 4 processor (so no intel-locked "br_inst_retired:near_taken" garbledigook). I used a bunch of the standard options instead.

# compile a program to test
CFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z,now -Wl,-z,relro -fno-semantic-interposition -ffat-lto-objects -fno-trapping-math -Wl,-sort-common -Wl,--enable-new-dtags -mrelax-cmpxchg-loop -fopenmp -funroll-loops -march=znver4 -flto" ./configure && make -j 24

# profile the program (params stored in env variable, they provide inputs and outputs)
rm perf.data; perf record -e branches,branch-misses,cycles,instructions -b -o perf.data -- src/prog ${params}

create_gcov --binary=src/prog --profile=perf.data \
    --gcov=profile.afdo
[WARNING:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_serializer.cc:604] Ignoring branch stack entry reserved bits: 32
# … repeat the above a million times or so... can we shut this off? Doesn’t make sense to report the same warning every time it loops.
[INFO:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_reader.cc:1058] Number of events stored: 346316
[INFO:/home/gabe/build/autofdo/third_party/perf_data_converter/src/quipper/perf_parser.cc:274] Parser processed: 7 MMAP/MMAP2 events, 2 COMM events, 0 FORK events, 1 EXIT events, 345116 SAMPLE events, 342373 of these were mapped, 0 SAMPLE events with a data address, 0 of these were mapped
WARNING: Logging before InitGoogleLogging() is written to STDERR
F20231011 17:29:01.407420 65767 dwarf2reader.cc:835] Unhandled form type
*** Check failure stack trace: ***
Aborted (core dumped)

If everything looks good, I have a large ~200MB perf.data to upload.

It's a matter of finding a contributor who has the time to do the work for that.

I see -- it would probably save time in the long term with all the debugging and code cleaning and separate tree maintenence, so perhaps the argument could be made re: amortized time savings.

@erozenfeld
Copy link
Contributor

Your steps look fine. You are probably running into a bug in DWARF5 support. Try using -gdwarf-4 instead of -g. Even if that works, I'd like to take a look at the repro.

@erozenfeld
Copy link
Contributor

One more thing: please add -gcov_version=2 to your create_gcov invocation.

@algr
Copy link

algr commented Oct 12, 2023

I'm building from 61b25e4 (HEAD -2 at the time of writing) but am running into compile failures. E.g. quipper/huge_page_deducer.cc is missing an include for <unordered_map>. Are there compatibility issues with the third-party libraries? Will reverting to a given autofdo commit (like 61b25e4) also pick up matching commits of the third-party libraries?

I wanted to revert to a stable release or a "last known good" release, but it looks like the latest release (and tag) was 0.19 in 2019.

@shenhanc78
Copy link
Collaborator

shenhanc78 commented Oct 12, 2023

Hi, sorry for breaking create_gcov.

To preclude such incidences, we are thinking of forking off create_gcov or making it into another create_llvm_prof branch. Because create_gcov is only in maintenance mode and it does not depends on llvm, whereas the main create_llvm_prof part is actively developed inside google and will be synced here from time to time.

I'll talk to @rlavaee who is syncing some internal code here.

I'll also looking into @algr building issue during the weekend.

@GabeAl
Copy link
Author

GabeAl commented Oct 13, 2023

gdwarf-4 and --gcov_version=2 still crash.
F20231012 20:29:38.413218 88587 dwarf2reader.cc:835] Unhandled form type
*** Check failure stack trace: ***
Aborted (core dumped)

prog_and_perfdata.zip

@GabeAl
Copy link
Author

GabeAl commented Oct 13, 2023

BTW why all the llvm business? GCC is generally a more performant compiler for most vectorized hpc code with intrinsics etc (as long as you avoid corner cases or coding specifically to llvm's idiosyncracies). Seems like some redundant effort in the open source compiler community.

@rlavaee
Copy link
Collaborator

rlavaee commented Oct 13, 2023 via email

@rlavaee
Copy link
Collaborator

rlavaee commented Oct 13, 2023 via email

@GabeAl
Copy link
Author

GabeAl commented Oct 13, 2023

Thanks @rlavaee -- would this change be expected to fix compilation, or to fix the bug I posted previously in the thread (repro requested by @erozenfeld )?

Happy to give it a try but would like to align expectations first.

@rlavaee
Copy link
Collaborator

rlavaee commented Oct 13, 2023 via email

@erozenfeld
Copy link
Contributor

@rlavaee Thank you for fixing the break!

@erozenfeld
Copy link
Contributor

@GabeAl I reproduced your failure and, as I suspected, you ran into a missing DWARF5 feature: DW_FORM_line_strp. Luckily there is a PR that fixes it: #166 I asked the maintainers to review and merge it. I verified that after applying that PR create_gcov succeeds on your repro.

BTW, double check your steps when you compile prog with -gdwarf-4. If you really get dwarf2reader.cc:835] Unhandled form type with that, please send me your prog compiled with -gdwarf-4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants