(Investigation) Multi-process support in tracy #822

Arpafaucon · 2024-07-02T18:21:40Z

This is not a bug report, but rather an attempt at documenting my work on #471.

I am still not experienced in the tracy code base, so I'd like to document a bit closely what I try, and what it gave.

worst case scenario: it disturbs no one, and it forces me to document clearly where I'm at
best case: more experienced people have the time to read this and can give me a few pointers to help me progress / explain what's unclear

What happened today

Investigative work to better understand the structure. Key takeaways
- Worker is a very central class that manages all the relevant data about a profiling run
- to save memory space, tracy does a lot of internal data "compression"
  - thread IDs are reduced to uint16_t
  - strings are interned and retrieved by a unique ID
Some interesting readings
export-csv gave me good insights on the important internal data structures of Worker
and import-chrome on the other side shows how to feed data into a Worker

I tried testing #766 but I had no success so far, I get the error reported in #766 (comment)

To get my hands dirty, I explored an alternative approach that only supports offline files, let's call it tracy-merge.

we generate multiple profiler runs separately (they talk to different tracy-capture processes)
we run tracy-merge on the set of generated *.tracy, to generate one file that contains all the events
to limit complexity, I use the importing constructor of Worker. Basically I fill a big vector with events of all the profiling files in turn, taking care of re-mapping thread IDs to avoid overlaps

Current result, generated from merging the same profiling run from test/test_cpp copied twice.

I lose the real thread IDs in the merge process
I still have missing zone names (but some of them are present) - I can't explain this at the moment
I only converted timeline events for now, so plots and frames are lost

The text was updated successfully, but these errors were encountered:

Arpafaucon · 2024-07-03T22:26:27Z

Report

I got back on the multiplexer work, trying to figure out the reason of the failure. What is still confusing is that the experimental profiler (as I read it) seems to want to consider all client-issued QueueItems as answers to server queries.

adding some logic to skip the handle_client_response for the non-response frames partly fixes the issue, and i can now spawn the multpilexer with multiple clients. However, I have recurrent crashes to investigate
I was kinda scared by the huge switch case in get_time_and_field that is needed to remap the time correctly (still have issues there), but I agree with cipharius: not a lot we can do without refactoring the profiler and/or the worker
discovered with pleasure TracyEventDebug, very nice to understand the protocol by example
out of topic: merged [IDE] migrate test folder to CMake configuration #824
useful resources
- the discord channel
- https://wolf.nereid.pl/posts/tracy-internals (and the other blog post about tracy as well)
- the source code (spent hours in it)

Arpafaucon mentioned this issue Jul 2, 2024

Better support for cluster environments #471

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Investigation) Multi-process support in tracy #822

(Investigation) Multi-process support in tracy #822

Arpafaucon commented Jul 2, 2024 •

edited

Loading

Arpafaucon commented Jul 3, 2024

(Investigation) Multi-process support in tracy #822

(Investigation) Multi-process support in tracy #822

Comments

Arpafaucon commented Jul 2, 2024 • edited Loading

What happened today

Arpafaucon commented Jul 3, 2024

Arpafaucon commented Jul 2, 2024 •

edited

Loading