Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: identify internal function invocations in traces #8222

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

klkvr
Copy link
Member

@klkvr klkvr commented Jun 21, 2024

Motivation

Introduces --decode-internal flag for forge test, cast run and cast call --trace which enables decoding of internal functions in traces

Example

Example trace of random Uniswap V3 swap:
image

Solution

To determine when we are jumping in/out of functions we are using source map Jump key. However, it is not really reliable, especially after optimizations. Almost in all cases there are mismatches between number of "in"s and "out"s, so we need additional processing to correctly display subset of functions which are correctly reported.

Main implementation of this tracing is in DebugTraceIdentifier: https://github.com/foundry-rs/foundry/blob/216e9da8a28fcc57bcba1c6c4986aa5353472cc5/crates/evm/traces/src/debug/mod.rs

The only issue with this approach is that we are losing data about entire stack of internal functions which were joined before revert

I've used default tracer from revm-inspectors instead of traces collected by Debugger to allow easier integration into printing logic. Using it required a small patch to inspectors: paradigmxyz/revm-inspectors#150

This approach is enough to implement flamegraphs in a similar way, and can probably be extended to smarter tracking of stack/memory/calldata to also resolve input and output parameters of internal functions

Printing logic is a bit ugly at the moment

Closes: #3999 + Closes: #4351

@klkvr klkvr force-pushed the klkvr/internal-fns-in-traces branch from 3b2b1fe to 9fd779a Compare June 21, 2024 01:24
mattsse pushed a commit to paradigmxyz/revm-inspectors that referenced this pull request Jun 21, 2024
Adds `Step` variant for `LogCallOrder` enum and renames it to
`TraceMemberOrder`.

This is useful for printing logic which relies on execution steps as
well, e.g. foundry-rs/foundry#8222
@klkvr klkvr force-pushed the klkvr/internal-fns-in-traces branch from 5f643a4 to 216e9da Compare June 23, 2024 05:17
@klkvr klkvr mentioned this pull request Jun 23, 2024
@klkvr
Copy link
Member Author

klkvr commented Jun 24, 2024

Added tracking of inputs and ouputs as well. It is currently not able to decode user-defined types such as structs and enums, tracking those would probably require smarter AST analysis.
image

Currently --decode-internal is pretty expensive in terms of memory usage because each step is being tracked for each test. We are only interested in JUMPs and JUMPDESTs, so it might make sense to add a configuration option for TracingInspector to only collect steps with specific opcodes.

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only briefly skimmed parts of it.

I think this makes sense, I'd appreciate a few more docs, and I'll take a closer look

crates/cli/src/utils/cmd.rs Outdated Show resolved Hide resolved
Comment on lines 400 to 401
if self.tracer.is_none() && yes ||
!self.tracer.as_ref().map_or(false, |t| t.config().record_steps) && debug
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hard to follow,
I wonder if we can encapsulate these two bools into an enum TracingKid or smth, because debug also implies tracing, right?

mattsse pushed a commit to paradigmxyz/revm-inspectors that referenced this pull request Jun 25, 2024
- Adds `record_returndata_snapshots` flag to config which enables
snapshots of `interpreter.return_data_buffer`
- Adds `record_opcodes_filter` parameter which allows to only record
specific opcodes. ref
foundry-rs/foundry#8222 (comment)
- Adds `gas_used` field for `CallTraceStep`

This should be enough to migrate foundry's debugger to using
`TracingInspector` from here, I will open PR for this later today.
@klkvr klkvr marked this pull request as ready for review June 27, 2024 08:13
@klkvr klkvr changed the title [wip] feat: identify internal function invocations in traces feat: identify internal function invocations in traces Jun 27, 2024
@klkvr
Copy link
Member Author

klkvr commented Jun 27, 2024

This is pretty much ready. Memory usage is still very high when running a complex test suite with --decode-internal, but with filters it is not more than for debugger. cast run and cast call are also working fine.

Not sure what to do with printing logic. It is not great and I think it would be better to integrate this directly into TraceWriter once we have #8224. wdyt @mattsse @DaniPopes

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, only have nits

this is now blocked by the tracewriter pr?
cc @zerosnacks

crates/cli/src/utils/cmd.rs Show resolved Hide resolved
crates/evm/evm/src/executors/invariant/replay.rs Outdated Show resolved Hide resolved
crates/evm/traces/src/debug/sources.rs Show resolved Hide resolved
match ty {
// For `string` and `bytes` layout is a word with length followed by the data
DynSolType::String | DynSolType::Bytes => {
let length = U256::from_be_bytes::<32>(first_word.try_into().unwrap()).to::<usize>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the usize cause overflow problem, usize is 64 bits (8 bytes), where the length can be 32 bytes long

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am trying this new feature, seeing below error in my end

Message:  Uint conversion error: Overflow(256, 10488988339550576416, 18446744073709551615)
Location: crates/evm/traces/src/debug/mod.rs:305

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                                ⋮ 14 frames hidden ⋮
  15: core::result::Result<T,E>::expect::h1e62bb9f5f4f9d0f
      at /rustc/d124af4d4a707212dfe7c081/library/core/src/result.rs:1034
  16: ruint::from::<impl ruint::Uint<_,_>>::to::h7b50fa5729cd9809
      at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruint-1.12.3/src/from.rs:236
       234 │         T: Debug,
       235 │     {
       236 >         self.uint_try_to().expect("Uint conversion error")
       237 │     }
       238 │
  17: foundry_evm_traces::debug::decode_from_memory::hb99e9e33dfb84c71
      at /Users/x/foundry/crates/evm/traces/src/debug/mod.rs:305
       303 │         // For `string` and `bytes` layout is a word with length followed by the data
       304 │         DynSolType::String | DynSolType::Bytes => {
       305 >             let length = U256::from_be_bytes::<32>(first_word.try_into().unwrap()).to::<usize>();
       306 │             let data = memory.get(location + 32..location + 32 + length)?;
       307 │
  18: foundry_evm_traces::debug::try_decode_args_from_step::{{closure}}::{{closure}}::h0912b74d67673459
      at /Users/x/foundry/crates/evm/traces/src/debug/mod.rs:283
       281 │                         (DynSolType::Uint(8), Some(Storage::Memory | Storage::Storage)) => None,

Copy link
Member Author

@klkvr klkvr Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @ChinW thanks for trying this out and reporting!

I think it's safe to assume that memory wouldn't be expanded for more than usize, though this analysis is not perfect and we might expect to find the length while there's actually some other value (i.e. number > usize::max), which I believe has happened in your case

any chance you could provide a repro for this? also, are you using via-ir here?

@Philogy
Copy link

Philogy commented Jun 29, 2024

Been waiting for this feature forever!

What do reverts look like? Is it able to give you the line you reverted on?

What about solidity native functions like abi.encode, require? I think it'd be really useful if the internal calls were shown with a line number

@klkvr
Copy link
Member Author

klkvr commented Jun 29, 2024

What do reverts look like? Is it able to give you the line you reverted on?

Current approach relies on solc source map keys of jump type. For JUMPs we are sometimes provided with info on whether this is a jump in or out of the function, and by reading source code you can determine the name, location, input and output types of the function.

However, with optimizations those source maps are getting messed up and you are getting a lot of mismatched ins and outs.

Currently I've been mostly focusing on correctness of the identification, thus we currently only identify a match if we see an explicit JUMP in and JUMP out of the same function. This currently doesn't really work for REVERTs and RETURNs done in low-level assembly, because there are no JUMPs out, just a frame execution end.

We're still ending up with a stack of potentially correct internal fns in those cases, but when I tested this for some random cases this stack usually contained some invalid data which I wouldn't want to display.

So currently there is definitely space for improvement of identification, likely through more "guessing"-approach relying on multiple factors and guided by AST, source maps and bytecode analysis.

What about solidity native functions like abi.encode, require? I think it'd be really useful if the internal calls were shown with a line number

While I was reading source maps I've seen that solc marks JUMPs into abi.encode/abi.decode as jumps in, so it should be possible to identify those in the future. Though this is mostly suited for user-defined internal fns at the moment.

require is not treated by solc as a function, though REVERT instruction source mapping is usually pointing to the require source code, so those should be possible to identify as well (we are already doing this in coverage iirc)

IMO all of this is basically a better/more readable UX for the debugger, which can already be used to check the exact line of code where the revert occured

mattsse pushed a commit to paradigmxyz/revm-inspectors that referenced this pull request Jul 2, 2024
ref foundry-rs/foundry#8222
ref foundry-rs/foundry#8198

Adds structs and extends `TraceWriter` to support formatting of decoded
trace steps. Currently two decoding formats are supported:
- Internal calls. Similar to a decoded call trace, decoded internal
function invocation which spans over multiple steps. Kept as decoded
function name, inputs, outputs and index of the last step.
- Arbitrary strings. This might be useful for formatting decoded opcodes
(e.g. adding `├─ [sload] <slot>` to trace. It might make sense to extend
it to something more configurable once we start implementing this
Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

pending @DaniPopes

crates/forge/src/multi_runner.rs Outdated Show resolved Hide resolved
crates/evm/traces/src/debug/sources.rs Outdated Show resolved Hide resolved
crates/evm/traces/src/debug/sources.rs Show resolved Hide resolved
crates/evm/evm/src/executors/trace.rs Outdated Show resolved Hide resolved
crates/evm/traces/src/debug/sources.rs Outdated Show resolved Hide resolved
@klkvr
Copy link
Member Author

klkvr commented Jul 9, 2024

Updated --decode-internal flag to accept a regex similar to --debug. On large suites --decode-internal easily results in OOM, so I think it's better to restrict its usage in such way

@DaniPopes
Copy link
Member

Yeah same problem as in the debugger caused by memory snapshots

@DaniPopes
Copy link
Member

Maybe we can disable memory decoding by default to avoid the memory consumption issue?

@klkvr
Copy link
Member Author

klkvr commented Jul 9, 2024

Maybe we can disable memory decoding by default to avoid the memory consumption issue?

Yeah, I though about disabling memory tracking if more than one test matched filters. Though not sure how to make this intuitive

Should it be two separate flags, one of which does not require the test function filter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for internal function jump trace Forge get geth style trace of test case
5 participants