*: add invocations to applicationlog #3569

ixje · 2024-09-03T14:48:53Z

Problem

Solution

Implement as extension. Moved the discussion from Dora's backend PR to here

To do

copy arguments to avoid modifications
limit the total number of argument stack items in a single transaction (for safety)
make this a configurable feature
include native contract calls

Do we want to limit the stack item depth (think MaxJSONDepth) or are we content with just limiting the total stack arguments?

AnnaShaleva

A good prototype, but I have several design questions that we should solve before review.

pkg/core/interop/contract/call.go

AnnaShaleva · 2024-09-04T05:07:33Z

pkg/core/interop/contract/call.go

+	ic.InvocationCalls = append(ic.InvocationCalls, state.ContractInvocation{
+		Hash:   u,
+		Method: method,
+		Params: stackitem.NewArray(args),


Regarding size restrictions: let's summarize what have been discussed in https://github.com/CityOfZion/dora-server-NeoN3/pull/220#issuecomment-2325214725. Restriction on the number of recorded invocations or the number of arguments per each recorded invocation (or per all recorded invocations) is needed. I'd suggest to restrict both:

The overall number of invocations is restricted in depth by the invocation stack size, but it's not restricted in length (only by executing GAS cost restriction). This problem should be solved the same way as restriction of the number of Notifications (Investigate System.Runtime.Notify refcounting #3490). Currently in NeoGo node it's not restricted, but it may be restricted in C# node (which means we have a bug in NeoGo) or it may not be restricted at all (which means we have a cross-implementation bug). So we firstly need to solve Investigate System.Runtime.Notify refcounting #3490 and after that port this solution to invocations restriction.

The number of arguments per every recordable invocation may be restricted using the same approach as for Notifications: serialize it and check the resulting size:

neo-go/pkg/core/interop/runtime/engine.go

Line 110 in d47fe39

bytes, err := ic.DAO.GetItemCtx().Serialize(elem.Item(), false)

If serialization fails (due to recursive structures presence) or arguments size exceeds the limit, then this invocation wan't be stored in the database.

These size restrictions should be well-documented and it should be noted that invocation tracking system may be missing some invocations and users can't rely on the recorded invocations completely.

While I agree it should be restricted, I don't think this point should be blocking (just like it is not for notifications).

This seems to count the total amount of bytes of all stack items. I don't understand how that translates to parameter count of the method called. I can see it used to ensure it doesn't exceed a maximum total item count in the Array though.

I don't think we should throw away the complete invocation record because of a parameter violation. If it's recorded on chain then it was apparently still a valid invocation, regardless if we want to store all of the invocation details. Let's say it exceeds one of the limits, we could still store the contracthash, method, maybe parameter count and skip the actual parameters. Perhaps also adding a is_valid field to the json output that can be used as indicator how much of the information can be trusted.

i.e.
Valid

"invocations": [ { "contract_hash": "0x49cf4e5378ffcd4dec034fd98a174c5491e395e2", "method": "designateAsRole", "is_valid": true "parameters": { "type": "Array", "value": [ { "type": "Integer", "value": "4" }, ] } } ]

Invalid

"invocations": [ { "contract_hash": "0x49cf4e5378ffcd4dec034fd98a174c5491e395e2", "method": "designateAsRole", "is_valid": false "parameters": null } ]

I don't think this point should be blocking

Agree, let's then leave this restriction to #3490 and finalize Invocation logs without it.

This seems to count the total amount of bytes of all stack items. I don't understand how that translates to parameter count of the method called.

I don't think that we need to stick to the number of parameters restriction, because the only thing that bothers us during Invocations collection is the resulting size of the serialized Invocations structure. We don't want it to be large and to take a lot of disc space. Serialize lets us ensure that serialized parameters do not occupy a lot of space and it also cares about the overall number of serialized elements (SerializationContext.limit is responsible for that). Thus I consider Serialize to be a perfect candidate for the arguments size restriction.

Also, we may cache the result of Serialize once, and then reuse it while storing Invocations on disc.

I believe Roman was also concerned about the processing time, but I'll let him comment to see if he agrees using serialize is sufficient.

Processing time is important, but we need to serialize arguments anyway to store Invocation logs, and if we reuse the result of Serialize, then processing time doesn't increase a lot.

And of course, the node should have a setting to disable Invocation logs, because it's a resource-demanding feature. Most of the public nodes likely won't have this feature enabled.

And of course, the node should have a setting to disable Invocation logs, because it's a resource-demanding feature. Most of the public nodes likely won't have this feature enabled.

Yeah that's fine. I have making it configurable on the to do list. In what section should I put the option? ApplicationConfiguration.RPC or ApplicationConfiguration (because it also affects the indexing behaviour)?

To me it's more an ApplicationConfiguration-level setting, in particular, I'd place it into config.Ledger structure. You're right in that it affects the database behaviour, and it's possible that some other node services should be able to reach this setting in future.

ApplicationConfiguration is OK, ideally we should make DBs compatible with/without this option.

Side note: serialization is also kinda a snapshot of items, so DeepCopy can even be avoided if we're checking the size via serialization (for notifications copying is important since this data can be reused in the same context for System.Runtime.GetNotifications).

AnnaShaleva · 2024-09-04T05:19:41Z

pkg/core/interop/contract/call.go

+	ic.InvocationCalls = append(ic.InvocationCalls, state.ContractInvocation{
+		Hash:   u,
+		Method: method,
+		Params: stackitem.NewArray(args),


This "flattened" way of invocations tracking is missing depth, so that given [ContractACall, ContractBCall, ContractCCall] it's impossible to say whether contract B calls contract C internally or contract A calls both B and C subsequently. If comparing with VM-level InvocationsTree, then InvocationsTree gives a clear understanding of calls depth and nesting relationship, which is good for the user:

neo-go/pkg/vm/vm.go

Lines 395 to 397 in d47fe39

newTree := &invocations.Tree{Current: ctx.ScriptHash()}

curTree.Calls = append(curTree.Calls, newTree)

ctx.sc.invTree = newTree

However, using VM InvocationsTree in the current state is impossible, because it does not track call arguments. And it's a problem to make it track call arguments because it only has access to loading context with contract scripthash, and arguments are loaded by interop handlers. This problem may be solved with some additional VM callback.

So the question is: do we need nested relationship to be present in the resulting invocations log? It's important to solve this design question before the implementation.

For the Dora use-case we don't need this information. Keeping it flat would be similar to notifications where we also can't tell who triggered them (i.e. was it user calling contractA which calls Contract B, or was it user calling ContractA and user calling ContractB using 2 System.Contract.Calls in a tx.script).

However, it does seem like this information can be useful to somebody somewhere down the road and changing it later on is going to be a hassle. What would it look like? An option could be

type ContractInvocation struct { Hash util.Uint160 `json:"contract_hash"` Method string `json:"method"` Arguments *stackitem.Array `json:"arguments"` IsValid bool `json:"is_valid"` Invocations []ContractInvocation `json:"invocations"` }

An option could be

Agree, it's in fact the way how VM InvocationTree works.

But regarding 1D (flattened) / 2D (nested) structure of Invocations: I think we need some third opinion on this topic. Personally, I vote for the nested structure because it contains more information which may be useful in some cases, and especially for contract calls debugging.

Proper call tree would have a higher price. I'm OK with keeping it this way if it doesn't have significant performance penalty.

AnnaShaleva · 2024-09-04T05:25:52Z

pkg/core/interop/contract/call.go

+	ic.InvocationCalls = append(ic.InvocationCalls, state.ContractInvocation{
+		Hash:   u,
+		Method: method,
+		Params: stackitem.NewArray(args),


One more question is: how to handle contract exceptions given this way of Invocations counting? E.g. for Notificaitons we revert the whole set of notifications on context unloading if exception was raised and no catch block is present in this context:

neo-go/pkg/core/interop/contract/call.go

Line 142 in d47fe39

ic.Notifications = ic.Notifications[:baseNtfCount] // Rollback all notification changes made by current context.

Should we do the same thing for Invocations? From one point, these invocations were handled by VM and we can't just throw them out from the Invocations list; VM InvocationsTree records all invocations irrespectively of exceptions. From another point, all side-effects of these invocations (notifications, contract storage changes) were reverted so why should we include invocations to the list if engine state remains as it was before these invocations.

This is a very good (but hard) question. At first sight I'm leaning towards rolling back because the invocations list in the applicationlog is not intended as a debugging tool like the invocations tree but as a means to track successful contract invocations. Specifically for Dora it will not process transactions that did not end in a HALT state. Arguably this skews the smart contract invocation statistics, but that's what it is.

Looking at what the invocations entry means in the applicationlog in general I think it should work like notifications. This immediately reminds me of #3189, how would you like this see this for transactions ending in a FAULT state? Include to match the current notifications behaviour or exclude?

is not intended as a debugging tool

My first though was that we're developing a debugging tool :D But if not, then probably it would be better to follow Notifications behaviour and revert invocations tree. To me, one of the reason context's Notifications are reverted in case of uncaught exception is the way how system tracks NEP17/NEP11 token transfers: for HALTed transactions Transfer notification with particular arguments is filtered out from the list of notifications and is stored as a transfer record in the DB. That's why it's important to rollback unsuccessful notifications exactly as contract storage changes. And to me then Invocations are expected to behave exactly like Notifications.

Regarding applications logs, we'll have to fix the current behaviour of NeoGo node for FAULT transactions. The current behaviour is helpful and I'd love to keep it, but it's just not correct.

My first though was that we're developing a debugging tool :D

My original motivation comes from the problems described in https://github.com/CityOfZion/dora-server-NeoN3/issues/219 but it definitely has the potential to be used for debugging as well. Perhaps the rollback can be disabled when used with diagnostics. If we choose the format to be nested then I think it becomes a more powerful version of the current invocationtree.

@roman-khimov, what do you think about reverting invocations tree in case of exceptions, do we need it?

Theoretically I'd love some "called, but reverted" status for them. But if practically consistent (reverted) result is sufficient then ok.

pkg/core/interop/contract/call.go

AnnaShaleva · 2024-09-05T05:12:18Z

@roman-khimov, I think we need some third opinion on these topics.

roman-khimov · 2024-09-05T20:14:36Z

How about System.Runtime.LoadScript calls, btw?

AnnaShaleva · 2024-09-06T05:04:45Z

How about System.Runtime.LoadScript calls

It leads to new execution context creation, thus it's a valid part of invocation tree. But is this information useful in practice? Dynamic invocations are identified by hash160 of the loaded script, as a result user can't get this script because he knows only its hash. But still we may include dynamic invocations into the resulting Invocations tree with some special field like isContractCall: false.

ixje · 2024-10-31T09:42:52Z

Picking this up again. I rebased the branch onto latest master and processed some of the feedback. In particular

use stackitem.Serialize instead of deepcopy and re-use the results when storing the data
make the behaviour configurable through a SaveInvocations config option

Note; It was unclear to me based on #3569 (comment) if I should have made it a tree or keep it flat. I kept it flat for now.

If the feature is enabled the applicationlog output looks as follows

"invocations": [
                    {
                        "contract_hash": "0xd2a4cff31913016155e38e474a2c06d08be276cf",
                        "method": "transfer",
                        "arguments": {
                            "type": "Array",
                            "value": [
                                {
                                    "type": "ByteString",
                                    "value": "krOcd6pg8ptXwXPO2Rfxf9Mhpus="
                                },
                                {
                                    "type": "ByteString",
                                    "value": "AZelPVEEY0csq+FRLl/HJ9cW+Qs="
                                },
                                {
                                    "type": "Integer",
                                    "value": "1000000000000"
                                },
                                {
                                    "type": "Any"
                                }
                            ]
                        },
                        "arguments_count": 4,
                        "is_valid": true
                    }
                ]

and in disabled state it returns

"invocations": []

I'm looking for feedback on the above before taking care of covering System.Runtime.LoadScript calls

ixje · 2024-11-14T14:45:35Z

@AnnaShaleva can this PR also get some review love please

AnnaShaleva reviewed Sep 4, 2024

View reviewed changes

ixje added 3 commits October 31, 2024 08:15

Store invocations in application log

d8a0d13

rename params to arguments

04c71f5

make invocation saving configurable, safely handle args

dfd5c9b

ixje force-pushed the applog-invocations branch from f4e91f5 to dfd5c9b Compare October 31, 2024 09:03

re-use serialized arguments for storing

cd7caf5

ixje requested a review from AnnaShaleva October 31, 2024 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: add invocations to applicationlog #3569

*: add invocations to applicationlog #3569

ixje commented Sep 3, 2024 •

edited

Loading

AnnaShaleva left a comment

AnnaShaleva Sep 4, 2024

ixje Sep 4, 2024

AnnaShaleva Sep 5, 2024

ixje Sep 5, 2024

AnnaShaleva Sep 5, 2024

ixje Sep 5, 2024

AnnaShaleva Sep 5, 2024

roman-khimov Sep 5, 2024

AnnaShaleva Sep 4, 2024

ixje Sep 4, 2024

AnnaShaleva Sep 5, 2024

roman-khimov Sep 5, 2024

AnnaShaleva Sep 4, 2024

ixje Sep 4, 2024

AnnaShaleva Sep 5, 2024

ixje Sep 5, 2024

AnnaShaleva Sep 6, 2024

roman-khimov Sep 6, 2024

AnnaShaleva commented Sep 5, 2024

roman-khimov commented Sep 5, 2024

AnnaShaleva commented Sep 6, 2024

ixje commented Oct 31, 2024

ixje commented Nov 14, 2024

	newTree := &invocations.Tree{Current: ctx.ScriptHash()}
	curTree.Calls = append(curTree.Calls, newTree)
	ctx.sc.invTree = newTree

*: add invocations to applicationlog #3569

Are you sure you want to change the base?

*: add invocations to applicationlog #3569

Conversation

ixje commented Sep 3, 2024 • edited Loading

Problem

Solution

To do

AnnaShaleva left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AnnaShaleva commented Sep 5, 2024

roman-khimov commented Sep 5, 2024

AnnaShaleva commented Sep 6, 2024

ixje commented Oct 31, 2024

ixje commented Nov 14, 2024

ixje commented Sep 3, 2024 •

edited

Loading