You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary
Leader log queries impede critical validator processing and cause extreme numbers of missed slot leader checks.
Steps to reproduce
Watch the frequency of missed slot leader checks over time.
Run a demanding cardano-cli query in a loop against the validator (example below).
Watch the disaster unfold: In my case, there were 7% of missed slot leader checks due to a repeated query.
Repeat the test with a regular relay node. Tip differences will run sky high (>100) when queries are processed.
Expected behavior
Proper resource isolation.
Ongoing query processing never delays inward tip propagation (“height”).
Ongoing query processing on a validator never ever causes missed slot leader checks!
The fact that one should not run queries against a validator is orthogonal; a validator should either process such queries gracefully, without impediment to critical operations, or outright reject them.
Timing-critical tasks must take precedence. (They should not be timing-critical, but sadly are.)
System info (please complete the following information):
Additional context
This case could be dismissed with “use a workaround”, i.e. “have a separate relay node for slot leader queries only”, i.e. not for routing to a validator. However, such an idea is suboptimal, increasing the amount of resources a pool operator must set aside by up to 50%, compared to the simplest relay + validator setup.
The lack of proper resource isolation may have been a contributing factor to my problem of never successfully validating a block, described in this post and above.
The text was updated successfully, but these errors were encountered:
@andrejpodzimek I've been working on something that may alleviate your problem. It was done for relays serving hundreds of clients but perhaps it could work here too.
Internal/External
External
Area
Other
Summary
Leader log queries impede critical validator processing and cause extreme numbers of missed slot leader checks.
Steps to reproduce
cardano-cli
query in a loop against the validator (example below).Expected behavior
Proper resource isolation.
The fact that one should not run queries against a validator is orthogonal; a validator should either process such queries gracefully, without impediment to critical operations, or outright reject them.
System info (please complete the following information):
cardano-node --version
):cardano-cli --version
):Screenshots and attachments
An example query to expose resource isolation problems:
RTS options:
Additional context
This case could be dismissed with “use a workaround”, i.e. “have a separate relay node for slot leader queries only”, i.e. not for routing to a validator. However, such an idea is suboptimal, increasing the amount of resources a pool operator must set aside by up to 50%, compared to the simplest relay + validator setup.
The lack of proper resource isolation may have been a contributing factor to my problem of never successfully validating a block, described in this post and above.
The text was updated successfully, but these errors were encountered: