Marlowe Runtime Chain Index Architecture #180

jhbertra · 2022-07-07T10:02:19Z

jhbertra
Jul 7, 2022

I've started this thread as a place to discuss the design for the chain index component of the runtime. I'm tentatively using the term "chain index" as opposed to "chain follower" or "chain sync client" because we may be using more than just the chain sync mini protocol, so "chain index" is more general.

Before we dive into the internals of the component, we should settle on an initial set of requirements that we feel comfortable with and describe the interface.

Here is a first draft of the specification for discussion:

High-level and Rationale

I'm beginning to think that we want the chain index to actually be a separate process. The reason for this is because of the potentially large volume of data it will store. It may make sense in some cases to deploy it on a different physical computer, or provide hosted instances for users who don't want to host either the Cardano Node or the database themselves. It certainly makes sense to segregate the chain index's database from other databases in the runtime for this reason.

The reason for introducing the intermediate contract roster structure is to support use cases in which only a subset of contracts are required with minimal data transfer, particularly in the streaming API. Sending only transaction IDs makes the message payload smaller, and enables the streaming API to send everything, simplifying its semantics. This also allows consumers to maintain a lightweight index that they can cheaply update and hold in memory, while actual transactions can be loaded lazily and optionally kept in a transient caching layer.

Definitions

Marlowe epoch: the initial chain point from which Marlowe transactions are expected to be found.
```
newtype MarloweEpoch = MarloweEpoch ChainPoint
```
Contract roster: a map that enumerates the relevant transaction IDs for each Marlowe contract.
```
newtype ContractRoster = ContractRoster (MonoidalMap ContractId (Set TransactionID))
```
Contract filter: a filter used to specify a set of contracts
```
data ContractFilter = ???
```

Synchronization status: A data structure that describes how far into the synchronization process the chain index is.

data SynchronizationStatus
  = Disconnected
  | Synchronizing ChainPoint ChainTip
  | Synchronized ChainTip

Administrator: An actor who configures and manages the deployment of the runtime components
Consumer: An actor who queries the chain index.
Transaction ID set: A set of transaction IDs
Transaction set: A map of transaction data or error message keyed by transaction ID

Chain event: A data structure that describes an incremental update to the chain.

data ChainEvent
  = SynchronizationComplete
  | RollForward ChainTip ContractRoster
  | RollBackward ChainTip

Node IPC thread: A dedicated thread for running the chain sync client.

Functional requirements

As an administrator, I want to specify a Marlowe epoch chain point, so that I can save storage space.
- Given a valid Marlowe epoch, the chain index should download all blocks from that point onward.
- Given a valid Marlowe epoch, the chain index should not download any blocks before that point.
- Given an undefined Marlowe epoch, the chain index should warn that a Marlowe epoch is not defined and that synchronization will start from genesis.
- Given an invalid Marlowe epoch, the chain index should report that the Marlowe epoch is invalid and exit.
- Given a Marlowe epoch that occurs after the one set in a previous run, the chain index should warn that the Marlowe epoch has changed, and that a range of blocks will be lost.
- Given a Marlowe epoch that occurs before the one set in a previous run, the chain index should warn that the Marlowe epoch has changed, and that a new batch of blocks will be downloaded. The chain index should only fetch the range of blocks it does not already have.
As a consumer, I want to request a contract roster for a contract filter, so that I know which transactions to fetch for a given contract.
- Given a contract filter, the contract roster should include all contracts included by the contract filter.
- Given a contract filter, the contract roster should not include any contracts not included by the contract filter.
- Given the synchronization status of the chain index is not Synchronized, the response should indicate that that the roster is possibly incomplete.
As a consumer, I want to request the synchronization status, so that I know when I can start requesting the contract roster.
- Given the latest saved block is behind the tip, the status should be Synchronizing.
- Given the latest saved block is at the tip, the status should be Synchronized.
- Given the node is disconnected, the status should be Disconnected.
As a consumer, I want to request a transaction set for a transaction ID set, so that I can obtain detailed transaction information.
- Given a valid transaction ID in the transaction ID set, the transaction set should include the corresponding transaction data.
- Given an invalid transaction ID in the _transaction ID set, the transaction set should include an error message for that transaction ID.
- Given a transaction ID set, the transaction set should contain the same set IDs as keys.
- Given the synchronization status of the chain index is not Synchronized, the response should indicate that that the transaction set is possibly incomplete.
As a consumer, I want to subscribe to a stream of _chain event_s, so that I can update my copy of the roster and propagate change events downstream.
- Given the synchronization status of the chain index is Synchronizing, the stream should not send any events.
- Given the synchronization status of the chain index is Synchronized, the first event the stream should send should be a SynchronizationComplete.
- Given the synchronization status of the chain index is Disconnected, the connection request should be rejected.
- Given the synchronization status of the chain index becomes Disconnected, the connection should be severed.
- Given the synchronization status of the chain index becomes Synchronized, the first event the stream should send should be a SynchronizationComplete.

Non-functional requirements

Node communication should happen on the node IPC thread to minimize latency
Block deserialization should not be happen on the node IPC thread
Persistence should happen in bulk when synchronizing
Persistence should happen on demand when synchronized
Contract roster and Transaction set queries should be fast
Bulk block deletion by slot number should be fast
Block deletion should cascade to transaction deletion
Rollbacks should be handled by deleting blocks
Synchronization should start from the most recent intersection point
On Node disconnection, the processes should not crash
On Node disconnection, the process should automatically try to reconnect with retry back-off

Usage notes

The consumer should establish a streaming connection.
When the consumer receives a SynchronizationComplete on the streaming connection, it should request the contract roster.
Upon receiving a contract roster via the streaming connection, it should merge the new roster with the one it currently maintains.
Upon receiving a contract roster via a request, it should replace the one it currently maintains with the new roster.
If the consumer wishes to filter its roster, it should do so in the following ways:
1. Provide the filter when requesting the roster.
2. Apply the filter to the roster received via the streaming connection before merging it with the current one.
If the streaming connection is severed by the chain index, the consumer may try to reconnect.
Upon reconnection, the consumer should act as if it is connecting for the first time and follow the same procedure.
The consumer may persist the contract roster, and load it upon start, but this is not necessary because it will always start by requesting a new roster.

jhbertra · 2022-07-07T12:04:28Z

jhbertra
Jul 7, 2022
Author

Interface summary:

querySynchronizationStatus :: ClientM SynchronizationStatus
queryContractRoster :: ContractFilter -> ClientM ContractRoster
queryTransactions :: MonadIO m => Set TransactionId -> ClientM (Producer TransactionResult m ())
streamChainEvents :: MonadIO m => ClientM (Producer ChainEvent m ())

-- pure helpers
applyFilter :: ContractFilter -> ContractRoster -> ContractRoster

Note that queryTransactions and streamChainEvents both return a Producer. This is because queryTransactions streams its response to avoid loading arbitrarily large collections of transactions into memory, while streamChainEvents is a streaming data source by its nature.

In this context, Producer is from the pipes package.

0 replies

jhbertra · 2022-07-07T12:46:13Z

jhbertra
Jul 7, 2022
Author

Architecture diagram:

graph
  node((node socket))
  consumer(("consumer(s)"))
  db[(Database)]
  config[(Config)]
  qport((Query Port))
  sport((Streaming Port))
  subgraph Chain Index
    epoch[Marlowe Epoch]
    intersect[Intersection Points]
    env[Environment]
    sync[Chain Sync Client]
    dbcon[DB connection]
    status[Synchronization Status]
    persist[Chain Persister]
    sapi[Streaming API]
    qapi[Query API]
    qsock[Query Socket]
    ssock[Streaming Socket]
  end
  sync-->node
  config-->db
  sync-->epoch
  epoch-->env
  qapi-->qsock
  qapi-->status
  qapi-->dbcon
  qsock-->env
  qsock--->qport
  sapi-->ssock
  sapi-->status
  sapi-->sync
  ssock-->env
  ssock--->sport
  status-->sync
  persist-->status
  sync-->intersect
  intersect-->dbcon
  persist-->dbcon
  env-->config
  dbcon-->env
  dbcon-->db
  persist-->sync
  consumer-->qport
  consumer-->sport

0 replies

jhbertra · 2022-07-07T13:21:15Z

jhbertra
Jul 7, 2022
Author

I'm realizing that the consumer has no way to apply a contract filter to a contract roster with the information it contains. I think that streamChainEvents needs to accept a ContractFilter instead so that all filtering can be done on the chain index side.

0 replies

jhbertra · 2022-07-07T13:23:17Z

jhbertra
Jul 7, 2022
Author

Interface draft v2:

querySynchronizationStatus :: ClientM SynchronizationStatus
queryContractRoster :: ContractFilter -> ClientM ContractRoster
queryTransactions :: MonadIO m => Set TransactionId -> ClientM (Producer TransactionResult m ())
streamChainEvents :: MonadIO m => ContractFilter -> ClientM (Producer ChainEvent m ())

The filter passed to streamChainEvents would not change the number of events transmitted - it would simply be applied to the ContractRoster supplied by a RollForward event.

0 replies

jhbertra · 2022-07-07T13:32:37Z

jhbertra
Jul 7, 2022
Author

I'm also realizing that the ContractRoster structure is actually unnecessary. It can be fused away like this:

querySynchronizationStatus :: ClientM SynchronizationStatus
queryTransactions :: ContractFilter -> ClientM (Producer Transaction m ())
streamChainEvents :: MonadIO m => ContractFilter -> ClientM (Producer ChainEvent m ())

data ChainEvent
  = SynchronizationComplete
  | RollForward ChainTip [Transaction]
  | RollBackward ChainTip

This should dramatically simplify the API, and consequentially, the consumer.

1 reply

jhbertra Jul 7, 2022
Author

Actually, this might not be a good idea to discard the idea after all - one benefit of the ContractRoster is that it keeps the size of the structure that must be initially requested much smaller. This means that the consumer can request the whole roster as part of its initialization procedure and just discard its current one. Then it only needs to fetch transactions that it doesn't already have, since they will never change and can be cached aggressively. If this version of queryTransactions needed to be called as part of consumer initialization, it would be far more intensive, and we'd likely need to add the ability to exclude certain transactions from the response.

palas · 2022-07-07T14:30:49Z

palas
Jul 7, 2022
Maintainer

@jhbertra, Is the "Roster" mechanism something standard? Do you have a reference to an explanation of it? Maybe it would be good to add it to the discussion. I saw you wrote rationale for using it, but it would be useful to have some more information about what a Roster is, how does it work, etc

21 replies

palas Jul 8, 2022
Maintainer

So, the problem I keep encountering and the reason I suggested going for the ContractRoster to start with is this: we could instead pass a list of intersection points to the chain index and let it pick the best one and only send transactions starting from that point. Then the consumer would only have to request transactions it doesn't already know about. But what happens when the consumer changes its filter? It now needs a way to express "give me all transactions that match the new filter, starting from the latest of these intersection points, union all transactions that match the difference of the new filter and the old filter." This is what I meant when I said it starts getting much more complicated when you combine (A) Saving previously fetched transactions and not re-requesting them on startup, (B) Filtering transaction information by contract, and (C) Dynamically switching the filter.

I'm not saying my solution is the best either. I completely agree that it doesn't scale to arbitrary volumes of transactions, and I also do not like the fact that you need to do two round trip queries to get the data you want. But I don't think simply adding a chain point window will solve the problem in a satisfactory or future-proof way.

On solution could be that the client could, in principle create a filter that just contains the difference and make a separate query for the delta. So for example:

The client wants all transactions that satisfy filter A
Later the client wants to extend the filter A, so that it is looser, and makes a filter C that includes A, (so "A or B = C")
The client could get up to date by querying the whole chain using B. You can also even calculate B as "B = C and (not A)", but most of the time it is probably quite straightforward what B is, or a at least a good approximation

palas Jul 8, 2022
Maintainer

That wouldn't be an efficient way to communicate the desired information. The point of the contract roster is to communicate an enumerable set of transaction IDs downstream. Bloom filters only provide a probabilistic membership test, you can't enumerate the members from it. So to make use of it, you would need to first start with every single transaction ID and individually test them for membership in the bloom filter to get the set of transactions you want to work with.

I am not sure, I think you can do intersection and union of bloom filters too. You cannot get each transaction in the bloom filter from the bloom filter, but then again, the client doesn't need to (I think). The only problems I see is that I am not sure how efficient it is to fetch everything that matches a bloom filter from a DB, and that it is probabilistic, so it may get you transactions you don't want. But it is not obvious to me that it wouldn't do the trick.

I think using ranges would be better, in principle.

jhbertra Jul 8, 2022
Author

Regarding the discussion on bloom filters, I don't think it's a good use of focus at the current moment to be thinking about them, and I would caution about getting too fixated on them. We don't have more than a vague intuition they might be handy at the moment, so it doesn't seem like a productive topic to consider right now. Personally, I would start by just using raw SQL queries to filter data, database engines are extremely efficient for this.

I also think there may be some confusion here as to terminology - you keep using the term "client" but I'm not sure if we're on the same page as to what that is referring to. When you say "client," what do you mean?

jhbertra Jul 8, 2022
Author

On solution could be that the client could, in principle create a filter that just contains the difference and make a separate query for the delta.

I think utilizing deltas in some way might be a good approach indeed. This relates closely to what I was mentioning the other day about incremental computing. This may be a good application for that, as we are talking about constructing a large data structure that changes in small ways over time. That's exactly what incremental computing is designed to deal with. Is there a way we could express this overall problem in terms of deltas, beyond just the filters? I guess delta filters are like program deltas, so that's a pretty important component to consider if we go down this route.

jhbertra Jul 8, 2022
Author

The main problem I see with using an incremental approach is that it would by necessity make the chain index stateful (in the sense that it would need to manage session state for each downstream consumer), which makes things again more complicated.

bwbush · 2022-07-07T17:44:03Z

bwbush
Jul 7, 2022
Collaborator

@jhbertra, is it the case that the chain index has no Marlowe-specific knowledge aside from (1) the roster knows the opaque ContractId and (2) there is an initial epoch of interest for Marlowe? In particular, Marlowe versioning would not impact the change index?

7 replies

jhbertra Jul 8, 2022
Author

is it the case that the chain index has no Marlowe-specific knowledge aside from (1) the roster knows the opaque ContractId and (2) there is an initial epoch of interest for Marlowe? In particular, Marlowe versioning would not impact the change index?

@bwbush yes, that's the idea. It would get (1) from the transaction metadata and (2) from configuration

jhbertra Jul 8, 2022
Author

@bwbush I think option (1) is closest to what I'm proposing here and I think it is the best option for our purposes. Here are my reasons for not preferring the other two:

Option (2): This would result in unnecessary data storage and data transfer from the chain index. I think we can afford to build some basic awareness of Marlowe into the chain index (to the degree stated in option (1)) because our goal is not to build a general-purpose chain-index. This gives us more license to embed some domain-specific concerns in it.

Option (3): The fact that the PAB chain index was designed this way created problems with data consistency. If components that are downstream from the chain index are notified directly by the node and try to pull from the chain index, they may pull before the chain index has updated, and receive inconsistent results. This is why I think the index should be the source of this push information to the downstream components.

I would actually classify the approach as suggested as "pull-init, push-update" - the initial data is pulled and then updates are received via push. This simply makes the push logic a lot easier to implement, as it doesn't need to "replay" or "catch up" the subscriber when it connects.

bwbush Jul 8, 2022
Collaborator

@jhbertra, thanks for clarifying. I think this approach is fine.

For security, we have to be careful not to trust the metadata. The best the metadata can do is to provide assertions or hints that would be cryptographically verified. Given that situation, I propose that we define ContractId as the TxOutRef of the creation transaction for the contract because it is globally unique and cryptographically derived. (Recall that data TxOutRef = TxOutRef TxId Integer.) We cannot just use the TxId for the creation transaction because, in principle, multiple Marlowe contracts could be created in the same transaction.

jhbertra Jul 8, 2022
Author

Ah, yeah that would be a good choice of identifier for contracts. I had in mind that we could use the MarloweParams (in my mind, ContractId was a synonym). But this might be better, because it becomes trivial to find the creation TxOut, and thus the start of the contract's history this way.

bwbush Jul 8, 2022
Collaborator

I can create two contracts with the same MarloweParams and run them in simultaneously or at different times. To use MarloweParams to derive ContractId, we'd have to enforce on-chain that MarloweParams is globally unique to each contract. Enforcing that would likely hamper some Marlowe use cases that employ novel uses of role tokens, where role tokens are pre-minted or reused across contracts.

bwbush · 2022-07-07T19:18:33Z

bwbush
Jul 7, 2022
Collaborator

Query/Filtering Requirements

Transaction Information

The Marlowe-relevant information in TxBodyContent build era is

`Cardano.Api`	Rationale
`getTxId`	unique identifier for transaction
`txIns`, including the `PlutusScriptWitness` with the `Datum` and `Redeemer`	reconstruction of `InputContent`
`txOuts`, in sequence (for determining `TxOutRef`), including view pattern on `TxOutDatum`	output address, value, and Marlowe contract with its state
`txValidityRange`	Marlowe semantics via `TransactionInput`
key `1564` of `txMetadata`	Marlowe versioning and other hints, and user-facing contract metadata
`txMintValue`	detection of minting of roles currency

This information is essential for reconstructing the TransactionInput of the Marlowe history.

Query Patterns

Name	Pattern	Rationale
Historical Marlowe Txs	provide the Marlowe `Tx` information of any transactions with `TxOut` to a given `ScriptHash`, or consuming a `TxIn` from such a transaction	reconstruction of the graph of contract history
Unspent Marlowe UTxOs	same as above, but only unspent `TxOut`s	determination of current contract state
Unspent UTxOs	unspent `TxOut`s for a given `AddressAny`, `PaymentPubKeyHash`, or `StakePubKeyHash`	balancing transactions
Marlowe Lookup	provide the Marlowe `Tx` information for a given `TxId`	for dereferencing `TxOutRef`s
Spending Lookup	provide the `TxOut` for a given `TxOutRef`	tracing non-Marlowe history (role tokens, etc.)
Currency Lookup	provide the `TxOut` information for any transactions involving a given `PolicyId`	discovery of contract ownership

References

Language.Marlowe.Client.History illustrates the PAB-based discovery of contract history.
Language.Marlowe.CLI.Sync.Types provides MarloweEvent, MarloweIn, and MarloweOut types that package information from TxBodyContent.
Language.Marlowe.Runtime.History extracts Marlowe-related events from the ledger.

2 replies

bwbush Jul 7, 2022
Collaborator

@jhbertra, we'll need to verify that the above information is efficiently discoverable and extractable from the chain index.

jhbertra Jul 8, 2022
Author

Historical Marlowe Txs: This is what is indexed by the ContractRoster, though it is based on the transaction metadata instead.
Unspent Marlowe UTxOs: This is can be extracted from the set of transactions for a given contract as a side-effect of reconstructing history.
Unspent UTxOs: This will need to be captured in the chain index requirements
Marlowe Lookup: This is captured by requirement requirement 4
Spending Lookup: This will need to be captured in the chain index requirements
Currency Lookup: This will need to be captured in the chain index requirements

ghost · 2022-07-07T19:48:11Z

ghost
Jul 7, 2022

In Functional Requirements 2.: "Given the synchronization status of the chain index is not Synchronized, the response should indicate that that the roster is possibly incomplete."

This could be expressed with Phantom Types if that makes sense with how the roster is to be used.

data Synchronized
data NotSychronized

newtype ContractRoster syncStatus = ContractRoster (MonoidalMap ContractId (Set TransactionId))

Then the compiler can keep us from mixing synchronized and unsynchronized rosters accidentally. And we can define operations
that will only compile with rosters in one of those states.

4 replies

jhbertra Jul 8, 2022
Author

That's a possibility, however I don't know to what extend these two cases are actually mutually exclusive. And there is the added fact that we would need some kind of value-level witness to the phantom type in order to perform control flow and serialize it.

bwbush Jul 8, 2022
Collaborator

Although the synchronization state does alert clients that a roster might be incomplete, it doesn't provide any guarantee of completeness because there is always a possibility that contract transactions have occurred after the node's tip. Clients might decide to hold off on processing transactions until the chain index reports synchronization, but they always need to be ready to handle lack of synchronization with the block-producing nodes.

jhbertra Jul 8, 2022
Author

True. Maybe it would be better to design this under the assumption that we are always out of sync with the blockchain. That would also get rid of the distinction between the two states. The main benefit of the distinction is that we can do certain things more optimally when we are catching up to the tip, as the system is subjected to a different stress profile, but it shouldn't change the core business logic.

bwbush Jul 8, 2022
Collaborator

Yes, clients need to defensively assume that the information they act on is stale or rolled back. There's no avoiding that, but synchronization information gives the client a hint of how-up-to date the roster information is.

jhbertra · 2022-07-08T12:58:18Z

jhbertra
Jul 8, 2022
Author

@palas @bwbush @dino-iog Can we explore a different idea for a minute? I would like to break out of the weeds of this one proposed solution and consider a different approach to clear our thinking a bit.

What if our chain index consisted of:

A set of queries that we need to perform on the chain.
A set of subscriptions that tell us when events we care about occur.

In this sense, it would be quite similar to what we already have (the PAB implementation) with a couple of crucial differences:

They come from the same source of truth, so we wouldn't have subscriptions that fire before queryable sources have updated, like we did with the PAB
It's just an API that we can consume from an IO context, not an inflexible Contract framework monad.

13 replies

jhbertra Jul 8, 2022
Author

We also would need to consider what combinators the callbacks would support. We actually need to consider simultaneous occurence (because we work with a discrete representation of time), so there is really only one associative operator that covers combining them:

alignCallbacks :: Callback a -> Callback b -> Callback (These a b)

jhbertra Jul 8, 2022
Author

The more I think about it, the more this seems like the best approach to take with the chain index. I don't yet have a very clear picture of how it will be consumed by downstream components, but this sort of machinery is flexible enough that it can be integrated relatively easily into various configurations.

jhbertra Jul 8, 2022
Author

That being said, I try to always assume I'm wrong, so maybe someone has a better suggestion? 😄

palas Jul 11, 2022
Maintainer

Queries sounds nice in principle, but we would still need to address the synchronisation problem, and I am not sure callbacks will do the trick.

Making queries when out of sync would give us inconsistent state (and we can be out of sync because of rollbacks and because of rollforwards that may happen between queries or before you subscribe, for example). It is the whole race condition nightmare, when we make the first query the state is one, when we make the second query the state is different... Suddenly the thing we are querying about doesn't exist anymore because there was a rollback.

So I think we need a holistic approach to ensuring synchronisation at all times.

We can still use queries, we would just need to include synchronisation info, and queries should fail if synchronisation info doesn't match.

If synchronisation fails, then we need a procedure to get in sync.

jhbertra Jul 11, 2022
Author

@palas that is a good point. Let's look at a few options for dealing with this.

We include, as you say, a synchronization point, in all response payloads (from both queries and callbacks). This could simply be the current point that the index is synchronized up to (slot no, block no, block hash). The consumer of the index API would then include the most recent synchronization point in its requests (for queries or callbacks). This would be checked by the index and used to potentially invalidate the request. The invalidation info would include updated synchronization info, and possibly info about the rollback that invalidated the request, if applicable. It is important to note that in many (probably most) cases, a roll forward would not have to invalidate a request. The chain index could use the synchronization point provided in the query to ignore blocks beyond the point in the request, so the query would return deterministic results as long as the index was synchronized up to the synchronization point.
We design the chain index to be push-only, and to maintain a persistent connection per consumer which progresses linearly through the chain, including handling rollbacks. This would put more burden on the consumers to build their own indexing structures, but it simplifies the job of the chain index somewhat. In fact, if we did this, I wouldn't call it a chain index, but a chain sync client.
We can produce atomic queries that allow the consumer to use a DSL to specify everything they need and it is executed in a single read. This would eliminate the problem of multiple queries being out-of-sync with one another, but it would make the indexing and design of the chain index a bit more complicated, as it would need to be more flexible than if it were to offer a static set of queries.
We can add support for transactions to the chain index API. This would allow the consumer to start a transaction, perform queries, and complete the transaction. This has a number of downsides though - what if the consumer forgets to complete the transaction? What if the consumer crashes? It also means that the job of the chain index becomes far more complicated as it would need to manage a view of the chain per transaction.

In terms of a procedure to get in sync, this would be outside the scope of the chain index component - the downstream components would need to handle this appropriately. But mostly, this would come down to reverting state in response to rollbacks.

I'll start a new thread and work up something in a bit more detail for option 3, as I think it is an intriguing one that could solve this problem quite neatly.

jhbertra · 2022-07-11T15:49:17Z

jhbertra
Jul 11, 2022
Author

This is a modification to the previous proposal which was motivated by @palas raising concerns about consistency / transactionality. The idea is that the downstream consumers of the chain index may need to aggregate data across multiple queries that should all come from a consistent view of the ledger. If the API of the chain index requires consumers to perform one call per query, we would need to do extra work to ensure the results are consistent with one another. Instead, what we could do is only expose a single API which accepts a composite query and runs them all against a consistent view of the data, returning the results in bulk. Here is an example of the proposed API:

-- Previous proposal
queryUtxosAtAddress :: MonadChainIndex m => AddressAny -> m [TxOut]
queryTxOut :: MonadChainIndex m => TxOutRef -> m (Maybe TxOut)
...

-- New proposal
data ChainIndexQuery a where
  GetUtxosAtAddress :: AddressAny -> ChainIndexQuery [TxOut]
  GetTxOut :: TxOutRef -> ChainIndexQuery (Maybe TxOut)
  ...
  GetProduct :: ChainIndexQuery a -> ChainIndexQuery b -> ChainIndexQuery (a, b)

queryChainIndex :: MonadChainIndex m => ChainIndexQuery a -> m a

This would take care of consistent queries, as it transforms multiple calls into a single call. It is also more efficient as it avoids multiple network round-trips.

We would also need to consider synchronization between receiving a callback and performing a query as a follow-up. When awaiting an on-chain event, we will often want to perform one or more queries after we receive the callback. In the time between when the chain index calls us back and when it receives our follow-up query, there is a chance that the chain state has changed. This is especially true when the chain index is synchronizing. We can employ a similar technique to the previous one to bring the two together into the same call, but it requires a bit more work, because we need to consider how to use the data associated with event results as parameters for the queries. This will likely involve some complex type-level machinery and I want to get a sense of what people think of this approach before diving too deep into design here.

9 replies

jhbertra Jul 11, 2022
Author

There isn't an Applicative instance in here - nor even a Functor instance. These would require the serialization of functions, which is not a can of worms I want to open :)

paluh Jul 11, 2022
Maintainer

I've written Applicative in more like "abstract" manner - sorry about that - I should be more precise.

jhbertra Jul 11, 2022
Author

As in the product constructor?

paluh Jul 11, 2022
Maintainer

Yes.

jhbertra Jul 11, 2022
Author

OK, that makes sense :)

jhbertra · 2022-07-11T18:02:25Z

jhbertra
Jul 11, 2022
Author

One observation that I've made is that different components of the runtime require radically different query patterns. The history processor is predominantly push-driven, in that it needs to traverse transactions and receive notifications when the chain is updated. Transaction balancing, on the other hand, is entirely pull-driven, in that it needs to query existing UTxOs, but is unconcerned about updates to the chain.

This is part of what makes designing this component so challenging - the design needs to satisfy both sets of requirements.

0 replies

jhbertra · 2022-07-11T19:59:33Z

jhbertra
Jul 11, 2022
Author

I'd like to consider yet another approach for this component that side-steps some of the complexity we've been considering by pushing it downstream. The approach would be to make the chain index basically act like a smart proxy for the chain sync protocol.

NOTE If you are not already familiar with the chain sync protocol, I highly recommend you familiarize yourself with it first before reading this proposal. A good example of its use can be seen here

The responsibilities of the chain index would be:

Establish and maintain an upstream connection with a local cardano node using the Cain Sync protocol to synchronize chain events.
Persist information received from the node in a local database so that information can be queried and retrieved at a later point.
Accept and maintain connections to downstream components using a filtered version of the chain sync protocol (detailed below).

This would allow downstream components to be built as state machines that pull the next information in the chain they need and do with it as they see fit.

Benefits of this approach include:

Extension of the chain sync protocol - a method of extracting chain info that has been proven and battle tested to be quite effective and flexible, working well in the presence of rollbacks.
A single, unified protocol instead of several ad-hoc APIs
Flexible design that does not encode any awareness of the requirements of downstream components.

Drawbacks of the approach include:

No straightforward query access. Downstream components that need to query data must maintain their own databases to do so. This is arguably an upside though, as the resulting databases would be ultra purpose-built, and therefore have a lot of flexibility to optimize and evolve as needed.
Chain index becomes stateful, as it must manage state for the downstream connections. The state would be transient though (i.e. not persistent), so this makes it much less complicated to manage.

Here is a draft of the protocol types, adapted from Ouroboros.Network.Protocol.ChainSync.Client:

newtype FilteredChainSyncClient (query :: Type -> Type) point tip m a = FilteredChainSyncClient
  { runFilteredChainSyncClient :: m (ClientStIdle query point tip m a) }

data ClientStIdle query point tip m a where
  SendMsgRequestNext
    :: query result -- ^ a query that extracts a result from a block
    -> ClientStNext result query point tip m a -- handler to run if the server responds immediately
    -> m (ClientStNext result query point tip m a) -- handler to run if the server tells us to wait
    -> ClientStIdle query point tip m a

  SendMsgFindIntersect
    :: [point]
    -> ClientStIntersect query point tip m a
    -> ClientStIdle query point tip m a

  SendMsgDone
    :: a
    -> ClientStIdle query point tip m a

data ClientStNext result query point tip m a = ClientStNext
  { recvMsgRollForward :: result -- the result of the query
                      -> tip -- information about the tip of the chain
                      -> ChainSyncClient query point tip m a -- continuation client
  , recvMsgRollBackward :: point -- rollbback point
                       -> tip -- information about the tip of the chain
                       -> ChainSyncClient query point tip m a -- continuation client
  }

data ClientStIntersect query point tip m a = ClientStIntersect
  { recvMsgIntersectFound :: point -- found intersection point
                          -> tip -- information about the tip of the chain
                          -> ChainSyncClient query point tip m a -- continuation client
  , recvMsgIntersectNotFound :: tip --- information about the tip of the chain
                             -> ChainSyncClient query point tip m a -- continuation client
  }

The main difference between this and the regular ChainSyncClient is the query type parameter replacing the header type parameter. This query would be a GADT that allows you to request only what you are interested in receiving next. The handlers in ClientStNext would also be called under different circumstances by the server:

recvMsgRollForward would only be called if and when the query can be fulfilled (this may be many blocks after the query was sent).
recvMsgRollBackward would only be called if and when a rollback occurred such that the query was invalidated by the rollback.

The concrete argument for the query parameter, in short, allows us to specify the query DSL supported by the server.

To summarize, whereas the regular chain sync protocol allows you to define a state machine that handles each block in the chain sequentially, the filtered chain sync protocol allows you to define a similar state machine that handles a sequence of chain events which are at the discretion of the client to specify.

5 replies

jhbertra Jul 11, 2022
Author

A limited example of a type that could be used for query:

data Query a where
  -- Get the next block header
  GetBlockHeader :: Query BlockHeader
  -- Get the transaction that consumes a UTxO (warning will never fulfill if the TxOutRef was previously consumed)
  GetConsumingTx :: TxOutRef -> Query Tx
  -- A query that is fultilled when the specified slot number has been reached or passed
  AwaitSlotNoElapsed :: SlotNo -> Query ()
  ...
  -- A query that is fulfilled when both constituent queries are fulfilled
  GetBoth :: Query a -> Query b -> Query (a, b)
  -- A query that is fulfilled when either constituend queries are fulfilled
  GetEither :: Query a -> Query b -> Query (These a b)

And example usage:

getFirstBlockafterSlot slot = client
  where
    client = FilteredChainSyncClient do
      SendMsgRequestNext
        (GetBoth (AwaitSlotNoElapsed slot) GetBlockHeader)
        next
        (pure next)
    next = ClientStNext
      { recvMsgRollForward = \(_, header) _ ->  SendMsgDone header
      , recvMsgRollBackward = \_ ->  client -- ignore the rollback and try again. Should never be called anyway because `AwaitSlotNoElapsed` would never be invalidated. Maybe there is a way to capture this in the types with Void or something similar.
      }

jhbertra Jul 11, 2022
Author

Another consideration is that we may be able to replace the FindIntersect message with a query constructor. Instead of finding an intersection, and starting from there, all clients could start from genesis, but simply request the point where it wants to start from via a query.

Another way to think of queries is that they combine navigation with observation - they jump you to a different point in the chain and allow you to extract information from that point.

In this light, the ClientStNext constructor may need to have a handler for a failed query - for when you request something invalid. For example, calling GetConsumingTx with a spent TxO should be an error, because it's an invalid request. This mechanism could also be used for handling invalid intersection requests.

jhbertra Jul 11, 2022
Author

Another possible semantics for the rollBackward message is that if could be called when a rollback occurs that rolls back the point the client was at when the query was made. This would probably be simpler to implement and more consistent than working with a notion of "invalidating the query"

palas Jul 12, 2022
Maintainer

I don't 100% understand this proposal. But am I right in thinking that this approach could be done without having a database for the chain-index at all? That would be a good reason for doing it this way.

If the chain-index requires a database then we should be able to request information retroactively and efficiently from the chain-index, like the typical token you just discovered but already has history that you filtered out before. Can we do that efficiently with this approach? I understand we could with a query system.

On the other hand, we can make the query system work in a consistent way without needing to make all the queries at the same time or ensuring that queries and the waiting happen all transactionally. We just need to have every call require the client to be synchronised, and having a way to catch-up and a way to "wait for something new to happen". Then we can just retry until we are caught up and then we block while "waiting for something to happen" (a bit like STM).

And, as part of catching-up, clients would need to keep track of what information belongs to what block (in order to invalidate it if there is a rollback). But I understand clients would need to do that with this approach too.

And I am guessing the only thing we need to "wait to happen" would be for UTXOs to get spent and for particular addresses to receive new UTXOs. Am I missing anything? So a query system shouldn't be complicated. And it would be stateless, which would just discard a whole type of errors and make debugging really easy.

jhbertra Jul 12, 2022
Author

I don't 100% understand this proposal. But am I right in thinking that this approach could be done without having a database for the chain-index at all? That would be a good reason for doing it this way.

No, the chain index would still require a database for this to work. The idea is that new clients can connect at any time and start traversing the blockchain, so the index would need to remember what it had received previously. There is no getting around this. When we receive data from the node, we don't know if we will need it in the future, so we have to store it for later use.

If the chain-index requires a database then we should be able to request information retroactively and efficiently from the chain-index, like the typical token you just discovered but already has history that you filtered out before. Can we do that efficiently with this approach? I understand we could with a query system.

There is certainly no reason a call-and-response query protocol couldn't coexist with a synchronization protocol. Another option is to have the downstream components maintain private caches / data stores to optimize performance if and when they are needed.

In the example you gave, the approach I would take is have a FilteredChainSyncClient(A) which is watching for role tokens in a wallet (or collection of wallets perhaps, or using some other filtering strategy). When it discovers a new role token, it looks up the contract(s) associated with that role token and spawns a FilteredChainSyncClient (B) for that contract which traverses the history of the new contract. This corresponds to the roles of the wallet companion (A) and follower (B) apps in the current PAB design.

This could be made more efficient by adding bulk Query constructors like GetTxOutsToAddress which would load all transaction outputss to a given address. Then the "catch up" or "initial load" phase consists of a single network round trip rather than multiple. However, I would consider this an optimization. I think a better approach for a first implementation would be to prioritize stability, fault tolerance, and correctness over performance, provided we plan a path towards improved performance and keep those options open for the future.

On the other hand, we can make the query system work in a consistent way without needing to make all the queries at the same time or ensuring that queries and the waiting happen all transactionally. We just need to have every call require the client to be synchronised, and having a way to catch-up and a way to "wait for something new to happen". Then we can just retry until we are caught up and then we block while "waiting for something to happen" (a bit like STM).

Yup, that's another option and one which we've discussed as well. However, I think "requiring the client to be synchronized", "having a way to catch up" and a way to `"wait for something new to happen" will not be as straightforward as it sounds. This approach deals with all three of those concerns in a systematic way. Again, I think the main benefit of an approach like this is that it doesn't require us to be very inventive. These issues are ones that have been solved already by the Node API - the chain sync protocol was designed to handle these problems, and it does a very good job of doing so. By modelling our solution after it, we benefit from that existing knowledge and avoid having to come up with a novel solution to these problems.

And, as part of catching-up, clients would need to keep track of what information belongs to what block (in order to invalidate it if there is a rollback). But I understand clients would need to do that with this approach too.

Yes, that is true. They would need to do that no matter what approach we took. But this isn't complicated to do, as I discovered when implementing the initial POC.

And I am guessing the only thing we need to "wait to happen" would be for UTXOs to get spent and for particular addresses to receive new UTXOs. Am I missing anything? So a query system shouldn't be complicated. And it would be stateless, which would just discard a whole type of errors and make debugging really easy.

I can't think of anything else personally, though there may be subtle variations of these events that it would be helpful to distinguish between. As for plain queries being stateless, they're not really stateless - they just delegate state management downstream. I can see cases where that makes the architecture less complex (when the state management is specific to the given downstream component), as well as cases where it makes it more complex (when the state management is common to all clients of the chain index). What I'm attempting to do here is pull only the common state management tasks (keeping track of the client's "current position" in the chain and detecting rollbacks) into the chain index.

bwbush · 2022-07-12T16:07:25Z

bwbush
Jul 12, 2022
Collaborator

jhbertra Jul 12, 2022
Author

@bwbush I recommend you read my summary post below, it answers a number of these questions.

jhbertra Jul 12, 2022
Author

Checked items in the checklist as best as I could, some are only under consideration in a subset of the proposed options.

jhbertra · 2022-07-12T16:16:25Z

jhbertra
Jul 12, 2022
Author

At this point, we've discussed four separate proposals, which I'd like to summarize here:

1. Roster Sync

The Chain Index pushes roster changes to clients. The Roster is a map that shows A) which Marlowe contracts exist and B) what transactions are associated with each Marlowe contract. The Roster only includes identifiers, both for contracts and transactions. The client can control which contracts the Roster contains via a ContractFilter DSL.

The Chain Index also provides a bulk download API for transactions. When the client sees new transaction IDs in its Roster, it can use this API to download the transactions.

Benefits

Efficient pre-processing. The set of contracts and transactions for each contract is known early in the pipeline, making the job of downstream components more simple and efficient.
Synchronization is trivial. The client always has a complete, compact skeleton of the data it needs.
Initialization is straightforward. The client starts by downloading the full Roster, followed by a stream of updates.

Drawbacks

Inflexible. The Chain Index is aware of basic Marlowe concepts and uses this knowledge to bundle information in a way that assumes the client will use it in one particular way. As a result, there is a high degree of coupling between the clients and the Chain Index.
Multiple network round-trips. The client must perform at least two network round trips to download any transactions: one to download the Roster and another to download the transactions. Even worse, the transaction IDs will be transferred three times over the network: first in the Roster, second in the transaction download request, and third in the body of the transactions. Transferring the IDs multiple times is potentially wasteful if the number of transactions is large.

2. Queries and Callbacks

The Chain Index exposes a set of queries and a set of event callbacks via its API. Clients use queries to load data from the blockchain and event callbacks to wait for new transactions. Clients receive a synchronization token from event callbacks. This token describes the client's "position" in the blockchain as of the event occurrence. The client then provides this token to subsequent queries and event requests. The Chain Index uses the token to give the client a consistent view of the blockchain. It also detects when the client needs to be informed of a rollback.

Benefits

Statelessness. The Chain Index does not need to maintain any state related to each client. Instead, the client threads its state (its position in the chain) through the queries. The only state the Chain Index needs to maintain relates to pending event callbacks.
Simplicity. Queries and event callbacks are well-understood, familiar patterns.
Flexibility. The Chain Index does not enforce a specific usage pattern. Clients are free to call queries or event callbacks purely at their discretion.

Drawbacks

Client handles state. Clients will replicate standard state management logic. They will need to maintain a synchronization token, thread it through API calls, and manage potential rollbacks after each query or event callback request.
More validation. The Chain Index will need to validate each request and token it receives and respond with appropriate errors if they are invalid. Error handling will propagate to the clients and add even more boilerplate.

3. Aggregated Queries and Callbacks

This option is a variation of the second one. Clients send batch queries to the Chain Index and receive batch results instead of threading a synchronization token through multiple queries. Continuation queries accompany event callbacks so the Chain index can ensure consistency between event results and query results.

Benefits

Transactionality. Each request is executed in a single batch, allowing the Chain Index to guarantee transactionality.
Statelessness. The protocol has no state on the client or the server.
Network efficiency. The client only performs one network round-trip instead of multiple.

Drawbacks

Complexity. The API becomes more complex due to the need to express composite queries. The addition of continuation queries to events significantly exacerbates this increase in complexity.

4. Extended Chain Sync

Downstream clients communicate with the Chain Index via a modified version of the Chain Sync protocol. The Filtered Chain Sync protocol extends the base Chain Sync protocol by adding a query parameter to the RequestNext message. The addition of the query offers two benefits compared with the base Chain Sync protocol:

Clients can skip blocks that are of no interest to them. While the base chain sync protocol allows this, clients are limited to specifying known intersection points to which to skip. Clients of the Filtered Chain Sync protocol can ask to skip to a point in the chain at which a query is satisfied.
Clients only receive what they need. The base Chain Sync protocol sends an entire serialized block in response to each RequestNext message. The Filtered Chain Sync protocol only sends the information requested by the query, reducing data transmission.

Benefits

Harmony. This design fits in well with the rest of the Cardano ecosystem because it is similar to the existing chain sync protocol.
Robustness. The protocol design promotes writing clients as state machines that explicitly handle common patterns such as synchronization, rollbacks, and graph traversal.
Unified API. All clients communicate with the chain index over a single general-purpose API, leading to extremely low coupling and high cohesion.
Battle-tested. Chain Sync is a proven approach to synchronizing downstream peers in the presence of rollbacks.

Drawbacks

Query support. Because clients are state machines, the Chain Index does not directly support call-and-response query usage. Such usages must be encoded as trivial state machines. Note that this could be addressed by simply adding a direct query API if it is needed.
Statefullness. Because each client is a state machine that tracks its position in the chain, the protocol is inherently stateful. Whether the Chain Index maintains this state or passes it through the protocol messages, the presence of state increases complexity and reduces flexibility.

4 replies

bwbush Jul 12, 2022
Collaborator

Re 3, can you clarify "continuation query"?

palas Jul 12, 2022
Maintainer

Here is the framework I am using for evaluating these too:

Features
- Tracking of changes to UTXOs and addresses (updates feed / rollforward)
- Querying of info about transactions that happened in the past (historical info)
- Rollback handling information (how the client can find out what information to throw out because it was rolledback)
Costs
- Storage and memory demands
- Efficiency on answering each of the requirements
- Simplicity of implementation

jhbertra Jul 12, 2022
Author

I'm more or less using those criteria too - in addition I'm careful considering flexibility of design.

jhbertra Jul 12, 2022
Author

Some notes:

Complexity - don't push too much downstream or eat too much
Minimal - don't give it too much responsibility
Adequate - it should do enough to be worth it
Flexibility - changes to the clients should not be coupled with changes to the index

jhbertra · 2022-07-12T17:47:47Z

jhbertra
Jul 12, 2022
Author

Recap of discussion from video call:

We unanimously rejected options one and three
We were in agreement that options two and four were the best candidates
We had a preference for option four because:
- It offers a more structured and systematic approach compared to option two
- It lends itself well to an incremental approach that will allow us to progress to other components more quickly
@jhbertra will proceed with fleshing out the architecture and design of option four
The team will continue to scrutinize and examine the idea as it is being designed
- Suggestion by @jhbertra: ask a lot of "what if" questions.
We will want to estimate both the design work and the implementation work using planning poker.

0 replies

palas · 2022-07-12T18:12:11Z

palas
Jul 12, 2022
Maintainer

I am going to write here a couple of what-if questions like we talked about, to understand better how the option 4 works, and to test it, and I will write them as separate replies to make it easier to discuss:

What if a client connects to the chain-index with a tip that has been rolled back, and that branch has never been explored by the current chain-index. So maybe it was explored by another chain-index, that the client connected to before, but for some reason the current chain-index never went through that branch. Does this approach include a way of finding a common ground? A common root to both branches

2 replies

jhbertra Jul 12, 2022
Author

Great question! This is also something that consumers of the ChainSync protocol have to consider. This is why the mechanism for finding an intersection point accepts a list of points instead of a single point. The node will respond with the best candidate it is able to find. Another option is to handle the intersectionNotFound message by trying a different intersection. You can also handle the intersectionFound message by trying to find a better intersection iteratively.

The client is generally expected to start from the latest common intersection point (if the client has points that are beyond this point, they should be discarded as they are out of date anyway). The degenerate case of this is that the latest common intersection point is the genesis block, in which case the client would have to throw everything away and start from genesis. However, this rarely happens in practice thanks to the security parameter - which is the number of blocks past which you are guaranteed by the consensus protocol not to encounter a rollback.

Blocks older than the security parameter can safely be assumed to be permanent - they are guaranteed to exist in the chain databases of all up-to-date and well-behaved nodes forevermore. If a client were to connect to a node, the security parameter defines the maximum reasonable distance before the previous tip to find a common intersection. In fact, if the node still says it can't find an intersection beyond there, the best thing the client can do is abort the connection and exit with an error, because the node is either not up-to-date, or it is playing against the rules, and should not be interacted with.

palas Jul 12, 2022
Maintainer

Thanks, that is a relief. I didn't think of the security parameter, that ensures we don't traverse the whole chain

palas · 2022-07-12T18:12:13Z

palas
Jul 12, 2022
Maintainer

What if a client is waiting for next event from chain (e.g: a UTXO to be spent) and the user of the client decides to follow a new UTXO, or stops following the UTXO. Is there a nice way for the client to cancel the wait and modify the query?

3 replies

jhbertra Jul 12, 2022
Author

Good question. At least in the base chain sync protocol, the client does not have agency when awaiting a response to a RequestNext message. It only regains agency when it receives the response.

There is probably an argument for interrupting or cancelling requests, though none come to mind immediately. One of the problems with this that I can see is that it involves forking the client's thread. In your question, you say the client is "waiting for the next event from the chain." You also say it concurrently "decides to follow a new UTxO or stop following the UTxO," implying that the client is running more than one thread. This could be supported by having the constructor for SendMsgRequestNext accept an m (async (ClientStNext m a)) instead, or by using continuation passing instead, but this would significantly increase the complexity of the protocol.

So my response to this question would be: is there a clear case where you anticipate this is necessary? If not, I would say it's probably not worth adding to the protocol, as it would incur a complexity penalty that wouldn't pay off in most cases.

palas Jul 12, 2022
Maintainer

I think probably every client will have at least something else that it wants to communicate with, often asynchronously, and if the communication is bidirectional then messages can arrive at any time and they typically have to be processed when they arrive, which requires another thread or some kind of select (is the same logic as the When in Marlowe). Typically there will be commands from the user that come through another channel, like the user interface, for example for shutting the system down gracefully. There is always the option of killing the thread, but it is hard to know the side effects of that. Not the nicest mechanism

jhbertra Jul 13, 2022
Author

I see two separate concerns here:

Gracefully terminating a running a FilteredChainSyncClient
Asynchronous communication to and from a running FilteredChainSyncClient

Regarding graceful termination, we care about gracefully terminating the transport channel that we have opened and releasing resources. The monad used to run the client is the appropriate place to deal with this concern, not the protocol. The way to run a typed protocol is to provide a Channel which provides implementations for sending and receiving messages. We can define a channel that can be closed by the driver code that runs the protocol. The m parameter in the types represents the monad used to run the client.

We can use STM primitives like TQueue for async communication. For example, a FilteredChainSyncClient will typically send a message on an outbound channel every time it gets a roll forward or backward message. It can do so by writing to a TQueue to which it holds a reference. For another example, a FilteredChainSyncClient may need to receive messages on an inbound channel before sending a QueryNext message or when the server tells it must wait for a result. Again, it could query a TQueue or other STM construct to check for any inbound messages.

Regarding cancelling an in-flight protocol message, this is explicitly impossible when using typed-protocols (the library used to implement Chain Sync and all other Ouroborus protocols). To explain why it is necessary to understand how typed-protocols works. A protocol definition consists of the following:

The set of states in the protocol
The set of state transitions in the protocol
The subset of states in which the client has agency
The subset of states in which the server has agency
The subset of states in which no one has agency (i.e. terminal states)
An exclusion lemma which proves that there are no states in which both client and server have agency
An exclusion lemma which proves that there are no states in which both nobody and client have agency
An exclusion lemma which proves that there are no states in which both nobody and server have agency

The concept of agency is explicitly declared and enforced by the library. When the client sends a message (e.g. QueryNext) to the server, it releases agency to the server. The server gives agency back to the client by sending it a message. When the server has agency, the client cannot send another message saying, "never mind, I've changed my mind, do this instead."

You could, however, design the protocol so that the client polls the server for a response. Then the client could maintain agency while awaiting the response and decide to cancel it. Something like this:

newtype FilteredChainSyncClient (query :: Type -> Type) point tip m a = FilteredChainSyncClient
  { runFilteredChainSyncClient :: m (ClientStIdle query point tip m a) }

data ClientStIdle query point tip m a where
  -- Send a query to the server
  SendMsgQueryNext
    :: query result -- ^ a query that extracts a result from a block
    -> ClientStNext result query point tip m a -- handler to run if the server responds immediately
    -> m (ClientStWait result query point tip m a) -- handler to run if the server tells us to wait
    -> ClientStIdle query point tip m a

  -- Terminate
  SendMsgDone
    :: a
    -> ClientStIdle query point tip m a

data ClientStNext result query point tip m a = ClientStNext
  { recvMsgRollForward :: result -- the result of the query
                      -> tip -- information about the tip of the chain
                      -> FilteredChainSyncClient query point tip m a -- continuation client
  , recvMsgRollBackward :: point -- rollbback point
                       -> tip -- information about the tip of the chain
                       -> FilteredChainSyncClient query point tip m a -- continuation client
  }


data ClientStWait result query point tip m a where
  -- Poll for the query result
  SendMsgPollResult
    :: ClientStNext result query point tip m a -- handler to run if a result is available
    -> m (ClientStWait result query point tip m a) -- handler to run if the server tells us to wait
    -> ClientStWait query point tip m a

  -- Cancel the query and resume with a different client
  SendMsgCancel
    :: FilteredChainSyncClient result query point tip m a -- continuation client
    -> ClientStWait query point tip m a

This could be a viable option to consider. Polling has advantages as well as disadvantages.

jhbertra · 2022-07-12T20:21:30Z

jhbertra
Jul 12, 2022
Author

Marlowe Runtime Chain Index Entity-Relationship Diagram v1

erDiagram
  Block ||--o{ Transaction : contains
  Transaction ||--|{ TransactionInput : receives
  Transaction ||--|{ TransactionOutput : produces
  TransactionOutput ||--o{ AssetTransfer : transfers
  TransactionOutput ||--o| TransactionInput : feeds
  TransactionInput ||--o| PlutusScriptWitness : provides
  AssetClass ||--o{ AssetTransfer : classifies
  AssetClass ||--o{ AssetMint : classifies
  Transaction ||--o{ AssetMint : mints
  Block {
    bytea headerHash PK
    bigint slotNo "INDEX"
    bigint blockNo
  }
  Transaction {
    bytea id PK
    bytea headerHash FK
    bigint validityLowerBound "NULL"
    bigint validityUpperBound "NULL"
    bytea metadataKey1564 "NULL"
  }
  TransactionOutput {
    uuid id PK
    bytea transactionId FK
    int index "INDEX"
    bytea datum "NULL"
    bytea address "INDEX"
    bigint lovelace
  }
  TransactionInput {
    uuid id PK
    uuid transactionOutputId FK
    bytea transactionId FK
  }
  PlutusScriptWitness {
    uuid transactionInputId
    bytea datum
    bytea redeemer
  }
  AssetClass {
    uuid id PK
    bytea policyId FK
    text name "INDEX"
  }
  AssetTransfer {
    uuid transactionOutputId FK
    uuid assetClassId FK
    bigint quantity
  }
  AssetMint {
    bytea transactionId FK
    uuid assetClassId FK
    bigint quantity
  }

6 replies

bwbush Jul 14, 2022
Collaborator

Also, I don't think we need to precisely define secondary indices at this point. They can be added to support the query patterns and query optimization.

jhbertra Jul 14, 2022
Author

In TransactionOutput the Datum might just be a DatumHash. Technically, the UTxO per se only ever includes a DatumHash, but Marlowe (always) and other dApps (sometimes) include the full Datum in the transaction body. In Babbage, there will be the third option of including a reference to a datum.

What would you suggest as a schema change for this then?

Similarly for PlutusScriptWitness, a reference datum might be possible in Babbage instead of a datum. For this schema, I'd recommend dereferencing that pointer and just storing the full datum that it references. Anyway, we need to study the relevant CIPs and Plutus.V2.

Confirming no action is needed on this point?

In this schema, is it the case that a minting transaction will be represented with the quantity minted in both AssetTransfer and AssetMint? This will indicate that the token was minted and who received the minted token(s).

Correct, that is the idea.

Couldn't we just collapse TransactionInput and PlutusScriptWitness into a single table, just using nullable/Maybe? Why create a table for this 1-{0,1} relationship?

We could, but can a PlutusScriptWitness contain a datum without a redeemer and vice-versa? My point is that (a, Maybe (b, c)) is not isomorphic to (a, Maybe b, Maybe c).

There are two different (Cardano vs Plutus) naming conventions for native tokens, AssetClass vs AssetId and PolicyId vs CurrencySymbol, both with TokenName. I'm not sure which is best, but I lean towards the Cardano convention.

Is this correct? Cardano: AssetId, PolicyId; Plutus: AssetClass, CurrencySymbol. I agree on sticking with Cardano conventions.

We'll need some composite secondary indices like (policyid, name), (transactionid, index).

As mentioned, this doesn't need to be captured here.

In terms of validation, token quantity minted can be positive (minting) or negative (burning), but never zero.

I can include this as a note. Though this is presumably enforced by the ledger rules already - so we shouldn't have to validate these invariants ourselves. We just accept whatever the node tells us.

I strongly recommend comparing this schema to the Babbage schema for cardano-db-sync: https://github.com/input-output-hk/cardano-db-sync/blob/master/cardano-db/src/Cardano/Db/Schema.hs.

Similarly, here is the schema for cardano-graphql: https://input-output-hk.github.io/cardano-graphql/.

I'll take a look, thanks.

Is the plan to store all of this information from genesis onward, so that when a new client asks to start from a particular point, the chain index will perform the appropriate queries and stream the result (assuming that point is in the past and not at the tip)? I guess I'm asking if the chain index would only walk the chain once, extracting results into these tables as it goes, versus walk the chain for each client.

Yup, that's the plan 🙂

jhbertra Jul 14, 2022
Author

Marlowe Runtime Chain Index Entity-Relationship Diagram v2

erDiagram
  Block ||--o{ Tx : contains
  Tx ||--|{ TxIn : consumes
  Tx ||--|{ TxOut : produces
  TxOut ||--o{ AssetOut : transfers
  TxOut ||--o| TxIn : feeds
  Asset ||--o{ AssetOut : classifies
  Asset ||--o{ AssetMint : classifies
  Tx ||--o{ AssetMint : mints
  Block {
    serial id PK
    bytea hash
    bigint slotNo
    bigint blockNo
  }
  Tx {
    serial id PK
    int blockId FK
    bytea hash
    bigint validityLowerBound "NULL"
    bigint validityUpperBound "NULL"
    bytea metadataKey1564 "NULL"
  }
  TxOut {
    serial id PK
    int txId FK
    bigint index
    bytea address
    bigint lovelace
    bytea datumHash "NULL"
    bytea datumBytes "NULL"
  }
  TxIn {
    serial id PK
    int transactionOutputId FK
    int txOutId FK
    bytea redeemerDatumBytes "NULL"
  }
  Asset {
    serial id PK
    bytea policyId
    text name
  }
  AssetOut {
    serial id PK
    int txOutId FK
    int assetId FK
    bigint quantity
  }
  AssetMint {
    serial id PK
    int txId FK
    int assetId FK
    bigint quantity
  }

@bwbush does this address your comments?

bwbush Jul 15, 2022
Collaborator

@jhbertra Yes!

Re 1, the latest diagram, with nullable datumHash and datumBytes, works fine for this. In principle, it is Maybe (Either Datum DatumHash), but I don't think we should complexify the table design; we could always add a "check" constraint to the table if we really cared. Our slack discussion supercedes this comment.

Re 2, 3, 6, 7, 8, 9, 10, no changes advised.

Re 4, as we discussed on slack, datum+redeemer in txin.

Re 5, correct.

jhbertra Jul 15, 2022
Author

Marlowe Runtime Chain Index Entity-Relationship Diagram v3 (extracts Datum table)

erDiagram
  Block ||--o{ Tx : contains
  Tx ||--|{ TxIn : consumes
  Tx ||--|{ TxOut : produces
  TxOut ||--o| Datum : contains
  TxOut ||--o{ AssetOut : transfers
  TxOut ||--o| TxIn : feeds
  Asset ||--o{ AssetOut : classifies 
  Asset ||--o{ AssetMint : classifies
  Tx ||--o{ AssetMint : mints
  Block {
    serial id PK
    bytea hash
    bigint slotNo
    bigint blockNo
  }
  Tx {
    serial id PK
    int blockId FK
    bytea hash
    bigint validityLowerBound "NULL"
    bigint validityUpperBound "NULL"
    bytea metadataKey1564 "NULL"
  }
  TxOut {
    serial id PK
    int txId FK
    int datumId FK "NULL"
    bigint index
    bytea address
    bigint lovelace
  }
  Datum {
    serial id PK
    bytea hash
    bytea bytes "NULL"
  }
  TxIn {
    serial id PK
    int transactionOutputId FK
    int txOutId FK
    bytea redeemerDatumBytes "NULL"
  }
  Asset {
    serial id PK
    bytea policyId
    text name
  }
  AssetOut {
    serial id PK
    int txOutId FK
    int assetId FK
    bigint quantity
  }
  AssetMint {
    serial id PK
    int txId FK
    int assetId FK
    bigint quantity
  }

jhbertra · 2022-07-12T20:43:53Z

jhbertra
Jul 12, 2022
Author

Filtered Chain Sync Protocol State Diagram v1

Note the omission of the Intersect state. It is missing because the query mechanism makes it redundant (one could easily add support for finding intersections by defining a query that does the same thing).

stateDiagram-v2
  [*] --> Idle
  Idle --> Next : QueryNext
  Idle --> [*] : Done
  state Next {
    [*] --> CanAwait
    CanAwait --> MustAwait : AwaitReply
    CanAwait --> [*] : RollForward
    CanAwait --> [*] : RollBackward
    MustAwait --> [*] : RollForward
    MustAwait --> [*] : RollBackward
  }
  Next --> Idle : RollForward
  Next --> Idle : RollBackward

1 reply

jhbertra Jul 14, 2022
Author

v2

stateDiagram-v2
  [*] --> Init
  Init --> Handshake : RequestHandshake
  Handshake --> Idle : ConfirmHandshake
  Handshake --> Fault : RejectHandshake
  Idle --> Next : QueryNext
  Idle --> Done : Done
  Fault --> [*]
  Done --> [*]
  state Next {
    [*] --> CanAwait
    CanAwait --> MustAwait : Wait
    CanAwait --> [*] : QueryRejected
    CanAwait --> [*] : RollForward
    CanAwait --> [*] : RollBackward
    MustAwait --> [*] : RollForward
    MustAwait --> [*] : RollBackward
  }
  Next --> Idle : QueryRejected
  Next --> Idle : RollForward
  Next --> Idle : RollBackward

jhbertra · 2022-07-14T17:46:44Z

jhbertra
Jul 14, 2022
Author

I propose that from now on, we refer to this component as the "Marlowe Chain Sync Engine" instead of the "Marlowe Chain Index." It is a more descriptive and accurate name for what we plan to build.

0 replies

Marlowe Runtime Chain Index Architecture #180

jhbertra Jul 7, 2022

High-level and Rationale

Definitions

Functional requirements

Non-functional requirements

Usage notes

Replies: 21 comments · 80 replies

jhbertra Jul 7, 2022 Author

jhbertra Jul 7, 2022 Author

jhbertra Jul 7, 2022 Author

jhbertra Jul 7, 2022 Author

jhbertra Jul 7, 2022 Author

jhbertra Jul 7, 2022 Author

palas Jul 7, 2022 Maintainer

palas Jul 8, 2022 Maintainer

palas Jul 8, 2022 Maintainer

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

bwbush Jul 7, 2022 Collaborator

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

bwbush Jul 8, 2022 Collaborator

jhbertra Jul 8, 2022 Author

bwbush Jul 8, 2022 Collaborator

bwbush Jul 7, 2022 Collaborator

Query/Filtering Requirements

Transaction Information

Query Patterns

References

bwbush Jul 7, 2022 Collaborator

jhbertra Jul 8, 2022 Author

ghost Jul 7, 2022

jhbertra Jul 8, 2022 Author

bwbush Jul 8, 2022 Collaborator

jhbertra Jul 8, 2022 Author

bwbush Jul 8, 2022 Collaborator

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

jhbertra Jul 8, 2022 Author

palas Jul 11, 2022 Maintainer

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

paluh Jul 11, 2022 Maintainer

jhbertra Jul 11, 2022 Author

paluh Jul 11, 2022 Maintainer

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

jhbertra Jul 11, 2022 Author

palas Jul 12, 2022 Maintainer

jhbertra Jul 12, 2022 Author

bwbush Jul 12, 2022 Collaborator

jhbertra Jul 12, 2022 Author

jhbertra Jul 12, 2022 Author

jhbertra Jul 12, 2022 Author

1. Roster Sync

Benefits

Drawbacks

2. Queries and Callbacks

Benefits

Drawbacks

jhbertra
Jul 7, 2022

Replies: 21 comments 80 replies

jhbertra
Jul 7, 2022
Author

jhbertra
Jul 7, 2022
Author

jhbertra
Jul 7, 2022
Author

jhbertra
Jul 7, 2022
Author

jhbertra
Jul 7, 2022
Author

jhbertra Jul 7, 2022
Author

palas
Jul 7, 2022
Maintainer

palas Jul 8, 2022
Maintainer

palas Jul 8, 2022
Maintainer

jhbertra Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

bwbush
Jul 7, 2022
Collaborator

jhbertra Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

bwbush Jul 8, 2022
Collaborator

jhbertra Jul 8, 2022
Author

bwbush Jul 8, 2022
Collaborator

bwbush
Jul 7, 2022
Collaborator

bwbush Jul 7, 2022
Collaborator

jhbertra Jul 8, 2022
Author

ghost
Jul 7, 2022

jhbertra Jul 8, 2022
Author

bwbush Jul 8, 2022
Collaborator

jhbertra Jul 8, 2022
Author

bwbush Jul 8, 2022
Collaborator

jhbertra
Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

jhbertra Jul 8, 2022
Author

palas Jul 11, 2022
Maintainer

jhbertra Jul 11, 2022
Author

jhbertra
Jul 11, 2022
Author

jhbertra Jul 11, 2022
Author

paluh Jul 11, 2022
Maintainer

jhbertra Jul 11, 2022
Author

paluh Jul 11, 2022
Maintainer

jhbertra Jul 11, 2022
Author

jhbertra
Jul 11, 2022
Author

jhbertra
Jul 11, 2022
Author

jhbertra Jul 11, 2022
Author

jhbertra Jul 11, 2022
Author

jhbertra Jul 11, 2022
Author

palas Jul 12, 2022
Maintainer

jhbertra Jul 12, 2022
Author

bwbush
Jul 12, 2022
Collaborator

jhbertra Jul 12, 2022
Author

jhbertra Jul 12, 2022
Author

jhbertra
Jul 12, 2022
Author