Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0030 | Dapp-Connector #88

Merged
merged 15 commits into from
Nov 17, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions CIP-DappConnector/CIP-dapp-connector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
---
CIP: ?
crptmppt marked this conversation as resolved.
Show resolved Hide resolved
Title: Cardano dApp-Wallet Web Bridge
Authors: rooooooooob
Copy link
Member

@KtorZ KtorZ Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, add more authors here to include people who have actively contributed to the writing of this specification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we use for the threshold? Like @alessandrokonrad who has contributed significantly to discussion?

Status: Draft
Type: Standards
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved
Created: 2021-04-29
License: CC-BY-4.0
---

# Abstract

This documents describes a webpage-based communication bridge allowing webpages (i.e. dApps) to interface with Cardano wallets. This is done via injected javascript code into webpages. This specification defines the manner that such code is to be accessed by the webpage/dApp, as well as defining the API for dApps to communicate with the user's wallet. This document currently concerns the Shelley-Mary era but will have a second version once Plutus is supported. This specification is intended to cover similar use cases as web3 for Ethereum or [EIP-0012](https://github.com/ergoplatform/eips/pull/23) for Ergo. The design of this spec was based on the latter.
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved


# Motivation

In order to facilitate future dApp development, we will need a way for dApps to communicate with the user's wallet. While Cardano does not yet support smart contracts, there are still various use cases for this, such as NFT management. This will also lay the groundwork for an updated version of the spec once the Alonzo hardfork is released which can extend it to allow for Plutus support.



# Specification

## Data Types

### Bytes

A hex-encoded string of the corresponding bytes.

### cbor\<T>

A hex-encoded string representing [CBOR](https://tools.ietf.org/html/rfc7049) corresponding to `T` defined via [CDDL](https://tools.ietf.org/html/rfc8610) either inside of the [Shelley Mary binary spec](https://github.com/input-output-hk/cardano-ledger-specs/blob/0738804155245062f05e2f355fadd1d16f04cd56/shelley-ma/shelley-ma-test/cddl-files/shelley-ma.cddl) or, if not present there, from the [CIP-0008 signing spec](https://github.com/cardano-foundation/CIPs/blob/master/CIP-0008/CIP-0008.md).
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved
This representation was chosen when possible as it is consistent across the Cardano ecosystem and widely used by other tools, such as [cardano-serialization-lib](https://github.com/Emurgo/cardano-serialization-lib), which has support to encode every type in the binary spec as CBOR bytes.

### Paginate

```
type Paginate = {|
page: number,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't sufficient information because wallet state could change in-between pages (either new entries added or a rollback causing entries to be removed)
For the backend, instead of using arbitrary pages we instead give a concrete value like "all values after this address"

Due to this fact, I don't think it necessarily makes sense to have a unified paging system since the anchor to use depends on the data type you're looking for (txs, addresses, etc.)

If you feel this problem isn't really solvable, probably we should at least mention the pagination system has this possibility for error

limit: number,
|};
```
Used to specify optional pagination for some API calls. Limits results to {limit} each page, and uses a 0-indexing {page} to refer to which of those pages of {limit} items each.


## Error Types

### APIError

```
APIErrorCode {
InvalidRequest: -1,
InternalError: -2,
Refused: -3,
}
APIError {
code: APIErrorCode,
info: string
}
```

### DataSignError

```
DataSignErrorCode {
ProofGeneration: 1,
AddressNotPK: 2,
UserDeclined: 3,
InvalidFormat: 4,
}
type DataSignError = {
code: DataSignErrorCode,
info: String
}
```

* ProofGeneration - Wallet could not sign the data (e.g. does not have the secret key associated with the address)
* AddressNotPK - Address was not a P2PK address and thus had no SK associated with it.
* UserDeclined - User declined to sign the data
* InvalidFormat - If a wallet enforces data format requirements, this error signifies that the data did not conform to valid formats.

* InvalidRequest - Inputs do not conform to this spec or are otherwise invalid.
* InternalError - An error occurred during execution of this API call.
* Refused - The request was refused due to lack of access - e.g. wallet disconnects.
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

### PaginateError

```
type PaginateError = {|
maxSize: number,
|};
```
{maxSize} is the maximum size for pagination and if the dApp tries to request pages outside of this boundary this error is thrown.

### TxSendError

```
TxSendErrorCode = {
Refused: 1,
Failure: 2,
}
type TxSendError = {
code: TxSendErrorCode,
info: String
}
```

* Refused - Wallet refuses to send the tx (could be rate limiting)
* Failure - Wallet could not send the tx

### TxSignError

```
TxSignErrorCode = {
ProofGeneration: 1,
UserDeclined: 2,
}
type TxSignError = {
code: TxSignErrorCode,
info: String
}
```

* ProofGeneration - User has accepted the transaction sign, but the wallet was unable to sign the transaction (e.g. not having some of the private keys)
* UserDeclined - User declined to sign the transaction



## Initial API

In order to initiate communication from webpages to a user's Cardano wallet, the wallet must provide the following javascript functions to the webpage. These would likely be done via code injection into the webpage. The API is split into two stages to maintain the user's privacy, as the user will have to consent to `cardano_request_read_access()` in order for the dApp to read any information pertaining to the user's wallet.
Copy link
Contributor

@refi93 refi93 Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me this paragraph assumes the Cardano wallet is a browser add-on that can indeed inject an object with methods into the currently open web page, but for ordinary web/desktop wallets this is AFAIK not as straightforward. Do you see a way to make it work even for these?

I understand this CIP is to specify the general interface, not the exact technical solution, I'm rather curious about your thoughts on how other than add-on based wallets could implement this protocol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it is defined expects the ability to inject code into pages for the dApp to interface with, but not necessarily how that part is done. It's how MetaMask does it for web3. It would be possible to make a bridge extension to desktop wallets using chrome's native messaging for example, but I'm open to ideas for other ways that it could possibly be exposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of situation where you can't inject code directly, people in Ethereum usually use another tool called Wallet Connect which has a different flow for setting up the connection


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a dedicated namespace on the global object instead of prefixed functions. That is, have Window.Cardano.request_read_access etc ... ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. A cardano object is mentioned below; does it mean that we expect these two next functions to be available globally and to create the global cardano reference as a result 🤔 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. The cardano object is first of all injected if the cardano_requests_read_access function returns true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did this two-phase so that the cardano object was separate and if it exists then it meant that the rest of the API was injected. It was originally all in there e.g. cardano.request_read_access() but we changed the decision later. However, if the user disconnects the wallet then dApps still need to handle cardano.*() returning an error that the wallet is not connected, so it's possible we could revert to that behavior.

At page load those 2 functions are injected, and if cardano_request_read_access() is successful then the cardano object is injected with the rest of the API as alessandro mentioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea I had thought of previously was if cardano_request_read_access() returns a Cardano instance instead of injecting it as global.cardano. We would still need to inject the type definition if it's not injected at wallet start, and if it is at wallet start we would want to control in the implementation to not allow dApps creating their own object from that type and trying to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious to know why, to me it seems more consistent to keep everything under the same namespace.

Well at first I had wanted to simplify things so that dApps wouldn't have access to those APIs until they had been granted access without having to check for API access denied errors in every endpoint, but it's not really a valid point once you realize that you'll need to handle those anyway if a wallet cancels access later on after injection. It does let us potentially go with that returning the cardano object idea instead of performing a 2nd injection so that was a consideration as well.

Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think just hiding the endpoints, doesn't fully prevent someone from accessing the wallet actually, since you communicate with window.postMessage. so the prevention needs to be in the extension itself. I would also prefer to move it under the same name space. and maybe make it shorter like:
cardano.enable and cardano.is_enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alessandrokonrad It absolutely doesn't on its own. That is up to the wallet's implementation. When we implemented the ergo dApp connector spec in Yoroi we handle the connections from inside the Yoroi extension so Yoroi will only only respond to requests from pages that it considers connected, which can only happen if the connect request was accepted by the user (either by whitelisting the site or by the popup). You can't assume much at all in the way of security guarantees whatsoever in the injected code. That's why we treat the yoroi-ergo-connector (especially the injected code) as a trustless relay, and inside of yoroi-frontend we verify the inputs and keep track of connections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning this, we could potentially expand on this and other implementation-related security issues in the spec, but it might be outside of the scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed how these things work while I was reworking the wallet connection UX for when you have multiple wallets (e.g. Yoroi + AdaLite) that need to inject their API.

### cardano_request_read_access(): Promise\<bool>

Errors: APIError

This is the entrypoint to start communication with the user's wallet. The wallet should request the user's permission to connect the web page to the user's wallet, and if permission has been granted, the full API (`cardano` object) must be exposed to the webpage. The wallet can choose to maintain a whitelist to not necessarily ask the user's permission every time access is requested, but this behavior is up to the wallet and should be transparent to web pages using this API. If a wallet is already connected this function should do nothing and simply return true.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit doubtful about this all-or-nothing read access. As a user, I would prefer granting permissions on a more granular level. For example:

  • Grant access to an CIP-1852' account public key
  • Grand access to a public stake key
  • Grand access to one or more CIP-1852' index public keys
  • Grant access to wallet's balance
  • Grant access to reward's balance

etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was originally drafting the ergo spec I was thinking about this, but we had decided that it would add a lot of friction for users if they have to grant permissions for every kind of access for every new dApp they try and use. Perhaps we could leave it up to the wallet? The wallet could perhaps have a list of permissions with them all pre-selected in the connect screen, with users free to deselect things if they want to have fine-grain control over their privacy.

@SebastienGllmt @robkorn what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general a query like "access to wallet's balance" ends up being too narrow in scope to really be useful. I think permission to view public keys is the right level of abstraction for permissions at this point, and in the future if we have Atala Prism or some other system, there could be a more compelling case for permission management of something other than just public keys.

For Ergo, there was really only one public key to deal with per wallet, but in Cardano there are multiple public keys you might care about per wallet (Catalyst voting key, CIP-1852 accouting key, pool creation public key, etc.) so it may make sense for dApps to get access to these separate keys as separate permission prompts

Grand access to one or more CIP-1852' index public keys

I think this is more analogous to connecting multiple wallets to the same dApp then giving access to multiple public keys. I'm not sure if any wallets out there support this since the UX for connecting multiple wallets is kind of tricky.


### cardano_check_read_access(): Promise\<bool>

Errors: APIError

Returns true if the full API (`cardano` object) was injected and the dApp has access to the user's wallet, and false otherwise.




## Full API

Upon successful connection via `cardano_request_read_access()`, a javascript object named `cardano` is injected with the following methods. All read-only methods (all but the signing functionality) should not require any as the user has already consented to the dApp reading information about the wallet's state when they agreed to `cardano_request_read_access()`. The remaining methods `cardano.sign_tx()`, `cardano.sign_tx_input()`, `cardano.sign_data()` must request the user's consent in an informative way for each and every API call in order to maintain security.
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

The API chosen here is for the minimum API necessary for dApp <-> Wallet interactions without convenience functions that don't strictly need the wallet's state to work. The API here is for now also only designed for Shelley's Mary hardfork and thus has NFT support. When Alonzo is released with Plutus support this API will have to be extended.

### cardano.get_utxos(amount: cbor\<value> = undefined, paginate: Paginate = undefined): Promise\<cbor\<transaction_output> | undefined>
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

Errors: `APIError`, `PaginateError`

If `amount` is `undefined`, this shall return a list of all UTXOs (unspent transaction outputs) controlled by the wallet. If `amount` is not `undefined`, this request shall be limited to just the UTXOs that are required to reach the combined ADA/multiasset value target specified in `amount`, and if this cannot be attained, `undefined` shall be returned. The results can be further paginated by `paginate` if it is not `undefined`.

### cardano.get_balance(): Promise\<cbor\<value>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest get_available_balance, or take an extra argument for specifying which balance for there are in principle multiple concepts of balances in a wallet:

  • total balance (sum of all UTxO of the wallet)
  • available balance (total balance, minus those in pending transactions)
  • pending balance (available balance, plus the predictive balance coming from pending change outputs)
  • reward balance (balance of the reward account)

Copy link
Contributor Author

@rooooooooob rooooooooob Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to apply such a filter to get_utxos() too for consistency? At least for the pending part. The way we implemented the ergo spec in Yoroi was to just always filter the utxos returned to not include ones that were pending as part of the connector's sent txs. The notion of pending isn't always verifiable though, depending on how the tx was sent and to where.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to apply such a filter to get_utxos() too for consistency?

Yes. Incidentally, get_XXX_balance would just be the sum of all utxos returned by get_XXX_utxos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. We just included both endpoints since wallets would likely have the balance already calculated in the common case, and it would be such a common use-case that it would be a lot friendlier towards dApp devs to just have an endpoint for it instead of summing over the utxos every time they need the balance. Granted, initially it wasn't planned to have multiple versions of balances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KtorZ

I'd suggest get_available_balance, or take an extra argument for specifying which balance for there are in principle multiple concepts of balances in a wallet:

  • total balance (sum of all UTxO of the wallet)
  • available balance (total balance, minus those in pending transactions)
  • pending balance (available balance, plus the predictive balance coming from pending change outputs)
  • reward balance (balance of the reward account)

Does anyone have any preference on multiple endpoints vs a parameter? The parameter could have a default value to available balance which could make it simpler.

For the corresponding UTXO calls though we wouldn't have reward as an option, just for balance?


Errors: `APIError`

Returns the total balance available of the wallet. This is the same as summing the results of `cardano.get_utxos()`, but it is both useful to dApps and likely already maintained by the implementing wallet in a more efficient manner so it has been included in the API as well.

### cardano.get_used_addresses(paginate: Paginate = undefined): Promise\<\<cbor<address>[]>
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

Errors: `APIError`

Returns a list of all used (included in some on-chain transaction) addresses controlled by the wallet. The results can be further paginated by `paginate` if it is not `undefined`.

### cardano.get_unused_addresses(): Promise\<\<cbor<address>[]>
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

Errors: `APIError`

Returns a list of unused addresses controlled by the wallet.

### cardano.get_change_address(): Promise\<\<cbor<address>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for all three methods above: (Shelley) addresses are actually not CBOR-encoded. Rather, they follow their own grammar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. They are defined in the same document though as address = bytes, even if the inner details of what those bytes are are not just arbitrary bytes. I think I chose this for consistency with the other types as our cbor type specified in the document includes how to represent them as a string. If we think it would be better to use the bech32 or some other encoding we could use that instead.

rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

Errors: `APIError`

Returns an address owned by the wallet that should be used as a change address to return leftover assets during transaction creation back to the connected wallet. This can be used as a generic receive address as well.

Copy link
Contributor

@refi93 refi93 Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What call would the dApp need to invoke to fetch the staking key of the wallet, e.g. to perform a stake delegation? It could infer it from the addresses returned via any of the already existing get_*_address() call but that seems hacky, so I'd consider adding a get_stake_address() call too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added in a api.getRewardAddresses() endpoint.

### cardano.sign_tx(tx: cbor\<transaction_body>, metadata: cbor\<auxiliary_data> = undefined): Promise\<cbor\<transaction>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The auxiliary data are already part of the transaction body so the second argument is redundant.

  2. On a second note, I'd suggest going for witness_tx instead, for pub-key signatures is only one kind of witnessing method. Transactions containing scripts for instance will require full serialized scripts as witness. Alonzo also brings its own new witnesses with datums and redeemers.

  3. We should also consider reducing the output to the set of witnesses for the whole transaction can then be re-constructed by the DApp. Note that, this is not for "saving bytes", but rather a security argument. There's nothing preventing the wallet on the other end to send back a completely different signed transaction. Returning the full serialized signed transaction encourage developers to forward and submit that blob without any further verification. However, by only returning witnesses and letting the caller assemble the final transaction makes sure that they submit what they intended to get signed.

Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was suggesting to pass to the wallet the whole transaction object, so that the user could also verify what exactly happens in the transaction and as you said to check for all witnesses since scripts can be involved as well.
Probably a good idea to just return the witness set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning the full serialized signed transaction encourage developers to forward and submit that blob without any further verification.

Good point. I was looking at it more from the perspective of dApps being the potentially-untrustworthy actor, rather than the wallets as well.

So would we have it something like cardano.sign_tx(tx: cbor<transaction>, partial_sign: bool = false): Promise<cbor<transaction_witness_set>>? Or something else? Or have we thought of any other ideas for controlling partial signing?

Copy link
Contributor

@refi93 refi93 Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one challenge I can think of with this call is how it would work for cases when the exact metadata is not known in advance - I have in mind the Catalyst registration use case when the wallet is producing the signed metadata at the very moment of signing the transaction (supplying the staking key signature into the metadata object). The sign_data call could work for mnemonic-based wallets but hardware wallets don't expose any similar call

We indeed faced a problem of a similar nature in cardano-hw-cli and the recommended flow ended up being letting the user sign a dummy transaction first just to obtain the signed metadata - I guess that's pretty much the only option here as well if we don't want to have dedicated calls for the Catalyst registration and other kinds of "authenticated metadata" that may emerge in the future, unless they adhere to some unified standard like https://github.com/cardano-foundation/CIPs/blob/master/CIP-0008/CIP-0008.md

Copy link
Contributor

@refi93 refi93 Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that the approach with the dummy transaction to obtain the signed Catalyst registration metadata would not be viable if the wallet returned via the connector just the transaction witnesses (as the dApp would not have the signature/staking public key to include in the metadata)

edit: I realized the workaround via signing a dummy transaction is not possible with the currently available connector API because there isn't a dedicated call to obtain the catalyst registration signature/metadata. So if we want the connector to support Catalyst registration, probably a dedicated call/sign_tx function parameter for that would need to be added


Errors: `APIError`, `TxSignError`

Requests that a user sign the supplied transaction body. The wallet should ask the user for permission, and if given, try to sign the supplied body and return a signed transaction. If the wallet could not sign the transaction, `TxSignError` shall be returned with the `ProofGeneration` code. Likewise if the user declined it shall return the `UserDeclined` code.

### cardano.sign_tx_input(tx: cbor\<transaction_body>, index: number): Promise\<cbor\<transaction_witness_set>>
rooooooooob marked this conversation as resolved.
Show resolved Hide resolved

Errors: `APIError`, `TxSignError`

Provides lower-level ability for signing that produces the witnesses for just a single input in a transaction. This exists in case dApps need to construct a transaction to satisfy certain properties and the user might only own some of the inputs. The wallet should ask user permission for signing similar to `cardano.sign_tx_input()` and errors are handled in the same way.
Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for the sign_tx function: Instead of using the tx_body and metadata, what about just using transaction as argument with an empty witness set. Or if scripts are invovled they are added to the witness set. So the user could see exactly what is in the transaction. And of course it makes it easier to check what the wallet has to sign actually.

Why does the sign_tx function need to sign the the tx fully and gives otherwise an error if it can't. Imo the wallet should just sign whatever it can and then return the new tx object with the new witness set.

I would merge sign_tx and sign_tx_input together and just as described above let the wallet try to sign whatever is possible without giving any error if it can't sign everything.
Let's say you need to provide just one staking key witness because of a withdrawal, you can't use sign_tx, because it throws an error if you don't have all required keys and sign_tx_input just works for inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alessandrokonrad

Suggestion for the sign_tx function: Instead of using the tx_body and metadata, what about just using transaction as argument with an empty witness set.

If you mean that we use this as a way to implement partial signing so that the wallet tries to create witnesses for everything not provided, I am okay with this. We could also have it just specify somehow which parts of the tx it wants to sign in another way, however I guess that passing them enables the wallet to return a completed transaction instead of returning the witness sets and making the dApp construct the tx from parts and checking it has enough, which would be unnecessary boilerplate, so I think for that reason I'm on board with the idea. Thanks for the suggestion!

Why does the sign_tx function need to sign the the tx fully and gives otherwise an error if it can't. Imo the wallet should just sign whatever it can and then return the new tx object with the new witness set.

We could provide a different endpoint for partial tx signing (possibly deprecating sign_tx_input()) that simply signs what it can and returns that. Unless anyone can think of a use-case for this though I'd rather have the tx signing be all-or-nothing to reduce checking boilerplate/room for error in the dApp. I am unsure what utility this provides if we instead do your suggestion for passing in witnesses, and have the dApp finish off the rest of the witnesses, and fail if it can't. Is there a use case where we would need multi-phase witness creation that isn't just the dApp providing its wintesses, and the wallet finishing the rest? Like more than 2 phases.

I might be overlooking something here as we haven't implementing the spec yet in Yoroi, and there are always things caught when it comes to actually using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean that we use this as a way to implement partial signing so that the wallet tries to create witnesses for everything not provided, I am okay with this.

Yes for partial signing. Simply put in a transaction object and you get back a new transaction object, with maybe some new witnesses included.

I was thinking about multi-signature stuff. For example you have an utxo coming from a multi sig address, first of all the wallet wouldn't know that if needs to provide a witness for that unless the script is appened and the wallet finds a familiar keyhash in it. And multi-signature is also a use case of multi-phase witnesses, where only partial sigining works.
So the sign_tx function going from transaction -> transaction and signs whatever it can is very flexible.

Unless anyone can think of a use-case for this though I'd rather have the tx signing be all-or-nothing to reduce checking boilerplate/room for error in the dApp

Maybe we should have an endpoint that signs all or nothing, but then the dApp needs to know which one to use. On the other hand it's also easy to check if the transaction is fully signed or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have an endpoint that signs all or nothing, but then the dApp needs to know which one to use. On the other hand it's also easy to check if the transaction is fully signed or not.

Fair point about the multi-phase signing for multisig. I still think to make the experience better (less boilerplate/room for error) for the common use-case that we should provide that all-or-nothing sign tx, then a second more general case for people who know they need more advanced functionality. This could also take the form of a parameter I guess too that defaults to sign-all.

Like cardano.sign_tx(tx: cbor<transaction>, partial_sign: bool = false): Promise<cbor<transaction>>?

That would probably reduce confusion for people who just need the sign-all behavior instead of having two endpoints, but then you might have them wondering why they need to provide a transaction with empty witness sets instead of just the body. I guess if it's just the body we're providing before, you don't need to provide the metadata unless you're checking if it matches the hash in the body.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other alternative would be

cardano.sign_tx(tx_body: cbor<transaction_body>): Promise<cbor<transaction>>

cardano.partial_sign_tx(tx: cbor<transaction>): Promise<cbor<transaction>>

I don't think it would be that confusing that way either if we name the endpoints well and document it properly in the spec.

Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like cardano.sign_tx(tx: cbor, partial_sign: bool = false): Promise<cbor>?
That would probably reduce confusion for people who just need the sign-all behavior instead of having two endpoints, but then you might have them wondering why they need to provide a transaction with empty witness sets instead of just the body. I guess if it's just the body we're providing before, you don't need to provide the metadata unless you're checking if it matches the hash in the body.

Yes I like that idea with that endpoint. Way less confusing and for most cases people don't even need to care about the second parameter.
Well if you mint a token for example you still need the witness set, since there is not other way otherwise to prove if the wallet was able to fully sign the transaction. I think having transaction as argument is still the best solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KtorZ @alessandrokonrad I made an update to the CIP which removed the single input one and made the remaining one have a partial signing mode and return only the witness set of new witnesses that were just created during signing. Do we think this is a good solution now? Should partialSign default to true instead of false? Or is merely having it as a parameter good enough? Or should it be the only one and make all dApps check that enough signatures were provided in all cases (and remove the ProofGeneration error code entirely)?

We talked (1st CIP review) about having a way to point to which parts to sign but doing that in a general way for all signable things could be quite hard. On the other hand, I'm not sure if wallets would be able to easily figure out on their own which things once we move to Alonzo are signable by them without more context. We also have to think about what the wallets would want to show UI-wise for the signing request too.

Excerpt from review:

M - The idea is to point that to a particular input although in practice not only inputs require signatures: That's fine for payments but if we go a bit deeper down the Alonzo path, you can sign pretty much anything. Consider 'certificates' even - that's not even pointing at an input, that's pointing to a stakekey, or at a certificate. The signature should have a way to point to anything in the tx as "I need a signature for that particular certificate" -> the wallet should know which key corresponds to that and produce the appropriate signatures (after approval by the user) OR you request a public key (and the wallet can product a signature matched to that key). Might be public key hash rather than public key. I don't need a way to point to what element needs signed really, it could be many things. Or you give out the keyhash you expect a signature from.

Copy link
Contributor

@alessandrokonrad alessandrokonrad Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should partialSign default to true instead of false

I think false makes sense as default. Most of the signing request will just require witnesses from a single wallet.
If we go with true I think we could actually get rid of the partialSign parameter completely and just have partial signing as the normal.

We also have to think about what the wallets would want to show UI-wise for the signing request too.

You could check out namiwallet.io. I fully implemented this CIP already.
But basically what I do is I check the difference from own outputs amount - own inputs amount:

  • balance for specific asset > 0 => asset that will be added to your wallet (show in green, e.g. asset from withdrawal or other inputs or minting)
  • balance for specific asset < 0 => asset that will be subtracted from your wallet (show in red)
  • balance for specific asset = 0 => asset that was part of the utxo, but simply goes back to the wallet. (no displaying necessary)

Other outputs can be considered as recipients of the the amount the wallet loses. Show the addresses with the amount.
Then we could show if the transaction includes Certificate, Metadata, Minting or Withdrawal. Nami for example just shows these tags if included but doesn't go into more details yet.

the wallet should know which key corresponds to that and produce the appropriate signatures

I think this is important, because it makes the developer and user experience easier. The dApp just provides the cbor tx and the wallet scans through the whole tx and picks out the keyHashes it is familiar with and can provide the signatures for it. And before the user is shown which keys are required for the tx.


### cardano.sign_data(addr: cbor\<address>, sig_structure: cbor\<Sig_structure>): Promise\<Bytes>
Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to return the vkey/public key together with the signed bytes structure. It makes it easy for verification.
Maybe like this:

data_witness = [
    vkey: vkey
    signed_structure: Bytes
]

Not sure if as Json or Cbor.
and the return type is then Promise<data_witness>.

EDIT:
vkeywitness is defined in the specification and I just realized that would match perfectly since the signed_structure is also a Ed25519Signature. In the serialization-lib Vkeywitness is already defined, so the new return type could be Promise<cbor<vkeywitness>>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. It would be pretty easy to add on the wallet's side and would be useful so I'm all for the change.

However, before I change the spec, there is some ambiguity that I noticed with this endpoint. As we pass in an address, there exists the problem for some address types where there can be 2 keys (spending key and staking key) with no way to know which one it is. This isn't the case for Ergo which this spec was initially based on so it made sense to use an address there, but not here.

We could change it to take in a cbor<stake_credential> and require the dApps to know how to decompose that from the addresses. This is possible as that data is embedded as a part of the address binary structure and is accessible via cardano-serialization-lib as well. At least at first thought this seems better than either

  1. defaulting to one or the other (potentially confusing to dApp devs?)
  2. having a toggle (some address types have no staking keys - do you error? or default to spending?)

Or do we only want to allow signing by staking keys? I don't think there's a security issue with payment keys as we are signing a very specific CBOR structure that was designed to not have conflicts with signing a tx hash or anything like that. What happens in that case if someone provides an address with no staking component?

This is the one part of the EIP-0012 Ergo spec that wasn't implemented in Yoroi at all and hadn't been finalized as the CIP-0008 message signing spec hadn't been finalized at the time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I guess I had forgotten that in the CIP-0008 spec we can provide a mapping from address <-> keys so we need the full address type too because (quoted from CIP-0008):

To be able to effectively verify we need two things:

  1. P1 - (optional) knowledge of the relation of a public key and a Cardano address
  2. P2 - Knowledge of which algorithm was used to sign

For P1, the mapping of public keys to addresses has two main cases:

  1. for Shelley addresses, one payment key maps to many addresses (base, pointer, enterprise)
  2. for Byron addresses, chaincode and address attributes need to be combined with the public key to generate an address

To resolve this, one SHOULD add the full address to the protected header when using a v2 address. The v2 addresses contain the chaincode + attributes and we can verify the address matches combining it with the verification public key.
? address: bstr,
For P2, we use the alg header and to specify which public key was used to sign, use the cwt protected header.

and this can't be done entirely on the wallet side as a stake_credential could map to multiple addresses, unless we want the wallet to just insert a random/arbitrary one, but that seems like confusing behavior coming from the dApp if they take a stake_credential from an address then use this endpoint and it inserts a different address into the headers.

Maybe we could have it pass in both the cbor<stake_credential> and cbor<address>? And then add an error code to DataSignError if they don't match, or even just re-use DataSignErrorCode.InvalidFormat with info about the mismatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still like the idea of passing in an adress as argument. Yeah right base addresses have 2 credentials. But in my opinion if you want to prove ownership over a base address you care actually only about the payment credential part. So I would standardize that:
Base Adress -> Take Payment Credential
And if you want to sign with the Stake Credential you pass in the Reward Address

Copy link
Contributor

@alessandrokonrad alessandrokonrad Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm trying to understand CIP-0008 better and I think we should change the sign_data function to:
sign_data(addr: cbor<address>, payload<hex>) : Promise<Bytes>
The whole sig struc with the protected/unprotected headers I would build on the wallet side and just pass the payload there and then return the CoseSign1 or CoseSign object from the wallet, which includes the public key, algorithm id and the address in the protected headers. The payload is an hex encoded utf8 string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just have the payload, how would the wallet know how to construct the rest of the sig structure? Are we assuming only the CoseSign1 case? What if the dApp wants to put additional fields into the header besides alg/key? I guess one worry is to construct it you might need the pubkey but unless that other endpoint for getting verification keys is implemented (or an alternative) it might be hard to know that as the address only contains the hashes of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably more suitable for the CoseSign1 case right. And yes additional fields would not work, but I think in most cases you just want to make simple data signatures in order to prove ownership or login into a service. (maybe have two endpoints, one for easy signing and one for advanced things? Or rather an external library, that handles the complicated logic and exposes a simple sign function (like web3.js))

It's actually easy for the wallet to identify the public key belonging to an address/key hash. (in the signTx function you have to do similar things).
Even if there is a function in the API to get a public key, you still need to dermine first which one you need, and all information you have are addresses from a dApp side, so the wallet needs to know anyway which address/key hash maps to what public key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to be more general and accept any sig structure from CIP-0008 and like you said we can use external libraries to simplify the construction. The message-signing library for example could have simplified code-paths for constructing things in the most common cases which dApps could then use to pass to the connector. Due to other higher priorities we didn't end up using that library ourselves so there's likely some functionality lacking in it that could help.

It's actually easy for the wallet to identify the public key belonging to an address/key hash. (in the signTx function you have to do similar things).
Even if there is a function in the API to get a public key, you still need to dermine first which one you need, and all information you have are addresses from a dApp side, so the wallet needs to know anyway which address/key hash maps to what public key.

In the 2nd CIP review meeting this Tuesday it was brought up that we either need to change the API from addresses to pubkeys (although IMO this would make it clunkier to use to convert all the keys to addresses for most uses) or provide an API endpoint to get the pubkeys. I proposed one for Address -> PubKey (for addresses the wallet owns only so it could return undefined if it doesn't know it). This could be used for this. @KtorZ also proposed the following:

Provide a way to retrieve verification keys at a given derivation path. For example getVerificationKey(path : string) : Promise. The path could be either a list of levels or a slash-separated string (e.g. "1852'/1815'/0'/0/0)`. That is probably the most flexible option.

in another comment below in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed one for Address -> PubKey (for addresses the wallet owns only so it could return undefined if it doesn't know it)

Yes I am for this idea. Keep the address enpoint and have this public key endpoint extra.

For example getVerificationKey(path : string) : Promise. The path could be either a list of levels or a slash-separated string (e.g. "1852'/1815'/0'/0/0)`

This may be a little bit too flexible, because this would mean the dApp could derive keys/addresses the wallet is not even familiar with or there is a higher chance that the dApp could make something wrong.


Errors: `APIError`, `DataSignError`

This endpoint utilizes the [CIP-0008 signing spec](https://github.com/cardano-foundation/CIPs/blob/master/CIP-0008/CIP-0008.md) for standardization/safety reasons. It allows the dApp to request the user to sign data conforming to said spec. The user's consent should be requested and the details of `sig_structure` shown to them in an informative way. The Please refer to the CIP-0008 spec for details on how to construct the sig structure.

### cardano.submit_tx(tx: cbor\<transaction>): Promise\<hash32>

Errors: `APIError`, `TxSendError`

As wallets should already have this ability, we allow dApps to request that a transaction be sent through it. If the wallet accepts the transaction and tries to send it, it shall return the transaction id for the dApp to track. The wallet is free to return the `TxSendError` with code `Refused` if they do not wish to send it, or `Failure` if there was an error in sending it (e.g. preliminary checks failed on signatures).

rooooooooob marked this conversation as resolved.
Show resolved Hide resolved
# Implementations

TODO: link to Yoroi's implementation once available