Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links, paths, and anchors #453

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Links, paths, and anchors #453

wants to merge 12 commits into from

Conversation

pdaoust
Copy link
Collaborator

@pdaoust pdaoust commented Jun 6, 2024

Resolves #467

Seeking review for:

  • Succinctness of prose -- do you get lost? Is there too much information, is the complex stuff (e.g., introducing how to get a hash for linking) introduced too early?
  • Correctness of code at first glance -- I have to confess I haven't tried compiling any of it yet.

Copy link

@mjbrisebois mjbrisebois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I didn't know how to use the anchor functions until reading this.

!!! info Action hashes aren't certain until zome function lifecycle completes
When you get an action hash back from the host function that creates it, it doesn't mean the action is available on the DHT yet, or will ever be available. The action isn't written until the function that writes it completes, then passes the action to validation. If the function or the validation fail, the action will be discarded. And if it is successful, the action won't become fully available on the DHT until it's been published to a sufficient number of peers.

It's safer to share action hashes with other peers or cells in a callback called `post_commit()`. If your coordinator zome defines this callback, it'll be called after every successful function call within that zome.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning confused me. What is the less-safe way to share hashes? Also, when and why would I be sharing the hashes?

Is this about sending signals to other peers so they get the updates quicker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this confused me too. In the context of linking to/from the action within the same zome call it was created there's no issue since all the actions will either be committed to the source chain or discarded atomically

Comment on lines 419 to 435
Retrieve the _addresses_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html):

```rust
use hdk::prelude::*;
use movie_integrity::*;

let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter");
```

Retrieve the _addresses_ of all the top-level anchors by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html):

```rust
use hdk::prelude::*;
use movie_integrity::*;

let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter");
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these supposed to have the same example code?

* A **base**, which is the address the link is attached to and _points from_
* A **target**, which is the address the link _points to_
* A **type**
* An optional **tag** that can hold a small amount of arbitrary bytes, up to 4 kb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hdk docs say the max is 1 kb (https://docs.rs/hdk/latest/hdk/link/fn.create_link.html) -- I can't actually find where this limit is applied in the codebase so not sure which is accurate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

);
```

A link is considered dead once its creation action has one or more delete-link actions associated with it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might also copy the same disclaimer from the entries page: "deleting" a link just creates another action marking the link as deleted and does not actually delete any data. And that a future potential purge feature will not apply to links as they are only actions.


### Define a link type

Every link has a type that you define in an integrity zome, just like [an entry](/build/entries/#define-an-entry-type). Links are simple enough that they have no entry content. Instead, their data is completely contained in the actions that write them. Here's what a link creation action contains, in addition to the [common action fields](/build/working-with-data/#entries-actions-and-records-primary-data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing of sentences 2-4 is a bit confusing to me. I might just say explicitly: "a link is commited to the source chain as an action with no associated entry"

);
```

Links can't be updated; they can only be created or deleted.
Copy link
Member

@mattyg mattyg Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give some clarifying rationale for why they can't be updated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure... if only I knew why 🤣

);
```

A link is considered dead once its creation action has one or more delete-link actions associated with it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about using the term "dead" here -- in the web 2 context it usually means that a link's target is no longer hosted anywhere. I haven't seen the term used elsewhere in holochain, maybe we should just stick with "deleted"?

);
```

To get all live _and dead_ links, along with any deletion actions, use [`hdk::prelude::get_link_details`](https://docs.rs/hdk/latest/hdk/link/fn.get_link_details.html). This function has the same options as `get_links`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same note as above about the terms "dead" and "live"


Because linking is all about connecting hashes to other hashes, here's how you get a hash for a piece of content.

!!! info A note on the existence of data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section title could be more descriptive. Maybe "Link Addresses May Not Have Content"?


A link is considered dead once its creation action has one or more delete-link actions associated with it.

## Getting hashes for use in linking
Copy link
Member

@mattyg mattyg Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be clearer and better mental separation to move this section onto its own page "Hashes", and then keep this page only about links / anchors / paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, great thing to do in a follow-up -- or put all the content on 'Working with Data'

let author_id = action.author();
```

### External reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth noting that the external hash still needs to fit in 32 bytes.

let dna_hash = dna_info()?.hash;
```

!!! info Linking from a DNA hash is not recommended
Copy link
Member

@mattyg mattyg Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the conceptual explanation of "hot spots" deserves its own section on this page, rather than hiding it in the DNA hash section. And in general I think it's okay to create links from a known base as long as there is some reasonable expected limitation to the number of links that will be created off it (that's what each segment of a path does anyway).


### Count links

If all you need is a count of matching links, such as for an unread messages badge, use [`hdk::prelude::count_links`](https://docs.rs/hdk/latest/hdk/prelude/fn.count_links.html). It has a different input with more options for querying (we'll likely update the inputs of `get_links` and `count_links` to match `count_links` in the future).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth noting some gotchas with count_links: it goes to the network first, so if you end up asking 2 different peers with differing dht syncs you may get 2 different answers (which may be odd behavior if your unread messages badge shows a smaller number than before without you reading any messages). You may also get a different count as the number returned from get_links for the same reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, although I feel like there's an entire story about that re: eventual consistency everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's more than just that though since it doesn't go to your cache so you may have more links than you receive via count


## Paths and anchors

Sometimes the easiest way to discover a link base is to embed it into the application's code. You can create an **anchor**, an entry whose content is a well-known blob, and hash that blob any time you need to retrieve links. This can be used to simulate collections or tables in your graph database. As [mentioned](#getting-hashes-for-use-in-linking), the entry does not even need to be stored; you can simply create it, hash it, and use the hash in your link.
Copy link
Member

@mattyg mattyg Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling it an entry or blob that doesn't need to exist just adds confusion. An anchor is just a hash that is already known to every user that can be used as a link base. One way to make it known to every user is to hard-code it. It doesn't need to be generated by hashing some human meaningful data, it could just be 32 arbitrary bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also a little confusing that we have this concept of an anchor, and then we also have an opinionated implementation in the HDK - might be helpful to distinguish them explicitly

Copy link
Collaborator Author

@pdaoust pdaoust Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoroughly agree re: concept vs named implementation. I don't know that anyone actually uses two-level anchors, and I think we don't have to teach it to newcomers as a less capable and more confusing paths.


While you can build this yourself, this is such a common pattern that the HDK implements it for you in the [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) module. The implementation supports both anchors and **paths**, which are hierarchies of anchors.

!!! info Avoiding DHT hot spots
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same note as above -- I think hot spots could be expanded on in its own section


### Anchors

In the HDK, an 'anchor' is just a path with two levels of hierarchy. The examples below show how to implement the path-based examples above, but as anchors. Generally the implementation is simpler.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note that this is the HDK's opinionated implementation of the anchor concept.

Copy link
Member

@mattyg mattyg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome stuff thanks for taking this on!

let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter");
```

## Reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -88,7 +96,7 @@ use movie_integrity::*;

let movie = Movie {
title: "The Good, the Bad, and the Ugly",
director: "Sergio Leone"
director_hash: EntryHash::from_raw_36(vec![ /* hash of 'Sergio Leone' entry */ ]),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be useful to also show how to get /* hash of 'Sergio Leone' entry */, or is this covered elsewhere in the guide?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, there's a section on it elsewhere, although it's a bit long-winded. I might make it into its own article.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Practical docs for using Links
4 participants