From d1bbb0a767eb2d4fb47b610b53fd301da0c0c4a4 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 24 May 2024 14:52:55 -0700 Subject: [PATCH 01/12] add glossary entry for addressable content --- src/pages/resources/glossary.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/pages/resources/glossary.md b/src/pages/resources/glossary.md index 0188d8f1d..e5a68a680 100644 --- a/src/pages/resources/glossary.md +++ b/src/pages/resources/glossary.md @@ -13,6 +13,10 @@ A piece of data that represents a [record](#record) on an [agent's](#agent) [sou 1. [DHT address](#dht-address), synonymous with [base](#base) 2. [Transport address](#transport-address) +#### Addressable content + +An individual piece of data in a [content-addressable store](#content-addressable-storage-cas) that can be stored or retrieved by its identifier, usually the hash of the data. + #### Address space The entire range of possible [DHT addresses](#dht-address). This space is circular, meaning the last address is adjacent to the first address. From a8870fdbbad12318f091102ec4978d66730f5322 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Thu, 6 Jun 2024 14:04:27 -0700 Subject: [PATCH 02/12] new links content --- src/pages/build/links-paths-and-anchors.md | 456 +++++++++++++++++++++ 1 file changed, 456 insertions(+) create mode 100644 src/pages/build/links-paths-and-anchors.md diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md new file mode 100644 index 000000000..d3f25bf0a --- /dev/null +++ b/src/pages/build/links-paths-and-anchors.md @@ -0,0 +1,456 @@ +--- +title: "Links, Paths, and Anchors" +--- + +::: intro +A **link** connects two addresses in an application's shared database, forming a graph database on top of the underlying hash table. Links can be used to relate two pieces of [addressable content](/resources/glossary/#addressable-content) in the database to each other or point to references to addressable content outside the database. + +**Paths** and **anchors** build on the concept of links, allowing you to create collections, pagination, indexes, and hierarchical structures. +::: + +## Turning a hash table into a graph database + +A Holochain application's database is, at heart, just a big key-value store --- or more specifically, a hash table. You can store content at its hash, and you can retrieve content by its hash. This is useful if you already know the hash of the content you want to retrieve, but it isn't helpful if you don't know the hash of the content you're looking for. + +A piece of content itself can contain a hash in one of its fields, and that's great for modelling a _many-to-one relationship_. For instance, an [**action**](/build/#entries-actions-and-records-primary-data), one of the primary types of data in the database, points to the action that precedes it and to the entry it creates, updates, or delete. + +And if the number of things the content points to is small and doesn't change often, you can model a _many-to-many relationship_ using a field that contains an array of hashes. But at a certain point this becomes hard to manage, especially if that list grows or regularly changes. + +Holochain's hash table stores _metadata_ in addition to primary addressable content. This lets you attach **links** to an address in the database. You can then retrieve a full or filtered list of links from that address in order to discover more primary content. In this way you can build up a fully traversable graph database. + +### Define a link type + +Every link has a type that you define in an integrity zome, just like [an entry](/build/entries/#define-an-entry-type). Links are simple enough that they have no entry content. Instead, their data is completely contained in the actions that write them. Here's what a link creation action contains, in addition to the [common action fields](/build/working-with-data/#entries-actions-and-records-primary-data): + +* A **base**, which is the address the link is attached to and _points from_ +* A **target**, which is the address the link _points to_ +* A **type** +* An optional **tag** that can hold a small amount of arbitrary bytes, up to 4 kb + +The tag could be considered link 'content', but unlike an entry type, the HDK doesn't provide a macro that automatically deserializes the tag into a Rust struct or enum. It can be used to further qualify the link, provide data about the target that saves DHT queries, or be matched against in link queries. + +[Just as with entries](/build/entries/#define-an-entry-type), Holochain needs to know about your link tags in order to dispatch validation to the right integrity zome. You can do this by implementing a `link_types` callback function, and the easiest way to do this is to add the [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) macro to an enum that defines all your link types: + +```rust +use hdi::prelude::*; + +#[hdk_link_types] +enum LinkTypes { + DirectorToMovie, + GenreToMovie, + IpfsMoviePoster, + MovieReview, + // Note: the following types will become useful when we talk about + // paths and anchors later. + MovieByFirstLetterAnchor, + MovieByFirstLetter, + // other types... +} +``` + +## Create a link + +As with entries, you'll normally want to store your link CRUD code in a [**coordinator zome**](/resources/glossary/#coordinator-zome). You can read about why in the page on [entries](/build/entries/#create-an-entry). + +Create a link by calling [`hdk::prelude::create_link`](https://docs.rs/hdk/latest/hdk/link/fn.create_link.html). If you used the `hdk_link_types` macro in your integrity zome (see [Define a link type](#define-a-link-type)), you can use the link types enum you defined, and the link will have the correct integrity zome and link type indexes added to it. + +```rust +use hdk::prelude::*; +// Import the link types enum defined in the integrity zome. +use movie_integrity::*; + +let director_entry_hash = EntryHash::from_raw_36(vec![ /* bytes of the hash of the Sergio Leone entry */ ]); +let movie_entry_hash = EntryHash::from_raw_36(vec![ /* bytes of the hash of the Good, Bad, and Ugly entry */ ]); + +let create_link_action_hash = create_link( + director_entry_hash, + movie_entry_hash, + LinkTypes::DirectorToMovie, + // Create an optional search index value for fast lookup. + vec!["year:1966".as_bytes()].into() +); +``` + +Links can't be updated; they can only be created or deleted. + +## Delete a link + +Delete a link by calling [`hdk::prelude::delete_link`](https://docs.rs/hdk/latest/hdk/link/fn.delete_link.html) with the create-link action's hash. + +```rust +use hdk::prelude::*; + +let delete_link_action_hash = delete_link( + create_link_action_hash +); +``` + +A link is considered dead once its creation action has one or more delete-link actions associated with it. + +## Getting hashes for use in linking + +Because linking is all about connecting hashes to other hashes, here's how you get a hash for a piece of content. + +!!! info A note on the existence of data +An address doesn't have to have content stored at it in order for you to link to or from it. (In the case of external references, it's certain that data won't exist at the address.) If you want to add this extra constraint, you'll need to check for the presence of data at the base and/or target address in your link validation code. + +!!! + +### Actions + +Any host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this for further writes in the same function call, or you can return it to the client that called the function so that it can use it later. You can also [pass it to another function](/concepts/8_calls_capabilities/) --- either one in the same cell, another cell in the agent's hApp, or another cell in the same network. + + +!!! info Action hashes aren't certain until zome function lifecycle completes +When you get an action hash back from the host function that creates it, it doesn't mean the action is available on the DHT yet, or will ever be available. The action isn't written until the function that writes it completes, then passes the action to validation. If the function or the validation fail, the action will be discarded. And if it is successful, the action won't become fully available on the DHT until it's been published to a sufficient number of peers. + +It's safer to share action hashes with other peers or cells in a callback called `post_commit()`. If your coordinator zome defines this callback, it'll be called after every successful function call within that zome. +!!! + +If you have a variable that contains a [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) or [`hdk::prelude::Record`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html), you can also get its hash using the following methods: + +```rust +let action_hash_from_record = record.action_address(); +let action = record.signed_action; +let action_hash_from_action = action.as_hash(); +assert_eq!(action_hash_from_record, action_hash_from_action); +``` + +(But it's worth pointing out that if you have this value, it's probably because you just retrieved the action by hash, which means you probably already know the hash.) + +To get the hash of an action from an action that deletes or updates it, match on the [`Action::Update`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Update) or [`Action::Delete`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Delete) action variants and access the appropriate field: + +```rust +if let Action::Update(action_data) = action { + let replaced_action_hash = action_data.original_action_address; + // Do some things with the original action. +} else if let Action::Delete(action_data) = action { + let deleted_action_hash = action_data.deletes_address; + // Do some things with the deleted action. +} +``` + +### Entry + +To get the hash of an entry, first construct the entry struct or enum that you [defined in the integrity zome](/build/entries/#define-an-entry-type), then pass it through the [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) function. + +```rust +use hdk::hash::*; +use movie_integrity::*; + +let movie = Movie { + title: "The Good, the Bad, and the Ugly", + director_entry_hash: EntryHash::from_raw_36(vec![ /* hash of 'Sergio Leone' entry */ ]), + imdb_id: Some("tt0060196"), + release_date: Timestamp::from(Date::Utc("1966-12-23")), + box_office_revenue: 389_000_000, +}; +let movie_entry_hash = hash_entry(movie); +``` + +To get the hash of an entry from the action that created it, call the action's [`entry_hash`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#method.entry_hash) method. It returns an optional value, because not all actions have associated entries. + +```rust +let entry_hash = action.entry_hash()?; +``` + +If you know that your action is an entry creation action, you can get the entry hash from its `entry_hash` field: + +```rust +let entry_creation_action: EntryCreationAction = action.into()?; +let entry_hash = action.entry_hash; +``` + +To get the hash of an entry from a record, you can either get it from the record itself or the contained action: + +```rust +let entry_hash_from_record = record.entry().as_option()?.hash(); +let entry_hash_from_action = record.action().entry_hash()? +assert_equal!(entry_hash_from_record, entry_hash_from_action); +``` + +Finally, to get the hash of an entry from an action updates or deletes it, match the action to the appropriate variant and access the corresponding field: + +```rust +if let Action::Update(action_data) = action { + let replaced_entry_hash = action_data.original_entry_address; +} else if let Action::Delete(action_data) = action { + let deleted_entry_hash = action_data.deletes_entry_address; +} +``` + +### Agent + +An agent's ID is just their public key, and an entry for their ID is stored on the DHT. The hashing function for an agent ID entry is just the literal value of the entry. This is a roundabout way of saying that you link to or from an agent using their public key as a hash. + +An agent can get their own ID by calling [`hdk::prelude::agent_info`](https://docs.rs/hdk/latest/hdk/info/fn.agent_info.html). Note that agents may change their ID if their public key has been lost or stolen, so they may have more than one ID over the course of their participation in a network. + +```rust +use hdk::prelude::*; + +let my_first_id = agent_info()?.agent_initial_pubkey; +let my_current_id = agent_info()?.agent_latest_pubkey; +``` + +All actions have their author's ID as a field. You can get this field by calling the action's `author()` method: + +```rust +let author_id = action.author(); +``` + +### External reference + +Because an external reference comes from outside of a DHT, it's up to you to decide how to get it into the application. Typically, an external client such as a UI or bridging service would pass this value into your app. + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +#[hdk_extern] +fn add_movie_poster_from_ipfs(movie_entry_hash: EntryHash, ipfs_hash_bytes: Vec) { + let ipfs_hash = ExternalHash::from_raw_32(ipfs_hash_bytes); + create_link( + movie_entry_hash, + ipfs_hash, + LinkTypes::IpfsMoviePoster, + () + ); +} +``` + +### DNA + +There is one global hash that everyone knows, and that's the hash of the DNA itself. You can get it by calling [`hdk::prelude::dna_info`](https://docs.rs/hdk/latest/hdk/info/fn.dna_info.html). + +```rust +use hdk::prelude::*; + +let dna_hash = dna_info()?.hash; +``` + +!!! info Linking from a DNA hash is not recommended +Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [anchors and paths](#anchors-and-paths) to 'shard' responsibility throughout the DHT. +!!! + +## Retrieve links + +Get all the live links attached to a hash with the [`hdk::prelude::get_links`](https://docs.rs/hdk/latest/hdk/link/fn.get_links.html) function. + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let director_entry_hash = EntryHash::from_raw_36(vec![/* Sergio Leone's hash */]); +let movies_by_director = get_links( + director_entry_hash, + LinkTypes::DirectorToMovie, + None +)?; +let movie_entry_hashes = movies_by_director + .iter() + .map(|link| link.target.into_entry_hash()?); +``` + +If you want to filter the returned links by tag, pass some bytes as the third parameter. The peer holding the links for the requested base address will return a list of links whose tag starts with those bytes. + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let movies_in_1960s_by_director = get_links( + director_entry_hash, + LinkTypes::DirectorToMovie, + Some(vec!["year:196".as_bytes()].into()) +); +``` + +To get all live _and dead_ links, along with any deletion actions, use [`hdk::prelude::get_link_details`](https://docs.rs/hdk/latest/hdk/link/fn.get_link_details.html). This function has the same options as `get_links` + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let movies_plus_deleted = get_link_details( + director_entry_hash, + LinkTypes::DirectorToMovie, + None +); +``` + +### Count links + +If all you need is a count of matching links, such as for an unread messages badge, use [`hdk::prelude::count_links`](https://docs.rs/hdk/latest/hdk/prelude/fn.count_links.html). It has a different input with more options for querying (we'll likely update the inputs of `get_links` and `count_links` to match `count_links` in the future). + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let my_current_id = agent_info()?.agent_latest_pubkey; +let today = sys_time()?; +let number_of_reviews_written_by_me_in_last_month = count_links( + LinkQuery::new(movie_entry_hash, LinkTypes::MovieReview) + .after(Some(today - 1000 * 1000 * 60 * 60 * 24 * 30)) + .before(Some(today)) + .author(my_current_id) +); +``` + +## Paths and anchors + +Sometimes the easiest way to discover a link base is to embed it into the application's code. You can create an **anchor**, an entry whose content is a well-known blob, and hash that blob any time you need to retrieve links. This can be used to simulate collections or tables in your graph database. As [mentioned](#getting-hashes-for-use-in-linking), the entry does not even need to be stored; you can simply create it, hash it, and use the hash in your link. + +While you can build this yourself, this is such a common pattern that the HDK implements it for you in the [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) module. The implementation supports both anchors and **paths**, which are hierarchies of anchors. + +!!! info Avoiding DHT hot spots +It's recommended to not attach a large number of links to a single anchor, as that creates extra work for the peers responsible for that anchor's hash. Instead, use paths to split the links into appropriate 'buckets' and spread the work around. We'll give an example of that below. +!!! + +### Scaffold a simple collection + +If you've been using the scaffolding tool, you can scaffold a simple collection for an entry type with the command `hc scaffold collection`. Behind the scenes, it uses the anchor pattern. + +Follow the prompts to choose the entry type, names for the link types and anchor, and the scope of the collection, which can be either: + +* all entries of type, or +* entries of type by author + +It'll scaffold all the code needed to create a path anchor, create links, and delete links in the already scaffolded entry CRUD functions. + +### Paths + +Create a path by constructing a [`hdk::hash_path::path::Path`](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html) struct, hashing it, and using the hash in `create_link`. The string of the path is a simple [domain-specific language](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html#impl-From%3C%26str%3E-for-Path), in which dots denote sections of the path. + +```rust +use hdk::hash_path::path::*; +use movie_integrity::*; + +let path_to_movies_starting_with_g = Path::from("movies_by_first_letter.g") + // A path requires a link type from the integrity zome. Here, we're using the + // `MovieByFirstLetterAnchor` type that we created. + .typed(LinkTypes::MovieByFirstLetterAnchor); + +// Make sure it exists before attaching links to it -- if it already exists, +// ensure() will have no effect. +path_to_movies_starting_with_g.ensure()?; + +let create_link_hash = create_link( + path_to_movies_starting_with_g.path_entry_hash()?, + movie_entry_hash, + LinkTypes::MovieByFirstLetter, + () +)?; +``` + +Retrieve all the links on a path by constructing the path, then getting its hash: + +```rust +use hdk::hash_path::path::*; +use movie_integrity::*; + +// Note that a path doesn't need to be typed in order to compute its hash. +let path_to_movies_starting_with_g = Path::from("movies_by_first_letter.g"); +let links_to_movies_starting_with_g = get_links( + path_to_movies_starting_with_g.path_entry_hash()?, + LinkTypes::MovieByFirstLetter, + None +)?; +``` + +Retrieve all child paths of a path by constructing the parent path, typing it, and calling its `children_paths()` method: + +```rust +use hdk::hash_path::path::*; +use movie_integrity::*; + +let parent_path = Path::from("movies_by_first_letter") + .typed(LinkTypes::MovieByFirstLetterAnchor); +let all_first_letter_paths = parent_path.children_paths()?; +// Do something with the paths. Note: this would be expensive to do in practice. +let links_to_all_movies = all_first_letter_paths + .iter() + .map(|path| get_links(path.path_entry_hash()?, LinkTypes::MovieByFirstLetter, None)?) + .flatten() + .collect(); +``` + +### Anchors + +In the HDK, an 'anchor' is just a path with two levels of hierarchy. The examples below show how to implement the path-based examples above, but as anchors. Generally the implementation is simpler. + +Create an anchor by calling the [`hdk::prelude::anchor`](https://docs.rs/hdk/latest/hdk/prelude/fn.anchor.html) host function. An anchor can have two levels of hierarchy, which you give as the second and third arguments of the function. + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +// This function requires a special link type to be created in the integrity +// zome. Here, we're using the `MovieByFirstLetterAnchor` type that we created. +let movies_starting_with_g_anchor_hash = anchor(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter", "g"); +let create_link_hash = create_link( + movies_starting_with_g_anchor_hash, + movie_entry_hash, + LinkTypes::MovieByFirstLetter, + () +); +``` + +The `anchor` function creates no entries, just links, and will only create links that don't currently exist. + +Retrieve all the linked items from an anchor just as you would any link base: + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let anchor_hash_for_g = anchor(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter", "g"); +let links_to_movies_starting_with_g = get_links(anchor_hash_for_g, LinkTypes::MovieByFirstLetter, None); +``` + +Retrieve the _names_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_tags`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_tags.html): + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let all_first_letters = list_anchor_tags(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); +``` + +Retrieve the _addresses_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html): + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); +``` + +Retrieve the _addresses_ of all the top-level anchors by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html): + +```rust +use hdk::prelude::*; +use movie_integrity::*; + +let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); +``` + +## Reference + +* [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) +* [`hdk::prelude::create_link`](https://docs.rs/hdk/latest/hdk/link/fn.create_link.html) +* [`hdk::prelude::delete_link`](https://docs.rs/hdk/latest/hdk/link/fn.delete_link.html) +* Getting hashes from data + * [`hdk::prelude::Record#action_address`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html#method.action_address) + * [`hdk::prelude::HasHash`](https://docs.rs/hdk/latest/hdk/prelude/trait.HasHash.html) + * [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) (contains fields with hashes of referenced data in them) + * [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) + * [`hdk::prelude::agent_info`](https://docs.rs/hdk/latest/hdk/info/fn.agent_info.html) + * [`hdk::prelude::dna_info`](https://docs.rs/hdk/latest/hdk/info/fn.dna_info.html) +* [`hdk::prelude::get_links`](https://docs.rs/hdk/latest/hdk/link/fn.get_links.html) +* [`hdk::prelude::get_link_details`](https://docs.rs/hdk/latest/hdk/link/fn.get_link_details.html) +* [`hdk::prelude::count_links`](https://docs.rs/hdk/latest/hdk/prelude/fn.count_links.html) +* [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) +* [`hdk::prelude::anchor`](https://docs.rs/hdk/latest/hdk/prelude/fn.anchor.html) + +## Further reading + +* [Core Concepts: Links and Anchors](https://developer.holochain.org/concepts/5_links_anchors/) \ No newline at end of file From 34847bc4a3df9734c2392672408c74ee2a9a74c5 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Thu, 6 Jun 2024 14:05:39 -0700 Subject: [PATCH 03/12] add links page to nav --- src/pages/_data/navigation/mainNav.json5 | 1 + src/pages/build/index.md | 1 + src/pages/build/working-with-data.md | 3 ++- 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/pages/_data/navigation/mainNav.json5 b/src/pages/_data/navigation/mainNav.json5 index e52789649..86031fb1c 100644 --- a/src/pages/_data/navigation/mainNav.json5 +++ b/src/pages/_data/navigation/mainNav.json5 @@ -28,6 +28,7 @@ { title: "Build", url: "/build/", children: [ { title: "Working with Data", url: "/build/working-with-data/", children: [ { title: "Entries", url: "/build/entries/" }, + { title: "Links, Paths, and Anchors", url: "/build/links-paths-and-anchors/" }, ]}, ] }, diff --git a/src/pages/build/index.md b/src/pages/build/index.md index 97e890f8e..b01cdf065 100644 --- a/src/pages/build/index.md +++ b/src/pages/build/index.md @@ -17,4 +17,5 @@ This Build Guide organizes everything you need to know about developing Holochai * [Overview](/build/working-with-data/) --- general concepts related to working with data in Holochain * [Entries](/build/entries/) --- creating, reading, updating, and deleting +* [Links, Paths, and Anchors](/build/links-paths-and-anchors/) --- creating and deleting ::: \ No newline at end of file diff --git a/src/pages/build/working-with-data.md b/src/pages/build/working-with-data.md index 58442b28a..093d4b71d 100644 --- a/src/pages/build/working-with-data.md +++ b/src/pages/build/working-with-data.md @@ -52,7 +52,7 @@ This database is stored in a [distributed hash table (DHT)](/resources/glossary/ ### Links -A link is a piece of metadata attached to an address, the **base**, and points to another address, the **target**. It has a **link type** that gives it meaning in the application just like an entry type, as well as an optional **tag** that can store arbitrary application data. +A [link](/build/links-paths-and-anchors/) is a piece of metadata attached to an address, the **base**, and points to another address, the **target**. It has a **link type** that gives it meaning in the application just like an entry type, as well as an optional **tag** that can store arbitrary application data.

type: artist_album

type: artist_album_by_release_date tag: 1966-01-17

type: artist_album

type: artist_album_by_release_date tag: 1970-01-26

Simon & Garfunkel
Sounds of Silence
Bridge over Troubled Water
@@ -160,4 +160,5 @@ The shared DHT and the individual source chains are involved in multiple interre ### In this section {data-no-toc} * [Entries](/build/entries/) --- creating, reading, updating, and deleting +* [Links, Paths, and Anchors](/build/links-paths-and-anchors/) --- creating and deleting ::: \ No newline at end of file From c4047060e11256c7dd5bc7fe7b9b02455b548c33 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Thu, 6 Jun 2024 14:05:53 -0700 Subject: [PATCH 04/12] modifications to entries page to make the format fit better with links --- src/pages/build/entries.md | 40 +++++++++++++++++++++----------------- 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/src/pages/build/entries.md b/src/pages/build/entries.md index aac91a8f7..595aefdeb 100644 --- a/src/pages/build/entries.md +++ b/src/pages/build/entries.md @@ -15,9 +15,13 @@ An entry is always paired with an **entry creation action** that tells you who a The pairing of an entry and the action that created it is called a **record**, which is the basic unit of data in a Holochain application. +## Scaffold an entry type and CRUD API + +The Holochain dev tool command `hc scaffold entry-type ` generates the code for a simple entry type and a CRUD API. It presents an interface that lets you define a struct and its fields, then asks you to choose whether to implement update and delete functions for it along with the default create and read functions. + ## Define an entry type -Each entry has a **type**, which your application code uses to make sense of the entry's bytes. Our [HDI library](https://docs.rs/hdi/latest/hdi/) gives you macros to automatically define, serialize, and deserialize entry types to and from any Rust struct or enum that [`serde`](https://docs.rs/serde/latest/serde/) can handle. +Each entry has a **type**. This lets your application make sense of what would otherwise be a blob of arbitrary bytes. Our [HDI library](https://docs.rs/hdi/latest/hdi/) gives you macros to automatically define, serialize, and deserialize typed entries to and from any Rust struct or enum that [`serde`](https://docs.rs/serde/latest/serde/) can handle. Entry types are defined in an [**integrity zome**](/resources/glossary/#integrity-zome). To define an [`EntryType`](https://docs.rs/hdi/latest/hdi/prelude/enum.EntryType.html), use the [`hdi::prelude::hdk_entry_helper`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_entry_helper.html) macro on your Rust type: @@ -27,7 +31,7 @@ use hdi::prelude::*; #[hdk_entry_helper] pub struct Movie { title: String, - director: String, + director_hash: EntryHash, imdb_id: Option, release_date: Timestamp, box_office_revenue: u128, @@ -42,13 +46,18 @@ In order to dispatch validation to the proper integrity zome, Holochain needs to use hdi::prelude::*; #[hdk_entry_defs] +#[unit_enum(UnitEntryTypes)] enum EntryTypes { Movie(Movie), // other types... } ``` -### Configuring an entry type +This also gives you an enum that you can use later when you're storing app data. This is important because, under the hood, an entry type consists of two bytes --- an integrity zome index and an entry def index. These are required whenever you want to write an entry. Instead of having to remember those values every time you store something, your coordinator zome can just import and use this enum, which already knows how to convert each entry type to the right IDs. + +The code sample above also uses a macro called `unit_enum`, which will generate an enum of all the entry types' names _without_ places to store values. + +### Configure an entry type Each variant in the enum should hold the Rust type that corresponds to it, and is implicitly marked with an `entry_def` proc macro which, if you specify it explicitly, lets you configure the given entry type further: @@ -59,6 +68,7 @@ Each variant in the enum should hold the Rust type that corresponds to it, and i use hdi::prelude::*; #[hdk_entry_defs] +#[unit_enum(UnitEntryTypes)] enum EntryTypes { #[entry_def(required_validations = 7, )] Movie(Movie), @@ -72,13 +82,11 @@ enum EntryTypes { } ``` -This also gives you an enum that you can use later when you're storing app data. This is important because, under the hood, an entry type consists of two bytes -- an integrity zome index and an entry def index. These are required whenever you want to write an entry. Instead of having to remember those values every time you store something, your coordinator zome can just import and use this enum, which already knows how to convert each entry type to the right IDs. - ## Create an entry Most of the time you'll want to define your create, read, update, and delete (CRUD) functions in a [**coordinator zome**](/resources/glossary/#coordinator-zome) rather than the integrity zome that defines it. This is because a coordinator zome is easier to update in the wild than an integrity zome. -Create an entry by calling [`hdk::prelude::create_entry`](https://docs.rs/hdk/latest/hdk/prelude/fn.create_entry.html). If you used `hdk_entry_helper` and `hdk_entry_defs` macro in your integrity zome (see [Define an entry type](#define-an-entry-type)), you can use the entry types enum you defined, and the entry will be serialized and have the correct integrity zome and entry type indexes added to it. +Create an entry by calling [`hdk::prelude::create_entry`](https://docs.rs/hdk/latest/hdk/entry/fn.create_entry.html). If you used `hdk_entry_helper` and `hdk_entry_defs` macro in your integrity zome (see [Define an entry type](#define-an-entry-type)), you can use the entry types enum you defined, and the entry will be serialized and have the correct integrity zome and entry type indexes added to it. ```rust use hdk::prelude::*; @@ -88,7 +96,7 @@ use movie_integrity::*; let movie = Movie { title: "The Good, the Bad, and the Ugly", - director: "Sergio Leone" + director_hash: EntryHash::from_raw_36(vec![ /* hash of 'Sergio Leone' entry */ ]), imdb_id: Some("tt0060196"), release_date: Timestamp::from(Date::Utc("1966-12-23")), box_office_revenue: 389_000_000, @@ -374,13 +382,9 @@ match maybe_details { } ``` -## Scaffolding an entry type and CRUD API - -The Holochain dev tool command `hc scaffold entry-type ` generates the code for a simple entry type and a CRUD API. It presents an interface that lets you define a struct and its fields, then asks you to choose whether to implement update and delete functions for it along with the default create and read functions. - ## Community CRUD libraries -If the scaffolder doesn't support your desired functionality, or is too low-level, there are some community-maintained libraries that offer opinionated and high-level ways to work with entries. Some of them also offer permissions management. +There are some community-maintained libraries that offer opinionated and high-level ways to work with entries. Some of them also offer permissions management. * [rust-hc-crud-caps](https://github.com/spartan-holochain-counsel/rust-hc-crud-caps) * [hdk_crud](https://github.com/lightningrodlabs/hdk_crud) @@ -388,12 +392,12 @@ If the scaffolder doesn't support your desired functionality, or is too low-leve ## Reference -* [hdi::prelude::hdk_entry_helper](https://docs.rs/hdi/latest/hdi/attr.hdk_entry_helper.html) -* [hdi::prelude::hdk_entry_defs](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_entry_defs.html) -* [hdi::prelude::entry_def](https://docs.rs/hdi/latest/hdi/prelude/entry_def/index.html) -* [hdk::prelude::create_entry](https://docs.rs/hdk/latest/hdk/entry/fn.create_entry.html) -* [hdk::prelude::update_entry](https://docs.rs/hdk/latest/hdk/entry/fn.update_entry.html) -* [hdi::prelude::delete_entry](https://docs.rs/hdk/latest/hdk/entry/fn.delete_entry.html) +* [`hdi::prelude::hdk_entry_helper`](https://docs.rs/hdi/latest/hdi/attr.hdk_entry_helper.html) +* [`hdi::prelude::hdk_entry_defs`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_entry_defs.html) +* [`hdi::prelude::entry_def`](https://docs.rs/hdi/latest/hdi/prelude/entry_def/index.html) +* [`hdk::prelude::create_entry`](https://docs.rs/hdk/latest/hdk/entry/fn.create_entry.html) +* [`hdk::prelude::update_entry`](https://docs.rs/hdk/latest/hdk/entry/fn.update_entry.html) +* [`hdi::prelude::delete_entry`](https://docs.rs/hdk/latest/hdk/entry/fn.delete_entry.html) ## Further reading From 047e9ac1d4c1058fb69e934feff62d463953e4f0 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Thu, 6 Jun 2024 15:07:54 -0700 Subject: [PATCH 05/12] add pubkey to dictionary --- .cspell/custom-words.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/.cspell/custom-words.txt b/.cspell/custom-words.txt index 6eab388eb..36d104c60 100644 --- a/.cspell/custom-words.txt +++ b/.cspell/custom-words.txt @@ -23,6 +23,7 @@ Kleppmann NixOS nixpkgs pkgs +pubkey QUIC rustc rustflags From 9b29a8cfd3cb8dcf693b4217e5e26085a8202db5 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Thu, 6 Jun 2024 15:11:57 -0700 Subject: [PATCH 06/12] fix broken links --- src/pages/build/links-paths-and-anchors.md | 2 +- src/pages/build/working-with-data.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index d3f25bf0a..019887da2 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -229,7 +229,7 @@ let dna_hash = dna_info()?.hash; ``` !!! info Linking from a DNA hash is not recommended -Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [anchors and paths](#anchors-and-paths) to 'shard' responsibility throughout the DHT. +Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [paths or anchors](#paths-and-anchors) to 'shard' responsibility throughout the DHT. !!! ## Retrieve links diff --git a/src/pages/build/working-with-data.md b/src/pages/build/working-with-data.md index 093d4b71d..13af0fa7d 100644 --- a/src/pages/build/working-with-data.md +++ b/src/pages/build/working-with-data.md @@ -42,7 +42,7 @@ Addressable content can either be: * **Private**, stored on the author's device in their [source chain](#individual-state-histories-as-public-records), or * **Public**, stored in the application's shared graph database and accessible to all participants. -All actions are public, while entries can be either public or [private](/build/entries/#configuring-an-entry-type). External references hold neither public nor private content, but merely point to content outside the database. +All actions are public, while entries can be either public or [private](/build/entries/#configure-an-entry-type). External references hold neither public nor private content, but merely point to content outside the database. ## Shared graph database @@ -70,7 +70,7 @@ Holochain has a built-in [**create, read, update, and delete (CRUD)** model](/co All data in an application's database ultimately comes from the peers who participate in storing and serving it. Each piece of data originates in a participant's source chain, which is an [event journal](https://martinfowler.com/eaaDev/EventSourcing.html) that contains all the actions they've authored. These actions describe intentions to add to either the DHT's state or their own state. -Every action becomes part of the shared DHT, but not every entry needs to. The entry content of most system-level actions is private. You can also [mark an application entry type as private](/build/entries/#configuring-an-entry-type), and its content will stay on the participant's device and not get published to the graph. +Every action becomes part of the shared DHT, but not every entry needs to. The entry content of most system-level actions is private. You can also [mark an application entry type as private](/build/entries/#configure-an-entry-type), and its content will stay on the participant's device and not get published to the graph. Because every action has a reference to both its author and its previous action in the author's source chain, each participant's source chain can be considered a linear graph of their authoring history. From 7a81f50ba8b2556289870fc388ff5a7f96a65b96 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 7 Jun 2024 10:09:48 -0700 Subject: [PATCH 07/12] small text edit on links page --- src/pages/build/links-paths-and-anchors.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index 019887da2..8fe93b3fa 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -27,13 +27,14 @@ Every link has a type that you define in an integrity zome, just like [an entry] * A **type** * An optional **tag** that can hold a small amount of arbitrary bytes, up to 4 kb -The tag could be considered link 'content', but unlike an entry type, the HDK doesn't provide a macro that automatically deserializes the tag into a Rust struct or enum. It can be used to further qualify the link, provide data about the target that saves DHT queries, or be matched against in link queries. +The tag could be considered link 'content' that can be used to further qualify the link, provide data about the target that saves DHT queries, or be matched against in link queries. But unlike an entry's content, the HDK doesn't provide a macro that automatically deserializes the link tag's content into a Rust type. -[Just as with entries](/build/entries/#define-an-entry-type), Holochain needs to know about your link tags in order to dispatch validation to the right integrity zome. You can do this by implementing a `link_types` callback function, and the easiest way to do this is to add the [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) macro to an enum that defines all your link types: +[Just as with entries](/build/entries/#define-an-entry-type), Holochain needs to know about your link types in order to dispatch validation to the right integrity zome. You can do this by implementing a `link_types` callback function, and the easiest way to do this is to add the [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) macro to an enum that defines all your link types: ```rust use hdi::prelude::*; +// Generate a `link_types` function that returns a list of definitions. #[hdk_link_types] enum LinkTypes { DirectorToMovie, From 7123b93e97c81c8f4cfa56f4a0bff628b1918be1 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 27 Sep 2024 13:49:22 -0700 Subject: [PATCH 08/12] small content edits --- src/pages/build/links-paths-and-anchors.md | 40 +++++++++++----------- src/pages/build/working-with-data.md | 24 +++++++++---- 2 files changed, 37 insertions(+), 27 deletions(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index 8fe93b3fa..67b9b95f8 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -3,20 +3,18 @@ title: "Links, Paths, and Anchors" --- ::: intro -A **link** connects two addresses in an application's shared database, forming a graph database on top of the underlying hash table. Links can be used to relate two pieces of [addressable content](/resources/glossary/#addressable-content) in the database to each other or point to references to addressable content outside the database. +A **link** connects two addresses in an application's shared database, forming a graph database on top of the underlying hash table. Links can be used to connect pieces of [addressable content](/resources/glossary/#addressable-content) in the database or references to addressable content outside the database. **Paths** and **anchors** build on the concept of links, allowing you to create collections, pagination, indexes, and hierarchical structures. ::: ## Turning a hash table into a graph database -A Holochain application's database is, at heart, just a big key-value store --- or more specifically, a hash table. You can store content at its hash, and you can retrieve content by its hash. This is useful if you already know the hash of the content you want to retrieve, but it isn't helpful if you don't know the hash of the content you're looking for. +A Holochain application's database is, at heart, just a big key-value store --- or more specifically, a hash table. You can store and retrieve content by hash. This is useful if you already know the hash of the content you want to retrieve, but it isn't helpful if you don't know the hash of the content you're looking for. -A piece of content itself can contain a hash in one of its fields, and that's great for modelling a _many-to-one relationship_. For instance, an [**action**](/build/#entries-actions-and-records-primary-data), one of the primary types of data in the database, points to the action that precedes it and to the entry it creates, updates, or delete. +A piece of content itself can contain a hash as part of its data structure, and that's great for modelling a _many-to-one relationship_. And if the number of things the content points to is small and doesn't change often, you can model a _many-to-many relationship_ using a field that contains an array of hashes. But at a certain point this becomes hard to manage, especially if that list regularly changes or gets really large. -And if the number of things the content points to is small and doesn't change often, you can model a _many-to-many relationship_ using a field that contains an array of hashes. But at a certain point this becomes hard to manage, especially if that list grows or regularly changes. - -Holochain's hash table stores _metadata_ in addition to primary addressable content. This lets you attach **links** to an address in the database. You can then retrieve a full or filtered list of links from that address in order to discover more primary content. In this way you can build up a fully traversable graph database. +But Holochain also lets you attach **links** as metadata on an address in the database. You can then retrieve a full or filtered list of links from that address in order to discover more addressable content. In this way you can build up a traversable graph database. ### Define a link type @@ -27,7 +25,7 @@ Every link has a type that you define in an integrity zome, just like [an entry] * A **type** * An optional **tag** that can hold a small amount of arbitrary bytes, up to 4 kb -The tag could be considered link 'content' that can be used to further qualify the link, provide data about the target that saves DHT queries, or be matched against in link queries. But unlike an entry's content, the HDK doesn't provide a macro that automatically deserializes the link tag's content into a Rust type. +The tag could be considered link 'content' that can be used to further qualify the link, provide data about the target that saves on DHT queries, or be queried with a starts-with search. But unlike an entry's content, the HDK doesn't provide a macro that automatically deserializes the link tag's content into a Rust type. [Just as with entries](/build/entries/#define-an-entry-type), Holochain needs to know about your link types in order to dispatch validation to the right integrity zome. You can do this by implementing a `link_types` callback function, and the easiest way to do this is to add the [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) macro to an enum that defines all your link types: @@ -51,7 +49,7 @@ enum LinkTypes { ## Create a link -As with entries, you'll normally want to store your link CRUD code in a [**coordinator zome**](/resources/glossary/#coordinator-zome). You can read about why in the page on [entries](/build/entries/#create-an-entry). +As with entries, you'll normally want to store your link CRUD code in a [**coordinator zome**](/resources/glossary/#coordinator-zome), not an integrity zome. You can read about why in the page on [entries](/build/entries/#create-an-entry). Create a link by calling [`hdk::prelude::create_link`](https://docs.rs/hdk/latest/hdk/link/fn.create_link.html). If you used the `hdk_link_types` macro in your integrity zome (see [Define a link type](#define-a-link-type)), you can use the link types enum you defined, and the link will have the correct integrity zome and link type indexes added to it. @@ -72,7 +70,7 @@ let create_link_action_hash = create_link( ); ``` -Links can't be updated; they can only be created or deleted. +Links can't be updated; they can only be created or deleted. Multiple links with the same base, target, type, and tag can be created, and they'll be considered separate links for retrieval and deletion purposes. ## Delete a link @@ -93,19 +91,19 @@ A link is considered dead once its creation action has one or more delete-link a Because linking is all about connecting hashes to other hashes, here's how you get a hash for a piece of content. !!! info A note on the existence of data -An address doesn't have to have content stored at it in order for you to link to or from it. (In the case of external references, it's certain that data won't exist at the address.) If you want to add this extra constraint, you'll need to check for the presence of data at the base and/or target address in your link validation code. +An address doesn't have to have content stored at it in order for you to link to or from it. (In the case of external references, it's certain that data won't exist at the address.) If you want to require data to exist at the base or target, and if the data needs to be of a certain type, you'll need to check for this in your link validation code. !!! -### Actions +### Action -Any host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this for further writes in the same function call, or you can return it to the client that called the function so that it can use it later. You can also [pass it to another function](/concepts/8_calls_capabilities/) --- either one in the same cell, another cell in the agent's hApp, or another cell in the same network. +Any CRUD host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this in links, either for further writes in the same function call or elsewhere. !!! info Action hashes aren't certain until zome function lifecycle completes -When you get an action hash back from the host function that creates it, it doesn't mean the action is available on the DHT yet, or will ever be available. The action isn't written until the function that writes it completes, then passes the action to validation. If the function or the validation fail, the action will be discarded. And if it is successful, the action won't become fully available on the DHT until it's been published to a sufficient number of peers. +Like we mentioned in [Working with Data](/guide/working-with-data/#content-addresses), actions aren't actually there until the zome function that writes them completes successfully. And if you use 'relaxed' chain top ordering, your zome function can't depend on the action hash it gets back from the CRUD host function, because the final value might change before it's written. -It's safer to share action hashes with other peers or cells in a callback called `post_commit()`. If your coordinator zome defines this callback, it'll be called after every successful function call within that zome. +It's safer to share action hashes with other peers or cells in a callback called `post_commit()`. If your coordinator zome defines this callback, it'll be called after every successful function call within that zome, with the actual final action hashes. !!! If you have a variable that contains a [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) or [`hdk::prelude::Record`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html), you can also get its hash using the following methods: @@ -119,7 +117,7 @@ assert_eq!(action_hash_from_record, action_hash_from_action); (But it's worth pointing out that if you have this value, it's probably because you just retrieved the action by hash, which means you probably already know the hash.) -To get the hash of an action from an action that deletes or updates it, match on the [`Action::Update`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Update) or [`Action::Delete`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Delete) action variants and access the appropriate field: +To get the hash of an entry creation action from an action that deletes or updates it, match on the [`Action::Update`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Update) or [`Action::Delete`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Delete) action variants and access the appropriate field: ```rust if let Action::Update(action_data) = action { @@ -133,7 +131,7 @@ if let Action::Update(action_data) = action { ### Entry -To get the hash of an entry, first construct the entry struct or enum that you [defined in the integrity zome](/build/entries/#define-an-entry-type), then pass it through the [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) function. +To get the hash of an entry, first construct the entry struct or enum that you [defined in the integrity zome](/build/entries/#define-an-entry-type), then pass it through the [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) function. (Reminder: don't actually have to write the entry to a source chain to get or use the entry hash for use in a link.) ```rust use hdk::hash::*; @@ -170,7 +168,7 @@ let entry_hash_from_action = record.action().entry_hash()? assert_equal!(entry_hash_from_record, entry_hash_from_action); ``` -Finally, to get the hash of an entry from an action updates or deletes it, match the action to the appropriate variant and access the corresponding field: +Finally, to get the hash of an entry from an action that updates or deletes it, match the action to the appropriate variant and access the corresponding field: ```rust if let Action::Update(action_data) = action { @@ -203,6 +201,8 @@ let author_id = action.author(); Because an external reference comes from outside of a DHT, it's up to you to decide how to get it into the application. Typically, an external client such as a UI or bridging service would pass this value into your app. +As mentioned up in the [Entry](#entry) section, an entry hash can also be considered 'external' if you don't actually write it to the DHT. + ```rust use hdk::prelude::*; use movie_integrity::*; @@ -230,7 +230,7 @@ let dna_hash = dna_info()?.hash; ``` !!! info Linking from a DNA hash is not recommended -Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [paths or anchors](#paths-and-anchors) to 'shard' responsibility throughout the DHT. +Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash such as the DNA hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [paths or anchors](#paths-and-anchors) to 'shard' responsibility throughout the DHT. !!! ## Retrieve links @@ -298,12 +298,12 @@ let number_of_reviews_written_by_me_in_last_month = count_links( ## Paths and anchors -Sometimes the easiest way to discover a link base is to embed it into the application's code. You can create an **anchor**, an entry whose content is a well-known blob, and hash that blob any time you need to retrieve links. This can be used to simulate collections or tables in your graph database. As [mentioned](#getting-hashes-for-use-in-linking), the entry does not even need to be stored; you can simply create it, hash it, and use the hash in your link. +Sometimes the easiest way to discover a link base is to build it into the application's code. You can create an **anchor**, an entry whose content is a well-known blob, and hash that blob any time you need to retrieve links. This can be used to simulate collections or tables in your graph database. As [mentioned](#getting-hashes-for-use-in-linking), the entry does not even need to be stored; you can simply create it, hash it, and use the hash in your link. While you can build this yourself, this is such a common pattern that the HDK implements it for you in the [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) module. The implementation supports both anchors and **paths**, which are hierarchies of anchors. !!! info Avoiding DHT hot spots -It's recommended to not attach a large number of links to a single anchor, as that creates extra work for the peers responsible for that anchor's hash. Instead, use paths to split the links into appropriate 'buckets' and spread the work around. We'll give an example of that below. +Don't attach too many links to a single anchor, as that creates extra work for the peers responsible for that anchor's hash. Instead, use paths to split the links into appropriate 'buckets' and spread the work around. We'll give an example of that below. !!! ### Scaffold a simple collection diff --git a/src/pages/build/working-with-data.md b/src/pages/build/working-with-data.md index 13af0fa7d..3bba9d47d 100644 --- a/src/pages/build/working-with-data.md +++ b/src/pages/build/working-with-data.md @@ -3,7 +3,7 @@ title: Working With Data --- ::: intro -Holochain is, at its most basic, a framework for building **graph databases** on top of **content-addressed storage** that are validated and stored by **networks of peers**. Each peer contributes to the state of this database by publishing **actions** to an event journal stored on their device called their **source chain**. The source chain can also be used to hold private state. +Holochain is, at its most basic, a framework for building **graph databases** on top of **content-addressed storage** that is validated and stored by **networks of peers**. Each peer contributes to the state of this database by publishing **actions** to an event journal called their **source chain**, which is stored on their device. The source chain can also be used to hold private data. ::: ## Entries, actions, and records: primary data @@ -26,6 +26,8 @@ One entry written by two actions is considered to be the same piece of content,
authors
authors
authors
creates
creates
creates
Alice
Action 1
Bob
Action 2
Carol
Action 3
Entry
+### Content addresses + Entries and actions are both **addressable content**, which means that they're retrieved by their address --- which is usually the hash of their data. All addresses are 32-byte identifiers. There are four types of addressable content: @@ -35,12 +37,18 @@ There are four types of addressable content: * An **action** stores action data for a record, and its address is the hash of the serialized action content. * An **external reference** is the ID of a resource that exists outside the database, such as the hash of an IPFS resource or the public key of an Ethereum address. There's no content stored at the address; it simply serves as an anchor to attach [links](#links) to. -### Storage locations +There are a few things to note about action hashes: + +* You can't know an action's hash until you've written the action, because it's influenced by both the previous action's hash and participant's current system time. +* When you write an action, you can specify "relaxed chain top ordering". We won't go into the details here, but it means you can't depend on the action hash even after you write the action. +* Any function that writes actions is atomic, which means that actions aren't written until after the function succeeds _and_ all actions are successfully validated. That means that you shouldn't depend on content being available at an address until _after_ the function returns a success result. -Addressable content can either be: +### Storage locations and privacy -* **Private**, stored on the author's device in their [source chain](#individual-state-histories-as-public-records), or -* **Public**, stored in the application's shared graph database and accessible to all participants. +Each DNA creates a network of peers who participate in storing bits of that DNA's database, which means that each DNA's database (and the [source chains](#individual-state-histories-as-public-records) that contribute to it) is completely separate from all others. This creates a per-network privacy for shared data. On top of that, addressable content can either be: + +* **Private**, stored on the author's device in their source chain and accessible to them only, or +* **Public**, stored in the graph database and accessible to all participants. All actions are public, while entries can be either public or [private](/build/entries/#configure-an-entry-type). External references hold neither public nor private content, but merely point to content outside the database. @@ -57,14 +65,16 @@ A [link](/build/links-paths-and-anchors/) is a piece of metadata attached to an

type: artist_album

type: artist_album_by_release_date tag: 1966-01-17

type: artist_album

type: artist_album_by_release_date tag: 1970-01-26

Simon & Garfunkel
Sounds of Silence
Bridge over Troubled Water
-When a link's base and target don't exist as addressable content in the database, they're considered external references whose data isn't accessible to your application's back end. +When a link's base and target don't exist as addressable content in the database, they're considered **external references**, and it's up to your front end to decide how to handle them.

type: eth_wallet_to_ipfs_profile_photo

hC8kafe9...7c12
hC8kd01f...84ce
### CRUD metadata graph -Holochain has a built-in [**create, read, update, and delete (CRUD)** model](/concepts/6_crud_actions/). Data in the graph database and participants' local state cannot be modified or deleted, so these kinds of mutation are simulated by attaching metadata to existing data that marks changes to its status. This builds up a graph of the history of a given piece of content and its links. We'll get deeper into this in the [next section](#adding-and-modifying-data) and in the page on [entries](/build/entries/). +Holochain has a built-in [**create, read, update, and delete (CRUD)** model](/concepts/6_crud_actions/). Data in the graph database and participants' local state cannot be modified or deleted, so these kinds of mutation are simulated by attaching metadata to existing data. This builds up a graph of the history of a given piece of content. + +We'll get deeper into this in the [next section](#adding-and-modifying-data) and in the page on [entries](/build/entries/). ### Individual state histories as public records From 41470c57be1be510d2cd3196a789bc0cebb96523 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 27 Sep 2024 14:37:35 -0700 Subject: [PATCH 09/12] refine warnings about depending on hashes --- src/pages/build/links-paths-and-anchors.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index 67b9b95f8..aef96ed2e 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -100,10 +100,14 @@ An address doesn't have to have content stored at it in order for you to link to Any CRUD host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this in links, either for further writes in the same function call or elsewhere. -!!! info Action hashes aren't certain until zome function lifecycle completes -Like we mentioned in [Working with Data](/guide/working-with-data/#content-addresses), actions aren't actually there until the zome function that writes them completes successfully. And if you use 'relaxed' chain top ordering, your zome function can't depend on the action hash it gets back from the CRUD host function, because the final value might change before it's written. +!!! info Actions aren't written until function lifecycle completes +Like we mentioned in [Working with Data](/guide/working-with-data/#content-addresses), zome functions are atomic, so actions aren't actually there until the zome function that writes them completes successfully. -It's safer to share action hashes with other peers or cells in a callback called `post_commit()`. If your coordinator zome defines this callback, it'll be called after every successful function call within that zome, with the actual final action hashes. +If you need to share an action hash via a signal (say, with a remote peer), it's safer to wait until the zome function has completed. You can do this by creating a callback called `post_commit()`. It'll be called after every successful function call within that zome. +!!! + +!!! info Don't depend on relaxed action hashes +If you use 'relaxed' chain top ordering, your zome function shouldn't depend on the action hash it gets back from the CRUD host function, because the final value might change by the time the actions are written. !!! If you have a variable that contains a [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) or [`hdk::prelude::Record`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html), you can also get its hash using the following methods: From 5397386180d7755e3bb1708c99c8ddf100c7849f Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 27 Sep 2024 14:54:11 -0700 Subject: [PATCH 10/12] fix code samples for anchors --- src/pages/build/links-paths-and-anchors.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index aef96ed2e..4a171c86e 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -370,7 +370,7 @@ use movie_integrity::*; let parent_path = Path::from("movies_by_first_letter") .typed(LinkTypes::MovieByFirstLetterAnchor); let all_first_letter_paths = parent_path.children_paths()?; -// Do something with the paths. Note: this would be expensive to do in practice. +// Do something with the paths. Note: this would be expensive to do in practice, because each iteration is a DHT query. let links_to_all_movies = all_first_letter_paths .iter() .map(|path| get_links(path.path_entry_hash()?, LinkTypes::MovieByFirstLetter, None)?) @@ -411,13 +411,13 @@ let anchor_hash_for_g = anchor(LinkTypes::MovieByFirstLetterAnchor, "movies_by_f let links_to_movies_starting_with_g = get_links(anchor_hash_for_g, LinkTypes::MovieByFirstLetter, None); ``` -Retrieve the _names_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_tags`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_tags.html): +Retrieve the _addresses_ of all the top-level anchors by calling [`hdk::prelude::list_anchor_type_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_type_addresses.html): ```rust use hdk::prelude::*; use movie_integrity::*; -let all_first_letters = list_anchor_tags(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); +let hashes_of_all_top_level_anchors = list_anchor_type_addresses(LinkTypes::MovieByFirstLetterAnchor); ``` Retrieve the _addresses_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html): @@ -429,13 +429,13 @@ use movie_integrity::*; let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); ``` -Retrieve the _addresses_ of all the top-level anchors by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html): +Retrieve the _names_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_tags`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_tags.html): ```rust use hdk::prelude::*; use movie_integrity::*; -let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); +let all_first_letters = list_anchor_tags(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); ``` ## Reference From abcfc672b93d35bc15c2d7ea686d9f3a572077e4 Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Fri, 27 Sep 2024 14:54:25 -0700 Subject: [PATCH 11/12] fix erroneous link tag size --- src/pages/build/links-paths-and-anchors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index 4a171c86e..f03cb858d 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -23,7 +23,7 @@ Every link has a type that you define in an integrity zome, just like [an entry] * A **base**, which is the address the link is attached to and _points from_ * A **target**, which is the address the link _points to_ * A **type** -* An optional **tag** that can hold a small amount of arbitrary bytes, up to 4 kb +* An optional **tag** that can hold a small amount of arbitrary bytes, up to 1 kb The tag could be considered link 'content' that can be used to further qualify the link, provide data about the target that saves on DHT queries, or be queried with a starts-with search. But unlike an entry's content, the HDK doesn't provide a macro that automatically deserializes the link tag's content into a Rust type. From 7296c0fb8792b0e71aa1686e9de158e04e23229d Mon Sep 17 00:00:00 2001 From: Paul d'Aoust Date: Mon, 30 Sep 2024 15:16:46 -0700 Subject: [PATCH 12/12] add identifiers to nav --- src/pages/_data/navigation/mainNav.json5 | 1 + src/pages/build/entries.md | 2 +- src/pages/build/identifiers.md | 181 +++++++++++++++ src/pages/build/index.md | 5 +- src/pages/build/links-paths-and-anchors.md | 253 ++------------------- src/pages/build/working-with-data.md | 30 +-- 6 files changed, 220 insertions(+), 252 deletions(-) create mode 100644 src/pages/build/identifiers.md diff --git a/src/pages/_data/navigation/mainNav.json5 b/src/pages/_data/navigation/mainNav.json5 index 86031fb1c..a2688ad00 100644 --- a/src/pages/_data/navigation/mainNav.json5 +++ b/src/pages/_data/navigation/mainNav.json5 @@ -27,6 +27,7 @@ }, { title: "Build", url: "/build/", children: [ { title: "Working with Data", url: "/build/working-with-data/", children: [ + { title: "Identifiers", url: "/build/identifiers/" }, { title: "Entries", url: "/build/entries/" }, { title: "Links, Paths, and Anchors", url: "/build/links-paths-and-anchors/" }, ]}, diff --git a/src/pages/build/entries.md b/src/pages/build/entries.md index 595aefdeb..0e3187562 100644 --- a/src/pages/build/entries.md +++ b/src/pages/build/entries.md @@ -230,7 +230,7 @@ let delete_action_hash: ActionHash = delete_entry( )?; ``` -As with an update, this does _not_ actually remove data from the source chain or the DHT. Instead, a [`Delete` action](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/action/struct.Delete.html) is authored, which attaches to the entry creation action and marks it as 'dead'. An entry itself is only considered dead when all entry creation actions that created it are marked dead, and it can become live again in the future if a _new_ entry creation action writes it. Dead data can still be retrieved with [`hdk::prelude::get_details`](https://docs.rs/hdk/latest/hdk/prelude/fn.get_details.html) (see below). +As with an update, this does _not_ actually remove data from the source chain or the DHT. Instead, a [`Delete` action](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/action/struct.Delete.html) is authored, which attaches to the entry creation action and marks it as deleted. An entry itself is only considered deleted when _all_ entry creation actions that created it are marked deleted, and it can become live again in the future if a _new_ entry creation action writes it. Deleted data can still be retrieved with [`hdk::prelude::get_details`](https://docs.rs/hdk/latest/hdk/prelude/fn.get_details.html) (see below). In the future we plan to include a 'purge' functionality. This will give agents permission to actually erase an entry from their DHT store, but not its associated entry creation action. diff --git a/src/pages/build/identifiers.md b/src/pages/build/identifiers.md new file mode 100644 index 000000000..b75a8b358 --- /dev/null +++ b/src/pages/build/identifiers.md @@ -0,0 +1,181 @@ +--- +title: "Identifiers" +--- + +::: intro +Data in Holochain is **addressable content**, which means that it's retrieved using an address that's derived from the data itself. +::: + +## Address types + +The address of most data is the [Blake2b-256](https://www.blake2.net/) hash of its bytes. This goes for both [**actions** and **entries**](/build/working-with-data/#entries-actions-and-records-primary-data) (with just one exception), and there are two extra address types that have content associated with them. + +All addresses are 39 bytes long and are [multihash-friendly](https://www.multiformats.io/multihash/). Generally, you don't need to know how to construct an address. Functions and data structures that store data will give you the data's address. But when you see a hash in the wild, this is what it's made out of: + +| Multihash prefix | Hash | DHT location | +|------------------|----------|--------------| +| 3 bytes | 32 bytes | 4 bytes | + +The four-byte DHT location is calculated from the 32 bytes of the hash and is used in routing to the right peer. The three-byte multihash prefix will be one of the following: + +| Hash type | [`holo_hash`](https://docs.rs/holo_hash/latest/holo_hash/#types) type | Prefix in Base64 | +|-----------|-------------------------------------------------------------------------------------|------------------| +| DNA | [`DnaHash`](https://docs.rs/holo_hash/latest/holo_hash/type.DnaHash.html) | `hC0k` | +| agent ID | [`AgentPubKey`](https://docs.rs/holo_hash/latest/holo_hash/type.AgentPubKey.html) | `hCAk` | +| action | [`ActionHash`](https://docs.rs/holo_hash/latest/holo_hash/type.ActionHash.html) | `hCkk` | +| entry | [`EntryHash`](https://docs.rs/holo_hash/latest/holo_hash/type.EntryHash.html) | `hCEk` | +| external | [`ExternalHash`](https://docs.rs/holo_hash/latest/holo_hash/type.ExternalHash.html) | `hC8k` | + +You can see that, in the Rust SDK, each address is typed to what it represents. There are also a couple of composite types, [`AnyDhtHash`](https://docs.rs/holo_hash/latest/holo_hash/type.AnyDhtHash.html) and [`AnyLinkableHash`](https://docs.rs/holo_hash/latest/holo_hash/type.AnyLinkableHash.html), that certain functions (like link creation functions) accept. You can also use the above hash types as fields in your entry types. + +Here's an overview of all seven address types: + +* `DnaHash` is the hash of the DNA bundle, and is the [unique identifier for the network](/build/working-with-data/#storage-locations-and-privacy). +* `AgentPubKey` is the public key of a participant in a network. Its address is the same as the entry content --- the agent's public key, not the hash of the public key. +* `ActionHash` is the hash of a structure called an [action](/build/working-with-data/#entries-actions-and-records-primary-data) that records a participant's act of storing or changing private or shared data. +* `EntryHash` is the hash of an arbitrary blob of bytes called an [entry](/build/entries/), which contains application or system data. +* `ExternalHash` is the ID of a resource that exists outside the database, such as the hash of an IPFS resource or the public key of an Ethereum wallet. Holochain doesn't care about its content, as long as it's 32 bytes long. There's no content stored at the address; it simply serves as an anchor to attach [links](/build/links-paths-and-anchors/) to. +* `AnyDhtHash` is the hash of any kind of addressable content (that is, actions and entries, including `AgentPubKey` entries). +* `AnyLinkableHash` is the hash of anything that can be linked to or from (that is, all of the above). + +### The unpredictability of action hashes + +There are a few things to know about action hashes: + +* You can't know an action's hash until you've written the action, because it's influenced by both the previous action's hash and participant's current system time. +* When you write an action, you can specify "relaxed chain top ordering". We won't go into the details here, but it means an action hash may change after the action is written, so you shouldn't depend on the value of the hash within the function that writes it. +* Any function that writes to an agent's source chain is **atomic**, which means that all actions are written, but only after the function succeeds _and_ all actions are successfully validated. That means that you shouldn't depend on content being available at an address until _after_ the function returns a success result. + +## Getting hashes + +Because Holochain's graph DHT is all about connecting hashes to other hashes, here's how you get hashes. + +### Action + +Any CRUD host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this in links, either for further writes in the same function call or elsewhere. + + +!!! info Actions aren't written until function lifecycle completes +Like we mentioned in [Working with Data](/guide/working-with-data/#content-addresses), zome functions are atomic, so actions aren't actually there until the zome function that writes them completes successfully. + +If you need to share an action hash via a signal (say, with a remote peer), it's safer to wait until the zome function has completed. You can do this by creating a callback called `post_commit()`. It'll be called after every successful function call within that zome. +!!! + +!!! info Don't depend on relaxed action hashes +If you use 'relaxed' chain top ordering, your zome function shouldn't depend on the action hash it gets back from the CRUD host function, because the final value might change by the time the actions are written. +!!! + +If you have a variable that contains a [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) or [`hdk::prelude::Record`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html), you can also get its hash using the following methods: + +```rust +let action_hash_from_record = record.action_address(); +let action = record.signed_action; +let action_hash_from_action = action.as_hash(); +assert_eq!(action_hash_from_record, action_hash_from_action); +``` + +(But it's worth pointing out that if you have this value, it's probably because you just retrieved the action by hash, which means you probably already know the hash.) + +To get the hash of an entry creation action from an action that deletes or updates it, match on the [`Action::Update`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Update) or [`Action::Delete`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Delete) action variants and access the appropriate field: + +```rust +if let Action::Update(action_data) = action { + let replaced_action_hash = action_data.original_action_address; + // Do some things with the original action. +} else if let Action::Delete(action_data) = action { + let deleted_action_hash = action_data.deletes_address; + // Do some things with the deleted action. +} +``` + +### Entry + +To get the hash of an entry, first construct the entry struct or enum that you [defined in the integrity zome](/build/entries/#define-an-entry-type), then pass it through the [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) function. (You don't actually have to write the entry to a source chain to get the entry hash.) + +```rust +use hdk::hash::*; +use movie_integrity::*; + +let director_entry_hash = EntryHash::from_raw_36(vec![/* Sergio Leone's hash */]); +let movie = Movie { + title: "The Good, the Bad, and the Ugly", + director_entry_hash: director_entry_hash, + imdb_id: Some("tt0060196"), + release_date: Timestamp::from(Date::Utc("1966-12-23")), + box_office_revenue: 389_000_000, +}; +let movie_entry_hash = hash_entry(movie); +``` + +To get the hash of an entry from the action that created it, call the action's [`entry_hash`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#method.entry_hash) method. It returns an optional value, because not all actions have associated entries. + +```rust +let entry_hash = action.entry_hash()?; +``` + +If you know that your action is an entry creation action, you can get the entry hash from its `entry_hash` field: + +```rust +let entry_creation_action: EntryCreationAction = action.into()?; +let entry_hash = action.entry_hash; +``` + +To get the hash of an entry from a record, you can either get it from the record itself or the contained action: + +```rust +let entry_hash_from_record = record.entry().as_option()?.hash(); +let entry_hash_from_action = record.action().entry_hash()? +assert_equal!(entry_hash_from_record, entry_hash_from_action); +``` + +Finally, to get the hash of an entry from an action that updates or deletes it, match the action to the appropriate variant and access the corresponding field: + +```rust +if let Action::Update(action_data) = action { + let replaced_entry_hash = action_data.original_entry_address; +} else if let Action::Delete(action_data) = action { + let deleted_entry_hash = action_data.deletes_entry_address; +} +``` + +### Agent + +An agent's ID is just their public key, and an entry for their ID is stored on the DHT. The hashing function for an agent ID entry is just the literal value of the entry. This is a roundabout way of saying that you link to or from an agent using their public key as a hash. + +An agent can get their own ID by calling [`hdk::prelude::agent_info`](https://docs.rs/hdk/latest/hdk/info/fn.agent_info.html). Note that agents may change their ID if their public key has been lost or stolen, so they may have more than one ID over the course of their participation in a network. + +```rust +use hdk::prelude::*; + +let my_first_id = agent_info()?.agent_initial_pubkey; +let my_current_id = agent_info()?.agent_latest_pubkey; +``` + +All actions have their author's ID as a field. You can get this field by calling the action's `author()` method: + +```rust +let author_id = action.author(); +``` + +### External reference + +An external reference is just any 32-byte identifier. Holochain doesn't care if it's an IPFS hash, an Ethereum wallet, the hash of a constant in your code, a very short URL, or the name of your pet cat. But because it comes from outside of a DHT, it's up to your application to decide how to construct or handle it. Typically, an external client such as a UI would do all that. + +Construct an external hash from the raw bytes or a Base64 string: + +```rust +use holo_hash::*; + +``` + +You can then use the value wherever linkable hashes can be used. +My hovercraft is full of eels!!! +### DNA + +There is one global hash that everyone knows, and that's the hash of the DNA itself. You can get it by calling [`hdk::prelude::dna_info`](https://docs.rs/hdk/latest/hdk/info/fn.dna_info.html). + +```rust +use hdk::prelude::*; + +let dna_hash = dna_info()?.hash; +``` \ No newline at end of file diff --git a/src/pages/build/index.md b/src/pages/build/index.md index b01cdf065..5236aefe5 100644 --- a/src/pages/build/index.md +++ b/src/pages/build/index.md @@ -16,6 +16,7 @@ This Build Guide organizes everything you need to know about developing Holochai ### Topics {data-no-toc} * [Overview](/build/working-with-data/) --- general concepts related to working with data in Holochain -* [Entries](/build/entries/) --- creating, reading, updating, and deleting -* [Links, Paths, and Anchors](/build/links-paths-and-anchors/) --- creating and deleting +* [Identifiers](/build/identifiers) --- working with hashes and other unique IDs +* [Entries](/build/entries/) --- defining, creating, reading, updating, and deleting data +* [Links, Paths, and Anchors](/build/links-paths-and-anchors/) --- creating relationships between data ::: \ No newline at end of file diff --git a/src/pages/build/links-paths-and-anchors.md b/src/pages/build/links-paths-and-anchors.md index f03cb858d..e45ea5625 100644 --- a/src/pages/build/links-paths-and-anchors.md +++ b/src/pages/build/links-paths-and-anchors.md @@ -5,7 +5,7 @@ title: "Links, Paths, and Anchors" ::: intro A **link** connects two addresses in an application's shared database, forming a graph database on top of the underlying hash table. Links can be used to connect pieces of [addressable content](/resources/glossary/#addressable-content) in the database or references to addressable content outside the database. -**Paths** and **anchors** build on the concept of links, allowing you to create collections, pagination, indexes, and hierarchical structures. +An **anchor** is a pattern of linking from a well-known base address. Our SDK includes a library for creating hierarchies of anchors called **paths**, allowing you to create collections, pagination, indexes, and taxonomies. ::: ## Turning a hash table into a graph database @@ -18,7 +18,7 @@ But Holochain also lets you attach **links** as metadata on an address in the da ### Define a link type -Every link has a type that you define in an integrity zome, just like [an entry](/build/entries/#define-an-entry-type). Links are simple enough that they have no entry content. Instead, their data is completely contained in the actions that write them. Here's what a link creation action contains, in addition to the [common action fields](/build/working-with-data/#entries-actions-and-records-primary-data): +Every link has a type that you define in an integrity zome, just like [an entry](/build/entries/#define-an-entry-type). Links are simple enough that they're committed as an action with no associated entry. Here's what a link creation action contains, in addition to the [common action fields](/build/working-with-data/#entries-actions-and-records-primary-data): * A **base**, which is the address the link is attached to and _points from_ * A **target**, which is the address the link _points to_ @@ -84,158 +84,7 @@ let delete_link_action_hash = delete_link( ); ``` -A link is considered dead once its creation action has one or more delete-link actions associated with it. - -## Getting hashes for use in linking - -Because linking is all about connecting hashes to other hashes, here's how you get a hash for a piece of content. - -!!! info A note on the existence of data -An address doesn't have to have content stored at it in order for you to link to or from it. (In the case of external references, it's certain that data won't exist at the address.) If you want to require data to exist at the base or target, and if the data needs to be of a certain type, you'll need to check for this in your link validation code. - -!!! - -### Action - -Any CRUD host function that records an action on an agent's source chain, such as `create`, `update`, `delete`, `create_link`, and `delete_link`, returns the hash of the action. You can use this in links, either for further writes in the same function call or elsewhere. - - -!!! info Actions aren't written until function lifecycle completes -Like we mentioned in [Working with Data](/guide/working-with-data/#content-addresses), zome functions are atomic, so actions aren't actually there until the zome function that writes them completes successfully. - -If you need to share an action hash via a signal (say, with a remote peer), it's safer to wait until the zome function has completed. You can do this by creating a callback called `post_commit()`. It'll be called after every successful function call within that zome. -!!! - -!!! info Don't depend on relaxed action hashes -If you use 'relaxed' chain top ordering, your zome function shouldn't depend on the action hash it gets back from the CRUD host function, because the final value might change by the time the actions are written. -!!! - -If you have a variable that contains a [`hdk::prelude::Action`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html) or [`hdk::prelude::Record`](https://docs.rs/hdk/latest/hdk/prelude/struct.Record.html), you can also get its hash using the following methods: - -```rust -let action_hash_from_record = record.action_address(); -let action = record.signed_action; -let action_hash_from_action = action.as_hash(); -assert_eq!(action_hash_from_record, action_hash_from_action); -``` - -(But it's worth pointing out that if you have this value, it's probably because you just retrieved the action by hash, which means you probably already know the hash.) - -To get the hash of an entry creation action from an action that deletes or updates it, match on the [`Action::Update`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Update) or [`Action::Delete`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#variant.Delete) action variants and access the appropriate field: - -```rust -if let Action::Update(action_data) = action { - let replaced_action_hash = action_data.original_action_address; - // Do some things with the original action. -} else if let Action::Delete(action_data) = action { - let deleted_action_hash = action_data.deletes_address; - // Do some things with the deleted action. -} -``` - -### Entry - -To get the hash of an entry, first construct the entry struct or enum that you [defined in the integrity zome](/build/entries/#define-an-entry-type), then pass it through the [`hdk::hash::hash_entry`](https://docs.rs/hdk/latest/hdk/hash/fn.hash_entry.html) function. (Reminder: don't actually have to write the entry to a source chain to get or use the entry hash for use in a link.) - -```rust -use hdk::hash::*; -use movie_integrity::*; - -let movie = Movie { - title: "The Good, the Bad, and the Ugly", - director_entry_hash: EntryHash::from_raw_36(vec![ /* hash of 'Sergio Leone' entry */ ]), - imdb_id: Some("tt0060196"), - release_date: Timestamp::from(Date::Utc("1966-12-23")), - box_office_revenue: 389_000_000, -}; -let movie_entry_hash = hash_entry(movie); -``` - -To get the hash of an entry from the action that created it, call the action's [`entry_hash`](https://docs.rs/hdk/latest/hdk/prelude/enum.Action.html#method.entry_hash) method. It returns an optional value, because not all actions have associated entries. - -```rust -let entry_hash = action.entry_hash()?; -``` - -If you know that your action is an entry creation action, you can get the entry hash from its `entry_hash` field: - -```rust -let entry_creation_action: EntryCreationAction = action.into()?; -let entry_hash = action.entry_hash; -``` - -To get the hash of an entry from a record, you can either get it from the record itself or the contained action: - -```rust -let entry_hash_from_record = record.entry().as_option()?.hash(); -let entry_hash_from_action = record.action().entry_hash()? -assert_equal!(entry_hash_from_record, entry_hash_from_action); -``` - -Finally, to get the hash of an entry from an action that updates or deletes it, match the action to the appropriate variant and access the corresponding field: - -```rust -if let Action::Update(action_data) = action { - let replaced_entry_hash = action_data.original_entry_address; -} else if let Action::Delete(action_data) = action { - let deleted_entry_hash = action_data.deletes_entry_address; -} -``` - -### Agent - -An agent's ID is just their public key, and an entry for their ID is stored on the DHT. The hashing function for an agent ID entry is just the literal value of the entry. This is a roundabout way of saying that you link to or from an agent using their public key as a hash. - -An agent can get their own ID by calling [`hdk::prelude::agent_info`](https://docs.rs/hdk/latest/hdk/info/fn.agent_info.html). Note that agents may change their ID if their public key has been lost or stolen, so they may have more than one ID over the course of their participation in a network. - -```rust -use hdk::prelude::*; - -let my_first_id = agent_info()?.agent_initial_pubkey; -let my_current_id = agent_info()?.agent_latest_pubkey; -``` - -All actions have their author's ID as a field. You can get this field by calling the action's `author()` method: - -```rust -let author_id = action.author(); -``` - -### External reference - -Because an external reference comes from outside of a DHT, it's up to you to decide how to get it into the application. Typically, an external client such as a UI or bridging service would pass this value into your app. - -As mentioned up in the [Entry](#entry) section, an entry hash can also be considered 'external' if you don't actually write it to the DHT. - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -#[hdk_extern] -fn add_movie_poster_from_ipfs(movie_entry_hash: EntryHash, ipfs_hash_bytes: Vec) { - let ipfs_hash = ExternalHash::from_raw_32(ipfs_hash_bytes); - create_link( - movie_entry_hash, - ipfs_hash, - LinkTypes::IpfsMoviePoster, - () - ); -} -``` - -### DNA - -There is one global hash that everyone knows, and that's the hash of the DNA itself. You can get it by calling [`hdk::prelude::dna_info`](https://docs.rs/hdk/latest/hdk/info/fn.dna_info.html). - -```rust -use hdk::prelude::*; - -let dna_hash = dna_info()?.hash; -``` - -!!! info Linking from a DNA hash is not recommended -Because every participant in an application's network takes responsibility for storing a portion of the DHT's address space, attaching many links to a well-known hash such as the DNA hash can create 'hot spots' and cause an undue CPU, storage, and network burden the peers in the neighborhood of that hash. Instead, we recommend you use [paths or anchors](#paths-and-anchors) to 'shard' responsibility throughout the DHT. -!!! +A link is considered deleted once its creation action has one or more delete-link actions associated with it. As with entries, deleted links can still be retrieved with [`hdk::prelude::get_details`](https://docs.rs/hdk/latest/hdk/prelude/fn.get_details.html) ## Retrieve links @@ -269,7 +118,7 @@ let movies_in_1960s_by_director = get_links( ); ``` -To get all live _and dead_ links, along with any deletion actions, use [`hdk::prelude::get_link_details`](https://docs.rs/hdk/latest/hdk/link/fn.get_link_details.html). This function has the same options as `get_links` +To get all live _and deleted_ links, along with any deletion actions, use [`hdk::prelude::get_link_details`](https://docs.rs/hdk/latest/hdk/link/fn.get_link_details.html). This function has the same options as `get_links` ```rust use hdk::prelude::*; @@ -300,37 +149,45 @@ let number_of_reviews_written_by_me_in_last_month = count_links( ); ``` -## Paths and anchors +!!! info Links are counted locally +Currently `count_links` retrieves all links from the remote peer, then counts them locally. We're planning on changing this behavior so it does what you expect --- count links on the remote peer and send the count, to save network traffic. +!!! + +## Anchors and paths -Sometimes the easiest way to discover a link base is to build it into the application's code. You can create an **anchor**, an entry whose content is a well-known blob, and hash that blob any time you need to retrieve links. This can be used to simulate collections or tables in your graph database. As [mentioned](#getting-hashes-for-use-in-linking), the entry does not even need to be stored; you can simply create it, hash it, and use the hash in your link. +Sometimes the easiest way to discover a link base is to build it into the application's code. You can create an **anchor**, a well-known address (like the hash of a string constant) to attach links to. This can be used to simulate collections or tables in your graph database. -While you can build this yourself, this is such a common pattern that the HDK implements it for you in the [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) module. The implementation supports both anchors and **paths**, which are hierarchies of anchors. +While you can build this yourself, this is such a common pattern that the HDK implements it for you in the [`hdk::hash_path`](https://docs.rs/hdk/latest/hdk/hash_path/index.html) module. It lets you create **paths**, which are hierarchies of anchors. !!! info Avoiding DHT hot spots -Don't attach too many links to a single anchor, as that creates extra work for the peers responsible for that anchor's hash. Instead, use paths to split the links into appropriate 'buckets' and spread the work around. We'll give an example of that below. +Don't attach too many links to a single address, as that creates extra work for the peers responsible for it. Instead, use paths to split the links into appropriate 'buckets' and spread the work around. We'll give an example of that below. !!! -### Scaffold a simple collection +### Scaffold a simple collection anchor -If you've been using the scaffolding tool, you can scaffold a simple collection for an entry type with the command `hc scaffold collection`. Behind the scenes, it uses the anchor pattern. +The scaffolding tool can create a 'collection', which is a path that serves as an anchor for entries of a given type, along with all the functionality that creates and deletes links from that anchor to its entries: -Follow the prompts to choose the entry type, names for the link types and anchor, and the scope of the collection, which can be either: +```bash +hc scaffold collection +``` + +Follow the prompts to choose the entry type, name the link types and anchor, and define the scope of the collection, which can be either: * all entries of type, or * entries of type by author -It'll scaffold all the code needed to create a path anchor, create links, and delete links in the already scaffolded entry CRUD functions. - ### Paths -Create a path by constructing a [`hdk::hash_path::path::Path`](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html) struct, hashing it, and using the hash in `create_link`. The string of the path is a simple [domain-specific language](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html#impl-From%3C%26str%3E-for-Path), in which dots denote sections of the path. +When you want to create more complex collections, you'll need to use the paths library directly. + +Create a path by constructing a [`hdk::hash_path::path::Path`](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html) struct, hashing it, and using the hash as a link base. The string of the path is a simple [domain-specific language](https://docs.rs/hdk/latest/hdk/hash_path/path/struct.Path.html#impl-From%3C%26str%3E-for-Path), in which dots denote sections of the path. ```rust use hdk::hash_path::path::*; use movie_integrity::*; let path_to_movies_starting_with_g = Path::from("movies_by_first_letter.g") - // A path requires a link type from the integrity zome. Here, we're using the + // A path requires a link type that you've defined in the integrity zome. Here, we're using the // `MovieByFirstLetterAnchor` type that we created. .typed(LinkTypes::MovieByFirstLetterAnchor); @@ -346,7 +203,7 @@ let create_link_hash = create_link( )?; ``` -Retrieve all the links on a path by constructing the path, then getting its hash: +Retrieve all the links on a path by constructing the path, then calling `get_links` with its hash and link type: ```rust use hdk::hash_path::path::*; @@ -361,7 +218,7 @@ let links_to_movies_starting_with_g = get_links( )?; ``` -Retrieve all child paths of a path by constructing the parent path, typing it, and calling its `children_paths()` method: +Retrieve all child paths of a path by constructing the parent path and calling its `children_paths()` method: ```rust use hdk::hash_path::path::*; @@ -378,66 +235,6 @@ let links_to_all_movies = all_first_letter_paths .collect(); ``` -### Anchors - -In the HDK, an 'anchor' is just a path with two levels of hierarchy. The examples below show how to implement the path-based examples above, but as anchors. Generally the implementation is simpler. - -Create an anchor by calling the [`hdk::prelude::anchor`](https://docs.rs/hdk/latest/hdk/prelude/fn.anchor.html) host function. An anchor can have two levels of hierarchy, which you give as the second and third arguments of the function. - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -// This function requires a special link type to be created in the integrity -// zome. Here, we're using the `MovieByFirstLetterAnchor` type that we created. -let movies_starting_with_g_anchor_hash = anchor(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter", "g"); -let create_link_hash = create_link( - movies_starting_with_g_anchor_hash, - movie_entry_hash, - LinkTypes::MovieByFirstLetter, - () -); -``` - -The `anchor` function creates no entries, just links, and will only create links that don't currently exist. - -Retrieve all the linked items from an anchor just as you would any link base: - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -let anchor_hash_for_g = anchor(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter", "g"); -let links_to_movies_starting_with_g = get_links(anchor_hash_for_g, LinkTypes::MovieByFirstLetter, None); -``` - -Retrieve the _addresses_ of all the top-level anchors by calling [`hdk::prelude::list_anchor_type_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_type_addresses.html): - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -let hashes_of_all_top_level_anchors = list_anchor_type_addresses(LinkTypes::MovieByFirstLetterAnchor); -``` - -Retrieve the _addresses_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_addresses`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_addresses.html): - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -let hashes_of_all_first_letters = list_anchor_addresses(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); -``` - -Retrieve the _names_ of all the second-level anchors for a top-level anchor by calling [`hdk::prelude::list_anchor_tags`](https://docs.rs/hdk/latest/hdk/hash_path/anchor/fn.list_anchor_tags.html): - -```rust -use hdk::prelude::*; -use movie_integrity::*; - -let all_first_letters = list_anchor_tags(LinkTypes::MovieByFirstLetterAnchor, "movies_by_first_letter"); -``` - ## Reference * [`hdi::prelude::hdk_link_types`](https://docs.rs/hdi/latest/hdi/prelude/attr.hdk_link_types.html) diff --git a/src/pages/build/working-with-data.md b/src/pages/build/working-with-data.md index 3bba9d47d..3d4cf2b4a 100644 --- a/src/pages/build/working-with-data.md +++ b/src/pages/build/working-with-data.md @@ -8,51 +8,39 @@ Holochain is, at its most basic, a framework for building **graph databases** on ## Entries, actions, and records: primary data -Data in Holochain takes the shape of a [**record**](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/record/struct.Record.html). Different kinds of records have different purposes, but the thing common to all records is the [**action**](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/action/enum.Action.html): one participant's intent to manipulate their own state or the application's shared database state in some way. All actions contain: +Data in Holochain takes the shape of a [**record**](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/record/struct.Record.html). Different kinds of records have different purposes, but the thing common to all records is the [**action**](https://docs.rs/holochain_integrity_types/latest/holochain_integrity_types/action/enum.Action.html): one participant's intent to manipulate their own state and/or the application's shared database state in some way. All actions contain: * The **agent ID** of the author * A timestamp * The type of action -* The hash of the previous action in the author's history of state changes, called their [**source chain**](/concepts/3_source_chain/) (note: the first action in their chain doesn't contain this field, as it's the first) +* The hash of the previous action in the author's history of state changes, called their [**source chain**](/concepts/3_source_chain/) (note: the first action in their chain doesn't contain this field) * The index of the action in the author's source chain, called the **action seq** -Some actions also contain a **weight**, which is a calculation of the cost of storing the action and can be used for rate limiting. (Note: weighting and rate limiting isn't implemented yet.) +Some actions also contain a **weight**, which is a calculation of the cost of storing the action and can be used for spam prevention. (Note: weighting and rate limiting isn't implemented yet.) The other important part of a record is the **entry**. Not all action types have an entry to go along with them, but those that do, `Create` and `Update`, are called **entry creation actions** and are the main source of data in an application. It's generally most useful to _think about a **record** (entry plus creation action) as the primary unit of data_. This is because the action holds useful context about when an entry was written and by whom. -One entry written by two actions is considered to be the same piece of content, but when paired with their respective actions into records, each record is guaranteed to be unique. + +One entry written by two actions is considered to be the same piece of content, but when an entry is paired with its respective actions into records, each record is guaranteed to be unique.
authors
authors
authors
creates
creates
creates
Alice
Action 1
Bob
Action 2
Carol
Action 3
Entry
-### Content addresses - -Entries and actions are both **addressable content**, which means that they're retrieved by their address --- which is usually the hash of their data. All addresses are 32-byte identifiers. - -There are four types of addressable content: - -* An **entry** is an arbitrary blob of bytes, and its address is the hash of that blob. It has an **entry type**, which your application uses to deserialize it, validate it, and give it meaning. -* An **agent ID** is the public key of a participant in an application. Its address is the same as its content. -* An **action** stores action data for a record, and its address is the hash of the serialized action content. -* An **external reference** is the ID of a resource that exists outside the database, such as the hash of an IPFS resource or the public key of an Ethereum address. There's no content stored at the address; it simply serves as an anchor to attach [links](#links) to. - -There are a few things to note about action hashes: +## The graph DHT: Holochain's shared database -* You can't know an action's hash until you've written the action, because it's influenced by both the previous action's hash and participant's current system time. -* When you write an action, you can specify "relaxed chain top ordering". We won't go into the details here, but it means you can't depend on the action hash even after you write the action. -* Any function that writes actions is atomic, which means that actions aren't written until after the function succeeds _and_ all actions are successfully validated. That means that you shouldn't depend on content being available at an address until _after_ the function returns a success result. +Holochain keeps its shared data in a ### Storage locations and privacy -Each DNA creates a network of peers who participate in storing bits of that DNA's database, which means that each DNA's database (and the [source chains](#individual-state-histories-as-public-records) that contribute to it) is completely separate from all others. This creates a per-network privacy for shared data. On top of that, addressable content can either be: +Each [DNA](/concepts/2_application_architecture/#dna) creates a network of peers who participate in storing pieces of that DNA's database, which means that each DNA's database (and the [source chains](#individual-state-histories-as-public-records) that contribute to it) is completely separate from all others. This creates a per-network privacy for shared data. On top of that, addressable content can either be: * **Private**, stored on the author's device in their source chain and accessible to them only, or * **Public**, stored in the graph database and accessible to all participants. All actions are public, while entries can be either public or [private](/build/entries/#configure-an-entry-type). External references hold neither public nor private content, but merely point to content outside the database. -## Shared graph database +### Shared graph database Shared data in a Holochain application is represented as a [**graph database**](https://en.wikipedia.org/wiki/Graph_database) of nodes connected by edges called [**links**](/concepts/5_links_anchors/). Any kind of addressable content can be used as a graph node.