Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: use go-ipld-prime Path parsing rules #142

Merged
merged 2 commits into from
Mar 10, 2023
Merged

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Mar 9, 2023

I'm going to extract all of this code into go-unixfsnode so this is just a utility we use from there but it can be reused by others, I think there's a generalised useful pattern here.

This PR doesn't necessarily need to be merged because we can get a go-unixfsnode release out pretty quick, I really just want to get agreement on the changes I'm making to the way paths are parsed.

The main change here is that I'm switching from the original manual string parser and using the one in go-ipld-prime, which is more forgiving (it won't error) and some of our rules change, particularly:

  • Leading / is not necessary (although our code path should always produce one since we pluck it off the input string)
  • Redundant /'s are ignored, so foo/bar is the same as foo/////bar
  • . and .. are no longer special cases for errors, they are field names and don't carry any special meaning

Removing error conditions means we don't have to handle bad paths, there are no bad paths now.

@codecov-commenter
Copy link

codecov-commenter commented Mar 9, 2023

Codecov Report

Merging #142 (4264dcc) into main (1aac36d) will increase coverage by 0.12%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #142      +/-   ##
==========================================
+ Coverage   69.60%   69.72%   +0.12%     
==========================================
  Files          55       54       -1     
  Lines        4465     4420      -45     
==========================================
- Hits         3108     3082      -26     
+ Misses       1203     1189      -14     
+ Partials      154      149       -5     
Impacted Files Coverage Δ
pkg/types/request.go 80.00% <100.00%> (+8.94%) ⬆️

... and 3 files with indirect coverage changes

Copy link
Collaborator

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think LGTM? I would just assume all this exit the Lassie repo to a more appropriate pasture soon.

rvagg added a commit to ipfs/go-unixfsnode that referenced this pull request Mar 9, 2023
Refactored and extracted out of Lassie

Ref: filecoin-project/lassie#142
rvagg added a commit to ipfs/go-unixfsnode that referenced this pull request Mar 9, 2023
Refactored and extracted out of Lassie

Ref: filecoin-project/lassie#142
rvagg added a commit to ipfs/go-unixfsnode that referenced this pull request Mar 9, 2023
Refactored and extracted out of Lassie

Ref: filecoin-project/lassie#142
@rvagg
Copy link
Member Author

rvagg commented Mar 9, 2023

I think LGTM? I would just assume all this exit the Lassie repo to a more appropriate pasture soon.

Yep, that wast the plan but I wanted to get agreement on the go-ipld-prime Path handling change here before I just did it over there.

So I've done that now: ipfs/go-unixfsnode#45, and updated it here to use that branch. Will update to a tagged version when I get one.

@kylehuntsman
Copy link
Contributor

What was the reason for the change here? Were there problems parsing the path?

@rvagg
Copy link
Member Author

rvagg commented Mar 10, 2023

What was the reason for the change here? Were there problems parsing the path?

Well, indirectly it was inspired by this thread over here ipfs/boxo#198 and accounting for just how many places we manually do path splitting separate to Kubo, and they are all different.

I borrowed a lot of the logic for the original Lassie pathing parsing from https://github.com/ipld/go-ipld-selector-text-lite/blob/master/parser.go - it does things with .. and . and accounts for leading and trailing / and is really strict.

go-unixfsnode has one here: https://github.com/ipfs/go-unixfsnode/blob/ca00f89a2119380697bb8e4b10a628b7b172d97f/signaling.go#L23 but it's super relaxed, and its lack of flexibility makes it unsuitable to use (I'm not aware of anyone actually using it so that's what ipfs/go-unixfsnode#45 is for). With that you get weird things if you add a leading or trailing /, and // means there's an empty string segment in there, it's really not suitable unless you have very sanitised paths.

go-ipld-prime has a fairly mature understanding of paths, thanks to Eric's long background thinking about these things, all laid out nicely here: https://pkg.go.dev/github.com/ipld/go-ipld-prime/datamodel#Path, his approach is to be lenient in parsing but always form the right thing for IPLD traversal, it'll even turn numbers into indexes where that might be helpful for traversal (we don't currently use that because we're focused on unixfs file/dir name traversal).

SO, with all of that, I figure that instead of adding yet another implementation, we should consolidate properly; hence the two steps here -- lean on go-ipld-prime for pathing, and reinforce it as the place to go for IPLD path use, and extract our UnixFS specific tooling into go-unixfsnode so it can be reused by anyone else that wants it.

There's an additional step to deprecate https://github.com/ipld/go-ipld-selector-text-lite but it's not clear exactly how to do that. It's not UnixFS focused, although it is used for UnixFS, just with dag-pb pathing ('Links/19/Hash/Links/1/Hash/Links/99/Hash/Data` - yuk). I'm inclined to add a note in there suggesting people use go-unixfsnode if they can. BUT there may be a case for further consolidation of this pattern into go-ipld-prime that does all of these things and allows go-unixfsnode to piggyback it and add the unixfs semantics at the same time.

@hannahhoward hannahhoward merged commit 5473321 into main Mar 10, 2023
@kylehuntsman kylehuntsman deleted the rvagg/path-parsing branch May 25, 2023 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants