Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IRI mapping for multiple selectors (non-refined)? #454

Open
andymatuschak opened this issue Jun 7, 2024 · 0 comments
Open

IRI mapping for multiple selectors (non-refined)? #454

andymatuschak opened this issue Jun 7, 2024 · 0 comments

Comments

@andymatuschak
Copy link

andymatuschak commented Jun 7, 2024

Summary: I'd like to be able to encode a URI fragment representing multiple selectors (not one refined by another), to be interpreted with the same semantics as a specific resource with multiple selectors. This doesn't seem to be possible with the current mapping.

Hypothes.is's strategy for specifying and resolving annotations uses a combination of TextQuoteSelector and TextPositionSelector. The combination allows some resilience to both document modifications and also to ambiguous matches. Here's an example:

"selector": [
        {
          "start": 1239,
          "end": 1283,
          "type": "TextPositionSelector"
        },
        {
          "type": "TextQuoteSelector",
          "prefix": "Na-na-na-na-na-na-na, na-na-na-na, hey, Jude",
          "exact": "Na-na-na-na-na-na-na, na-na-na-na, hey, Jude",
          "suffix": "Na-na-na-na-na-na-na, na-na-na-na, hey, Jude"
        }
      ]

To resolve the segment of interest, Hypothes.is:

  1. First tries the TextPositionSelector. If that range is identical to the exact key of the the TextQuoteSelector, we're done.
  2. Finds all places which match the TextQuoteSelector. It uses a fuzzy matcher, so it's tolerant of small modifications. If there's only one match, we use it.
  3. If there are multiple—as in the "Hey Jude" example above—we choose the segment which most closely matches the TextPositionSelector.

With only the TextPositionSelector, there's no resilience to document changes. With only the TextQuoteSelector, there's no way to handle ambiguities. @dwhly reports here that Hypothes.is's data shows these problems do turn up in production.

Unfortunately, as far as I can tell, the IRI fragment mapping provided here only allows one selector to be encoded. This problem was briefly discussed in #93, but the group appears to have concluded that refinedBy handles these cases. I don't think it handles the case I'm describing here, but I'd love to be wrong!

Thanks for all your hard work, all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant