Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery: Clarify the INTENT of the TEI #89

Open
oej opened this issue Nov 21, 2024 · 5 comments
Open

Discovery: Clarify the INTENT of the TEI #89

oej opened this issue Nov 21, 2024 · 5 comments
Assignees

Comments

@oej
Copy link
Collaborator

oej commented Nov 21, 2024

We've had several discussions on the TEI syntax and the intent. This needs to be clarified.

  • The TEI is an identifier primarily. It helps with discovery.
  • One way for some types is to use DNS for discovery (all the current TEI types do that)
  • The TYPE comes before the DNS to enable future extensions with different ways of doing discovery
@oej
Copy link
Collaborator Author

oej commented Nov 22, 2024

URNs are identifiers. URLs are locators. The point of having a URN is that it can be reached in many locations. An ISBN of a book is a unique identifier, a URL to it in Amazon is a locator.

By removing DNS failover and load balancing we are missing the point of an identifier and may move to URLs directly.

@ppkarwasz
Copy link

My PoV is slightly different:

  • The TEI is exclusively an identifier for a product.
  • Its format is urn:tei:<subregistry>:<authority>:<product_id>.
  • Currently only the subregistry dns is defined. The authority part is a DNS name and is automatically assigned to the owner of the domain.
  • The dns registry does not require a dedicated resolver to find the TEA service endpoint (unlike urn:isbn). A resolution algorithm that only uses DNS and HTTPS is specified. Of course this does not preclude the possibility of having dedicated TEA resolvers in the future.
  • Other subregistries might be defined in the future.

I would really like for TEA clients to have an abstract TeiResolver interface from day 1. This interface will resolve a TEI identifier into an HTTPS endpoint (with all the parameters HTTP people like). I don't think this is a big burden on implementers.

@oej
Copy link
Collaborator Author

oej commented Nov 22, 2024

Ok, I'll buy that. Let's change to

  • The TEI syntax type move after the domain
  • We require TEI clients (not servers/services) to support HTTPS DNS failover and load balancing

I'll add some example code to my repo for that.

@ppkarwasz
Copy link

ppkarwasz commented Nov 22, 2024

Ok, I'll buy that. Let's change to

  • The TEI syntax type move after the domain

We can move the type after the domain, if a DNS domain is always the right identifier for a software producer.

I don't think it is. For example the Apache Commons PMC could use these TEIs for their software:

  • urn:tei:dns:commons.apache.org:<something>. This is something we already defined, including a standard way to find the TEA service endpoint.
  • Some sort of PURL based identifier: urn:tei:pkg:maven/commons-io:<something>. The maven/commons-io namespace is a better way to describe who owns the TEI, since it is not conditional on the registration of the commons.apache.org domain. Of course, there is currently no way to resolve this kind of TEI.

As I mentioned elsewhere, we could have TEIs of the form urn:tei:vies:PL1234567890:<something> that are owned by whomever has that particular EU VAT number. Resolving such a TEI will require a lot of infrastructure, but the TEI itself is much more stable in time (I don't have to renew my VAT number 😉).

Summarising: I think that your initial approach of putting <type> before <domain-name> was the correct one, although I am not sure that <domain-name> needs necessarily to be a DNS domain name. To bootstrap the system we need to use DNS domain names for now, but let us not exclude the possibility that a different identifier for each <type> will be used in the future.

@ppkarwasz
Copy link

ppkarwasz commented Nov 22, 2024

  • We require TEI clients (not servers/services) to support HTTPS DNS failover and load balancing

Sure, that is something we could settle in stone. TEI resolvers should provide a prioritized list of HTTPS endpoints. It doesn't really matters how they do it, as long as they do it. The TEI resolver feeds the list to the TEA client, which will try the backups if a request fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants