-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document background & rationale #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I waited a bit to comment because I have more questions than answers. Here we go...
I would have expected an algorithm description of how to calculate such an ID and on the basis of what input data. Did I miss it somewhere?
I would also expect a comparison to the way onestop id works. However, I have problems with that approach as:
- It sets an arbitrary distance the stop may be temporarily moved (e.g., for maintenance)
- false positive are likely to occur (cfr. Still more "false positive" issues with dist calc transitland/transitland-datastore#945)
- The identifiers themselves are descriptive: shouldn’t we then just use blank nodes with a description that can be used by a reconciliation algorithm at a later time?
Reconciling a stop based on its description is something we should try to standardize (and maybe not call that an identifier?), but at least open up the possibility to have different kinds of descriptions based on the specific transport network as well as the specific back-end system’s scope.
Another project I find inspirational in that regards is this one: https://sharedstreets.io/. This type of location referencing could maybe also be a good idea for this project?
Just came across this Twitter post by @danbri: https://twitter.com/danbri/status/1220685660677443584 -- The discussion is very similar to this one. In my opinion we need two things: strive towards global identifiers to be used everywhere, if not possible, have standardized methods to reconcile. |
Maybe this isn't clear from reading the readme (yet). This project computes multiple IDs for an item: rather unstable & unprecise "stepping-stone" IDs, as well as those supposed to be stable and wide-spread in the feature. What phrasing would you like to see to make this more clear?
I didn't document this yet because, as mentioned in #2, there's no standalone reference implementation (or thorough documentation & set of test fixtures). Therefore, there's no reliable way to make sure the Transitland and
I agree! I think we could make it more stable by picking the precision based on the type/size of the structure, but it still doesn't solve the problem. There's is a more fundamental underlying question though: For how long do you consider a moved stop to be the same as the "former" one?
Definitely! I still think the goal of deterministic IDs is worth pursuing though. Even a stable-in-99%-of-the-cases ID scheme is an improvement over arbitrary vendor-specific IDs.
I thought about that too! By accident, this may end up becoming a general model/ontology for describing public transportation infrastructure though (and would then heavily overlap in scope with Transmodel, GTFS, OSM, Wikidata, so I'd like to tackle this later.
I agree, but would like to work on this once I have gotten an idea of how well this works across Europe.
What is the difference between an ID for an item made up of its properties, and a (minimal) list of properties describing it as uniquely & precisely as possible? Deterministic IDs blend the two, right?
That is what I wanted to achieve with the list of IDs for a single item. I should emphasise this more if it didn't come across from the text.
Very interesting project! Will play with it. Do you want to use it to reference e.g. a road next to the station, or just reuse its concepts? I stumbled upon this conceptually similar question in sharedstreets/sharedstreets-ref-system#23
|
For exactly this reason I find “deterministic IDs” term confusing. When an ID must be determenistic, the fact that all systems need to adhere to the same structure, will break certain use cases. The fact that we want a decentralized approach is because different organizations have different political, organizational and cultural aspects to take into account. I like the term “deterministic ID” when used in combination with a global identification system. E.g., a certain URL may follow a certain deterministic URI scheme (e.g.,
As stops are always attached to one or multiple roads, I find this an incredibly interesting idea. It would bring together geospatial referencing as well as a global ID approach. |
Are they? This may not work all the time for watercraft stops & airports. I quickly looked up a minor airport which is only connected to the road network by a |
I assume we only have a different interpretations of the terms we used, not of the concepts. If two systems shared a "deterministic ID", they would have to use the same structure. If they needed different structures, they would yield different IDs.
In my understanding, the two URLs that you mentioned contain one ID each. |
Still has a location where you should be dropped off, right? I guess the location I’m mostly interested in is where one must enter? |
I think we have a misunderstanding here. 😅 While I agree from a end-user UX point of view, this is about getting the IDs as widely adopted & stable as possible, so IMO the priorities are slightly different: As an example, there might be three data sets with 1) general station info and connections between them, 2) where & how to enter each station, with accessibility info, and 3) if the elevators at each station are working. Each of these 3 APIs considers different "aspects" of the train station to be important for identifying it, but probably all know roughly about its location, name and size. When combining all three data sets using some kind of shared ID, we enable the UX that end users want & need: just getting to know how to get from the street nearby into the train to their destination. |
Let's continue discussing this over at public-transport/why-linked-open-transit-data#1. |
@juliuste @pietercolpaert Is anything crucial missing? Keep in mind that this project is a prototype, so only the rationale is important to understand.