Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stations in the HAFAS API, but missing in db-stations #6

Open
derhuerst opened this issue May 15, 2017 · 7 comments
Open

stations in the HAFAS API, but missing in db-stations #6

derhuerst opened this issue May 15, 2017 · 7 comments
Labels

Comments

@derhuerst
Copy link
Owner

derhuerst commented May 15, 2017

When querying routes from 8011167 to 8000261, I get the following station as start location:

{
  type: 'station',
  name: 'Berlin Jungfernheide',
  latitude: 52.530408,
  longitude: 13.299424,
  id: 8011167,
  platform: '4'
}

The id 8011167 does not occur in db-stations@1.0.0, taken from the stations API.

PS: @lightsprint09 @highsource @juliuste any idea? 😉 In the long term, I'd guess the whole open data community would appreciate it if the API really contained all stations that DB provides routes for.

@derhuerst
Copy link
Owner Author

derhuerst commented May 15, 2017

db-stations@0.4.0 contains the following entry, containing the id:

{
  "ds100": "BJUF",
  "nr": 3067,
  "name": "Berlin Jungfernheide",
  "zip": "10589",
  "city": "Berlin",
  "state": "BE",
  "id": 8011167,
  "latitude": 52.530276,
  "longitude": 13.299437
}

More specifically:

curl -sL 'http://download-data.deutschebahn.com/static/datasets/haltestellen/D_Bahnhof_2016_01_alle.csv' | grep 8011167
# 8011167;BJUF;Berlin Jungfernheide;RV;13.299437;52.530276;;

@highsource
Copy link

I've forwarded this to someone in DB S&S.

@voland10557
Copy link

I don't get what exactly the problem is, but here's some background info:

DB S&S has around 5400 train stations ("Bahnhöfe" + "Haltepunkte"):
https://data.deutschebahn.com/dataset/data-stationsdaten

This list contains around 6600 stops (also tram stops, bus stops, ...), but NOT only train stations, so it's not really Deutsche Bahn data (and I bet DB isn't keeping it up to date):
https://data.deutschebahn.com/dataset/data-haltestellen
Please see Verkehrsrots post in the comment section for more info.

So if you want to work with train stations data (DB S&S stations), you should work with the static "data-stationsdaten" or StaDa-API:
https://developer.deutschebahn.com/store/apis/info?name=StaDa-Station_Data&version=v2&provider=DBOpenData

And of course the data of DB RegioNetz Infrastruktur (RNI) stations (that is not part of DB S&S):
https://data.deutschebahn.com/dataset/data-stationsdaten-regio

@derhuerst
Copy link
Owner Author

@voland10557 Thanks for the explanation!

Let me first outline the (ideological) standpoint of a non-DB programmer/user:

<rant>

I'm trying to work with DB data. I, quite frankly, don't care which subdivision of Deutsche Bahn is responsible for which stations. I just want to consume the data. I don't intend to guess the dozens of DB internal abbreviations.

As a someone from the outside, I can't reproduce why the local public transport data (buses, local trains, etc.) is outdated, especially since Deutsche Bahn afair tries to cover them with routing information. So if DB covers an area/agency, it's job is to have correct data about it.

DB has, from my experience, a pretty long record of technical & legal excuses to provide a subpar user experience, both to end-users & to programmers. In the context of making long-distance public transport more convenient, to ultimately sell more tickets, it would make sense to overcome these limitations.

Just like me, the 300 other community programmers & 200 other companies out there don't intend to figure out all the weird abbreviations. They don't intend to stick together 3 inconsistent datasets. They don't intend to make their programs extra-faulproof for cases like IDs or names missing completely.

</rant>

Now the perspective of me, someone more motivated and technically involved:

  • The structure of the Stationsdaten API looks a lot like it is supposed to cover all stations DB has a record of. aufgabentraeger, stationManagement, multiple IBNRs and RIL100 ids all look like it would act as an aggregating API. Is this correct?
  • What is the unifying identifier I can use to find a station across all APIs & datasets? IBNRs (a.k.a. EVA numbers), RIL100 identifiers, nr field from the API mentioned above?
  • Is there any dataset or stations API that covers all stations that the routing APIs cover? e.g. other countries, local public transport, buses
  • Why are the two data portals, for static datasets & APIs?

@derhuerst
Copy link
Owner Author

derhuerst commented May 16, 2017

My intention is to have a module db-stations that contains every station that DB has routing information for or, as in the case above, returns when queried for a different station. It would probably be useful to have flags for the following properties:

  • Is a stations being serviced (as in DB trains run there) by DB (Fernverkehr/Regionalverkehr) itself or just by local/third-party companies it only has data about?
  • Is a station being managed by DB (e.g. by DB Station & Service)?

@derhuerst
Copy link
Owner Author

@tursics

@derhuerst
Copy link
Owner Author

FYI this issue still prevails.

@derhuerst derhuerst changed the title stations missing that appear in the HAFAS API stations in the HAFAS API, but missing in db-stations Jan 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants