Skip to content
This repository has been archived by the owner on Nov 22, 2019. It is now read-only.

pelias-deprecated/document-service

Repository files navigation

This repository is part of the Pelias project. Pelias is an open-source, open-data geocoder originally sponsored by Mapzen. Our official user documentation is here.

Pelias Document Service

** This service is deprecated and will no longer be maintained **

Greenkeeper badge

Travis CI Status Gitter Chat

Overview

Module that provides a web service to aid language-agnostic importers with creating documents for insertion into an Elasticsearch index queryable by the Pelias API. Who's on First data is required in order to populate the documents' administrative hierarchy.

Installation

$ git clone git@github.com:pelias/document-service.git
$ cd document-service
$ npm install
$ npm start /path/to/whosonfirst/data

For ease of use, Who's on First data can be downloaded using scripts provided by the Pelias Who's on First module.

NPM

NPM Module

The pelias-document-service npm module can be found here:

https://npmjs.org/package/pelias-document-service

Usage

To start the document service, type: npm start <path to Who's on First data>. By default, the service runs on port 5000 but can be overridden in the PORT environmental variable.

GET requests are made to the /synthesize endpoint in the format: http://localhost:5000/synthesize/<source>/<layer>.

source is the name of the source of the data that can be used to filter and is used in the synthesized document. For example, data imported from OpenAddresses would use openaddresses for source.

layer is the type of data that this document represents. Currently, the only valid values for layer are address, street, and venue.

Parameters

The following parameters are supported for the service:

name required description
id yes a unique identifier for reference in Elasticsearch
lon yes longitude of the record
lat yes latitude of the record
name yes a textual name of the record such as the name of a business (for venues) or house_number + street (for addresses), used by the Pelias API to create result labels
house_number
  • address layer, yes
  • venue layer, no
  • street layer, invalid
house number of an address or venue
street
  • address/street layers, yes
  • venue layer, no
street of an address or venue
postcode no postcode of an address or venue

Output

GET requests to the /synthesize endpoint return a content-type application/json response ready to be sent to Elasticsearch for create/update and queryable by the Pelias API. The output for OpenAddresses 30 W 26th St is:

{
  "name": {
    "default": "30 W 26th St"
  },
  "phrase": {
    "default": "30 W 26th St"
  },
  "parent": {
    "locality": [
      "New York"
    ],
    "locality_id": [
      "85977539"
    ],
    "locality_a": [
      null
    ],
    "neighbourhood": [
      "Flatiron District"
    ],
    "neighbourhood_id": [
      "85869245"
    ],
    "neighbourhood_a": [
      null
    ],
    "county": [
      "New York County"
    ],
    "county_id": [
      "102081863"
    ],
    "county_a": [
      null
    ],
    "borough": [
      "Manhattan"
    ],
    "borough_id": [
      "421205771"
    ],
    "borough_a": [
      null
    ],
    "region": [
      "New York"
    ],
    "region_id": [
      "85688543"
    ],
    "region_a": [
      "NY"
    ],
    "country": [
      "United States"
    ],
    "country_id": [
      "85633793"
    ],
    "country_a": [
      "USA"
    ]
  },
  "address_parts": {
    "number": "30",
    "street": "W 26th St",
    "zip": "10010"
  },
  "center_point": {
    "lon": -73.990409,
    "lat": 40.74427
  },
  "source": "openaddresses",
  "layer": "address",
  "source_id": "6364a510f0268d6f"
}

Request Examples

There are 3 types of documents that can be synthesized, each corresponding to the layer value of the request path:

  • venue
  • address
  • street

Venues

Venue documents are synthesized by calling the /synthesize/<source>/venue endpoint. Each venue has a lat, lon, id, name, and optional house_number, street, and postcode. name is typically the name of the business or point-of-interest, such as "New York Bakery" or "Yellowstone National Park". house_number and street are optional since in some cases this information is either not applicable (as in the case of national parks or water features which are defined as polygons) or confidential (such as women's shelters or other cases where point accuracy is to be purposely obscured).

Example (data from OpenStreetMap): http://localhost:5000/synthesize/openstreetmap/venue?id=264768896&lon=-73.989642&lat40.74101&name=Flatiron+Building&house_number=175&street=5th+Avenue&postcode=10010

Addresses

Address documents can be synthesized by calling the /synthesize/<source>/address endpoint. Each address has a lat, lon, id, name, house_number, street, and optional postcode. The name value is typically just the formatted address, which can be number-prefixed, as in "30 West 26th Street, New York, NY", or -postfixed, as in "Rigaer Straße 11, Berlin, Germany", but can be anything. The document service makes no judgements on what the value of name should be; its value is determined by the caller.

Example (data from OpenAddresses): http://localhost:5000/synthesize/openaddresses/address?id=6364a510f0268d6f&lon=-73.9904095&lat=40.74427&name=30+W+26th+St&house_number=30&street=W+26th+St&postcode=10010

Streets

Street documents are synthesized using the /synthesize/<source>/street endpoint. Each street has a lat, lon, id, name, street, and optional postcode. If a street is entirely contained within a single postcode, it should be supplied if available. Typically, the name value should be the same as the street value but there are no restrictions placed upon this condition.

Example (data from OpenStreetMap): http://localhost:5000/synthesize/openaddresses/address?id=10540891&lon=-73.935546&lat=40.813082&name=Madison+Avenue&street=Madison+Avenue

Error Conditions

Client Errors

The /synthesize endpoint returns an HTTP status code 400 is returned with an error message under any of the following conditions:

  • lat value is not parseable as a finite number
  • lon value is not parseable as a finite number
  • id value is empty
  • name value is empty
  • address layer-specific:
    • house_number value is empty
    • street value is empty
  • street layer-specific:
    • house_number value is non-empty
  • venue layer-specific:
    • house_number value is non-empty and street value is empty

Server Errors

The /synthesize endpoint returns an HTTP status code 500 is returned when a error occurs when performing administrative hierarchy lookup.

Writing an Importer

As this service currently only looks up the administrative hierarchy and formats the request parameters into JSON ready to be sent to Elasticsearch, an importer only needs to be able to make HTTP GET requests and either an Elasticsearch library or the ability to make HTTP POST requests (to index documents into Elasticsearch in lieu of a library).

Example importers have been written in a variety of languages: