MapboxTilesRenderer #1024

hokieg3n1us · 2021-10-16T02:49:55Z

Implement MapboxTilesRenderer, which adds support for outputting TMS tilesets as a mbtiles file (a SQLite database following the MBTiles 1.3 schema. This file can then be easily used in GIS clients, such as QGIS, though the primary motivation was creating an output format that could easily be used by GeoServer (using the MBTiles community extension).

Example using data from ACLED (Armed Conflict Location & Event Data):

import dask.dataframe
import datashader
from datashader.tiles import render_tiles
from datashader.utils import lnglat_to_meters

import colorcet


def _get_extents():
    return df.x.min().compute(), df.y.min().compute(), df.x.max().compute(), df.y.max().compute()


def _load_data_func(x_range, y_range):
    return df.loc[df.x.between(*x_range) & df.y.between(*y_range)]


def _rasterize_func(df, x_range, y_range, height, width):
    cvs = datashader.Canvas(x_range=x_range, y_range=y_range,
                            plot_height=height, plot_width=width)
    agg = cvs.points(df, 'x', 'y')
    return agg


def _shader_func(agg, span=None):
    img = datashader.tf.dynspread(datashader.tf.shade(agg, cmap=colorcet.fire))
    return img


# Can be utilized to customize image with watermark, etc.
def _post_render_func(img, **kwargs):
    return img


if __name__ == '__main__':
    df = dask.dataframe.read_csv('ACLED.csv', usecols=['longitude', 'latitude']).persist()

    # Coordinates should be in Mercator format.
    df['x'], df['y'] = lnglat_to_meters(df['longitude'], df['latitude'])

    df.drop(['longitude', 'latitude'], axis=1)

    min_zoom = 0
    max_zoom = 8
    output_path = 'output/ACLED.mbtiles'

    render_tiles(_get_extents(),
                 range(min_zoom, max_zoom + 1),
                 load_data_func=_load_data_func,
                 rasterize_func=_rasterize_func,
                 shader_func=_shader_func,
                 post_render_func=_post_render_func,
                 output_path=output_path, num_workers=4)

ACLED.mbtiles loaded into QGIS:

The render_tiles function now has two big changes to it's behavior.

It validates the output_path immediately. If the output_path is a directory, it will create it. If it's a file path that ends with mbtiles, it'll create the directory structure, then setup the tables & metadata. This should only be done a single time, so it's functionality is exposed by a static method in the MapboxTilesRenderer.
It exposes the num_workers used by the Dask Bag, so that the SQLite file doesn't lock due to extremely high concurrency.

Create MapboxTilesRenderer for outputting TMS tile sets as a mbtiles file (a SQLite database following the MBTiles 1.3 specification).

Compute min/max of span in-place instead of building a list of all values, then computing using dask. Use dimensions of data array for label based indexing, instead of hard-coded x, y labels (which forced input coordinate columns to be named x, y).

Include capability to provide a local_cache_path. If provided, the aggregation for the super tiles will be stored locally as a NetCDF file (instead of keeping all these in memory, which eventually overflows memory at high zoom levels). This allows most of the processing to now be done completely out of core.

hokieg3n1us · 2021-10-20T11:46:11Z

Included an additional capability for out-of-core processing. An optional parameter, local_cache_path, can be provided. If provided, the aggregates generated by Datashader for the super tiles will be persisted to that local cache as a NetCDF file, instead of keeping all these intermediate results in memory. These aggregates will then be individually loaded by Dask workers to generate the individual tiles. This allows tile generation for larger datasets at much higher zoom levels, being only dependent on your disk space.

Use the netCDF4 library for writing/loading XArray DataArrays from cache, since it properly handles unsigned data types. Optional dependency that will raise an error if not installed when the caching feature is used.

jbednar · 2021-10-26T00:34:05Z

Very cool! We're working on fixing the tests, at which point we should be able to review this. Let me know if you're still planning to add more amazing features!

Allow updates to an existing mbtiles file, inserting or replacing rows in the SQLite database and make SQL statements consistent.

hokieg3n1us · 2021-10-27T00:56:00Z

That's all the features I currently have planned. I'd considered supporting output to pmtiles, either directly, or as a conversion step from the mbtiles format, but that can wait until that format is more widely adopted.

calculate_zoom_level_stats moved to a parallel implementation with dask bags, that does not cache the super tiles in memory. If a local_cache_path is provided, the super tiles will be cached to disk as NetCDF files. Super tiles will then be either recomputed during the render_super_tiles function, or loaded from the local cache. Note: If using the Datashader render_tiles, the scheduler for Dask should be configured to 'threads' (when running on local) to avoid Dask bag computations from copying the input DataFrame to multiple processes, which will increase memory overhead. If outputting tiles to the MBTiles format, the num_workers should be tuned to prevent the SQLite database from locking during transactions. Both of these can be configured using dask.config.set(scheduler='threads', num_workers=4).

Handle setup of output paths.

codecov · 2023-05-19T01:37:39Z

Codecov Report

Merging #1024 (a0e6f1c) into main (8092f4d) will decrease coverage by 0.29%.
The diff coverage is 81.05%.

@@            Coverage Diff             @@
##             main    #1024      +/-   ##
==========================================
- Coverage   84.52%   84.24%   -0.29%     
==========================================
  Files          35       35              
  Lines        8369     8643     +274     
==========================================
+ Hits         7074     7281     +207     
- Misses       1295     1362      +67

Impacted Files	Coverage Δ
datashader/data_libraries/pandas.py	`100.00% <ø> (ø)`
datashader/tiles.py	`58.30% <45.54%> (-11.21%)`	⬇️
datashader/utils.py	`80.09% <85.00%> (+0.84%)`	⬆️
datashader/reductions.py	`84.68% <93.15%> (+1.56%)`	⬆️
datashader/compiler.py	`91.07% <100.00%> (+0.50%)`	⬆️
datashader/data_libraries/dask.py	`95.23% <100.00%> (+0.07%)`	⬆️
datashader/data_libraries/dask_xarray.py	`98.95% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

hokieg3n1us added 6 commits October 15, 2021 21:16

Implement MapboxTilesRenderer

f1abb63

Create MapboxTilesRenderer for outputting TMS tile sets as a mbtiles file (a SQLite database following the MBTiles 1.3 specification).

Undo Formatting

e508a9a

Create directories for mbtiles file.

85fb11a

Use filename as name in mbtiles metadata

c5d6d09

Minor updates

64d28dc

Compute min/max of span in-place instead of building a list of all values, then computing using dask. Use dimensions of data array for label based indexing, instead of hard-coded x, y labels (which forced input coordinate columns to be named x, y).

hokieg3n1us added 2 commits October 20, 2021 20:41

Use netCDF4 for caching

9ff9ca6

Use the netCDF4 library for writing/loading XArray DataArrays from cache, since it properly handles unsigned data types. Optional dependency that will raise an error if not installed when the caching feature is used.

Check if path is a directory, before calling makedirs

6d0173b

SQL consistency.

b6f3e80

Allow updates to an existing mbtiles file, inserting or replacing rows in the SQLite database and make SQL statements consistent.

maximlt assigned jbednar Nov 29, 2021

maximlt added this to the v0.13.1 milestone Nov 29, 2021

hokieg3n1us mentioned this pull request Jan 19, 2022

Memory Error while generating tiles using render_tiles in tilling.ipynb example #1039

Open

hokieg3n1us added 2 commits January 21, 2022 00:03

Path handling for MBTiles output.

11bc3d6

Handle setup of output paths.

ianthomas23 assigned ianthomas23 and unassigned jbednar Jul 18, 2022

ianthomas23 modified the milestones: v0.14.2, v0.14.3 Jul 18, 2022

hokieg3n1us added 2 commits May 18, 2023 20:43

Merge branch 'main' into mbtiles-output

fcaea08

Update tiles.py

a0e6f1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MapboxTilesRenderer #1024

MapboxTilesRenderer #1024

hokieg3n1us commented Oct 16, 2021

hokieg3n1us commented Oct 20, 2021

jbednar commented Oct 26, 2021

hokieg3n1us commented Oct 27, 2021

codecov bot commented May 19, 2023 •

edited

Loading

MapboxTilesRenderer #1024

Are you sure you want to change the base?

MapboxTilesRenderer #1024

Conversation

hokieg3n1us commented Oct 16, 2021

hokieg3n1us commented Oct 20, 2021

jbednar commented Oct 26, 2021

hokieg3n1us commented Oct 27, 2021

codecov bot commented May 19, 2023 • edited Loading

Codecov Report

codecov bot commented May 19, 2023 •

edited

Loading