Skip to content

Commit

Permalink
Add a tutorial framework + more tutorials
Browse files Browse the repository at this point in the history
  • Loading branch information
asinghvi17 committed Oct 9, 2024
1 parent 2d73c2b commit 120da8d
Show file tree
Hide file tree
Showing 8 changed files with 132 additions and 6 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
/docs/build/

/test/ref.parquet/
/test/real_zarray.zarr/
/test/real_zarray.zarr/

*/.CondaPkg/
8 changes: 8 additions & 0 deletions docs/CondaPkg.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[deps]
# netcdf4 = ""
virtualizarr = ""
xarray = ""
zarr = ""
certifi = ""
s3fs = ""
kerchunk = ""
3 changes: 3 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
[deps]
AWSS3 = "1c724243-ef5b-51ab-93f4-b0a88ac62a95"
ArchGDAL = "c9ce4bd3-c3d5-55b8-8973-c0e20141b8c3"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
CodecZstd = "6b39b394-51ab-5f42-8807-6242bab2b4c2"
CondaPkg = "992eb4ea-22a4-4c89-a5bb-47a3300528ab"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterVitepress = "4710194d-e776-4893-9690-8d956a29c365"
Downloads = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
Kerchunk = "12c09fd5-fe6a-4e79-8f42-b31f49215243"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
NCDatasets = "85f8d34a-cbdd-5861-8df4-14fed0d494ab"
PythonCall = "6099a3de-0909-46bc-b1f4-468b9a2dfc0d"
RasterDataSources = "3cb90ccd-e1b6-4867-9617-4276c8b2ca36"
Rasters = "a3a2b9e3-a471-40c9-b274-f788e487c689"
YAXArrays = "c21b50f5-aa40-41ea-b809-c0f5e47bfa5c"
Expand Down
15 changes: 15 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
# The first thing to do is to make sure Python's dynamic libraries
# are loaded first.
using CondaPkg, PythonCall
PythonCall.pyimport("aiohttp")

using Kerchunk
using Documenter, DocumenterVitepress


DocMeta.setdocmeta!(Kerchunk, :DocTestSetup, :(using Kerchunk); recursive=true)

using Literate
Expand Down Expand Up @@ -69,6 +75,12 @@ withenv("JULIA_DEBUG" => "Literate") do # allow Literate debug output to escape
# TODO: We should probably fix the above in `process_literate_recursive!`.
end

# Now, process the tutorials
Literate.markdown(
joinpath("tutorials", "solar_dynamics_observatory.jl"), "tutorials";
flavor = Literate.DocumenterFlavor(),
)

makedocs(;
modules=[Kerchunk],
authors="Anshul Singhvi <anshulsinghvi@gmail.com> and contributors",
Expand All @@ -77,6 +89,9 @@ makedocs(;
pages=[
"Home" => "index.md",
"What is Kerchunk?" => "what_the_heck.md",
"Tutorials" => [
"tutorials/solar_dynamics_observatory.md",
],
"API" => "api.md",
"Source code" => literate_pages,
],
Expand Down
Binary file added docs/src/assets/favicon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
47 changes: 42 additions & 5 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,48 @@ CurrentModule = Kerchunk

# Kerchunk

Documentation for [Kerchunk](https://github.com/JuliaIO/Kerchunk.jl).
Kerchunk.jl is a Julia package that enables loading [Kerchunk reference catalogs](https://fsspec.github.io/kerchunk/) as [Zarr.jl](https://github.com/JuliaIO/Zarr.jl) arrays.

```@index
```

```@docs
ReferenceStore
## Quick start

Kerchunk.jl is simply a storage backend to [`Zarr.jl`](https://github.com/JuliaIO/Zarr.jl). Zarr does integrate with the more fully featured packages [`Rasters.jl`](https://github.com/rafaqz/Rasters.jl) and [`YAXArrays.jl`](https://github.com/JuliaDataCubes/YAXArrays.jl), which are the packages you will want to use to interact with Kerchunk data.

```julia
using Kerchunk, Zarr

za = Zarr.zopen("reference://path/to/kerchunk/catalog.json")
# and treat it like any other Zarr array!
# You can even wrap it in YAXArrays.jl to get DimensionalData.jl accessors:
using YAXArrays
YAXArrays.open_dataset(za)
# or open it as a Rasters.RasterStack:
using Rasters
Rasters.RasterStack(
"reference://catalog.json",
source = Rasters.Zarrsource(),
lazy = true, # need to include this
) # source must be explicit
```

It's most useful to open Kerchunk datasets as either RasterStacks or YAXArrays datasets, since both of those packages have great dimensionality support.

## Background

[`kerchunk`](https://fsspec.github.io/kerchunk/) is a Python package that generates the reference catalogs.

## Limitations
- No support for `gen` references with templates.
- No support for complex Jinja2 templates in `refs`. (Although Kerchunk hardly supports this either...)

## Acknowledgements

This effort was funded by the NASA MEaSUREs program in contribution to the Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE) project (https://its-live.jpl.nasa.gov/).

## Alternatives and related packages

- You can always use Python's `xarray` directly via PythonCall.jl
- [FSSpec.jl](https://github.com/asinghvi17/FSSpec.jl) is an alternative storage backends for Zarr.jl that wraps the same [`fsspec`](https://github.com/fsspec/filesystem_spec) that `xarray` uses under the hood.

This package is of course built on top of [Zarr.jl](https://github.com/JuliaIO/Zarr.jl), which is a pure-Julia Zarr array library.
[YAXArrays.jl](https://github.com/JuliaDataCubes/YAXArrays.jl) is a Julia package that can wrap Zarr arrays in a DimensionalData-compatible interface.
61 changes: 61 additions & 0 deletions docs/src/tutorials/creating_kerchunks.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#=
# Creating Kerchunk catalogs
Kerchunk.jl is only a Kerchunk reader, meaning that if you want to create Kerchunk catalogs, you need to use the Python `kerchunk` package.
The easiest way to do this in Julia is to use the [CondaPkg.jl](https://github.com/JuliaPy/CondaPkg.jl) package to install the `kerchunk` package into a Conda environment and then use PythonCall.jl to call the `kerchunk` package. This ensures reproducibility, since you can pin the versions in the generated CondaPkg.toml as well, and package management via CondaPkg has a very similar interface to `Pkg.jl`.
=#


#=
## Setting up the Conda environment
```julia
using CondaPkg
CondaPkg.add("python")
CondaPkg.add("kerchunk")
```
=#

#=
## Creating a Kerchunk catalog
=#

using CondaPkg, PythonCall

# There are two approaches to this - either call Python explicitly via the command line, or call Python via PythonCall.jl.
# Calling Python via Julia is a nicer interface, but you will very quickly run into binary incompatibility issues.

CondaPkg.withenv() do
run(```
$(CondaPkg.which("python")) -e "
import kerchunk
import kerchunk.hdf5 as hdf
# do something
"
```)
end

#=
## Using PythonCall.jl
```julia
```
=#

# Let's load this using Kerchunk.jl, and see what we get!

using Zarr, Kerchunk
z = Zarr.zopen("reference://catalog.json")

using Rasters, ZarrDatasets
r = Raster(z)
Empty file added docs/src/tutorials/mur_sst.jl
Empty file.

0 comments on commit 120da8d

Please sign in to comment.