Skip to content

Commit

Permalink
[SEDONA-680] Remove rasterio from mandatory dependencies (#1692)
Browse files Browse the repository at this point in the history
  • Loading branch information
jiayuasu authored Nov 23, 2024
1 parent 100d419 commit b66e768
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 10 deletions.
19 changes: 15 additions & 4 deletions .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -143,14 +143,25 @@ jobs:
- env:
PYTHON_VERSION: ${{ matrix.python }}
run: find spark-shaded/target -name sedona-*.jar -exec cp {} ${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark/jars/ \;
- env:
- name: Run tests
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
export SPARK_HOME=${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark
cd python
source ${VENV_PATH}/bin/activate
pytest tests
- env:
pytest -v tests
- name: Run basic tests without rasterio
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
export SPARK_HOME=${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark
cd python
source ${VENV_PATH}/bin/activate
pip uninstall -y rasterio
pytest -v tests/core/test_rdd.py tests/sql/test_dataframe_api.py
- name: Run Spark Connect tests
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
if [ ! -f "${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark/sbin/start-connect-server.sh" ]
Expand All @@ -165,4 +176,4 @@ jobs:
cd python
source ${VENV_PATH}/bin/activate
pip install "pyspark[connect]==${SPARK_VERSION}"
pytest tests/sql/test_dataframe_api.py
pytest -v tests/sql/test_dataframe_api.py
3 changes: 3 additions & 0 deletions docs/tutorial/raster.md
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,9 @@ raster.as_numpy_masked() # numpy array with nodata values masked as nan
If you want to work with the raster data using `rasterio`, you can retrieve a `rasterio.DatasetReader` object using the
`as_rasterio` method.

!!!note
You need to have the `rasterio` package installed (version >= 1.2.10) to use this method. You can install it using `pip install rasterio`.

```python
ds = raster.as_rasterio() # rasterio.DatasetReader object
# Work with the raster using rasterio
Expand Down
27 changes: 23 additions & 4 deletions python/sedona/sql/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,21 @@

from pyspark.sql.types import BinaryType, UserDefinedType

from ..raster import raster_serde
from ..raster.sedona_raster import SedonaRaster
# Only support RasterType when rasterio is installed
try:
import rasterio
except ImportError:
rasterio = None

if rasterio is not None:
from ..raster import raster_serde
from ..raster.sedona_raster import SedonaRaster
else:
# We'll skip RasterType UDT registration and raise error when deserializing
# RasterUDT objects if rasterio is not installed
raster_serde = None
SedonaRaster = None

from ..utils import geometry_serde


Expand Down Expand Up @@ -57,7 +70,12 @@ def serialize(self, obj):
raise NotImplementedError("RasterType.serialize is not implemented yet")

def deserialize(self, datum):
return raster_serde.deserialize(datum)
if raster_serde is not None:
return raster_serde.deserialize(datum)
else:
raise NotImplementedError(
"rasterio is not installed. Please install it to support RasterType deserialization"
)

@classmethod
def module(cls):
Expand All @@ -71,4 +89,5 @@ def scalaUDT(cls):
return "org.apache.spark.sql.sedona_sql.UDT.RasterUDT"


SedonaRaster.__UDT__ = RasterType()
if SedonaRaster is not None:
SedonaRaster.__UDT__ = RasterType()
10 changes: 8 additions & 2 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,18 @@
long_description=long_description,
long_description_content_type="text/markdown",
python_requires=">=3.6",
install_requires=["attrs", "shapely>=1.7.0", "rasterio>=1.2.10"],
install_requires=["attrs", "shapely>=1.7.0"],
extras_require={
"spark": ["pyspark>=2.3.0"],
"pydeck-map": ["geopandas", "pydeck==0.8.0"],
"kepler-map": ["geopandas", "keplergl==0.3.2"],
"all": ["pyspark>=2.3.0", "geopandas", "pydeck==0.8.0", "keplergl==0.3.2"],
"all": [
"pyspark>=2.3.0",
"geopandas",
"pydeck==0.8.0",
"keplergl==0.3.2",
"rasterio>=1.2.10",
],
},
project_urls={
"Documentation": "https://sedona.apache.org",
Expand Down

0 comments on commit b66e768

Please sign in to comment.