Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete duplicate geometries: Incorrect result when the output is generated in the same geopackage containing the source layer. #60023

Open
2 tasks done
ludovico85 opened this issue Dec 27, 2024 · 2 comments
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Data Provider Related to specific vector, raster or mesh data providers Processing Relating to QGIS Processing framework or individual Processing algorithms Regression Something which used to work, but doesn't anymore

Comments

@ludovico85
Copy link

What is the bug or the crash?

Hi everyone,
I have a geopackage layer where all the geometries are duplicated (344,918 geometries).
The "Delete duplicate geometries" algorithm behaves strangely depending on the output:

  • Result saved as a temporary layer: 172,459 geometries
  • Result saved as a shapefile: 172,459 geometries
  • Result saved in a new geopackage: 172,459 geometries
  • Result saved in the source layer's geopackage: 167,281 geometries

Steps to reproduce the issue

  1. Upload the layer included in the geopackage
  2. Run the Delete duplicate geometries algorithm
  3. Save the output in the same geopackage of the source layer
  4. The number of returned geometries is wrong

Versions

<style type="text/css"> p, li { white-space: pre-wrap; } </style>
Versione di QGIS 3.34.14-Prizren Revisione codice QGIS 0cdaf6d
Versione Qt 5.15.13
Versione Python 3.12.8
Versione GDAL/OGR 3.9.3
Versione PROJ 9.5.0
Versione database del Registro EPSG v11.016 (2024-08-31)
Versione GEOS 3.13.0-CAPI-1.19.0
Versione SQLite 3.46.1
Versione PDAL 2.8.1
Versione client PostgreSQL 16.2
Versione SpatiaLite 5.1.0
Versione QWT 6.3.0
Versione QScintilla2 2.14.1
Versione SO Windows 11 Version 2009
       
Plugins Python attivi
Cxf_in 9.2
FreehandRasterGeoreferencer 0.8.3
GroupStats 2.2.7
LAStools 2.1.1
latlontools 3.6.20
lizmap 3.14.3
multi_filter 1.0
profile-manager 0.31
project_report 1.2
qfieldsync v4.9.1
QPackage 1.5
QuickOSM 2.2.3
raster_tracer 0.3.3
redLayer 2.2
SelectByRelationship 0.3.3
ViewshedAnalysis 1.9
db_manager 0.1.20
processing 2.12.99
Versione di QGIS 3.34.14-Prizren Revisione codice QGIS [0cdaf6d](https://github.com/qgis/QGIS/commit/0cdaf6d9) Versione Qt 5.15.13 Versione Python 3.12.8 Versione GDAL/OGR 3.9.3 Versione PROJ 9.5.0 Versione database del Registro EPSG v11.016 (2024-08-31) Versione GEOS 3.13.0-CAPI-1.19.0 Versione SQLite 3.46.1 Versione PDAL 2.8.1 Versione client PostgreSQL 16.2 Versione SpatiaLite 5.1.0 Versione QWT 6.3.0 Versione QScintilla2 2.14.1 Versione SO Windows 11 Version 2009

Plugins Python attivi
Cxf_in
9.2
FreehandRasterGeoreferencer
0.8.3
GroupStats
2.2.7
LAStools
2.1.1
latlontools
3.6.20
lizmap
3.14.3
multi_filter
1.0
profile-manager
0.31
project_report
1.2
qfieldsync
v4.9.1
QPackage
1.5
QuickOSM
2.2.3
raster_tracer
0.3.3
redLayer
2.2
SelectByRelationship
0.3.3
ViewshedAnalysis
1.9
db_manager
0.1.20
processing
2.12.99

Supported QGIS version

  • I'm running a supported QGIS version according to the roadmap.

New profile

Additional context

No response

@ludovico85 ludovico85 added the Bug Either a bug report, or a bug fix. Let's hope for the latter! label Dec 27, 2024
@pigreco
Copy link
Contributor

pigreco commented Dec 27, 2024

I confirm

@pigreco pigreco added the Regression Something which used to work, but doesn't anymore label Dec 27, 2024
@agiudiceandrea
Copy link
Contributor

agiudiceandrea commented Dec 27, 2024

I can also confirm the issue running QGIS LTR 3.34.14 and QGIS 3.40.2 (both with GDAL/OGR 3.9.2) and QGIS 3.41.0-Master (with GDAL/OGR 3.11.0-dev) on Windows 10 from OSGeo4W.

Neither the processing log nor the Log Messages panel report any error, thus the users cannot be aware that the issue occurred and they are misled to think the output layer has been correctly created with all the non duplicated feature, which is not the case.

The issue also occurs using a layer containing randomly generated duplicated points (1M features): it look likes it occurs if the layer contains a large number of features, while it doesn't for a limited number of features.

The issue didn't occur running QGIS 3.22.0 (with GDAL/OGR 3.4.0) and previous versions.

The issue doesn't occur if the OGR_SQLITE_JOURNAL=WAL env. var is set.

The issue also doesn't occur the subsequent times the processing algorithm is executed shortly afterwards the first incorrect run: on the first run the .gpkg-shm and .gpkg-wal files are created only when the algorithm execution reaches the 99%, while the second and subsequent runs the .gpkg-shm and .gpkg-wal files are created right at the start of the algorithm's execution.

@rouault, I guess the PRs #47098 (implemented since QGIS 3.22.6 and 3.24.0) and OSGeo/gdal#5207 (implemented since GDAL/OGR 3.4.2 and 3.5.0) may have triggered such issue.

@agiudiceandrea agiudiceandrea added Processing Relating to QGIS Processing framework or individual Processing algorithms Data Provider Related to specific vector, raster or mesh data providers labels Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Data Provider Related to specific vector, raster or mesh data providers Processing Relating to QGIS Processing framework or individual Processing algorithms Regression Something which used to work, but doesn't anymore
Projects
None yet
Development

No branches or pull requests

3 participants