Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing on table with Geometry #559

Open
arya-hemanshu opened this issue May 1, 2018 · 9 comments
Open

Indexing on table with Geometry #559

arya-hemanshu opened this issue May 1, 2018 · 9 comments

Comments

@arya-hemanshu
Copy link
Contributor

Description

In subject table the column shape is not indexed, indexing the shape column would significantly improve the performance of the tool.

Have gone through the documentation of creating an index on spatial column and have tried creating it, however found no performance improvement.

Upon digging a little more in the documentation found out that to spatially index a table the table should only store a single type of geometry e.g polygons and not any other geometry. There should be different tables for storing different type of geometries, e.g points, lines and polygons, then only indexes can be effective.

However, currently DC stores all type of geometries in a single subject table. It requires reviewing and think of a different approach if required.

Error log

None

@Anafi
Copy link

Anafi commented May 1, 2018

What if you convert every geometry to a geometry collection consisting of a single geometry (ST_Collect)? Does is indexing work with geometry collections?

@arya-hemanshu
Copy link
Contributor Author

@Anafi thank you for the suggestion, i will have a look at it, however i think spatial indexes only work with polygons, points and lines as while indexing postgis creates a bounding box around every geography present in the table. I will update this issue with my findings.

@Anafi
Copy link

Anafi commented May 18, 2018

@arya-hemanshu did the spatial index on Geometry Collection type work?

@arya-hemanshu
Copy link
Contributor Author

@Anafi i am not sure, as the official documentation of postgis indexing, doesn't have any examples of indexing on GeometryCollection and moreover i couldn't find any mention in the documentation that it works on GeometryCollection. Only, multipolygons, polygons, line and point examples are available in their official documentation

@Anafi
Copy link

Anafi commented May 22, 2018

When testing spatial intersection between 1,000 features of a table, having a spatial index did not show any major speed performance (1.4 sec. without spatial index, 1.3 sec with spatial index). However when tested on a table of 1,000,000 features it does make a difference (see below).

-- without spatial index
-- (it did not finish after 15 min..)
SELECT a.geom, b.geom
FROM  p0000.mytest AS a,  p0000.mytest AS b 
WHERE ST_Intersects(ST_CollectionHomogenize(a.geom), ST_CollectionHomogenize(b.geom)) AND a.id<>b.id;

CREATE INDEX mytest_gix ON p0000.mytest USING GIST (geom);

-- with spatial index operator 
-- 7.5 sec 
SELECT a.geom, b.geom
FROM  p0000.mytest AS a,  p0000.mytest AS b 
WHERE ST_CollectionHomogenize(a.geom)&&ST_CollectionHomogenize(b.geom) AND ST_Intersects(ST_CollectionHomogenize(a.geom), ST_CollectionHomogenize(b.geom)) AND a.id<>b.id;

@arya-hemanshu
Copy link
Contributor Author

@Anafi I will put an index in the code base, just had one query. These queries looks different to me, are the same? or one produces a different result then other?

@Anafi
Copy link

Anafi commented Jun 7, 2018

@arya-hemanshu The queries are different indeed. The difference is that the second one is not only using the ST_Intersects but also the && operator. That operator is used to first filter the features that their bounding boxes intersect and then ST_Intersects filters the features that their actual geometries intersect. See here

@arya-hemanshu
Copy link
Contributor Author

@Anafi is it possible to use the same queries with and without the spatial index and then compare the results?

@Anafi
Copy link

Anafi commented Jul 6, 2018

@arya-hemanshu running the second query on the table without the index takes as long as the first query (more than 15 min.) as the && operator has not any effect without the spatial index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants