ORDER BY query performance difference between ASC and DESC #3196
Replies: 8 comments
-
Peculiar - off the top of my head I can't think of any reason why the sorting order would make that much of a difference. It needs to sort the complete result in both cases before it can apply a limit, anyway. Does the same thing happen if you only execute the subquery:
or simplified even further:
|
Beta Was this translation helpful? Give feedback.
-
Yes all of these perform well. The 'conflict' appears to be with the last
graph pattern, so commenting it out for clarity, the following performs
well:
select *
WHERE {
{ SELECT *
{
{ SELECT DISTINCT
?FileFormat_s ?FileFormatpuidIndex_o
WHERE {
?FileFormat_s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?class .
OPTIONAL{?FileFormat_s <
http://data.nationalarchives.gov.uk/formatregistry/def/registry.puidIndex>
?FileFormatpuidIndex_o}
VALUES(?class){(<
http://data.nationalarchives.gov.uk/formatregistry/def/FileFormat>)}
}
}
} order by ASC(?FileFormatpuidIndex_o) limit 5
}
# {
# VALUES(?FileFormat_p){(<
http://data.nationalarchives.gov.uk/formatregistry/def/fileFormat.Version
)(<http://data.nationalarchives.gov.uk/formatregistry/def/file_formats.ID
)(<http://data.nationalarchives.gov.uk/formatregistry/def/registry.Note>)(<
http://www.w3.org/1999/02/22-rdf-syntax-ns#type>)(<
http://www.w3.org/2000/01/rdf-schema#comment>)(<
http://www.w3.org/2000/01/rdf-schema#label>)}
?FileFormat_s ?FileFormat_p ?FileFormat_o .
# }
}
Just in case it was data related, I also chose different values in the last
graph patterns VALUE statement but still had the same effect.
*Peter J. Lawrence*
*inova8*
*Providing answers for users' information questions*
*Mobile:* +44 7546 095977 | +1 330 631 3772 | *Phone:* +44 1298 872375 |
*Skype:* PeterJLawrence
*Email:* peter.lawrence@inova8.com | peter.lawrence@nexifysolutions.com
<peter.lawrence@inova8.com> | *Web:* www.inova8.com
*LinkedIn: *
*http://www.linkedin.com/in/peterjohnlawrence
<http://www.linkedin.com/in/peterjohnlawrence>*
…On Thu, 26 Mar 2020 at 00:43, Jeen Broekstra ***@***.***> wrote:
Peculiar - off the top of my head I can't think of any reason why the
sorting order would make that much of a difference. It needs to sort the
complete result in both cases before it can apply a limit, anyway.
Does the same thing happen if you only execute the subquery:
SELECT *
{
{ SELECT DISTINCT ?FileFormat_s ?FileFormatpuidIndex_o
WHERE {
?FileFormat_s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?class .
OPTIONAL{?FileFormat_s <http://data.nationalarchives.gov.uk/formatregistry/def/registry.puidIndex> ?FileFormatpuidIndex_o}
VALUES(?class){(<http://data.nationalarchives.gov.uk/formatregistry/def/FileFormat>)}
}
}
} order by ASC(?FileFormatpuidIndex_o ) limit 5
or simplified even further:
SELECT DISTINCT ?FileFormat_s ?FileFormatpuidIndex_o
WHERE {
?FileFormat_s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?class .
OPTIONAL{ ?FileFormat_s <http://data.nationalarchives.gov.uk/formatregistry/def/registry.puidIndex> ?FileFormatpuidIndex_o }
VALUES(?class){(<http://data.nationalarchives.gov.uk/formatregistry/def/FileFormat>)}
}
order by ASC(?FileFormatpuidIndex_o ) limit 5
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2041 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMJVMFLKF3DKGTLWGUFHNTRJKQKRANCNFSM4LTJ4VCQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Only thing I can think of is that the query planner changes the execution order unfavourably, though not sure why it'd do that for ASC but not DESC. I'll try and take a closer look. |
Beta Was this translation helpful? Give feedback.
-
@peterjohnlawrence apologies for not getting back to this sooner, I'm afraid it rather slipped my mind. We've just released RDF4J 3.2, with the new query explain feature. Could you try (in a test setup to be safe) to upgrade to this release and then try out something like this on your repository:
...and then do the same for the DESC version - this might gives us some clues as to why there is such a difference in performance. |
Beta Was this translation helpful? Give feedback.
-
Upgraded but cannot run the test java as the 3.2.0 release for HTTPTupleQuery only raises an UnsupportedOperationException exception for HTTPTupleQuery.explain However I have noticed something with the 3.2.0 repository query. If I remove the predicate IRI http://www.w3.org/2000/01/rdf-schema#comment from the VALUES list both ascending and descending queries perform well via Workbench (sub-second). If I include this predicate in the list, then the DESC behaves well but the ASC never (or limits of my patience:-)) returns. The object values associated with this predicate are long strings often with non-ascii characters. To test this I included an OFFSET and sure enough I start getting an immediate response. From this I conclude it is data-related. If I can get the Explain work I will be able to test out what is happening and report back. |
Beta Was this translation helpful? Give feedback.
-
Checking the 'data' that is causing the issue with RDF4J-Workbench, it is a UTF-8 string of 1196 characters but without any non-ASCII characters. Is there a string limit within Workbench? If I test the same query within a simple java test then there is no issue and all results are returned correctly. |
Beta Was this translation helpful? Give feedback.
-
Following up here quickly. @peterjohnlawrence could you try the latest version of RDF4J, we have drastically improved sorting, which could potentially impact your issue. |
Beta Was this translation helpful? Give feedback.
-
Also. I came across some issues with combining order by with limit. I was under the impression that limit was applied first. |
Beta Was this translation helpful? Give feedback.
-
I have a relatively simple query but if the ORDER BY ASC is used it never returns, whilst the ORDER BY DESC is very quick.
The server is running RDF4J 3.0.0
The repository uses a native store + Lucene with indices spoc,pocs,cpso,cspo,oscp,ocsp
The query is as follows. Changing order by ASC tyo order by DESC allows results to be returned immediately.
Beta Was this translation helpful? Give feedback.
All reactions