Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formally define OHDSI-SQL #290

Open
ablack3 opened this issue Aug 5, 2022 · 4 comments
Open

Formally define OHDSI-SQL #290

ablack3 opened this issue Aug 5, 2022 · 4 comments
Labels
exploratory Changes that require some research first

Comments

@ablack3
Copy link
Collaborator

ablack3 commented Aug 5, 2022

SqlRender performs substitution on strings of text but does not guarantee that its output will actually run on the desired dbms. This is because it does not check that it's input is valid OHDSI-SQL.

What if we thought of OHDSI-SQL as a sql dialect and formally defined it's structure so that we could

  • Check that inputs are valid and can be translated
  • Know that we have covered all translation cases when adding a new backend
  • Potentially provide editor support for code completion and syntax checking using LSP

The definition would look something like
image

reference: https://duckdb.org/docs/sql/statements/select

@schuemie schuemie added the exploratory Changes that require some research first label Aug 5, 2022
@schuemie
Copy link
Member

schuemie commented Aug 5, 2022

This is somewhat related to #157, but I like the idea of being able to validate the input. I looked into SQL parsers in the past, but found them hard to implement in the SqlRender syntax (i.e. I couldn't find a good Java library).

@ablack3
Copy link
Collaborator Author

ablack3 commented Aug 5, 2022

Yea I think this is really the same issue as #157 but with a focus on a formal definition of OHDSI-SQL syntax that could be checked and parsed programmatically possibly with javacc.

Artem implemented duckdb sql translations but we don't really know if we have implemented all of OHDSI SQL because we don't really know the full domain of inputs to translate.

@schuemie
Copy link
Member

schuemie commented Aug 8, 2022

Yes, I tend to use this section as an informal definition of the scope of SqlRender.

@ablack3
Copy link
Collaborator Author

ablack3 commented Aug 18, 2022

@SofiaMp from Kheiron is interested in taking this on. My description of this task is that we want to formally specify the OHDSI-SQL language since it is a programming language.

I would also suggest we separately specify the inputs to render and translate.

render: {parameterized OHDSI SQL} x {list of name value pairs} -> {non parameterized OHDSI SQL}
translate: {non-parameterized OHDSI SQL} -> {sql server, bigquery, impala, ....}

{parameterized OHDSI SQL} can contain branching logic and parameters while {non-parameterized OHDSI SQL} should not.

We need to define the complete syntax and semantics of OHDSI-SQL. Most of the work will be in defining the syntax. The only semantics that need to be defined are the non-SQL elements like branching logic, parameters, and default value declarations which are all done in the SqlRender vignettes. We don't need define the semantics of SQL.

I'll leave it up to @SofiaMp about how to approach this but the result will be a document (a new vignette perhaps) that provides a complete description of the OHDSI-SQL syntax that could be used to implement a parser.

The section @schuemie wrote is a great start https://ohdsi.github.io/SqlRender/articles/UsingSqlRender.html#functions-and-structures-supported-by-translate. @SofiaMp - think about what is missing here for the complete syntax definition? For example what are the function signatures and valid argument values? Does DATEDIFF support the firstdayofweek argument described in the MSSQL documentation?

https://homepage.cs.uri.edu/faculty/hamel/courses/2018/fall2018/csc301/lecture-notes/csc301-ln002.pdf
https://en.wikipedia.org/wiki/Syntax_(programming_languages)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exploratory Changes that require some research first
Projects
None yet
Development

No branches or pull requests

2 participants