Consider not lowercasing column names #229
-
We use bigquery, which supports column names in mixed case though doesn't allow names which differ only in case: https://cloud.google.com/bigquery/docs/schemas#column_names
When we started using bigquery, we settle on (perhaps misguided) approach of using camelCase for column names. Now it is too late to fix it as that convention is used in hundreds of column names. It works mostly fine - as bigquery persists the case of column name. Currently sqlfmt lowercases column names. An alternative way to think is that column names are like variable names and it is responsibility of author to get it right - so sqlfmt should just leave it as is. If it is really common for sqlfmt users to use inconsistent case, and so it is a desirable feature to convert it all to lowercase - perhaps considering an option to leave column names alone. |
Beta Was this translation helpful? Give feedback.
Replies: 10 comments 1 reply
-
You can disable formatting on those items
We implemented sqlfmt for this very reason, a mix of CamelCase, UPPERCASE and lowercase users, amongst other styling differences. |
Beta Was this translation helpful? Give feedback.
-
Maybe my description wasn't clear - but disabling formatting doesn't work - as we use camelCase column names everywhere. Different databases has different notion of case sensitivity for column names: For bigquery - it preserves case in column names. As bigquery preserves case, it would be nice if sqlfmt doesn't insist on making it lowercase. It is somewhat similar to the other bug #193. It is totally fine to be opinionated and choose a single style for formatting and sql keywords - but insisting on column names be lowercase makes it unusable for anyone working with dbs where column name case is preserved and have chosen to use camelCase column names. |
Beta Was this translation helpful? Give feedback.
-
Thanks for opening an issue, @amardeep, and thanks for chiming in, Randy. A few comments from me to explain where we are, how we got here, and what we can do about it:
Which is all a long way of saying, once #193 is fixed, you can run your project with If anyone reading this wants to propose and contribute another solution, I'm open to it, but I don't think I'll be building and maintaining this myself. Going to leave this issue open for further discussion, though. |
Beta Was this translation helpful? Give feedback.
-
Thanks for understanding Ted. You are right - that bigquery column names are not case sensitive, and so the query produced by sqlfmt is not invalid. And going with snake case would have been a better option. Though we can't change that now - so stuck with camelCase and those column names are much harder to read when all lowercased. I understand that properly implementing case preservation will make the code much more comples. Clickhouse dialect option will definitely help. Sqlfmt is mostly attractive for us for its consistent formatting especially with jinja templates and dbt. Without dbt/jinja, jetbrains datastudio did a good enough job, but it fails with jinja. Changing case is not actually that useful for our use case - if there was a switch to not do any case changes - that would work too - as datastudio or other tools can handle case changes when needed. I will try with clickhouse dialect once that is released. |
Beta Was this translation helpful? Give feedback.
-
FYI BigQuery JSON access is indeed case-sensitive, see here https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data ie. if your JSON object contains a property named |
Beta Was this translation helpful? Give feedback.
-
@nfcampos interesting. Can you quote JSON property names using the field access operator, like |
Beta Was this translation helpful? Give feedback.
-
yea that's right, all 4 of these are equivalent ( |
Beta Was this translation helpful? Give feedback.
-
FYI this recently broke one of our data pipelines silently (Snowflake table with some nested fields like |
Beta Was this translation helpful? Give feedback.
-
I'm having the same issue. This just broke a lot of my files that used json lateral flatton functions, which are case sensitive |
Beta Was this translation helpful? Give feedback.
-
thanks for the option of disabling this behavior with the Clickhouse dialect. I also got bitten by JSON flattening in the context of snowflake. Can I suggest calling the flag eg. |
Beta Was this translation helpful? Give feedback.
Thanks for opening an issue, @amardeep, and thanks for chiming in, Randy.
A few comments from me to explain where we are, how we got here, and what we can do about it:
snake_case
for all keywords and identifiers, and I believe this preference is shared by most data people I know. But I realize that naming and style conventions aren't always under our users' control, and this preference ideally would not create a barrier to adopting sqlfmt (I think generally, line breaks and indentation are a bigger benefit than consistent capitalization)snake_case
, you can (and should) quote your identifiers. sqlfmt will never change anything abo…