Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_ Operator does not work for column type NUMERIC #1126

Open
pforero opened this issue Oct 7, 2024 · 1 comment
Open

in_ Operator does not work for column type NUMERIC #1126

pforero opened this issue Oct 7, 2024 · 1 comment
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@pforero
Copy link

pforero commented Oct 7, 2024

The command IN UNNEST does not work for columns with type NUMERIC the same way it works with INT64 or FLOAT64.

The in_ operator on python-bigquery-sqlalchemy uses the IN UNNEST(@PARAM_1) syntax. Therefore it is unable to run an in_ operation for columns with the NUMERIC type.

Note: IN (@PARAM_1) works well with NUMERIC. And when compiling with compile_kwargs "literal_binds"=True it uses that syntax. But for normal compile it uses IN UNNEST.

Environment details

  • OS type and version: Ubuntu 20.04.6
  • Python version: 3.12.4
  • pip version: 24.2
  • sqlalchemy-bigquery version: 1.11.0

Steps to reproduce

  1. Create a sample BigQuery table with different types. Go to the the consoles, editor and run the code example.
  2. Run in the console different versions on the test query to see what works and doesn't work.
  3. Populate the table with some values.
  4. Run an SQLAlchemy query with the in_ operator on the numeric column and it will raise an error.

Code example

--Create table for step 1
CREATE TABLE my_dataset.test_table(
 numeric_col NUMERIC,
 int_col INT64,
 float_col FLOAT64
)
--Tests for syntax that works and doesn't work with numeric
SELECT
 COUNT(*)
FROM my_dataset.test_table`
WHERE
 numeric_col IN UNNEST([1]) -- Doesn't work
 int_col IN UNNEST([1]) -- Works
 numeric_col IN (1) -- Works
 numeric_col IN UNNEST([1.1]) -- Doesn't work
 float_col IN UNNEST([1.1]) -- Works
 numeric_col IN (1.1) -- Works
--Populate the table with some values
INSERT my_dataset.test_table (numeric_col , int_col , float_col)
VALUES(1, 1, 1.1), (2.2, 2, 2.2)
# Set-up the ORM of the table
from sqlalchemy import create_engine, select
from sqlalchemy.orm import DeclarativeBase, Mapped, Session, mapped_column

class Base(DeclarativeBase):
    pass
class TestTable(Base):
    __tablename__ = "test_table"
    __table_args__ = {"schema": "my_dataset"}
    numeric_col: Mapped[float] = mapped_column(primary_key=True)
    int_col: Mapped[int]
    float_col: Mapped[float]

bq = create_engine("bigquery://")
conn = bq.connect()
session = Session(conn)

stmt_numeric = select(TestTable).where(TestTable.numeric_col.in_([1]))
stmt_int = select(TestTable).where(TestTable.int_col.in_([1]))
stmt_float = select(TestTable).where(TestTable.float_col.in_([1.1]))

session.execute(stmt_int) # Works
session.execute(stmt_float) # Works
session.execute(stmt_numeric) # Doesn't work

Seeing as the syntax

WHERE numeric_col IN (1)

actually works it is counter intuitive seeing it doesn't work with sqlalchame-bigquery.

Thanks!

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Oct 7, 2024
@leahecole leahecole assigned Linchin and unassigned leahecole Oct 9, 2024
@leahecole leahecole added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Oct 9, 2024
@r1b
Copy link

r1b commented Dec 8, 2024

@pforero As a workaround, does it work if you compare against a python Decimal type? I haven't tried this end to end, but it seems like the problem might be that BigQuery does not attempt to cast array elements when using the x IN UNNEST([...]) form.

As you say, this does not work:

SELECT CAST(1.1 AS NUMERIC) IN UNNEST([1.1])

No matching signature for operator IN UNNEST for argument types: NUMERIC, ARRAY at [1:29]

But this does:

SELECT CAST(1.1 AS NUMERIC) IN UNNEST([CAST(1.1 AS NUMERIC)])

If you use a python Decimal type when constructing the WHERE clause, the parameter will use the NUMERIC type:

from decimal import Decimal
import pytest

import sqlalchemy

@pytest.fixture
def test_table(metadata):
    from sqlalchemy_bigquery import FLOAT64, INT64, NUMERIC

    return sqlalchemy.Table(
        "test_table",
        metadata,
        sqlalchemy.Column("f", FLOAT64),
        sqlalchemy.Column("i", INT64),
        sqlalchemy.Column("n", NUMERIC),
    )


def test_numeric_in_unnest(faux_conn, test_table):
    expr = sqlalchemy.select(test_table.c.n).where(test_table.c.n.in_([Decimal(1)]))

    expected_sql = (
        "SELECT `test_table`.`n` \n"
        "FROM `test_table` \n"
        "WHERE `test_table`.`n` IN UNNEST(%(n_1:NUMERIC)s)"
    )

    actual_sql = expr.compile(
        faux_conn
    ).string

    assert expected_sql == actual_sql

Note that it is currently expected that python-bigquery-sqlalchemy rewrites x IN (...) to x IN UNNEST([...]) - looks like the idea there was to avoid chewing up query parameters when the set of values is large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

4 participants