Impact
Deserialization of untrusted data in IPC and Parquet readers in PyArrow versions 0.14.0 to 14.0.0 allows arbitrary code execution. An application is vulnerable if it reads Arrow IPC, Feather or Parquet data from untrusted sources (for example user-supplied input files). This vulnerability only affects PyArrow, not other Apache Arrow implementations or bindings.
Note that Ibis itself makes extremely limited use of pyarrow.parquet.read_table
:
read_table
is used in tests, where the input file is entirely controlled by the Ibis developers
read_table
is used in the ibis/examples/__init__.py
as a fallback for backends that don't support reading Parquet directly. Parquet data used in ibis.examples
are also managed by the Ibis developers. This Parquet data is generated from CSV files and SQLite databases.
- The Pandas and Dask backends both use PyArrow to read Parquet files and are therefore affected.
Ibis does not make use of APIs that directly read from either Arrow IPC files or Feather files.
Patches
Ibis imports the pyarrow_hotfix
package wherever pyarrow is used, as of version 7.1.0.
Upgrading to Arrow 14.0.1 is also a possible solution, starting in Ibis 7.1.0.
Workarounds
Install pyarrow_hotfix
and run import pyarrow_hotfix
ahead of any and all import ibis
statements.
For example:
becomes
import pyarrow_hotfix
import ibis
References
https://www.cve.org/CVERecord?id=CVE-2023-47248
https://nvd.nist.gov/vuln/detail/CVE-2023-47248
Impact
Deserialization of untrusted data in IPC and Parquet readers in PyArrow versions 0.14.0 to 14.0.0 allows arbitrary code execution. An application is vulnerable if it reads Arrow IPC, Feather or Parquet data from untrusted sources (for example user-supplied input files). This vulnerability only affects PyArrow, not other Apache Arrow implementations or bindings.
Note that Ibis itself makes extremely limited use of
pyarrow.parquet.read_table
:read_table
is used in tests, where the input file is entirely controlled by the Ibis developersread_table
is used in theibis/examples/__init__.py
as a fallback for backends that don't support reading Parquet directly. Parquet data used inibis.examples
are also managed by the Ibis developers. This Parquet data is generated from CSV files and SQLite databases.Ibis does not make use of APIs that directly read from either Arrow IPC files or Feather files.
Patches
Ibis imports the
pyarrow_hotfix
package wherever pyarrow is used, as of version 7.1.0.Upgrading to Arrow 14.0.1 is also a possible solution, starting in Ibis 7.1.0.
Workarounds
Install
pyarrow_hotfix
and runimport pyarrow_hotfix
ahead of any and allimport ibis
statements.For example:
becomes
References
https://www.cve.org/CVERecord?id=CVE-2023-47248
https://nvd.nist.gov/vuln/detail/CVE-2023-47248