Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in result_fetch(res@ptr, n = n) : embedded nul in string #294

Open
cboettig opened this issue Feb 2, 2023 · 5 comments
Open

Error in result_fetch(res@ptr, n = n) : embedded nul in string #294

cboettig opened this issue Feb 2, 2023 · 5 comments

Comments

@cboettig
Copy link

cboettig commented Feb 2, 2023

Issue #121 was locked unresolved. The issue remains, and the solution proposed there, emulating the behavior already used in RMySQL seems like a good way forward.

@krlmlr
Copy link
Member

krlmlr commented Feb 2, 2023

Thanks. I think the rationale for closing I gave back then still holds.

Can you somehow cast the data on the database side to make sure it appears in proper shape in R? What does your stack look like?

@her1997
Copy link

her1997 commented Sep 30, 2023

I have same problem as thread #121

When running code in the following way:

query <- "SELECT *FROM table"
result <- dbSendQuery(con, query)
df<- dbFetch(result)

I get the error: "Error in result_fetch(res@ptr, n = n) : embedded nul in string"

For now, I have internally resolved it by identifying the problematic column and changing the query to:

query <- "SELECT column1, column2,..., REPLACE(column, '\\0', '') AS new_column FROM table"

And with this, I have been able to correctly read the column that was causing the issue when reading the data.
The inconvenience is still present; the ideal situation would be to read the entire table without any conflicts.

I was using: RMariaDB_1.2.1, R version 4.0.4 (2021-02-15)

@krlmlr
Copy link
Member

krlmlr commented Apr 1, 2024

@paleolimbot: Are there plans for a MariaDB/MySQL ADBC driver, by any chance?

This database will support strings with embedded NUL characters. AFAICT, Postgres doesn't support them, and I was unable to replicate it with SQLite. How will ADBC deal with that issue?

IIUC, Arrow had the same problem, and the consensus there was to fail explicitly: apache/arrow#8365 (comment) .

@paleolimbot
Copy link

Are there plans for a MariaDB/MySQL ADBC driver, by any chance?

I'm not aware of any, but we recently have dramatically improved the Go and C++ driver framework which should make it much easier to write and maintain drivers if there is some well-known Go or C++ connector worth wrapping.

From ADBC's standpoint, I am not sure if the driver would catch this (or want to) before exporting an Arrow array. Any incompatibility would happen on conversion to an R vector, where, as you noted, the arrow package fails (or can strip them). In nanoarrow you can at least work around this by requesting that the string be converted to a blob (the arrow package doesn't allow this type of output request).

library(nanoarrow)

string_array_with_embedded_nul <- nanoarrow_array_modify(
  nanoarrow_array_init(na_string()),
  list(
    length = 1,
    buffers = list(
      NULL,
      c(0L, 9L),
      c(charToRaw("abcd"), as.raw(0x00), charToRaw("efgh"))
    )
  )
)

# Error deferred to printing because of ALTREP?
convert_array(string_array_with_embedded_nul)
#> [1]
#> Error in print.default(x): embedded nul in string: 'abcd\0efgh'

# Can always bail to a blob on convert
convert_array(string_array_with_embedded_nul, blob::blob())
#> <blob[1]>
#> [1] blob[9 B]

# Error deferred to printing because of ALTREP
as.vector(arrow::as_arrow_array(string_array_with_embedded_nul))
#> [1]
#> Error in print.default(x): embedded nul in string: 'abcd\0efgh'; to strip nuls when converting from Arrow to R, set options(arrow.skip_nul = TRUE)

# Option exists to strip them (with lots of warnings)
options(arrow.skip_nul = TRUE)
as.vector(arrow::as_arrow_array(string_array_with_embedded_nul))
#> [1]
#> Warning in print.default(x): Stripping '\0' (nul) from character vector

#> Warning in print.default(x): Stripping '\0' (nul) from character vector

#> Warning in print.default(x): Stripping '\0' (nul) from character vector
#>  "abcdefgh"

Created on 2024-04-01 with reprex v2.1.0

@krlmlr
Copy link
Member

krlmlr commented Apr 1, 2024

If there was an ADBC driver for MariaDB, I'd expect it to thread through the embedded NULLs. The users could then apply repair.

Fixing this issue in the existing backends looks like a losing battle. Implementing that ADBC driver seems to be the much more worthwhile goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants