Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Ability to round-trip all pandas columns dtypes #14149

Open
galipremsagar opened this issue Sep 21, 2023 · 0 comments
Open

[FEA] Ability to round-trip all pandas columns dtypes #14149

galipremsagar opened this issue Sep 21, 2023 · 0 comments
Assignees
Labels
cudf.pandas Issues specific to cudf.pandas feature request New feature or request Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Is your feature request related to a problem? Please describe.
With the current Column design and to_pandas API implementation it is only possible to convert a cudf series to numpy dtype or pandas nullable dtypes. However, pandas also support arrow-backed dtypes.

In [1]: import pandas as pd

In [2]: np_series = pd.Series([1, 2, 3], dtype='int64')

In [3]: pd_series = pd.Series([1, 2, 3], dtype=pd.Int64Dtype())

In [4]: import pyarrow as pa

In [5]: arrow_series = pd.Series([1, 2, 3], dtype=pd.ArrowDtype(pa.int64()))

In [6]: np_series
Out[6]: 
0    1
1    2
2    3
dtype: int64

In [7]: pd_series
Out[7]: 
0    1
1    2
2    3
dtype: Int64

In [8]: arrow_series
Out[8]: 
0   1
1   2
2   3
dtype: int64[pyarrow]

In [9]: import cudf

In [10]: cudf.from_pandas(np_series).to_pandas()
Out[10]: 
0    1
1    2
2    3
dtype: int64

In [11]: cudf.from_pandas(pd_series).to_pandas()
Out[11]: 
0    1
1    2
2    3
dtype: int64

In [12]: cudf.from_pandas(arrow_series).to_pandas()
Out[12]: 
0    1
1    2
2    3
dtype: int64

Describe the solution you'd like
I would like cudf to have the ability to round-trip the data type of pandas successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.pandas Issues specific to cudf.pandas feature request New feature or request Python Affects Python cuDF API.
Projects
Status: Todo
Status: In Progress
Development

No branches or pull requests

1 participant