Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix construction of Series / Index from dict keys when "str" dtype is specified explicitly #60436

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tasfia8
Copy link

@tasfia8 tasfia8 commented Nov 28, 2024

The default behavior (pd.Index(d.keys())) worked correctly, but explicitly setting dtype="str" raised a ValueError. The issue stemmed from dict_keys not being converted to a proper array-like structure before being passed to StringDtype, which couldn't handle such inputs.

To fix the issue:

  • KeyView was introduced to identify and preprocess dict_keys before passing them to Pandas internals. The keys are now converted to a list for compatibility.
    -Handles Existing Test Cases and fixes previous PR fix issues.
    After the fix both the default (pd.Index(d.keys())) and explicit (pd.Index(d.keys(), dtype="str")) cases work:
Screenshot 2024-11-19 at 3 17 08 AM

@tasfia8
Copy link
Author

tasfia8 commented Nov 28, 2024

@jorisvandenbossche Can you check now please and if possible merge? I considered your comments in this updated PR and added the generic if. Now besides the dict.keys(), it will handle other iterable such as dict.values() and addressing other similar types.

My IDE is throwing some unintentional formatting issues which I can't seem to undo, but in this commit, I only changed the if condition in the construction.py and nothing else

Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Dec 31, 2024
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@jorisvandenbossche
Copy link
Member

@tasfia8 apologies for the flow response. Your main commit eb11a27 looks good, but there are still a bunch of unrelated changes. I will push a change to this branch to clean that up.

In addition, we also still need to add a test.

@jorisvandenbossche jorisvandenbossche changed the title Bug Fix: #60343 Construction of Series / Index fails from dict keys when "str" dtype is specified explicitly - PR 2 BUG: fix construction of Series / Index from dict keys when "str" dtype is specified explicitly Jan 3, 2025
@jorisvandenbossche jorisvandenbossche added this to the 2.3 milestone Jan 3, 2025
@jorisvandenbossche jorisvandenbossche added Strings String extension data type and string data Constructors Series/DataFrame/Index/pd.array Constructors labels Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Constructors Series/DataFrame/Index/pd.array Constructors Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG (string): contruction of Series / Index fails from dict keys when "str" dtype is specified explicitly
3 participants