Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add value_counts() method to SpanArray to support DataFrame.describe() #96

Open
frreiss opened this issue Aug 25, 2020 · 0 comments
Open
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers

Comments

@frreiss
Copy link
Member

frreiss commented Aug 25, 2020

DataFrame.describe() on a DataFrame with a span column currently doesn't work because our array types are missing the method value_counts(). The specific stack trace looks like this:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-55-f041b2b916b3> in <module>
----> 1 syntax_df.describe()

[...]

~/opt/miniconda3/envs/pd/lib/python3.7/site-packages/pandas/core/algorithms.py in value_counts(values, sort, ascending, normalize, bins, dropna)
    736 
    737             # handle Categorical and sparse,
--> 738             result = Series(values)._values.value_counts(dropna=dropna)
    739             result.name = name
    740             counts = result._values

AttributeError: 'CharSpanArray' object has no attribute 'value_counts'

We should implement the value_counts() method, following the example of the implementation for Pandas' built-in IntervalArray type.

Note that TokenSpanArray is currently a subclass of CharSpanArray, but the implementation of #91 may change that relationship.

@frreiss frreiss added bug Something isn't working enhancement New feature or request good first issue Good for newcomers labels Aug 25, 2020
@frreiss frreiss changed the title Add value_counts() method to CharSpanArray to support DataFrame.describe() Add value_counts() method to SpanArray to support DataFrame.describe() Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant