Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base.hasfastin(Iterators.Pairs) needs to check how it is storing the keys #34672

Open
oxinabox opened this issue Feb 5, 2020 · 4 comments
Open

Comments

@oxinabox
Copy link
Contributor

oxinabox commented Feb 5, 2020

Similar to JuliaCollections/OrderedCollections.jl#33

We see that:

julia> x = pairs((a=1,b=2))
pairs(::NamedTuple) with 2 entries:
  :a => 1
  :b => 2

julia> Base.hasfastin(typeof(x))
true

julia> Base.hasfastin(typeof(x.itr))
false

hasfastin if the checking of haskey is expected logorithmic or better.
but for for many uses of pairs, e.g. when applied to a NamedTuple it is linear.

But we are falling back to the default for AbstractDict which is true.

julia/base/abstractset.jl

Lines 284 to 295 in aa36eef

"""
hasfastin(T)
Determine whether the computation `x ∈ collection` where `collection::T` can be considered
as a "fast" operation (typically constant or logarithmic complexity).
The definition `hasfastin(x) = hasfastin(typeof(x))` is provided for convenience so that instances
can be passed instead of types.
However the form that accepts a type argument should be defined for new types.
"""
hasfastin(::Type) = false
hasfastin(::Union{Type{<:AbstractSet},Type{<:AbstractDict},Type{<:AbstractRange}}) = true
hasfastin(x) = hasfastin(typeof(x))

@rfourquet
Copy link
Member

I can also think of Base.ImmutableDict that has only linear lookup. On the other hand, my impression is that Base.ImmutableDict is typically used only with few elements, and NamedTuple too, which would make me think that "linear" in this case is still "fast" (and maybe for LittleDict too, AFAIR, it's supposed to be used mostly with few elements).

It would be great to have one (or more) benchmarks of the implication of changing this hasfastin trait on real use-cases.

@oxinabox
Copy link
Contributor Author

oxinabox commented Feb 5, 2020

I think the idea is on does something like:

if length(x) < 32 || hasfastin(typeof(x))
    dothing(keys(x))
else
    dothing(Set(keys(x)))
end

@rfourquet
Copy link
Member

I think the idea is on does something like:

Precisely, so for small collections it might still be better to not create a Set even when they have only linear lookup.

@oxinabox
Copy link
Contributor Author

oxinabox commented Feb 5, 2020

example of use:

julia/base/abstractset.jl

Lines 274 to 280 in 3aeb3ec

if !hasfastin(r) && rlen > FASTIN_SET_THRESHOLD
return issubset(l, Set(r))
end
end
for elt in l
elt in r || return false
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants