-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change MultiDict
to be backed by a MultiSet
instead of a Vector
; add delete!(md, k, v)
#692
Draft
NHDaly
wants to merge
26
commits into
MultiDict-reboot
Choose a base branch
from
nhd-MultiDict-unordered
base: MultiDict-reboot
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+350
−85
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Before, constructors would accept _either_ (k=>v, k=>v...), or (k=>[v], k=>[v]...). Each worked in some contexts, but neither in all contexts. Also, before, generators did not work. Now, there are two simple rules: 1. MultiDict constructors expect pairs or tuples or generators of {K,V}: - `MultiDict(1=>1,1=>2)`, `MultiDict{Int,Int}(1=>2,2=>3)`, etc 2. To manually construct a MultiDict with {K,Vector{V}), you can construct with the _underlying Dict_ directly. - `MultiDict(Dict(1=>[1,2])`, `MultiDict{Int,Int}(1=>[2],2=>[3])`, etc
- Fixes broken constructor by adding `grow_to!` overload. We need this because MultiDict isn't an AbstractDict. - Adds `empty(::MultiDict, ::Type{K}, ::Type{V})`, which was needed in the process, and also makes us more standard! :raised-hands:
Apply @eulerkochy's formatting suggestions Co-authored-by: Koustav Chowdhury <kc99.kol@gmail.com>
Fixes constructors that must do type-promotion, e.g.: ```julia julia> MultiDict(1 => 2, 1 => "hello") MultiDict{Int64,Any}(Dict{Int64,Array{Any,1}}(1 => [2, "hello"])) ```
…tually allowed duplicates ```julia julia> collect(MultiDict(3=>3, 1=>2, 1=>4)) 3-element Array{Pair{Int64,Int64},1}: 3 => 3 1 => 2 1 => 4 ```
…extend. Per BlueStyle, we should overload via qualified name, and import names via `using`. I simply lapsed in my earlier commit.
Stop implicit overloading via `import`
A `MultiSet{T}` is very much like an `Accumulator{T,Int}`, but it exposes a single-value-based interface. The fact that it is storing counts under the hood is completely hidden from the interface. For example, iterating the MultiSet returns each item in the MultiSet, repeated per its count, as if the multiset _actually_ contained duplicate items: ```julia julia> show(collect(MultiSet([1,1,1,2,2]))) [2, 2, 1, 1, 1] ``` whereas the accumulator presents its count-based representation to the user: ```julia julia> show(collect(MultiSet([1,1,1,2,2]).data)) [2 => 2, 1 => 3] ``` For more info on MultiSets, see: https://en.wikipedia.org/wiki/Multiset
```julia julia> collect(md) 5-element Array{Pair{Int64,Int64},1}: 2 => 10 2 => 20 1 => 10 1 => 10 1 => 20 julia> collect(md[1]) 3-element Array{Int64,1}: 10 10 20 ```
The default implementation of keys() and values() lazily constructs the keys and values out of `iterate()`, which is exactly what we'd want to do, and it's more standard using a KeySet and ValueIterator. :)
Before, it was comparing the value to the collection returned from `get()`, and now it returns the correct value. There already existed a function doing the right thing, but it was incorrectly overloaded for 2-Tuples, not Pairs, which is not what the AbstractDict interface expects.
4 tasks
After this PR, i think i am finally done with the MultiDict changes! :) I'll have a look at the OrderedMultiDict, like @StephenVavasis suggested, but i'm not sure when I'll get to it. (Sorry this has all been slow going; this is just my recreational programming 😅) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR builds on top of all the currently open PRs related to MultiDicts, to change the implementation to be built on a
Dict{K, MultiSet{V}}
. This allows logarithmic-time lookup of values, which makesdelete!(md, k, v)
andin(md, kv::Pair{K,V))
feasible performance-wise.This PR then adds
delete!(md, k, v)
and updatesin(k=>v, md::MultiDict)
.PRs included in this one:
MultiSet{T}()
, built onAccumulator{T,Int}()
. #691iterate()
iterate over all pairs, as if the dict actually allowed duplicate keys #676Fixes #654.