Support union of multiple intervals#156
Support union of multiple intervals#156putianyi889 wants to merge 18 commits intoJuliaMath:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #156 +/- ##
==========================================
- Coverage 99.11% 98.62% -0.49%
==========================================
Files 6 7 +1
Lines 225 291 +66
==========================================
+ Hits 223 287 +64
- Misses 2 4 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Is the restriction to |
|
The function calls |
hyrodium
left a comment
There was a problem hiding this comment.
Could you keep the original union(d1, d2) method for performance?
Before this PR
julia> using BenchmarkTools
julia> using IntervalSets
julia> i1 = zero(T) .. one(T)
0.0 .. 1.0
julia> i3 = one(T)/2 .. 2*one(T)
0.5 .. 2.0
julia> @benchmark ∪(i1, i3)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 16.133 ns … 1.314 μs ┊ GC (min … max): 0.00% … 95.32%
Time (median): 17.528 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.641 ns ± 31.241 ns ┊ GC (mean ± σ): 3.63% ± 2.33%
▄▄▂▇█▆▅▄▂▃▅▄▄▂▁ ▁▁▁▁▁ ▂
████████████████▆▆▆▆▄▅▄▄▆▇▇▆▅▆▇▇█████████▇▆▅▄▄▅▅▄▁▁▆▅▆▆▇▆▅▇ █
16.1 ns Histogram: log(frequency) by time 33.5 ns <
Memory estimate: 32 bytes, allocs estimate: 1.After this PR
julia> using BenchmarkTools
julia> using IntervalSets
julia> i1 = zero(T) .. one(T)
0.0 .. 1.0
julia> i3 = one(T)/2 .. 2*one(T)
0.5 .. 2.0
julia> @benchmark ∪(i1, i3)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 65.323 ns … 1.561 μs ┊ GC (min … max): 0.00% … 93.07%
Time (median): 68.096 ns ┊ GC (median): 0.00%
Time (mean ± σ): 74.969 ns ± 73.735 ns ┊ GC (mean ± σ): 4.94% ± 4.81%
▄██▆▃ ▂▃▃ ▁ ▁
▇███████▇▇▇▇▇████▇▇▆▇▆▆▅▅▅▅▅▄▆▅▅▅▆▇▇▆▆▅▅▆▇█▇▇▇▇▇▇▇▆▆▆▄▅▅▅▄▅ █
65.3 ns Histogram: log(frequency) by time 114 ns <
Memory estimate: 128 bytes, allocs estimate: 2.Co-authored-by: Yuto Horikawa <hyrodium@gmail.com>
|
The implementation is now to use an iterator-based method to merge sorted intervals. The bottleneck is sorting.
For the union of 2 intervals, the old method is still 25% faster, so it's reserved. |
|
Important changes:
There is one uncovered line. union(d::TypedEndpointsInterval) = d # 1 intervalIt's to avoid calling union(I::TypedEndpointsInterval...) = iterunion(sort!(collect(I); lt = leftof)) # ≥21 intervalsin that case, so I don't think it's worth testing. |
|
Splatting is reasonably used only for small numbers of arguments in Julia. Then, the most straightforward dependency-free implementation is even more performant than this PR: julia> ints = Tuple(i..(i+1) for i in 1:10)
(1 .. 2, 2 .. 3, 3 .. 4, 4 .. 5, 5 .. 6, 6 .. 7, 7 .. 8, 8 .. 9, 9 .. 10, 10 .. 11)
# this PR:
julia> @btime union($ints...)
255.479 ns (23 allocations: 1.52 KiB)
1 .. 11
# naive implementation:
julia> myunion(ints...) = reduce(union, ints)
julia> @btime myunion($ints...)
18.556 ns (0 allocations: 0 bytes)
1 .. 11UPD: ah, I see, it does require sorting... I heard Julia is adding tuple sorting soon though? |
|
Yeah, don't know the current state of the Julia PR JuliaLang/julia#46104, but copying tuple sorting from there Details function tsort(x::NTuple{N}; lt::Function=isless, by::Function=identity,
rev::Union{Bool,Nothing}=nothing, order::Base.Ordering=Base.Forward) where N
o = Base.ord(lt,by,rev,order)
issorted(x, o) ? x : _sort(x, o)
end
_sort(x::Union{NTuple{0}, NTuple{1}}, o::Base.Ordering) = x
function _sort(x::NTuple, o::Base.Ordering)
a, b = Base.IteratorsMD.split(x, Val(length(x)>>1))
merge(_sort(a, o), _sort(b, o), o)
end
merge(x::NTuple, y::NTuple{0}, o::Base.Ordering) = x
merge(x::NTuple{0}, y::NTuple, o::Base.Ordering) = y
merge(x::NTuple{0}, y::NTuple{0}, o::Base.Ordering) = x # Method ambiguity
merge(x::NTuple, y::NTuple, o::Base.Ordering) =
(Base.lt(o, y[1], x[1]) ? (y[1], merge(x, Base.tail(y), o)...) : (x[1], merge(Base.tail(x), y, o)...))julia> myunion(ints...) = reduce(union, tsort(ints; by=i -> (leftendpoint(i), isleftopen(i))))
julia> @btime myunion($ints...)
59.723 ns (0 allocations: 0 bytes) |
If I understand correctly, it only supports type-stable tuple sorting, that is, all items must be of the exact same type. Edit: JuliaLang/julia#52010 By the way, that implementation is exactly the same as TupleTools.jl. |
unionoperation is not supported #103This also provides a way to simplify
DomainSets.UnionDomainof intervals.