Add Gosper island boundary edge iterator#1138
Conversation
|
Talk explaining how this fits in: https://ajfriend.com/h3c2p-talk/ |
| for (i = 0; iter.e; i++) { | ||
| H3Index prev = iter.e; | ||
| iterStepGosper(&iter); | ||
| t_assert(iter.e != prev, "edge should advance"); |
There was a problem hiding this comment.
This checks that the new cell is different but not necessarily unique (i.e., the iterator could swap between two different values and still pass this test). The comment on this function implies a more involved test that the same value will not be returned twice.
I guess this is handled practically by do_edges_connect/check_expected_edges?
There was a problem hiding this comment.
You're right. I updated the comment to reflect that this sub-check does the weaker test.
The stronger test comes from combining the three sub-checks into check_gosper_island. If all those pass, we know:
- The iterator has the correct number of edges
- Iterator values are valid H3 directed edges
- Consecutive pairs of edges are distinct
- Edges are on the boundary of the Gosper island (origin of edge is child of initial cell, destination is not), at the correct resolution
- Consecutive edges share an endpoint (last to first)
- Loop closes back to where it started
Those properties together should guarantee that the edges produced by the iterator are all distinct.
There was a problem hiding this comment.
For (5), does that need to be "Consecutive edges share exactly one endpoint"? It seems like the last to first and consecutive pairs of edges are distinct parts covers that because the failure scenario I'm imagining would require the iterator to reverse at some point.
There was a problem hiding this comment.
Since we have the other properties, it is sufficient (and natural) to check just the one endpoint. For example,
suppose we have a path segment of edges on the Gosper boundary like:
a -> b -> c
and we are considering the next edge x in the sequence, and worried it might double back on the sequence:
a -> b -> c -> x
Let's show that it can't double back. Assume start(c) and end(c) give the first and last vertices of directed edge c. We know/require that start(x) == end(c). That means that x != c, and even that x != b, since the endpoints of c are distinct (and the same for b).
Now, consider the possibility that x = rev(c), the reversal of edge c. This would pass requirement (5), because start(rev(c)) == end(c), but we're OK because this edge doesn't pass (4): rev(c) will have the wrong origin/parent cell.
Does that cover the failure scenario you were thinking of?
| while (iter->_isPentagon && ALWAYS(iter->e != H3_NULL) && | ||
| H3_GET_INDEX_DIGIT(iter->e, iter->_parentRes + 1) == 1) { | ||
| advanceEdge(iter); | ||
| } |
There was a problem hiding this comment.
I wonder if it could be possible to fast-forward to the next non-deleted edge, as an optimization.
I guess it's possible we start somewhere in the middle of the deleted subsequence, in which case it would probably be harder to compute how to get out of it in a single step.
There was a problem hiding this comment.
That is something we could try at some point!
I'd leave it out of this PR, though, because I'd expect the time savings to be low in practice, and not worth the increase in code complexity.
| for (i = 0; iter.e; i++) { | ||
| H3Index prev = iter.e; | ||
| iterStepGosper(&iter); | ||
| t_assert(iter.e != prev, "edge should advance"); |
There was a problem hiding this comment.
For (5), does that need to be "Consecutive edges share exactly one endpoint"? It seems like the last to first and consecutive pairs of edges are distinct parts covers that because the failure scenario I'm imagining would require the iterator to reverse at some point.
justinhwang
left a comment
There was a problem hiding this comment.
Logic lgtm, but perhaps some more comments on what seemed like magic numbers to me on my first read
| // New parent cell: shift back by 6 positions. | ||
| walkPos[r] -= 6; |
There was a problem hiding this comment.
What is the intuition behind 6 positions? I could visually confirm (referencing your talk slides) that this was the case but didn't quite follow myself
There was a problem hiding this comment.
Honestly, I just printed out a bunch of sequences, noticed the pattern, and guessed at the shift until I found one that worked, and checked that it worked universally.
| // Advance to next position: Increase the cyclic index by one, but | ||
| // we use +19 == +1 (mod 18), so that it stays positive, | ||
| // even after potentially subtracting by 6 above. | ||
| walkPos[r] = (walkPos[r] + 19) % 18; |
There was a problem hiding this comment.
style nit: could we pull 18 out into a const for clarity (same for 6 below)
#define WALK_CYCLE_LEN ((int)(sizeof(walk_digit) / sizeof(walk_digit[0])))
...
walkPos[r] = (walkPos[r] + WALK_CYCLE_LEN + 1) % WALK_CYCLE_LEN;
or just WALK_CYCLE_LEN directly as 18
There was a problem hiding this comment.
Makes sense. Updated.
|
|
||
| // Set resolution to child level and initialize boundary walk. | ||
| // The first child level starts at walk position 0. | ||
| // Subsequent levels start at either 14 or 16, depending on |
There was a problem hiding this comment.
This was just another case of me looking at a bunch of sequences, noticing the pattern, guessing, realizing it shifted between resolutions, and figuring out that 14 and 16 work, and checking that it always works.
| // Advance to next edge. Increase the cyclic index by one, but | ||
| // we use +7 == +1 (mod 6) so that it stays positive, | ||
| // even after potentially subtracting by 2 above. | ||
| iter->_edgePos = (iter->_edgePos + 7) % 6; |
There was a problem hiding this comment.
nit: we could also define a const here
Closes #1114
Adds
IterEdgesGosper, an iterator that yields the directed edges forming the boundary (Gosper island outline) of a cell's child set at a given resolution.rexpanded to child resolutionR: hexagons produce6 * 3^(R-r)edges, pentagons produce5 * 3^(R-r)Usage
Next step: Speed up
cellsToMultiPolygonThis is a building block for a faster
cellsToMultiPolygon. When given a pre-compacted input cell set, the iterator allows thecellsToMultiPolygonalgorithm to walk the Gosper island boundary edges directly, skipping all internal edges. On a Colorado polygon at resolution 8, this yields a 35x speedup. The higher the resolution, the better the speedup. This speedup is separate from the 3-4x improvement already achieved with the newcellsToMultiPolygonalgorithm.The quick implementation that I used to get the 200x speedup number for the talk (slides and video) can be found here: #1168