Skip to content

Add Gosper island boundary edge iterator#1138

Open
ajfriend wants to merge 7 commits into
uber:masterfrom
ajfriend:gosper_iter
Open

Add Gosper island boundary edge iterator#1138
ajfriend wants to merge 7 commits into
uber:masterfrom
ajfriend:gosper_iter

Conversation

@ajfriend
Copy link
Copy Markdown
Collaborator

@ajfriend ajfriend commented Mar 10, 2026

Closes #1114

Adds IterEdgesGosper, an iterator that yields the directed edges forming the boundary (Gosper island outline) of a cell's child set at a given resolution.

  • Each edge's origin cell is a child of the parent cell; its destination cell is not
  • Edges form a closed counter-clockwise loop
  • For a parent at resolution r expanded to child resolution R: hexagons produce 6 * 3^(R-r) edges, pentagons produce 5 * 3^(R-r)
  • For now, this is an internal-only iterator, not included in the public API

Usage

IterEdgesGosper iter = iterInitGosper(cell, childRes);
while (iter.e) {
    // iter.e is the current directed edge
    // iter.remaining is the number of edges left
    iterStepGosper(&iter);
}

Next step: Speed up cellsToMultiPolygon

This is a building block for a faster cellsToMultiPolygon. When given a pre-compacted input cell set, the iterator allows the cellsToMultiPolygon algorithm to walk the Gosper island boundary edges directly, skipping all internal edges. On a Colorado polygon at resolution 8, this yields a 35x speedup. The higher the resolution, the better the speedup. This speedup is separate from the 3-4x improvement already achieved with the new cellsToMultiPolygon algorithm.

The quick implementation that I used to get the 200x speedup number for the talk (slides and video) can be found here: #1168

@coveralls
Copy link
Copy Markdown

coveralls commented Mar 10, 2026

Coverage Status

coverage: 99.176% (+0.01%) from 99.162% — ajfriend:gosper_iter into uber:master

@ajfriend
Copy link
Copy Markdown
Collaborator Author

Talk explaining how this fits in: https://ajfriend.com/h3c2p-talk/

Comment thread src/h3lib/lib/iterGosper.c Outdated
for (i = 0; iter.e; i++) {
H3Index prev = iter.e;
iterStepGosper(&iter);
t_assert(iter.e != prev, "edge should advance");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks that the new cell is different but not necessarily unique (i.e., the iterator could swap between two different values and still pass this test). The comment on this function implies a more involved test that the same value will not be returned twice.

I guess this is handled practically by do_edges_connect/check_expected_edges?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I updated the comment to reflect that this sub-check does the weaker test.

The stronger test comes from combining the three sub-checks into check_gosper_island. If all those pass, we know:

  1. The iterator has the correct number of edges
  2. Iterator values are valid H3 directed edges
  3. Consecutive pairs of edges are distinct
  4. Edges are on the boundary of the Gosper island (origin of edge is child of initial cell, destination is not), at the correct resolution
  5. Consecutive edges share an endpoint (last to first)
  6. Loop closes back to where it started

Those properties together should guarantee that the edges produced by the iterator are all distinct.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For (5), does that need to be "Consecutive edges share exactly one endpoint"? It seems like the last to first and consecutive pairs of edges are distinct parts covers that because the failure scenario I'm imagining would require the iterator to reverse at some point.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have the other properties, it is sufficient (and natural) to check just the one endpoint. For example,
suppose we have a path segment of edges on the Gosper boundary like:

a -> b -> c

and we are considering the next edge x in the sequence, and worried it might double back on the sequence:

a -> b -> c -> x

Let's show that it can't double back. Assume start(c) and end(c) give the first and last vertices of directed edge c. We know/require that start(x) == end(c). That means that x != c, and even that x != b, since the endpoints of c are distinct (and the same for b).

Now, consider the possibility that x = rev(c), the reversal of edge c. This would pass requirement (5), because start(rev(c)) == end(c), but we're OK because this edge doesn't pass (4): rev(c) will have the wrong origin/parent cell.

Does that cover the failure scenario you were thinking of?

Comment thread src/apps/testapps/testGosperIter.c Outdated
Comment thread src/apps/testapps/testGosperIter.c Outdated
Comment thread src/apps/testapps/testGosperIter.c
Comment on lines +136 to +139
while (iter->_isPentagon && ALWAYS(iter->e != H3_NULL) &&
H3_GET_INDEX_DIGIT(iter->e, iter->_parentRes + 1) == 1) {
advanceEdge(iter);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it could be possible to fast-forward to the next non-deleted edge, as an optimization.

I guess it's possible we start somewhere in the middle of the deleted subsequence, in which case it would probably be harder to compute how to get out of it in a single step.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something we could try at some point!

I'd leave it out of this PR, though, because I'd expect the time savings to be low in practice, and not worth the increase in code complexity.

for (i = 0; iter.e; i++) {
H3Index prev = iter.e;
iterStepGosper(&iter);
t_assert(iter.e != prev, "edge should advance");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For (5), does that need to be "Consecutive edges share exactly one endpoint"? It seems like the last to first and consecutive pairs of edges are distinct parts covers that because the failure scenario I'm imagining would require the iterator to reverse at some point.

Copy link
Copy Markdown
Contributor

@justinhwang justinhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic lgtm, but perhaps some more comments on what seemed like magic numbers to me on my first read

Comment on lines +97 to +98
// New parent cell: shift back by 6 positions.
walkPos[r] -= 6;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intuition behind 6 positions? I could visually confirm (referencing your talk slides) that this was the case but didn't quite follow myself

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I just printed out a bunch of sequences, noticed the pattern, and guessed at the shift until I found one that worked, and checked that it worked universally.

Comment thread src/h3lib/lib/iterGosper.c Outdated
// Advance to next position: Increase the cyclic index by one, but
// we use +19 == +1 (mod 18), so that it stays positive,
// even after potentially subtracting by 6 above.
walkPos[r] = (walkPos[r] + 19) % 18;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: could we pull 18 out into a const for clarity (same for 6 below)

#define WALK_CYCLE_LEN  ((int)(sizeof(walk_digit) / sizeof(walk_digit[0])))
...
walkPos[r] = (walkPos[r] + WALK_CYCLE_LEN + 1) % WALK_CYCLE_LEN;

or just WALK_CYCLE_LEN directly as 18

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Updated.


// Set resolution to child level and initialize boundary walk.
// The first child level starts at walk position 0.
// Subsequent levels start at either 14 or 16, depending on
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 14 or 16?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just another case of me looking at a bunch of sequences, noticing the pattern, guessing, realizing it shifted between resolutions, and figuring out that 14 and 16 work, and checking that it always works.

// Advance to next edge. Increase the cyclic index by one, but
// we use +7 == +1 (mod 6) so that it stays positive,
// even after potentially subtracting by 2 above.
iter->_edgePos = (iter->_edgePos + 7) % 6;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could also define a const here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement iterator for edges in "Gosper Island"

4 participants