Skip to content

Fix retrace2#911

Merged
simsong merged 9 commits intomainfrom
fix-retrace2
Apr 14, 2026
Merged

Fix retrace2#911
simsong merged 9 commits intomainfrom
fix-retrace2

Conversation

@simsong
Copy link
Copy Markdown
Member

@simsong simsong commented Apr 13, 2026

Following #903.

Codex discovered the problem with the -1 button: "I found a strong candidate: after tracing completes, the page constructs a second TracerController on the same DOM without removing the first controller’s event handlers. That means the frame buttons get rebound on top of the old handlers, which is exactly the kind of bug that can make -1 behave inconsistently after tracing. I’m fixing that by making the movie controls rebind cleanly instead of stacking handlers."

@simsong
Copy link
Copy Markdown
Member Author

simsong commented Apr 13, 2026

Codex:

I fixed the retrace path in both the browser and the Lambda queue flow.

The core bug was a race: the browser started polling immediately, before the retrace request had flipped the movie out of tracing completed, so it could reload the old traced ZIP and stop polling before the new retrace really ran. I fixed that by marking the movie as tracing before queueing work in movie_glue.py and only starting the browser poll after the trace request is accepted in canvas_tracer_controller.js. I also preserved the starting frame across the ZIP reload and now report completion in the existing big status area as Retraced from frame N to frame M. while keeping the player on the frame where retracing started.

I also removed the noisy console output you called out from canvas_tracer_controller.js and canvas_movie_controller.js: no more entries=, metdata_frames=, movie_frmes=, polling dumps, or browser api_key logging. On the backend, I stopped Lambda from logging the full HTTP event in main.py and removed a few raw API-key debug logs in apikey.py and odb.py.

I ran:

  • poetry run pytest -q lambda-resize/tests/queue_tracing_test.py
  • poetry run pytest -q tests/canvas_controller_syntax_test.py

Both passed. I have not been able to fully browser-drive the retrace flow from this sandbox, so the next useful check is your exact manual path again: go to frame 5, move a point, retrace, and confirm it stays on frame 5 and reports the retraced range in the status text.

@simsong simsong marked this pull request as ready for review April 13, 2026 01:25
Copilot AI review requested due to automatic review settings April 13, 2026 01:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses inconsistent tracer/movie control behavior after retracing by ensuring UI event handlers are rebound cleanly (instead of stacking), while also shifting the frontend back to using a globally loaded jQuery instance (with utils.js acting as an ES-module shim).

Changes:

  • Rebind movie/tracer control handlers using .off(...).on(...) to prevent duplicate listeners when controllers are reconstructed.
  • Restore global jQuery usage across templates and update utils.js to re-export the global jQuery $.
  • Update local dev/CI scripts and docs (including moving local Lambda to port 9811 and switching local service start/stop commands).

Reviewed changes

Copilot reviewed 27 out of 31 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/app/static/canvas_movie_controller.js Adds bind_movie_controls() and rebinding via .off().on() to avoid stacked handlers.
src/app/static/canvas_tracer_controller.js Rebinds tracer-specific handlers and fixes selector usage when disabling inputs during tracking.
src/app/static/utils.js Replaces custom DOM helper with a shim that exports the global jQuery instance.
src/app/templates/base.html Loads jquery-3.7.1.min.js globally before module scripts.
src/app/templates/demo_tracer1.html / demo_tracer2.html / demo_tracer3.html Ensures demo pages also load jQuery globally.
src/app/static/jquery-3.7.1.min.js Vendors jQuery for local/static serving.
jstests/setupTests.js Loads vendored jQuery into the jsdom environment for Jest tests.
jstests/bulk_register_users.test.js Updates tests to stub $.post (jQuery) instead of fetch.
src/app/local_lambda_debug.py, Makefile, tests/apikey_lambda_base_test.py, docs Moves local Lambda default/expected port from 9001 to 9811.
tests/fixtures/local_aws.py, Makefile*, .github/workflows/ci-cd.yml, docs Switches local MinIO/DynamoDB control calls to python3 bin/local_services.py ... and removes old bash scripts.
src/app/static/video.html Adds jQuery script include (but file remains under static/).
Comments suppressed due to low confidence (1)

src/app/static/video.html:14

  • src/app/static/video.html is served as a static asset, but it now includes Jinja ({{ url_for(...) }}) and already contains {% block %} tags. Those template directives will not be rendered when served from /static/, so the script URL will be literally {{ url_for(...) }} and fail to load. Either move this file under src/app/templates/ and render it via Flask/Jinja, or remove Jinja syntax and use a plain relative/absolute URL. Also note planttracer.js is an ES module (uses import) and should be loaded with type="module" if this page is meant to execute it.
    <script src="{{ url_for('static', filename='jquery-3.7.1.min.js') }}"></script>
    <script type="text/javascript" charset="utf8" src="static/planttracer.js"></script>

    {% block head %}{% endblock %}
    {% block script %}{% endblock %}

Starts the database if it is not running.
"""
subprocess.call( [os.path.join(ROOT_DIR,'bin/local_dynamodb_control.bash'),'start'])
subprocess.call(["python3", os.path.join(ROOT_DIR, "bin/local_services.py"), "dynamodb", "start"])
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bin/local_services.py is invoked here, but it doesn't exist in this branch (the bin/ directory currently contains only .gitignore). As a result, the local DynamoDB fixture will fail to start the service. Add/commit bin/local_services.py (or switch back to an existing start script) so this call succeeds in a clean checkout.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"""
if os.environ.get(C.AWS_REGION, '') == 'local':
subprocess.call([os.path.join(ROOT_DIR, 'bin/local_minio_control.bash'), 'start'])
subprocess.call(["python3", os.path.join(ROOT_DIR, "bin/local_services.py"), "minio", "start"])
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bin/local_services.py is invoked here, but the script is not present in the repo, so this fixture will fail when AWS_REGION=local. Ensure bin/local_services.py is added/committed (or update the command to an existing script) before relying on it in tests.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

Comment thread Makefile
Comment on lines +318 to +321
python3 bin/local_services.py dynamodb start

stop_local_dynamodb: bin/DynamoDBLocal.jar
bash bin/local_dynamodb_control.bash stop
python3 bin/local_services.py dynamodb stop
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These make targets now call python3 bin/local_services.py ..., but bin/local_services.py is not present in the repository (so make start_local_dynamodb/stop_local_dynamodb will fail on a fresh checkout). Add/commit the new script or update these targets to call an existing implementation.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines +62 to +66
python3 bin/local_services.py dynamodb start

- name: Validate local minio/S3 endpoint and make sure we have buckets
run: |
bin/local_minio_control.bash start
python3 bin/local_services.py minio start
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI now starts local DynamoDB/MinIO via python3 bin/local_services.py ..., but bin/local_services.py is not present in the repo. This will cause the workflow to fail immediately. Add/commit the script (or revert to the previous start commands) so CI can bring up its local services.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 72.09302% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.98%. Comparing base (71681a6) to head (3a86747).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
lambda-resize/src/resize_app/main.py 14.28% 6 Missing ⚠️
lambda-resize/src/resize_app/movie_glue.py 83.33% 2 Missing ⚠️
src/app/odb.py 0.00% 2 Missing ⚠️
src/app/apikey.py 80.00% 1 Missing ⚠️
src/app/static/utils.js 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #911      +/-   ##
==========================================
+ Coverage   57.02%   58.98%   +1.95%     
==========================================
  Files          60       60              
  Lines        5929     5627     -302     
  Branches      271      151     -120     
==========================================
- Hits         3381     3319      -62     
+ Misses       2416     2261     -155     
+ Partials      132       47      -85     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 32 changed files in this pull request and generated 5 comments.

Comment on lines 388 to +392
data,
attempt,
url,
body: body,
movie_id: self.movie_id,
frame_start: self.pending_trace_start_frame,
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the HTTP failure log object, frame_start is taken from self.pending_trace_start_frame, but that field is set to null earlier in the same failure path. As a result the log will always show frame_start: null; capture the start frame in a local variable (or log retraceStartFrame) before clearing state.

Copilot uses AI. Check for mistakes.
Comment on lines 410 to 414
error: err && err.message,
url,
body: body,
movie_id: self.movie_id,
frame_start: self.pending_trace_start_frame,
});
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue in the network-error failure log: frame_start is logged from self.pending_trace_start_frame after it has been cleared to null, so the log loses the actual retrace start frame. Log the captured start frame value (or clear state after logging).

Copilot uses AI. Check for mistakes.
Comment on lines 173 to +177
frame_start = body.get("frame_start", 0)
LOGGER.info("trace-movie. movie_id=%s", movie_id)
return movie_glue.queue_tracing(api_key, movie_id, frame_start)
try:
movie_glue.prepare_tracing_request(api_key=api_key, movie_id=movie_id, frame_start=frame_start)
return movie_glue.queue_tracing(api_key, movie_id, frame_start)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare_tracing_request() normalizes frame_start to an int, but the subsequent call to queue_tracing() uses the original frame_start value from the request body. If the body provides a string (common for JSON), the queued message may carry a non-int frame_start and diverge from what was validated/cleared. Use the normalized frame_start returned by prepare_tracing_request() (or cast once and reuse for both calls).

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +180
except ValueError as e:
LOGGER.exception("trace-movie rejected: %s", e)
return Response(status_code=403, body=str(e.args))
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error response uses body=str(e.args), which stringifies a tuple (e.g. "('api_key required',)") rather than a clean message. Return str(e) (and ideally a JSON body consistent with other endpoints) so callers get a stable, readable error string.

Copilot uses AI. Check for mistakes.
Comment thread src/app/static/video.html Outdated
@simsong simsong merged commit 695a7b8 into main Apr 14, 2026
6 checks passed
@simsong simsong deleted the fix-retrace2 branch April 14, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants