Remove report_id from measurement submission#1195
Conversation
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (77.09%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #1195 +/- ##
==========================================
+ Coverage 91.84% 92.70% +0.86%
==========================================
Files 85 78 -7
Lines 7907 7388 -519
Branches 476 439 -37
==========================================
- Hits 7262 6849 -413
+ Misses 539 448 -91
+ Partials 106 91 -15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| """ | ||
| test_name = str(content.get("test_name") or "").lower().replace("_", "") | ||
|
|
||
| cc = str(content.get("probe_cc") or "").upper().replace("_", "") |
There was a problem hiding this comment.
I think that since this is inside of the actual measurement content, we want to be a bit more strict about what we accept. That is to say we should enforce that the probe_cc is meeting all this criteria and not return a default value if it's inconsistent.
There was a problem hiding this comment.
Actually in looking at this further, I think the validation should live in the check function and in here you should just return the raw values as they as to perform validation later.
|
|
||
| raw_asn = str(content.get("probe_asn") or "AS0").upper() | ||
| if len(raw_asn) > 12 or len(raw_asn) < 3 or not raw_asn.startswith("AS"): | ||
| raw_asn = "AS0" |
There was a problem hiding this comment.
Same comment as above, the client should be setting a good value here, if they aren't we should be providing an error.
| Checks metadata consistency, raising an HTTPException and stopping | ||
| ingestion when an inconsistency is detected | ||
| """ | ||
| asn_i = normalize_asn(probe_asn) |
There was a problem hiding this comment.
I'm not sure it's correct to be doing normalization here, since the value is coming from the measurement body. If it's not in a format compatible with the spec, we should return an error to client.
| log.error(f"Unable to send report to fastpath. report_id: {report_id}") | ||
| log.error(f"Unable to send report to fastpath. measurement_uid: {msmt_uid}") | ||
| Metrics.MISSED_MSMNTS.inc() | ||
| return SubmitMeasurementResponse( |
There was a problem hiding this comment.
Do we need to return to the caller the report_id that was generated as well? I suppose it doesn't hurt since it's a value which we generated in here and the client has no other way of knowing it.
There was a problem hiding this comment.
I think the report_id is redundant to the measurement_uid in this context (it's unique per measurement). The best value we could get out of it is restoring it in the future, but in that case the probe would probably already know it before sending the measurement
But I agree it wouldn't hurt to have add it
There was a problem hiding this comment.
Norbel suggested that we keep the report_id in the response model, so I added that 👍
| metadata = metadata_from_measurement_content(content) | ||
|
|
||
| json["is_verified"] = "u" | ||
| json["content"]["report_id"] = generate_report_id( |
There was a problem hiding this comment.
I don't think we can do this, because it will otherwise result in an inconsistency between what is inside of the measurement already (that the user got through other means) and what we are replacing it with.
In this code path, I think we should be using the report_id that's taken from the URL.
There was a problem hiding this comment.
If we do this, we would still have to require the client to make a call to the generate report id function (or change it to generate the report_id on its own?). Which is fine, but we would only remove the report_id requirement from the anonc endpoint and not from the legacy one. Is this what we want?
There was a problem hiding this comment.
The /report/{report_id} API call already assumes this as the report_id is encoded inside of the URL path and it cannot work without it: https://github.com/ooni/backend/pull/1195/changes#diff-fd68970e397912b47cad3b7957ef7c811249f632e065d349a0161f06fa516591R103.
I don't think we can get rid of requiring the report_id in a way that's backward compatible.
This PR will remove the requirement of a report id from measurement submission
/receive_measurementendpoint will still expect thereport_idparameter, but this will no longer be used. Instead, a pseudo-report_id will be constructed using metadata from the measurement body itself/submit_measurement(the new anonymous credentials version of measurement submission) will remove it entirely from the pathNote that the
report_idwill no longer be usable to group measurements that were taken in the same session, as every measurement will have a distinctreport_idAfter this PR, the
report_idwill be considered a legacy key, only kept for backwards compatibilityA call to
POST /reportto get thereport_idwill no longer be required before submitting measurements.closes #1139