fix: add area column to CSVSink BASE_HEADER#2205
fix: add area column to CSVSink BASE_HEADER#2205Zeesejo wants to merge 1 commit intoroboflow:developfrom
Conversation
CSVSink iterated over detections using __iter__, which yields (xyxy, mask, confidence, class_id, tracker_id, data) tuples. The .area property is not part of that tuple, so it was never written to the CSV — accessing it via custom_data raised an AttributeError (issue roboflow#1397). Fix: - Add 'area' to BASE_HEADER so it appears as a standard column. - Compute detections.area once in parse_detection_data() and include area[i] for each row. Fixes roboflow#1397
There was a problem hiding this comment.
Pull request overview
This PR fixes CSVSink serialization to include each detection’s computed area in the emitted CSV, addressing the gap reported in issue #1397.
Changes:
- Add
"area"toCSVSink’sBASE_HEADERso it’s always written. - Compute
detections.areaonce and include the per-detection value in each serialized row.
| "confidence", | ||
| "tracker_id", | ||
| "area", | ||
| ] |
There was a problem hiding this comment.
Adding area to BASE_HEADER can produce duplicate area columns when callers still pass custom_data={'area': ...} (or if detections.data contains an area key). parse_field_names() currently appends all dynamic keys without excluding base fields, so the header can contain duplicate names and the resulting CSV becomes ambiguous. Consider filtering dynamic_header to remove any keys already in BASE_HEADER (and/or treating base-field overrides as an error).
| ] | |
| ] | |
| BASE_HEADER_SET = set(BASE_HEADER) |
| BASE_HEADER = [ | ||
| "x_min", | ||
| "y_min", | ||
| "x_max", | ||
| "y_max", | ||
| "class_id", | ||
| "confidence", | ||
| "tracker_id", | ||
| "area", | ||
| ] |
There was a problem hiding this comment.
This change updates the CSV schema (new area column) and will require updating existing CSVSink tests/fixtures that assert the header and row layouts (e.g., tests/detection/test_csv.py expected headers/rows). Please add/adjust test expectations to include area in the correct position and with the computed values so CI stays green.
Problem
CSVSinkwrites bounding-box coordinates,confidence,class_id, andtracker_idto CSV but silently omits theareaof each detection, even thoughDetections.areais a standard computed property. Users trying to log area either get an empty column or anAttributeErrorwhen passing it viacustom_data.Reported in #1397.
Root Cause
BASE_HEADER(the static list of always-written columns) never included"area":parse_detection_data()built each row from thexyxyarray + optional fields, but never calleddetections.area.Fix
Two small, self-contained changes to
csv_sink.py:1.
BASE_HEADER— add"area"2.
parse_detection_data()— compute and include area per rowdetections.areais computed once before the loop (not per-iteration) to avoid redundant work on large batches.Verification
Fixes #1397