Skip to content

fix: add area column to CSVSink BASE_HEADER#2205

Open
Zeesejo wants to merge 1 commit intoroboflow:developfrom
Zeesejo:fix/l100-002-csv-sink-area
Open

fix: add area column to CSVSink BASE_HEADER#2205
Zeesejo wants to merge 1 commit intoroboflow:developfrom
Zeesejo:fix/l100-002-csv-sink-area

Conversation

@Zeesejo
Copy link
Copy Markdown

@Zeesejo Zeesejo commented Apr 7, 2026

Problem

CSVSink writes bounding-box coordinates, confidence, class_id, and tracker_id to CSV but silently omits the area of each detection, even though Detections.area is a standard computed property. Users trying to log area either get an empty column or an AttributeError when passing it via custom_data.

Reported in #1397.


Root Cause

BASE_HEADER (the static list of always-written columns) never included "area":

# before
BASE_HEADER = [
    "x_min", "y_min", "x_max", "y_max",
    "class_id", "confidence", "tracker_id",
    # ← area missing
]

parse_detection_data() built each row from the xyxy array + optional fields, but never called detections.area.


Fix

Two small, self-contained changes to csv_sink.py:

1. BASE_HEADER — add "area"

BASE_HEADER = [
    "x_min", "y_min", "x_max", "y_max",
    "class_id", "confidence", "tracker_id",
    "area",   # ← added
]

2. parse_detection_data() — compute and include area per row

areas = detections.area          # computed once outside the loop
for i in range(len(detections.xyxy)):
    row = {
        ...,
        "area": str(areas[i]),   # ← added
    }

detections.area is computed once before the loop (not per-iteration) to avoid redundant work on large batches.


Verification

import supervision as sv
import numpy as np

detections = sv.Detections(
    xyxy=np.array([[10, 20, 110, 120]]),  # area = 100*100 = 10000
    confidence=np.array([0.95]),
    class_id=np.array([0]),
)

with sv.CSVSink("out.csv") as sink:
    sink.append(detections)

# out.csv now contains an 'area' column with value 10000.0

Fixes #1397

CSVSink iterated over detections using __iter__, which yields
(xyxy, mask, confidence, class_id, tracker_id, data) tuples.
The .area property is not part of that tuple, so it was never
written to the CSV — accessing it via custom_data raised an
AttributeError (issue roboflow#1397).

Fix:
- Add 'area' to BASE_HEADER so it appears as a standard column.
- Compute detections.area once in parse_detection_data() and
  include area[i] for each row.

Fixes roboflow#1397
@Zeesejo Zeesejo requested a review from SkalskiP as a code owner April 7, 2026 13:19
Copilot AI review requested due to automatic review settings April 7, 2026 13:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes CSVSink serialization to include each detection’s computed area in the emitted CSV, addressing the gap reported in issue #1397.

Changes:

  • Add "area" to CSVSink’s BASE_HEADER so it’s always written.
  • Compute detections.area once and include the per-detection value in each serialized row.

"confidence",
"tracker_id",
"area",
]
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding area to BASE_HEADER can produce duplicate area columns when callers still pass custom_data={'area': ...} (or if detections.data contains an area key). parse_field_names() currently appends all dynamic keys without excluding base fields, so the header can contain duplicate names and the resulting CSV becomes ambiguous. Consider filtering dynamic_header to remove any keys already in BASE_HEADER (and/or treating base-field overrides as an error).

Suggested change
]
]
BASE_HEADER_SET = set(BASE_HEADER)

Copilot uses AI. Check for mistakes.
Comment on lines 16 to 25
BASE_HEADER = [
"x_min",
"y_min",
"x_max",
"y_max",
"class_id",
"confidence",
"tracker_id",
"area",
]
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change updates the CSV schema (new area column) and will require updating existing CSVSink tests/fixtures that assert the header and row layouts (e.g., tests/detection/test_csv.py expected headers/rows). Please add/adjust test expectations to include area in the correct position and with the computed values so CI stays green.

Copilot generated this review using guidance from repository custom instructions.
@Borda Borda changed the title fix: add area column to CSVSink BASE_HEADER (issue #1397) fix: add area column to CSVSink BASE_HEADER Apr 8, 2026
@Borda Borda added waiting for author bug Something isn't working labels Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working waiting for author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Save detection area with CSVSink

3 participants