`IndexError` in `model.inference` when input contains empty strings

### Description

When using `knowledgator/gliner-x-base`, inference fails with an `IndexError` on inputs containing an empty string.

### Reproduction Steps

The following script demonstrates the problem:

```python
from gliner import GLiNER

model = GLiNER.from_pretrained("knowledgator/gliner-x-base")

# The presence of the empty string in this list triggers the error
texts = ["Email CEO to approve budget", ""]
labels = ["person", "organization", "action"]

print("Running inference...")
predictions = model.inference(texts, labels, batch_size=16)
print(f"Results: {predictions}")
```

### Traceback

```text
Traceback (most recent call last):
  File "issue_repro.py", line 10, in <module>
    predictions = model.inference(texts, labels, batch_size=16)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../gliner/model.py", line 1290, in inference
    start_text_idx = start_token_idx_to_text_idx[start_token_idx]
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
IndexError: list index out of range
```

### Expected Behavior

The model should handle empty strings gracefully by returning an empty list of entities for that specific index, e.g.:
`[[{'start': 6, 'end': 9, 'text': 'CEO', 'label': 'person'}], []]`, as is done with during _standard inference_ with other GLiNER models.

### Environment

* **GLiNER:** v0.2.24
* **flash_attn:** v2.7.4.post1+25.11
* **Model:** `knowledgator/gliner-x-base`

---

### Workaround

I found a quick fix by simply skipping the output processing of empty string inputs by modifying [this section](https://github.com/urchade/GLiNER/blob/7a065864c552823b76212e8d7d0ba68f6891cefc/gliner/model.py#L1280) with:

```python
all_entities = []
for i, output in enumerate(outputs):
    if not tokens[i]: # FIX empty input case for models like knowledgator/gliner-x-base
        all_entities.append([])
        continue
    start_token_idx_to_text_idx = all_start_token_idx_to_text_idx[i]
    end_token_idx_to_text_idx = all_end_token_idx_to_text_idx[i]
    entities = []
```

But it would be better to handle this in the forward pass to avoid ghost predictions from the model on empty strings entirely. Perhaps a single fix to handle this could be found that also solves Issue #315?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`IndexError` in `model.inference` when input contains empty strings #316

Description

Reproduction Steps

Traceback

Expected Behavior

Environment

Workaround

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

IndexError in model.inference when input contains empty strings #316

Description

Description

Reproduction Steps

Traceback

Expected Behavior

Environment

Workaround

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`IndexError` in `model.inference` when input contains empty strings #316