KDA MTP (Multi-Token Prediction) support

### Description

Add Multi-Token Prediction (MTP) support to cuLA's inference kernels.

### Context

MTP is an inference optimization technique that predicts multiple tokens simultaneously, improving throughput for autoregressive generation. Supporting MTP in cuLA's linear attention kernels would enable faster inference for models using this technique.

### Tasks

- [ ] Design MTP integration for linear attention inference kernels
- [ ] Implement MTP support in relevant kernels (KDA, Lightning Attention)
- [ ] Add tests and benchmarks
- [ ] Document usage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KDA MTP (Multi-Token Prediction) support #17

Description

Context

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

KDA MTP (Multi-Token Prediction) support #17

Description

Description

Context

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions