Adapt to latest vllm changes#632
Conversation
4aad447 to
6bfddb6
Compare
eero-t
left a comment
There was a problem hiding this comment.
Approved, matches changes for "GenAIExamples" in opea-project/GenAIExamples#1210 (and corresponding PR for "GenAIComps" repo).
@poussa ?
|
Investigating the CI failure for "agent, gaudi, ci-gaudi-values, common" test, I see 2 bugs:
(Besides the size, I think another model would be nicer as default due to license used on Meta's models.) |
|
My vLLM PR includes same agent and (relevant) vLLM component changes as yours, but strangely that same CI agent test succeeded for it: https://github.com/opea-project/GenAIInfra/actions/runs/12262626198/job/34212355870?pr=610 ? EDIT: today's push on my PR got the same issue. |
eero-t
left a comment
There was a problem hiding this comment.
--tensor-parallel-size option can be dropped, as 1 value is the default:
https://docs.vllm.ai/en/latest/usage/engine_args.html
|
we need to wait for PR #642 to land-in first |
- Remove --eager-enforce on hpu to improve performance - Refactor to the upstream docker entrypoint changes Fixes issue opea-project#631. Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
|
This is included to #610, so reviewing & merging that instead is another option. |
|
Let's use #610 instead. |
Description
Issues
Fixes #631.
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
List the newly introduced 3rd party dependency if exists.
Tests
Describe the tests that you ran to verify your changes.