Skip to content

[docs]: refresh KT install commands#1958

Merged
JimmyPeilinLi merged 1 commit intomainfrom
docs-v061-refresh
Apr 26, 2026
Merged

[docs]: refresh KT install commands#1958
JimmyPeilinLi merged 1 commit intomainfrom
docs-v061-refresh

Conversation

@JimmyPeilinLi
Copy link
Copy Markdown
Collaborator

Summary: update README/SFT/kt-kernel install commands with current KT inference and SFT paths while keeping docs formatting minimal. Validation: git diff --check; confirmed sglang-kt 0.6.1 wheel metadata does not depend on kt-kernel.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation and installation guides to reflect the transition to PyPI-based package distribution and improved integration with LLaMA-Factory. Key changes include simplified installation commands, updated tutorial links, and more detailed training/inference instructions using accelerate. Feedback focuses on maintaining necessary dependency extras for LLaMA-Factory, correcting Python version compatibility notes, and ensuring consistency between model examples used in training and inference sections.

Comment thread README.md
# Install environment following kt-sft/README.md
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
cd /path/to/LLaMA-Factory
pip install -e .
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Including the [torch,metrics] extras when installing LLaMA-Factory is recommended to ensure that all necessary dependencies for training and evaluation are installed, especially since the quick start does not explicitly install them earlier.

Suggested change
pip install -e .
pip install -e ".[torch,metrics]"

Comment thread README_ZH.md
# 按照 kt-sft/README.md 安装环境
USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml
cd /path/to/LLaMA-Factory
pip install -e .
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

建议在安装 LLaMA-Factory 时包含 [torch,metrics] 扩展项,以确保安装了训练和评估所需的所有依赖项。

Suggested change
pip install -e .
pip install -e ".[torch,metrics]"

```shell
# 1. Create a conda environment
conda create -n Kllama python=3.12 # choose from : [3.10, 3.11, 3.12, 3.13]
conda create -n Kllama python=3.12 # choose from : [3.11, 3.12, 3.13]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The suggested Python versions [3.11, 3.12, 3.13] include 3.13, but kt-kernel/README.md (lines 63 and 68) indicates that pre-built wheels are currently only provided for Python 3.10, 3.11, and 3.12. Suggesting 3.13 may lead to a source build, which contradicts the goal of avoiding local compilation stated in line 98. Additionally, Python 3.10 is missing from the list despite being supported.

Suggested change
conda create -n Kllama python=3.12 # choose from : [3.11, 3.12, 3.13]
conda create -n Kllama python=3.12 # choose from : [3.10, 3.11, 3.12]

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation
pip install -e .
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Removing the [torch,metrics] extras from the LLaMA-Factory installation may cause the training and evaluation steps to fail if these dependencies are not already present in the environment. It is safer to include them to ensure the tutorial works as expected.

Suggested change
pip install -e .
pip install -e ".[torch,metrics]"


# 4. Install flash-attention, download the corresponding file based on your Python and Torch versions from: https://github.com/Dao-AILab/flash-attention/releases
pip install flash_attn-2.8.3+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
pip install flash-attn --no-build-isolation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The command has been updated to use PyPI, but the preceding comment (line 116) still instructs the user to download a wheel file from GitHub. Please update the comment to be consistent with the new installation method.

### Core Feature 2: Chat with the fine-tuned model (base + LoRA adapter)

Run the command: `llamafactory-cli chat examples/inference/deepseek3_lora_sft_kt.yaml`.
Run the command: `llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The command uses qwen3_lora_sft.yaml, but the tutorial is focused on DeepSeek-V3 (as seen in the training step at line 131 and the YAML example at line 221). This inconsistency will confuse users who just trained a DeepSeek-V3 model in the previous step. Please update the command to use the appropriate DeepSeek-V3 inference configuration.

Suggested change
Run the command: `llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml`.
Run the command: `llamafactory-cli chat examples/inference/deepseek_v3_lora_sft.yaml`.

### Core Feature 3: Batch inference + metrics (base + LoRA adapter)

Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/deepseek3_lora_sft_kt.yaml`.
Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/qwen3_lora_sft.yaml`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This API command uses qwen3_lora_sft.yaml, which is inconsistent with the DeepSeek-V3 model used throughout the rest of the guide.

Suggested change
Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/qwen3_lora_sft.yaml`.
Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/deepseek_v3_lora_sft.yaml`.

@JimmyPeilinLi JimmyPeilinLi merged commit 0656e01 into main Apr 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant