[docs]: refresh KT install commands#1958
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the documentation and installation guides to reflect the transition to PyPI-based package distribution and improved integration with LLaMA-Factory. Key changes include simplified installation commands, updated tutorial links, and more detailed training/inference instructions using accelerate. Feedback focuses on maintaining necessary dependency extras for LLaMA-Factory, correcting Python version compatibility notes, and ensuring consistency between model examples used in training and inference sections.
| # Install environment following kt-sft/README.md | ||
| USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml | ||
| cd /path/to/LLaMA-Factory | ||
| pip install -e . |
There was a problem hiding this comment.
Including the [torch,metrics] extras when installing LLaMA-Factory is recommended to ensure that all necessary dependencies for training and evaluation are installed, especially since the quick start does not explicitly install them earlier.
| pip install -e . | |
| pip install -e ".[torch,metrics]" |
| # 按照 kt-sft/README.md 安装环境 | ||
| USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml | ||
| cd /path/to/LLaMA-Factory | ||
| pip install -e . |
| ```shell | ||
| # 1. Create a conda environment | ||
| conda create -n Kllama python=3.12 # choose from : [3.10, 3.11, 3.12, 3.13] | ||
| conda create -n Kllama python=3.12 # choose from : [3.11, 3.12, 3.13] |
There was a problem hiding this comment.
The suggested Python versions [3.11, 3.12, 3.13] include 3.13, but kt-kernel/README.md (lines 63 and 68) indicates that pre-built wheels are currently only provided for Python 3.10, 3.11, and 3.12. Suggesting 3.13 may lead to a source build, which contradicts the goal of avoiding local compilation stated in line 98. Additionally, Python 3.10 is missing from the list despite being supported.
| conda create -n Kllama python=3.12 # choose from : [3.11, 3.12, 3.13] | |
| conda create -n Kllama python=3.12 # choose from : [3.10, 3.11, 3.12] |
| git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git | ||
| cd LLaMA-Factory | ||
| pip install -e ".[torch,metrics]" --no-build-isolation | ||
| pip install -e . |
There was a problem hiding this comment.
Removing the [torch,metrics] extras from the LLaMA-Factory installation may cause the training and evaluation steps to fail if these dependencies are not already present in the environment. It is safer to include them to ensure the tutorial works as expected.
| pip install -e . | |
| pip install -e ".[torch,metrics]" |
|
|
||
| # 4. Install flash-attention, download the corresponding file based on your Python and Torch versions from: https://github.com/Dao-AILab/flash-attention/releases | ||
| pip install flash_attn-2.8.3+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl | ||
| pip install flash-attn --no-build-isolation |
| ### Core Feature 2: Chat with the fine-tuned model (base + LoRA adapter) | ||
|
|
||
| Run the command: `llamafactory-cli chat examples/inference/deepseek3_lora_sft_kt.yaml`. | ||
| Run the command: `llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml`. |
There was a problem hiding this comment.
The command uses qwen3_lora_sft.yaml, but the tutorial is focused on DeepSeek-V3 (as seen in the training step at line 131 and the YAML example at line 221). This inconsistency will confuse users who just trained a DeepSeek-V3 model in the previous step. Please update the command to use the appropriate DeepSeek-V3 inference configuration.
| Run the command: `llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml`. | |
| Run the command: `llamafactory-cli chat examples/inference/deepseek_v3_lora_sft.yaml`. |
| ### Core Feature 3: Batch inference + metrics (base + LoRA adapter) | ||
|
|
||
| Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/deepseek3_lora_sft_kt.yaml`. | ||
| Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/qwen3_lora_sft.yaml`. |
There was a problem hiding this comment.
This API command uses qwen3_lora_sft.yaml, which is inconsistent with the DeepSeek-V3 model used throughout the rest of the guide.
| Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/qwen3_lora_sft.yaml`. | |
| Run the command: `API_PORT=8000 llamafactory-cli api examples/inference/deepseek_v3_lora_sft.yaml`. |
Summary: update README/SFT/kt-kernel install commands with current KT inference and SFT paths while keeping docs formatting minimal. Validation: git diff --check; confirmed sglang-kt 0.6.1 wheel metadata does not depend on kt-kernel.