多模态大模型微调框架 LlamaFactory 使用指南

文章配图

LlamaFactory 是一个面向科研机构、企业研发团队或个人开发者快速构建和部署 AI 应用的一站式大模型训练与微调工具，致力于提供简单易用、高效灵活的全流程解决方案。平台以'低门槛、高效率、强扩展'为核心，通过集成化工具链、可视化操作界面与自动化工作流，显著降低大模型定制与优化的技术成本，助力用户快速实现模型从开发调试到生产部署的全周期闭环。

文章配图

官方文档：

https://llamafactory.readthedocs.io/zh-cn/latest/

安装

使用 uv 工具来安装 LlamaFactory。

下载工程

git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git

uv 安装

cd LlamaFactory
uv sync

使用一条命令 uv sync 就完成 LlamaFactory 的安装，版本以及依赖版本等不会出现错误。

验证

打开 llamafactory 自带的 web 页面

uv run llamafactory-cli webui

文章配图

能正常打开这个页面就说明安装没有问题了。

简单使用

llamafactory 的使用有两种模式，分别是 Web 页面和命令行。这里就简单介绍一下命令行的使用。

基本功能的命令行使用包括：

训练
导出
推理
评估

命令行的通用使用方式是 llamafactory-cli + 任务 + 配置文件。

任务类型主要通过任务来指定，如：

train：训练
export：导出
chat：推理
eval：评估

配置文件是 yaml 格式的文件，命名也很清晰，包括训练参数，任务配置参数。

在训练上，官方给了很多示例文件，比如全量训练、lora 微调、qlora 微调等方法。

文章配图

训练

uv run llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml

# examples/train_lora/llama3_lora_sft.yaml
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
stage: sft
do_train: true
finetuning_type: lora
lora_target: all
dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: saves/llama3-8b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy:

导出

llamafactory-cli export merge_config.yaml

# examples/merge_lora/llama3_lora_sft.yaml
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft
template: llama3
finetuning_type: lora
### export
export_dir: models/llama3_lora_sft
export_size: 2
export_device: cpu
export_legacy_format: false

推理

llamafactory-cli chat inference_config.yaml

# examples/inference/llama3.yaml
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
template: llama3
infer_backend: huggingface # choices: [huggingface, vllm]

评估

llamafactory-cli eval examples/train_lora/llama3_lora_eval.yaml

# examples/train_lora/llama3_lora_eval.yaml
### model
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft
# 可选项
### method
finetuning_type: lora
### dataset
task: mmlu_test # mmlu_test, ceval_validation, cmmlu_test
template: fewshot
lang: en
n_shot: 5
### output
save_dir: saves/llama3-8b/lora/eval
### eval
batch_size: 4

微调 Qwen3 VL

模型准备

llamafactory-cli 可以自动下载模型，但是国内有时会超时，建议使用国内镜像网站。在命令行中执行如下：

export HF_ENDPOINT="https://hf-mirror.com"

选择一个指定的模型 Qwen/Qwen3-VL-2B-Instruct。

数据准备

llamafactory 中数据集的配置集中在 data/dataset_info.json。

{
"identity": {"file_name": "identity.json"},
"alpaca_en_demo": {"file_name": "alpaca_en_demo.json"},
"alpaca_zh_demo": {"file_name": "alpaca_zh_demo.json"},
"glaive_toolcall_en_demo": {
"file_name": "glaive_toolcall_en_demo.json",
"formatting": "sharegpt",
"columns": {"messages": "conversations", "tools": "tools"}
}
}

dataset_info.json 中 json 格式的文件，配置了需要使用的数据集。key 是数据集的名字，value 是具体参数。

例如：

数据集名称：alpaca_en_demo
数据集路径：alpaca_en_demo.json

具体数据集的格式，llamafactory 目前支持 alpaca 和 sharegpt 两种数据格式。

alpaca：

[{"instruction":"Describe a process of making crepes.","input":"","output":"Making crepes is an easy and delicious process! Enjoy!"},{"instruction":"Transform the following sentence using a synonym: The car sped quickly.","input":"","output":"The car accelerated rapidly."}]

sharegpt：

[{"messages":[{"content":"<audio>What's that sound?","role":"user"},{"content":"It is the sound of glass shattering.","role":"assistant"}],"audios":["mllm_demo_data/1.mp3"]}]

本次使用 coco-2014-caption，属于 sharegpt 格式，所以使用 sharegpt 格式来准备。

dataset_info.json 注册 coco 数据集的配置项

{
"coco-400": {
"file_name": "coco-400.json",
"formatting": "sharegpt",
"columns": {"messages": "conversations", "id": "id"},
"tags": {"role_tag": "from", "content_tag": "value", "user_tag": "user", "assistant_tag": "assistant"}
}
}

coco 数据集的格式如下：

文章配图

配置参数

微调就选择 qlora 的方式，根据工程给的示例文件去修改，选择的示例文件是：

文章配图

### model
model_name_or_path: Qwen/Qwen3-4B-Instruct-2507
quantization_bit: 4 # choices: [8 (bnb/hqq/eetq), 4 (bnb/hqq), 3 (hqq), 2 (hqq)]
quantization_method: bnb # choices: [bnb, hqq, eetq]
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
### dataset
dataset: identity,alpaca_en_demo
template: qwen3_nothink
cutoff_len: 2048
max_samples: 1000
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/qwen3-4b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none
### train
per_device_train_batch_size: 1
gradient_accumulation_steps:

根据以上模板，修改成我们自身的参数，关键修改在于：

模型名称：model_name_or_path
数据集：dataset
模板：template 视觉大模型是 qwen3_vl_nothink，语言大模型是 qwen3_nothink，用错模板会报错

剩下的如训练批次、batch_size、梯度累计、学习率、保存路径、训练记录等都有设置，不再详说。

### model
model_name_or_path: Qwen/Qwen3-VL-2B-Instruct
quantization_bit: 4
quantization_method: bnb
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all
### dataset
dataset: coco-3000
template: qwen3_vl_nothink
cutoff_len: 2048
preprocessing_num_workers: 16
dataloader_num_workers: 4
### output
output_dir: saves/qwen3-2b-coco-3000/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none
### train
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 1e-5
num_train_epochs: 2

启动训练

uv run llamafactory-cli train examples/train_qlora/qwen3-coco.yaml

➜ LlamaFactory git:(main) ✗ uv run llamafactory-cli train examples/train_qlora/qwen3-coco.yaml
[WARNING|2026-02-06 17:47:42] llamafactory.hparams.parser:148>> We recommend enable `upcast_layernorm` in quantized training.
Qwen3VLVideoProcessor {"crop_size": null,"data_format":"channels_first","default_to_square": true,"device": null,"do_center_crop": null,"do_convert_rgb": true,"do_normalize": true,"do_rescale": true,"do_resize": true,"do_sample_frames": true,"fps":2,"image_mean":[0.5,0.5,0.5],"image_std":[0.5,0.5,0.5],"input_data_format": null,"max_frames":768,"merge_size":2,"min_frames":4,"num_frames": null,"pad_size": null,"patch_size":16,"processor_class":"Qwen3VLProcessor","resample":3,"rescale_factor":,: ,:{:,:},:,: ,:}
[]  ::,>> loading configuration  processor_config.json  cache at None
[]  ::,>> Processor Qwen3VLProcessor:- image_processor: Qwen2VLImageProcessorFast {: ,:,: ,: ,: ,: ,: ,: ,: ,: ,: ,:[,,],:,:[,,],: ,: ,:,: ,: ,:,:,:,:,: ,:{:,:},:}- tokenizer: Qwen2TokenizerFast(name_or_path=, vocab_size=, model_max_length=, is_fast=True, padding_side=, truncation_side=, special_tokens={:,:,:[,,,,,,,,,]}
[]  ::,>> ***** Running training *****
[]  ::,>> Num examples = 
[]  ::,>> Num Epochs = 
[]  ::,>> Instantaneous batch size per device = 
[]  ::,>> =
[]  ::,>> Gradient Accumulation steps = 
[]  ::,>> Total optimization steps = 
[]  ::,>> Number of trainable parameters = ,,
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
{:,:,:,:}
%|███████████████████████████████████████████████████████████████████████████|/ [:<:,it/s]
[]  ::,>> Saving model checkpoint to saves/qwen3-coco/lora/sft/checkpoint
{:,:,:,:,:}
%|███████████████████████████████████████████████████████████████████████████|/ [:<:,it/s]
epoch =  total_flos = GF train_loss =  train_runtime = :: train_samples_per_second =  train_steps_per_second = 
Figure saved at: saves/qwen3-coco/lora/sft/training_loss.png
[] llamafactory.extras.ploting:>> No metric eval_loss to plot.
[] llamafactory.extras.ploting:>> No metric eval_accuracy to plot.
[]  ::,>> Dropping the following result  it does  have all the necessary fields:{:{:,:}}

训练过程记录

文章配图

多模态大模型微调框架 LlamaFactory 使用指南