查看模型镜像列表

更新时间：2025-11-03 15:50:25

LlamaFactory镜像列表构建了一个完整的容器化生态体系，通过标准化封装不同版本的PyTorch、Transformers、CUDA和vLLM等深度学习组件，为大语言模型的训练、微调和部署提供开箱即用的解决方案。它采用矩阵化版本管理策略，既保障了环境一致性和实验可复现性，又覆盖了从最新硬件到遗留系统的广泛兼容性需求，让用户能够根据具体硬件配置和项目要求快速选择合适的环境组合，从而将重心完全聚焦于模型开发本身而非繁琐的环境配置工作。

前提条件

用户已注册大模型实验室，如果需要帮助或尚未注册，可参考账户注册/登录完成注册/登录。
当前账号的余额充裕，可满足实例运行的需要。点击可了解费用信息。

信息

镜像Tag命名规则为：lf{llamafactory版本}-tf{transformers版本}-torch{torch版本}-cu{cuda版本}-{内部版本号}

LlamaFactory 0.9.4 (当前主分支)

0.9.4(Transformers 4.57.1)主版本 (最新组件)

Transformers	PyTorch	CUDA	vLLM	HuggingFace Hub	镜像Tag	状态	说明
4.57.1	2.8.0	12.6	0.10.2	0.35.3	`lf0.9.4-tf4.57.1-torch2.8.0-cu12.6-1.1`	🟢 主版本*	容器默认启动使用的版本
4.57.1	2.8.0	12.8	0.10.2	0.35.3	`lf0.9.4-tf4.57.1-torch2.8.0-cu12.8-1.1`	🟢 主版本
4.57.1	2.8.0	11.8	0.10.2	0.35.3	`lf0.9.4-tf4.57.1-torch2.8.0-cu11.8-1.1`	🟢 主版本

信息

LlamaFactory 0.9.4分支已支持Qwen3-VL全系列模型（4B、7B、30B-A3B、235B-A22B）的SFT和DPO训练，EasyR1框架也同步支持了所有Qwen3-VL模型的GRPO和DAPO强化学习，经过测试Qwen3-VL-30B-A3B-Thinking模型在Geometry3k数据集上，经过RL（Reinforcement Learning，强化学习）训练可提升25%的准确率。

0.9.4 (Transformers 4.57.1)模型列表

模型详情

模型名称	系列分类	模型类型	参数量	特点说明
gpt-oss-20b	GPT-OSS系列	基座模型	20B	开源GPT模型
gpt-oss-120b	GPT-OSS系列	基座模型	120B	超大规模开源GPT
aya-23-8B	Aya系列	多语言模型	8B	多语言理解与生成
aya-23-35B	Aya系列	多语言模型	35B	大规模多语言模型
Baichuan-7B	Baichuan系列	基座模型	7B	中英双语基座模型
Baichuan-13B-Base	Baichuan系列	基座模型	13B	中英双语基座模型
Baichuan-13B-Chat	Baichuan系列	对话模型	13B	中英双语对话模型
Baichuan2-7B-Base	Baichuan2系列	基座模型	7B	第二代中英双语基座
Baichuan2-13B-Base	Baichuan2系列	基座模型	13B	第二代中英双语基座
Baichuan2-7B-Chat	Baichuan2系列	对话模型	7B	第二代对话模型
Baichuan2-13B-Chat	Baichuan2系列	对话模型	13B	第二代对话模型
bloom-560m	BLOOM系列	基座模型	560M	多语言基座小模型
bloom-3b	BLOOM系列	基座模型	3B	多语言基座模型
bloom-7b1	BLOOM系列	基座模型	7B	多语言基座模型
bloomz-560m	BLOOMZ系列	指令调优	560M	指令调优小模型
bloomz-3b	BLOOMZ系列	指令调优	3B	指令调优模型
bloomz-7b1-mt	BLOOMZ系列	指令调优	7B	多任务指令调优
BlueLM-7B-Base	BlueLM系列	基座模型	7B	中英双语基座
BlueLM-7B-Chat	BlueLM系列	对话模型	7B	中英双语对话
Breeze-7B-Base-v1_0	Breeze系列	基座模型	7B	中文轻量基座
Breeze-7B-Instruct-v1_0	Breeze系列	指令模型	7B	中文指令模型
chatglm2-6b	ChatGLM系列	对话模型	6B	第二代对话模型
chatglm3-6b-base	ChatGLM系列	基座模型	6B	第三代基座模型
chatglm3-6b	ChatGLM系列	对话模型	6B	第三代对话模型
chinese-llama-2-1.3b	Chinese-LLaMA	基座模型	1.3B	中文优化小模型
chinese-llama-2-7b	Chinese-LLaMA	基座模型	7B	中文优化模型
chinese-llama-2-13b	Chinese-LLaMA	基座模型	13B	中文优化大模型
chinese-alpaca-2-1.3b	Chinese-Alpaca	对话模型	1.3B	中文对话小模型
chinese-alpaca-2-7b	Chinese-Alpaca	对话模型	7B	中文对话模型
chinese-alpaca-2-13b	Chinese-Alpaca	对话模型	13B	中文对话大模型
codegeex4-all-9b	CodeGeeX系列	代码模型	9B	多语言代码生成
codegemma-7b	CodeGemma系列	代码模型	7B	代码生成基座
codegemma-7b-it	CodeGemma系列	代码模型	7B	代码生成指令版
codegemma-1.1-2b	CodeGemma系列	代码模型	2B	轻量代码模型
codegemma-1.1-7b-it	CodeGemma系列	代码模型	7B	代码指令模型
Codestral-22B-v0.1	Codestral系列	代码模型	22B	大型代码模型
c4ai-command-r-v01	Command系列	RAG模型	-	检索增强生成
c4ai-command-r-plus	Command系列	RAG模型	-	增强版RAG模型
c4ai-command-r-v01-4bit	Command系列	量化模型	-	4bit量化版本
c4ai-command-r-plus-4bit	Command系列	量化模型	-	增强版4bit量化
dbrx-base	DBRX系列	基座模型	-	MoE架构基座
dbrx-instruct	DBRX系列	指令模型	-	MoE指令模型
deepseek-llm-7b-base	DeepSeek-LLM	基座模型	7B	通用基座模型
deepseek-llm-67b-base	DeepSeek-LLM	基座模型	67B	大规模基座模型
deepseek-llm-7b-chat	DeepSeek-LLM	对话模型	7B	通用对话模型
deepseek-llm-67b-chat	DeepSeek-LLM	对话模型	67B	大规模对话模型
deepseek-math-7b-base	DeepSeek-Math	数学模型	7B	数学基座模型
deepseek-math-7b-instruct	DeepSeek-Math	数学模型	7B	数学指令模型
deepseek-moe-16b-base	DeepSeek-MoE	基座模型	16B	MoE架构基座
deepseek-moe-16b-chat	DeepSeek-MoE	对话模型	16B	MoE对话模型
DeepSeek-V2-Lite	DeepSeek-V2	轻量模型	-	V2轻量版本
DeepSeek-V2	DeepSeek-V2	基座模型	-	第二代基座
DeepSeek-V2-Lite-Chat	DeepSeek-V2	对话模型	-	V2轻量对话
DeepSeek-V2-Chat	DeepSeek-V2	对话模型	-	第二代对话
DeepSeek-Coder-V2-Lite-Base	DeepSeek-Coder	代码模型	-	代码轻量基座
DeepSeek-Coder-V2-Base	DeepSeek-Coder	代码模型	-	代码基座模型
DeepSeek-Coder-V2-Lite-Instruct	DeepSeek-Coder	代码模型	-	代码轻量指令
DeepSeek-Coder-V2-Instruct	DeepSeek-Coder	代码模型	-	代码指令模型
deepseek-coder-6.7b-base	DeepSeek-Coder	代码模型	6.7B	代码基座模型
deepseek-coder-7b-base-v1.5	DeepSeek-Coder	代码模型	7B	代码基座v1.5
deepseek-coder-33b-base	DeepSeek-Coder	代码模型	33B	大规模代码基座
deepseek-coder-6.7b-instruct	DeepSeek-Coder	代码模型	6.7B	代码指令模型
deepseek-coder-7b-instruct-v1.5	DeepSeek-Coder	代码模型	7B	代码指令v1.5
deepseek-coder-33b-instruct	DeepSeek-Coder	代码模型	33B	大规模代码指令
DeepSeek-V2-Chat-0628	DeepSeek-V2	对话模型	-	特定版本对话
DeepSeek-V2.5	DeepSeek-V2.5	基座模型	-	2.5代基座
DeepSeek-V2.5-1210	DeepSeek-V2.5	基座模型	-	特定版本基座
DeepSeek-V3-Base	DeepSeek-V3	基座模型	-	第三代基座
DeepSeek-V3	DeepSeek-V3	基座模型	-	第三代模型
DeepSeek-R1-Distill-Qwen-1.5B	DeepSeek-R1	推理模型	1.5B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-7B	DeepSeek-R1	推理模型	7B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-8B	DeepSeek-R1	推理模型	8B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-14B	DeepSeek-R1	推理模型	14B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-32B	DeepSeek-R1	推理模型	32B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-70B	DeepSeek-R1	推理模型	70B	蒸馏推理模型
DeepSeek-R1-Zero	DeepSeek-R1	推理模型	-	零样本推理
DeepSeek-R1	DeepSeek-R1	推理模型	-	推理模型
EXAONE-3.0-7.8B-Instruct	EXAONE系列	指令模型	7.8B	多模态指令模型
falcon-7b	Falcon系列	基座模型	7B	开源基座模型
falcon-11B	Falcon系列	基座模型	11B	中等规模基座
falcon-40b	Falcon系列	基座模型	40B	大规模基座模型
falcon-180b	Falcon系列	基座模型	180B	超大规模基座
falcon-7b-instruct	Falcon系列	指令模型	7B	指令调优模型
falcon-40b-instruct	Falcon系列	指令模型	40B	大规模指令模型
falcon-180b-chat	Falcon系列	对话模型	180B	超大规模对话
gemma-2b	Gemma系列	基座模型	2B	轻量基座模型
gemma-7b	Gemma系列	基座模型	7B	基座模型
gemma-2b-it	Gemma系列	指令模型	2B	轻量指令模型
gemma-7b-it	Gemma系列	指令模型	7B	指令模型
gemma-1.1-2b-it	Gemma系列	指令模型	2B	1.1版指令模型
gemma-1.1-7b-it	Gemma系列	指令模型	7B	1.1版指令模型
gemma-2-2b	Gemma2系列	基座模型	2B	第2代轻量基座
gemma-2-9b	Gemma2系列	基座模型	9B	第2代基座模型
gemma-2-27b	Gemma2系列	基座模型	27B	第2代大基座
gemma-2-2b-it	Gemma2系列	指令模型	2B	第2代轻量指令
gemma-2-9b-it	Gemma2系列	指令模型	9B	第2代指令模型
gemma-2-27b-it	Gemma2系列	指令模型	27B	第2代大指令模型
glm-4-9b	GLM系列	基座模型	9B	第4代基座模型
glm-4-9b-chat	GLM系列	对话模型	9B	第4代对话模型
glm-4-9b-chat-1m	GLM系列	对话模型	9B	长上下文对话
gpt2	GPT-2系列	基座模型	124M	基础版本
gpt2-medium	GPT-2系列	基座模型	355M	中等版本
gpt2-large	GPT-2系列	基座模型	774M	大型版本
gpt2-xl	GPT-2系列	基座模型	1.5B	超大版本
granite-3.0-1b-a400m-base	Granite系列	基座模型	1B	代码基座模型
granite-3.0-3b-a800m-base	Granite系列	基座模型	3B	代码基座模型
granite-3.0-2b-base	Granite系列	基座模型	2B	代码基座模型
granite-3.0-8b-base	Granite系列	基座模型	8B	代码基座模型
granite-3.0-1b-a400m-instruct	Granite系列	指令模型	1B	代码指令模型
granite-3.0-3b-a800m-instruct	Granite系列	指令模型	3B	代码指令模型
granite-3.0-2b-instruct	Granite系列	指令模型	2B	代码指令模型
granite-3.0-8b-instruct	Granite系列	指令模型	8B	代码指令模型
granite-3.1-1b-a400m-base	Granite系列	基座模型	1B	3.1版代码基座
granite-3.1-3b-a800m-base	Granite系列	基座模型	3B	3.1版代码基座
granite-3.1-2b-base	Granite系列	基座模型	2B	3.1版代码基座
granite-3.1-8b-base	Granite系列	基座模型	8B	3.1版代码基座
granite-3.1-1b-a400m-instruct	Granite系列	指令模型	1B	3.1版代码指令
granite-3.1-3b-a800m-instruct	Granite系列	指令模型	3B	3.1版代码指令
granite-3.1-2b-instruct	Granite系列	指令模型	2B	3.1版代码指令
granite-3.1-8b-instruct	Granite系列	指令模型	8B	3.1版代码指令
Index-1.9B	Index系列	基座模型	1.9B	轻量基座模型
Index-1.9B-Pure	Index系列	基座模型	1.9B	纯净版基座
Index-1.9B-Chat	Index系列	对话模型	1.9B	轻量对话模型
Index-1.9B-Character	Index系列	角色模型	1.9B	角色扮演模型
Index-1.9B-32K	Index系列	基座模型	1.9B	长上下文版本
internlm-7b	InternLM系列	基座模型	7B	基座模型
internlm-20b	InternLM系列	基座模型	20B	大规模基座
internlm-chat-7b	InternLM系列	对话模型	7B	对话模型
internlm-chat-20b	InternLM系列	对话模型	20B	大规模对话
internlm2-7b	InternLM2系列	基座模型	7B	第2代基座
internlm2-20b	InternLM2系列	基座模型	20B	第2代大基座
internlm2-chat-7b	InternLM2系列	对话模型	7B	第2代对话
internlm2-chat-20b	InternLM2系列	对话模型	20B	第2代大对话
internlm2_5-1_8b	InternLM2.5系列	基座模型	1.8B	2.5代轻量基座
internlm2_5-7b	InternLM2.5系列	基座模型	7B	2.5代基座
internlm2_5-20b	InternLM2.5系列	基座模型	20B	2.5代大基座
internlm2_5-1_8b-chat	InternLM2.5系列	对话模型	1.8B	2.5代轻量对话
internlm2_5-7b-chat	InternLM2.5系列	对话模型	7B	2.5代对话
internlm2_5-7b-chat-1m	InternLM2.5系列	对话模型	7B	百万字长对话
internlm2_5-20b-chat	InternLM2.5系列	对话模型	20B	2.5代大对话
internlm3-8b-instruct	InternLM3系列	指令模型	8B	第3代指令模型
Jamba-v0.1	Jamba系列	混合模型	-	SSM-Transformer混合架构
LingoWhale-8B	LingoWhale系列	基座模型	8B	中英双语模型
llama-7b	LLaMA系列	基座模型	7B	经典基座模型
llama-13b	LLaMA系列	基座模型	13B	中等基座模型
llama-30b	LLaMA系列	基座模型	30B	大型基座模型
llama-65b	LLaMA系列	基座模型	65B	超大规模基座
Llama-2-7b-hf	LLaMA-2系列	基座模型	7B	第2代基座
Llama-2-13b-hf	LLaMA-2系列	基座模型	13B	第2代基座
Llama-2-70b-hf	LLaMA-2系列	基座模型	70B	第2代大基座
Llama-2-7b-chat-hf	LLaMA-2系列	对话模型	7B	第2代对话
Llama-2-13b-chat-hf	LLaMA-2系列	对话模型	13B	第2代对话
Llama-2-70b-chat-hf	LLaMA-2系列	对话模型	70B	第2代大对话
Meta-Llama-3-8B	LLaMA-3系列	基座模型	8B	第3代基座
Meta-Llama-3-70B	LLaMA-3系列	基座模型	70B	第3代大基座
Meta-Llama-3-8B-Instruct	LLaMA-3系列	指令模型	8B	第3代指令
Meta-Llama-3-70B-Instruct	LLaMA-3系列	指令模型	70B	第3代大指令
Llama3-8B-Chinese-Chat	LLaMA-3中文	对话模型	8B	中文优化对话
Llama3-70B-Chinese-Chat	LLaMA-3中文	对话模型	70B	中文优化大对话
Meta-Llama-3.1-8B	LLaMA-3.1系列	基座模型	8B	3.1代基座
Meta-Llama-3.1-70B	LLaMA-3.1系列	基座模型	70B	3.1代大基座
Meta-Llama-3.1-405B	LLaMA-3.1系列	基座模型	405B	超大规模基座
Meta-Llama-3.1-8B-Instruct	LLaMA-3.1系列	指令模型	8B	3.1代指令
Meta-Llama-3.1-70B-Instruct	LLaMA-3.1系列	指令模型	70B	3.1代大指令
Meta-Llama-3.1-405B-Instruct	LLaMA-3.1系列	指令模型	405B	超大规模指令
Llama3.1-8B-Chinese-Chat	LLaMA-3.1中文	对话模型	8B	3.1代中文对话
Llama3.1-70B-Chinese-Chat	LLaMA-3.1中文	对话模型	70B	3.1代中文大对话
Llama-3.2-1B	LLaMA-3.2系列	基座模型	1B	3.2代轻量基座
Llama-3.2-3B	LLaMA-3.2系列	基座模型	3B	3.2代轻量基座
Llama-3.2-1B-Instruct	LLaMA-3.2系列	指令模型	1B	3.2代轻量指令
Llama-3.2-3B-Instruct	LLaMA-3.2系列	指令模型	3B	3.2代轻量指令
Llama-3.3-70B-Instruct	LLaMA-3.3系列	指令模型	70B	3.3代大指令
Llama-3.2-11B-Vision	LLaMA-3.2多模态	视觉模型	11B	视觉语言模型
Llama-3.2-11B-Vision-Instruct	LLaMA-3.2多模态	视觉模型	11B	视觉指令模型
Llama-3.2-90B-Vision	LLaMA-3.2多模态	视觉模型	90B	大规模视觉模型
Llama-3.2-90B-Vision-Instruct	LLaMA-3.2多模态	视觉模型	90B	大规模视觉指令
llava-1.5-7b-hf	LLaVA系列	多模态模型	7B	视觉语言模型
llava-1.5-13b-hf	LLaVA系列	多模态模型	13B	视觉语言模型
llava-v1.6-vicuna-7b-hf	LLaVA系列	多模态模型	7B	Vicuna版视觉模型
llava-v1.6-vicuna-13b-hf	LLaVA系列	多模态模型	13B	Vicuna版视觉模型
llava-v1.6-mistral-7b-hf	LLaVA系列	多模态模型	7B	Mistral版视觉模型
llama3-llava-next-8b-hf	LLaVA系列	多模态模型	8B	LLaMA3版视觉模型
llava-v1.6-34b-hf	LLaVA系列	多模态模型	34B	大规模视觉模型
llava-next-72b-hf	LLaVA系列	多模态模型	72B	超大规模视觉模型
llava-next-110b-hf	LLaVA系列	多模态模型	110B	巨型视觉模型
LLaVA-NeXT-Video-7B-hf	LLaVA-NeXT系列	视频模型	7B	视频理解模型
LLaVA-NeXT-Video-7B-DPO-hf	LLaVA-NeXT系列	视频模型	7B	DPO优化视频模型
LLaVA-NeXT-Video-7B-32K-hf	LLaVA-NeXT系列	视频模型	7B	长视频理解模型
LLaVA-NeXT-Video-34B-hf	LLaVA-NeXT系列	视频模型	34B	大规模视频模型
LLaVA-NeXT-Video-34B-DPO-hf	LLaVA-NeXT系列	视频模型	34B	DPO优化大视频模型
Marco-o1	Marco系列	推理模型	-	数学推理模型
MiniCPM-2B-sft-bf16	MiniCPM系列	对话模型	2B	SFT优化对话
MiniCPM-2B-dpo-bf16	MiniCPM系列	对话模型	2B	DPO优化对话
MiniCPM3-4B	MiniCPM系列	对话模型	4B	第三代对话模型
MiniCPM-o-2_6	MiniCPM系列	对话模型	2.6B	优化版对话模型
MiniCPM-V-2_6	MiniCPM系列	多模态模型	2.6B	视觉语言模型
Ministral-8B-Instruct-2410	Ministral系列	指令模型	8B	轻量指令模型
Mistral-Nemo-Base-2407	Mistral系列	基座模型	-	Nemo架构基座
Mistral-Nemo-Instruct-2407	Mistral系列	指令模型	-	Nemo架构指令
Mistral-7B-v0.1	Mistral系列	基座模型	7B	初代基座模型
Mistral-7B-v0.2-hf	Mistral系列	基座模型	7B	0.2版基座
Mistral-7B-v0.3	Mistral系列	基座模型	7B	0.3版基座
Mistral-7B-Instruct-v0.1	Mistral系列	指令模型	7B	初代指令模型
Mistral-7B-Instruct-v0.2	Mistral系列	指令模型	7B	0.2版指令
Mistral-7B-Instruct-v0.3	Mistral系列	指令模型	7B	0.3版指令
Mistral-Small-24B-Base-2501	Mistral系列	基座模型	24B	小规模基座
Mistral-Small-24B-Instruct-2501	Mistral系列	指令模型	24B	小规模指令
Mixtral-8x7B-v0.1	Mixtral系列	基座模型	8x7B	MoE架构基座
Mixtral-8x22B-v0.1	Mixtral系列	基座模型	8x22B	大型MoE基座
Mixtral-8x7B-Instruct-v0.1	Mixtral系列	指令模型	8x7B	MoE指令模型
Mixtral-8x22B-Instruct-v0.1	Mixtral系列	指令模型	8x22B	大型MoE指令
Moonlight-16B-A3B	Moonlight系列	基座模型	16B	月光系列基座
Moonlight-16B-A3B-Instruct	Moonlight系列	指令模型	16B	月光指令模型
OLMo-1B-hf	OLMo系列	基座模型	1B	轻量开源模型
OLMo-7B-hf	OLMo系列	基座模型	7B	开源基座模型
OLMo-7B-Instruct-hf	OLMo系列	指令模型	7B	开源指令模型
OLMo-1.7-7B-hf	OLMo系列	基座模型	7B	1.7版基座
openchat-3.5-0106	OpenChat系列	对话模型	-	3.5版对话模型
openchat-3.6-8b-20240522	OpenChat系列	对话模型	8B	3.6版对话模型
OpenCoder-1.5B-Base	OpenCoder系列	代码模型	1.5B	轻量代码基座
OpenCoder-8B-Base	OpenCoder系列	代码模型	8B	代码基座模型
OpenCoder-1.5B-Instruct	OpenCoder系列	代码模型	1.5B	轻量代码指令
OpenCoder-8B-Instruct	OpenCoder系列	代码模型	8B	代码指令模型
Orion-14B-Base	Orion系列	基座模型	14B	基座模型
Orion-14B-Chat	Orion系列	对话模型	14B	对话模型
Orion-14B-LongChat	Orion系列	对话模型	14B	长对话模型
Orion-14B-Chat-RAG	Orion系列	对话模型	14B	RAG增强对话
Orion-14B-Chat-Plugin	Orion系列	对话模型	14B	插件支持对话
paligemma-3b-pt-224	PaliGemma系列	多模态模型	3B	图像理解模型
paligemma-3b-pt-448	PaliGemma系列	多模态模型	3B	高分辨率版本
paligemma-3b-pt-896	PaliGemma系列	多模态模型	3B	超高分辨率版
paligemma-3b-mix-224	PaliGemma系列	多模态模型	3B	混合训练版本
paligemma-3b-mix-448	PaliGemma系列	多模态模型	3B	混合高分辨率版
paligemma2-3b-pt-224	PaliGemma2系列	多模态模型	3B	第2代图像模型
paligemma2-3b-pt-448	PaliGemma2系列	多模态模型	3B	第2代高分辨率版
paligemma2-3b-pt-896	PaliGemma2系列	多模态模型	3B	第2代超高分辨率版
paligemma2-10b-pt-224	PaliGemma2系列	多模态模型	10B	第2代中规模模型
paligemma2-10b-pt-448	PaliGemma2系列	多模态模型	10B	第2代中规模高分辨率版
paligemma2-10b-pt-896	PaliGemma2系列	多模态模型	10B	第2代中规模超高分辨率版
paligemma2-28b-pt-224	PaliGemma2系列	多模态模型	28B	第2代大规模模型
paligemma2-28b-pt-448	PaliGemma2系列	多模态模型	28B	第2代大规模高分辨率版
paligemma2-28b-pt-896	PaliGemma2系列	多模态模型	28B	第2代大规模超高分辨率版
paligemma2-3b-mix-224	PaliGemma2系列	多模态模型	3B	第2代混合训练版
paligemma2-3b-mix-448	PaliGemma2系列	多模态模型	3B	第2代混合高分辨率版
paligemma2-10b-mix-224	PaliGemma2系列	多模态模型	10B	第2代中规模混合版
paligemma2-10b-mix-448	PaliGemma2系列	多模态模型	10B	第2代中规模混合高分辨率版
paligemma2-28b-mix-224	PaliGemma2系列	多模态模型	28B	第2代大规模混合版
paligemma2-28b-mix-448	PaliGemma2系列	多模态模型	28B	第2代大规模混合高分辨率版
phi-1_5	Phi系列	基座模型	1.5B	小规模基座
phi-2	Phi系列	基座模型	2.7B	轻量基座模型
Phi-3-mini-4k-instruct	Phi-3系列	指令模型	-	轻量指令模型
Phi-3-mini-128k-instruct	Phi-3系列	指令模型	-	长上下文指令
Phi-3-medium-4k-instruct	Phi-3系列	指令模型	-	中等指令模型
Phi-3-medium-128k-instruct	Phi-3系列	指令模型	-	中规模长上下文指令
Phi-3.5-mini-instruct	Phi-3.5系列	指令模型	-	3.5代轻量指令
Phi-3.5-MoE-instruct	Phi-3.5系列	指令模型	-	MoE架构指令
Phi-3-small-8k-instruct	Phi-3系列	指令模型	-	小规模指令
Phi-3-small-128k-instruct	Phi-3系列	指令模型	-	小规模长上下文指令
phi-4	Phi系列	基座模型	-	第4代基座
pixtral-12b	Pixtral系列	多模态模型	12B	多语言视觉语言模型
Qwen-1_8B	Qwen系列	基座模型	1.8B	轻量基座模型
Qwen-7B	Qwen系列	基座模型	7B	基座模型
Qwen-14B	Qwen系列	基座模型	14B	中等基座模型
Qwen-72B	Qwen系列	基座模型	72B	大规模基座模型
Qwen-1_8B-Chat	Qwen系列	对话模型	1.8B	轻量对话模型
Qwen-7B-Chat	Qwen系列	对话模型	7B	对话模型
Qwen-14B-Chat	Qwen系列	对话模型	14B	中等对话模型
Qwen-72B-Chat	Qwen系列	对话模型	72B	大规模对话模型
Qwen-1_8B-Chat-Int8	Qwen系列	量化模型	1.8B	Int8量化版本
Qwen-1_8B-Chat-Int4	Qwen系列	量化模型	1.8B	Int4量化版本
Qwen-7B-Chat-Int8	Qwen系列	量化模型	7B	Int8量化版本
Qwen-7B-Chat-Int4	Qwen系列	量化模型	7B	Int4量化版本
Qwen-14B-Chat-Int8	Qwen系列	量化模型	14B	Int8量化版本
Qwen-14B-Chat-Int4	Qwen系列	量化模型	14B	Int4量化版本
Qwen-72B-Chat-Int8	Qwen系列	量化模型	72B	Int8量化版本
Qwen-72B-Chat-Int4	Qwen系列	量化模型	72B	Int4量化版本
Qwen1.5-0.5B	Qwen1.5系列	基座模型	0.5B	超轻量基座
Qwen1.5-1.8B	Qwen1.5系列	基座模型	1.8B	轻量基座模型
Qwen1.5-4B	Qwen1.5系列	基座模型	4B	小规模基座
Qwen1.5-7B	Qwen1.5系列	基座模型	7B	基座模型
Qwen1.5-14B	Qwen1.5系列	基座模型	14B	中等基座模型
Qwen1.5-32B	Qwen1.5系列	基座模型	32B	大规模基座模型
Qwen1.5-72B	Qwen1.5系列	基座模型	72B	超大规模基座
Qwen1.5-110B	Qwen1.5系列	基座模型	110B	巨型基座模型
Qwen1.5-MoE-A2.7B	Qwen1.5系列	基座模型	2.7B	MoE架构基座
Qwen1.5-0.5B-Chat	Qwen1.5系列	对话模型	0.5B	超轻量对话
Qwen1.5-1.8B-Chat	Qwen1.5系列	对话模型	1.8B	轻量对话模型
Qwen1.5-4B-Chat	Qwen1.5系列	对话模型	4B	小规模对话
Qwen1.5-7B-Chat	Qwen1.5系列	对话模型	7B	对话模型
Qwen1.5-14B-Chat	Qwen1.5系列	对话模型	14B	中等对话模型
Qwen1.5-32B-Chat	Qwen1.5系列	对话模型	32B	大规模对话模型
Qwen1.5-72B-Chat	Qwen1.5系列	对话模型	72B	超大规模对话
Qwen1.5-110B-Chat	Qwen1.5系列	对话模型	110B	巨型对话模型
Qwen1.5-MoE-A2.7B-Chat	Qwen1.5系列	对话模型	2.7B	MoE架构对话
CodeQwen1.5-7B	CodeQwen系列	代码模型	7B	代码基座模型
CodeQwen1.5-7B-Chat	CodeQwen系列	代码模型	7B	代码对话模型
Qwen2-0.5B	Qwen2系列	基座模型	0.5B	第2代超轻量基座
Qwen2-1.5B	Qwen2系列	基座模型	1.5B	第2代轻量基座
Qwen2-7B	Qwen2系列	基座模型	7B	第2代基座模型
Qwen2-72B	Qwen2系列	基座模型	72B	第2代大规模基座
Qwen2-57B-A14B	Qwen2系列	混合模型	57B+14B	混合专家模型
Qwen2-0.5B-Instruct	Qwen2系列	指令模型	0.5B	第2代超轻量指令
Qwen2-1.5B-Instruct	Qwen2系列	指令模型	1.5B	第2代轻量指令
Qwen2-7B-Instruct	Qwen2系列	指令模型	7B	第2代指令模型
Qwen2-72B-Instruct	Qwen2系列	指令模型	72B	第2代大规模指令
Qwen2-57B-A14B-Instruct	Qwen2系列	指令模型	57B+14B	混合专家指令
Qwen2-Math-1.5B	Qwen2-Math系列	数学模型	1.5B	数学基座模型
Qwen2-Math-7B	Qwen2-Math系列	数学模型	7B	数学基座模型
Qwen2-Math-72B	Qwen2-Math系列	数学模型	72B	大规模数学模型
Qwen2-Math-1.5B-Instruct	Qwen2-Math系列	数学模型	1.5B	数学指令模型
Qwen2-Math-7B-Instruct	Qwen2-Math系列	数学模型	7B	数学指令模型
Qwen2-Math-72B-Instruct	Qwen2-Math系列	数学模型	72B	大规模数学指令
Qwen2.5-0.5B	Qwen2.5系列	基座模型	0.5B	2.5代超轻量基座
Qwen2.5-1.5B	Qwen2.5系列	基座模型	1.5B	2.5代轻量基座
Qwen2.5-3B	Qwen2.5系列	基座模型	3B	2.5代小规模基座
Qwen2.5-7B	Qwen2.5系列	基座模型	7B	2.5代基座模型
Qwen2.5-14B	Qwen2.5系列	基座模型	14B	2.5代中等基座
Qwen2.5-32B	Qwen2.5系列	基座模型	32B	2.5代大规模基座
Qwen2.5-72B	Qwen2.5系列	基座模型	72B	2.5代超大规模基座
Qwen2.5-0.5B-Instruct	Qwen2.5系列	指令模型	0.5B	2.5代超轻量指令
Qwen2.5-1.5B-Instruct	Qwen2.5系列	指令模型	1.5B	2.5代轻量指令
Qwen2.5-3B-Instruct	Qwen2.5系列	指令模型	3B	2.5代小规模指令
Qwen2.5-7B-Instruct	Qwen2.5系列	指令模型	7B	2.5代指令模型
Qwen2.5-14B-Instruct	Qwen2.5系列	指令模型	14B	2.5代中等指令
Qwen2.5-32B-Instruct	Qwen2.5系列	指令模型	32B	2.5代大规模指令
Qwen2.5-72B-Instruct	Qwen2.5系列	指令模型	72B	2.5代超大规模指令
Qwen2.5-Coder-0.5B	Qwen2.5-Coder系列	代码模型	0.5B	超轻量代码基座
Qwen2.5-Coder-1.5B	Qwen2.5-Coder系列	代码模型	1.5B	轻量代码基座
Qwen2.5-Coder-3B	Qwen2.5-Coder系列	代码模型	3B	小规模代码基座
Qwen2.5-Coder-7B	Qwen2.5-Coder系列	代码模型	7B	代码基座模型
Qwen2.5-Coder-14B	Qwen2.5-Coder系列	代码模型	14B	中等代码基座
Qwen2.5-Coder-32B	Qwen2.5-Coder系列	代码模型	32B	大规模代码基座
Qwen2.5-Coder-0.5B-Instruct	Qwen2.5-Coder系列	代码模型	0.5B	超轻量代码指令
Qwen2.5-Coder-1.5B-Instruct	Qwen2.5-Coder系列	代码模型	1.5B	轻量代码指令
Qwen2.5-Coder-3B-Instruct	Qwen2.5-Coder系列	代码模型	3B	小规模代码指令
Qwen2.5-Coder-7B-Instruct	Qwen2.5-Coder系列	代码模型	7B	代码指令模型
Qwen2.5-Coder-14B-Instruct	Qwen2.5-Coder系列	代码模型	14B	中等代码指令
Qwen2.5-Coder-32B-Instruct	Qwen2.5-Coder系列	代码模型	32B	大规模代码指令
Qwen2.5-Math-1.5B	Qwen2.5-Math系列	数学模型	1.5B	轻量数学模型
Qwen2.5-Math-7B	Qwen2.5-Math系列	数学模型	7B	数学模型
Qwen2.5-Math-72B	Qwen2.5-Math系列	数学模型	72B	大规模数学模型
Qwen2.5-Math-1.5B-Instruct	Qwen2.5-Math系列	数学模型	1.5B	轻量数学指令
Qwen2.5-Math-7B-Instruct	Qwen2.5-Math系列	数学模型	7B	数学指令模型
Qwen2.5-Math-72B-Instruct	Qwen2.5-Math系列	数学模型	72B	大规模数学指令
QwQ-32B-Preview	QwQ系列	预览模型	32B	预览版本模型
QwQ-32B	QwQ系列	基座模型	32B	正式版本模型
Qwen2-Audio-7B	Qwen2-Audio系列	音频模型	7B	音频基座模型
Qwen2-Audio-7B-Instruct	Qwen2-Audio系列	音频模型	7B	音频指令模型
Qwen2-VL-2B	Qwen2-VL系列	多模态模型	2B	轻量视觉语言模型
Qwen2-VL-7B	Qwen2-VL系列	多模态模型	7B	视觉语言模型
Qwen2-VL-72B	Qwen2-VL系列	多模态模型	72B	大规模视觉语言模型
Qwen2-VL-2B-Instruct	Qwen2-VL系列	多模态模型	2B	轻量视觉指令
Qwen2-VL-7B-Instruct	Qwen2-VL系列	多模态模型	7B	视觉指令模型
Qwen2-VL-72B-Instruct	Qwen2-VL系列	多模态模型	72B	大规模视觉指令
QVQ-72B-Preview	QVQ系列	预览模型	72B	视觉量化预览版
Qwen2.5-VL-3B-Instruct	Qwen2.5-VL系列	多模态模型	3B	2.5代视觉指令
Qwen2.5-VL-7B-Instruct	Qwen2.5-VL系列	多模态模型	7B	2.5代视觉指令
Qwen2.5-VL-72B-Instruct	Qwen2.5-VL系列	多模态模型	72B	2.5代大规模视觉指令
SOLAR-10.7B-v1.0	SOLAR系列	基座模型	10.7B	基座模型
SOLAR-10.7B-Instruct-v1.0	SOLAR系列	指令模型	10.7B	指令模型
Skywork-13B-base	Skywork系列	基座模型	13B	基座模型
Skywork-o1-Open-Llama-3.1-8B	Skywork系列	基座模型	8B	基于LLaMA3.1
starcoder2-3b	StarCoder2系列	代码模型	3B	轻量代码模型
starcoder2-7b	StarCoder2系列	代码模型	7B	代码模型
starcoder2-15b	StarCoder2系列	代码模型	15B	中等代码模型
TeleChat-1B	TeleChat系列	对话模型	1B	轻量对话模型
telechat-7B	TeleChat系列	对话模型	7B	对话模型
TeleChat-12B-v2	TeleChat系列	对话模型	12B	第2版对话模型
TeleChat-52B	TeleChat系列	对话模型	52B	大规模对话模型
TeleChat2-3B	TeleChat2系列	对话模型	3B	第2代轻量对话
TeleChat2-7B	TeleChat2系列	对话模型	7B	第2代对话模型
TeleChat2-115B	TeleChat2系列	对话模型	115B	第2代巨型对话
vicuna-7b-v1.5	Vicuna系列	对话模型	7B	基于LLaMA的对话模型
vicuna-13b-v1.5	Vicuna系列	对话模型	13B	基于LLaMA的对话模型
Video-LLaVA-7B-hf	Video-LLaVA系列	视频模型	7B	视频理解模型
XuanYuan-6B	XuanYuan系列	基座模型	6B	金融领域基座
XuanYuan-70B	XuanYuan系列	基座模型	70B	金融领域大基座
XuanYuan2-70B	XuanYuan2系列	基座模型	70B	第2代金融基座
XuanYuan-6B-Chat	XuanYuan系列	对话模型	6B	金融对话模型
XuanYuan-70B-Chat	XuanYuan系列	对话模型	70B	金融大对话模型
XuanYuan2-70B-Chat	XuanYuan2系列	对话模型	70B	第2代金融对话
XVERSE-7B	XVERSE系列	基座模型	7B	基座模型
XVERSE-13B	XVERSE系列	基座模型	13B	中等基座模型
XVERSE-65B	XVERSE系列	基座模型	65B	大规模基座模型
XVERSE-65B-2	XVERSE系列	基座模型	65B	第2版基座模型
XVERSE-7B-Chat	XVERSE系列	对话模型	7B	对话模型
XVERSE-13B-Chat	XVERSE系列	对话模型	13B	中等对话模型
XVERSE-65B-Chat	XVERSE系列	对话模型	65B	大规模对话模型
XVERSE-MoE-A4.2B	XVERSE系列	基座模型	4.2B	MoE架构模型
yayi-7b-llama2	YaYi系列	基座模型	7B	基于LLaMA2
yayi-13b-llama2	YaYi系列	基座模型	13B	基于LLaMA2
Yi-6B	Yi系列	基座模型	6B	基座模型
Yi-9B	Yi系列	基座模型	9B	中等基座模型
Yi-34B	Yi系列	基座模型	34B	大规模基座模型
Yi-6B-Chat	Yi系列	对话模型	6B	对话模型
Yi-34B-Chat	Yi系列	对话模型	34B	大规模对话模型
Yi-1.5-6B	Yi-1.5系列	基座模型	6B	1.5代基座模型
Yi-1.5-9B	Yi-1.5系列	基座模型	9B	1.5代中等基座
Yi-1.5-34B	Yi-1.5系列	基座模型	34B	1.5代大规模基座
Yi-1.5-6B-Chat	Yi-1.5系列	对话模型	6B	1.5代对话模型
Yi-1.5-9B-Chat	Yi-1.5系列	对话模型	9B	1.5代中等对话
Yi-1.5-34B-Chat	Yi-1.5系列	对话模型	34B	1.5代大规模对话
Yi-Coder-1.5B	Yi-Coder系列	代码模型	1.5B	轻量代码模型
Yi-Coder-9B	Yi-Coder系列	代码模型	9B	代码模型
Yi-Coder-1.5B-Chat	Yi-Coder系列	代码模型	1.5B	轻量代码对话
Yi-Coder-9B-Chat	Yi-Coder系列	代码模型	9B	代码对话模型
Yi-VL-6B-hf	Yi-VL系列	多模态模型	6B	视觉语言模型
Yi-VL-34B-hf	Yi-VL系列	多模态模型	34B	大规模视觉语言模型
Yuan2-2B-hf	Yuan2系列	基座模型	2B	轻量基座模型
Yuan2-51B-hf	Yuan2系列	基座模型	51B	大规模基座模型
Yuan2-102B-hf	Yuan2系列	基座模型	102B	超大规模基座模型
zephyr-7b-alpha	Zephyr系列	对话模型	7B	Alpha版本对话
zephyr-7b-beta	Zephyr系列	对话模型	7B	Beta版本对话
zephyr-orpo-141b-A35b-v0.1	Zephyr系列	对话模型	141B	超大规模对话模型

提示

与LlamaFactory 0.9.4（Transformers 4.56.0）版本相比，LlamaFactory 0.9.4（Transformers 4.57.1）在其基础上新增了对以下模型的支持。请根据您的具体需求，选择适合的镜像版本。

DeepSeek-Coder-V2-Lite-Instruct, DeepSeek-Coder-V2-Instruct, DeepSeek-V2-Chat, DeepSeek-V2-Chat-0628, DeepSeek-V2-Lite-Chat, DeepSeek-V2.5-1210, DeepSeek-V3-0324, DeepSeek-R1-0528-Qwen3-8B, MobileLLM-R1-140M, MobileLLM-R1-140M-base, MobileLLM-R1-360M, MobileLLM-R1-360M-base, MobileLLM-R1-950M, MobileLLM-R1-950M-base, Qwen3-Next-80B-A3B-Instruct, Qwen3-Omni-30B-A3B-Captioner, Qwen3-Omni-30B-A3B-Instruct, Qwen3-Omni-30B-A3B-Thinking, Qwen3-VL-235B-A22B-Instruct, Qwen3-VL-235B-A22B-Thinking, Qwen3-VL-30B-A3B-Instruct, Qwen3-VL-30B-A3B-Thinking, Qwen3-VL-4B-Instruct, Qwen3-VL-4B-Thinking, Qwen3-VL-8B-Instruct, Qwen3-VL-8B-Thinking

0.9.4(Transformers 4.56.0)

Transformers	PyTorch	CUDA	vLLM	HuggingFace Hub	镜像Tag	状态
4.56.0	2.7.1	12.6	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.7.1-cu12.6-1.1`	🟡 历史版本
4.56.0	2.7.1	12.8	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.7.1-cu12.8-1.1`	🟡 历史版本
4.56.0	2.7.1	11.8	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.7.1-cu11.8-1.1`	🟡 历史版本
4.56.0	2.6.0	12.6	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.6.0-cu12.6-1.1`	🟡 历史版本
4.56.0	2.6.0	12.4	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.6.0-cu12.4-1.1`	🟡 历史版本
4.56.0	2.6.0	11.8	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.6.0-cu11.8-1.1`	🟡 历史版本
4.56.0	2.5.1	12.6	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.5.1-cu12.4-1.1`	🟡 历史版本
4.56.0	2.5.1	12.1	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.5.1-cu12.1-1.11`	🟡 历史版本
4.56.0	2.5.1	11.8	0.10.0	0.34.3	`lf0.9.4-tf4.56.0-torch2.5.1-cu11.8-1.1`	🟡 历史版本

0.9.4 版本特性总结

🟢 主版本: transformers 4.57.1 + vllm 0.10.2

默认配置: PyTorch 2.8.0, CUDA 12.6
良好兼容: 支持 CUDA 11.8 / 12.8

🟡 历史版本: transformers 4.56.0 + vllm 0.10.0

广泛兼容: 支持 PyTorch 2.5.1-2.7.1，CUDA 11.8-12.8

0.9.4(Transformers 4.56.0)模型列表

模型详情

模型名称	系列分类	模型类型	参数量	特点说明
gpt-oss-20b	GPT-OSS系列	基座模型	20B	开源GPT模型
gpt-oss-120b	GPT-OSS系列	基座模型	120B	超大规模开源GPT
aya-23-8B	Aya系列	多语言模型	8B	多语言理解与生成
aya-23-35B	Aya系列	多语言模型	35B	大规模多语言模型
Baichuan-7B	Baichuan系列	基座模型	7B	中英双语基座模型
Baichuan-13B-Base	Baichuan系列	基座模型	13B	中英双语基座模型
Baichuan-13B-Chat	Baichuan系列	对话模型	13B	中英双语对话模型
Baichuan2-7B-Base	Baichuan2系列	基座模型	7B	第二代中英双语基座
Baichuan2-13B-Base	Baichuan2系列	基座模型	13B	第二代中英双语基座
Baichuan2-7B-Chat	Baichuan2系列	对话模型	7B	第二代对话模型
Baichuan2-13B-Chat	Baichuan2系列	对话模型	13B	第二代对话模型
bloom-560m	BLOOM系列	基座模型	560M	多语言基座小模型
bloom-3b	BLOOM系列	基座模型	3B	多语言基座模型
bloom-7b1	BLOOM系列	基座模型	7B	多语言基座模型
bloomz-560m	BLOOMZ系列	指令调优	560M	指令调优小模型
bloomz-3b	BLOOMZ系列	指令调优	3B	指令调优模型
bloomz-7b1-mt	BLOOMZ系列	指令调优	7B	多任务指令调优
BlueLM-7B-Base	BlueLM系列	基座模型	7B	中英双语基座
BlueLM-7B-Chat	BlueLM系列	对话模型	7B	中英双语对话
Breeze-7B-Base-v1_0	Breeze系列	基座模型	7B	中文轻量基座
Breeze-7B-Instruct-v1_0	Breeze系列	指令模型	7B	中文指令模型
chatglm2-6b	ChatGLM系列	对话模型	6B	第二代对话模型
chatglm3-6b-base	ChatGLM系列	基座模型	6B	第三代基座模型
chatglm3-6b	ChatGLM系列	对话模型	6B	第三代对话模型
chinese-llama-2-1.3b	Chinese-LLaMA	基座模型	1.3B	中文优化小模型
chinese-llama-2-7b	Chinese-LLaMA	基座模型	7B	中文优化模型
chinese-llama-2-13b	Chinese-LLaMA	基座模型	13B	中文优化大模型
chinese-alpaca-2-1.3b	Chinese-Alpaca	对话模型	1.3B	中文对话小模型
chinese-alpaca-2-7b	Chinese-Alpaca	对话模型	7B	中文对话模型
chinese-alpaca-2-13b	Chinese-Alpaca	对话模型	13B	中文对话大模型
codegeex4-all-9b	CodeGeeX系列	代码模型	9B	多语言代码生成
codegemma-7b	CodeGemma系列	代码模型	7B	代码生成基座
codegemma-7b-it	CodeGemma系列	代码模型	7B	代码生成指令版
codegemma-1.1-2b	CodeGemma系列	代码模型	2B	轻量代码模型
codegemma-1.1-7b-it	CodeGemma系列	代码模型	7B	代码指令模型
Codestral-22B-v0.1	Codestral系列	代码模型	22B	大型代码模型
c4ai-command-r-v01	Command系列	RAG模型	-	检索增强生成
c4ai-command-r-plus	Command系列	RAG模型	-	增强版RAG模型
c4ai-command-r-v01-4bit	Command系列	量化模型	-	4bit量化版本
c4ai-command-r-plus-4bit	Command系列	量化模型	-	增强版4bit量化
dbrx-base	DBRX系列	基座模型	-	MoE架构基座
dbrx-instruct	DBRX系列	指令模型	-	MoE指令模型
deepseek-llm-7b-base	DeepSeek-LLM	基座模型	7B	通用基座模型
deepseek-llm-67b-base	DeepSeek-LLM	基座模型	67B	大规模基座模型
deepseek-llm-7b-chat	DeepSeek-LLM	对话模型	7B	通用对话模型
deepseek-llm-67b-chat	DeepSeek-LLM	对话模型	67B	大规模对话模型
deepseek-math-7b-base	DeepSeek-Math	数学模型	7B	数学基座模型
deepseek-math-7b-instruct	DeepSeek-Math	数学模型	7B	数学指令模型
deepseek-moe-16b-base	DeepSeek-MoE	基座模型	16B	MoE架构基座
deepseek-moe-16b-chat	DeepSeek-MoE	对话模型	16B	MoE对话模型
DeepSeek-V2-Lite	DeepSeek-V2	轻量模型	-	V2轻量版本
DeepSeek-V2	DeepSeek-V2	基座模型	-	第二代基座
DeepSeek-V2-Lite-Chat	DeepSeek-V2	对话模型	-	V2轻量对话
DeepSeek-V2-Chat	DeepSeek-V2	对话模型	-	第二代对话
DeepSeek-Coder-V2-Lite-Base	DeepSeek-Coder	代码模型	-	代码轻量基座
DeepSeek-Coder-V2-Base	DeepSeek-Coder	代码模型	-	代码基座模型
DeepSeek-Coder-V2-Lite-Instruct	DeepSeek-Coder	代码模型	-	代码轻量指令
DeepSeek-Coder-V2-Instruct	DeepSeek-Coder	代码模型	-	代码指令模型
deepseek-coder-6.7b-base	DeepSeek-Coder	代码模型	6.7B	代码基座模型
deepseek-coder-7b-base-v1.5	DeepSeek-Coder	代码模型	7B	代码基座v1.5
deepseek-coder-33b-base	DeepSeek-Coder	代码模型	33B	大规模代码基座
deepseek-coder-6.7b-instruct	DeepSeek-Coder	代码模型	6.7B	代码指令模型
deepseek-coder-7b-instruct-v1.5	DeepSeek-Coder	代码模型	7B	代码指令v1.5
deepseek-coder-33b-instruct	DeepSeek-Coder	代码模型	33B	大规模代码指令
DeepSeek-V2-Chat-0628	DeepSeek-V2	对话模型	-	特定版本对话
DeepSeek-V2.5	DeepSeek-V2.5	基座模型	-	2.5代基座
DeepSeek-V2.5-1210	DeepSeek-V2.5	基座模型	-	特定版本基座
DeepSeek-V3-Base	DeepSeek-V3	基座模型	-	第三代基座
DeepSeek-V3	DeepSeek-V3	基座模型	-	第三代模型
DeepSeek-V3-0324	DeepSeek-V3	基座模型	-	特定版本基座
DeepSeek-R1-Distill-Qwen-1.5B	DeepSeek-R1	推理模型	1.5B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-7B	DeepSeek-R1	推理模型	7B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-8B	DeepSeek-R1	推理模型	8B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-14B	DeepSeek-R1	推理模型	14B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-32B	DeepSeek-R1	推理模型	32B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-70B	DeepSeek-R1	推理模型	70B	蒸馏推理模型
DeepSeek-R1-Zero	DeepSeek-R1	推理模型	-	零样本推理
DeepSeek-R1	DeepSeek-R1	推理模型	-	推理模型
DeepSeek-R1-0528-Qwen3-8B	DeepSeek-R1	推理模型	8B	特定版本推理
DeepSeek-R1-0528	DeepSeek-R1	推理模型	-	特定版本推理
EXAONE-3.0-7.8B-Instruct	EXAONE系列	指令模型	7.8B	多模态指令模型
falcon-7b	Falcon系列	基座模型	7B	开源基座模型
falcon-11B	Falcon系列	基座模型	11B	中等规模基座
falcon-40b	Falcon系列	基座模型	40B	大规模基座模型
falcon-180b	Falcon系列	基座模型	180B	超大规模基座
falcon-7b-instruct	Falcon系列	指令模型	7B	指令调优模型
falcon-40b-instruct	Falcon系列	指令模型	40B	大规模指令模型
falcon-180b-chat	Falcon系列	对话模型	180B	超大规模对话
gemma-2b	Gemma系列	基座模型	2B	轻量基座模型
gemma-7b	Gemma系列	基座模型	7B	基座模型
gemma-2b-it	Gemma系列	指令模型	2B	轻量指令模型
gemma-7b-it	Gemma系列	指令模型	7B	指令模型
gemma-1.1-2b-it	Gemma系列	指令模型	2B	1.1版指令模型
gemma-1.1-7b-it	Gemma系列	指令模型	7B	1.1版指令模型
gemma-2-2b	Gemma2系列	基座模型	2B	第2代轻量基座
gemma-2-9b	Gemma2系列	基座模型	9B	第2代基座模型
gemma-2-27b	Gemma2系列	基座模型	27B	第2代大基座
gemma-2-2b-it	Gemma2系列	指令模型	2B	第2代轻量指令
gemma-2-9b-it	Gemma2系列	指令模型	9B	第2代指令模型
gemma-2-27b-it	Gemma2系列	指令模型	27B	第2代大指令模型
glm-4-9b	GLM系列	基座模型	9B	第4代基座模型
glm-4-9b-chat	GLM系列	对话模型	9B	第4代对话模型
glm-4-9b-chat-1m	GLM系列	对话模型	9B	长上下文对话
gpt2	GPT-2系列	基座模型	124M	基础版本
gpt2-medium	GPT-2系列	基座模型	355M	中等版本
gpt2-large	GPT-2系列	基座模型	774M	大型版本
gpt2-xl	GPT-2系列	基座模型	1.5B	超大版本
granite-3.0-1b-a400m-base	Granite系列	基座模型	1B	代码基座模型
granite-3.0-3b-a800m-base	Granite系列	基座模型	3B	代码基座模型
granite-3.0-2b-base	Granite系列	基座模型	2B	代码基座模型
granite-3.0-8b-base	Granite系列	基座模型	8B	代码基座模型
granite-3.0-1b-a400m-instruct	Granite系列	指令模型	1B	代码指令模型
granite-3.0-3b-a800m-instruct	Granite系列	指令模型	3B	代码指令模型
granite-3.0-2b-instruct	Granite系列	指令模型	2B	代码指令模型
granite-3.0-8b-instruct	Granite系列	指令模型	8B	代码指令模型
Index-1.9B	Index系列	基座模型	1.9B	轻量基座模型
Index-1.9B-Pure	Index系列	基座模型	1.9B	纯净版基座
Index-1.9B-Chat	Index系列	对话模型	1.9B	轻量对话模型
Index-1.9B-Character	Index系列	角色模型	1.9B	角色扮演模型
Index-1.9B-32K	Index系列	基座模型	1.9B	长上下文版本
internlm-7b	InternLM系列	基座模型	7B	基座模型
internlm-20b	InternLM系列	基座模型	20B	大规模基座
internlm-chat-7b	InternLM系列	对话模型	7B	对话模型
internlm-chat-20b	InternLM系列	对话模型	20B	大规模对话
internlm2-7b	InternLM2系列	基座模型	7B	第2代基座
internlm2-20b	InternLM2系列	基座模型	20B	第2代大基座
internlm2-chat-7b	InternLM2系列	对话模型	7B	第2代对话
internlm2-chat-20b	InternLM2系列	对话模型	20B	第2代大对话
internlm2_5-1_8b	InternLM2.5系列	基座模型	1.8B	2.5代轻量基座
internlm2_5-7b	InternLM2.5系列	基座模型	7B	2.5代基座
internlm2_5-20b	InternLM2.5系列	基座模型	20B	2.5代大基座
internlm2_5-1_8b-chat	InternLM2.5系列	对话模型	1.8B	2.5代轻量对话
internlm2_5-7b-chat	InternLM2.5系列	对话模型	7B	2.5代对话
internlm2_5-7b-chat-1m	InternLM2.5系列	对话模型	7B	百万字长对话
internlm2_5-20b-chat	InternLM2.5系列	对话模型	20B	2.5代大对话
internlm3-8b-instruct	InternLM3系列	指令模型	8B	第3代指令模型
Jamba-v0.1	Jamba系列	混合模型	-	SSM-Transformer混合架构
LingoWhale-8B	LingoWhale系列	基座模型	8B	中英双语模型
llama-7b	LLaMA系列	基座模型	7B	经典基座模型
llama-13b	LLaMA系列	基座模型	13B	中等基座模型
llama-30b	LLaMA系列	基座模型	30B	大型基座模型
llama-65b	LLaMA系列	基座模型	65B	超大规模基座
Llama-2-7b-hf	LLaMA-2系列	基座模型	7B	第2代基座
Llama-2-13b-hf	LLaMA-2系列	基座模型	13B	第2代基座
Llama-2-70b-hf	LLaMA-2系列	基座模型	70B	第2代大基座
Llama-2-7b-chat-hf	LLaMA-2系列	对话模型	7B	第2代对话
Llama-2-13b-chat-hf	LLaMA-2系列	对话模型	13B	第2代对话
Llama-2-70b-chat-hf	LLaMA-2系列	对话模型	70B	第2代大对话
Meta-Llama-3-8B	LLaMA-3系列	基座模型	8B	第3代基座
Meta-Llama-3-70B	LLaMA-3系列	基座模型	70B	第3代大基座
Meta-Llama-3-8B-Instruct	LLaMA-3系列	指令模型	8B	第3代指令
Meta-Llama-3-70B-Instruct	LLaMA-3系列	指令模型	70B	第3代大指令
Llama3-8B-Chinese-Chat	LLaMA-3中文	对话模型	8B	中文优化对话
Llama3-70B-Chinese-Chat	LLaMA-3中文	对话模型	70B	中文优化大对话
Meta-Llama-3.1-8B	LLaMA-3.1系列	基座模型	8B	3.1代基座
Meta-Llama-3.1-70B	LLaMA-3.1系列	基座模型	70B	3.1代大基座
Meta-Llama-3.1-405B	LLaMA-3.1系列	基座模型	405B	超大规模基座
Meta-Llama-3.1-8B-Instruct	LLaMA-3.1系列	指令模型	8B	3.1代指令
Meta-Llama-3.1-70B-Instruct	LLaMA-3.1系列	指令模型	70B	3.1代大指令
Meta-Llama-3.1-405B-Instruct	LLaMA-3.1系列	指令模型	405B	超大规模指令
Llama3.1-8B-Chinese-Chat	LLaMA-3.1中文	对话模型	8B	3.1代中文对话
Llama3.1-70B-Chinese-Chat	LLaMA-3.1中文	对话模型	70B	3.1代中文大对话
Llama-3.2-1B	LLaMA-3.2系列	基座模型	1B	3.2代轻量基座
Llama-3.2-3B	LLaMA-3.2系列	基座模型	3B	3.2代轻量基座
Llama-3.2-1B-Instruct	LLaMA-3.2系列	指令模型	1B	3.2代轻量指令
Llama-3.2-3B-Instruct	LLaMA-3.2系列	指令模型	3B	3.2代轻量指令
Llama-3.3-70B-Instruct	LLaMA-3.3系列	指令模型	70B	3.3代大指令
MiniCPM-2B-sft-bf16	MiniCPM系列	对话模型	2B	SFT优化对话
MiniCPM-2B-dpo-bf16	MiniCPM系列	对话模型	2B	DPO优化对话
MiniCPM3-4B	MiniCPM系列	对话模型	4B	第三代对话模型
MiniCPM-o-2_6	MiniCPM系列	对话模型	2.6B	优化版对话模型
MiniCPM-V-2_6	MiniCPM系列	多模态模型	2.6B	视觉语言模型
Mistral-7B-v0.1	Mistral系列	基座模型	7B	初代基座模型
Mistral-7B-v0.2-hf	Mistral系列	基座模型	7B	0.2版基座
Mistral-7B-v0.3	Mistral系列	基座模型	7B	0.3版基座
Mistral-7B-Instruct-v0.1	Mistral系列	指令模型	7B	初代指令模型
Mistral-7B-Instruct-v0.2	Mistral系列	指令模型	7B	0.2版指令
Mistral-7B-Instruct-v0.3	Mistral系列	指令模型	7B	0.3版指令
Mixtral-8x7B-v0.1	Mixtral系列	基座模型	8x7B	MoE架构基座
Mixtral-8x22B-v0.1	Mixtral系列	基座模型	8x22B	大型MoE基座
Mixtral-8x7B-Instruct-v0.1	Mixtral系列	指令模型	8x7B	MoE指令模型
Mixtral-8x22B-Instruct-v0.1	Mixtral系列	指令模型	8x22B	大型MoE指令
OLMo-1B-hf	OLMo系列	基座模型	1B	轻量开源模型
OLMo-7B-hf	OLMo系列	基座模型	7B	开源基座模型
OLMo-7B-Instruct-hf	OLMo系列	指令模型	7B	开源指令模型
openchat-3.5-0106	OpenChat系列	对话模型	-	3.5版对话模型
openchat-3.6-8b-20240522	OpenChat系列	对话模型	8B	3.6版对话模型
Qwen-1_8B	Qwen系列	基座模型	1.8B	轻量基座模型
Qwen-7B	Qwen系列	基座模型	7B	基座模型
Qwen-14B	Qwen系列	基座模型	14B	中等基座模型
Qwen-72B	Qwen系列	基座模型	72B	大规模基座模型
Qwen-1_8B-Chat	Qwen系列	对话模型	1.8B	轻量对话模型
Qwen-7B-Chat	Qwen系列	对话模型	7B	对话模型
Qwen-14B-Chat	Qwen系列	对话模型	14B	中等对话模型
Qwen-72B-Chat	Qwen系列	对话模型	72B	大规模对话模型
Qwen1.5-0.5B	Qwen1.5系列	基座模型	0.5B	超轻量基座
Qwen1.5-1.8B	Qwen1.5系列	基座模型	1.8B	轻量基座模型
Qwen1.5-4B	Qwen1.5系列	基座模型	4B	小规模基座
Qwen1.5-7B	Qwen1.5系列	基座模型	7B	基座模型
Qwen1.5-14B	Qwen1.5系列	基座模型	14B	中等基座模型
Qwen1.5-32B	Qwen1.5系列	基座模型	32B	大规模基座模型
Qwen1.5-72B	Qwen1.5系列	基座模型	72B	超大规模基座
Qwen1.5-110B	Qwen1.5系列	基座模型	110B	巨型基座模型
Qwen1.5-0.5B-Chat	Qwen1.5系列	对话模型	0.5B	超轻量对话
Qwen1.5-1.8B-Chat	Qwen1.5系列	对话模型	1.8B	轻量对话模型
Qwen1.5-4B-Chat	Qwen1.5系列	对话模型	4B	小规模对话
Qwen1.5-7B-Chat	Qwen1.5系列	对话模型	7B	对话模型
Qwen1.5-14B-Chat	Qwen1.5系列	对话模型	14B	中等对话模型
Qwen1.5-32B-Chat	Qwen1.5系列	对话模型	32B	大规模对话模型
Qwen1.5-72B-Chat	Qwen1.5系列	对话模型	72B	超大规模对话
Qwen1.5-110B-Chat	Qwen1.5系列	对话模型	110B	巨型对话模型
Qwen2-0.5B	Qwen2系列	基座模型	0.5B	第2代超轻量基座
Qwen2-1.5B	Qwen2系列	基座模型	1.5B	第2代轻量基座
Qwen2-7B	Qwen2系列	基座模型	7B	第2代基座模型
Qwen2-72B	Qwen2系列	基座模型	72B	第2代大规模基座
Qwen2-0.5B-Instruct	Qwen2系列	指令模型	0.5B	第2代超轻量指令
Qwen2-1.5B-Instruct	Qwen2系列	指令模型	1.5B	第2代轻量指令
Qwen2-7B-Instruct	Qwen2系列	指令模型	7B	第2代指令模型
Qwen2-72B-Instruct	Qwen2系列	指令模型	72B	第2代大规模指令
SOLAR-10.7B-v1.0	SOLAR系列	基座模型	10.7B	基座模型
SOLAR-10.7B-Instruct-v1.0	SOLAR系列	指令模型	10.7B	指令模型
starcoder2-3b	StarCoder2系列	代码模型	3B	轻量代码模型
starcoder2-7b	StarCoder2系列	代码模型	7B	代码模型
starcoder2-15b	StarCoder2系列	代码模型	15B	中等代码模型
TeleChat-1B	TeleChat系列	对话模型	1B	轻量对话模型
telechat-7B	TeleChat系列	对话模型	7B	对话模型
vicuna-7b-v1.5	Vicuna系列	对话模型	7B	基于LLaMA的对话模型
vicuna-13b-v1.5	Vicuna系列	对话模型	13B	基于LLaMA的对话模型
XuanYuan-6B	XuanYuan系列	基座模型	6B	金融领域基座
XuanYuan-70B	XuanYuan系列	基座模型	70B	金融领域大基座
XuanYuan-6B-Chat	XuanYuan系列	对话模型	6B	金融对话模型
XuanYuan-70B-Chat	XuanYuan系列	对话模型	70B	金融大对话模型
XVERSE-7B	XVERSE系列	基座模型	7B	基座模型
XVERSE-13B	XVERSE系列	基座模型	13B	中等基座模型
XVERSE-65B	XVERSE系列	基座模型	65B	大规模基座模型
XVERSE-7B-Chat	XVERSE系列	对话模型	7B	对话模型
XVERSE-13B-Chat	XVERSE系列	对话模型	13B	中等对话模型
XVERSE-65B-Chat	XVERSE系列	对话模型	65B	大规模对话模型
Yi-6B	Yi系列	基座模型	6B	基座模型
Yi-9B	Yi系列	基座模型	9B	中等基座模型
Yi-34B	Yi系列	基座模型	34B	大规模基座模型
Yi-6B-Chat	Yi系列	对话模型	6B	对话模型
Yi-34B-Chat	Yi系列	对话模型	34B	大规模对话模型
zephyr-7b-alpha	Zephyr系列	对话模型	7B	Alpha版本对话
zephyr-7b-beta	Zephyr系列	对话模型	7B	Beta版本对话

提示

与LlamaFactory 0.9.3版本相比，LlamaFactory 0.9.4(Transformers 4.56.0)在其基础上新增了对以下模型的支持。请根据您的具体需求，选择适合的镜像版本。

gpt-oss-20b, gpt-oss-120b, dots.ocr, gemma-3-270m, gemma-3-270m-it, GLM-4.1V-9B-Thinking, GLM-4.5-Air-Base, GLM-4.5-Base, GLM-4.5-Air, GLM-4.5, GLM-4.5V, granite-4.0-tiny-preview, Intern-S1-mini, Keye-VL-8B-Preview, Kimi-Dev-72B, Kimi-VL-A3B-Thinking-2506, MiniCPM4.1-8B, MiniCPM-V-4, Mistral-Small-3.2-24B-Instruct-2506, MobileLLM-R1-140M-base, MobileLLM-R1-360M-base, MobileLLM-R1-950M-base, MobileLLM-R1-140M, MobileLLM-R1-360M, MobileLLM-R1-950M, Qwen3-4B-Thinking-2507, Qwen3-30B-A3B-Thinking-2507, Qwen3-235B-A22B-Thinking-2507, Qwen3-Next-80B-A3B-Thinking, Qwen3-Next-80B-A3B-Instruct, Qwen3-Omni-30B-A3B-Captioner, Qwen3-Omni-30B-A3B-Instruct, Qwen3-Omni-30B-A3B-Thinking, Qwen3-VL-4B-Instruct, Qwen3-VL-8B-Instruct, Qwen3-VL-30B-A3B-Instruct, Qwen3-VL-235B-A22B-Instruct, Qwen3-VL-4B-Thinking, Qwen3-VL-8B-Thinking, Qwen3-VL-30B-A3B-Thinking, Qwen3-VL-235B-A22B-Thinking, Seed-OSS-36B-Base, Seed-OSS-36B-Base-woSyn, Seed-OSS-36B-Instruct

LlamaFactory 0.9.3 (历史版本)

0.9.3 镜像列表

Transformers	PyTorch	CUDA	vLLM	HuggingFace Hub	镜像Tag	状态
4.52.4	2.7.0	12.6	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.7.0-cu12.6-1.1`	🟡 历史版本
4.52.4	2.7.0	12.8	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.7.0-cu12.8-1.1`	🟡 历史版本
4.52.4	2.7.0	11.8	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.7.0-cu11.8-1.1`	🟡 历史版本
4.52.4	2.6.0	12.6	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.6.0-cu12.6-1.1`	🟡 历史版本
4.52.4	2.6.0	12.4	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.6.0-cu12.4-1.1`	🟡 历史版本
4.52.4	2.6.0	11.8	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.6.0-cu11.8-1.1`	🟡 历史版本
4.52.4	2.5.1	12.4	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.5.1-cu12.4-1.1`	🟡 历史版本
4.52.4	2.5.1	12.1	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.5.1-cu12.1-1.1`	🟡 历史版本
4.52.4	2.5.1	11.8	0.9.1	0.34.3	`lf0.9.3-tf4.52.4-torch2.5.1-cu11.8-1.1`	🟡 历史版本

0.9.3版本特性总结

🔄 稳定版本: transformers 4.52.4 + vllm 0.9.1
🔄 良好兼容: 支持PyTorch 2.5.1-2.7.0，CUDA 11.8-12.8

0.9.3模型列表

模型详情

模型名称	系列分类	模型类型	参数量	特点说明
aya-23-8B	Aya系列	多语言模型	8B	多语言理解与生成
aya-23-35B	Aya系列	多语言模型	35B	大规模多语言模型
Baichuan-7B	Baichuan系列	基座模型	7B	中英双语基座模型
Baichuan-13B-Base	Baichuan系列	基座模型	13B	中英双语基座模型
Baichuan-13B-Chat	Baichuan系列	对话模型	13B	中英双语对话模型
Baichuan2-7B-Base	Baichuan2系列	基座模型	7B	第二代中英双语基座
Baichuan2-13B-Base	Baichuan2系列	基座模型	13B	第二代中英双语基座
Baichuan2-7B-Chat	Baichuan2系列	对话模型	7B	第二代对话模型
Baichuan2-13B-Chat	Baichuan2系列	对话模型	13B	第二代对话模型
bloom-560m	BLOOM系列	基座模型	560M	多语言基座小模型
bloom-3b	BLOOM系列	基座模型	3B	多语言基座模型
bloom-7b1	BLOOM系列	基座模型	7B	多语言基座模型
bloomz-560m	BLOOMZ系列	指令调优	560M	指令调优小模型
bloomz-3b	BLOOMZ系列	指令调优	3B	指令调优模型
bloomz-7b1-mt	BLOOMZ系列	指令调优	7B	多任务指令调优
BlueLM-7B-Base	BlueLM系列	基座模型	7B	中英双语基座
BlueLM-7B-Chat	BlueLM系列	对话模型	7B	中英双语对话
Breeze-7B-Base-v1_0	Breeze系列	基座模型	7B	中文轻量基座
Breeze-7B-Instruct-v1_0	Breeze系列	指令模型	7B	中文指令模型
chatglm2-6b	ChatGLM系列	对话模型	6B	第二代对话模型
chatglm3-6b-base	ChatGLM系列	基座模型	6B	第三代基座模型
chatglm3-6b	ChatGLM系列	对话模型	6B	第三代对话模型
chinese-llama-2-1.3b	Chinese-LLaMA	基座模型	1.3B	中文优化小模型
chinese-llama-2-7b	Chinese-LLaMA	基座模型	7B	中文优化模型
chinese-llama-2-13b	Chinese-LLaMA	基座模型	13B	中文优化大模型
chinese-alpaca-2-1.3b	Chinese-Alpaca	对话模型	1.3B	中文对话小模型
chinese-alpaca-2-7b	Chinese-Alpaca	对话模型	7B	中文对话模型
chinese-alpaca-2-13b	Chinese-Alpaca	对话模型	13B	中文对话大模型
codegeex4-all-9b	CodeGeeX系列	代码模型	9B	多语言代码生成
codegemma-7b	CodeGemma系列	代码模型	7B	代码生成基座
codegemma-7b-it	CodeGemma系列	代码模型	7B	代码生成指令版
codegemma-1.1-2b	CodeGemma系列	代码模型	2B	轻量代码模型
codegemma-1.1-7b-it	CodeGemma系列	代码模型	7B	代码指令模型
Codestral-22B-v0.1	Codestral系列	代码模型	22B	大型代码模型
c4ai-command-r-v01	Command系列	RAG模型	-	检索增强生成
c4ai-command-r-plus	Command系列	RAG模型	-	增强版RAG模型
c4ai-command-r-v01-4bit	Command系列	量化模型	-	4bit量化版本
c4ai-command-r-plus-4bit	Command系列	量化模型	-	增强版4bit量化
dbrx-base	DBRX系列	基座模型	-	MoE架构基座
dbrx-instruct	DBRX系列	指令模型	-	MoE指令模型
deepseek-llm-7b-base	DeepSeek-LLM	基座模型	7B	通用基座模型
deepseek-llm-67b-base	DeepSeek-LLM	基座模型	67B	大规模基座模型
deepseek-llm-7b-chat	DeepSeek-LLM	对话模型	7B	通用对话模型
deepseek-llm-67b-chat	DeepSeek-LLM	对话模型	67B	大规模对话模型
deepseek-math-7b-base	DeepSeek-Math	数学模型	7B	数学基座模型
deepseek-math-7b-instruct	DeepSeek-Math	数学模型	7B	数学指令模型
deepseek-moe-16b-base	DeepSeek-MoE	基座模型	16B	MoE架构基座
deepseek-moe-16b-chat	DeepSeek-MoE	对话模型	16B	MoE对话模型
DeepSeek-V2-Lite	DeepSeek-V2	轻量模型	-	V2轻量版本
DeepSeek-V2	DeepSeek-V2	基座模型	-	第二代基座
DeepSeek-V2-Lite-Chat	DeepSeek-V2	对话模型	-	V2轻量对话
DeepSeek-V2-Chat	DeepSeek-V2	对话模型	-	第二代对话
DeepSeek-Coder-V2-Lite-Base	DeepSeek-Coder	代码模型	-	代码轻量基座
DeepSeek-Coder-V2-Base	DeepSeek-Coder	代码模型	-	代码基座模型
DeepSeek-Coder-V2-Lite-Instruct	DeepSeek-Coder	代码模型	-	代码轻量指令
DeepSeek-Coder-V2-Instruct	DeepSeek-Coder	代码模型	-	代码指令模型
deepseek-coder-6.7b-base	DeepSeek-Coder	代码模型	6.7B	代码基座模型
deepseek-coder-7b-base-v1.5	DeepSeek-Coder	代码模型	7B	代码基座v1.5
deepseek-coder-33b-base	DeepSeek-Coder	代码模型	33B	大规模代码基座
deepseek-coder-6.7b-instruct	DeepSeek-Coder	代码模型	6.7B	代码指令模型
deepseek-coder-7b-instruct-v1.5	DeepSeek-Coder	代码模型	7B	代码指令v1.5
deepseek-coder-33b-instruct	DeepSeek-Coder	代码模型	33B	大规模代码指令
DeepSeek-V2-Chat-0628	DeepSeek-V2	对话模型	-	特定版本对话
DeepSeek-V2.5	DeepSeek-V2.5	基座模型	-	2.5代基座
DeepSeek-V2.5-1210	DeepSeek-V2.5	基座模型	-	特定版本基座
DeepSeek-V3-Base	DeepSeek-V3	基座模型	-	第三代基座
DeepSeek-V3	DeepSeek-V3	基座模型	-	第三代模型
DeepSeek-V3-0324	DeepSeek-V3	基座模型	-	特定版本基座
DeepSeek-R1-Distill-Qwen-1.5B	DeepSeek-R1	推理模型	1.5B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-7B	DeepSeek-R1	推理模型	7B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-8B	DeepSeek-R1	推理模型	8B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-14B	DeepSeek-R1	推理模型	14B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-32B	DeepSeek-R1	推理模型	32B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-70B	DeepSeek-R1	推理模型	70B	蒸馏推理模型
DeepSeek-R1-Zero	DeepSeek-R1	推理模型	-	零样本推理
DeepSeek-R1	DeepSeek-R1	推理模型	-	推理模型
DeepSeek-R1-0528-Qwen3-8B	DeepSeek-R1	推理模型	8B	特定版本推理
DeepSeek-R1-0528	DeepSeek-R1	推理模型	-	特定版本推理
EXAONE-3.0-7.8B-Instruct	EXAONE系列	指令模型	7.8B	多模态指令模型
falcon-7b	Falcon系列	基座模型	7B	开源基座模型
falcon-11B	Falcon系列	基座模型	11B	中等规模基座
falcon-40b	Falcon系列	基座模型	40B	大规模基座模型
falcon-180b	Falcon系列	基座模型	180B	超大规模基座
falcon-7b-instruct	Falcon系列	指令模型	7B	指令调优模型
falcon-40b-instruct	Falcon系列	指令模型	40B	大规模指令模型
falcon-180b-chat	Falcon系列	对话模型	180B	超大规模对话
gemma-2b	Gemma系列	基座模型	2B	轻量基座模型
gemma-7b	Gemma系列	基座模型	7B	基座模型
gemma-2b-it	Gemma系列	指令模型	2B	轻量指令模型
gemma-7b-it	Gemma系列	指令模型	7B	指令模型
gemma-1.1-2b-it	Gemma系列	指令模型	2B	1.1版指令模型
gemma-1.1-7b-it	Gemma系列	指令模型	7B	1.1版指令模型
gemma-2-2b	Gemma2系列	基座模型	2B	第2代轻量基座
gemma-2-9b	Gemma2系列	基座模型	9B	第2代基座模型
gemma-2-27b	Gemma2系列	基座模型	27B	第2代大基座
gemma-2-2b-it	Gemma2系列	指令模型	2B	第2代轻量指令
gemma-2-9b-it	Gemma2系列	指令模型	9B	第2代指令模型
gemma-2-27b-it	Gemma2系列	指令模型	27B	第2代大指令模型
glm-4-9b	GLM系列	基座模型	9B	第4代基座模型
glm-4-9b-chat	GLM系列	对话模型	9B	第4代对话模型
glm-4-9b-chat-1m	GLM系列	对话模型	9B	长上下文对话
gpt2	GPT-2系列	基座模型	124M	基础版本
gpt2-medium	GPT-2系列	基座模型	355M	中等版本
gpt2-large	GPT-2系列	基座模型	774M	大型版本
gpt2-xl	GPT-2系列	基座模型	1.5B	超大版本
Index-1.9B	Index系列	基座模型	1.9B	轻量基座模型
Index-1.9B-Pure	Index系列	基座模型	1.9B	纯净版基座
Index-1.9B-Chat	Index系列	对话模型	1.9B	轻量对话模型
Index-1.9B-Character	Index系列	角色模型	1.9B	角色扮演模型
Index-1.9B-32K	Index系列	基座模型	1.9B	长上下文版本
internlm-7b	InternLM系列	基座模型	7B	基座模型
internlm-20b	InternLM系列	基座模型	20B	大规模基座
internlm-chat-7b	InternLM系列	对话模型	7B	对话模型
internlm-chat-20b	InternLM系列	对话模型	20B	大规模对话
internlm2-7b	InternLM2系列	基座模型	7B	第2代基座
internlm2-20b	InternLM2系列	基座模型	20B	第2代大基座
internlm2-chat-7b	InternLM2系列	对话模型	7B	第2代对话
internlm2-chat-20b	InternLM2系列	对话模型	20B	第2代大对话
internlm2_5-1_8b	InternLM2.5系列	基座模型	1.8B	2.5代轻量基座
internlm2_5-7b	InternLM2.5系列	基座模型	7B	2.5代基座
internlm2_5-20b	InternLM2.5系列	基座模型	20B	2.5代大基座
internlm2_5-1_8b-chat	InternLM2.5系列	对话模型	1.8B	2.5代轻量对话
internlm2_5-7b-chat	InternLM2.5系列	对话模型	7B	2.5代对话
internlm2_5-7b-chat-1m	InternLM2.5系列	对话模型	7B	百万字长对话
internlm2_5-20b-chat	InternLM2.5系列	对话模型	20B	2.5代大对话
internlm3-8b-instruct	InternLM3系列	指令模型	8B	第3代指令模型
Jamba-v0.1	Jamba系列	混合模型	-	SSM-Transformer混合架构
LingoWhale-8B	LingoWhale系列	基座模型	8B	中英双语模型
llama-7b	LLaMA系列	基座模型	7B	经典基座模型
llama-13b	LLaMA系列	基座模型	13B	中等基座模型
llama-30b	LLaMA系列	基座模型	30B	大型基座模型
llama-65b	LLaMA系列	基座模型	65B	超大规模基座
Llama-2-7b-hf	LLaMA-2系列	基座模型	7B	第2代基座
Llama-2-13b-hf	LLaMA-2系列	基座模型	13B	第2代基座
Llama-2-70b-hf	LLaMA-2系列	基座模型	70B	第2代大基座
Llama-2-7b-chat-hf	LLaMA-2系列	对话模型	7B	第2代对话
Llama-2-13b-chat-hf	LLaMA-2系列	对话模型	13B	第2代对话
Llama-2-70b-chat-hf	LLaMA-2系列	对话模型	70B	第2代大对话
Meta-Llama-3-8B	LLaMA-3系列	基座模型	8B	第3代基座
Meta-Llama-3-70B	LLaMA-3系列	基座模型	70B	第3代大基座
Meta-Llama-3-8B-Instruct	LLaMA-3系列	指令模型	8B	第3代指令
Meta-Llama-3-70B-Instruct	LLaMA-3系列	指令模型	70B	第3代大指令
Llama3-8B-Chinese-Chat	LLaMA-3中文	对话模型	8B	中文优化对话
Llama3-70B-Chinese-Chat	LLaMA-3中文	对话模型	70B	中文优化大对话
Meta-Llama-3.1-8B	LLaMA-3.1系列	基座模型	8B	3.1代基座
Meta-Llama-3.1-70B	LLaMA-3.1系列	基座模型	70B	3.1代大基座
Meta-Llama-3.1-405B	LLaMA-3.1系列	基座模型	405B	超大规模基座
Meta-Llama-3.1-8B-Instruct	LLaMA-3.1系列	指令模型	8B	3.1代指令
Meta-Llama-3.1-70B-Instruct	LLaMA-3.1系列	指令模型	70B	3.1代大指令
Meta-Llama-3.1-405B-Instruct	LLaMA-3.1系列	指令模型	405B	超大规模指令
Llama3.1-8B-Chinese-Chat	LLaMA-3.1中文	对话模型	8B	3.1代中文对话
Llama3.1-70B-Chinese-Chat	LLaMA-3.1中文	对话模型	70B	3.1代中文大对话
Llama-3.2-1B	LLaMA-3.2系列	基座模型	1B	3.2代轻量基座
Llama-3.2-3B	LLaMA-3.2系列	基座模型	3B	3.2代轻量基座
Llama-3.2-1B-Instruct	LLaMA-3.2系列	指令模型	1B	3.2代轻量指令
Llama-3.2-3B-Instruct	LLaMA-3.2系列	指令模型	3B	3.2代轻量指令
Llama-3.3-70B-Instruct	LLaMA-3.3系列	指令模型	70B	3.3代大指令
MiniCPM-2B-sft-bf16	MiniCPM系列	对话模型	2B	SFT优化对话
MiniCPM-2B-dpo-bf16	MiniCPM系列	对话模型	2B	DPO优化对话
MiniCPM3-4B	MiniCPM系列	对话模型	4B	第三代对话模型
MiniCPM-o-2_6	MiniCPM系列	对话模型	2.6B	优化版对话模型
MiniCPM-V-2_6	MiniCPM系列	多模态模型	2.6B	视觉语言模型
Mistral-7B-v0.1	Mistral系列	基座模型	7B	初代基座模型
Mistral-7B-v0.2-hf	Mistral系列	基座模型	7B	0.2版基座
Mistral-7B-v0.3	Mistral系列	基座模型	7B	0.3版基座
Mistral-7B-Instruct-v0.1	Mistral系列	指令模型	7B	初代指令模型
Mistral-7B-Instruct-v0.2	Mistral系列	指令模型	7B	0.2版指令
Mistral-7B-Instruct-v0.3	Mistral系列	指令模型	7B	0.3版指令
Mixtral-8x7B-v0.1	Mixtral系列	基座模型	8x7B	MoE架构基座
Mixtral-8x22B-v0.1	Mixtral系列	基座模型	8x22B	大型MoE基座
Mixtral-8x7B-Instruct-v0.1	Mixtral系列	指令模型	8x7B	MoE指令模型
Mixtral-8x22B-Instruct-v0.1	Mixtral系列	指令模型	8x22B	大型MoE指令
OLMo-1B-hf	OLMo系列	基座模型	1B	轻量开源模型
OLMo-7B-hf	OLMo系列	基座模型	7B	开源基座模型
OLMo-7B-Instruct-hf	OLMo系列	指令模型	7B	开源指令模型
openchat-3.5-0106	OpenChat系列	对话模型	-	3.5版对话模型
openchat-3.6-8b-20240522	OpenChat系列	对话模型	8B	3.6版对话模型
Qwen-1_8B	Qwen系列	基座模型	1.8B	轻量基座模型
Qwen-7B	Qwen系列	基座模型	7B	基座模型
Qwen-14B	Qwen系列	基座模型	14B	中等基座模型
Qwen-72B	Qwen系列	基座模型	72B	大规模基座模型
Qwen-1_8B-Chat	Qwen系列	对话模型	1.8B	轻量对话模型
Qwen-7B-Chat	Qwen系列	对话模型	7B	对话模型
Qwen-14B-Chat	Qwen系列	对话模型	14B	中等对话模型
Qwen-72B-Chat	Qwen系列	对话模型	72B	大规模对话模型
Qwen1.5-0.5B	Qwen1.5系列	基座模型	0.5B	超轻量基座
Qwen1.5-1.8B	Qwen1.5系列	基座模型	1.8B	轻量基座模型
Qwen1.5-4B	Qwen1.5系列	基座模型	4B	小规模基座
Qwen1.5-7B	Qwen1.5系列	基座模型	7B	基座模型
Qwen1.5-14B	Qwen1.5系列	基座模型	14B	中等基座模型
Qwen1.5-32B	Qwen1.5系列	基座模型	32B	大规模基座模型
Qwen1.5-72B	Qwen1.5系列	基座模型	72B	超大规模基座
Qwen1.5-110B	Qwen1.5系列	基座模型	110B	巨型基座模型
Qwen1.5-0.5B-Chat	Qwen1.5系列	对话模型	0.5B	超轻量对话
Qwen1.5-1.8B-Chat	Qwen1.5系列	对话模型	1.8B	轻量对话模型
Qwen1.5-4B-Chat	Qwen1.5系列	对话模型	4B	小规模对话
Qwen1.5-7B-Chat	Qwen1.5系列	对话模型	7B	对话模型
Qwen1.5-14B-Chat	Qwen1.5系列	对话模型	14B	中等对话模型
Qwen1.5-32B-Chat	Qwen1.5系列	对话模型	32B	大规模对话模型
Qwen1.5-72B-Chat	Qwen1.5系列	对话模型	72B	超大规模对话
Qwen1.5-110B-Chat	Qwen1.5系列	对话模型	110B	巨型对话模型
Qwen2-0.5B	Qwen2系列	基座模型	0.5B	第2代超轻量基座
Qwen2-1.5B	Qwen2系列	基座模型	1.5B	第2代轻量基座
Qwen2-7B	Qwen2系列	基座模型	7B	第2代基座模型
Qwen2-72B	Qwen2系列	基座模型	72B	第2代大规模基座
Qwen2-0.5B-Instruct	Qwen2系列	指令模型	0.5B	第2代超轻量指令
Qwen2-1.5B-Instruct	Qwen2系列	指令模型	1.5B	第2代轻量指令
Qwen2-7B-Instruct	Qwen2系列	指令模型	7B	第2代指令模型
Qwen2-72B-Instruct	Qwen2系列	指令模型	72B	第2代大规模指令
SOLAR-10.7B-v1.0	SOLAR系列	基座模型	10.7B	基座模型
SOLAR-10.7B-Instruct-v1.0	SOLAR系列	指令模型	10.7B	指令模型
starcoder2-3b	StarCoder2系列	代码模型	3B	轻量代码模型
starcoder2-7b	StarCoder2系列	代码模型	7B	代码模型
starcoder2-15b	StarCoder2系列	代码模型	15B	中等代码模型
TeleChat-1B	TeleChat系列	对话模型	1B	轻量对话模型
telechat-7B	TeleChat系列	对话模型	7B	对话模型
vicuna-7b-v1.5	Vicuna系列	对话模型	7B	基于LLaMA的对话模型
vicuna-13b-v1.5	Vicuna系列	对话模型	13B	基于LLaMA的对话模型
XuanYuan-6B	XuanYuan系列	基座模型	6B	金融领域基座
XuanYuan-70B	XuanYuan系列	基座模型	70B	金融领域大基座
XuanYuan-6B-Chat	XuanYuan系列	对话模型	6B	金融对话模型
XuanYuan-70B-Chat	XuanYuan系列	对话模型	70B	金融大对话模型
XVERSE-7B	XVERSE系列	基座模型	7B	基座模型
XVERSE-13B	XVERSE系列	基座模型	13B	中等基座模型
XVERSE-65B	XVERSE系列	基座模型	65B	大规模基座模型
XVERSE-7B-Chat	XVERSE系列	对话模型	7B	对话模型
XVERSE-13B-Chat	XVERSE系列	对话模型	13B	中等对话模型
XVERSE-65B-Chat	XVERSE系列	对话模型	65B	大规模对话模型
Yi-6B	Yi系列	基座模型	6B	基座模型
Yi-9B	Yi系列	基座模型	9B	中等基座模型
Yi-34B	Yi系列	基座模型	34B	大规模基座模型
Yi-6B-Chat	Yi系列	对话模型	6B	对话模型
Yi-34B-Chat	Yi系列	对话模型	34B	大规模对话模型
zephyr-7b-alpha	Zephyr系列	对话模型	7B	Alpha版本对话
zephyr-7b-beta	Zephyr系列	对话模型	7B	Beta版本对话

提示

与LlamaFactory 0.9.2版本相比，LlamaFactory 0.9.3在其基础上新增了对以下模型的支持。请根据您的具体需求，选择适合的镜像版本。

DeepSeek-V3-0324, DeepSeek-R1-0528-Qwen3-8B, DeepSeek-R1-0528, gemma-3-1b-pt, gemma-3-1b-it, medgemma-27b-text-it, gemma-3-4b-pt, gemma-3-12b-pt, gemma-3-27b-pt, gemma-3-4b-it, gemma-3-12b-it, gemma-3-27b-it, medgemma-4b-pt, medgemma-4b-it, GLM-4-9B-0414, GLM-4-32B-Base-0414, GLM-4-32B-0414, GLM-Z1-9B-0414, GLM-Z1-32B-0414, granite-3.2-2b-instruct, granite-3.2-8b-instruct, granite-3.3-2b-base, granite-3.3-8b-base, granite-3.3-2b-instruct, granite-3.3-8b-instruct, granite-vision-3.2-2b, Hunyuan-7B-Instruct, InternVL2_5-2B-MPO-hf, InternVL2_5-8B-MPO-hf, InternVL3-1B-hf, InternVL3-2B-hf, InternVL3-8B-hf, InternVL3-14B-hf, InternVL3-38B-hf, InternVL3-78B-hf, Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking, Llama-4-Scout-17B-16E, Llama-4-Scout-17B-16E-Instruct, Llama-4-Maverick-17B-128E, Llama-4-Maverick-17B-128E-Instruct, MiMo-7B-Base, MiMo-7B-SFT, MiMo-7B-RL, MiMo-7B-RL-ZERO, MiMo-VL-7B-SFT, MiMo-VL-7B-RL, MiniCPM4-0.5B, MiniCPM4-8B, Mistral-Small-3.1-24B-Base-2503, Mistral-Small-3.1-24B-Instruct-2503, Qwen3-0.6B-Base, Qwen3-1.7B-Base, Qwen3-4B-Base, Qwen3-8B-Base, Qwen3-14B-Base, Qwen3-30B-A3B-Base, Qwen3-0.6B, Qwen3-1.7B, Qwen3-4B, Qwen3-8B, Qwen3-14B, Qwen3-32B, Qwen3-30B-A3B, Qwen3-235B-A22B, Qwen3-0.6B-GPTQ-Int8, Qwen3-1.7B-GPTQ-Int8, Qwen3-4B-AWQ, Qwen3-8B-AWQ, Qwen3-14B-AWQ, Qwen3-32B-AWQ, Qwen3-30B-A3B-GPTQ-Int4, Qwen3-235B-A22B-GPTQ-Int4, Qwen2.5-Omni-3B, Qwen2.5-Omni-7B, Qwen2.5-Omni-7B-GPTQ-Int4, Qwen2.5-Omni-7B-AWQ, Qwen2.5-VL-32B-Instruct, Seed-Coder-8B-Base, Seed-Coder-8B-Instruct, Seed-Coder-8B-Reasoning-bf16, SmolLM-135M, SmolLM-360M, SmolLM-1.7B, SmolLM-135M-Instruct, SmolLM-360M-Instruct, SmolLM-1.7B-Instruct, SmolLM2-135M, SmolLM2-360M, SmolLM2-1.7B, SmolLM2-135M-Instruct, SmolLM2-360M-Instruct, SmolLM2-1.7B-Instruct

LlamaFactory 0.9.2 (旧版本)

0.9.2镜像列表

Transformers	PyTorch	CUDA	vLLM	HuggingFace Hub	镜像Tag	状态
4.45.2	2.5.1	12.4	0.7.0	0.34.3	`lf0.9.2-tf4.45.2-torch2.5.1-cu12.4-1.1`	🔴 旧版本
4.45.2	2.5.1	12.1	0.7.0	0.34.3	`lf0.9.2-tf4.45.2-torch2.5.1-cu12.1-1.1`	🔴 旧版本
4.45.2	2.5.1	11.8	0.7.0	0.34.3	`lf0.9.2-tf4.45.2-torch2.5.1-cu11.8-1.1`	🔴 旧版本

0.9.2版本特性总结

⏳ 较旧组件: transformers 4.45.2 + vllm 0.7.0
⏳ 有限支持: 仅PyTorch 2.5.1，CUDA版本较少

0.9.2模型列表

模型详情

模型名称	系列分类	模型类型	参数量	特点说明
aya-23-8B	Aya系列	多语言模型	8B	多语言理解与生成
aya-23-35B	Aya系列	多语言模型	35B	大规模多语言模型
Baichuan-7B	Baichuan系列	基座模型	7B	中英双语基座模型
Baichuan-13B-Base	Baichuan系列	基座模型	13B	中英双语基座模型
Baichuan-13B-Chat	Baichuan系列	对话模型	13B	中英双语对话模型
Baichuan2-7B-Base	Baichuan2系列	基座模型	7B	第二代中英双语基座
Baichuan2-13B-Base	Baichuan2系列	基座模型	13B	第二代中英双语基座
Baichuan2-7B-Chat	Baichuan2系列	对话模型	7B	第二代对话模型
Baichuan2-13B-Chat	Baichuan2系列	对话模型	13B	第二代对话模型
bloom-560m	BLOOM系列	基座模型	560M	多语言基座小模型
bloom-3b	BLOOM系列	基座模型	3B	多语言基座模型
bloom-7b1	BLOOM系列	基座模型	7B	多语言基座模型
bloomz-560m	BLOOMZ系列	指令调优	560M	指令调优小模型
bloomz-3b	BLOOMZ系列	指令调优	3B	指令调优模型
bloomz-7b1-mt	BLOOMZ系列	指令调优	7B	多任务指令调优
BlueLM-7B-Base	BlueLM系列	基座模型	7B	中英双语基座
BlueLM-7B-Chat	BlueLM系列	对话模型	7B	中英双语对话
Breeze-7B-Base-v1_0	Breeze系列	基座模型	7B	中文轻量基座
Breeze-7B-Instruct-v1_0	Breeze系列	指令模型	7B	中文指令模型
chatglm2-6b	ChatGLM系列	对话模型	6B	第二代对话模型
chatglm3-6b-base	ChatGLM系列	基座模型	6B	第三代基座模型
chatglm3-6b	ChatGLM系列	对话模型	6B	第三代对话模型
chinese-llama-2-1.3b	Chinese-LLaMA	基座模型	1.3B	中文优化小模型
chinese-llama-2-7b	Chinese-LLaMA	基座模型	7B	中文优化模型
chinese-llama-2-13b	Chinese-LLaMA	基座模型	13B	中文优化大模型
chinese-alpaca-2-1.3b	Chinese-Alpaca	对话模型	1.3B	中文对话小模型
chinese-alpaca-2-7b	Chinese-Alpaca	对话模型	7B	中文对话模型
chinese-alpaca-2-13b	Chinese-Alpaca	对话模型	13B	中文对话大模型
codegeex4-all-9b	CodeGeeX系列	代码模型	9B	多语言代码生成
codegemma-7b	CodeGemma系列	代码模型	7B	代码生成基座
codegemma-7b-it	CodeGemma系列	代码模型	7B	代码生成指令版
codegemma-1.1-2b	CodeGemma系列	代码模型	2B	轻量代码模型
codegemma-1.1-7b-it	CodeGemma系列	代码模型	7B	代码指令模型
Codestral-22B-v0.1	Codestral系列	代码模型	22B	大型代码模型
c4ai-command-r-v01	Command系列	RAG模型	-	检索增强生成
c4ai-command-r-plus	Command系列	RAG模型	-	增强版RAG模型
c4ai-command-r-v01-4bit	Command系列	量化模型	-	4bit量化版本
c4ai-command-r-plus-4bit	Command系列	量化模型	-	增强版4bit量化
dbrx-base	DBRX系列	基座模型	-	MoE架构基座
dbrx-instruct	DBRX系列	指令模型	-	MoE指令模型
deepseek-llm-7b-base	DeepSeek-LLM	基座模型	7B	通用基座模型
deepseek-llm-67b-base	DeepSeek-LLM	基座模型	67B	大规模基座模型
deepseek-llm-7b-chat	DeepSeek-LLM	对话模型	7B	通用对话模型
deepseek-llm-67b-chat	DeepSeek-LLM	对话模型	67B	大规模对话模型
deepseek-math-7b-base	DeepSeek-Math	数学模型	7B	数学基座模型
deepseek-math-7b-instruct	DeepSeek-Math	数学模型	7B	数学指令模型
deepseek-moe-16b-base	DeepSeek-MoE	基座模型	16B	MoE架构基座
deepseek-moe-16b-chat	DeepSeek-MoE	对话模型	16B	MoE对话模型
DeepSeek-V2-Lite	DeepSeek-V2	轻量模型	-	V2轻量版本
DeepSeek-V2	DeepSeek-V2	基座模型	-	第二代基座
DeepSeek-V2-Lite-Chat	DeepSeek-V2	对话模型	-	V2轻量对话
DeepSeek-V2-Chat	DeepSeek-V2	对话模型	-	第二代对话
DeepSeek-Coder-V2-Lite-Base	DeepSeek-Coder	代码模型	-	代码轻量基座
DeepSeek-Coder-V2-Base	DeepSeek-Coder	代码模型	-	代码基座模型
DeepSeek-Coder-V2-Lite-Instruct	DeepSeek-Coder	代码模型	-	代码轻量指令
DeepSeek-Coder-V2-Instruct	DeepSeek-Coder	代码模型	-	代码指令模型
deepseek-coder-6.7b-base	DeepSeek-Coder	代码模型	6.7B	代码基座模型
deepseek-coder-7b-base-v1.5	DeepSeek-Coder	代码模型	7B	代码基座v1.5
deepseek-coder-33b-base	DeepSeek-Coder	代码模型	33B	大规模代码基座
deepseek-coder-6.7b-instruct	DeepSeek-Coder	代码模型	6.7B	代码指令模型
deepseek-coder-7b-instruct-v1.5	DeepSeek-Coder	代码模型	7B	代码指令v1.5
deepseek-coder-33b-instruct	DeepSeek-Coder	代码模型	33B	大规模代码指令
DeepSeek-V2-Chat-0628	DeepSeek-V2	对话模型	-	特定版本对话
DeepSeek-V2.5	DeepSeek-V2.5	基座模型	-	2.5代基座
DeepSeek-V2.5-1210	DeepSeek-V2.5	基座模型	-	特定版本基座
DeepSeek-V3-Base	DeepSeek-V3	基座模型	-	第三代基座
DeepSeek-V3	DeepSeek-V3	基座模型	-	第三代模型
DeepSeek-R1-Distill-Qwen-1.5B	DeepSeek-R1	推理模型	1.5B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-7B	DeepSeek-R1	推理模型	7B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-8B	DeepSeek-R1	推理模型	8B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-14B	DeepSeek-R1	推理模型	14B	蒸馏推理模型
DeepSeek-R1-Distill-Qwen-32B	DeepSeek-R1	推理模型	32B	蒸馏推理模型
DeepSeek-R1-Distill-Llama-70B	DeepSeek-R1	推理模型	70B	蒸馏推理模型
DeepSeek-R1-Zero	DeepSeek-R1	推理模型	-	零样本推理
DeepSeek-R1	DeepSeek-R1	推理模型	-	推理模型
EXAONE-3.0-7.8B-Instruct	EXAONE系列	指令模型	7.8B	多模态指令模型
falcon-7b	Falcon系列	基座模型	7B	开源基座模型
falcon-11B	Falcon系列	基座模型	11B	中等规模基座
falcon-40b	Falcon系列	基座模型	40B	大规模基座模型
falcon-180b	Falcon系列	基座模型	180B	超大规模基座
falcon-7b-instruct	Falcon系列	指令模型	7B	指令调优模型
falcon-40b-instruct	Falcon系列	指令模型	40B	大规模指令模型
falcon-180b-chat	Falcon系列	对话模型	180B	超大规模对话
gemma-2b	Gemma系列	基座模型	2B	轻量基座模型
gemma-7b	Gemma系列	基座模型	7B	基座模型
gemma-2b-it	Gemma系列	指令模型	2B	轻量指令模型
gemma-7b-it	Gemma系列	指令模型	7B	指令模型
gemma-1.1-2b-it	Gemma系列	指令模型	2B	1.1版指令模型
gemma-1.1-7b-it	Gemma系列	指令模型	7B	1.1版指令模型
gemma-2-2b	Gemma2系列	基座模型	2B	第2代轻量基座
gemma-2-9b	Gemma2系列	基座模型	9B	第2代基座模型
gemma-2-27b	Gemma2系列	基座模型	27B	第2代大基座
gemma-2-2b-it	Gemma2系列	指令模型	2B	第2代轻量指令
gemma-2-9b-it	Gemma2系列	指令模型	9B	第2代指令模型
gemma-2-27b-it	Gemma2系列	指令模型	27B	第2代大指令模型
glm-4-9b	GLM系列	基座模型	9B	第4代基座模型
glm-4-9b-chat	GLM系列	对话模型	9B	第4代对话模型
glm-4-9b-chat-1m	GLM系列	对话模型	9B	长上下文对话
gpt2	GPT-2系列	基座模型	124M	基础版本
gpt2-medium	GPT-2系列	基座模型	355M	中等版本
gpt2-large	GPT-2系列	基座模型	774M	大型版本
gpt2-xl	GPT-2系列	基座模型	1.5B	超大版本
granite-3.0-1b-a400m-base	Granite系列	基座模型	1B	代码基座模型
granite-3.0-3b-a800m-base	Granite系列	基座模型	3B	代码基座模型
granite-3.0-2b-base	Granite系列	基座模型	2B	代码基座模型
granite-3.0-8b-base	Granite系列	基座模型	8B	代码基座模型
granite-3.0-1b-a400m-instruct	Granite系列	指令模型	1B	代码指令模型
granite-3.0-3b-a800m-instruct	Granite系列	指令模型	3B	代码指令模型
granite-3.0-2b-instruct	Granite系列	指令模型	2B	代码指令模型
granite-3.0-8b-instruct	Granite系列	指令模型	8B	代码指令模型
granite-3.1-1b-a400m-base	Granite系列	基座模型	1B	3.1版代码基座
granite-3.1-3b-a800m-base	Granite系列	基座模型	3B	3.1版代码基座
granite-3.1-2b-base	Granite系列	基座模型	2B	3.1版代码基座
granite-3.1-8b-base	Granite系列	基座模型	8B	3.1版代码基座
granite-3.1-1b-a400m-instruct	Granite系列	指令模型	1B	3.1版代码指令
granite-3.1-3b-a800m-instruct	Granite系列	指令模型	3B	3.1版代码指令
granite-3.1-2b-instruct	Granite系列	指令模型	2B	3.1版代码指令
granite-3.1-8b-instruct	Granite系列	指令模型	8B	3.1版代码指令
Index-1.9B	Index系列	基座模型	1.9B	轻量基座模型
Index-1.9B-Pure	Index系列	基座模型	1.9B	纯净版基座
Index-1.9B-Chat	Index系列	对话模型	1.9B	轻量对话模型
Index-1.9B-Character	Index系列	角色模型	1.9B	角色扮演模型
Index-1.9B-32K	Index系列	基座模型	1.9B	长上下文版本
internlm-7b	InternLM系列	基座模型	7B	基座模型
internlm-20b	InternLM系列	基座模型	20B	大规模基座
internlm-chat-7b	InternLM系列	对话模型	7B	对话模型
internlm-chat-20b	InternLM系列	对话模型	20B	大规模对话
internlm2-7b	InternLM2系列	基座模型	7B	第2代基座
internlm2-20b	InternLM2系列	基座模型	20B	第2代大基座
internlm2-chat-7b	InternLM2系列	对话模型	7B	第2代对话
internlm2-chat-20b	InternLM2系列	对话模型	20B	第2代大对话
internlm2_5-1_8b	InternLM2.5系列	基座模型	1.8B	2.5代轻量基座
internlm2_5-7b	InternLM2.5系列	基座模型	7B	2.5代基座
internlm2_5-20b	InternLM2.5系列	基座模型	20B	2.5代大基座
internlm2_5-1_8b-chat	InternLM2.5系列	对话模型	1.8B	2.5代轻量对话
internlm2_5-7b-chat	InternLM2.5系列	对话模型	7B	2.5代对话
internlm2_5-7b-chat-1m	InternLM2.5系列	对话模型	7B	百万字长对话
internlm2_5-20b-chat	InternLM2.5系列	对话模型	20B	2.5代大对话
internlm3-8b-instruct	InternLM3系列	指令模型	8B	第3代指令模型
Jamba-v0.1	Jamba系列	混合模型	-	SSM-Transformer混合架构
LingoWhale-8B	LingoWhale系列	基座模型	8B	中英双语模型
llama-7b	LLaMA系列	基座模型	7B	经典基座模型
llama-13b	LLaMA系列	基座模型	13B	中等基座模型
llama-30b	LLaMA系列	基座模型	30B	大型基座模型
llama-65b	LLaMA系列	基座模型	65B	超大规模基座
Llama-2-7b-hf	LLaMA-2系列	基座模型	7B	第2代基座
Llama-2-13b-hf	LLaMA-2系列	基座模型	13B	第2代基座
Llama-2-70b-hf	LLaMA-2系列	基座模型	70B	第2代大基座
Llama-2-7b-chat-hf	LLaMA-2系列	对话模型	7B	第2代对话
Llama-2-13b-chat-hf	LLaMA-2系列	对话模型	13B	第2代对话
Llama-2-70b-chat-hf	LLaMA-2系列	对话模型	70B	第2代大对话
Meta-Llama-3-8B	LLaMA-3系列	基座模型	8B	第3代基座
Meta-Llama-3-70B	LLaMA-3系列	基座模型	70B	第3代大基座
Meta-Llama-3-8B-Instruct	LLaMA-3系列	指令模型	8B	第3代指令
Meta-Llama-3-70B-Instruct	LLaMA-3系列	指令模型	70B	第3代大指令
Llama3-8B-Chinese-Chat	LLaMA-3中文	对话模型	8B	中文优化对话
Llama3-70B-Chinese-Chat	LLaMA-3中文	对话模型	70B	中文优化大对话
Meta-Llama-3.1-8B	LLaMA-3.1系列	基座模型	8B	3.1代基座
Meta-Llama-3.1-70B	LLaMA-3.1系列	基座模型	70B	3.1代大基座
Meta-Llama-3.1-405B	LLaMA-3.1系列	基座模型	405B	超大规模基座
Meta-Llama-3.1-8B-Instruct	LLaMA-3.1系列	指令模型	8B	3.1代指令
Meta-Llama-3.1-70B-Instruct	LLaMA-3.1系列	指令模型	70B	3.1代大指令
Meta-Llama-3.1-405B-Instruct	LLaMA-3.1系列	指令模型	405B	超大规模指令
Llama3.1-8B-Chinese-Chat	LLaMA-3.1中文	对话模型	8B	3.1代中文对话
Llama3.1-70B-Chinese-Chat	LLaMA-3.1中文	对话模型	70B	3.1代中文大对话
Llama-3.2-1B	LLaMA-3.2系列	基座模型	1B	3.2代轻量基座
Llama-3.2-3B	LLaMA-3.2系列	基座模型	3B	3.2代轻量基座
Llama-3.2-1B-Instruct	LLaMA-3.2系列	指令模型	1B	3.2代轻量指令
Llama-3.2-3B-Instruct	LLaMA-3.2系列	指令模型	3B	3.2代轻量指令
Llama-3.3-70B-Instruct	LLaMA-3.3系列	指令模型	70B	3.3代大指令
Llama-3.2-11B-Vision	LLaMA-3.2多模态	视觉模型	11B	视觉语言模型
Llama-3.2-11B-Vision-Instruct	LLaMA-3.2多模态	视觉模型	11B	视觉指令模型
Llama-3.2-90B-Vision	LLaMA-3.2多模态	视觉模型	90B	大规模视觉模型
Llama-3.2-90B-Vision-Instruct	LLaMA-3.2多模态	视觉模型	90B	大规模视觉指令
llava-1.5-7b-hf	LLaVA系列	多模态模型	7B	视觉语言模型
llava-1.5-13b-hf	LLaVA系列	多模态模型	13B	视觉语言模型
llava-v1.6-vicuna-7b-hf	LLaVA系列	多模态模型	7B	Vicuna版视觉模型
llava-v1.6-vicuna-13b-hf	LLaVA系列	多模态模型	13B	Vicuna版视觉模型
llava-v1.6-mistral-7b-hf	LLaVA系列	多模态模型	7B	Mistral版视觉模型
llama3-llava-next-8b-hf	LLaVA系列	多模态模型	8B	LLaMA3版视觉模型
llava-v1.6-34b-hf	LLaVA系列	多模态模型	34B	大规模视觉模型
llava-next-72b-hf	LLaVA系列	多模态模型	72B	超大规模视觉模型
llava-next-110b-hf	LLaVA系列	多模态模型	110B	巨型视觉模型
LLaVA-NeXT-Video-7B-hf	LLaVA-NeXT系列	视频模型	7B	视频理解模型
LLaVA-NeXT-Video-7B-DPO-hf	LLaVA-NeXT系列	视频模型	7B	DPO优化视频模型
LLaVA-NeXT-Video-7B-32K-hf	LLaVA-NeXT系列	视频模型	7B	长视频理解模型
LLaVA-NeXT-Video-34B-hf	LLaVA-NeXT系列	视频模型	34B	大规模视频模型
LLaVA-NeXT-Video-34B-DPO-hf	LLaVA-NeXT系列	视频模型	34B	DPO优化大视频模型
Marco-o1	Marco系列	推理模型	-	数学推理模型
MiniCPM-2B-sft-bf16	MiniCPM系列	对话模型	2B	SFT优化对话
MiniCPM-2B-dpo-bf16	MiniCPM系列	对话模型	2B	DPO优化对话
MiniCPM3-4B	MiniCPM系列	对话模型	4B	第三代对话模型
MiniCPM-o-2_6	MiniCPM系列	对话模型	2.6B	优化版对话模型
MiniCPM-V-2_6	MiniCPM系列	多模态模型	2.6B	视觉语言模型
Ministral-8B-Instruct-2410	Ministral系列	指令模型	8B	轻量指令模型
Mistral-Nemo-Base-2407	Mistral系列	基座模型	-	Nemo架构基座
Mistral-Nemo-Instruct-2407	Mistral系列	指令模型	-	Nemo架构指令
Mistral-7B-v0.1	Mistral系列	基座模型	7B	初代基座模型
Mistral-7B-v0.2-hf	Mistral系列	基座模型	7B	0.2版基座
Mistral-7B-v0.3	Mistral系列	基座模型	7B	0.3版基座
Mistral-7B-Instruct-v0.1	Mistral系列	指令模型	7B	初代指令模型
Mistral-7B-Instruct-v0.2	Mistral系列	指令模型	7B	0.2版指令
Mistral-7B-Instruct-v0.3	Mistral系列	指令模型	7B	0.3版指令
Mistral-Small-24B-Base-2501	Mistral系列	基座模型	24B	小规模基座
Mistral-Small-24B-Instruct-2501	Mistral系列	指令模型	24B	小规模指令
Mixtral-8x7B-v0.1	Mixtral系列	基座模型	8x7B	MoE架构基座
Mixtral-8x22B-v0.1	Mixtral系列	基座模型	8x22B	大型MoE基座
Mixtral-8x7B-Instruct-v0.1	Mixtral系列	指令模型	8x7B	MoE指令模型
Mixtral-8x22B-Instruct-v0.1	Mixtral系列	指令模型	8x22B	大型MoE指令
Moonlight-16B-A3B	Moonlight系列	基座模型	16B	月光系列基座
Moonlight-16B-A3B-Instruct	Moonlight系列	指令模型	16B	月光指令模型
OLMo-1B-hf	OLMo系列	基座模型	1B	轻量开源模型
OLMo-7B-hf	OLMo系列	基座模型	7B	开源基座模型
OLMo-7B-Instruct-hf	OLMo系列	指令模型	7B	开源指令模型
OLMo-1.7-7B-hf	OLMo系列	基座模型	7B	1.7版基座
openchat-3.5-0106	OpenChat系列	对话模型	-	3.5版对话模型
openchat-3.6-8b-20240522	OpenChat系列	对话模型	8B	3.6版对话模型
OpenCoder-1.5B-Base	OpenCoder系列	代码模型	1.5B	轻量代码基座
OpenCoder-8B-Base	OpenCoder系列	代码模型	8B	代码基座模型
OpenCoder-1.5B-Instruct	OpenCoder系列	代码模型	1.5B	轻量代码指令
OpenCoder-8B-Instruct	OpenCoder系列	代码模型	8B	代码指令模型
Orion-14B-Base	Orion系列	基座模型	14B	基座模型
Orion-14B-Chat	Orion系列	对话模型	14B	对话模型
Orion-14B-LongChat	Orion系列	对话模型	14B	长对话模型
Orion-14B-Chat-RAG	Orion系列	对话模型	14B	RAG增强对话
Orion-14B-Chat-Plugin	Orion系列	对话模型	14B	插件支持对话
paligemma-3b-pt-224	PaliGemma系列	多模态模型	3B	图像理解模型
paligemma-3b-pt-448	PaliGemma系列	多模态模型	3B	高分辨率版本
paligemma-3b-pt-896	PaliGemma系列	多模态模型	3B	超高分辨率版
paligemma-3b-mix-224	PaliGemma系列	多模态模型	3B	混合训练版本
paligemma-3b-mix-448	PaliGemma系列	多模态模型	3B	混合高分辨率版
paligemma2-3b-pt-224	PaliGemma2系列	多模态模型	3B	第2代图像模型
paligemma2-3b-pt-448	PaliGemma2系列	多模态模型	3B	第2代高分辨率版
paligemma2-3b-pt-896	PaliGemma2系列	多模态模型	3B	第2代超高分辨率版
paligemma2-10b-pt-224	PaliGemma2系列	多模态模型	10B	第2代中规模模型
paligemma2-10b-pt-448	PaliGemma2系列	多模态模型	10B	第2代中规模高分辨率版
paligemma2-10b-pt-896	PaliGemma2系列	多模态模型	10B	第2代中规模超高分辨率版
paligemma2-28b-pt-224	PaliGemma2系列	多模态模型	28B	第2代大规模模型
paligemma2-28b-pt-448	PaliGemma2系列	多模态模型	28B	第2代大规模高分辨率版
paligemma2-28b-pt-896	PaliGemma2系列	多模态模型	28B	第2代大规模超高分辨率版
paligemma2-3b-mix-224	PaliGemma2系列	多模态模型	3B	第2代混合训练版
paligemma2-3b-mix-448	PaliGemma2系列	多模态模型	3B	第2代混合高分辨率版
paligemma2-10b-mix-224	PaliGemma2系列	多模态模型	10B	第2代中规模混合版
paligemma2-10b-mix-448	PaliGemma2系列	多模态模型	10B	第2代中规模混合高分辨率版
paligemma2-28b-mix-224	PaliGemma2系列	多模态模型	28B	第2代大规模混合版
paligemma2-28b-mix-448	PaliGemma2系列	多模态模型	28B	第2代大规模混合高分辨率版
phi-1_5	Phi系列	基座模型	1.5B	小规模基座
phi-2	Phi系列	基座模型	2.7B	轻量基座模型
Phi-3-mini-4k-instruct	Phi-3系列	指令模型	-	轻量指令模型
Phi-3-mini-128k-instruct	Phi-3系列	指令模型	-	长上下文指令
Phi-3-medium-4k-instruct	Phi-3系列	指令模型	-	中等指令模型
Phi-3-medium-128k-instruct	Phi-3系列	指令模型	-	中规模长上下文指令
Phi-3.5-mini-instruct	Phi-3.5系列	指令模型	-	3.5代轻量指令
Phi-3.5-MoE-instruct	Phi-3.5系列	指令模型	-	MoE架构指令
Phi-3-small-8k-instruct	Phi-3系列	指令模型	-	小规模指令
Phi-3-small-128k-instruct	Phi-3系列	指令模型	-	小规模长上下文指令
phi-4	Phi系列	基座模型	-	第4代基座
pixtral-12b	Pixtral系列	多模态模型	12B	多语言视觉模型
Qwen-1_8B	Qwen系列	基座模型	1.8B	轻量基座模型
Qwen-7B	Qwen系列	基座模型	7B	基座模型
Qwen-14B	Qwen系列	基座模型	14B	中等基座模型
Qwen-72B	Qwen系列	基座模型	72B	大规模基座模型
Qwen-1_8B-Chat	Qwen系列	对话模型	1.8B	轻量对话模型
Qwen-7B-Chat	Qwen系列	对话模型	7B	对话模型
Qwen-14B-Chat	Qwen系列	对话模型	14B	中等对话模型
Qwen-72B-Chat	Qwen系列	对话模型	72B	大规模对话模型
Qwen-1_8B-Chat-Int8	Qwen系列	量化模型	1.8B	Int8量化版本
Qwen-1_8B-Chat-Int4	Qwen系列	量化模型	1.8B	Int4量化版本
Qwen-7B-Chat-Int8	Qwen系列	量化模型	7B	Int8量化版本
Qwen-7B-Chat-Int4	Qwen系列	量化模型	7B	Int4量化版本
Qwen-14B-Chat-Int8	Qwen系列	量化模型	14B	Int8量化版本
Qwen-14B-Chat-Int4	Qwen系列	量化模型	14B	Int4量化版本
Qwen-72B-Chat-Int8	Qwen系列	量化模型	72B	Int8量化版本
Qwen-72B-Chat-Int4	Qwen系列	量化模型	72B	Int4量化版本
Qwen1.5-0.5B	Qwen1.5系列	基座模型	0.5B	超轻量基座
Qwen1.5-1.8B	Qwen1.5系列	基座模型	1.8B	轻量基座模型
Qwen1.5-4B	Qwen1.5系列	基座模型	4B	小规模基座
Qwen1.5-7B	Qwen1.5系列	基座模型	7B	基座模型
Qwen1.5-14B	Qwen1.5系列	基座模型	14B	中等基座模型
Qwen1.5-32B	Qwen1.5系列	基座模型	32B	大规模基座模型
Qwen1.5-72B	Qwen1.5系列	基座模型	72B	超大规模基座
Qwen1.5-110B	Qwen1.5系列	基座模型	110B	巨型基座模型
Qwen1.5-MoE-A2.7B	Qwen1.5系列	基座模型	2.7B	MoE架构基座
Qwen1.5-0.5B-Chat	Qwen1.5系列	对话模型	0.5B	超轻量对话
Qwen1.5-1.8B-Chat	Qwen1.5系列	对话模型	1.8B	轻量对话模型
Qwen1.5-4B-Chat	Qwen1.5系列	对话模型	4B	小规模对话
Qwen1.5-7B-Chat	Qwen1.5系列	对话模型	7B	对话模型
Qwen1.5-14B-Chat	Qwen1.5系列	对话模型	14B	中等对话模型
Qwen1.5-32B-Chat	Qwen1.5系列	对话模型	32B	大规模对话模型
Qwen1.5-72B-Chat	Qwen1.5系列	对话模型	72B	超大规模对话
Qwen1.5-110B-Chat	Qwen1.5系列	对话模型	110B	巨型对话模型
Qwen1.5-MoE-A2.7B-Chat	Qwen1.5系列	对话模型	2.7B	MoE架构对话
Qwen1.5-0.5B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	0.5B	GPTQ量化版本
Qwen1.5-0.5B-Chat-AWQ	Qwen1.5系列	量化模型	0.5B	AWQ量化版本
Qwen1.5-1.8B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	1.8B	GPTQ量化版本
Qwen1.5-1.8B-Chat-AWQ	Qwen1.5系列	量化模型	1.8B	AWQ量化版本
Qwen1.5-4B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	4B	GPTQ量化版本
Qwen1.5-4B-Chat-AWQ	Qwen1.5系列	量化模型	4B	AWQ量化版本
Qwen1.5-7B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	7B	GPTQ量化版本
Qwen1.5-7B-Chat-AWQ	Qwen1.5系列	量化模型	7B	AWQ量化版本
Qwen1.5-14B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	14B	GPTQ量化版本
Qwen1.5-14B-Chat-AWQ	Qwen1.5系列	量化模型	14B	AWQ量化版本
Qwen1.5-32B-Chat-AWQ	Qwen1.5系列	量化模型	32B	AWQ量化版本
Qwen1.5-72B-Chat-GPTQ-Int8	Qwen1.5系列	量化模型	72B	GPTQ量化版本
Qwen1.5-72B-Chat-AWQ	Qwen1.5系列	量化模型	72B	AWQ量化版本
Qwen1.5-110B-Chat-AWQ	Qwen1.5系列	量化模型	110B	AWQ量化版本
Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4	Qwen1.5系列	量化模型	2.7B	MoE量化版本
CodeQwen1.5-7B	CodeQwen系列	代码模型	7B	代码基座模型
CodeQwen1.5-7B-Chat	CodeQwen系列	代码模型	7B	代码对话模型
CodeQwen1.5-7B-Chat-AWQ	CodeQwen系列	量化模型	7B	代码量化版本
Qwen2-0.5B	Qwen2系列	基座模型	0.5B	第2代超轻量基座
Qwen2-1.5B	Qwen2系列	基座模型	1.5B	第2代轻量基座
Qwen2-7B	Qwen2系列	基座模型	7B	第2代基座模型
Qwen2-72B	Qwen2系列	基座模型	72B	第2代大规模基座
Qwen2-57B-A14B	Qwen2系列	混合模型	57B+14B	混合专家模型
Qwen2-0.5B-Instruct	Qwen2系列	指令模型	0.5B	第2代超轻量指令
Qwen2-1.5B-Instruct	Qwen2系列	指令模型	1.5B	第2代轻量指令
Qwen2-7B-Instruct	Qwen2系列	指令模型	7B	第2代指令模型
Qwen2-72B-Instruct	Qwen2系列	指令模型	72B	第2代大规模指令
Qwen2-57B-A14B-Instruct	Qwen2系列	指令模型	57B+14B	混合专家指令
Qwen2-0.5B-Instruct-GPTQ-Int8	Qwen2系列	量化模型	0.5B	GPTQ量化版本
Qwen2-0.5B-Instruct-GPTQ-Int4	Qwen2系列	量化模型	0.5B	GPTQ-Int4量化
Qwen2-0.5B-Instruct-AWQ	Qwen2系列	量化模型	0.5B	AWQ量化版本
Qwen2-1.5B-Instruct-GPTQ-Int8	Qwen2系列	量化模型	1.5B	GPTQ量化版本
Qwen2-1.5B-Instruct-GPTQ-Int4	Qwen2系列	量化模型	1.5B	GPTQ-Int4量化
Qwen2-1.5B-Instruct-AWQ	Qwen2系列	量化模型	1.5B	AWQ量化版本
Qwen2-7B-Instruct-GPTQ-Int8	Qwen2系列	量化模型	7B	GPTQ量化版本
Qwen2-7B-Instruct-GPTQ-Int4	Qwen2系列	量化模型	7B	GPTQ-Int4量化
Qwen2-7B-Instruct-AWQ	Qwen2系列	量化模型	7B	AWQ量化版本
Qwen2-72B-Instruct-GPTQ-Int8	Qwen2系列	量化模型	72B	GPTQ量化版本
Qwen2-72B-Instruct-GPTQ-Int4	Qwen2系列	量化模型	72B	GPTQ-Int4量化
Qwen2-72B-Instruct-AWQ	Qwen2系列	量化模型	72B	AWQ量化版本
Qwen2-57B-A14B-Instruct-GPTQ-Int4	Qwen2系列	量化模型	57B+14B	混合专家量化
Qwen2-Math-1.5B	Qwen2-Math系列	数学模型	1.5B	数学基座模型
Qwen2-Math-7B	Qwen2-Math系列	数学模型	7B	数学基座模型
Qwen2-Math-72B	Qwen2-Math系列	数学模型	72B	大规模数学模型
Qwen2-Math-1.5B-Instruct	Qwen2-Math系列	数学模型	1.5B	数学指令模型
Qwen2-Math-7B-Instruct	Qwen2-Math系列	数学模型	7B	数学指令模型
Qwen2-Math-72B-Instruct	Qwen2-Math系列	数学模型	72B	大规模数学指令
Qwen2.5-0.5B	Qwen2.5系列	基座模型	0.5B	2.5代超轻量基座
Qwen2.5-1.5B	Qwen2.5系列	基座模型	1.5B	2.5代轻量基座
Qwen2.5-3B	Qwen2.5系列	基座模型	3B	2.5代小规模基座
Qwen2.5-7B	Qwen2.5系列	基座模型	7B	2.5代基座模型
Qwen2.5-14B	Qwen2.5系列	基座模型	14B	2.5代中等基座
Qwen2.5-32B	Qwen2.5系列	基座模型	32B	2.5代大规模基座
Qwen2.5-72B	Qwen2.5系列	基座模型	72B	2.5代超大规模基座
Qwen2.5-0.5B-Instruct	Qwen2.5系列	指令模型	0.5B	2.5代超轻量指令
Qwen2.5-1.5B-Instruct	Qwen2.5系列	指令模型	1.5B	2.5代轻量指令
Qwen2.5-3B-Instruct	Qwen2.5系列	指令模型	3B	2.5代小规模指令
Qwen2.5-7B-Instruct	Qwen2.5系列	指令模型	7B	2.5代指令模型
Qwen2.5-14B-Instruct	Qwen2.5系列	指令模型	14B	2.5代中等指令
Qwen2.5-32B-Instruct	Qwen2.5系列	指令模型	32B	2.5代大规模指令
Qwen2.5-72B-Instruct	Qwen2.5系列	指令模型	72B	2.5代超大规模指令
Qwen2.5-7B-Instruct-1M	Qwen2.5系列	指令模型	7B	百万字长上下文
Qwen2.5-14B-Instruct-1M	Qwen2.5系列	指令模型	14B	百万字长上下文
Qwen2.5-0.5B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	0.5B	GPTQ量化版本
Qwen2.5-0.5B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	0.5B	GPTQ-Int4量化
Qwen2.5-0.5B-Instruct-AWQ	Qwen2.5系列	量化模型	0.5B	AWQ量化版本
Qwen2.5-1.5B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	1.5B	GPTQ量化版本
Qwen2.5-1.5B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	1.5B	GPTQ-Int4量化
Qwen2.5-1.5B-Instruct-AWQ	Qwen2.5系列	量化模型	1.5B	AWQ量化版本
Qwen2.5-3B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	3B	GPTQ量化版本
Qwen2.5-3B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	3B	GPTQ-Int4量化
Qwen2.5-3B-Instruct-AWQ	Qwen2.5系列	量化模型	3B	AWQ量化版本
Qwen2.5-7B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	7B	GPTQ量化版本
Qwen2.5-7B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	7B	GPTQ-Int4量化
Qwen2.5-7B-Instruct-AWQ	Qwen2.5系列	量化模型	7B	AWQ量化版本
Qwen2.5-14B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	14B	GPTQ量化版本
Qwen2.5-14B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	14B	GPTQ-Int4量化
Qwen2.5-14B-Instruct-AWQ	Qwen2.5系列	量化模型	14B	AWQ量化版本
Qwen2.5-32B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	32B	GPTQ量化版本
Qwen2.5-32B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	32B	GPTQ-Int4量化
Qwen2.5-32B-Instruct-AWQ	Qwen2.5系列	量化模型	32B	AWQ量化版本
Qwen2.5-72B-Instruct-GPTQ-Int8	Qwen2.5系列	量化模型	72B	GPTQ量化版本
Qwen2.5-72B-Instruct-GPTQ-Int4	Qwen2.5系列	量化模型	72B	GPTQ-Int4量化
Qwen2.5-72B-Instruct-AWQ	Qwen2.5系列	量化模型	72B	AWQ量化版本
Qwen2.5-Coder-0.5B	Qwen2.5-Coder系列	代码模型	0.5B	超轻量代码基座
Qwen2.5-Coder-1.5B	Qwen2.5-Coder系列	代码模型	1.5B	轻量代码基座
Qwen2.5-Coder-3B	Qwen2.5-Coder系列	代码模型	3B	小规模代码基座
Qwen2.5-Coder-7B	Qwen2.5-Coder系列	代码模型	7B	代码基座模型
Qwen2.5-Coder-14B	Qwen2.5-Coder系列	代码模型	14B	中等代码基座
Qwen2.5-Coder-32B	Qwen2.5-Coder系列	代码模型	32B	大规模代码基座
Qwen2.5-Coder-0.5B-Instruct	Qwen2.5-Coder系列	代码模型	0.5B	超轻量代码指令
Qwen2.5-Coder-1.5B-Instruct	Qwen2.5-Coder系列	代码模型	1.5B	轻量代码指令
Qwen2.5-Coder-3B-Instruct	Qwen2.5-Coder系列	代码模型	3B	小规模代码指令
Qwen2.5-Coder-7B-Instruct	Qwen2.5-Coder系列	代码模型	7B	代码指令模型
Qwen2.5-Coder-14B-Instruct	Qwen2.5-Coder系列	代码模型	14B	中等代码指令
Qwen2.5-Coder-32B-Instruct	Qwen2.5-Coder系列	代码模型	32B	大规模代码指令
Qwen2.5-Math-1.5B	Qwen2.5-Math系列	数学模型	1.5B	轻量数学模型
Qwen2.5-Math-7B	Qwen2.5-Math系列	数学模型	7B	数学模型
Qwen2.5-Math-72B	Qwen2.5-Math系列	数学模型	72B	大规模数学模型
Qwen2.5-Math-1.5B-Instruct	Qwen2.5-Math系列	数学模型	1.5B	轻量数学指令
Qwen2.5-Math-7B-Instruct	Qwen2.5-Math系列	数学模型	7B	数学指令模型
Qwen2.5-Math-72B-Instruct	Qwen2.5-Math系列	数学模型	72B	大规模数学指令
QwQ-32B-Preview	QwQ系列	预览模型	32B	预览版本模型
QwQ-32B	QwQ系列	基座模型	32B	正式版本模型
Qwen2-Audio-7B	Qwen2-Audio系列	音频模型	7B	音频基座模型
Qwen2-Audio-7B-Instruct	Qwen2-Audio系列	音频模型	7B	音频指令模型
Qwen2-VL-2B	Qwen2-VL系列	多模态模型	2B	轻量视觉语言模型
Qwen2-VL-7B	Qwen2-VL系列	多模态模型	7B	视觉语言模型
Qwen2-VL-72B	Qwen2-VL系列	多模态模型	72B	大规模视觉语言模型
Qwen2-VL-2B-Instruct	Qwen2-VL系列	多模态模型	2B	轻量视觉指令
Qwen2-VL-7B-Instruct	Qwen2-VL系列	多模态模型	7B	视觉指令模型
Qwen2-VL-72B-Instruct	Qwen2-VL系列	多模态模型	72B	大规模视觉指令
Qwen2-VL-2B-Instruct-GPTQ-Int8	Qwen2-VL系列	量化模型	2B	视觉GPTQ量化
Qwen2-VL-2B-Instruct-GPTQ-Int4	Qwen2-VL系列	量化模型	2B	视觉GPTQ-Int4量化
Qwen2-VL-2B-Instruct-AWQ	Qwen2-VL系列	量化模型	2B	视觉AWQ量化
Qwen2-VL-7B-Instruct-GPTQ-Int8	Qwen2-VL系列	量化模型	7B	视觉GPTQ量化
Qwen2-VL-7B-Instruct-GPTQ-Int4	Qwen2-VL系列	量化模型	7B	视觉GPTQ-Int4量化
Qwen2-VL-7B-Instruct-AWQ	Qwen2-VL系列	量化模型	7B	视觉AWQ量化
Qwen2-VL-72B-Instruct-GPTQ-Int8	Qwen2-VL系列	量化模型	72B	视觉GPTQ量化
Qwen2-VL-72B-Instruct-GPTQ-Int4	Qwen2-VL系列	量化模型	72B	视觉GPTQ-Int4量化
Qwen2-VL-72B-Instruct-AWQ	Qwen2-VL系列	量化模型	72B	视觉AWQ量化
QVQ-72B-Preview	QVQ系列	预览模型	72B	视觉量化预览版
Qwen2.5-VL-3B-Instruct	Qwen2.5-VL系列	多模态模型	3B	2.5代视觉指令
Qwen2.5-VL-7B-Instruct	Qwen2.5-VL系列	多模态模型	7B	2.5代视觉指令
Qwen2.5-VL-72B-Instruct	Qwen2.5-VL系列	多模态模型	72B	2.5代大规模视觉指令
Qwen2.5-VL-3B-Instruct-AWQ	Qwen2.5-VL系列	量化模型	3B	视觉AWQ量化
Qwen2.5-VL-7B-Instruct-AWQ	Qwen2.5-VL系列	量化模型	7B	视觉AWQ量化
Qwen2.5-VL-72B-Instruct-AWQ	Qwen2.5-VL系列	量化模型	72B	视觉AWQ量化
SOLAR-10.7B-v1.0	SOLAR系列	基座模型	10.7B	基座模型
SOLAR-10.7B-Instruct-v1.0	SOLAR系列	指令模型	10.7B	指令模型
Skywork-13B-base	Skywork系列	基座模型	13B	基座模型
Skywork-o1-Open-Llama-3.1-8B	Skywork系列	基座模型	8B	基于LLaMA3.1
starcoder2-3b	StarCoder2系列	代码模型	3B	轻量代码模型
starcoder2-7b	StarCoder2系列	代码模型	7B	代码模型
starcoder2-15b	StarCoder2系列	代码模型	15B	中等代码模型
TeleChat-1B	TeleChat系列	对话模型	1B	轻量对话模型
telechat-7B	TeleChat系列	对话模型	7B	对话模型
TeleChat-12B-v2	TeleChat系列	对话模型	12B	第2版对话模型
TeleChat-52B	TeleChat系列	对话模型	52B	大规模对话模型
TeleChat2-3B	TeleChat2系列	对话模型	3B	第2代轻量对话
TeleChat2-7B	TeleChat2系列	对话模型	7B	第2代对话模型
TeleChat2-115B	TeleChat2系列	对话模型	115B	第2代巨型对话
vicuna-7b-v1.5	Vicuna系列	对话模型	7B	基于LLaMA的对话模型
vicuna-13b-v1.5	Vicuna系列	对话模型	13B	基于LLaMA的对话模型
Video-LLaVA-7B-hf	Video-LLaVA系列	视频模型	7B	视频理解模型
XuanYuan-6B	XuanYuan系列	基座模型	6B	金融领域基座
XuanYuan-70B	XuanYuan系列	基座模型	70B	金融领域大基座
XuanYuan2-70B	XuanYuan2系列	基座模型	70B	第2代金融基座
XuanYuan-6B-Chat	XuanYuan系列	对话模型	6B	金融对话模型
XuanYuan-70B-Chat	XuanYuan系列	对话模型	70B	金融大对话模型
XuanYuan2-70B-Chat	XuanYuan2系列	对话模型	70B	第2代金融对话
XuanYuan-6B-Chat-8bit	XuanYuan系列	量化模型	6B	8bit量化版本
XuanYuan-6B-Chat-4bit	XuanYuan系列	量化模型	6B	4bit量化版本
XuanYuan-70B-Chat-8bit	XuanYuan系列	量化模型	70B	8bit量化版本
XuanYuan-70B-Chat-4bit	XuanYuan系列	量化模型	70B	4bit量化版本
XuanYuan2-70B-Chat-8bit	XuanYuan2系列	量化模型	70B	第2代8bit量化
XuanYuan2-70B-Chat-4bit	XuanYuan2系列	量化模型	70B	第2代4bit量化
XVERSE-7B	XVERSE系列	基座模型	7B	基座模型
XVERSE-13B	XVERSE系列	基座模型	13B	中等基座模型
XVERSE-65B	XVERSE系列	基座模型	65B	大规模基座模型
XVERSE-65B-2	XVERSE系列	基座模型	65B	第2版基座模型
XVERSE-7B-Chat	XVERSE系列	对话模型	7B	对话模型
XVERSE-13B-Chat	XVERSE系列	对话模型	13B	中等对话模型
XVERSE-65B-Chat	XVERSE系列	对话模型	65B	大规模对话模型
XVERSE-MoE-A4.2B	XVERSE系列	基座模型	4.2B	MoE架构模型
XVERSE-7B-Chat-GPTQ-Int8	XVERSE系列	量化模型	7B	GPTQ量化版本
XVERSE-7B-Chat-GPTQ-Int4	XVERSE系列	量化模型	7B	GPTQ-Int4量化
XVERSE-13B-Chat-GPTQ-Int8	XVERSE系列	量化模型	13B	GPTQ量化版本
XVERSE-13B-Chat-GPTQ-Int4	XVERSE系列	量化模型	13B	GPTQ-Int4量化
XVERSE-65B-Chat-GPTQ-Int4	XVERSE系列	量化模型	65B	GPTQ-Int4量化
yayi-7b-llama2	YaYi系列	基座模型	7B	基于LLaMA2
yayi-13b-llama2	YaYi系列	基座模型	13B	基于LLaMA2
Yi-6B	Yi系列	基座模型	6B	基座模型
Yi-9B	Yi系列	基座模型	9B	中等基座模型
Yi-34B	Yi系列	基座模型	34B	大规模基座模型
Yi-6B-Chat	Yi系列	对话模型	6B	对话模型
Yi-34B-Chat	Yi系列	对话模型	34B	大规模对话模型
Yi-6B-Chat-8bits	Yi系列	量化模型	6B	8bit量化版本
Yi-6B-Chat-4bits	Yi系列	量化模型	6B	4bit量化版本
Yi-34B-Chat-8bits	Yi系列	量化模型	34B	8bit量化版本
Yi-34B-Chat-4bits	Yi系列	量化模型	34B	4bit量化版本
Yi-1.5-6B	Yi-1.5系列	基座模型	6B	1.5代基座模型
Yi-1.5-9B	Yi-1.5系列	基座模型	9B	1.5代中等基座
Yi-1.5-34B	Yi-1.5系列	基座模型	34B	1.5代大规模基座
Yi-1.5-6B-Chat	Yi-1.5系列	对话模型	6B	1.5代对话模型
Yi-1.5-9B-Chat	Yi-1.5系列	对话模型	9B	1.5代中等对话
Yi-1.5-34B-Chat	Yi-1.5系列	对话模型	34B	1.5代大规模对话
Yi-Coder-1.5B	Yi-Coder系列	代码模型	1.5B	轻量代码模型
Yi-Coder-9B	Yi-Coder系列	代码模型	9B	代码模型
Yi-Coder-1.5B-Chat	Yi-Coder系列	代码模型	1.5B	轻量代码对话
Yi-Coder-9B-Chat	Yi-Coder系列	代码模型	9B	代码对话模型
Yi-VL-6B-hf	Yi-VL系列	多模态模型	6B	视觉语言模型
Yi-VL-34B-hf	Yi-VL系列	多模态模型	34B	大规模视觉语言模型
Yuan2-2B-hf	Yuan2系列	基座模型	2B	轻量基座模型
Yuan2-51B-hf	Yuan2系列	基座模型	51B	大规模基座模型
Yuan2-102B-hf	Yuan2系列	基座模型	102B	超大规模基座模型
zephyr-7b-alpha	Zephyr系列	对话模型	7B	Alpha版本对话
zephyr-7b-beta	Zephyr系列	对话模型	7B	Beta版本对话
zephyr-orpo-141b-A35b-v0.1	Zephyr系列	对话模型	141B	超大规模对话模型

版本对比总表

对比维度	LlamaFactory 0.9.4	LlamaFactory 0.9.3	LlamaFactory 0.9.2
状态	🟢 主版本/生产推荐	🟡 历史版本/稳定	🔴 旧版本/维护
Transformers版本	4.56.0-4.57.1	4.52.4	4.45.2
PyTorch支持	2.5.1, 2.6.0, 2.7.1, 2.8.0	2.5.1, 2.6.0, 2.7.0	仅2.5.1
CUDA支持范围	11.8, 12.1, 12.4, 12.6, 12.8	11.8, 12.1, 12.4, 12.6, 12.8	11.8, 12.1, 12.4
vLLM版本	0.10.0-0.10.2	0.9.1	0.7.0
HuggingFace Hub	0.34.3-0.35.3	0.34.3	0.34.3

选择指南

按PyTorch版本选择CUDA

请参考以下版本兼容性列表，根据您所需的PyTorch版本或CUDA版本，选择与之匹配的镜像，并结合实际硬件型号快速启动GPU实例。

PyTorch版本	0.9.4可用CUDA	0.9.3可用CUDA	0.9.2可用CUDA
2.8.0	11.8, 12.6, 12.8	❌ 不支持	❌ 不支持
2.7.1	11.8, 12.6, 12.8	❌ 不支持	❌ 不支持
2.7.0	❌ 不支持	11.8, 12.6, 12.8	❌ 不支持
2.6.0	11.8, 12.4, 12.6	11.8, 12.4, 12.6	❌ 不支持
2.5.1	11.8, 12.1, 12.4, 12.6	11.8, 12.1, 12.4	11.8, 12.1, 12.4

按CUDA版本选择PyTorch

CUDA版本	0.9.4可用PyTorch	0.9.3可用PyTorch	0.9.2可用PyTorch
12.8	2.8.0, 2.7.1	2.7.0	❌ 不支持
12.6	2.8.0, 2.7.1, 2.6.0	2.7.0, 2.6.0	❌ 不支持
12.4	2.6.0, 2.5.1	2.6.0, 2.5.1	2.5.1
12.1	2.5.1	2.5.1	2.5.1
11.8	2.8.0, 2.7.1, 2.6.0, 2.5.1	2.7.0, 2.6.0, 2.5.1	2.5.1

总结

为了获得最佳稳定性和兼容性，我们建议大多数用户选择LlamaFactory的主版本系列（当前最新为 0.9.4）。在选择时，请务必根据您本地环境的CUDA版本来选取对应的镜像标签。

🟢 生产环境: 选择0.9.4主版本
🟡 兼容性测试: 选择0.9.4历史变体或0.9.3版本
🔴 遗留系统: 仅在必要时选择0.9.2版本

前提条件​

LlamaFactory 0.9.4 (当前主分支)​

0.9.4(Transformers 4.57.1)主版本 (最新组件)​

0.9.4 (Transformers 4.57.1)模型列表​

0.9.4(Transformers 4.56.0)​

0.9.4 版本特性总结​

0.9.4(Transformers 4.56.0)模型列表​

LlamaFactory 0.9.3 (历史版本)​

0.9.3 镜像列表​

0.9.3版本特性总结​

0.9.3模型列表​

LlamaFactory 0.9.2 (旧版本)​

0.9.2镜像列表​

0.9.2版本特性总结​

0.9.2模型列表​

版本对比总表​

选择指南​

按PyTorch版本选择CUDA​

按CUDA版本选择PyTorch​

总结​

前提条件

LlamaFactory 0.9.4 (当前主分支)

0.9.4(Transformers 4.57.1)主版本 (最新组件)

0.9.4 (Transformers 4.57.1)模型列表

0.9.4(Transformers 4.56.0)

0.9.4 版本特性总结

0.9.4(Transformers 4.56.0)模型列表

LlamaFactory 0.9.3 (历史版本)

0.9.3 镜像列表

0.9.3版本特性总结

0.9.3模型列表

LlamaFactory 0.9.2 (旧版本)

0.9.2镜像列表

0.9.2版本特性总结

0.9.2模型列表

版本对比总表

选择指南

按PyTorch版本选择CUDA

按CUDA版本选择PyTorch

总结