FastChat Vicuna
public: 2025-04-19 See also the main item: /LLM. Official GitHub. Follow this CSDN blog for the 1st time run: CSDN, (bak 2023-04-18). Note about timing (on Tesla V100 16G): convert_llama_weights_to_hf.py for LLAMA-7B uses <10min. python -m fastchat.model.apply_delta for LLAMA-7B uses <10min. GPTQ-for-LLaMA for LLAMA-13B to 4bit .pt uses 0.75 hour. Vicuna GPTQ Models (量化模型) Comparison & WebUI Tutorial. ref: medium See also FastChat for WebUI & RESTful API: FastChat GitHub Home. ...