LLAMA

ChatGLM

2023-05-29. Category & Tags: AIGC, GPT, ChatGPT, Vicuna, LLAMA, LLM, ChatGLM

public: 2025-04-19 See also the main item: /LLM. 【DOing , not finished】 see also: 手把手带你实现：基于 Langchain 和 chatglm-6b 构建本地知识库的自动问答应用 9.5 min pytorch 入门 20 - 本地知识库 LLM 对话系统（langchain-ChatGLM 项目）- 源码分析（完结喽） - 跟小鱼儿一起学习 pytorch 官网入门教程 37min 利用 LangChain 和国产大模型 ChatGLM-6B 实现基于本地知识库的自动问答 1.4min Github 地址：https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui ModelScope 在线体验：https://modelscope.cn/studios/AI-ModelScope/LangChain-ChatLLM/summary OpenI 地址： https://openi.pcl.ac.cn/Learning-Develop-Union/LangChain-ChatGLM-Webui Install Env # ref: imClumsyPanda/langchain-ChatGLM (tested on 22.04) Public curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - echo distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia. ...

FastChat Vicuna

2023-05-29. Category & Tags: AIGC, GPT, ChatGPT, LLAMA, LLM, FastChat, Vicuna

public: 2025-04-19 See also the main item: /LLM. Official GitHub. Follow this CSDN blog for the 1st time run: CSDN, (bak 2023-04-18). Note about timing (on Tesla V100 16G): convert_llama_weights_to_hf.py for LLAMA-7B uses <10min. python -m fastchat.model.apply_delta for LLAMA-7B uses <10min. GPTQ-for-LLaMA for LLAMA-13B to 4bit .pt uses 0.75 hour. Vicuna GPTQ Models (量化模型) Comparison & WebUI Tutorial. ref: medium See also FastChat for WebUI & RESTful API: FastChat GitHub Home. ...

LlamaIndex

2023-05-29. Category & Tags: AIGC, GPT, ChatGPT, LLAMA, LLM, LlamaIndex

public: 2025-04-19 See also the main item: /LLM. 类似/相关： haystack ref: LlamaIndex 原理与应用简介（不同场景下的架构逻辑） by bilibili 字节字节 LlamaIndex 的核心功能 # 知识库问答示例总体流程：载入数据，切分构建 index ·持久化 index ·查询与生成 Data connectors:APIs,pdf,ppt,docx,markdown,image,audio,video,tables… Index:list,vector store,tree,keyword table,Pandas,SQL 存储，与各类向量数据库的对接。0.6 版本之后更加复杂，分成 doc,indexi 和 vector3 三块存储 Query:.各种对应 index 的查询与结果生成，主要分成 retrieve（召回）和 synthesize（整合生成）两部分 Query 结果中的 extra_info,支持引用展示 Post process:召回的“后处理”，例如关键词过滤，重排序等定制化，包括 LLM,prompt,embedding,存储等 Optimizers,优化调用，节省 token 与 Query 相关的特性与场景 # Vector Index - 常用于 QA # Tree Index - 多个知识库的场景(自底向上用 Prompt & synthesis 的方法递归生成 parent nodes) # Keyword Table Index - 常用于问题比较短，有很多专有词的场景（Keywords 也是通过 prompt 生成） # DEFAULT_KEYWORD_EXTRACT_TEMPLATE_TMPL = ( "Some text is provided below. ...

OLLAMA

2023-05-29. Category & Tags: AIGC, GPT, ChatGPT, LLAMA, LLM, OLLAMA, ChatBotOLLAMA

public: 2025-04-19 See also the main item: /LLM. Windows, Linux, MacOS 可执行程序直接运行，自动下载模型权重，且不需要网络代理。腾讯开发者, (bak). 注意：ollama run llama2之后，在 npm run dev （chatbot）前，需要ollama run mistral，否则提示'model 'mistral:latest' not found, try pulling it first。想要运行什么模型，就在文件夹内直接用ollama run <MODEL>，例如：ollama run llama2:latest 或 ollama run qwen ollama run gemma。模型有了之后，再npm run dev，根据提示进入网页 localhost:3000 就可以选择模型了。 to allow listening on all network interfaces: # One time (nix): export OLLAMA_HOST=0.0.0.0:11434 && ollama run ... (Mac: launchctl setenv OLLAMA_HOST 0.0.0.0:11434) Always (nix): vim /etc/systemd/system/ollama. ...