LLM (Large Language Model) 大语言模型
See also:
- /chatglm
- /fastchat-vicuna
- /llamaindex
- /ollama
- Chinese NLP Data: 四大名著现代汉语版、古汉语版
Interesting APPs #
Some simple examples/demos.
- 语音聊天机器人:【Open WebUI+Ollama/vLLM+CosyVoice+Whisper】终极个人聊天互动机器人-环境部署及成果展示
- 简单多模态:ollama+open-webui_知识库+多模态+文生图功能详解
Fine-tune vs. RAG vs. Prompt #
- Fine-tune ≈ Learn a course
- RAG ≈ open-book examination
- Prompt ≈ ?
Several Tools to Run the LLM’s Model (itself) #
本地大模型启动 openai 服务的 N 种方式,vllm,fastchat,llama factory,llama.cpp,ollama
Chain of Thought / 思维链 #
- 狭义: Chain of Thought = COT
- 广义: 3 种: Chain of Thought = COT; Tree of Thought = TOT; Graph of Thought = GOT. 对应数据结构分别是: list, tree, graph.
RAG (Retrieval-Augmented Generation) Basics #
Typical RAG Question-answering System with Local-knowledgebase #
ref: 从基础 RAG 到 Agent: Llamaindex 助力大模型 应用落地的思考与实践,稀土开发者大会 2024.6
typical workflow (steps/modules): #
- Data Parsing & Input
- Search
naive RAG is NOT good at: #
- Summarization / 综述总结
- Comparison / 对比分析
- Implicit Data / 含蓄暗示(间接提示)
- Multi-part Questions / 多分段综合问题
due to (Reasons):
- single-short / 单回合
- no plan / 无 Query 理解、步骤规划,或低级理解和规划
- no tools / 无工具调用
- no reflection / 无反思
- no memory / 无(上下文)记忆
thus improvements (Agentic RAG / 主动式 RAG):
- use external tools
- add reflection
- multi-turn / 多回合 (多次循环或者按需循环,直到满意或者达到最大次数)
- add query 理解/规划层 (逐步给出足够复杂的解决方案)
- add memory
Define: Agentic RAG (which has 4 components/steps): #
- Reflect / 反思 【相对成熟】
- Tool Use / 工具使用 【相对成熟】
- Routing / 路由
- Conversation Memory / 对话记忆
- Query Planning / 规划
Agent #
See also:
- langchain-ai/langgraph : langgraph examples, 2024 updating…, many, with videos on youtube
- Tool Use - Simple Data Analyst Agent with Cohere and Langchain , with videos on youtube, 2024.4
- Building a LangGraph ReAct Mini Agent: reasoning graph should NOT be complicated.
- (从官方翻译而来)手把手带你搭建 Agent 智能体!从零到一超详细原理微调讲解+代码解析项目实战,毛毛虫都能学清楚!—RAG,prompt,微调,Agent. (2.5h。重点看 React 框架如何解决问题,忽略 P6-Memroy 中的关于人的各种记忆的随意定义。三连评论换课件。)
- Microsoft MS AutoGen examples, official, agent-focus
- CrewAI examples, official, few
Sunny Define Agent: 基于 LLM 的代理人。
Define: Reflect / 反思 【相对成熟】
- Agent 检查、评估自己的工作结果,并提出改进的方法. (可以有效提高 LLM 生成的内容质量)
Define: Tool Use / 工具使用 (to generate correct input for tools w.r.t. Query, e.g.) 【相对成熟】
- e.g. 1: (Define Auto-Retrieval: ) convert query to keys which are used as Vector DB filters
- e.g. 2: Text-to-SQL as SQL DB input
- e.g. 3: generate api calls
Define: Routing / 路由 (which has 2 components):
- sematic search / 语义搜索 by Vector Query which retrives top-k results
- summarization / 归纳总结
Define: Conversation Memory / 对话记忆:
- not only keeping conversation history
- but also how to fuse info when the history is larger than input context window (compress, search etc.)
Define: Query Planning / 规划 【早期发展阶段】 (e.g. compare 2 companies income increment) :
- Agent 对用户的目标进行拆解并执行(比如一篇分析报告,拆分为提纲、分段撰写、总结小标题等,但未必和用户既定路线完全一致)。 通过连接工作流中不同的工具节点实现任务的精细编排和执行,编排难度较大, 能力上限较高,确定性较高。
- e.g. plan: company A’s income & increment; company B’s income & increment; comparison
RAG 进阶策略 ( choose considering metrics ):
pic ref: 从基础 RAG 到 Agent: Llamaindex 助力大模型 应用落地的思考与实践,稀土开发者大会 2024.6
Define: Multi-Agent Collaboration / 协作【早期发展阶段】: 单个 agent 处理大量目标/子任务是超过 agent 的能力的,可以多个 AI Agent 协同工作,分工任务,讨论和辩论想法,提出比单个智能体更好的解决方案。需要关注复杂任务中的专家角色, 而无需精确设计流程和协作关系,实现了对复杂任务的分支处理, 编排难度较小, 结果的上限较高, 但是不确定性较高。 例如长文生成、 逻辑话题等。
- 专 agent 专项任务
- 并行
- 还要考虑成本和响应时间
Define Llama Agents Framework 产品 (一个模块化的面向服务的分布式架构): 通过 “Control Plane” (包括: Orchestrator & Service Metadata) 生成各 Agent 的调用,通过 “Message Queue” 发送给对应的各个 agent,比起 autogen、crewai 等,引入了人的反馈:
RAG 未来展望(LlamaIndex):可观、可控、可定制。
See also: LlamaHub.ai for RAG components (mainly LlamaParse for Data Parsing & Input. Note: LlamaParse is online and requires api-key).
Frameworks Comparison #
Aspect | LangChain | Hugging Face Transformers | GPT-Index | DeepPavlov | PromptLayer | CrewAI | |
---|---|---|---|---|---|---|---|
Prompt Engineering | Extensive support, custom prompt templates | Basic prompt management | Advanced prompt engineering capabilities | Focus on integrating prompts with index retrieval | Flexible prompt support | Advanced prompt engineering capabilities | Advanced prompt support and customization |
Data Retrieval and Integration | Robust integration with various data sources | Focus on indexing and retrieval from multiple sources | Integration via datasets and APIs | Emphasizes retrieval-augmented generation (RAG) | Integration with various data sources and formats | Emphasis on data integration with prompt execution | Comprehensive data retrieval and integration capabilities |
Model Orchestration and Chaining | Strong orchestration for complex workflows | Limited chaining capabilities | Orchestration through pipelines and workflows | Focuses on indexing and retrieval chains | Supports chaining and workflow management | Workflow management with observability features | Robust orchestration and chaining for complex tasks |
Debugging and Observability | Good debugging tools, extensive logging capabilities | Basic logging and monitoring | Advanced logging and model monitoring | Debugging focused on retrieval issues | Logging and error handling capabilities | Integrated observability with real-time logging | Advanced debugging and observability tools |
LLM Applications (RAGs) | Strong support for Retrieval-Augmented Generation | Basic support for RAGs | Supports RAGs through pipelines | RAGs are a core feature | Supports RAGs | Integrated RAG support with advanced features | Comprehensive support for RAGs and similar applications |
Evaluations | Tools for evaluating prompt effectiveness | Basic evaluation capabilities | Comprehensive evaluation tools and metrics | Evaluation focuses on indexing accuracy | Evaluation tools for model performance | Detailed evaluation tools and metrics | Detailed evaluation tools and metrics |
Production Readiness | Well-suited for production with enterprise features | Suitable for production with indexing capabilities | Production-ready with robust API support | Designed for production with focus on indexing | Production-ready with enterprise support | Well-suited for production with enterprise capabilities | Well-suited for production with enterprise features |
Ecosystems and Integrations | Strong ecosystem with various integrations | Extensive ecosystem with numerous integrations | Focused integrations with RAG-centric tools | Good ecosystem with various integrations | Ecosystem includes integration with observability tools | Comprehensive ecosystem and integrations | |
Support (Documents, Tutorials, Community) | Extensive documentation, strong community support | Comprehensive documentation, extensive community | Good documentation, active community | Detailed documentation, community support | Good documentation, community support, tutorials | Comprehensive documentation, strong community support |
Warning: Crew AI collects anonymized usage data/info and reports to telemetry.crewai.com
.
Solution: Disable by faking it in OpenTelemetry
lib.
References:
- 4o mini
- LangChain vs. Alternatives:
- Hugging Face Transformers:
- GPT-Index:
- DeepPavlov:
- PromptLayer:
- CrewAI:
Frameworks in Details #
for fun & basic usage #
ChatBot Ollama
for development #
AutoGPT
MetaGPT
LangChain
Autogen Studio by MicroSoft
GraphRAG neo4j presentation in graphics
LLM for Scholar #
LLM + Graph + Simple Local (English) #
MS GraphRAG + Ollama 本地部署
# 拉取quantinz模型
ollama pull quentinz/bge-base-zh-v1.5:latest
# 拉取gemma模型
ollama run gemma2:9b
# 展示模型列表
ollama list
Main refs:
- GraphRag 本地测试, (bak)
- [疑似抄袭上一个 CSDN 的 ref]GraphRAG+Ollama 本地部署,保姆教程,踩坑无数,闭坑大法, (bak)
- TheAiSingularity/graphrag-local-ollama
Other refs:
- GraphRAG+Ollama 实现本地部署: using llm model mistral + embedding model nomic-embed-text
- medium: GraphRAG local setup via vLLM and Ollama : A detailed integration guide. using: Llama-3.1-8B + nomic-embed-text. (bak)
- 5 分钟手把手系列(二):本地部署 Graphrag(Pycharm+Ollama+LM Studio)
More:
faq #
Problem: “not answering in json” related problem. Hot (bypass) fix:
In graphrag/llm/openai/utils.py
, replace result = json.loads(input)
with:
result_list = input.replace('** ',' ').split('* **')[1:-1]
result = dict({'points': []})
for one_result in result_list:
result['points'].append({'description': one_result, 'score': 85})
input = json.dumps(result)
LLM + Graph + Simple Local (Chinese) #
Main refs:
Other refs:
LLM + Graph + Local (+ Neo4j) #
- 构件图:使用 Neo4j 和 LangChain 实现“从本地到全局”的 GraphRAG
- GraphRAG 部署流程及 Neo4j 展示 》§ Neo4j 可视化, (bak)
- ksachdeva/langchain-graphrag
- YouTube Local GraphRAG + Langchain + local llm = Easy AI/Chat for your Docs
TODO #
Cinnamon/kotaemon: advantages: multi-model, docker, graph-display, customizable pipeline.
refs: