Milestone
vLLM PagedAttention
gpt-oss 源码/模型分析
AI Agents
《Agentic Design Pattern》学习
Code Agent
Claude Code
Code X
Deep Research
通义 Deep Research、
Tongyi DeepResearch
Search-r1
R1-Searcher
Agentic RL
Policy Optimization Algorithm
GRPO有什么问题?
GSPO
DAPO
Tool-integrated Reasoning
- multi turn tool use
- multi tools
- 同步调用tool还是异步?
- 怎么选择tool?
TORA
ToRL
Frameworks
Verl
VerlTool
LangChain LangGraph Agent总结
LangChain 03:Message解析
Multi-Agent System Plan & Execute