Tongyi DeepResearch: Alibaba's Open-Source AI Agent
Unveiling Tongyi DeepResearch: Alibaba's Powerful Open-Source AI Agent
Alibaba has released Tongyi DeepResearch, a sophisticated open-source AI agent designed to revolutionize deep information-seeking tasks. This cutting-edge model boasts a substantial 30.5 billion total parameters, with an innovative approach that activates only 3.3 billion parameters per token, optimizing efficiency without compromising performance.
Developed by Tongyi Lab, Tongyi DeepResearch has demonstrated exceptional capabilities, achieving state-of-the-art results across a variety of challenging agentic search benchmarks. These include Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, xbench-DeepSearch, FRAMES, and SimpleQA. This ambitious project builds upon the foundational work of Alibaba's previous WebAgent initiative, further pushing the boundaries of what AI can achieve in complex research scenarios.
Key Features and Innovations:
Tongyi DeepResearch stands out with several remarkable features:
- Automated Synthetic Data Generation: A highly scalable, fully automatic pipeline is employed for synthetic data generation. This empowers advanced agentic pre-training, supervised fine-tuning, and reinforcement learning processes, ensuring a robust and adaptable model.
- Large-Scale Continual Pre-training: The model undergoes extensive continual pre-training using diverse and high-quality agentic interaction data. This process enhances the modelβs capabilities, keeps its knowledge fresh, and significantly strengthens its reasoning performance.
- End-to-End Reinforcement Learning: Alibaba utilizes a strictly on-policy Reinforcement Learning (RL) approach. This includes a customized Group Relative Policy Optimization framework, token-level policy gradients, leave-one-out advantage estimation, and selective filtering of negative samples to ensure stable training in dynamic environments.
- Flexible Agent Inference Paradigms: At inference, Tongyi DeepResearch supports two primary paradigms:
- ReAct: Ideal for rigorously evaluating the model's intrinsic abilities.
- Iterative Research ('Heavy' Mode): Employs a test-time scaling strategy to unlock the model's maximum performance potential.
Getting Started with Tongyi DeepResearch:
The project provides a clear path for users to get started, including environment setup, dependency installation, and data preparation. The repository includes instructions for configuring inference scripts, allowing users to specify model paths, datasets, and output directories. Essential API keys and credentials for various tools can be configured within the provided shell scripts.
Model Availability:
Tongyi-DeepResearch-30B-A3B is readily available for download via HuggingFace and ModelScope, supporting a context length of up to 128K tokens.
Community and Research:
The project also highlights an extensive family of related research, including advancements in Web Agents, information seeking, and agentic RL. The repository encourages community contributions and is actively seeking talent for research intern positions.
Tongyi DeepResearch represents a significant leap forward in open-source AI for deep research, offering powerful tools and a robust framework for tackling complex information-seeking challenges.