Tencent Releases Hunyuan-T1: First Ultra-Large Mamba-Powered Language Model
2 minutes
Tencent announced the official release of Hunyuan-T1, an advanced reasoning model that represents a significant upgrade from their earlier T1-Preview version. This new model is built on TurboS, which Tencent describes as "the world's first ultra-large-scale Hybrid-Transformer-Mamba MoE large model."
Technical Foundation and Architecture
Hunyuan-T1 is built on a foundation that combines Transformer and Mamba architectures in a Mixture of Experts (MoE) configuration. According to Tencent, the model's Mamba architecture specifically optimizes long-sequence processing, enabling efficient capture of information in extended texts while reducing computational resource requirements.
One of the most notable performance claims is that Hunyuan-T1 achieves decoding speeds twice as fast as comparable models under the same deployment conditions.
Training Methodology
The development team reports allocating 96.7% of their computing resources to reinforcement learning during the post-training phase. This approach focused specifically on enhancing reasoning capabilities and aligning the model with human preferences.
Tencent's training strategy incorporated:
- Curriculum learning that gradually increased data difficulty
- Step-by-step expansion of the model's context length
- Classic reinforcement learning strategies including data replay and periodic policy resetting
- A unified reward system with self-rewarding mechanisms based on comprehensive evaluation
The training data encompassed a diverse range of reasoning problems spanning mathematics, logic, science, and coding, with the inclusion of ground-truth feedback to ensure robust performance across reasoning tasks.
Performance Claims
According to Tencent's internal evaluations, Hunyuan-T1 demonstrates performance comparable to other advanced reasoning models:
- 87.2 score on MMLU-PRO (slightly behind OpenAI's O1)
- 69.3 score on GPQA-diamond for professional domain knowledge
- 64.9 score on LiveCodeBench for coding evaluation
- 96.2 score on MATH-500, reportedly close to DeepSeek R1's performance
- 91.9 score on ArenaHard tasks
The company states that Hunyuan-T1 performs particularly well in cultural and creative instruction following, text summarization, and agent capabilities.
Industry Context
Hunyuan-T1 enters a landscape where reinforcement learning is increasingly prominent in the post-training phase of large language models. Tencent notes this approach has been validated by recent models including OpenAI's o-series and DeepSeek R1.
The model represents a continuation of Tencent's work that began with the mid-February 2025 release of Hunyuan T1-Preview on the Tencent Yuanbao application.
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.