Tencent Releases Hunyuan-T1: First Ultra-Large Mamba-Powered Language Model

March 25, 2025

2 minutes

🟢easy Reading Level

Tencent announced the official release of Hunyuan-T1, an advanced reasoning model that represents a significant upgrade from their earlier T1-Preview version. This new model is built on TurboS, which Tencent describes as "the world's first ultra-large-scale Hybrid-Transformer-Mamba MoE large model."

Technical Foundation and Architecture

Hunyuan-T1 is built on a foundation that combines Transformer and Mamba architectures in a Mixture of Experts (MoE) configuration. According to Tencent, the model's Mamba architecture specifically optimizes long-sequence processing, enabling efficient capture of information in extended texts while reducing computational resource requirements.

One of the most notable performance claims is that Hunyuan-T1 achieves decoding speeds twice as fast as comparable models under the same deployment conditions.

Training Methodology

The development team reports allocating 96.7% of their computing resources to reinforcement learning during the post-training phase. This approach focused specifically on enhancing reasoning capabilities and aligning the model with human preferences.

Tencent's training strategy incorporated:

  • Curriculum learning that gradually increased data difficulty
  • Step-by-step expansion of the model's context length
  • Classic reinforcement learning strategies including data replay and periodic policy resetting
  • A unified reward system with self-rewarding mechanisms based on comprehensive evaluation

The training data encompassed a diverse range of reasoning problems spanning mathematics, logic, science, and coding, with the inclusion of ground-truth feedback to ensure robust performance across reasoning tasks.

Performance Claims

According to Tencent's internal evaluations, Hunyuan-T1 demonstrates performance comparable to other advanced reasoning models:

  • 87.2 score on MMLU-PRO (slightly behind OpenAI's O1)
  • 69.3 score on GPQA-diamond for professional domain knowledge
  • 64.9 score on LiveCodeBench for coding evaluation
  • 96.2 score on MATH-500, reportedly close to DeepSeek R1's performance
  • 91.9 score on ArenaHard tasks

The company states that Hunyuan-T1 performs particularly well in cultural and creative instruction following, text summarization, and agent capabilities.

Industry Context

Hunyuan-T1 enters a landscape where reinforcement learning is increasingly prominent in the post-training phase of large language models. Tencent notes this approach has been validated by recent models including OpenAI's o-series and DeepSeek R1.

The model represents a continuation of Tencent's work that began with the mid-February 2025 release of Hunyuan T1-Preview on the Tencent Yuanbao application.

Valeriia Kuka

Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.


© 2025 Learn Prompting. All rights reserved.