Dataset Library
Reasoning traces for distilling frontier models
Curated datasets built by querying Claude, GPT, Gemini and other frontier models with diverse coding, math, and reasoning prompts. Designed for training small open models that still think clearly.
What's included
Each dataset includes detailed reasoning traces, carefully filtered conversations, and metadata ready for fine-tuning. Listings are synced hourly from Hugging Face.
claude-4.5-opus-high-reasoning-250x
Distilled from Claude Opus 4.5
gemini-3-pro-preview-high-reasoning-1000x
Distilled from Gemini 3 Pro
claude-haiku-4.5-high-reasoning-1700x
glm-4.7-2000x
gpt-5.1-high-reasoning-1000x
Distilled from GPT-5.1
gemini-3-flash-preview
claude-sonnet-4.5-high-reasoning-250x
Distilled from Claude Sonnet 4.5
gpt-5.2-high-reasoning-250x
deepseek-v3.2-speciale-openr1-math-3k
Distilled from DeepSeek v3.2 Speciale
deepseek-v3.2-speciale-1000x
Distilled from DeepSeek v3.2 Speciale
MiniMax-M2.1-8800x
gpt-5.1-codex-max-1000x
Distilled from GPT-5.1
deepseek-v3.2-speciale-OpenCodeReasoning-3k
Distilled from DeepSeek v3.2 Speciale
MiMo-V2-Flash-2300x
minimax-m2.1-1000x
gpt-5-codex-250x
Distilled from GPT-5 Codex
gpt-5-codex-1000x
Distilled from GPT-5 Codex
gemini-3-pro-preview-high-reasoning-250x
Distilled from Gemini 3 Pro
gemini-3-flash-preview-1000x
grok-code-fast-1-1000x
Distilled from Grok
claude-haiku-4.5-1700x
convo-v1
kimi-k2-thinking-1000x
Distilled from Kimi K2
glm-4.7-350x
gemini-2.5-flash-11000x
Distilled from Gemini 2.5 Flash
kimi-k2-thinking-250x
Distilled from Kimi K2
glm-4.6-250x
Distilled from GLM 4.6
polaris-alpha-1000x
brainstorm-v3.1-grok-4-fast-200x
Distilled from Grok
sherlock-thinking-alpha-11000x
gemini-3-flash-preview-standalone-html-1k
gemini-2.5-flash-lite-2509-preview-1000x
Distilled from Gemini 2.5 Flash