GLM 4.7

GLM 4.7 is Z.ai's model released December 22, 2025 with major improvements in coding, tool usage, and multi-step reasoning. It uses a more natural conversational tone and shows improved frontend development results.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.7',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out GLM 4.7 by Z.ai. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

GLM 4.7

Ask GLM 4.7 anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Z.ai

200K

5.8s

196tps

$0.60/M

$2.20/M

Read:$0.11/M

Write:—

—

12/22/2025

Novita AI

205K

1.0s

88tps

$0.60/M

$2.20/M

Read:$0.1/M

Write:—

—

12/22/2025

DeepInfra

203K

0.5s

31tps

$0.43/M

$1.75/M

Read:$0.08/M

Write:—

—

12/22/2025

Cerebras

131K

0.3s

526tps

$2.25/M

$2.75/M

Read:$2.25/M

Write:—

—

12/22/2025

Amazon Bedrock

200K

1.4s

57tps

$0.60/M

$2.20/M

—

12/22/2025

Baseten

200K

0.2s

155tps

$0.60/M

$2.20/M

Read:$0.12/M

Write:—

—

12/22/2025

More models by Z.ai

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

zai/glm-5.2-fast

1.4s

143tps

$3.00/M

$10.25/M

Read:$0.5/M

Write:—

—

06/23/2026

zai/glm-5.2

0.8s

152tps

$0.95/M

$3.00/M

Read:$0.18/M

Write:—

—

06/16/2026

zai/glm-5.1

205K

0.4s

165tps

$1.30/M

$4.30/M

Read:$0.26/M

Write:—

—

04/07/2026

zai/glm-5-turbo

203K

4.4s

51tps

$1.20/M

$4.00/M

Read:$0.24/M

Write:—

—

03/15/2026

zai/glm-5

203K

0.5s

84tps

$0.80/M

$2.56/M

Read:$0.16/M

Write:—

—

02/12/2026

zai/glm-4.7-flashx

200K

1.4s

152tps

$0.06/M

$0.40/M

Read:$0.01/M

Write:—

—

01/19/2026

About GLM 4.7

GLM 4.7 was released December 22, 2025 as a capability upgrade in Z.ai's model lineup. It brings major improvements in coding, tool usage, and multi-step reasoning, with added focus on complex agentic tasks that require sustained planning across multiple steps.

The model introduces a more natural conversational tone compared to earlier GLM generations. This improves the experience in chat-based applications and interactive coding assistants. GLM 4.7 shows improved frontend development results for teams building UI components, web applications, and design-to-code workflows.

GLM 4.7 operates within a context window of 204.8K tokens and is available through AI Gateway with the standard unified API, built-in observability, and intelligent provider routing. It represents the full-scale offering in the 4.7 generation, complemented by GLM-4.7-Flash and GLM-4.7-FlashX for speed-optimized workloads.

What To Consider When Choosing a Provider

Configuration: GLM 4.7 targets frontend tasks. If your workload involves generating React components, CSS, or converting designs to code, benchmark it against general-purpose alternatives you already use.
Configuration: The 4.7 generation includes three tiers. GLM 4.7 provides maximum capability, GLM-4.7-Flash offers speed optimization, and GLM-4.7-FlashX provides the fastest inference. Choose based on your latency-capability tradeoff.
Configuration: Multi-step reasoning updates make GLM 4.7 more reliable for agentic pipelines. Test it against your existing agent benchmarks to quantify the improvement over GLM-4.6.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GLM 4.7

Best For

Frontend development: HTML, CSS, React components, and design-to-code conversion benefit from the targeted improvements
Complex agentic tasks: Multi-step reasoning, tool usage, and sustained planning across extended interactions
Interactive coding assistants: The natural conversational tone improves developer experience
Full-stack code generation: Improvements in both coding capability and tool usage apply across the stack
Production applications: The highest capability tier in the GLM-4.7 generation, paired with AI Gateway observability

Consider Alternatives When

Latency-driven workloads: GLM-4.7-Flash or GLM-4.7-FlashX provides faster inference at reduced capability
Vision capabilities needed: Evaluate GLM-4.6V or GLM-4.5V for multimodal visual input
Advanced reasoning beyond 4.7: GLM-5 introduces multiple thinking modes and improved long-range planning
Cost-efficiency priority: The flash variants in the 4.7 generation or GLM-4.5-Air may be more economical when peak capability is not essential

Conclusion

GLM 4.7 advances Z.ai's model lineup with targeted improvements in the areas that matter most for modern development workflows: coding, tool usage, multi-step reasoning, and frontend generation. As the full-scale 4.7 model, it sets the capability ceiling for the generation while GLM-4.7-Flash and FlashX variants serve speed-sensitive workloads.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

GLM 4.7

Playground

Providers

More models by Z.ai

About GLM 4.7

What To Consider When Choosing a Provider

When to Use GLM 4.7

Best For

Consider Alternatives When

Conclusion