同样调用AI，为什么他的网页像ChatGPT流式打字，你的要等5秒？

内容分享2个月前发布

用户打开你的AI助手页面，点了发送——然后盯着白屏等了整整 5 秒钟。

屏幕上的光标转啊转，用户心想：这破网站是不是卡死了？

与此同时，隔壁老王的AI应用，文字像打字机一样一个字一个字蹦出来，流畅得像真人在跟你聊天。

都是调用的 ChatGPT API，差距怎么这么大？

实则就差一个技术点：流式响应（Server-Sent Events / Streaming）。

今天我把完整的流式调用方案拆解给你看，前端+后端全都有，复制就能用。

一、先看看你的代码是不是这样写的（5秒等待的元凶）

❌ 普通写法：等全部结果返回才显示

// 前端代码
async function sendMessage() {
    const response = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ message: userInput })
    });
    
    // ⚠️ 问题在这里：必须等全部数据返回才继续
    const result = await response.json();
    
    // 这时候用户已经等了5秒
    showMessage(result.reply);
}

后端呢？大致率也是这么写的：

python

# Python 后端
@app.post('/api/chat')
async def chat(request: Request):
    # 一次性调用API，等完整结果
    response = openai.ChatCompletion.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": request.json()["message"]}]
    )
    
    # 等AI返回完整回答后才发给前端
    return {"reply": response.choices[0].message.content}

问题分析：

问题	表现	缘由
等待时间长	用户点击后等5-10秒	必须等完整响应
体验割裂	结果突然一下子全显示	没有渐进式反馈
用户焦虑	以为页面卡死了	无任何中间状态
容易超时	复杂问题直接超时	单次请求时间过长

二、ChatGPT官网是怎么做的？（一个字一个字蹦出来）

打开 ChatGPT 官网，随意问一个问题，你会发现：

文字是一个字一个字出现的，像真人打字一样。

这就是 流式响应（Streaming） 的威力。

原理实则很简单：

传统模式：
用户发送 → 后端等待AI全部生成 → 一次性返回 → 前端显示

流式模式：
用户发送 → 后端开始接收AI生成 → 每收到一个字就推送给前端 → 前端实时显示

流式模式的优势：

✅ 即时反馈：第一个字出现就开始显示，感知上几乎0等待
✅ 渐变效果：文字流动出现，用户不觉得无趣
✅ 承载更长回答：1万字的回答也能流式输出，不会超时
✅ 更像真人对话：打字机效果，沉浸感强

三、前端实现：3种方案，总有一款适合你

方案1：Fetch + ReadableStream（现代浏览器原生支持）

// 前端 - 核心流式读取代码
async function sendMessageStream(message) {
    const response = await fetch('/api/chat/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message: message })
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    const messageElement = document.createElement('div');
    chatContainer.appendChild(messageElement);

    // ⚡ 流式读取：每收到一段数据就立即显示
    while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        
        // 解码并显示
        const chunk = decoder.decode(value);
        messageElement.textContent += chunk;
        
        // 滚动到底部
        chatContainer.scrollTop = chatContainer.scrollHeight;
    }
}

效果预览：

用户：Python怎么快速入门？
AI助理：实则入门Python只需要掌握这几个核心概念...
                    （文字一个字一个字蹦出来）

方案2：EventSource + SSE（需要后端配合）

// 前端 - SSE方式
function sendMessageSSE(message) {
    const eventSource = new EventSource(`/api/chat/sse?message=${encodeURIComponent(message)}`);
    const messageElement = document.createElement('div');
    chatContainer.appendChild(messageElement);

    eventSource.onmessage = (event) => {
        if (event.data === '[DONE]') {
            eventSource.close();
            return;
        }
        // 累加显示
        messageElement.textContent += event.data;
    };

    eventSource.onerror = () => {
        eventSource.close();
        console.error('连接断开');
    };
}

用户：Python怎么快速入门？
AI助理：实则入门Python只需要掌握这几个核心概念...
                    （文字一个字一个字蹦出来）

方案3：Vue/React 封装（复制即用）

// React Hook 封装
import { useState } from 'react';

function useChatStream() {
    const [messages, setMessages] = useState([]);
    const [typing, setTyping] = useState(false);

    const sendMessage = async (content) => {
        setTyping(true);
        const response = await fetch('/api/chat/stream', {
            method: 'POST',
            body: JSON.stringify({ message: content })
        });

        const reader = response.body.getReader();
        const newMessage = { role: 'assistant', content: '' };
        setMessages(prev => [...prev, newMessage]);

        while (true) {
            const { done, value } = await reader.read();
            if (done) break;
            
            newMessage.content += new TextDecoder().decode(value);
            setMessages(prev => [...prev.slice(0, -1), { ...newMessage }]);
        }
        setTyping(false);
    };

    return { messages, sendMessage, typing };
}

四、后端实现：Python / Node.js / C# 三种方案

方案A：Python FastAPI（推荐，生产级可用）

from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
import openai
import asyncio

app = FastAPI()

@app.post('/api/chat/stream')
async def chat_stream(request: Request):
    body = await request.json()
    user_message = body.get('message', '')
    
    async def event_generator():
        # 调用OpenAI流式API
        stream = openai.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": user_message}],
            stream=True  # ⚡ 关键：开启流式
        )
        
        for chunk in stream:
            if chunk.choices[0].delta.content:
                # 逐字发送给前端
                yield f"data: {chunk.choices[0].delta.content}

"
        
        yield "data: [DONE]

"
    
    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream"
    )

启动命令：

uvicorn main:app --reload --host 0.0.0.0 --port 8000

方案B：Node.js Express（前端开发者首选）

const express = require('express');
const { OpenAI } = require('openai');
const app = express();

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

app.post('/api/chat/stream', async (req, res) => {
    const { message } = req.body;
    
    res.setHeader('Content-Type', 'text/event-stream');
    res.setHeader('Cache-Control', 'no-cache');
    res.setHeader('Connection', 'keep-alive');
    
    const stream = await openai.chat.completions.create({
        model: 'gpt-4o-mini',
        messages: [{ role: 'user', content: message }],
        stream: true  // ⚡ 关键
    });
    
    // 流式返回给前端
    for await (const chunk of stream) {
        const content = chunk.choices[0].delta.content;
        if (content) {
            res.write(`data: ${content}

`);
        }
    }
    res.end();
});

app.listen(3000, () => console.log('服务启动：http://localhost:3000'));

方案C：C# ASP.NET Core（.NET开发者专用）

[ApiController]
[Route("api/[controller]")]
public class ChatController : ControllerBase
{
    [HttpPost("stream")]
    public async Task StreamChat([FromBody] ChatRequest request)
    {
        Response.ContentType = "text/event-stream";
        Response.Headers.Add("Cache-Control", "no-cache");
        
        var stream = await openai.ChatCompletions.CreateStreaming(
            model: "gpt-4o-mini",
            messages: new[] { new ChatMessage("user", request.Message) }
        );
        
        await foreach (var chunk in stream)
        {
            var content = chunk.Choices.FirstOrDefault()?.Delta?.Content;
            if (!string.IsNullOrEmpty(content))
            {
                await Response.WriteAsync($"data: {content}

");
                await Response.Body.FlushAsync();
            }
        }
    }
}

五、避坑指南：流式调用5个必知问题

坑	缘由	解决方案
CORS跨域报错	前后端分离跨域请求	后端添加 Access-Control-Allow-Origin 头
SSE在代理超时	Nginx默认30秒超时	设置proxy_read_timeout 300s;
中文乱码	UTF-8编码问题	确保Content-Type: text/event-stream; charset=utf-8
内存占用高	流没及时关闭	确保[DONE]事件后正确关闭连接
重连风暴	网络波动时前端疯狂重连	添加指数退避重试机制

Nginx 配置示例：

location /api/chat/stream {
    proxy_pass http://127.0.0.1:8000;
    proxy_http_version 1.1;
    proxy_set_header Connection '';
    proxy_buffering off;
    proxy_read_timeout 300s;
    chunked_transfer_encoding on;
}

六、完整可运行示例（前端+后端，复制即用）

前端完整代码（单文件HTML）

<!DOCTYPE html>
<html>
<head>
    <title>AI 流式对话</title>
    <style>
        #chat { width: 600px; height: 400px; border: 1px solid #ccc; overflow-y: auto; padding: 10px; }
        .user { color: blue; }
        .ai { color: green; }
    </style>
</head>
<body>
    <h2>流式AI对话（像ChatGPT一样打字）</h2>
    <div id="chat"></div>
    <input type="text" id="input" placeholder="输入消息...">
    <button onclick="send()">发送</button>

    <script>
        async function send() {
            const input = document.getElementById('input');
            const chat = document.getElementById('chat');
            const msg = input.value;
            
            chat.innerHTML += `<div>你: ${msg}</div>`;
            input.value = '';
            
            const aiMsg = document.createElement('div');
            aiMsg.className = 'ai';
            chat.appendChild(aiMsg);
            
            // ⚡ 核心流式逻辑
            const res = await fetch('http://localhost:8000/api/chat/stream', {
                method: 'POST',
                headers: {'Content-Type': 'application/json'},
                body: JSON.stringify({message: msg})
            });
            
            const reader = res.body.getReader();
            const decoder = new TextDecoder();
            
            while (true) {
                const {done, value} = await reader.read();
                if (done) break;
                aiMsg.textContent += decoder.decode(value);
                chat.scrollTop = chat.scrollHeight;
            }
        }
    </script>
</body>
</html>

后端（Python FastAPI，保存为 main.py）

from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse
import openai

app = FastAPI()
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"]
)

@app.post("/api/chat/stream")
async def chat_stream(request: Request):
    body = await request.json()
    
    async def stream():
        stream = openai.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": body["message"]}],
            stream=True
        )
        for chunk in stream:
            if chunk.choices[0].delta.content:
                yield f"data: {chunk.choices[0].delta.content}

"
        yield "data: [DONE]

"
    
    return StreamingResponse(stream(), media_type="text/event-stream")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

七、总结：5秒 vs 实时，差在哪？

对比项	传统模式	流式模式
用户感知	等5-10秒	立即响应
显示方式	一次性全部出现	字字蹦出
体验评分	❌ 用户流失	✅ 沉浸感强
技术难度	⭐	⭐⭐
代码量	少几行	多10行

记住：流式响应不是可选项，是现代AI应用的标配。

同样调用AI，为什么他的网页像ChatGPT流式打字，你的要等5秒？

图片与正文无关

你的网站还在让用户干等吗？赶紧改成流式模式，用户体验直接提升 10 倍！

本系列索引

文章 01 | 核心主题：C# 接入 OpenAI/大模型 API
文章 02 | 核心主题：流式输出（Streaming）(本文）
文章 03 | 核心主题：Prompt 工程
文章 04 | 核心主题：Semantic Kernel
文章 05 | 核心主题：向量数据库 + RAG
文章 06 | 核心主题：Function Calling / Tool Use
文章 07 | 核心主题：多轮对话与记忆管理
文章 08 | 核心主题：AI Agent 设计
文章 09 | 核心主题：本地部署大模型
文章 10 | 核心主题：结构化输出
文章 11 | 核心主题：多模态（Vision）
文章 12 | 核心主题：成本控制
文章 13 | 核心主题：安全与合规
文章 14 | 核心主题：可观测性与质量评测

评论区聊聊：你的AI应用用的是流式还是普通模式？遇到过什么坑？

#AI #流式响应 #ChatGPT #前端开发 #后端开发 #Python #NodeJS #用户体验

© 版权声明

文章版权归作者所有，未经允许请勿转载。

相关文章

os.path 模块下 os.path.join()、os.path.exists()、os.path.split() 等基本函数用法介绍

os.path 模块下 os.path.join()、os.path.exists()、os.path.split() 等基本函数用法介绍

9个月前

030

AI实测｜每天1款不重样！第1天ChatGPT拆解，新手必看避坑指南

AI实测｜每天1款不重样！第1天ChatGPT拆解，新手必看避坑指南

2个月前

040

志愿填报——通信工程就业率、薪资水平、行业分布及就业方向整理

志愿填报——通信工程就业率、薪资水平、行业分布及就业方向整理

8个月前

0120

如何正确使用split方法

如何正确使用split方法

7个月前

040

1 条评论

none

暂无评论...