2026年手把手教你使用 Faster-Whisper 实时语音输入转文本，本地部署教程

大家好，我是讯享网，很高兴认识大家。这里提供最前沿的Ai技术和互联网信息。

 
  
    
     
      
     文章目录 
      
      前言 
      一、安装环境 
      二、使用步骤 
        
        1.下载模型 
        2.实时录音转文本脚本 
        3.报错解决方法 
        
      总结 
      
      
      
      要想实现像豆包、微信等一样的语音输入功能，通常有两种主流方案：云端 API（轻量、准确度极高）和 本地模型（免费、隐私、无需联网）。由于目前开发的系统需要添加一个语音识别功能，刚好记录一下使用 Faster-Whisper 实时语音输入转文本。Faster-Whisper官网地址链接: Faster-Whisper官网地址 
       
        
       
      电脑有显卡的话可以参考下面这篇文章安装 cuda 和 cudnn 
      cuda和cudnn的安装教程: cuda和cudnn的安装教程(全网最详细保姆级教程) 
       
       在你的虚拟环境安装 faster-whisper，命令如下： 
       pip install faster-whisper  
        
         
        
       pip install pyaudiowpatch  
        
         
        
        
        Tiny (最小/最快):Systran/faster-whisper-tiny 
        Base:Systran/faster-whisper-base 
        Small:Systran/faster-whisper-small 
        Medium:Systran/faster-whisper-medium 
        Large-v2:Systran/faster-whisper-large-v2 
        Large-v3 (效果最好):Systran/faster-whisper-large-v3 
        Distil-Large-v3 (蒸馏版/速度快):Systran/faster-distil-whisper-large-v3 
        
       在 Hugging Face 的 “Files and versions” 页面中，下载以下几个关键文件（放入同一个文件夹）： 
        
        config.json 
        model.bin 
        tokenizer.json 
        vocabulary.json 
        preprocessor_config.json 
        
        
         
        
        
         
        
       代码如下（示例）： 
       # -*- coding: utf-8 -*-""" @Auth ：落花不写码 @File ：mian.py @IDE ：PyCharm @Motto :学习新思想，争做新青年 """import os import sys import time import wave import tempfile import threading import torch import pyaudiowpatch as pyaudio from faster_whisper import WhisperModel # 录音切片时长（秒） AUDIO_BUFFER =5defrecord_audio(p, device):# 创建临时文件with tempfile.NamedTemporaryFile(suffix=".wav", delete=False)as f: filename = f.name wave_file = wave.open(filename,"wb") wave_file.setnchannels(int(device["maxInputChannels"])) wave_file.setsampwidth(p.get_sample_size(pyaudio.paInt16)) wave_file.setframerate(int(device["defaultSampleRate"]))defcallback(in_data, frame_count, time_info, status):"""写入音频帧""" wave_file.writeframes(in_data)return(in_data, pyaudio.paContinue)try: stream = p.open(format=pyaudio.paInt16, channels=int(device["maxInputChannels"]), rate=int(device["defaultSampleRate"]), frames_per_buffer=1024,input=True, input_device_index=device["index"], stream_callback=callback,) time.sleep(AUDIO_BUFFER)# 阻塞主线程进行录音except Exception as e:print(f"录音出错: {e}")finally:if'stream'inlocals(): stream.stop_stream() stream.close() wave_file.close()return filename defwhisper_audio(filename, model):""" 调用模型进行转录 """try:# vad_filter=True 可以去掉没说话的静音片段 segments, info = model.transcribe( filename, beam_size=5, language="zh", vad_filter=True, vad_parameters=dict(min_silence_duration_ms=500))for segment in segments:print("[%.2fs -> %.2fs] %s"%(segment.start, segment.end, segment.text))except Exception as e:print(f"转录出错: {e}")finally:# 转录完成后删除临时文件if os.path.exists(filename): os.remove(filename)defmain():print("正在加载 Whisper 模型...")# 检查 GPUif torch.cuda.is_available(): device ="cuda" compute_type ="float16"# 或者 "int8_float16"print("使用 GPU (CUDA) 进行推理")else: device ="cpu" compute_type ="int8"# CPU 上推荐用 int8print("使用 CPU 进行推理")# 模型路径 model_path ="large-v3"try: model = WhisperModel(model_path, device=device, compute_type=compute_type,local_files_only=True)print("模型加载成功！")except Exception as e:print(f"模型加载失败: {e}")returnwith pyaudio.PyAudio()as p:try: default_mic = p.get_default_input_device_info()print(f"  
       当前使用的麦克风: {default_mic[‘name’]} (Index: {default_mic[‘index’]})“)print(f”采样率: {default_mic[‘defaultSampleRate’]}, 通道数: {default_mic[‘maxInputChannels’]}“)print(”-“*50)print(“开始持续录音 (按 Ctrl+C 停止)…”)whileTrue: filename = record_audio(p, default_mic) thread = threading.Thread(target=whisper_audio, args=(filename, model)) thread.start()except OSError:print(“未找到默认麦克风，请检查系统声音设置。”)except KeyboardInterrupt:print(” 停止录音，程序退出。”)except Exception as e:print(f” 发生未知错误: {e}“)if name ==’main’: main()

报错：

pip install –force-reinstall ctranslate2==4.4.0

pip install onnxruntime==1.19.2

对你有帮助请帮我一键三连。

2026年手把手教你使用 Faster-Whisper 实时语音输入转文本，本地部署教程

文章目录

相关推荐