Speech Interface (Faster Whisper)
Summary
This MCP server implementation provides voice interaction capabilities for AI assistants, enabling speech-to-text and text-to-speech functionality. It uses faster-whisper for improved speech recognition performance and integrates with PyAudio for audio processing. The server offers a simplified API for starting conversations and replying to user input, making it suitable for applications requiring natural language voice interfaces with AI models.
Available Actions(3)
narrate_conversation
Generate audio files for conversations using multiple voices. Parameters: script (string - path to the script in JSON or Markdown format), output_path (string - path to save the output audio), script_format (string - 'json' or 'markdown')
narrate
Convert text to speech and save as an audio file. Parameters: text (string - the text to convert), output_path (string - path to save the output audio), text_file_path (optional string - path to a file containing text)
transcribe
Transcribe speech from audio or video files. Parameters: file_path (string - path to the audio/video file), include_timestamps (optional boolean - whether to include timestamps), detect_speakers (optional boolean - whether to detect speakers)
Avis de la Communauté
Aucun avis encore. Soyez le premier à donner votre avis !
Connectez-vous pour rejoindre la conversation