This MCP server implementation provides voice interaction capabilities for AI assistants, enabling speech-to-text and text-to-speech functionality. It uses faster-whisper for improved speech recognition performance and integrates with PyAudio for audio processing. The server offers a simplified API for starting conversations and replying to user input, making it suitable for applications requiring natural language voice interfaces with AI models.
Aún no hay reseñas. ¡Sé el primero en reseñar!
Inicia sesión para unirte a la conversación
Generate audio files for a conversation using either JSON or Markdown format. Parameters: script (string), output_path (string), script_format (string)
Convert text to speech and save it as an audio file. Parameters: text (string), output_path (string) or text_file_path (string)
Transcribe speech from various audio and video formats. Parameters: file_path (string), include_timestamps (optional boolean), detect_speakers (optional boolean)