Documentation Scraper

by arabold

2.5k

Summary

Docs MCP Server provides a specialized documentation scraping and retrieval system that enables AI assistants to access library documentation from various sources including GitHub, NPM, PyPI, and web pages. Built with TypeScript, it implements a pipeline architecture that handles document scraping, processing, splitting, and storage with features like semantic markdown splitting, greedy chunk optimization, and version-aware retrieval. The server exposes tools for searching documentation, finding specific versions, listing available libraries, and managing scraping jobs through a well-defined API. Particularly valuable for developers who need AI assistants to reference accurate, up-to-date documentation without leaving their workflow.

Available Actions(9)

scrape_docs

Starts a scraping job for documentation and returns a jobId immediately.

get_job_status

Checks the current status and progress of a specific job.

list_jobs

Lists all active and completed jobs.

cancel_job

Attempts to stop a running or queued job.

search_docs

Searches the indexed documentation for a specific library.

list_libraries

Lists all libraries currently indexed in the store.

find_version

Finds the appropriate version for a library based on the target version.

remove_docs

Removes indexed documents for a specific library and version.

fetch_url

Fetches a single URL and returns its content as Markdown.

Last Updated: April 21, 2025

Community Reviews

0.0

0 reviews

No reviews yet. Be the first to review!

Try it now in

Highlight AI

Quick Setup

Bundle and Download

Our bundler currently only supports TypeScript-based servers. Check back soon!

Documentation

View GitHub Repository

Language

TypeScript