This MCP implementation, developed by JMH, is a Python-based web crawler designed for extracting and saving website content as markdown files. It offers features like website structure mapping, batch URL processing, and configurable output settings. The project integrates with FastMCP for easy installation and deployment, and leverages libraries such as BeautifulSoup and requests for efficient web scraping. Its focus on markdown output and straightforward configuration makes it particularly suitable for content aggregation, site archiving, or building knowledge bases from web sources. The crawler's ability to create content indexes and its support for concurrent requests set it apart as a tool for both small-scale personal projects and larger data collection tasks.
暂无评论. 成为第一个评论的人!
登录以参与讨论
Extracts content from a specified URL and saves it to a markdown file. Parameters: url (string), output_path (string)
Scans the linked content from a specified URL. Parameters: url (string)
Creates an index from the provided content map and saves it to a markdown file. Parameters: content_map (string), output_path (string)