Markdown Web Crawl
Summary
This MCP implementation, developed by JMH, is a Python-based web crawler designed for extracting and saving website content as markdown files. It offers features like website structure mapping, batch URL processing, and configurable output settings. The project integrates with FastMCP for easy installation and deployment, and leverages libraries such as BeautifulSoup and requests for efficient web scraping. Its focus on markdown output and straightforward configuration makes it particularly suitable for content aggregation, site archiving, or building knowledge bases from web sources. The crawler's ability to create content indexes and its support for concurrent requests set it apart as a tool for both small-scale personal projects and larger data collection tasks.
Available Actions(3)
extract_content
Extract content from a specified URL and save it to a markdown file. Parameters: url (string), output_path (string)
scan_linked_content
Scan the linked content from a specified URL. Parameters: url (string)
create_index
Create an index from the scanned content map and save it to a markdown file. Parameters: content_map (string), output_path (string)
社区评论
暂无评论. 成为第一个评论的人!
登录以参与讨论