Web Content Extractor
Summary
This MCP server for web content scanning and analysis, developed using TypeScript, provides tools for extracting and processing web page content. It leverages libraries like Cheerio for HTML parsing and Turndown for HTML-to-Markdown conversion, offering capabilities to fetch, analyze, and transform web content. The implementation is designed to integrate seamlessly with AI-assisted workflows, enabling tasks such as web scraping, content summarization, and data extraction. It's particularly useful for researchers, content creators, and developers who need to automate web content analysis, generate structured data from websites, or incorporate web-based information into their AI applications.
Available Actions(6)
fetch-page
Fetches a web page and converts it to Markdown. Parameters: url (required), selector (optional)
extract-links
Extracts all links from a web page with their text. Parameters: url (required), baseUrl (optional), limit (optional, default: 100)
crawl-site
Recursively crawls a website up to a specified depth. Parameters: url (required), maxDepth (optional, default: 2)
check-links
Checks for broken links on a page. Parameters: url (required)
find-patterns
Finds URLs matching a specific pattern. Parameters: url (required), pattern (required)
generate-site-map
Generates a simple XML sitemap by crawling. Parameters: url (required), maxDepth (optional, default: 2), limit (optional, default: 1000)
커뮤니티 리뷰
아직 리뷰가 없습니다. 첫 번째 리뷰를 작성해 보세요!
대화에 참여하려면 로그인하세요