添加 Web Content Extractor MCP Server | MCP 目录 - 模型上下文协议

Web Content Extractor

by bsmi021

Summary

This MCP server for web content scanning and analysis, developed using TypeScript, provides tools for extracting and processing web page content. It leverages libraries like Cheerio for HTML parsing and Turndown for HTML-to-Markdown conversion, offering capabilities to fetch, analyze, and transform web content. The implementation is designed to integrate seamlessly with AI-assisted workflows, enabling tasks such as web scraping, content summarization, and data extraction. It's particularly useful for researchers, content creators, and developers who need to automate web content analysis, generate structured data from websites, or incorporate web-based information into their AI applications.

Available Actions(6)

fetch-page

Fetches a web page and converts it to Markdown. Parameters: url (required), selector (optional)

extract-links

Extracts all links from a web page with their text. Parameters: url (required), baseUrl (optional), limit (optional, default: 100)

crawl-site

Recursively crawls a website up to a specified depth. Parameters: url (required), maxDepth (optional, default: 2)

check-links

Checks for broken links on a page. Parameters: url (required)

find-patterns

Finds URLs matching a specific pattern. Parameters: url (required), pattern (required)

generate-site-map

Generates a simple XML sitemap by crawling. Parameters: url (required), maxDepth (optional, default: 2), limit (optional, default: 1000)

Last Updated: July 7, 2025

社区评论

0.0

0 条评论

暂无评论. 成为第一个评论的人！

登录以参与讨论

Coming soon to

Highlight AI

文档

查看 GitHub 仓库

语言

TypeScript

分类

生产力开发工具设计人工智能搜索

Web Content Extractor

by bsmi021

Summary

Available Actions(6)

社区评论

文档

语言

分类

标签

Web Content Extractor

by bsmi021

Summary

Available Actions(6)

fetch-page

extract-links

crawl-site

check-links

find-patterns

generate-site-map

社区评论

文档

语言

分类

标签

fetch-page

extract-links

crawl-site

check-links

find-patterns

generate-site-map