๐๏ธ ๐ ๐ MCP tool access to MarkItDown -a library that converts many file formats (local or remote) to Markdown for LLM consumption.
is%3Aopen+label%3A%22open+for+contribution%22) | | PRs | All PRs | PRs open for reviewing |
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
MarkItDown โ A lightweight Python utility for converting various files to Markdown for use with LLMs and text analysis pipelines.
markitdown path-to-file.pdf > document.md
or specify output file:
markitdown path-to-file.pdf -o document.md
Piping content is also supported:
cat path-to-file.pdf | markitdown
from markitdown import MarkItDown md = MarkItDown(enable_plugins=False) result = md.convert("test.xlsx") print(result.text_content)
docker build -t markitdown:latest . docker run --rm -i markitdown:latest < ~/your-file.pdf > output.md
pip install 'markitdown[pdf, docx, pptx]'
).Install via pip:
pip install 'markitdown[all]'
or from source:
git clone [email protected]:microsoft/markitdown.git cd markitdown pip install -e 'packages/markitdown[all]'
๐ ๐ Allows the AI to read .ged files and genetic data
๐ โ๏ธ Get the LaTeX source of arXiv papers to handle mathematical content and equations
๐ [Vectorize](https://vectorize.io) MCP server for advanced retrieval, Private Deep Research, Anything-to-Markdown file extraction and text chunking.
๐ ๐ An MCP server to convert almost any file or web content into Markdown