Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,12 @@ markitdown --use-plugins path-to-file.pdf

To find available plugins, search GitHub for the hashtag `#markitdown-plugin`. To develop a plugin, see `packages/markitdown-sample-plugin`.

Plugins are the recommended extension path for optional or backend-specific functionality, especially when an extension:

- adds non-default dependencies
- depends on external services or model runtimes
- changes converter behavior only for opt-in users

#### markitdown-ocr Plugin

The `markitdown-ocr` plugin adds OCR support to PDF, DOCX, PPTX, and XLSX converters, extracting text from embedded images using LLM Vision — the same `llm_client` / `llm_model` pattern that MarkItDown already uses for image descriptions. No new ML libraries or binary dependencies required.
Expand All @@ -143,7 +149,7 @@ pip install markitdown-ocr
pip install openai # or any OpenAI-compatible client
```

**Usage:**
**Usage (Python API):**

Pass the same `llm_client` and `llm_model` you would use for image descriptions:

Expand Down
14 changes: 8 additions & 6 deletions packages/markitdown-ocr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,6 @@ pip install openai

## Usage

### Command Line

```bash
markitdown document.pdf --use-plugins --llm-client openai --llm-model gpt-4o
```

### Python API

Pass `llm_client` and `llm_model` to `MarkItDown()` exactly as you would for image descriptions:
Expand All @@ -52,6 +46,12 @@ print(result.text_content)

If no `llm_client` is provided the plugin still loads, but OCR is silently skipped — falling back to the standard built-in converter.

### Command Line

MarkItDown's built-in CLI can enable installed plugins with `--use-plugins`, but it does not currently construct Python client objects such as `OpenAI()` for you.

For that reason, this plugin is primarily configured through the Python API shown above, where you can pass `llm_client` and `llm_model` directly to `MarkItDown(...)`.

### Custom Prompt

Override the default extraction prompt for specialized documents:
Expand Down Expand Up @@ -100,6 +100,8 @@ When a file is converted:
4. The returned text is inserted inline, preserving document structure
5. If the LLM call fails, conversion continues without that image's text

This plugin is one example of the broader plugin-first extension model in MarkItDown: backend-specific OCR or document-processing logic can live in separately installed packages without changing the default core behavior.

## Supported File Formats

### PDF
Expand Down
2 changes: 2 additions & 0 deletions packages/markitdown-sample-plugin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ sample_plugin = "markitdown_sample_plugin"

Here, the value of `sample_plugin` can be any key, but should ideally be the name of the plugin. The value is the fully qualified name of the package implementing the plugin.

If your plugin needs optional configuration, you can also read additional keyword arguments passed through `MarkItDown(enable_plugins=True, **kwargs)` from `register_converters(markitdown, **kwargs)`.


## Installation

Expand Down