What it does

The skill triggers automatically when the agent detects a PDF path in the conversation or a request matching one of the trigger phrases. It returns structured markdown optimized for LLM consumption — tables stay tables, headings stay headings, footnotes get linked.

How to use

Drop a PDF into the conversation and ask for what you need. The skill handles extraction transparently:

Extract the Q4 revenue table from ./reports/q4-2025.pdf and summarize the year-over-year trend.

Under the hood

Uses poppler for text-layer extraction, falls back to tesseract when the PDF has no text layer (scanned docs). Tables are detected with a layout heuristic and normalized before being handed back to the model.

pdf-extractor

install

compatibility

What it does

How to use

Under the hood