Docling: Open-Source Document AI for PDFs and Forms

Docling converts PDFs, Word files, and images into clean structured data with layout, tables, and reading order preserved, built for AI and RAG pipelines. For a team processing documents at volume, it replaces a paid per-page extraction API with a tool that runs locally at zero per-page cost.

Every automation project that touches paper hits the same wall as the ones that touch the web: the input is a mess. A PDF invoice, a scanned contract, a form someone filled out by hand, none of it is data a computer can use until something reads the layout, finds the tables, and figures out the order the words are meant to be read in. That extraction step is the unglamorous plumbing under document automation, and the easy path has been to pay a cloud API by the page.

Docling is the open-source alternative for that job. It is MIT licensed, sits at roughly 62,000 stars, and converts PDFs, DOCX files, images, and other documents into clean structured data, preserving layout, tables, and reading order, optimized specifically for feeding AI and RAG pipelines.

What it does

Docling takes a document in and gives structured data out. Point it at a PDF, a Word file, or an image, and it returns the content with its structure intact, the tables recognized as tables, the reading order sorted out, the layout understood rather than flattened into a jumble of characters.

That structure is the whole point. Raw OCR that dumps text in the wrong order is nearly useless to a language model, and it is worse than useless for anything that depends on tables. Docling is built to produce the kind of clean, structured output that a RAG pipeline or an AI workflow can actually consume, which is why it shows up as the ingestion layer in a lot of document-processing stacks. In plain terms, it is OCR-to-structured-data, intelligent document processing you run yourself.

What it displaces

The cost comparison is against the paid document-AI APIs. AWS Textract's AnalyzeDocument for forms is $50 per 1,000 pages, plus another $15 per 1,000 pages for table extraction. Process invoices or contracts at volume and that meter runs steadily, page after page, month after month.

Docling runs locally at zero per-page cost. There is no API charging as documents flow through, because the documents flow through hardware you already have. What you pay instead is your own compute and the setup time to stand it up, and that is the honest caveat at the center of the trade: a per-page cloud bill versus your own infrastructure and the engineering time to run it.

Who it is for, and who it is not

Docling fits an ops or data team processing documents at volume, invoices, contracts, forms, statements, with a technical person who can run it. For that team the math is straightforward: above a certain page count, owning the extraction beats renting it, and the savings compound with every batch. The per-page meter that used to scale with your document volume becomes a fixed piece of infrastructure instead.

It is the wrong tool for a one-off document or a non-technical user. If you need to pull the numbers out of a single contract this afternoon, standing up Docling is absurd, upload it to a hosted service and move on. And if nobody on your side can run and maintain it, the local-and-free story does not apply to you, because the cost was never really the per-page fee. It was the not-having-to-run-it, and a hosted API still sells that.

The bigger picture

The layer that turns real-world documents into machine-readable data keeps getting commoditized the same way scraping did. A year ago this was mostly something you paid a cloud provider to do by the page. Now a free, well-maintained, 62,000-star project does it locally, and the only cost is the compute and the hours to run it. For teams with the volume and the technical person, the marginal cost of turning a stack of PDFs into usable data is heading toward the price of the electricity.

Docling Turns Your Messy PDFs Into Clean Data AI Can Actually Use

What it does

What it displaces

Who it is for, and who it is not

The bigger picture