Parse any file
into structured text

File in, LLM-ready text out. PDF, DOCX, XLSX, PPTX and images — clean markdown and structured JSON via one API call.

$ curl https://api.files2llm.com/jobs \
  -H "X-API-Key: f2l_..." \
  -F "file=@report.pdf"

{
  "job_id": "abc-123",
  "status": "queued"
}

What you get

Clean markdown out

Structured markdown plus plain text, page by page. LLM-ready — no layout noise, no boilerplate.

Every common format

PDF, DOCX, XLSX, PPTX and images through one endpoint, with consistent ParsedDocument JSON out.

Scanned PDFs & OCR

Handles scanned documents and images with OCR, so even non-digital files become text.

Sync or webhook

Poll for status or pass a callback URL - works for both interactive use and batch ingestion.

Language detection

Each document comes back with its detected language, ready for downstream routing.

Multiple keys per account

Rotate keys safely, label them per environment, revoke individually from the dashboard.

See it in action

Same API, three flavors. Pick your stack.

# Upload a file (PDF, DOCX, XLSX, PPTX, image)
curl -X POST https://api.files2llm.com/jobs \
  -H "X-API-Key: f2l_..." \
  -F "file=@report.pdf"

# Response
{"job_id":"abc-123","status":"queued"}

# Poll until parsed
curl https://api.files2llm.com/jobs/abc-123 \
  -H "X-API-Key: f2l_..."

# Fetch the structured result (text + markdown per page)
curl https://api.files2llm.com/jobs/abc-123/result \
  -H "X-API-Key: f2l_..."

files2llm is part of the 2LLM Suite — focused APIs that turn messy inputs into LLM-ready data. Also try scrape2llm and html2media and html2reel and stream2llm and research2llm for the rest.

Start parsing files in under a minute.

Get your API key →

Parse any fileinto structured text