Structured data extraction from unstructured text: parties, dates, amounts, and governing law clauses from contracts; diagnoses, medications, and dosages from clinical notes; company names, transaction amounts, and counterparties from financial documents; dimensions, materials, and certifications from supplier product sheets. spaCy and Hugging Face token classification models fine-tuned on your annotated documents handle high-volume, latency-sensitive extraction; LLM-based extraction (GPT-4o structured outputs, Claude tool use) handles complex, variable-format documents where rigid entity schemas don't capture the variation. Custom entity types are trained on your domain vocabulary using annotation tools (Label Studio, Prodigy), a medical billing system has different entity requirements than a contract management platform. Nested entity handling extracts entities within entities (a medication with its dose, route, and frequency as sub-attributes). Output delivered as structured JSON to your database, ERP, or document management system, replacing the manual data entry step that currently sits between document receipt and system record creation.