Skip to content

Changelog

All notable changes to this project are documented here. The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Starting with v0.2.0, each release will be documented with granular Added / Changed / Fixed / Removed / Security sections. v0.1.0 is the single foundational cut that establishes the baseline.

0.1.1 — 2026-04-21

Docs patch — surface the independent v0.1.0 benchmark on PyPI and crates.io.

No engine changes. This release publishes the independent, reproducible post-release audit at olga_v0.1.0_benchmark/ and links it from the crate README, the PyPI README, the MkDocs landing page, and BENCHMARKS.md. Headline result on a 50-file mixed-format corpus: 1.62× faster and 2.62× more extracted content than a hand-routed best-of-breed pipeline. The crate metadata author field is also corrected from "Hugues Tankouo" to "Hugues Dtankouo".

0.1.0 — 2026-04-21

First public release — the end-to-end Olga pipeline in one cut.

Olga's first public release. This foundational cut ships the full intelligent document processing pipeline: a Rust core that parses PDF, DOCX, XLSX, and HTML with provenance tracking and table reconstruction, a Python distribution (olgadoc) with a strictly-typed API surface, an olga CLI for inspection, extraction, search, and page-level access, runnable examples, end-to-end regression coverage, an MkDocs site, and a full CI/CD pipeline publishing to crates.io and PyPI. The public API is stable enough for evaluation and prototyping; expect minor breaking changes on the path to 1.0.