Skip to content

API reference

Every symbol in the olgadoc package is documented here. The pages are generated directly from the live docstrings via mkdocstrings, so they stay in sync with the code.

Top-level classes

  • Document — the primary entry point. Open, inspect, extract.
  • Page — a single page inside a document.
  • Processability — health report for an opened document.

Exceptions

OlgaError

Bases: Exception

Raised for every error surfaced by the Olga engine.

The exception message reflects the underlying Rust error variant — encrypted document, unsupported format, decode failure, and so on — so callers can pattern-match on substrings or simply re-raise.

TypedDict payloads

Every method that returns a dict returns a real TypedDict — see:

  • PayloadsLink, Table, SearchHit, Chunk, OutlineEntry, ExtractedImage, HealthIssue and friends.
  • JSON treeDocumentJson and the 16-variant JsonElement discriminated union returned by Document.to_json.

Module version

__version__ module-attribute

__version__: str = '0.1.1'

str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.