Client
omniread.pdf.client
PDF client abstractions for OmniRead.
This module defines the client layer responsible for retrieving raw PDF bytes from a concrete backing store.
Clients provide low-level access to PDF binaries and are intentionally decoupled from scraping and parsing logic. They do not perform validation, interpretation, or content extraction.
Typical backing stores include: - Local filesystems - Object storage (S3, GCS, etc.) - Network file systems
BasePDFClient
Bases: ABC
Abstract client responsible for retrieving PDF bytes from a specific backing store (filesystem, S3, FTP, etc.).
Implementations must: - Accept a source identifier appropriate to the backing store - Return the full PDF binary payload - Raise retrieval-specific errors on failure
fetch
abstractmethod
fetch(source: Any) -> bytes
Fetch raw PDF bytes from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source |
Any
|
Identifier of the PDF location, such as a file path, object storage key, or remote reference. |
required |
Returns:
| Type | Description |
|---|---|
bytes
|
Raw PDF bytes. |
Raises:
| Type | Description |
|---|---|
Exception
|
Retrieval-specific errors defined by the implementation. |
FileSystemPDFClient
Bases: BasePDFClient
PDF client that reads from the local filesystem.
This client reads PDF files directly from the disk and returns their raw binary contents.
fetch
fetch(path: Path) -> bytes
Read a PDF file from the local filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
Path
|
Filesystem path to the PDF file. |
required |
Returns:
| Type | Description |
|---|---|
bytes
|
Raw PDF bytes. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the path does not exist. |
ValueError
|
If the path exists but is not a file. |