Azure Captioner¶
matmmextract.inference.captioner_azure¶
Generate per-panel sub-captions using Azure-hosted models via the OpenAI-compatible API (Mistral, Llama, GPT, etc.).
Differences from captioner_gemini (Gemini):
- Uses openai.OpenAI with a custom base_url pointing at Azure
- Response format is a JSON schema dict (OpenAI structured outputs)
- Column names differ: image_name and reference instead of
downloaded_image_name and reference_sentences
- Retries on "error" keys in already-written JSON files
Input CSV columns required¶
image_name, caption, reference
Output¶
One JSON file per row in output_dir, named <image_name>.json:
{
"panels": [
{
"panel": "a",
"visualization_category": "Microscopy",
"visualization_subtype": "SEM",
"subcaption": "...",
"summary": "..."
},
...
]
}
- class CaptionResult(n_total: 'int' = 0, n_success: 'int' = 0, n_error: 'int' = 0, n_skipped: 'int' = 0, output_dir: 'str' = '')[source]¶
- n_error: int = 0¶
- n_skipped: int = 0¶
- n_success: int = 0¶
- n_total: int = 0¶
- output_dir: str = ''¶
- captioner(csv_path: str | Path, output_dir: str | Path, api_key: str | None = None, azure_endpoint: str = '', model_name: str = 'Mistral-Large-3', max_tokens: int = 4096, max_retries: int = 4, overwrite: bool = False, image_name_col: str = 'downloaded_image_name', caption_col: str = 'caption', reference_col: str = 'reference_sentences', requests_per_minute: int | None = None, verbose: bool = True) CaptionResult[source]¶
Generate sub-captions for every row in csv_path using an Azure model.
- Parameters:
csv_path – CSV with columns
image_name,caption,reference(column names overridable via the*_colparameters).output_dir – Directory where one JSON per figure is written.
api_key – Azure API key. Falls back to
AZURE_API_KEYenv var.azure_endpoint – Azure OpenAI-compatible endpoint URL.
model_name – Model deployment name (e.g.
"Mistral-Large-3","gpt-4o").max_tokens – Max output tokens. Uses
max_completion_tokensfor OpenAI models,max_tokensfor Mistral/Llama.max_retries – Retry attempts on API error or existing
"error"JSON.overwrite – Re-generate even if output JSON already exists and has no error.
reference_col (image_name_col / caption_col /) – Column name overrides for non-standard CSVs.
requests_per_minute – If set, throttle API calls to this rate by sleeping between requests. If
None(default), no extra throttling beyond retry back-off.verbose – Print progress.
- Return type: