azure-ai-contentunderstanding-py
Cloud, DevOps & Systèmes|
Documentation
Azure AI Content Understanding SDK for Python
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
Installation
pip install azure-ai-contentunderstandingEnvironment Variables
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/Authentication
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)Core Workflow
Content Understanding operations are asynchronous long-running operations:
begin_analyze() (returns a poller).result())AnalyzeResult.contentsPrebuilt Analyzers
| Analyzer | Content Type | Purpose |
|----------|--------------|---------|
| prebuilt-documentSearch | Documents | Extract markdown for RAG applications |
| prebuilt-imageSearch | Images | Extract content from images |
| prebuilt-audioSearch | Audio | Transcribe audio with timing |
| prebuilt-videoSearch | Video | Extract frames, transcripts, summaries |
| prebuilt-invoice | Documents | Extract invoice fields |
Analyze Document
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)
# Analyze document from URL
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
# Access markdown content (contents is a list)
content = result.contents[0]
print(content.markdown)Access Document Content Details
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)Analyze Image
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)Analyze Video
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()
# Access video content (AudioVisualContent)
content = result.contents[0]
# Get transcript phrases with timing
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
# Get key frames (for video)
for frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")Analyze Audio
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()
# Access audio transcript
content = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")Custom Analyzers
Create custom analyzers with field schemas for specialized extraction:
# Create custom analyzer
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
# Use custom analyzer
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
# Access extracted fields
print(result.fields["vendor_name"])
print(result.fields["invoice_total"])Analyzer Management
# List all analyzers
analyzers = client.list_analyzers()
for analyzer in analyzers:
print(f"{analyzer.analyzer_id}: {analyzer.description}")
# Get specific analyzer
analyzer = client.get_analyzer("prebuilt-documentSearch")
# Delete custom analyzer
client.delete_analyzer("my-custom-analyzer")Async Client
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential
async def analyze_document():
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
async with ContentUnderstandingClient(
endpoint=endpoint,
credential=credential
) as client:
poller = await client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
)
result = await poller.result()
content = result.contents[0]
return content.markdown
asyncio.run(analyze_document())Content Types
| Class | For | Provides |
|-------|-----|----------|
| DocumentContent | PDF, images, Office docs | Pages, tables, figures, paragraphs |
| AudioVisualContent | Audio, video files | Transcript phrases, timing, key frames |
Both derive from MediaContent which provides basic info and markdown representation.
Model Imports
from azure.ai.contentunderstanding.models import (
AnalyzeInput,
AnalyzeResult,
MediaContentKind,
DocumentContent,
AudioVisualContent,
)Client Types
| Client | Purpose |
|--------|---------|
| ContentUnderstandingClient | Sync client for all operations |
| ContentUnderstandingClient (aio) | Async client for all operations |
Best Practices
begin_analyze with AnalyzeInput — this is the correct method signatureresult.contents[0] — results are returned as a listazure.identity.aio credentialsCompétences similaires
Explorez d'autres agents de la catégorie Cloud, DevOps & Systèmes