azure-ai-document-intelligence-dotnet

Cloud, DevOps & Systèmes

|

Documentation

Azure.AI.DocumentIntelligence (.NET)

Extract text, tables, and structured data from documents using prebuilt and custom models.

Installation

dotnet add package Azure.AI.DocumentIntelligence
dotnet add package Azure.Identity

Current Version: v1.0.0 (GA)

Environment Variables

DOCUMENT_INTELLIGENCE_ENDPOINT=https://<resource-name>.cognitiveservices.azure.com/
DOCUMENT_INTELLIGENCE_API_KEY=<your-api-key>
BLOB_CONTAINER_SAS_URL=https://<storage>.blob.core.windows.net/<container>?<sas-token>

Authentication

Microsoft Entra ID (Recommended)

using Azure.Identity;
using Azure.AI.DocumentIntelligence;

string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");
var credential = new DefaultAzureCredential();
var client = new DocumentIntelligenceClient(new Uri(endpoint), credential);

> Note: Entra ID requires a custom subdomain (e.g., https://.cognitiveservices.azure.com/), not a regional endpoint.

API Key

string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");
string apiKey = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_API_KEY");
var client = new DocumentIntelligenceClient(new Uri(endpoint), new AzureKeyCredential(apiKey));

Client Types

| Client | Purpose |

|--------|---------|

| DocumentIntelligenceClient | Analyze documents, classify documents |

| DocumentIntelligenceAdministrationClient | Build/manage custom models and classifiers |

Prebuilt Models

| Model ID | Description |

|----------|-------------|

| prebuilt-read | Extract text, languages, handwriting |

| prebuilt-layout | Extract text, tables, selection marks, structure |

| prebuilt-invoice | Extract invoice fields (vendor, items, totals) |

| prebuilt-receipt | Extract receipt fields (merchant, items, total) |

| prebuilt-idDocument | Extract ID document fields (name, DOB, address) |

| prebuilt-businessCard | Extract business card fields |

| prebuilt-tax.us.w2 | Extract W-2 tax form fields |

| prebuilt-healthInsuranceCard.us | Extract health insurance card fields |

Core Workflows

1. Analyze Invoice

using Azure.AI.DocumentIntelligence;

Uri invoiceUri = new Uri("https://example.com/invoice.pdf");

Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
    WaitUntil.Completed, 
    "prebuilt-invoice", 
    invoiceUri);

AnalyzeResult result = operation.Value;

foreach (AnalyzedDocument document in result.Documents)
{
    if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField)
        && vendorNameField.FieldType == DocumentFieldType.String)
    {
        string vendorName = vendorNameField.ValueString;
        Console.WriteLine($"Vendor Name: '{vendorName}', confidence: {vendorNameField.Confidence}");
    }

    if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField)
        && invoiceTotalField.FieldType == DocumentFieldType.Currency)
    {
        CurrencyValue invoiceTotal = invoiceTotalField.ValueCurrency;
        Console.WriteLine($"Invoice Total: '{invoiceTotal.CurrencySymbol}{invoiceTotal.Amount}'");
    }
    
    // Extract line items
    if (document.Fields.TryGetValue("Items", out DocumentField itemsField)
        && itemsField.FieldType == DocumentFieldType.List)
    {
        foreach (DocumentField item in itemsField.ValueList)
        {
            var itemFields = item.ValueDictionary;
            if (itemFields.TryGetValue("Description", out DocumentField descField))
                Console.WriteLine($"  Item: {descField.ValueString}");
        }
    }
}

2. Extract Layout (Text, Tables, Structure)

Uri fileUri = new Uri("https://example.com/document.pdf");

Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
    WaitUntil.Completed, 
    "prebuilt-layout", 
    fileUri);

AnalyzeResult result = operation.Value;

// Extract text by page
foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Page {page.PageNumber}: {page.Lines.Count} lines, {page.Words.Count} words");
    
    foreach (DocumentLine line in page.Lines)
    {
        Console.WriteLine($"  Line: '{line.Content}'");
    }
}

// Extract tables
foreach (DocumentTable table in result.Tables)
{
    Console.WriteLine($"Table: {table.RowCount} rows x {table.ColumnCount} columns");
    foreach (DocumentTableCell cell in table.Cells)
    {
        Console.WriteLine($"  Cell ({cell.RowIndex}, {cell.ColumnIndex}): {cell.Content}");
    }
}

3. Analyze Receipt

Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(
    WaitUntil.Completed, 
    "prebuilt-receipt", 
    receiptUri);

AnalyzeResult result = operation.Value;

foreach (AnalyzedDocument document in result.Documents)
{
    if (document.Fields.TryGetValue("MerchantName", out DocumentField merchantField))
        Console.WriteLine($"Merchant: {merchantField.ValueString}");
        
    if (document.Fields.TryGetValue("Total", out DocumentField totalField))
        Console.WriteLine($"Total: {totalField.ValueCurrency.Amount}");
        
    if (document.Fields.TryGetValue("TransactionDate", out DocumentField dateField))
        Console.WriteLine($"Date: {dateField.ValueDate}");
}

4. Build Custom Model

var adminClient = new DocumentIntelligenceAdministrationClient(
    new Uri(endpoint), 
    new AzureKeyCredential(apiKey));

string modelId = "my-custom-model";
Uri blobContainerUri = new Uri("<blob-container-sas-url>");

var blobSource = new BlobContentSource(blobContainerUri);
var options = new BuildDocumentModelOptions(modelId, DocumentBuildMode.Template, blobSource);

Operation<DocumentModelDetails> operation = await adminClient.BuildDocumentModelAsync(
    WaitUntil.Completed, 
    options);

DocumentModelDetails model = operation.Value;

Console.WriteLine($"Model ID: {model.ModelId}");
Console.WriteLine($"Created: {model.CreatedOn}");

foreach (var docType in model.DocumentTypes)
{
    Console.WriteLine($"Document type: {docType.Key}");
    foreach (var field in docType.Value.FieldSchema)
    {
        Console.WriteLine($"  Field: {field.Key}, Confidence: {docType.Value.FieldConfidence[field.Key]}");
    }
}

5. Build Document Classifier

string classifierId = "my-classifier";
Uri blobContainerUri = new Uri("<blob-container-sas-url>");

var sourceA = new BlobContentSource(blobContainerUri) { Prefix = "TypeA/train" };
var sourceB = new BlobContentSource(blobContainerUri) { Prefix = "TypeB/train" };

var docTypes = new Dictionary<string, ClassifierDocumentTypeDetails>()
{
    { "TypeA", new ClassifierDocumentTypeDetails(sourceA) },
    { "TypeB", new ClassifierDocumentTypeDetails(sourceB) }
};

var options = new BuildClassifierOptions(classifierId, docTypes);

Operation<DocumentClassifierDetails> operation = await adminClient.BuildClassifierAsync(
    WaitUntil.Completed, 
    options);

DocumentClassifierDetails classifier = operation.Value;
Console.WriteLine($"Classifier ID: {classifier.ClassifierId}");

6. Classify Document

string classifierId = "my-classifier";
Uri documentUri = new Uri("https://example.com/document.pdf");

var options = new ClassifyDocumentOptions(classifierId, documentUri);

Operation<AnalyzeResult> operation = await client.ClassifyDocumentAsync(
    WaitUntil.Completed, 
    options);

AnalyzeResult result = operation.Value;

foreach (AnalyzedDocument document in result.Documents)
{
    Console.WriteLine($"Document type: {document.DocumentType}, confidence: {document.Confidence}");
}

7. Manage Models

// Get resource details
DocumentIntelligenceResourceDetails resourceDetails = await adminClient.GetResourceDetailsAsync();
Console.WriteLine($"Custom models: {resourceDetails.CustomDocumentModels.Count}/{resourceDetails.CustomDocumentModels.Limit}");

// Get specific model
DocumentModelDetails model = await adminClient.GetModelAsync("my-model-id");
Console.WriteLine($"Model: {model.ModelId}, Created: {model.CreatedOn}");

// List models
await foreach (DocumentModelDetails modelItem in adminClient.GetModelsAsync())
{
    Console.WriteLine($"Model: {modelItem.ModelId}");
}

// Delete model
await adminClient.DeleteModelAsync("my-model-id");

Key Types Reference

| Type | Description |

|------|-------------|

| DocumentIntelligenceClient | Main client for analysis |

| DocumentIntelligenceAdministrationClient | Model management |

| AnalyzeResult | Result of document analysis |

| AnalyzedDocument | Single document within result |

| DocumentField | Extracted field with value and confidence |

| DocumentFieldType | String, Date, Number, Currency, etc. |

| DocumentPage | Page info (lines, words, selection marks) |

| DocumentTable | Extracted table with cells |

| DocumentModelDetails | Custom model metadata |

| BlobContentSource | Training data source |

Build Modes

| Mode | Use Case |

|------|----------|

| DocumentBuildMode.Template | Fixed layout documents (forms) |

| DocumentBuildMode.Neural | Variable layout documents |

Best Practices

1.Use DefaultAzureCredential for production
2.Reuse client instances — clients are thread-safe
3.Handle long-running operations — Use WaitUntil.Completed for simplicity
4.Check field confidence — Always verify Confidence property
5.Use appropriate model — Prebuilt for common docs, custom for specialized
6.Use custom subdomain — Required for Entra ID authentication

Error Handling

using Azure;

try
{
    var operation = await client.AnalyzeDocumentAsync(
        WaitUntil.Completed, 
        "prebuilt-invoice", 
        documentUri);
}
catch (RequestFailedException ex)
{
    Console.WriteLine($"Error: {ex.Status} - {ex.Message}");
}

Related SDKs

| SDK | Purpose | Install |

|-----|---------|---------|

| Azure.AI.DocumentIntelligence | Document analysis (this SDK) | dotnet add package Azure.AI.DocumentIntelligence |

| Azure.AI.FormRecognizer | Legacy SDK (deprecated) | Use DocumentIntelligence instead |

Reference Links

| Resource | URL |

|----------|-----|

| NuGet Package | https://www.nuget.org/packages/Azure.AI.DocumentIntelligence |

| API Reference | https://learn.microsoft.com/dotnet/api/azure.ai.documentintelligence |

| GitHub Samples | https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/documentintelligence/Azure.AI.DocumentIntelligence/samples |

| Document Intelligence Studio | https://documentintelligence.ai.azure.com/ |

| Prebuilt Models | https://aka.ms/azsdk/formrecognizer/models |

Utiliser l'Agent azure-ai-document-intelligence-dotnet - Outil & Compétence IA | Skills Catalogue | Skills Catalogue