Skip to content

Azure Content Understanding C# Sample - Educational project demonstrating content analysis with infrastructure as code deployment

Notifications You must be signed in to change notification settings

rogersba1/azure-ai-content-understanding-basic

Repository files navigation

Azure Content Understanding C# Client Sample

A focused .NET 8 sample application demonstrating Azure Content Understanding capabilities with quick-start guidance and examples. Draws heavily from other samples available, like https://github.com/Azure-Samples/data-extraction-using-azure-content-understanding/ and https://github.com/Azure-Samples/azure-ai-content-understanding-dotnet.

Features

  • Terraform based iac deploy/provision all necessary Azure resources
  • Azure Content Understanding API integration with authentication and configurable endpoints
  • Health checks for key Azure resources (Content Understanding, Key Vault)
  • Analyzer and Classifier samples exploring several use cases
  • End-to-end document analysis pipeline
    • Create/Upload Analayzers and Classifiers
    • Run analysis of single documents
    • Run classification and analysis on single/multiple documents
  • Simple CLI operations with local export of results

📋 Prerequisites

  • .NET 8 SDK
  • (Optional) Azure CLI for authentication (az login)
  • (Optional) Terraform if you want to deploy the sample infra in iac/

🚀 Quick Start

  1. (Optional) Deploy infrastructure in iac/ using the provided scripts.

  2. Build and run the client

# From the repo root
cd src/ContentUnderstanding.Client

# Build
dotnet build

# Run (interactive by default)
dotnet run

# Preferred: subcommands (System.CommandLine)
dotnet run -- --use-cli health
dotnet run -- --use-cli analyzers
dotnet run -- --use-cli analyze --analyzer receipt --document receipt1.pdf
dotnet run -- --use-cli classifiers
dotnet run -- --use-cli classify --classifier <name> --document <file>
dotnet run -- --use-cli classify-dir --classifier <name> --directory <subfolder>
### Classify a whole directory

Classify all supported files in a subfolder under Data/SampleDocuments using a classifier:

dotnet run --project .\src\ContentUnderstanding.Client -- --use-cli classify-dir --classifier <name> --directory <subfolder>
  • Non-recursive: only files directly in <subfolder> are processed.
  • Supported types: .pdf, .png, .jpg, .jpeg, .tif, .tiff, .bmp.
  • Sequential processing with per-file error logging; the run continues on errors.
  • Outputs: per-file JSON and formatted text results in Output/ plus a mandatory batch summary:
    • batch_<directory>_<classifier>_<timestamp>_summary.json

📖 Usage Guide

Command-Line Interface

Preferred: subcommands via System.CommandLine

# Show comprehensive help
dotnet run -- --use-cli

# Health check core Azure resources (Content Understanding, Key Vault, Managed Identity)
dotnet run -- --use-cli health

# List all available analyzers
dotnet run -- --use-cli analyzers

Creating Analyzers

# Create default analyzer (receipt)
dotnet run -- --use-cli create-analyzer

# Create specific analyzer by file name
dotnet run -- --use-cli create-analyzer --analyzer-file receipt.json

Document Analysis

# Use all defaults (receipt1.pdf + receipt analyzer)
dotnet run -- --use-cli analyze

# Document-specific analysis
dotnet run -- --use-cli analyze --document receipt.png

# Analyzer-specific analysis
dotnet run -- --use-cli analyze --analyzer enginemanual

# Full control
dotnet run -- --use-cli analyze --analyzer receipt --document receipt.png

# Use absolute paths for documents outside the project
dotnet run -- --use-cli analyze --document "C:\\path\\to\\my\\document.pdf"

Supported File Formats

The application automatically detects content types for:

  • PDF: .pdf (application/pdf)
  • Images: .png, .jpg, .jpeg, .tiff, .bmp

📋 CLI Reference

Preferred subcommands:

dotnet run -- --use-cli <command> [options]

Common commands:

  • health
  • analyzers
  • analyze --analyzer --document
  • check-operation --operation-id
  • classifiers
  • create-classifier --classifier --classifier-file
  • create-analyzer --analyzer --analyzer-file
  • classify --classifier --document
  • classify-dir --classifier --directory

Project layout

azure-ai-content-understanding-basic/
├── src/
│   └── ContentUnderstanding.Client/       # Main console application (namespace ContentUnderstanding.Client)
│       ├── Program.cs                     # Main entry point with parameterized CLI
│       ├── Services/                      # HTTP service layer (ContentUnderstanding.Client.Services)
│       ├── Data/                          # Analyzer schemas and sample documents
│       ├── Models/                        # DTOs (ContentUnderstanding.Models namespace)
│       └── Output/                        # Analysis results export (git-ignored)
├── iac/                                   # Infrastructure as Code (Terraform)
├── docs/                                  # Documentation
└── README.md

Configuration

The application uses appsettings.json for non-sensitive configuration and Azure Key Vault or environment variables for secrets. See docs/CONFIGURATION.md for details.

Authentication

The application uses DefaultAzureCredential for authentication, supporting:

  • Azure CLI (az login)
  • Visual Studio credentials
  • Environment variables
  • Managed Identity (in Azure)

Secret Management

  • API keys are typically stored in Azure Key Vault
  • The application looks for the ai-services-key secret by default
  • No hardcoded credentials in source code

Note: Key Vault access may be restricted by network rules. For local development you can set the API key via environment variables or appsettings.Development.json if preferred.

Documentation

License

This project is licensed under the MIT License.

About

Azure Content Understanding C# Sample - Educational project demonstrating content analysis with infrastructure as code deployment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published