A focused .NET 8 sample application demonstrating Azure Content Understanding capabilities with quick-start guidance and examples. Draws heavily from other samples available, like https://github.com/Azure-Samples/data-extraction-using-azure-content-understanding/ and https://github.com/Azure-Samples/azure-ai-content-understanding-dotnet.
- Terraform based iac deploy/provision all necessary Azure resources
- Azure Content Understanding API integration with authentication and configurable endpoints
- Health checks for key Azure resources (Content Understanding, Key Vault)
- Analyzer and Classifier samples exploring several use cases
- End-to-end document analysis pipeline
- Create/Upload Analayzers and Classifiers
- Run analysis of single documents
- Run classification and analysis on single/multiple documents
- Simple CLI operations with local export of results
- .NET 8 SDK
- (Optional) Azure CLI for authentication (
az login) - (Optional) Terraform if you want to deploy the sample infra in
iac/
-
(Optional) Deploy infrastructure in
iac/using the provided scripts. -
Build and run the client
# From the repo root
cd src/ContentUnderstanding.Client
# Build
dotnet build
# Run (interactive by default)
dotnet run
# Preferred: subcommands (System.CommandLine)
dotnet run -- --use-cli health
dotnet run -- --use-cli analyzers
dotnet run -- --use-cli analyze --analyzer receipt --document receipt1.pdf
dotnet run -- --use-cli classifiers
dotnet run -- --use-cli classify --classifier <name> --document <file>
dotnet run -- --use-cli classify-dir --classifier <name> --directory <subfolder>
### Classify a whole directoryClassify all supported files in a subfolder under Data/SampleDocuments using a classifier:
dotnet run --project .\src\ContentUnderstanding.Client -- --use-cli classify-dir --classifier <name> --directory <subfolder>- Non-recursive: only files directly in
<subfolder>are processed. - Supported types: .pdf, .png, .jpg, .jpeg, .tif, .tiff, .bmp.
- Sequential processing with per-file error logging; the run continues on errors.
- Outputs: per-file JSON and formatted text results in
Output/plus a mandatory batch summary:batch_<directory>_<classifier>_<timestamp>_summary.json
Preferred: subcommands via System.CommandLine
# Show comprehensive help
dotnet run -- --use-cli
# Health check core Azure resources (Content Understanding, Key Vault, Managed Identity)
dotnet run -- --use-cli health
# List all available analyzers
dotnet run -- --use-cli analyzers# Create default analyzer (receipt)
dotnet run -- --use-cli create-analyzer
# Create specific analyzer by file name
dotnet run -- --use-cli create-analyzer --analyzer-file receipt.json# Use all defaults (receipt1.pdf + receipt analyzer)
dotnet run -- --use-cli analyze
# Document-specific analysis
dotnet run -- --use-cli analyze --document receipt.png
# Analyzer-specific analysis
dotnet run -- --use-cli analyze --analyzer enginemanual
# Full control
dotnet run -- --use-cli analyze --analyzer receipt --document receipt.png
# Use absolute paths for documents outside the project
dotnet run -- --use-cli analyze --document "C:\\path\\to\\my\\document.pdf"The application automatically detects content types for:
- PDF:
.pdf(application/pdf) - Images:
.png,.jpg,.jpeg,.tiff,.bmp
Preferred subcommands:
dotnet run -- --use-cli <command> [options]Common commands:
- health
- analyzers
- analyze --analyzer --document
- check-operation --operation-id
- classifiers
- create-classifier --classifier --classifier-file
- create-analyzer --analyzer --analyzer-file
- classify --classifier --document
- classify-dir --classifier --directory
azure-ai-content-understanding-basic/
├── src/
│ └── ContentUnderstanding.Client/ # Main console application (namespace ContentUnderstanding.Client)
│ ├── Program.cs # Main entry point with parameterized CLI
│ ├── Services/ # HTTP service layer (ContentUnderstanding.Client.Services)
│ ├── Data/ # Analyzer schemas and sample documents
│ ├── Models/ # DTOs (ContentUnderstanding.Models namespace)
│ └── Output/ # Analysis results export (git-ignored)
├── iac/ # Infrastructure as Code (Terraform)
├── docs/ # Documentation
└── README.md
The application uses appsettings.json for non-sensitive configuration and Azure Key Vault or environment variables for secrets. See docs/CONFIGURATION.md for details.
The application uses DefaultAzureCredential for authentication, supporting:
- Azure CLI (
az login) - Visual Studio credentials
- Environment variables
- Managed Identity (in Azure)
- API keys are typically stored in Azure Key Vault
- The application looks for the
ai-services-keysecret by default - No hardcoded credentials in source code
Note: Key Vault access may be restricted by network rules. For local development you can set the API key via environment variables or appsettings.Development.json if preferred.
docs/CONFIGURATION.md- Configuration guidedocs/initial_plan.md- Project plan and status- Azure Content Understanding docs: https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/
This project is licensed under the MIT License.