Skip to content

Tagging

Garot Conklin edited this page Apr 29, 2025 · 1 revision

Tagging Strategy for CloudOps AI

This document outlines the tagging strategy for CloudOps AI to ensure effective resource management and automation.

1. Operational Tags

Tag Key Example Value Purpose
Environment prod, staging, dev Isolate actions by environment (e.g., block remediations in prod)
Owner team-noc@company.com Identify responsible team for escalations
IncidentSeverity P1, P2, P3 Prioritize actions based on criticality
AutoRemediate true, false Flag resources eligible for auto-fixes

2. Cost & Billing Tags

Tag Key Example Value Purpose
CostCenter it-ops-123 Track spend per team
Project ai-noc-agent Allocate costs to projects
BudgetAlertThreshold 90 (percentage) Trigger cost alerts

3. Security & Compliance Tags

Tag Key Example Value Purpose
DataClassification confidential, public Control remediation scope (e.g., skip sensitive DBs)
Compliance hipaa, pci Enforce compliance-specific rules
RetentionPolicy 30d, 1y Guide log/backup retention

4. AI/Agent-Specific Tags

Tag Key Example Value Purpose
AIActionProfile conservative, aggressive Tweak AI decision thresholds
LastRemediation 2024-05-20T14:30Z Track when last action was taken
AllowedActions stop,restart,notify Whitelist permitted remediations

5. Resource Context Tags

Tag Key Example Value Purpose
WorkloadType batch, real-time Customize monitoring (e.g., batch jobs tolerate longer downtimes)
DRPriority tier1, tier3 Influence recovery order
Dependencies app:checkout-service Map resource relationships

Example YAML for Tag-Based Rules

rules:
  - name: "TerminateHungInstances"
    condition:
      - "tag:Environment == 'prod'"
      - "tag:AutoRemediate == 'true'"
      - "tag:WorkloadType != 'stateful'"
    actions:
      - type: "remediate"
        command: "aws ec2 terminate-instances"

Best Practices

  1. Consistency: Enforce tags via AWS Config Rules or IAM policies.
  2. Automation: Use AWS Lambda/EventBridge to auto-tag resources (e.g., propagate CostCenter from VPC to child resources).
  3. Governance: Monitor untagged resources with AWS Resource Groups.
Clone this wiki locally