Skip to content

xarf/xarf-spec

XARF v4 Specification

The eXtended Abuse Reporting Format (XARF) is a standard for reporting abuse incidents in a structured, machine-readable format. XARF v4 introduces a category-based architecture with seven main abuse categories and enhanced evidence handling.

πŸ“š Documentation

πŸ—‚οΈ Seven Abuse Categories

XARF v4 organizes all abuse reports into seven main categories:

  1. messaging - Communication abuse (email spam, SMS, chat)
  2. connection - Network attacks (DDoS, port scans, login attacks)
  3. content - Malicious web content (phishing, malware sites, defacement)
  4. infrastructure - Compromised systems (botnets, C2, compromised servers)
  5. copyright - IP infringement (DMCA, trademark violations)
  6. vulnerability - Security vulnerabilities (CVE reports, misconfigurations)
  7. reputation - Threat intelligence (blocklist entries, IOC data)

Event Types by Category

Each category contains multiple specific event types with dedicated schemas:

Category Event Types Schema Location
messaging spam, bulk_messaging schemas/v4/types/messaging-*.json
connection login_attack, port_scan, ddos, infected_host, sql_injection, vuln_scanning, reconnaissance, scraping schemas/v4/types/connection-*.json
vulnerability cve, open, misconfiguration schemas/v4/types/vulnerability-*.json
content phishing, malware, fraud, csam, csem, exposed_data, brand_infringement, remote_compromise, suspicious_registration schemas/v4/types/content-*.json
infrastructure botnet, compromised_server schemas/v4/types/infrastructure-*.json
reputation blocklist, threat_intelligence schemas/v4/types/reputation-*.json
copyright copyright, p2p, cyberlocker, ugc_platform, link_site, usenet schemas/v4/types/copyright-*.json

πŸ“„ Sample Reports

Sample reports are organized by version for reference and migration purposes:

samples/
β”œβ”€β”€ v4/               # XARF v4 samples - one per schema type (32 total)
β”‚   β”œβ”€β”€ messaging-spam.json
β”‚   β”œβ”€β”€ messaging-bulk-messaging.json
β”‚   β”œβ”€β”€ connection-login-attack.json
β”‚   β”œβ”€β”€ connection-port-scan.json
β”‚   β”œβ”€β”€ connection-ddos.json
β”‚   β”œβ”€β”€ connection-infected-host.json
β”‚   β”œβ”€β”€ connection-sql-injection.json
β”‚   β”œβ”€β”€ connection-vuln-scanning.json
β”‚   β”œβ”€β”€ connection-reconnaissance.json
β”‚   β”œβ”€β”€ connection-scraping.json
β”‚   β”œβ”€β”€ content-brand-infringement.json
β”‚   β”œβ”€β”€ content-fraud.json
β”‚   β”œβ”€β”€ content-remote-compromise.json
β”‚   β”œβ”€β”€ content-suspicious-registration.json
β”‚   β”œβ”€β”€ vulnerability-cve.json
β”‚   β”œβ”€β”€ vulnerability-open.json
β”‚   β”œβ”€β”€ vulnerability-misconfiguration.json
β”‚   β”œβ”€β”€ content-phishing.json
β”‚   β”œβ”€β”€ content-malware.json
β”‚   β”œβ”€β”€ content-csam.json
β”‚   β”œβ”€β”€ content-csem.json
β”‚   β”œβ”€β”€ content-exposed-data.json
β”‚   β”œβ”€β”€ infrastructure-botnet.json
β”‚   β”œβ”€β”€ infrastructure-compromised-server.json
β”‚   β”œβ”€β”€ reputation-blocklist.json
β”‚   β”œβ”€β”€ reputation-threat-intelligence.json
β”‚   β”œβ”€β”€ copyright-copyright.json
β”‚   β”œβ”€β”€ copyright-p2p.json
β”‚   β”œβ”€β”€ copyright-cyberlocker.json
β”‚   β”œβ”€β”€ copyright-ugc-platform.json
β”‚   β”œβ”€β”€ copyright-link-site.json
β”‚   └── copyright-usenet.json
└── v3/               # XARF v3 samples (legacy format, migration reference)
    β”œβ”€β”€ spam_v3_sample.json
    β”œβ”€β”€ ddos_v3_sample.json
    β”œβ”€β”€ phishing_v3_sample.json
    └── botnet_v3_sample.json

πŸš€ Quick Start

# Install dependencies (jq, python3, jsonschema)
./scripts/setup.sh

# View a sample report
cat samples/v4/messaging-spam.json

# Check JSON formatting
./scripts/format-json.sh check

# Format all JSON files  
./scripts/format-json.sh format

# Validate all samples against schemas
python3 scripts/validate-schemas.py

# Or using nix-shell (NixOS users)
nix-shell -p python3 python3Packages.jsonschema --run "python3 scripts/validate-schemas.py"

# Validate specific sample against its schema
python3 -c "
import json, jsonschema
with open('samples/v4/messaging-spam.json') as f: data = json.load(f)
with open('schemas/v4/types/messaging-spam.json') as f: schema = json.load(f)
jsonschema.validate(data, schema)
print('βœ… Valid!')
"

πŸ”§ Parser Libraries

🌐 XARF v3 Compatibility

XARF v4 maintains backward compatibility with v3 reports. See our migration guide for details.

πŸ“Š Schema Structure

{
  "xarf_version": "4.0.0",
  "report_id": "uuid-v4",
  "timestamp": "2024-01-01T12:00:00Z",
  "reporter": {
    "org": "Example Security",
    "contact": "abuse@example.com",
    "domain": "example.com",
    "type": "automated|manual|hybrid"
  },
  "sender": {
    "org": "Example Security",
    "contact": "abuse@example.com",
    "domain": "example.com"
  },
  "source_identifier": "192.0.2.1",
  "category": "messaging|connection|content|infrastructure|copyright|vulnerability|reputation",
  "type": "specific_type_per_category",
  "evidence_source": "spamtrap|honeypot|user_report|automated_scan|manual_analysis",
  "evidence": [
    {
      "content_type": "text/plain|image/png|application/pdf|message/rfc822",
      "description": "Human-readable evidence description",
      "payload": "base64_encoded_evidence_data"
    }
  ],
  "tags": ["structured:tagging", "for:classification"],
  "_internal": {
    "source_system": "system_identifier",
    "custom": "organization_specific_metadata"
  }
}

🀝 Contributing

XARF v4 is an open standard. We welcome contributions from the security community:

  • Issues: Report bugs or suggest improvements
  • Samples: Contribute anonymized real-world examples
  • Documentation: Help improve clarity and completeness
  • Parsers: Implement XARF support in new languages

πŸ“„ License

MIT License - See LICENSE for details.

πŸ”— Links

About

XARF v4 Specification - Schemas, documentation, and samples for the eXtended Abuse Reporting Format

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •