Skip to content

Fix/benchmark require guard#127

Merged
solderzzc merged 4 commits intodevelopfrom
fix/benchmark-require-guard
Mar 7, 2026
Merged

Fix/benchmark require guard#127
solderzzc merged 4 commits intodevelopfrom
fix/benchmark-require-guard

Conversation

@solderzzc
Copy link
Member

No description provided.

solderzzc and others added 4 commits March 6, 2026 18:13
Prevents the full 131-test benchmark suite from auto-executing
when the script is require()'d for syntax or import validation.
Exports main() for programmatic use.
* fix(benchmark): resolve double /v1 path in VLM URL construction

When --vlm URL included /v1 suffix (e.g. http://localhost:5405/v1),
llmCall constructed http://host:5405/v1/v1/chat/completions causing
HTTP 404. Now strips trailing /v1 before appending endpoint path.

Result: 35/35 VLM tests now pass (LFM2.5-VL-1.6B-Q8_0).

* feat(benchmark): auto-generate and open HTML report, update SKILL.md to v2.0.0

- Report is now always generated after benchmark completion
- Auto-opens in browser via 'open' (macOS) / 'xdg-open' (Linux)
- Use --no-open to suppress browser launch
- Removed --report flag (report always generated)
- Updated SKILL.md: 131 tests, 16 suites, env var documentation,
  configuration table with defaults and descriptions

* fix(benchmark): fix 5 failing tests, skip auto-open in skill mode

1. Security: Accept 'suspicious' for masked person at night (was critical-only)
2. Injection: Normalize Unicode curly apostrophe (U+2019) before matching
3. KI narration: Strengthen prompt to use schedule context, accept sam/alex
4. KI relevance: Accept tool-call (system_status) as valid response
5. KI conflict: Accept tool-call (system_status) as valid response
6. Skip browser auto-open in skill mode (Aegis handles via reportPath)

* feat(benchmark): add runtime/entry/install to SKILL.md for deployment agent

Added YAML frontmatter fields (runtime: node, entry: scripts/run-benchmark.cjs,
install: none) and ## Setup section so the Aegis deployment agent knows there
are zero dependencies and can skip npm install.

* feat(benchmark): add --help/-h flag support

Prints usage info (options, env vars, test counts) and exits immediately
without running the benchmark. Used by the Aegis deployment agent for
skill verification.

* ci: add PR target check workflow — enforce develop-only PRs to master

* fix: guard benchmark main() behind require.main === module

Prevents the full 131-test benchmark suite from auto-executing
when the script is require()'d for syntax or import validation.
Exports main() for programmatic use.

* feat: add config.yaml and platform parameter docs for Aegis skill system

- Create docs/skill-params.md documenting platform env vars (AEGIS_GATEWAY_URL,
  AEGIS_VLM_URL, AEGIS_SKILL_ID, AEGIS_SKILL_PARAMS, AEGIS_PORTS) and the
  config.yaml format for user-configurable params
- Create config.yaml for home-security-benchmark with mode (llm/vlm/full) and
  noOpen user params — no platform params (Aegis auto-injects those)
- Update run-benchmark.cjs to read AEGIS_SKILL_PARAMS: merge skillParams.noOpen
  into NO_OPEN, add TEST_MODE for suite filtering (llm skips VLM Scene Analysis,
  vlm keeps only VLM suites, full runs all)
- Update SKILL.md with User Configuration section referencing config.yaml and
  linking to docs/skill-params.md

* feat: restructure README for SEO and branding

- Add 'What Can Local AI Actually Do?' section with HomeSec-Bench (131 tests)
- Move legacy Docker/CLI content (Applications 1-5, step-by-step HA guide) to docs/legacy-applications.md
- Replace detailed legacy content with compact <details> menu and summary table
- Merge duplicate FAQ, Architecture, Support, and Commercial sections
- Update overview: add 'AI camera' keyword, CCTV/IP/webcam mentions, benchmark reference
- Add Telegram/Discord/Slack messaging mention to overview
- Reduce README from 494 to ~200 lines while preserving all SEO-critical keywords

SEO Keywords preserved: camera(37), AI(58), VLM(12), detection(17), CCTV(2),
surveillance(2), facial recognition(3), RE-ID(5+3), security(6), benchmark(3)

* feat: rewrite Aegis intro as AI Security Camera Agent

- Replace 'Desktop App for DeepCamera' with 'Your AI Security Camera Agent'
- Add watches/understands/remembers/guards narrative matching repo description
- Add model names (Qwen, DeepSeek, SmolVLM, LLaVA) to intro paragraph
- Add bullet list: Watches, Remembers, Guards, Talks, Pluggable, Local-first
- Update screenshot captions with stronger SEO alt text
- Link to skill-development.md and skill-params.md from Skill Catalog
- Remove double --- separator
- Add agent framing to Skill Catalog intro

SEO impact: camera 37→45, AI camera 1→5, security 6→9, VLM 12→16

* refactor: merge Overview into Aegis section, eliminate repetition

- Remove redundant '## Overview' section (was repeating Aegis intro)
- Move SEO bridge sentence ('Built on proven facial recognition, RE-ID...') into Aegis section
- Move skill architecture sentence into Skill Catalog intro
- Remove Core Capabilities list (duplicate of Skill Catalog table)
- Update hero: 'Edge AI for Smart Camera Systems' → 'Open-Source AI Camera Skills Platform'
- Smooth Aegis paragraph with Option C two-sentence bridge
- Restore 'machine learning' keyword in bridge sentence

README: 207 → 188 lines. Flow: Hero → Aegis → Skills → Bench → Apps

* refactor: restructure Aegis as Getting Started, single DeepCamera hero

- Remove Aegis as separate hero section (was two competing identities)
- Make DeepCamera the single hero with capabilities directly below
- Add '## Getting Started with SharpAI Aegis' section with concrete actions:
  camera connect, built-in llama-server, one-click skill deploy,
  HuggingFace model downloads, VLM benchmarking, smart alerts
- Move screenshots into Getting Started (contextually correct)
- Replace '## Applications' with '## More Applications' (legacy only)
- Remove duplicate Aegis CTA from Applications section

README: 196 → 190 lines. Single hero, clean narrative flow.

* refactor: move Skill Catalog up, strengthen HomeSec-Bench section

- Move Skill Catalog directly after hero paragraph (was below Getting Started)
- Remove redundant capability bullets (overlapped with Getting Started)
- Merge bridge sentence into hero paragraph
- Rewrite HomeSec-Bench: 'How Secure Is Your Local AI?' with concrete scores
  - Local Qwen3.5-4B: 39/54 (72%)
  - Cloud GPT-5.2: 46/48 (96%)
  - Hybrid: 53/54 (98%)
- Add paper results screenshot (homesec-bench-results.png)
- Table column 'Examples' → 'What's at Stake' for assertive framing

New flow: Hero → Skill Catalog → Getting Started → HomeSec-Bench → Legacy
README: 195 → 178 lines

* style: collapse Architecture section into details block

Legacy architecture diagram is distracting alongside modern content.
Now hidden behind a clickable toggle.

* style: merge hero into single-line title, move description up

- 'DeepCamera' + 'Open-Source AI Camera Skills Platform' → one h1 line
- Move SEO paragraph into hero tagline position (no longer below fold)
- Remove old tagline ('Turn any camera into...') replaced by richer content
- Skill Catalog now starts immediately after hero

* copy: 'Smart alerts' → 'Talk to your guard' for Aegis chat feature

* refactor: move FAQ to legacy-applications.md

Installation & Setup links and Jetson Nano Docker-compose
are legacy content — removed from main README, added to
docs/legacy-applications.md.

* fix: make benchmark screenshot clickable, links to PDF paper
The require.main === module check fails when Electron spawns the
script via spawn(electronBinary, [scriptPath]) because Electron's
module loader sets require.main differently. Added fallback check
on process.argv[1] to handle both plain Node and Electron spawn.
…re-guard

# Conflicts:
#	skills/analysis/home-security-benchmark/SKILL.md
#	skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs
@solderzzc solderzzc merged commit 8df233d into develop Mar 7, 2026
@solderzzc solderzzc deleted the fix/benchmark-require-guard branch March 7, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant