Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
ba0c521
feat: add Outlook host support with mailbox tools, agent, and manifest
trsdn Feb 21, 2026
292764e
fix: correct Outlook manifest schema (remove RequestedHeight from Ite…
trsdn Feb 21, 2026
57417b1
fix: add localStorage fallback for hosts without SharedRuntime (Outlook)
trsdn Feb 22, 2026
35cbe35
fix: filter skills by host so Excel skills don't show in Outlook
trsdn Feb 22, 2026
9628883
feat: add Outlook skill with email workflows and tool guidance
trsdn Feb 22, 2026
175404b
feat: host-specific welcome suggestions for Outlook, PowerPoint, and …
trsdn Feb 22, 2026
9d22307
feat: add 13 additional Outlook JS API tools (22 total)
trsdn Feb 22, 2026
c49bb60
feat: add get_appointments tool via EWS FindItem for calendar access
trsdn Feb 22, 2026
aed7621
fix: improve get_appointments error message for disabled EWS tokens
trsdn Feb 22, 2026
d80600e
feat: add PowerPoint skill, improve prompt; remove EWS calendar tool
trsdn Feb 22, 2026
0f005db
fix: add pinnable task pane support to Outlook manifest
trsdn Feb 22, 2026
f1cd3f8
fix: improve SmartArt/group shape text extraction in PowerPoint
trsdn Feb 22, 2026
2137806
fix: remove shape.load('type') that caused InvalidArgument in PowerPoint
trsdn Feb 22, 2026
b711985
fix: handle InvalidArgument by processing slides individually
trsdn Feb 22, 2026
d5b9e1d
fix: per-shape sync and correct getImageAsBase64 signature
trsdn Feb 22, 2026
2f15606
fix: reduce get_slide_image default width to 320px for CLI output limits
trsdn Feb 22, 2026
7e0635a
feat: add iterative refinement instructions to PowerPoint agent
trsdn Feb 22, 2026
455ce92
feat: add 14 new PowerPoint tools for full JS API coverage (24 total)
trsdn Feb 22, 2026
c10cc1d
fix: agent always checks current slide with get_selected_slides first
trsdn Feb 22, 2026
3e9303d
fix: get_selected_slides returns both 1-based number and 0-based index
trsdn Feb 22, 2026
078bbb3
feat: expand Word host with 10 new tools, skill, and improved prompts
trsdn Feb 22, 2026
2ca9015
feat: add 15 new Word tools for full JS API coverage (35 total)
trsdn Feb 22, 2026
c6665fc
feat: improve PowerPoint skill with layout variety, verification loop…
trsdn Feb 22, 2026
015ac70
feat: add PPTX thumbnail grid script for visual QA
trsdn Feb 22, 2026
8815fe1
fix: add content sizing rules to prevent text overflow in PowerPoint …
trsdn Feb 22, 2026
6abd265
fix: add bold label separator rules to prevent merged text in PowerPo…
trsdn Feb 22, 2026
d142d55
fix: replace nested text array examples that cause [object Object] in…
trsdn Feb 22, 2026
06976a0
feat: split PowerPoint skill into 4 specialized skills
trsdn Feb 22, 2026
6d4b312
fix: enforce single-string colon pattern for label+description bullets
trsdn Feb 22, 2026
782ab0c
fix: tighten content limits — max 8 words per bullet, 6 words per des…
trsdn Feb 22, 2026
dc05390
fix: tighten word limits with concrete examples — 3-5 words per descr…
trsdn Feb 22, 2026
19875c3
fix: add shrinkText safety net and default to 3 columns
trsdn Feb 22, 2026
9030747
feat: enforce mandatory verification loops and tighten content limits
trsdn Feb 22, 2026
e64580d
fix: add 12-char word limit for columns and compound word fix guidance
trsdn Feb 22, 2026
49c9c08
refactor: simplify prompts — fewer rules, stronger verification loop
trsdn Feb 22, 2026
6b3ba1d
fix: emphasize bottom-edge checking in verification loop
trsdn Feb 22, 2026
52ce20a
fix: increase get_slide_image resolution from 320px to 800px
trsdn Feb 22, 2026
ad33763
feat: add region cropping to get_slide_image for targeted overflow de…
trsdn Feb 22, 2026
7426a4d
fix: match PptxGenJS slide size to presentation and remove hardcoded …
trsdn Feb 22, 2026
2f99d8f
feat: add "detailed" mode to get_slide_image — overview + 4 quadrant …
trsdn Feb 22, 2026
cce074b
refactor: remove "detailed" multi-image mode, keep single-quadrant zoom
trsdn Feb 22, 2026
181e04b
docs: update README with Outlook, PowerPoint, and Word host support
trsdn Feb 22, 2026
0448fb4
fix: increase font size guidelines and add readability check to verif…
trsdn Feb 22, 2026
a539811
feat: add progress narration section to PowerPoint agent
trsdn Feb 22, 2026
d394a3c
fix: enable auto-scroll to bottom during streaming
trsdn Feb 22, 2026
3be4aba
feat: implement multi-agent planner → worker deck orchestration
trsdn Feb 22, 2026
bcc9381
fix: capture plan in tool handler instead of parsing events
trsdn Feb 22, 2026
3aff5aa
feat: add fast/deep modes to deck orchestrator
trsdn Feb 22, 2026
880fcd3
feat: fast mode sends all slides in 1 worker session
trsdn Feb 22, 2026
e3e9ead
feat: add WorkIQ toggle and MCP stdio transport (#2)
urosstojkic Feb 22, 2026
d9bcc6c
feat: Word orchestrator, Outlook skills, WorkIQ model selection, docs…
trsdn Feb 22, 2026
0b0e156
feat: merge Word orchestrator, Outlook skills, WorkIQ integration fro…
Feb 26, 2026
76b5aa3
feat: add Outlook manifest sideload support and validate script
Feb 26, 2026
8ff97ae
fix: auto-approve MCP permission requests in integration test; fix Sk…
Feb 26, 2026
bf89a86
fix: resolve lint errors in planner tool and .cjs files
Feb 26, 2026
f6592ce
fix: pass dummy invocation to Outlook tool.handler() in E2E tests
Feb 26, 2026
03222e0
docs: add Outlook E2E coverage, attribute @trsdn, update all test tab…
Feb 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 12 additions & 9 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Branch protection is enforced on GitHub (ruleset ID `13260767`). Any attempt to
- **Vite 7** — bundling, dev server (HMR via middleware mode in Express)
- **TypeScript 5** — type safety
- **Vitest** — unit + integration testing (jsdom env)
- **Mocha** — E2E tests inside Excel Desktop (current host runtime E2E)
- **Mocha** — E2E tests inside real Office Desktop hosts (Excel, PowerPoint, Word, Outlook)
- **Playwright** — browser UI tests for task pane flows

## Architecture
Expand Down Expand Up @@ -114,15 +114,15 @@ Bundled skill files in `src/skills/` provide additional context injected into th
│ • WebSocket client + session (mocked in unit tests) │
│ • Agent/skill service parsing │
├──────────────────────────────────────────────────────┤
Excel.run() boundary (current host implementation)
Office.run() boundary (all hosts)
├──────────────────────────────────────────────────────┤
│ E2E only (Mocha + real Excel Desktop)
│ E2E only (Mocha + real Office Desktop) │
│ ───────────────────────────── │
│ • rangeCommands, tableCommands, sheetCommands │
│ • chartCommands, workbookCommands, commentCommands │
│ • conditionalFormatCommands, dataValidationCommands │
│ • pivotTableCommands │
│ • PowerPoint / Word commands
│ • PowerPoint / Word / Outlook commands
│ • OfficeRuntime.storage (real runtime) │
└──────────────────────────────────────────────────────┘
```
Expand Down Expand Up @@ -152,10 +152,11 @@ The task pane is split into three areas:
| --------------- | ---------- | -------------------- | ----- | ------------------------------------------------------------------------------------ |
| **Integration** | Vitest | `tests/integration/` | 36 | Component wiring; tool schemas; stores; hooks; live Copilot WebSocket |
| **UI** | Playwright | `tests-ui/` | | Browser task pane flows (real Copilot API, NO mocking) |
| **E2E (Excel)** | Mocha | `tests-e2e/` | ~187 | Excel commands inside real Excel Desktop |
| **E2E (PPT)** | Mocha | `tests-e2e-ppt/` | ~13 | PowerPoint commands inside real PowerPoint Desktop |
| **E2E (Word)** | Mocha | `tests-e2e-word/` | ~12 | Word commands inside real Word Desktop |
| ~~Unit~~ | ~~Vitest~~ | ~~`tests/unit/`~~ | | ~~DO NOT ADD NEW UNIT TESTS~~ |
| **E2E (Excel)** | Mocha | `tests-e2e/` | ~187 | Excel commands inside real Excel Desktop |
| **E2E (PPT)** | Mocha | `tests-e2e-ppt/` | ~13 | PowerPoint commands inside real PowerPoint Desktop |
| **E2E (Word)** | Mocha | `tests-e2e-word/` | ~12 | Word commands inside real Word Desktop |
| **E2E (Outlook)** | Mocha | `tests-e2e-outlook/` | ~9 | Outlook commands (requires Exchange sideloading approval) |
| ~~Unit~~ | ~~Vitest~~ | ~~`tests/unit/`~~ | | ~~DO NOT ADD NEW UNIT TESTS~~ |

### Required Test Execution After Any Code Change

Expand All @@ -165,7 +166,8 @@ The task pane is split into three areas:
2. `npm run test:e2e` — E2E tests inside real Excel Desktop (requires `npm run start:desktop` first; **must pass before marking work complete**)
3. `npm run test:e2e:ppt` — E2E tests inside real PowerPoint Desktop (requires PPT open; **must pass before marking PPT work complete**)
4. `npm run test:e2e:word` — E2E tests inside real Word Desktop (requires Word open; **must pass before marking Word work complete**)
5. `npm run test:ui` — Playwright UI tests when task pane flows are changed
5. `npm run test:e2e:outlook` — E2E tests inside real Outlook Desktop (requires Exchange sideloading approval; blocked on tenants with policy restrictions — flag as blocker if unavailable)
6. `npm run test:ui` — Playwright UI tests when task pane flows are changed

**Never consider work done until integration and E2E tests pass for the affected host(s).** If live Copilot WebSocket tests fail because the dev server is not running, start `npm run dev` as a background process and re-run — do not skip or report as blocked. If E2E tests cannot be run (Office app not open), explicitly flag this as a blocker to the user — do not silently skip them.

Expand Down Expand Up @@ -238,6 +240,7 @@ The task pane is split into three areas:
- **New Excel command?** → E2E test in `tests-e2e/`
- **New PowerPoint command?** → E2E test in `tests-e2e-ppt/`
- **New Word command?** → E2E test in `tests-e2e-word/`
- **New Outlook command?** → E2E test in `tests-e2e-outlook/`
- **New task pane interaction flow?** → UI test in `tests-ui/`
- **New React component or hook behavior?** → Integration test in `tests/integration/`
- **New host routing rule?** → Integration test in `tests/integration/`
Expand Down
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,8 @@ Thumbs.db
.vscode/settings.json
*.js.map
__pycache__/
aitest-reports/
aitest-reports/tests-pptx/test.pptx
tests-pptx/~$*
tests-pptx/*.pptx
tests-pptx/*.jpg
tests-pptx/*.png
34 changes: 28 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Office Coding Agent

An Office add-in that embeds GitHub Copilot as an AI assistant in Excel (and other Office hosts). Built with React, [assistant-ui](https://github.com/assistant-ui/assistant-ui), Tailwind CSS, and the [GitHub Copilot SDK](https://www.npmjs.com/package/@github/copilot-sdk). The Copilot SDK integration architecture is based on [patniko/github-copilot-office](https://github.com/patniko/github-copilot-office). Requires an active GitHub Copilot subscription — no API keys or endpoint configuration needed.
An Office add-in that embeds GitHub Copilot as an AI assistant in Excel, PowerPoint, Word, and Outlook. Built with React, [assistant-ui](https://github.com/assistant-ui/assistant-ui), Tailwind CSS, and the [GitHub Copilot SDK](https://www.npmjs.com/package/@github/copilot-sdk). The Copilot SDK integration architecture is based on [patniko/github-copilot-office](https://github.com/patniko/github-copilot-office). Requires an active GitHub Copilot subscription — no API keys or endpoint configuration needed.

> **Research Project Disclaimer**
>
Expand All @@ -16,12 +16,16 @@ Node.js proxy server (src/server.mjs)
GitHub Copilot API
```

The proxy server uses the `@github/copilot-sdk` to manage the Copilot CLI lifecycle and bridges it to the browser task pane via WebSocket + JSON-RPC. Tool calls (Excel commands) flow back from the server to the browser.
The proxy server uses the `@github/copilot-sdk` to manage the Copilot CLI lifecycle and bridges it to the browser task pane via WebSocket + JSON-RPC. Tool calls flow back from the server to the browser, where host-specific handlers execute them (e.g., `Excel.run()`, `PowerPoint.run()`, `Word.run()`, or Outlook REST APIs).

## Features

- **GitHub Copilot authentication** — sign in once with your GitHub account; no API keys or endpoint config
- **Host-routed tools** — Excel, PowerPoint, and Word toolsets selected by current Office host
- **Host-routed tools** — Excel, PowerPoint, Word, and Outlook toolsets selected by current Office host
- **10 Excel tool groups** — range, table, chart, sheet, workbook, comment, conditional format, data validation, pivot table, range format — covering ~83 actions
- **24 PowerPoint tools** — slides, shapes, text, images, tables, charts, notes, layouts; includes visual QA with `get_slide_image` region cropping for overflow detection
- **35 Word tools** — documents, paragraphs, tables, images, headers/footers, styles, comments, sections, fields, content controls
- **22 Outlook tools** — emails, calendar, contacts, folders, attachments, categories, search, flags, drafts
- **Agent system** — host-targeted agents with YAML frontmatter (`hosts`, `defaultForHosts`)
- **Skills system** — bundled skill files inject context into the system prompt, toggleable via SkillPicker
- **Custom agents & skills** — import local ZIP files for custom agents and skills
Expand All @@ -30,6 +34,17 @@ The proxy server uses the `@github/copilot-sdk` to manage the Copilot CLI lifecy
- **Auto-scroll chat** — thread stays pinned to newest content so follow-up output remains visible
- **Web fetch tool** — proxied through the local server to avoid CORS restrictions

## Agent Skills Format

A skill is a folder containing `SKILL.md`. Optional supporting docs live under `references/` inside that skill folder.

## Prerequisites

- [Node.js](https://nodejs.org/) >= 20
- Microsoft Office (Excel, PowerPoint, Word, or Outlook — desktop or Microsoft 365 web)
- An active **GitHub Copilot** subscription (individual, business, or enterprise)
- The `@github/copilot` CLI authenticated (`gh auth login` or equivalent)

## Getting Started

**👉 See [GETTING_STARTED.md](./GETTING_STARTED.md) for full setup instructions** — including authentication, starting the proxy server, registering the add-in, and sideloading into Office.
Expand Down Expand Up @@ -60,6 +75,8 @@ For local shared-folder sideloading and staging manifest workflows, see [docs/SI

## Available Scripts

## Available Scripts

| Script | Description |
| -------------------------------- | --------------------------------------------------------------------- |
| `npm run dev` | Start Copilot proxy + Vite dev server (port 3000) |
Expand Down Expand Up @@ -101,15 +118,18 @@ For local shared-folder sideloading and staging manifest workflows, see [docs/SI
| `npm run test:e2e` | Run E2E tests in Excel Desktop |
| `npm run test:e2e:ppt` | Run E2E tests in PowerPoint Desktop |
| `npm run test:e2e:word` | Run E2E tests in Word Desktop |
| `npm run test:e2e:outlook` | Run E2E tests in Outlook Desktop |
| `npm run test:e2e:all` | Run all four E2E suites in sequence |
| `npm run validate` | Validate `manifests/manifest.dev.xml` |
| `npm run validate:outlook` | Validate `manifests/manifest.outlook.dev.xml` |

## Testing

This project uses three active test layers:

- **Integration** (`tests/integration/`, Vitest) — component wiring, stores, host/tool routing, and live Copilot websocket flows
- **UI** (`tests-ui/`, Playwright) — browser taskpane behavior and regression coverage
- **E2E** (`tests-e2e*`, Mocha) — real Office host validation in Excel, PowerPoint, and Word desktop
- **E2E** (`tests-e2e*`, Mocha) — real Office host validation in Excel, PowerPoint, Word, and Outlook desktop

Unit tests are intentionally not used for new work in this repository.

Expand All @@ -132,6 +152,7 @@ npm run test:ui
npm run test:e2e
npm run test:e2e:ppt
npm run test:e2e:word
npm run test:e2e:outlook

# Validate the Office add-in manifest
npm run validate
Expand All @@ -141,7 +162,7 @@ Integration tests run as part of the default `npm test` suite.

## E2E Testing

The project includes ~187 end-to-end tests that validate all 83 Excel tools plus settings persistence and AI round-trips inside a real Excel Desktop instance.
The project includes end-to-end tests across all four Office hosts: ~187 Excel tests (tools, settings persistence, AI round-trips), ~13 PowerPoint tests, ~12 Word tests, and Outlook tests (requiring Exchange sideloading approval).

### How It Works

Expand Down Expand Up @@ -313,7 +334,7 @@ In chat pickers:

- **`useOfficeChat`** — creates a `WebSocketCopilotClient`, opens a `BrowserCopilotSession`, maps `SessionEvent` stream to `ThreadMessage[]` for `useExternalStoreRuntime`
- **`BrowserCopilotSession.query()`** — async generator yielding `SessionEvent` objects (assistant.message_delta, tool.execution_start, session.idle, etc.)
- **`getToolsForHost(host)`** — returns `Tool[]` (Copilot SDK format) for the current Office host
- **`getToolsForHost(host)`** — returns `Tool[]` (Copilot SDK format) for the current Office host (Excel: ~83 tools, PowerPoint: 24, Word: 35, Outlook: 22)

State is minimal: `useSettingsStore` (Zustand) persists model/agent/skill configuration; chat state is ephemeral.

Expand Down Expand Up @@ -367,6 +388,7 @@ The proxy server architecture (`server.mjs` → `copilotProxy.mjs` → `@github/
## Acknowledgments

- **[patniko/github-copilot-office](https://github.com/patniko/github-copilot-office)** — The proxy server architecture, Copilot SDK integration pattern, and WebSocket transport design used in this project were adopted from this repository by [Patrick Nikoletich](https://github.com/patniko) and [Steve Sanderson](https://github.com/SteveSandersonMS). Their work provided the foundation for the Phase 2 migration.
- **[@trsdn (Torsten)](https://github.com/trsdn)** and **[@urosstojkic](https://github.com/urosstojkic)** — Contributed the Word document orchestrator (planner→worker pattern), 22 Outlook tools, expanded PowerPoint tooling (24 tools), WorkIQ MCP stdio integration, host-specific welcome prompts, improved auto-scroll, and new skills (Outlook email/calendar/drafting, Word formatting/tables/document-builder, PowerPoint content/layout/animation/presentation). Originally submitted as [PR #33](https://github.com/sbroenne/office-coding-agent/pull/33) and merged in [PR #45](https://github.com/sbroenne/office-coding-agent/pull/45).
- **[assistant-ui](https://github.com/assistant-ui/assistant-ui)** — React chat UI components used for the task pane thread and composer.
- **[Vercel AI SDK](https://ai-sdk.dev/)** — Original AI runtime used in Phase 1.

Expand Down
20 changes: 12 additions & 8 deletions docs/TESTING_STRATEGY.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,21 +108,23 @@ Unit tests that mock Office APIs or fabricate fake contexts provide zero confide

### E2E Tests — Mocha inside real Office hosts

| Host | Directory | Tests |
| ---------- | ----------------- | ----- |
| Excel | `tests-e2e/` | ~187 |
| PowerPoint | `tests-e2e-ppt/` | ~13 |
| Word | `tests-e2e-word/` | ~12 |
| Host | Directory | Tests |
| ---------- | ---------------------- | ----- |
| Excel | `tests-e2e/` | ~187 |
| PowerPoint | `tests-e2e-ppt/` | ~13 |
| Word | `tests-e2e-word/` | ~12 |
| Outlook | `tests-e2e-outlook/` | ~9 |

**Real Office.js APIs, real host runtime.**

## When to Write What

| Scenario | Test type | Location |
| ------------------------------------ | ---------------- | -------------------- |
| New Excel command (`Excel.run`) | E2E test | `tests-e2e/` |
| New PowerPoint command | E2E test | `tests-e2e-ppt/` |
| New Word command | E2E test | `tests-e2e-word/` |
| New Excel command (`Excel.run`) | E2E test | `tests-e2e/` |
| New PowerPoint command | E2E test | `tests-e2e-ppt/` |
| New Word command | E2E test | `tests-e2e-word/` |
| New Outlook command | E2E test | `tests-e2e-outlook/` |
| New task pane interaction flow | UI test | `tests-ui/` |
| New React component or hook behavior | Integration test | `tests/integration/` |
| New host routing rule | Integration test | `tests/integration/` |
Expand All @@ -142,6 +144,8 @@ npm run test:ui
npm run test:e2e # Excel Desktop (~187 tests)
npm run test:e2e:ppt # PowerPoint Desktop (~13 tests)
npm run test:e2e:word # Word Desktop (~12 tests)
npm run test:e2e:outlook # Outlook Desktop (~9 tests; requires Exchange sideloading approval)
npm run test:e2e:all # All four suites in sequence

# Validate manifest
npm run validate
Expand Down
Loading
Loading