Skip to content

Network multiplayer optimization (delta synchronization, reconnect support)#9642

Open
MostCromulent wants to merge 157 commits intoCard-Forge:masterfrom
MostCromulent:NetworkPlay/main
Open

Network multiplayer optimization (delta synchronization, reconnect support)#9642
MostCromulent wants to merge 157 commits intoCard-Forge:masterfrom
MostCromulent:NetworkPlay/main

Conversation

@MostCromulent
Copy link
Contributor

@MostCromulent MostCromulent commented Jan 30, 2026

Full Disclosure Up Front: I have zero coding ability. All code has been written by Claude Code AI based on natural language instruction. However that doesn't mean I have just accepted the first thing it spat out. I understand people may be skeptical about the "vibe coding" aspect of this, which is why I have attempted to verify the effectiveness of the code as much as I can through extensive automated testing, and why I have provided comprehensive documentation so there is full transparency for reviewers. If the devs have technical implementation questions or feedback I am going to have to rely on Claude to help me respond to that. I'm sorry if this is a mess; ultimately I am trying to help contribute to Forge in the best way I can. Yes, a comprehensive network play overhaul was probably overambitious as a first project by a non-coder.

Overview

This PR attempts to introduce major improvements to Forge's network play functionality.

These features and their implementation are comprehensively documented in Branch Documentation. In summary:

  • Delta Synchronisation Protocol: The current network protocol transfers the entire game state to each player on every update. This is replaced with a new protocol which, after initialisation, only transfers changed properties in the game state on update. If there is a sync error defaults back to the current protocol to restore the full game state. Testing indicates this process massively reduces total bandwidth usage by around ~99.5% compared to status quo. (An average of 2.39 MB per game using new protocol vs 578.58 MB per game using current protocol across an automated 100-game test).

  • Reconnection support: Allows disconnected players to reconnect to games with automatic state recovery. If a player cannot reconnect within a timeout period then they will be automatically handed to AI control so the game can continue. The host may manually skip the timeout and immediately hand control to AI.

  • Enhanced Chat Notifications: Improved visibility for network play chat messages with better formatting and player join/leave notifications. (Edit: Cosmetic chat functions moved to separate PR Network multiplayer optimization (chat) #9660)

  • Network UI Improvements: Better feedback during network operations including the prompt window identifying which player currently has priority and how long the game has been waiting for them to take an action; e.g. "Waiting for MostCromulent... (32 seconds)". (Edit: UI improvements moved to seperate PR Network multiplayer optimization (UI improvements) #9671)

  • Comprehensive Test Infrastructure: Automated headless network testing and analysis tools specifically developed to verify and debug the delta synchronisation protocol. Includes the ability to headlessly run large scale parallel AI vs AI games to completion using full network infrastructure, with comprehensive debug logging.

Dependencies

Dependencies between different features in the Branch are outlined in Feature Dependencies.` (Edit: Out of date, substantial refactoring has been done to remove feature dependencies)

Testing

I have attempted to verify the soundness of the delta synchronisation protocol as thoroughly as I can alone using the automated testing infrastructure developed for this Branch. Details of testing capabilities are outlined at Testing Documentation.

Comprehensive Validation Results (29 January 2025) | Test log analysis and all 100 test logs available for review.

  • A batch test of 100 AI vs AI games on full network infrastructure across a variety of player counts from 2-4 players, using over 200 unique decks to test a wide range of play conditions.
  • 97% success rate with zero checksum errors. The three failed games were caused by timeouts due to complex board states, not the network protocol.
  • Average bandwidth usage of 2.39 MB per game using new delta protocol vs 578.58 MB per game using current protocol.
  • 99.6% average bandwidth savings.

Conclusion
Based on testing, the new delta synchronisation network protocol in this branch appears to solve a longstanding issue with Forge's multiplayer implementation, significantly reducing bandwidth usage and latency in network multiplayer games. The other features of this branch provide a range of quality-of-life improvements for network play.

I've taken this about as far as I can alone and would welcome any feedback and testing from others.

MostCromulent and others added 30 commits January 20, 2026 20:24
This commit introduces two major features to improve multiplayer networking:

1. Delta Synchronization
   - Only transmit changed properties instead of full game state
   - Add DeltaPacket and FullStatePacket for efficient data transfer
   - DeltaSyncManager tracks changes and builds minimal update packets
   - Extend TrackableObject with change tracking methods
   - Periodic checksum validation for state consistency

2. Reconnection Support
   - Players can rejoin games after disconnection (5 min timeout)
   - GameSession and PlayerSession manage connection state
   - Secure token-based authentication for reconnection
   - Username-based fallback if credentials are lost
   - Game pauses automatically when player disconnects
   - Full state restoration on successful reconnection

New files:
- DeltaPacket.java, FullStatePacket.java - Network packets
- DeltaSyncManager.java - Server-side delta collection
- GameSession.java, PlayerSession.java - Session management
- ReconnectRequestEvent.java, ReconnectRejectedEvent.java - Events
- NetworkOptimizationTest.java - Unit tests
- NETWORK_OPTIMIZATION.md - Implementation documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WARNING: Delta sync is NOT currently operating correctly and needs
further debugging. The system falls back to full state sync which
works, but the bandwidth optimization benefits are not yet realized.

Changes:
- Add NewObjectData to DeltaPacket for newly created objects
- Implement compact binary serialization (NetworkPropertySerializer)
  to replace Java serialization (~99% size reduction)
- Add tracker initialization after network deserialization
- Fix session credential timing (create session before startMatch)
- Add immediate GameView setting in client handler
- Add StackItemView network deserialization constructor
- Add bandwidth monitoring and debug logging
- Rename NETWORK_OPTIMIZATION.md to Branch_Documentation.md
- Add CLAUDE.md project documentation

Known issues requiring debug:
- Client may not receive/apply delta packets correctly
- Object lookup in Tracker may fail for new objects
- CardStateView property application needs verification

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add diagnostic tools for investigating deserialization byte stream
misalignment issues in delta sync:

- NetworkTrackableSerializer/Deserializer: Add byte position tracking
  (bytesWritten/bytesRead counters with getters)
- NetworkPropertySerializer: Add marker validation, ordinal validation,
  and verbose CardStateView serialize/deserialize logging
- AbstractGuiGame: Add hex dump on error, improved CardStateView null
  handling with reflection-based creation for missing states

Also:
- Rename Branch_Documentation.md to BRANCH_DOCUMENTATION.md
- Update documentation with recent changes

Note: The serialization debugging changes have not yet been tested.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: fullStateSync was replacing the entire gameView AFTER
openView had stored PlayerView references in the UI. This caused
the UI to hold orphaned PlayerView instances while delta sync
updated different instances.

Fix: When fullStateSync is called and gameView already exists,
use copyChangedProps() instead of replacing the gameView. This
preserves the existing PlayerView instances that the UI references.

Also includes:
- NetworkDebugLogger for comprehensive network debug logging
- Zone tracking and UI refresh after delta application
- Debug logging in CMatchUI for tracking PlayerView identity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add log level support to NetworkDebugLogger with separate console/file
verbosity. Console defaults to INFO (summaries only), file defaults to
DEBUG (full detail). Adds debug(), warn() methods and level configuration.

Re-categorized ~64 log calls: detailed tracing -> debug(), warnings about
missing objects -> warn(), keeping summaries and markers as log().

Updated BRANCH_DOCUMENTATION.md with debugging section.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Enhanced the chat system to provide clearer communication of server events:

System Message Styling:
- Added MessageType enum to ChatMessage (PLAYER, SYSTEM)
- Mobile: System messages displayed in blue bubbles, centered alignment
- Desktop: System messages prefixed with [SERVER] indicator
- Automatic detection: null source = system message

Host Identification:
- Host player's name appended with " (Host)" in all chat messages
- Applied to both host's own messages and relay to other players

Player Ready Notifications:
- Shows individual player ready status with count: "PlayerName is ready (2/4 players ready)"
- Broadcasts "All players ready! Starting game..." when all players ready
- Works for both local host and remote clients

Reconnection Countdown:
- 30-second interval countdown notifications during 5-minute timeout
- Format: "Waiting for [Player] to reconnect... (4:30 remaining)"
- Provides clear feedback on remaining time for reconnection

Game End Notifications:
- Announces winner: "Game ended. Winner: [PlayerName]" or "Game ended. Draw"
- Added "Returning to lobby..." notification after game ends
- Integrated with HostedMatch winner detection

Files modified:
- ChatMessage.java: Added message type system
- FServerManager.java: Ready notifications, countdown timer, winner announcement
- NetConnectUtil.java: Host indicator for local messages
- GameLobby.java: Added getHostedMatch() accessor
- FNetOverlay.java (desktop): [SERVER] prefix for system messages
- OnlineChatScreen.java (mobile): Blue styling for system messages
Updated BRANCH_DOCUMENTATION.md to include comprehensive documentation
of the chat notification improvements as Feature 3.

Documentation includes:
- Problem statement: poor visibility of server events
- Solution architecture: MessageType enum, visual styling, notifications
- Implementation details for all notification types
- Ready state tracking with player counts
- 30-second countdown timer for reconnection
- Game end winner announcements
- Host player identification
- Visual examples for both mobile and desktop platforms
- Updated files modified section

The chat enhancements complement the existing delta sync and reconnection
features by providing clear user feedback on all network play events.
…tions-EXd4y

Claude/improve chat notifications
Cleaned up the documentation to focus on current functionality rather
than development history:

Removed:
- Entire "Recent Changes (Post-Initial Commit)" section (123 lines)
  - Historical bug fixes and implementation details
  - Tracker initialization fixes
  - Session credential timing fixes
  - Change flag management details
  - Deserialization diagnostics details

Merged into main content:
- Compact binary serialization (now in Feature 1)
- Bandwidth monitoring (now in Feature 1)
- Performance metrics (99.8% reduction in CardView size)

Updated:
- Future Improvements: Removed completed items (bandwidth metrics, configurable logging)
- Known Limitations: Removed fixed items (CardStateView handling, verbose logging)

Result: Documentation is now 105 lines shorter and focuses on understanding
the current system architecture rather than historical implementation changes.
…tions-EXd4y

Claude/improve chat notifications e xd4y
Add AI takeover functionality when players fail to reconnect:
- Convert timed-out player slots to AI type instead of removing them
- Substitute player controller with AI controller to maintain game state
- Preserve player's hand, library, and battlefield under AI control
- Resume game automatically once all disconnected players have AI takeover

Add /skipreconnect chat command for host:
- Allows host to skip 5-minute reconnection timer
- Can target specific player by name: /skipreconnect <playerName>
- Or affects first disconnected player if no name specified
- Only accessible to host player (index 0)
- Immediately triggers AI takeover for the disconnected player

Implementation details:
- Added convertPlayerToAI() method to handle controller substitution
- Uses Player.dangerouslySetController() to swap in AI controller
- Creates LobbyPlayerAi and PlayerControllerAi for disconnected player
- Updates lobby slot to AI type with " (AI)" suffix
- Marks player as connected in session after AI takeover
- Broadcasts informative messages to all players

This enhancement prevents games from becoming unplayable when a player
disconnects and allows the game to continue gracefully with AI control.
Improve documentation organization and clarity:
- Move AI takeover content from middle of Session Management to its own section after GUI Instance Management
- Fix section numbering (PlayerSession was incorrectly numbered as '2.' after interrupted flow)
- Update Disconnect Handling diagram to show AI takeover path
- Update reconnection timeline example to show correct AI takeover message
- Rename subsections for better clarity (e.g., 'Chat Command Handling' -> 'Implementation: Chat Command Parsing')
- Group all AI takeover content together: overview, host command, implementation details

The AI takeover feature is now presented in a logical flow after explaining the normal reconnection process, making it easier to understand as an extension of timeout handling.
Add new subsection 'AI-Assisted Debugging with Log Files' explaining how to use NetworkDebugLogger output with Claude for troubleshooting:

- How to locate and share log files
- What information to provide along with logs
- What types of issues Claude can help diagnose
- Tips for better analysis results (DEBUG level, timestamps, both client/server logs)
- Privacy considerations when sharing logs

This helps users leverage AI assistance for complex network synchronization issues that are difficult to debug manually.
- Add AI takeover description to reconnection support feature
- Add 'Additional Resources' section linking to debugging
- Mention comprehensive debug logging with AI-assisted troubleshooting
Remove references to AI-assisted troubleshooting as this is obvious to developers:
- Simplified overview debugging link to just mention 'comprehensive debug logging'
- Removed entire 'AI-Assisted Debugging with Log Files' section including:
  - How to use logs with Claude
  - Example usage
  - Tips for better results
  - Privacy note

The debug logging system and its usage is sufficiently documented in the existing sections.
- Change heading to 'Potential Future Improvements'
- Remove spectator reconnection item
- Renumber remaining improvements
…er-Yzrli

Claude/networkplay ai takeover yzrli
Critical fixes implemented:

1. Thread Safety: Changed clients map from TreeMap to ConcurrentHashMap
   - TreeMap is not thread-safe and was accessed from Netty I/O threads
   - Could cause ConcurrentModificationException or corrupted state
   - Location: FServerManager.java:44

2. Race Condition: Fixed reconnection timeout cancellation
   - Used atomic computeIfPresent() instead of remove-then-cancel
   - Prevents race where timeout fires after successful reconnection
   - Could have caused players to be converted to AI after reconnecting
   - Locations: FServerManager.java:911-915, 1009-1013

3. Data Integrity: Added null check in delta serialization
   - Detects inconsistent state where changedProps exists but props is null
   - Prevents silent corruption where all properties serialize as null
   - Could cause client/server desynchronization
   - Location: DeltaSyncManager.java:291-298

All three issues could cause game state corruption or crashes in production.
Option 1: Remove Reflection from Hot Path (200-500% performance improvement)
-------------------------------------------------------------------------
- Changed TrackableObject.set() visibility from protected to public
- Replaced all reflection-based property setting with direct method calls
- Eliminated 100-1000x performance penalty of reflection in delta application
- Affected locations:
  * AbstractGuiGame.java:1426 - Main property setting (hot path)
  * AbstractGuiGame.java:1398-1413 - CardStateView property loop
  * AbstractGuiGame.java:1347, 1363, 1371, 1379 - CardState assignments

Performance Impact:
- Before: Method.invoke() called for EVERY property update (100-1000x slower)
- After: Direct obj.set() call (native speed)
- Expected: 200-500% faster delta packet application on clients

Option 4: Add Recursion Depth Limits (Prevent stack overflow crashes)
---------------------------------------------------------------------
- Added MAX_ATTACHMENT_DEPTH = 20 to prevent deep card attachment chains
- Added MAX_COLLECTION_SIZE = 1000 to prevent runaway collection iteration
- Updated collectCardDelta() to track recursion depth
- Added depth validation with warning logs when limits approached
- Added collection size validation for:
  * Battlefield cards
  * Zone cards (hand, graveyard, library, etc.)
  * Attached cards (equipment chains, auras)

Safety Impact:
- Prevents StackOverflowError from pathological game states
- Logs warnings when approaching limits for debugging
- Gracefully handles edge cases instead of crashing

Files Modified:
- forge-game/src/main/java/forge/trackable/TrackableObject.java
  * Made set() method public (line 68)

- forge-gui/src/main/java/forge/gamemodes/match/AbstractGuiGame.java
  * Removed 6 reflection calls, replaced with direct set() calls
  * Simplified CardStateView property application loop

- forge-gui/src/main/java/forge/gamemodes/net/server/DeltaSyncManager.java
  * Added MAX_ATTACHMENT_DEPTH and MAX_COLLECTION_SIZE constants
  * Updated collectCardDelta() signature to track depth
  * Added depth validation and collection size limits
  * Added warning logs for limit violations
Created GZIP compression utility class with:
- compress() and decompress() methods
- 512-byte threshold support
- Debug flag to disable compression
- Metrics tracking (compression ratios, poor compression logging)
- PacketData wrapper class

Note: Discovered LZ4 compression already exists in CompatibleObjectEncoder.
Evaluating whether to:
- Replace LZ4 with GZIP
- Enhance existing LZ4 with monitoring
- Add GZIP layer on top of LZ4

This file may be modified or removed based on final compression strategy.
Updates:
- Added section 7 (LZ4 Compression) to Delta Synchronization feature
- Documented that all network packets are automatically compressed via CompatibleObjectEncoder/Decoder
- Noted 60-75% compression ratio with 1-5ms overhead
- Calculated combined savings: ~97% reduction (delta sync + LZ4)
- Added CompatibleObjectEncoder/Decoder to Network Protocol file list
- Removed 'Compression' from Potential Future Improvements (already implemented)
Discovered that LZ4 compression is already implemented in CompatibleObjectEncoder/Decoder.
All network packets (including DeltaPacket and FullStatePacket) are automatically
compressed with LZ4, providing 60-75% bandwidth reduction.

No additional compression layer needed - removing the GZIP utility that was created
before discovering the existing LZ4 implementation.
…ion-Yzrli

Document existing LZ4 compression and remove from future improvements
Created DeltaSyncQuickTest.java - a simple, fast verification tool that demonstrates
delta sync bandwidth savings without requiring the full build environment.

Features:
- Simulates 300-card game state
- Tests common scenarios (tap card, draw card, combat)
- Simulates full 50-turn game with 200 updates
- Calculates bandwidth savings percentages
- Runs in ~3 seconds with javac/java

Results demonstrate:
- Single action updates: 99.5-99.97% bandwidth reduction
- Full game simulation: 99.94% reduction (12.4MB → 8KB)
- Combined with LZ4: ~97-99% total bandwidth savings

Usage: javac DeltaSyncQuickTest.java && java DeltaSyncQuickTest
Updated the Delta Synchronization description in the Overview to include
specific bandwidth reduction metrics:
- Mentions ~97-99% bandwidth reduction
- Provides concrete example: 12.4MB → 80KB for typical game
- Highlights combined effect of delta sync + LZ4 compression

This gives readers immediate understanding of the performance improvement
without needing to read the full technical details.
…ession-Yzrli

Claude/update overview compression yzrli
@squee1968
Copy link
Contributor

More broadly, would it be helpful if I move the testing infrastructure into a separate PR so the testing conducted here is on production code only? That would reduce the size of this PR by ~30 files and ~10,000 lines of code.

IMHO, A resounding YES!!! In addition, I'd also move the reconnect, chat, and GUI commits into individual PRs as well. as this all is just a bit much to put onto the dev's plate all at once.

@MostCromulent
Copy link
Contributor Author

IMHO, A resounding YES!!! In addition, I'd also move the reconnect, chat, and GUI commits into individual PRs as well. as this all is just a bit much to put onto the dev's plate all at once.

No problem, whatever I can do to make things easier for you.

Given some features like reconnect support and UI fixes are currently dependent on delta synchronisation is it okay if when I move them to separate PRs I maintain that dependency (i.e. assume a phased implementation with delta sync merging first / those features cannot operate independently)?

I can always revisit refactoring those features based on future dev feedback, but it just means for now I will be able to split things up quickly without needing to do immediate substantial refactoring.

@squee1968
Copy link
Contributor

squee1968 commented Feb 2, 2026

Given some features like reconnect support and UI fixes are currently dependent on delta synchronisation is it okay if when I move them to separate PRs I maintain that dependency (i.e. assume a phased implementation with delta sync merging first / those features cannot operate independently)?

I'm sure that would be fine, but wait for input from the devs for further input/confirmation. I'm only a contributor.

MostCromulent and others added 2 commits February 2, 2026 21:42
- Add null guard for getGameView() in NetworkGuiGame.setGameView()
- Add @OverRide annotation to AbstractGuiGame.updateDependencies()
- Add toString() to GameView for better debugging/logging

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add null checks with fallback colors for FSkin.getColor() to prevent
NullPointerException when skin is not initialized (e.g., in tests).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@tool4ever
Copy link
Contributor

Sigh, a few tests still seem to timeout here...
I have also attempted to run them locally and the result doesn't change, so blaming GitHub seems like a shot in the dark :/
Most worrying is probably this one:

NetworkOptimizationTest.testDeltaVsFullStateComparison:593 Delta size should be less than full state with ObjectOutputStream overhead expected [true] but found [false]

Anyway I've already spent 2+ hours on reviewing/testing the yield PR which is way smaller in size and right now that one still feels like it's only around 50% yet

A quick glance at the test framework does make it look like an overengineered monster, however just ripping all tests out will not really help us in tracking this branch correctness

While to me the reconnect dependency doesn't look that tight maybe try the chat stuff first. Anything to reduce the size of this 🙏

AutomatedNetworkTest runs 12+ full MTG game simulations (50-120s each),
contributing to CI runs exceeding 25 minutes. Added
skipUnlessStressTestsEnabled() matching the BatchGameTest pattern.

Tests now skip in CI unless -Drun.stress.tests=true is set.

Related to CI Run 56126261919 timeout issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MostCromulent
Copy link
Contributor Author

MostCromulent commented Feb 2, 2026

Claude AI's analysis of test build errors:

Test runtime

The root cause of the extended CI runtime is that AutomatedNetworkTest runs 12+ full MTG game simulations in CI, each taking 50-120+ seconds. These are stress/integration tests, not unit tests. BatchGameTest was recently updated with the skipUnlessStressTestsEnabled()), but AutomatedNetworkTest was missing it.

Fix applied: Added skipUnlessStressTestsEnabled() to AutomatedNetworkTest - all game-running tests will now skip in CI unless -Drun.stress.tests=true is set.

With this change only tests that should run automatically are unit tests, not full games.

AutomatedNetworkTest.testWithSpecificPrecons - FAILED (120.0s timeout)

Error: ThreadTimeoutException - game didn't complete within 120s

Analysis: The game log shows it was actively running (Turn 16) when the timeout hit. Some precon deck matchups with complex AI decisions simply take longer than 120 seconds.

Fix applied: this test is now skipped in CI per the above.

NetworkOptimizationTest.testDeltaVsFullStateComparison

Error: AssertionError: Delta size (328) should be smaller than full state size (~320).

Analysis reveals fundamental conceptual flaw with test logic. It compares:

  • Delta: DeltaPacket with 300 bytes of data → getApproximateSize() returns ~328 bytes (includes 8+8+4 bytes for sequence/timestamp/checksum + object ID overhead)
  • Full State: A raw byte[300] array serialized via ObjectOutputStream → produces ~320 bytes

These are incomparable measurements. The test doesn't compare delta vs actual GameView serialization - it compares a structured packet vs primative byte array serialization. The assertion deltaSize < fullStateSize fails because the delta packet has legitimate protocol overhead.

A real GameView used in network play contains all players' hands, libraries, graveyards, battlefields, all card states (tapped, counters, attachments, damage), combat state, stack, mana pools, and phase/turn information

Game logging from the same CI run shows delta sync working correctly:

2026-02-02T12:53:19.3267583Z [12:53:19.270] [INFO] [DeltaSync] Packet #44: Approximate=63 bytes, ActualNetwork=647 bytes, FullState=114151 bytes
2026-02-02T12:53:19.3268896Z [12:53:19.270] [INFO] [DeltaSync]   Savings: Approximate=99%, Actual=99% | Cumulative: Approximate=88919, Actual=1117913, FullState=4737904                            
...                                                                               
2026-02-02T12:53:23.0237461Z [12:53:22.833] [INFO] [DeltaSync] Packet #170: Approximate=585 bytes, ActualNetwork=1055 bytes, FullState=116245 bytes
2026-02-02T12:53:23.0238231Z [12:53:22.833] [INFO] [DeltaSync]   Savings: Approximate=99%, Actual=99% | Cumulative: Approximate=104238, Actual=4796725, FullState=19079924

Recommendation: testDeltaVsFullStateComparison should be removed, it doesn't test real behaviour.

(Note: I have not removed the test at this stage; wanted to provide this analysis for full transparency before taking any action.)

@MostCromulent MostCromulent changed the title Network multiplayer optimization (delta synchronization, reconnect support, chat and UI improvements) Network multiplayer optimization (delta synchronization, reconnect support, and UI improvements) Feb 2, 2026
Revert ChatMessage.java and FNetOverlay.java to master versions.
Chat styling features (system message coloring, MessageType enum) are
now isolated in NetworkPlay/chat for independent merge to master.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MostCromulent
Copy link
Contributor Author

MostCromulent commented Feb 2, 2026

A quick glance at the test framework does make it look like an overengineered monster

Fully accept this. I'm going to have a look at consolidating/simplifying the test infrastructure. Keeping it streamlined wasn't a real focus when I started development on this and I accept it's gotten very bloated and is making PR review harder. Claude's preliminary analysis is that we can consolidate testing infrastructure from ~30 files and ~11,000 lines of code down to about ~20 files and about ~6,000 without losing functionality, so I will give that a crack.

Edit/Question:
Since I'm reconfiguring this, do you want it set up so that it just runs unit tests without any full games in CI environment? (i.e. applies the skipUnlessStressTestsEnabled() fix recently added to Batch Tests), or would it be useful for it to run a single 1v1 AI game on network infrastructure so you have visibility over the debug logs and the delta vs fullstate comparison in CI?

@MostCromulent
Copy link
Contributor Author

MostCromulent commented Feb 2, 2026

While to me the reconnect dependency doesn't look that tight [...]

Anything to reduce the size of this 🙏

I have looked into the dependencies further and will refactor so that reconnect support (Edit: and UI improvements) can be directly integrated into current network protocol without dependency on delta sync. Reconnect relies on sending full game state (which is what current network protocol does by default), so in some ways it is actually simpler.

Once refactor is done I will remove from this branch and into a separate PR.

TO DO:

  • Consolidate/simplify testing infrastructure
  • Refactor reconnect support to remove delta sync dependency, move to separate PR
  • Refactor UI improvements to remove delta sync dependency, move to separate PR

@tool4ever
Copy link
Contributor

tool4ever commented Feb 3, 2026

TO DO:

  • Consolidate/simplify testing infrastructure
  • Refactor reconnect support to remove delta sync dependency, move to separate PR
  • Refactor UI improvements to remove delta sync dependency, move to separate PR

these sound good

testDeltaVsFullStateComparison

agree with the analysis, test was actually less meaningful than the name implied

Since I'm reconfiguring this, do you want it set up so that it just runs unit tests without any full games in CI environment? (i.e. applies the skipUnlessStressTestsEnabled() fix recently added to Batch Tests), or would it be useful for it to run a single 1v1 AI game on network infrastructure so you have visibility over the debug logs and the delta vs fullstate comparison in CI?

having a single game played to somewhat test netplay as a whole still works seems reasonable, I'd imagine if the players get very small basic lands only decks it would just add a few seconds extra
(then the more complex stuff can stay behind that switch)

MostCromulent and others added 10 commits February 4, 2026 20:58
…s reduced

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Results files now use network-debug-{batchId}-results.md pattern to
keep them grouped with their corresponding log files in directory
listings. Removes the old prefix-based naming (comprehensive-test-results,
quick-test-results, analysis-results) in favor of consistent batchId-based
naming.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move skipUnlessStressTestsEnabled() from @BeforeClass to individual
stress test methods. This allows unit tests (deck loader, configuration,
server start/stop) and testTrueNetworkTraffic to run in default CI builds
while batch, parallel, and comprehensive tests still require
-Drun.stress.tests=true.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
UnifiedNetworkHarness already logs game results, so remove redundant
logging from NetworkPlayIntegrationTest test methods.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Separates UI improvements into NetworkPlay/ui branch.
Removes from NetworkPlay/main:
- Waiting timer with player name display (CMatchUI, InputLockUI)
- Connection error detail messages (NetConnectUtil, lobby screens)
- CONN_ERROR_PREFIX constant and related localization strings

This keeps NetworkPlay/main focused on delta sync functionality only.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MostCromulent MostCromulent changed the title Network multiplayer optimization (delta synchronization, reconnect support, and UI improvements) Network multiplayer optimization (delta synchronization, reconnect support) Feb 5, 2026
Complete removal of all reconnection infrastructure. Delta sync and
normal network play are fully preserved.

Server (FServerManager): Remove ~500 lines - session management, AI
takeover, reconnection timeout/broadcast, host commands, game pause
events. Simplify DeregisterClientHandler and RegisterClientHandler.
Add clearPlayerGuis() for test harness cleanup.

Client (FGameClient, GameClientHandler): Remove session credentials,
reconnection state fields, reconnect request/accepted handling.
Simplify channelActive() to normal login path only.

Protocol/Interface: Remove gamePaused, gameResumed, reconnectAccepted,
reconnectRejected, reconnectRequest from ProtocolMethod, IGuiGame,
IGameController, and all implementations (AbstractGuiGame,
NetworkGuiGame, PlayerControllerHuman, NetGameController).

Other: Simplify FullStatePacket (remove reconnect fields), clean up
NetGuiGame (remove updateClient/sendFullStateForReconnect), remove
host command handling from NetConnectUtil, remove createGameSession
from ServerGameLobby and GameLobby.

Delete files: GameSession.java, PlayerSession.java,
ReconnectRequestEvent.java, ReconnectRejectedEvent.java.

Fix delta sync initialization: Add sendFullState() call to
NetGuiGame.openView() - the removed sendSessionCredentials() had
been performing this initialization, without which all game updates
fell back to full setGameView instead of delta packets.

Tests: Remove 11 obsolete reconnection unit tests from
DeltaSyncUnitTest (28 remain, all passing). Update integration test
parallelBatchSize from 3 to 10. Validated with 10/10 games passing,
97.9% bandwidth savings, 0 checksum mismatches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants