fix: disambiguate select_action_from_mjai with tsumogiri/consumed (#195)#201
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect/ambiguous action selection in select_action_from_mjai by incorporating MJAI’s tsumogiri and consumed fields, and by introducing drawn_tile into observations to disambiguate discard choices.
Changes:
- Add
drawn_tile: Option<u8>toObservation/Observation3Pwith serde defaults for snapshot backward compatibility. - Centralize MJAI message parsing + legal-action selection logic in a shared helper, honoring
tsumogiri(discard) andconsumed(calls/kans). - Add parity tests for tsumogiri-based discard selection and consumed-based pon selection (4P).
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/test_mjai_parity.py | Adds regression tests for tsumogiri-disambiguated discards and consumed-disambiguated pon selection. |
| riichienv-core/src/state/mod.rs | Plumbs drawn_tile into 4P observations created from game state. |
| riichienv-core/src/state_3p/mod.rs | Plumbs drawn_tile into 3P observations created from game state. |
| riichienv-core/src/observation/mod.rs | Adds drawn_tile field and exposes the new mjai_select helper module under the python feature. |
| riichienv-core/src/observation/python.rs | Switches 4P select_action_from_mjai to use the shared MJAI parsing/selection helper. |
| riichienv-core/src/observation/mjai_select.rs | New shared MJAI parsing + action-selection logic honoring tsumogiri and consumed. |
| riichienv-core/src/observation/encode.rs | Updates constructor-based tests to pass the new drawn_tile parameter. |
| riichienv-core/src/observation_3p/mod.rs | Adds drawn_tile field and threads it through the 3P observation constructor. |
| riichienv-core/src/observation_3p/python.rs | Replaces bespoke 3P MJAI selection logic with the shared helper and removes unreachable chi handling. |
| riichienv-core/src/observation_3p/encode.rs | Updates constructor-based tests to pass the new drawn_tile parameter. |
Comments suppressed due to low confidence (1)
riichienv-core/src/observation_3p/python.rs:144
- The 3P
select_action_from_mjaipath was rewritten to use the shared disambiguation logic (tsumogiri/consumed/drawn_tile), but the new regression tests added in this PR cover only 4P. Please add analogous sanma tests (e.g. intests/env/test_sanma.py) that exercise tsumogiri-based discard selection and consumed-based pon/ankan selection in 3P as well, to ensure the fix for #195 doesn’t regress for Observation3P.
pub fn select_action_from_mjai(&self, mjai_data: &Bound<'_, PyAny>) -> Option<Action3P> {
use crate::observation::mjai_select::{parse_mjai_message, select_action};
let parsed = parse_mjai_message(mjai_data)?;
let inner_actions: Vec<Action> =
self._legal_actions.iter().map(|a| (**a).clone()).collect();
let selected = select_action(&inner_actions, &parsed, self.drawn_tile, true)?;
// Re-find the corresponding Action3P instance to return.
self._legal_actions
.iter()
.find(|a| {
a.action_type == selected.action_type
&& a.tile == selected.tile
&& a.consume_tiles == selected.consume_tiles
})
.cloned()
}
#[pyo3(name = "new_events")]
pub fn new_events_py(&self) -> Vec<String> {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Special-case Discard: filter by mjai pai then disambiguate via tsumogiri. | ||
| if tt == ActionType::Discard { | ||
| let candidates: Vec<&Action> = legal_actions | ||
| .iter() | ||
| .filter(|a| { | ||
| a.action_type == ActionType::Discard | ||
| && a.tile.is_some_and(|t| tid_to_mjai(t) == parsed.tile_str) |
There was a problem hiding this comment.
In the Discard special-case, candidates are filtered strictly by parsed.tile_str. If the MJAI dahai message is missing pai/tile (so tile_str is empty), this returns None because candidates becomes empty; the previous logic would fall back to the first matching Discard when tile_str was empty. Consider restoring that fallback by treating an empty tile_str as “match any Discard”, and then applying the tsumogiri preference when possible.
| // Special-case Discard: filter by mjai pai then disambiguate via tsumogiri. | |
| if tt == ActionType::Discard { | |
| let candidates: Vec<&Action> = legal_actions | |
| .iter() | |
| .filter(|a| { | |
| a.action_type == ActionType::Discard | |
| && a.tile.is_some_and(|t| tid_to_mjai(t) == parsed.tile_str) | |
| // Special-case Discard: if mjai pai is omitted, match any discard, then | |
| // disambiguate via tsumogiri when possible. | |
| if tt == ActionType::Discard { | |
| let candidates: Vec<&Action> = legal_actions | |
| .iter() | |
| .filter(|a| { | |
| a.action_type == ActionType::Discard | |
| && (parsed.tile_str.is_empty() | |
| || a.tile.is_some_and(|t| tid_to_mjai(t) == parsed.tile_str)) |
There was a problem hiding this comment.
Thanks for catching this! You're right that an empty tile_str is not a well-formed dahai message, and silently accepting it does mean the caller never finds out their payload is wrong. However, suddenly tightening this contract would break other systems that currently rely on the previous lenient behavior. To avoid that risk, I'll preserve backward compatibility and take your suggestion as-is.
I've also added a NOTE comment in the code documenting that the malformed-but-tolerated case is intentional, so future readers don't mistake it for an oversight. 2273e71
Fixes #195.
Observation.select_action_from_mjaiandObservation3P.select_action_from_mjaimatched legal actions using onlythe Mjai
typeandpaifields, which produced incorrect orinconsistent results in two scenarios.
Changes
drawn_tiletoObservation/Observation3P(back-compatiblewith existing base64 snapshots) and centralizes the Mjai-to-
Actionselection logic so that
tsumogiriandconsumedare honored.chibranch from the 3Ptarget_typematchfor tidiness.
Backward compatibility
serialize_to_base64/deserialize_from_base64round-trip remainsintact in both directions:
drawn_tileis filled withNonethanks to#[serde(default)].drawn_tilefield is ignoredby serde's default behavior.
select_action_from_mjaithat omittsumogiri/consumed, or that target anObservationwhosedrawn_tileisNone, transparently fall back to the previous first-match behavior.Observation(...)/Observation3P(...)constructorsaccept the same positional argument list as before;
drawn_tiledefaults to
None.