Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Thermostat cluster SetpointRaiseLower sometimes fails when mode is Both #31214

Open
Emill opened this issue Jan 2, 2024 · 2 comments
Open
Labels
app-clusters Application cluster work bug Something isn't working hvac needs triage spec Mismatch between spec and implementation

Comments

@Emill
Copy link

Emill commented Jan 2, 2024

Reproduction steps

  1. Start chip-all-clusters-app or thermostat-app.
  2. Pair chip-tool.
  3. Read the OccupiedHeatingSetpoint and OccupiedCoolingSetpoint attributes (optional):
$ ./chip-tool thermostat read occupied-heating-setpoint 3 1
...
Data = 2000
...
$ ./chip-tool thermostat read occupied-cooling-setpoint 3 1
...
Data = 2600
...

These correspond to that the heating should be turned on when the temperature gets below 20.00 degrees Celsius and cooling should be turned on when the temperature gets above 26.00 degrees Celcius.

  1. Try to lower both setpoints by a big value, like -12.7 degrees:
$ ./chip-tool thermostat setpoint-raise-lower 2 -127 3 1
...
status = 0x01 (FAILURE)
...

The server app outputs:

[1704206104.716113][243400:243400] CHIP:DMG: Received command for Endpoint=1 Cluster=0x0000_0201 Command=0x0000_0000
[1704206104.716131][243400:243400] CHIP:ZCL: Error: SetOccupiedCoolingSetpoint failed!
[1704206104.716139][243400:243400] CHIP:DMG: Endpoint 1, Cluster 0x0000_0201 update version to b09ba7c5
[1704206104.716144][243400:243400] CHIP:DMG: Endpoint=1 Cluster=0x0000_0201 Command=0x0000_0000 status 0x01 (FAILURE) (no additional context)

This failure was discovered when trying to execute the test TC-TSTAT-3.2 by first increasing Both values by 12.7 degrees and then decreasing Both values by 12.7 degrees against the test harness.

The failure occurs in thermostat-server.cpp:

    case OccupiedCoolingSetpoint::Id: {
        requested = static_cast<int16_t>(chip::Encoding::LittleEndian::Get16(value));
        if (!CoolSupported)
            return imcode::UnsupportedAttribute;
        if (requested < AbsMinCoolSetpointLimit || requested < MinCoolSetpointLimit || requested > AbsMaxCoolSetpointLimit ||
            requested > MaxCoolSetpointLimit)
            return imcode::InvalidValue;
        if (AutoSupported)
        {
            if (requested < OccupiedHeatingSetpoint + DeadBandTemp)
                return imcode::InvalidValue; // <------------------- This condition is triggered
        }
        return imcode::Success;
    }

and

    WriteCoolingSetpointStatus = OccupiedCoolingSetpoint::Set(aEndpointId, DesiredCoolingSetpoint);
    if (WriteCoolingSetpointStatus != EMBER_ZCL_STATUS_SUCCESS)
    {
        ChipLogError(Zcl, "Error: SetOccupiedCoolingSetpoint failed!");
    }
    WriteHeatingSetpointStatus = OccupiedHeatingSetpoint::Set(aEndpointId, DesiredHeatingSetpoint);
    if (WriteHeatingSetpointStatus != EMBER_ZCL_STATUS_SUCCESS)
    {
        ChipLogError(Zcl, "Error: SetOccupiedHeatingSetpoint failed!");
    }

The problem is that it first tries to write the cooling attribute followed by the heating attribute in a non-atomic fashion. When both values are to be lowered at the same time, the condition that fails (requested < OccupiedHeatingSetpoint + DeadBandTemp) for the cooling setpoint should not be tested until both values have been written. Alternatively, I think it will also work if the attributes are written in reverse order when we decrease the values.

Note that in order for the bug to trigger, initial values for the two attributes must be such that the condition will trigger. The default start-up values for the two example apps will make the bug trigger.

Also note that the second attribute will be written even if the first write failed, which is a bit strange.

Bug prevalence

Always

GitHub hash of the SDK that was being used

6ea3c34

Platform

core

Platform Version(s)

No response

Anything else?

No response

@Emill Emill added bug Something isn't working needs triage labels Jan 2, 2024
@bzbarsky-apple bzbarsky-apple added app-clusters Application cluster work hvac spec Mismatch between spec and implementation labels Jan 2, 2024
@tcarmelveilleux
Copy link
Contributor

@bzbarsky-apple Why is the spec tag added?

@bzbarsky-apple
Copy link
Contributor

@tcarmelveilleux Because the behavior this bug causes is not spec-compliant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app-clusters Application cluster work bug Something isn't working hvac needs triage spec Mismatch between spec and implementation
Projects
Status: Todo
Development

No branches or pull requests

3 participants