Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE_MISSION_ES_MAX_SHELL_PKT Causes ES Error When Unsigned #335

Closed
skliper opened this issue Sep 30, 2019 · 8 comments · Fixed by #584
Closed

CFE_MISSION_ES_MAX_SHELL_PKT Causes ES Error When Unsigned #335

skliper opened this issue Sep 30, 2019 · 8 comments · Fixed by #584
Assignees
Labels
bug good first issue Good for newcomers
Milestone

Comments

@skliper
Copy link
Contributor

skliper commented Sep 30, 2019

When CFE_MISSION_ES_MAX_SHELL_PKT is defined as an unsigned value (2000u, for example) and a shell command is executed that results in a shell output file of less than CFE_MISSION_ES_MAX_SHELL_PKT, ES gets stuck in a loop of sending millions of ES Shell Telemetry packets.

This error/loop is in the int32 CFE_ES_ShellOutputCommand(const char * CmdString, const char *Filename) function. I think it is caused by the FileSize variable (int32) getting converted to an uint32 and under-flowing.

A workaround to this is to keep CFE_MISSION_ES_MAX_SHELL_PKT defined as a signed integer, but I think that the fix should be for ES to fix this under-flow. The CFE_MISSION_ES_MAX_SHELL_PKT should be able to be thought of/defined as an unsigned value.

@skliper skliper self-assigned this Sep 30, 2019
@skliper
Copy link
Contributor Author

skliper commented Sep 30, 2019

Imported from trac issue 304. Created by krmoore2 on 2019-06-18T17:21:15, last modified: 2019-06-24T09:46:32

@skliper skliper added the bug label Sep 30, 2019
@skliper skliper removed their assignment Sep 30, 2019
@skliper
Copy link
Contributor Author

skliper commented Nov 1, 2019

Problem is in:

for (CurrFilePtr=0; CurrFilePtr < (FileSize - CFE_MISSION_ES_MAX_SHELL_PKT); CurrFilePtr += CFE_MISSION_ES_MAX_SHELL_PKT)
{
OS_read(fd, CFE_ES_TaskData.ShellPacket.Payload.ShellOutput, CFE_MISSION_ES_MAX_SHELL_PKT);
/* Send the packet */
CFE_SB_TimeStampMsg((CFE_SB_Msg_t *) &CFE_ES_TaskData.ShellPacket);
CFE_SB_SendMsg((CFE_SB_Msg_t *) &CFE_ES_TaskData.ShellPacket);
/* delay to not flood the pipe on large messages */
OS_TaskDelay(CFE_PLATFORM_ES_SHELL_OS_DELAY_MILLISEC);
}

Where if filesize is < CFE_MISSION_ES_MAX_SHELL_PKT, and it's set as unsigned the value will return a big number instead of going negative (to be less than CurrFilePtr)

To fix:

(CurrFilePtr+CFE_MISSION_ES_MAX_SHELL_PKT) < FileSize

but also recommend not explicitly setting #defines as unsigned unless explicitly necessary, I doubt the code has been scrubbed carefully for potential underflows.

skliper added a commit that referenced this issue Nov 1, 2019
@skliper
Copy link
Contributor Author

skliper commented Nov 1, 2019

Need to test commit

  1. Nominal build, send shell command and confirm 1 packet is sent
  2. assign CFE_MISSION_ES_MAX_SHELL_PKT as 200u and rebuild, send short shell command and confirm 1 packet is sent

@skliper skliper added the good first issue Good for newcomers label Nov 1, 2019
@jphickey
Copy link
Contributor

jphickey commented Nov 1, 2019

I would concur with the fix, although not 100% convinced it was a bug because it was really a user configuration error. But the suggested change would make it (slightly) more robust so it is probably worth changing. There are lots of other problems with the shell output code, though (like not checking OS_read return code, for instance).

We should have something in the CFE coding standards that says one should NOT be adding "u" or "l" suffixes to literals, generally. This is a prime example of an unintended consequence of this practice. The literal type suffixes tend to create more problems then they solve unless under very specific circumstances when they are necessary.

A while back we had to remove "l" suffixes on all the cfe_error.h constants because it broke on 64 bit.

@skliper
Copy link
Contributor Author

skliper commented Nov 4, 2019

@krmoore FYI

@krmoore
Copy link

krmoore commented Nov 6, 2019

How did those suffixes break on 64-bit? I don't understand why using them would be a bad thing. There's no use case for the CFE_MISSION_ES_MAX_SHELL_PKT configuration to ever be negative. It's an inherently unsigned value, so it should be treated as such, right?

@skliper
Copy link
Contributor Author

skliper commented Nov 6, 2019

It's not a 64-bit thing, it's coercion (implicit conversion)/integer promotion (see c99 rule 6.3.1.1). The result of subtracting two unsigned numbers is always positive, and FileSize gets promoted to unsigned since CFE_MISSION_ES_MAX_SHELL_PKT is unsigned.

@jphickey
Copy link
Contributor

jphickey commented Nov 6, 2019

It is correct that the CFE_MISSION_ES_MAX_SHELL_PKT value itself should never be negative, but that doesn't mean that it isn't used in a signed computation/expression.

The problem when adding these "u" (and "l") suffixes to literals is that it is not limited to the literal only. It is basically also forcing any expression containing the value to also be unsigned, due to promotion rules. So in this example even though the code was (correctly) using an int32 local variable type to perform the computation, it is because of the extra/unexpected/unnecessary "u" suffix on the literal this was forcing the entire expression to be evaluated as unsigned, and therefore the loop stop condition was never met.

As to why an "L" suffix can change behavior unexpectedly, consider this example:

int32_t value = -1;

if (value == 0xFFFFFFFF)
   /* do something */

if (value == 0xFFFFFFFFL)
   /* do something */

Although it could be argued this is poor code style. but the first "if" is going to evaluate "true" for any foreseeable machine type (Assuming that the machine has a 32-bit integer type and that the respective UINT32_MAX is 0xFFFFFFFF). The implicit conversions dictated by C say that both sides of the equality are promoted to unsigned before evaluation, and the way to do that conversion is by adding (UINT32_MAX+1) to the LHS. The result is that the expression will be true, on both 32-bit and 64-bit machines.

But by adding a superfluous "L" as in the second example, it forces both sides to be promoted to "long" instead. So the result is that the expression is false on any machine where long is bigger than 32 bits.

CFE basically does this exact thing everywhere that error codes are checked, because the error constants are defined as a hex literal.

The reality is that the implicit type rules for literals as dictated by the C standard are well defined and will achieve the desired/expected intent in the vast majority of cases. The suffixes exist only for weird corner cases where the defined rules do not work. But for portable code, literal suffixes are more likely to do harm than good.

@skliper skliper self-assigned this Apr 2, 2020
@skliper skliper added this to the 6.8.0 milestone Apr 2, 2020
astrogeco added a commit that referenced this issue Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants