Skip to content

Serial monitor output cut short with multibyte (UTF-8) data #9808

Closed
@matthijskooijman

Description

@matthijskooijman

I noticed an issue with the serial monitor, where some data might get stuck and not be displayed in the monitor as expected.

To reproduce, run the following sketch (probably works on any native USB Arduino, I tried on an STM32 board which I had at hand):

void setup() {
  Serial.begin(9600);
  for (uint8_t i = 0; i < 12; ++i) {
    Serial.print("0123456789");
  }
  Serial.print("°0123456789");
  delay(10000);
  Serial.print("X");
}

void loop() { }

This prints just over 128 bytes of data, with a multi-byte character in the first 128 bytes. Then it waits a bit, and prints one more byte.

When running this, the serial monitor will first print:

012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789°0123456

Then, after 10 seconds, it adds the missing part plus the X:

789X

For this to occur, the Arduino must send its data quite fast (hence the native USB requirement). I reproduced this once on a Uno at 2Mbps, but could not reliably do that.

The problem is in the code that converts data from bytes to chars:

byte[] buf = port.readBytes(serialEvent.getEventValue());
int next = 0;
while(next < buf.length) {
while(next < buf.length && outToMessage.hasRemaining()) {
int spaceInIn = inFromSerial.remaining();
int copyNow = buf.length - next < spaceInIn ? buf.length - next : spaceInIn;
inFromSerial.put(buf, next, copyNow);
next += copyNow;
inFromSerial.flip();
bytesToStrings.decode(inFromSerial, outToMessage, false);
inFromSerial.compact();
}
outToMessage.flip();
if(outToMessage.hasRemaining()) {
char[] chars = new char[outToMessage.remaining()];
outToMessage.get(chars);
message(chars, chars.length);
}
outToMessage.clear();

This code reads an arbitrary amount of data using port.readBytes into buf. It then puts as much bytes as possible into inFromSerial, which is then converted to chars which are stored into outToMessage, continuing until outToMessage is full, which is then passed on to the serial monitor, and the entire thing is continued until buf is fully read.

However, in the case simulated with this sketch:

  • buf contains 132 bytes (12 x 01234567890 + °0123456789, not that ° is 2 bytes).
  • On the first inner loop, 128 bytes are read into inFromSerial, which are decoded into 127 characters in outToMessage.
  • On the second inner loop, the remaining 4 bytes are read into inFromSerial, but just one is decoded into outToMessage because then outToMessage is full.
  • These first 128 characters are sent to the serial monitor, as shown above. The last three bytes/characters (789) are still in inFromSerial.
  • The outer loop checks next < buf.length, which is false (since everything has been read from buf, so processing stops.
  • When later the X is sent, that is copied into inFromSerial as well, where the 789 are still waiting, and everything together is decoded and displayed.

To fix this, processing should probably just continue until inFromSerial is empty (or actually until no bytes can be read anymore, since there might be a partial UTF-8 character left in inFromSerial that must be left there until the rest is received).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions