Skip to content

SL-18330: Send binary LLSD to LEAP plugins; extend LLSD compatibility. #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 22, 2023

Conversation

nat-goodspeed
Copy link
Contributor

The assertion in a python-llsd pull request is that parsing large LLSD content is much faster with binary serialization than notation. Since LEAP plugins sometimes request large content, make LLLeap send binary LLSD, instead of notation, to the LEAP plugin child process.

Also improve compatibility with MIME-type headers emitted by secondlife/python-llsd.

The self-tests introduced by this change depend on secondlife/python-llsd#4 .

LLSDSerialize::serialize() emits a header string, e.g. "<? llsd/notation ?>"
for notation format. Until now, LLSDSerialize::deserialize() has required that
header to properly decode the input stream.

But none of LLSDBinaryFormatter, LLSDXMLFormatter or LLSDNotationFormatter
emit that header themselves. Nor do any of the Python llsd.format_binary(),
format_xml() or format_notation() functions. Until now, you could not use
LLSD::deserialize() to parse an arbitrary-format LLSD stream serialized by
anything but LLSDSerialize::serialize().

Change LLSDSerialize::deserialize() so that if no header is recognized,
instead of failing, it attempts to parse as notation. Add tests to exercise
this case.

The tricky part about this processing is that deserialize() necessarily reads
some number of bytes from the input stream first, to try to recognize the
header. If it fails to do so, it must prepend the bytes it has already read to
the rest of the input stream since they're probably the beginning of the
serialized data.

To support this use case, introduce cat_streambuf, a std::streambuf subclass
that (virtually) concatenates other std::streambuf instances. When read by a
std::istream, the sequence of underlying std::streambufs appears to the
consumer as a single continuous stream.
Absent a header from LLSDSerialize::serialize(), make deserialize()
distinguish between XML or notation by recognizing an initial '<'.
Since parsing binary LLSD is faster than parsing notation LLSD, send data from
the viewer to the LEAP plugin child process's stdin in binary instead of
notation.

Similarly, instead of parsing the child process's stdout using specifically a
notation parser, use the generic LLSDSerialize::deserialize() LLSD parser.

Add more LLSDSerialize Python compatibility tests.
When sending multiple LEAP packets in the same file (for testing convenience),
use a length prefix instead of delimiting with '\n'. Now that we allow a
serialization format that includes an LLSD format header (e.g.
"<?llsd/binary?>"), '\n' is part of the packet content. But in fact, testing
binary LLSD means we can't pick any delimiter guaranteed not to appear in the
packet content.

Using a length prefix also lets us pass a specific max_bytes to the subject
C++ LLSD parser.

Make llleap_test.cpp use new freestanding Python llsd package when available.
Update Python-side LEAP protocol code to work directly with encoded bytes
stream, avoiding bytes<->str encoding and decoding, which breaks binary LLSD.

Make LLSDSerialize::deserialize() recognize LLSD format header case-
insensitively. Python emits and checks for "llsd/binary", while LLSDSerialize
emits and checks for "LLSD/Binary". Once any of the headers is recognized,
pass corrected max_bytes to the specific parser.

Make deserialize() more careful about the no-header case: preserve '\n' in
content. Introduce debugging code (disabled) because it's a little tricky to
recreate.

Revert LLLeap child process stdout parser from LLSDSerialize::deserialize() to
the specific LLSDNotationParser(), as at present: the generic parser fails one
of LLLeap's integration tests for reasons that remain mysterious.
mBuffer(1024)
{}

int underflow();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly with override specifier?

}
header = hdr_buf;
header.erase(lastchar+1);
}
Copy link
Contributor

@AndrewMeadows AndrewMeadows Jan 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to have more than one '\n' or '\r' right at the end, right? This only deletes on of them. It seems to me you would want to delete them all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really excellent point! I'm glad you raised it.

erase(iterator) calls the overload that erases only a single character.

erase(size_type) calls the (size_type, count) overload whose default count erases to the end of the string. That's what I intend, and that's what I'm calling here, because find_last_not_of() returns size_type.

But the confusion is perfectly natural given that I declared auto lastchar = ... In order to distinguish the erase() overloads, the reader must either remember or look up the type returned by find_last_not_of().

I will (a) declare size_type explicitly, and (b) add a comment explaining which erase() overload that invokes.

}
// Either we ran out of mInputs or we succeeded in reading some
// characters, that is, size != 0. Tell base class what we have.
setg(mBuffer.data(), mBuffer.data(), mBuffer.data() + size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When mInputs.empty() the size variable is never initialized, but it used in either case at line 544. I think it should be explicitly initialized to zero at line 532.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. Will fix.

@nat-goodspeed
Copy link
Contributor Author

@AndrewMeadows , please glance at updates?

@nat-goodspeed nat-goodspeed merged commit d9ef671 into contribute Feb 22, 2023
@nat-goodspeed nat-goodspeed deleted the sl-18330 branch February 22, 2023 17:12
@github-actions github-actions bot locked and limited conversation to collaborators Feb 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants