Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tinycbor mynewt upstream #83

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vrahane
Copy link
Contributor

@vrahane vrahane commented Dec 1, 2017

Upstreaming changes made by mynewt and carrying over fixes

  • Bring tinycbor up to date with mynewt tinycbor
  • Changing implementation of cbor_encoder_get_extra_bytes_needed() and
    cbor_encoder_get_buffer_size() as part of cbor_buf_writer APIs
  • Move bytes_needed field from CborEncoder to the buf writer, this is needed by the tests mainly.
  • Fix cbor_buf_cmp to do memcmp and return complemented result
  • iterate_string_chunks(): fixing NULL compare at the end of string
    and moving it out of the iterate_string_chunks(). This is to avoid
    buffer specific parser calls in the function
  • cbor_value_get_next_byte() is removed in mynewt version of tinycbor,
    so, we track offsets of the buffer which can be used for comparison
    in the parser tests instead of calculating the offset
  • Making the decoder and parser APIs backwards compatible
  • Adding encoder writer and parser reader as part of the encoder and
    parser structure. This is to make the encoder and parser use new
    function of encoder_writer and decoder_reader without breaking backwards
    compatibility.
  • Making the old API use flat buffers by default
  • Adding APIs for initializing encoder and parser with custom writer and
    reader
  • cpp test now uses tinycbor lib
  • Make the default writer and reader conditional based on
    NO_DFLT_READER/WRITER define. This is because we want a default
    reader/writer to avoid API changes.
  • Use it->offset instead of it->ptr to track buffer offsets
  • Update resolve_indicator() static api parameters to use cbor value
    and access offsets instead of taking pointers as input parameters
  • In validate_container() do a byte by byte comparison instead of
    memcmp since we no longer have access to the buffer directly
    Also, use offsets instead of pointers to validate sorted maps
  • Added a new define for conditionally compiling in float support (NO_FLOAT_SUPPORT).
    This is because we want the float support to be compiled in by
    default.
  • Use static_assert macro instead of Static_assert. Changed to avoid
    build failures.
  • Add api to get string chunk, this is a callback which can be used by
    buffer implementations to grab a string that is divided in chunks
    which spans across multiple chained buffers

Copy link
Member

@thiagomacieira thiagomacieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase and split into multiple PR, one per logical change. I'm not too strict on the changes being atomic, but I do want to see them grouped into logical chunks that go together.

I also need an explanation for why you've done some of the changes.

@carlescufi
Copy link

@thiagomacieira thanks for the early feedback, is 1 PR with multiple, atomic, self-contained commits OK or do you explicitly want multiple PRs?

@thiagomacieira
Copy link
Member

I would prefer multiple PRs if you can. If you need, you can have more than one commit in each PR too.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch 2 times, most recently from 855498c to b60f787 Compare December 7, 2017 00:07
@vrahane
Copy link
Contributor Author

vrahane commented Dec 7, 2017

Thank you for the quick review @thiagomacieira , I have rebased and resolved all conflicts. I would like to understand more about how you would like to split this PR into multiple PRs. This PR basically adds support for chained buffers with keeping backwards compatibility intact.

I have a suggestion, please let me know if this is what you would like :

  1. Make a PR for encoder and parser changes combined
  2. Make a separate PR for changes to tools
  3. Make a separate PR for changes to tests

If you would like I can even separate out parser and encoder changes.

Thank you so much for going through this.

@thiagomacieira
Copy link
Member

I'd like to see changes logically grouped. For example, the first commit in the series says "tinycbor bug fixes". That does not belong with the rest. Please submit bug fixes in separate PRs, one per bugfix.

Another one is "Fix Windows build". Well, the Windows build isn't broken right now, so you must have broken it in your series. Remove the breakage instead of adding a later patch to fix. Similarly for "Remove mynewt specific pkg.yml" -- there's no such file right now, so if there's a file to be removed, it's because an earlier commit added it. Remove the addition.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch from 12230ab to 12d89ef Compare December 7, 2017 22:21
@vrahane
Copy link
Contributor Author

vrahane commented Dec 7, 2017

Ah, I see what you mean now. I have created a consolidated PR for now. I will remove any bug fixes from this and put them in separate PRs. Thank you.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch 2 times, most recently from f447cf8 to 53c03ec Compare December 15, 2017 20:39
@thiagomacieira
Copy link
Member

Please see https://github.com/thiagomacieira/tinycbor/tree/dev for an initial API. Pay close attention to 86053d7

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch 2 times, most recently from 6148e89 to 423f94b Compare January 3, 2018 00:02
@vrahane
Copy link
Contributor Author

vrahane commented Jan 3, 2018

@thiagomacieira Are you suggesting a change to the API that the PR adds. I have rebased and resolved the conflicts that were introduced by the most recent merges.

@thiagomacieira
Copy link
Member

@vrahane thanks for your attempt at rebasing, but you mustn't have done it right. Please look at your commit's change to cbor.h: it's removing a lot of code, the new API I added in 0.5. And your commit still does too much in a single go. I'd need you to split them into separate, logical chunks.

As for the API for reading and writing outside of a linear buffer, I had a need for it so I developed my own version (without looking at yours). I am asking you to review it and provide feedback. I have yet to add docs about it, but the API is unit-tested.

@vrahane
Copy link
Contributor Author

vrahane commented Jan 5, 2018

Thank you for the reply. The code from cbor.h has been moved to src/cbor_buf_writer.c. The constants & enums have been moved to cbor_defs.h. Code has not been removed. There are no API changes/removals in the PR. I can try to break it more into logical chunks.

Your version looks good to me but I think some functionality is missing for it to work with chained buffers. Would you like to get on a call and discuss the comparison between your changes and the changes in the PR ? That might be the best way to get an understanding what each of us want.

@thiagomacieira
Copy link
Member

You need to justify those changes: why do we need new headers?

Let's schedule a call next week (can you send me an email reminding me?). I'd like to understand why it may not work with chained buffers. I've got a PoC implementation that should work with any kind of buffer.

@vrahane
Copy link
Contributor Author

vrahane commented Jan 8, 2018

There were compilation errors with cbor.h being included in src/cbor_buf_writer.c and src/cborparser.c. Hence, the new headers which separate out cbor constants and enums from inline apis. I do not have the exact compilation error log.

I have removed the extraneous header. It was needed earlier, it is not needed anymore.
I have also removed any unrelated changes from this PR.

I will send you an email shortly scheduling a call. One case where the API you are suggesting might not work is for a string that spans across multiple chained buffers. I can compare more between your suggested API and my PR and we can discuss more over the call.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch 2 times, most recently from 88c8d4c to 3ef6799 Compare January 8, 2018 23:01
@thiagomacieira
Copy link
Member

What I've done for strings is basically punt the problem. Both the "dup" and "copy" string functions require linear buffers. If your buffer isn't linear, then you must use the "get_string_chunk" function, which does not try to access the string in the first place. It's up to the caller to read string from the buffers.

@vrahane
Copy link
Contributor Author

vrahane commented Jan 9, 2018

There was some confusion, because the hash you had provided was from intel/tinycbor/dev and the URL pointed to your dev branch of your fork of tinycbor. Hence, the confusion.

I see what you mean. Giving it some more thought, I can try to change Mynewt's chained buffer (mbuf) API for tinycbor and test out your implementation. Would you like me to do that ?

@thiagomacieira
Copy link
Member

Yes, that would be helpful. Right now, I have a strawman only and more testing is required.

@ccollins476ad
Copy link

ccollins476ad commented Jan 16, 2018

(Regarding the new API in https://github.com/thiagomacieira/tinycbor/tree/dev)

Just a question - how would you expect get_string_chunk() to be used with chained buffers?

static CborError get_string_chunk(CborValue *it, const void **bufferptr, size_t *len)

Maybe my understanding of this function's semantics is incorrect. Here is my understanding: a successful call causes *bufferptr to point to a buffer containing a string chunk of size *len. What if we have a 5-byte string chunk ("aaaaa") spread across two buffers, as follows:

+-------------+     +-------+
| 65 61 61 61 | --> | 61 61 |
+-------------+     +-------+

In other words, the first three characters are in the first buffer, and the remaining two are in the second buffer.

Did you have any thoughts about how get_string_chunk() should handle a case like this?

EDIT: Also, cbor_value_advance() seems to only work with fixed length buffers (it calls _cbor_value_copy_string()). Would you agree?

@thiagomacieira
Copy link
Member

That's a very good question and that's where I'd like input. It is possible with the API as I wrote it, but it's ugly.

Please note that this is more complex than what you asked. There's reading from a chained buffer and there's also reading into a chained buffer.

There are two possible implementations with the API. The internal get_string_chunk function calls the transfer_string callback with the pointer that the user passed, the number of bytes to skip and the number of bytes in the string chunk. So the callback must first skip offset bytes, then arrange for len bytes of the string chunk to be made available. The choice is here: it can store a pointer (or something else) or something else in the user's buffer then advance the buffer by len bytes, or it can do nothing.

The "do nothing" case is possible because if you're iterating chunks, then TinyCBOR will return after that, allowing the input buffer to be positioned exactly at the beginning of the string. The caller can then deal with the string directly, in zero-copy fashion. It just needs to be sure to have advanced the buffer by len before calling _cbor_value_get_string_chunk again. This is what I've done in my implementation in Qt (see https://gitlab.com/thiagomacieira/qtbase/blob/524372bc4691be05eacb845999b839a687181565/src/corelib/serialization/qcborstream.cpp#L498 and https://gitlab.com/thiagomacieira/qtbase/blob/524372bc4691be05eacb845999b839a687181565/src/corelib/serialization/qcborstream.cpp#L531).

@thiagomacieira
Copy link
Member

The consequences for the API:

  • _cbor_value_get_string_chunk: works. Whether the value stored in bufferptr is a pointer or something different depends on what the transfer_string callback does.
  • _cbor_value_copy_string: works only if buffer parameter is NULL and only if it's not the "do nothing" solution.
  • _cbor_value_dup_string: does not work (but it calls malloc(), so you didn't want it anyway)

Let'st take your 5-character example:

   +-------------+     +-------+
   | 65 61 61 61 | --> | 61 61 |
   +-------------+     +-------+

Before get_string_chunk, the buffer state (whatever it is) points to the 0x65 byte. After the call, the two options are:

  1. the input buffer points to the first 0x61 byte, len was set to 5. The caller must process 5 bytes, then advance the input buffer past the last 0x61.
  2. the input buffer points to after the last 0x61, len was set to 5 and buffer was set to something non-zero that allows the caller to find the first 0x61 (for example, an offset from the beginning of the buffer).

@ccollins476ad
Copy link

Thanks for the explanation, Thiago. It makes sense.

When you say the transfer_string callback can "do nothing", it should still add the offset to the reader's internal pointer (or offset count), right? That seems to break cbor_value_calculate_string_length(), as it causes the parser to advance past the string descriptor. Or maybe I have misunderstood something...

@thiagomacieira
Copy link
Member

No, you're correct. If it does not adjust the internal pointer, any TinyCBOR function that iterates over the string will fail.

My Qt wrapper didn't call any such function, so there was no failure. That's how it got away with "doing nothing". Or so I thought: the validate() function does need that and must be failing, it just happen not to be tested with the condition. Thanks for pointing out.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch 3 times, most recently from 209e557 to f7f4a78 Compare January 25, 2018 00:42
@vrahane
Copy link
Contributor Author

vrahane commented Jan 25, 2018

@thiagomacieira Can you please restart the travis-ci build. It seems it has got stuck in some intermittent state.

@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch from f7f4a78 to 9a28193 Compare January 26, 2018 21:04
cbor_reader_get16 *get16;
cbor_reader_get32 *get32;
cbor_reader_get64 *get64;
cbor_memcmp *cmp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to add memcmp and memcpy. I'd been thinking how to achieve those and hadn't come up with this yet.

/* use this count writer if you want to try out a cbor encoding to see
* how long it would be (before allocating memory). This replaced the
* code in tinycbor.h that would try to do this once the encoding failed
* in a buffer. Its much easier to understand this way (for me)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to accept this.

@thiagomacieira
Copy link
Member

thiagomacieira commented Jan 29, 2018

Ok, I like your changes, but not enough to accept them. There are minor things in the change (like structures not following my coding style), but the most important thing is that it's a large growth in size.

I need some time to study the possibilities and how they affect the code it generates. Moreover, I need to merge with my own changes I've made for dev, which already group the reading and writing. I will incorporate some of your ideas, but not all. Given my current time commitments, this should take two or three months to complete.

@thiagomacieira thiagomacieira dismissed their stale review January 29, 2018 02:36

Not requesting changes. I'm going to use the change as ideas, but will not accept this review.

vrahane added a commit to apache/mynewt-mcumgr that referenced this pull request Jan 30, 2018
- Bring tinycbor up to date with mynewt tinycbor
- Changing implementation of cbor_encoder_get_extra_bytes_needed()
  and cbor_encoder_get_buffer_size() as part of cbor_buf_writer APIs
- Move bytes_needed field from CborEncoder to the buf writer, this
  is needed by the tests mainly.
- Fix cbor_buf_cmp to do memcmp and return complemented result
- iterate_string_chunks(): fixing NULL compare at the end of string
  and moving it out of the iterate_string_chunks(). This is to avoid
  buffer specific parser calls in the function
- cbor_value_get_next_byte() is removed in mynewt version of tinycbor,
  so, we track offsets of the buffer which can be used for comparison
  in the parser tests instead of calculating the offset
- Move cbor_encoder_get_extra_bytes_needed() and
  cbor_encoder_get_buffer_size() to be part of cbor_buf_writer APIs
- Add bytes_needed field to the buf writer
- Adding encoder writer and parser reader as part of the encoder and
  parser structure. This is to make the encoder and parser use new
  function of encoder_writer and decoder_reader without breaking backwards
  compatibility.
- Making the old API use flat buffers by default
- Adding APIs for initializing encoder and parser with custom writer and
  reader
- Make the default writer and reader conditional based on
  NO_DFLT_READER/WRITER define. This is because we want a default
  reader/writer to avoid API changes.
- Move enums to cbor_defs.h
- Use it->offset instead of it->ptr to track buffer offsets
- Update resolve_indicator() static api paramaters to use cbor value
  and access offsets instead of taking pointers as input parameters
- In validate_container() do a byte by byte comparison instead of
  memcmp since we no longer have access to teh buffer directly
  Also, use offets instead of pointers to validate sorted maps
- Added a new dfine for conditionally compiling in float support (NO_FLOAT_SUPPORT).
  This is because we want the float support to be compiled in by
  default.
- Use static_assert macro instead of Static_assert. Changed to avoid
  build failures.
- Add api to get string chunk, this is a callback which can be used by
  buffer implementations to grab a string that is divided in chunks
  which spans across multiple chained buffers
@vrahane vrahane force-pushed the tinycbor_mynewt_upstream branch from 9a28193 to ece8b46 Compare February 6, 2018 22:59
vrahane added a commit to vrahane/zephyr that referenced this pull request Feb 7, 2018
- Pulling in code from intel/tinycbor#83 as a
  patch to zephyr's ext/tinycbor. This is to facilitate the use of
  chained buffers functionality for tinycbor while it is in development
  on https://github.com/intel/tinycbor
- Bring tinycbor up to date with mynewt tinycbor
- Changing implementation of cbor_encoder_get_extra_bytes_needed() and
  cbor_encoder_get_buffer_size() as part of cbor_buf_writer APIs
- Move bytes_needed field from CborEncoder to the buf writer, this is
  needed by the tests mainly.
- Fix cbor_buf_cmp to do memcmp and return complemented result
  iterate_string_chunks(): fixing NULL compare at the end of string
  and moving it out of the iterate_string_chunks(). This is to avoid
  buffer specific parser calls in the function
- cbor_value_get_next_byte() is removed in mynewt version of tinycbor,
  so, we track offsets of the buffer which can be used for comparison
  in the parser tests instead of calculating the offset
- Making the decoder and parser APIs backwards compatible
- Adding encoder writer and parser reader as part of the encoder and
  parser structure. This is to make the encoder and parser use new
  function of encoder_writer and decoder_reader without breaking
  backwards compatibility.
- Making the old API use flat buffers by default
- Adding APIs for initializing encoder and parser with custom writer
  and reader
- Make the default writer and reader conditional based on
  NO_DFLT_READER/WRITER define. This is because we want a default
  reader/writer to avoid API changes.
- Use it->offset instead of it->ptr to track buffer offsets
- Update resolve_indicator() static api parameters to use cbor value
  and access offsets instead of taking pointers as input parameters
- In validate_container() do a byte by byte comparison instead of
  memcmp since we no longer have access to the buffer directly
- Also, use offsets instead of pointers to validate sorted maps
- Added a new define for conditionally compiling in float support
  (NO_FLOAT_SUPPORT).
- This is because we want the float support to be compiled in by
  default.
- Use static_assert macro instead of Static_assert. Changed to avoid
  build failures.
- Add api to get string chunk, this is a callback which can be used by
  buffer implementations to grab a string that is divided in chunks
  which spans across multiple chained buffers
- Add KConfig and CMakeList files for configuration and build
- Delete .gitignore and .gitattributes
- Remove tools, tests and examples as they are not really needed for
  building the library

Signed-off-by: Vipul Rahane <vipulrahane@apache.org>
- Make half float encode/decode conditional
- Change defaults for CBOR_WITHOUT_OPEN_MEMSTREAM,
  CBOR_NO_HALF_FLOAT_TYPE and CBOR_NO_FLOATING_POINT
  to y so that half float, float and open
  memstream support along with newlib libc does not
  get compiled in
- src/cborpretty.c, src/cbortojson.c and src/cborvalidation.c
  conditionally include math.h and half float type support
- Conditionally include math.h in src/compilersupport_p.h to avoid
  newlib libc from getting compiled in
- Conditionally compile src/cborparser_dup_string.c if newlib libc is
  compiled in

Signed-off-by: Vipul Rahane <vipulrahane@apache.org>
Copy link
Member

@thiagomacieira thiagomacieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split the FP part into a separate PR, with just one commit?

@@ -33,8 +33,8 @@
#endif

#include "cbor.h"
#include "compilersupport_p.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our includes must always come before system's.

ccollins476ad added a commit to apache/mynewt-mcumgr that referenced this pull request Feb 13, 2018
Note: This is commit 3ef6799f633df19a84b99ccd0f21fb8c02201690 of a PR
against the tinycbor library:
intel/tinycbor#83
ccollins476ad added a commit to apache/mynewt-mcumgr that referenced this pull request Feb 13, 2018
Note: This is commit 3ef6799f633df19a84b99ccd0f21fb8c02201690 of a PR
against the tinycbor library:
intel/tinycbor#83
ccollins476ad added a commit to apache/mynewt-mcumgr that referenced this pull request Feb 15, 2018
Note: This is commit 3ef6799f633df19a84b99ccd0f21fb8c02201690 of a PR
against the tinycbor library:
intel/tinycbor#83
@@ -37,7 +41,9 @@
# include <assert.h>
#endif
#include <float.h>
#ifndef CBOR_NO_FLOATING_TYPE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CBOR_NO_FLOATING_POINT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or CBOR_NO_HALF_FLOAT_TYPE?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants