Skip to content

Conversation

@winder
Copy link
Contributor

@winder winder commented Jul 17, 2021

Summary

If asset strings contain non-printable characters they are stored in a byte array.
Return -b64 asset strings in most endpoints.

Test Plan

New unit test

@winder winder self-assigned this Jul 19, 2021
Comment on lines 100 to 102
ap.AssetName = strings.ReplaceAll(ap.AssetName, "\000", "")
ap.UnitName = strings.ReplaceAll(ap.UnitName, "\000", "")
ap.URL = strings.ReplaceAll(ap.URL, "\000", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect, but I'm probably not understanding the context :
going from imported AssetParams to persisted AssetParams should base64 the result into the "XXXBytes" variant only when needed. Otherwise, it should remain in the original variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is for the transaction's JSON format, these fields happen to be unused but I left it in like this in case Anya wants to use it.


// In case the DB layer filled the name with non-printable utf8
if asset.Params.Name != nil {
name := util.PrintableUTF8OrEmpty(*asset.Params.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while it's ok to add this, I would suggest you'll add a logging here to indicate that it happened, as it implies there is an issue elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining another DB implemented the API. I don't think it's strictly an issue for the DB layer to ignore this presentation requirement. For Postgres, the tests verify that this case would never be triggered.

@codecov-commenter
Copy link

codecov-commenter commented Jul 20, 2021

Codecov Report

Merging #577 (bcb65e0) into develop (d45623b) will increase coverage by 3.34%.
The diff coverage is 64.21%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #577      +/-   ##
===========================================
+ Coverage    43.80%   47.14%   +3.34%     
===========================================
  Files           24       25       +1     
  Lines         3801     3873      +72     
===========================================
+ Hits          1665     1826     +161     
+ Misses        1873     1767     -106     
- Partials       263      280      +17     
Impacted Files Coverage Δ
api/handlers.go 3.68% <0.00%> (-0.13%) ⬇️
types/types.go 0.00% <0.00%> (ø)
idb/postgres/internal/encoding/decode.go 58.82% <58.82%> (ø)
idb/postgres/internal/encoding/encode.go 96.00% <90.90%> (+2.89%) ⬆️
api/converter_utils.go 86.15% <100.00%> (+0.12%) ⬆️
idb/postgres/postgres.go 42.49% <100.00%> (+6.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d45623b...bcb65e0. Read the comment docs.

@tolikzinovyev
Copy link
Contributor

Looks good overall. Left some comments in the code. Could you also add tests in idb/postgres/internal/encoding similar to https://github.com/algorand/indexer/tree/tolik/acct-rewrite4/idb/postgres/internal/encoding?

@winder winder linked an issue Jul 21, 2021 that may be closed by this pull request
@winder winder changed the title utf8 support for asset strings non utf8 support for asset strings Jul 21, 2021
URLBytes: []byte(params.URL),
}

ret.AssetName = util.PrintableUTF8OrEmpty(params.AssetName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be simpler to do:

if params.AssetName is not printable {
  params.AssetNameBytes = ...
  params.AssetName = nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this function from the go-algorand implementation, it lets us skip 16 if statements which is pretty nice. Some discussion about the different versions here if you're curious: algorand/go-algorand#2555 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, most of the times it is better. But in this function we set AssetNameBytes and then conditionally unset it, which took me some effort to understand. I suggest IsPrintableUTF8() is factored out of PrintableUTF8OrEmpty() and used here.

'"reserve": "EQJSQTOITX64CPGA5VRKBKLEJR57YXVNLTW5DKZMESIPTEOLRDNWQIJCGU"' \
'"total": 1000000000000' \
'"unit-name": "bogo"' \
'"unit-name-b64": "Ym9nbw=="'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need tests where the strings are not utf-8-printable. I think a go test will suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't have a good way to test the HTTP layer. You're right that this is missing the final check for removing name/unit-name/url if they contain non-utf8 characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock IndexerDb and test the api functions? Ideally, we would test api functions with an implementation (implementations) of IndexerDb, but a bit of refactoring will need to be done beforehand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I have run some tests manually.

I think it would be good to do some refactoring to the api.Serve, the pattern used by the block generator is pretty nice and allowed me to start/stop the server multiple times for the performance test runner.

@tolikzinovyev
Copy link
Contributor

LGTM. I will let @tsachiherman and @brianolson take a look as well.

Copy link
Contributor

@brianolson brianolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it another read and generally looks good but one quibble over file layout.

"github.com/algorand/indexer/types"
)

type assetParams struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the split of types.go/encode.go/decode.go
I'd rather see one type declaration and its encode and decode in one file (OOP style). Easier to make sure encode and decode match that way. Easier to find the type being referred to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This split makes a lot more sense in the context of Tolik's rewrite, this pattern is used to wrap many more objects than I've done here.

I would probably opt for a single file, only because it would let you use simple text searches to find everything. But don't have a strong opinion either way. For now, I'd like to match his PR to make the merge slightly easier, maybe we can postpone this discussion for the rewrite PR?

Reference: https://github.com/algorand/indexer/blob/tolik/acct-rewrite4/idb/postgres/internal/encoding/types.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block over it but I still don't like it. I still want type-encode-decode bundled together in one file. types.go makes the most sense when the only thing happening to them is via reflection (e.g. go-codec loading and storing serialization forms).

"github.com/algorand/go-algorand-sdk/encoding/json"
"github.com/algorand/indexer/types"

sdk_types "github.com/algorand/go-algorand-sdk/types"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grouping here is weird; it should have been like this, right ?

import(
 	"github.com/algorand/go-algorand-sdk/encoding/json"
 	sdk_types "github.com/algorand/go-algorand-sdk/types" 	

 	"github.com/algorand/indexer/types"
)

Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

@winder winder merged commit 6e4d737 into develop Jul 22, 2021
@winder winder deleted the will/asset-b64 branch July 22, 2021 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

non utf8 support for asset strings

6 participants