Skip to content

Set the default charset as UTF-8 for REST calls with application/json MediaType #1437

@lqiu96

Description

@lqiu96

Discovered while implementing the Showcase Compliance Suite:

2023/03/03 17:19:42   request: {
  "name":  "Extreme values",
  "info":  {
    "fString":  "non-ASCII+non-printable string ? ? ? \"\\/\b\f\r\t? works, not newlines yet",
    "fInt32":  2147483647,
    "fSint32":  2147483647,
    "fSfixed32":  2147483647,
    "fUint32":  4294967295,
    "fFixed32":  4294967295,
    "fInt64":  "9223372036854775807",
    "fSint64":  "9223372036854775807",
    "fSfixed64":  "9223372036854775807",
    "fUint64":  "18446744073709551615",
    "fFixed64":  "18446744073709551615",
    "fDouble":  1.7976931348623157e+308,
    "fFloat":  3.4028235e+38,
    "fBool":  false,
    "fBytes":  "",
    "fKingdom":  "LIFE_KINGDOM_UNSPECIFIED",
    "fChild":  null,
    "pString":  "Goodbye",
    "pInt32":  2147483647,
    "pDouble":  1.7976931348623157e+308,
    "pBool":  false
  },
  "serverVerify":  true,
  "fInt32":  0,
  "fInt64":  "0",
  "fDouble":  0
}
Received Info:  f_string:"non-ASCII+non-printable string ? ? ? \"\\/\x08\x0c\r\t? works, not newlines yet"  f_int32:2147483647  f_sint32:2147483647  f_sfixed32:2147483647  f_uint32:4294967295  f_fixed32:4294967295  f_int64:9223372036854775807  f_sint64:9223372036854775807  f_sfixed64:9223372036854775807  f_uint64:18446744073709551615  f_fixed64:18446744073709551615  f_double:1.7976931348623157e+308  f_float:3.4028235e+38  p_string:"Goodbye"  p_int32:2147483647  p_double:1.7976931348623157e+308  p_bool:false
Expected Info:  f_string:"non-ASCII+non-printable string ☺ → ← \"\\/\x08\x0c\r\tሴ works, not newlines yet"  f_int32:2147483647  f_sint32:2147483647  f_sfixed32:2147483647  f_uint32:4294967295  f_fixed32:4294967295  f_int64:9223372036854775807  f_sint64:9223372036854775807  f_sfixed64:9223372036854775807  f_uint64:18446744073709551615  f_fixed64:18446744073709551615  f_double:1.7976931348623157e+308  f_float:3.4028235e+38  p_string:"Goodbye"  p_int32:2147483647  p_double:1.7976931348623157e+308  p_bool:false
2023/03/03 17:19:42 server error: (ComplianceSuiteRequestMismatchError) contents of request "Extreme values" do not match test suite

The unicode values aren't send properly as part of the request. The correct values should be ☺ → ← but they get sent as ? ? ?.

Issue:

  1. The MediaType for REST requests is set here:
    https://github.com/googleapis/gapic-generator-java/blob/c23f981e2ac3c573bed51e725dc7061551179400/gax-java/gax-httpjson/src/main/java/com/google/api/gax/httpjson/HttpRequestRunnable.java#L175

It only set the MediaType's charset as application/json when it could be application/json; charset=utf-8. This ensures that the unicode values are set to be encoded properly.

  1. The default charset is not set to be UTF-8:
    https://github.com/googleapis/google-http-java-client/blob/84216c5cfe2f1600464d34661a208cf165e11b9b/google-http-client/src/main/java/com/google/api/client/http/AbstractHttpContent.java#L94-L98

Instead, it's set to StandardCharsets.ISO_8859_1, which only represents the first 256 characters of Unicode.

This seems to due to this comment: googleapis/google-http-java-client#300 (comment)

Metadata

Metadata

Assignees

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions