-
Notifications
You must be signed in to change notification settings - Fork 23
Input/Output refactor, breaking changes #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
import scala.scalajs.js | ||
|
||
trait JsInterop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A JsInterop
companion would be nice, not everyone is so keen to inherit helpers or import the whole world from commons
.
import com.avsystem.commons.misc.TimestampConversions | ||
|
||
trait JavaTimeInterop { | ||
implicit def instantTimestampConversions(instant: Instant): TimestampConversions = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add JDate
to the mix, it's not my favourite either, but it's still widely used. It's also the only JVM time type present in SJS as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already in JBasicUtils
because JDate
is available also in ScalaJS while JavaTimeInterop
is JVM only.
* | ||
* @param millis milliseconds since UNIX epoch, UTC | ||
*/ | ||
class Timestamp(val millis: Long) extends AnyVal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for it not to be final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value classes are always final
* | ||
* @param millis milliseconds since UNIX epoch, UTC | ||
*/ | ||
class Timestamp(val millis: Long) extends AnyVal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like that we'll finally have a general-purpose cross-compiled timestamp representation in commons
, but it will be often useless for SJS frontends without DateTime
-like formatting. Our go-to solution for now is to transform JVM formats to Moment.js formats, which binds us to JDate
most of the time anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have a sensible cross-compiled, immutable, comparable, serializable and human-readable type that could be put into RPC interfaces.
I know that in order to show it to the user, fronted will have to convert it to something raw (millis, js.Date
) so it can be passed to moment.js
or whatever other formatting library. However, this can now happen at the edge of our codebase and JDate
will no longer pollute it, most importantly our RPC interfaces.
import scala.concurrent.duration.{FiniteDuration, TimeUnit} | ||
|
||
/** | ||
* Millisecond-precision, general purpose, cross compiled timestamp representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instant
has nanosecond resolution. JS has performance.now()
with kinda-usually-microseconds resolution (browser dependent). I wouldn't introduce a new, less precise representation as an intermediate one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am aware of the precision loss but I think it's still worth the cost of having to remember about it.
I'm not sure what you mean by this intermediate precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that this new representation has neat conversions to all the different "times" one can have to use. Its loss of precision disables us from using it as an intermediate format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precision loss happens only at converting Instant
to Timestamp
- all the other conversions are fine. I can change the name of toTimestamp
extension method on Instant
to truncateToTimestamp
so that it's clear that it loses precision. WDYT?
I don't think we need cross-compiled instant data type with more than millisecond precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sgtm
sealed trait JsonDateFormat | ||
object JsonDateFormat { | ||
/** | ||
* Specifies that a timestamp should be represented in ISO 8861 format with UTC time zone, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISO 8861? 🤣 https://www.evs.ee/products/iso-8861-1998
case JsonBinaryFormat.ByteArray => | ||
val builder = new mutable.ArrayBuilder.ofByte | ||
val li = readList() | ||
while (li.hasNext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Input
s/Output
s could look nicer if e.g. ListInput
s were Traversable
(Once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely not, I will not make it inherit collections API. Input
s and Output
s are as raw as possible.
case JsonBinaryFormat.HexString => | ||
val hex = checkedValue[String](JsonType.string) | ||
val result = new Array[Byte](hex.length / 2) | ||
var i = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(0 until result.length).map(...
should not be much slower and you could even go all the way .toArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want this as raw and fast as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark it then 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will. But on the first sight your version should be much slower because not only will it create intermediate collection but also every byte will be boxed in that collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing foreach
was relatively harmless but map
+ toArray
cut off more than half of benchmark score. Ultimately I made a tail-recursive loop.
sb.toString | ||
} | ||
} | ||
|
||
trait BaseJsonOutput { | ||
protected final def writeJsonString(builder: JStringBuilder, str: String): Unit = { | ||
protected final def indent(builder: JStringBuilder, options: JsonOptions, depth: Int): Unit = | ||
options.indentSize match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.foreach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't want that lambda in here - this is called often and I want this as fast as possible.
package serialization | ||
|
||
/** | ||
* This ugly workaround has been introduced when standard `Option` encoding changed from zero-or-one element list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't be so hard on yourself, seen worse than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the performance of GenCodec[Option[T]]
with these LegacyOptionEncoding
input and output. I think we should check it with some benchmarks.
|
||
final class TimestampConversions(private val millis: Long) extends AnyVal { | ||
def toJsDate: js.Date = new js.Date(millis.toDouble) | ||
def toJDate: JDate = new JDate(millis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toTimestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
|
||
def parse(string: String): Long = { | ||
def fail = throw new ReadFailure(s"invalid ISO instant: $string") | ||
if (regex.test(string)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this regex test makes the whole method much slower. This affects JsonStringInput
, so could you at least measure the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very naive benchmark shows no significant difference (5-20%).
I want this to be secured with this regex because I want this to work as closely as possible in both JS and JVM.
|
||
final class TimestampConversions(private val millis: Long) extends AnyVal { | ||
def toInstant: Instant = Instant.ofEpochMilli(millis) | ||
def toJDate: JDate = new JDate(millis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toTimestamp
|
||
implicit def optionCodec[T: GenCodec]: GenCodec[Option[T]] = create[Option[T]]( | ||
locally { | ||
case i: LegacyOptionEncodingInput if i.legacyEncodingEnabled => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern matching looks like something slow. Could you create a benchmark to compare this version with clean codec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just an instanceof
check - I don't see a reason why would be particularly slow.
But yeah, measuring is always good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will check that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you were right - now I just left a method in Input
& Output
instead of having it in separate traits.
/** | ||
* Specifies format used by `JsonStringOutput.writeBinary` / `JsonStringInput.readBinary` to represent byte arrays. | ||
*/ | ||
sealed trait JsonBinaryFormat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see lots of JS generated from this file. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a ValueEnum
but I wanted to leave possibility to make a case class
there.
while (i < result.length) { | ||
result(i) = ((reader.fromHex(hex.charAt(2 * i)) << 4) | reader.fromHex(hex.charAt(2 * i + 1))).toByte | ||
i += 1 | ||
def loop(i: Int): Unit = if (i < result.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tailrec?
import japgolly.scalajs.benchmark.gui.BenchmarkGUI | ||
import org.scalajs.dom._ | ||
|
||
object Main { | ||
def main(args: Array[String]): Unit = { | ||
val body = document.getElementById("body") | ||
BenchmarkGUI.renderSuite(body)(Benchmarks.suite) | ||
BenchmarkGUI.renderSuite(body)(IsoInstantBenchmarks.suite) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BenchmarkGUI.renderMenu(body)(
IsoInstantBenchmarks.suite,
JsonBenchmarks.suite
)
This PR contains some controversial, breaking but IMHO necessary changes in serialization framework:
InputType
has been dropped.It was used only to check if
Input
containsnull
so it was replaced by simpleisNull
method. This change is primarily motivated by the fact that some "simple" types may need to use array/object encoding, e.g. #64 - this way array/object values would not be properly distinguishable from "simple" values andinputType
method would have to lie in some way.GenCodec
forOption
has been changed to use unwrapped encoding withnull
representingNone
.My original motivation for using list-based encoding was to be able to express
null
andSome(null)
values forOption
. It was additionally supported by an assumption that we don't care too much about serialized format as long as deserializes properly. This is no longer true asGenCodec
is now being used for various purposes and may affect the way our OSS is used outside the company (e.g. through Udash).Currently, the only way to ensure unwrapped encoding of optional values is to use
Opt
/OptArg
which I definitely don't want to impose on people using our libraries. For this reason I see no other choice but to introduce this breaking change and somehow manage the breakage in our own use cases which primarily involve serialization to MongoDB. Therefore, I left an ugly backdoor thatInput
andOutput
implementations may use to retain the previous encoding ofOption
for compatibility purposes - seeLegacyOptionEncodingInput
/LegacyOptionEncodingOutput
.JsonOptions
for better control of output format ofJsonStringOutput
(timestamp format, binary data format, etc.). See documentation ofJsonOptions
for more details.The main breaking changes in JSON serialization is that timestamps are now by default serialized as ISO formatted strings, e.g.
"2010-04-10T08:50:41.262Z"
and byte arrays are serialized as actual JSON (signed) byte arrays instead of hex strings. This of course can be adjusted inJsonOptions
to use previous format.Timestamp
type that serves as a minimal, platform-independent representation of UTC based instant.