Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rascal and JSON/CVS libraries do not support partial datetime (time without date and vice versa) or time offsets (they are normalized to instants) #1533

Open
jurgenvinju opened this issue Sep 27, 2021 · 7 comments
Labels

Comments

@jurgenvinju
Copy link
Member

jurgenvinju commented Sep 27, 2021

Describe the bug

Given the issues identified in RAP 11: https://docs.google.com/document/d/1xXRZnPBifewgKs57mRYBI75QjSF5Hg-Lw9tQ_STWtQQ/edit

it is important that programmers can check if a year field (for example) is present or not. That would also be consistent
given the way we treat the fields of values of the loc type.

Currently however, datetime has many fields but none of them are supported by the ? and has operators.

There are also issues with zone offsets while reading/writing them in JSON and CSV. The offset information is normalized to the instanst that it represents.

See also RAP11: Rascal and Vallang should keep datetime values immutable, and have different values for when the zone offsets or zone names are different. Even though this would make "equal" instants different, that would allow data/code scientists in Rascal to faithfully deal with such discrepancies and make choices based on the context that they are in.

@DavyLandman
Copy link
Member

related: #1508

On that branch I also fixed some of the parsing bugs in our implementation for time or date only literals.

I think it's a hole in our value type tree that two datetimes sometimes throw exceptions when you compare then against eachother.

@jurgenvinju
Copy link
Member Author

jurgenvinju commented Sep 27, 2021

In what way was that branch still a work in progress?

The current situation is very simple:

  • many tests are failing that accept random datetime parameters because large parts of the Rascal implementation (and Rascal test code) is based on the assumption that datetime values always have date and time.
  • About four months ago the vallang random value generator was improved to also start generating datetime values with only time and with only date information.
  • Today I upped the dependency of vallang to 0.14.1 to synchronize the dependency on Java 11 compiler and run-time
  • Ergo, here we are with 10 failing tests.

Since there are many issues related to datetime that still need serious consideration (design) and fixing, the best might be
to configure the random generator to generate only the values that the Rascal implementation can handle.

Another solution in my mind requires either splitting datetime into date, time and datetime, such that random values can be generated for each independently, or providing implementations of the ? and has operator for all the datetime fields (like .year? and .time?. This is too much for this java11 branch.

Is there a third option that I am missing @DavyLandman @PaulKlint @tvdstorm ?

@jurgenvinju
Copy link
Member Author

My current ideas go to a temporary change in the vallang library like so:

	boolean partialDateTime = "true".equals(System.getProperty("vallang.random.partialDateTime"));
		
        try {
			if (partialDateTime && random.nextDouble() > 0.8) {
				LocalTime result = LocalTime.ofSecondOfDay(between(random, ChronoField.SECOND_OF_DAY));
                                   return vf.time(
					result.getHour(), 
					result.getMinute(), 
					result.getSecond(), 
					(int) TimeUnit.MILLISECONDS.convert(result.getNano(), TimeUnit.NANOSECONDS)
				);

That way the vallang library can use -Dvallang.random.partialDateTime=true and that code remains well tested, while the (main) user of the vallang library (rascal project) can be improved in this respect when we are ready to do so.

@jurgenvinju
Copy link
Member Author

Ok. ran a test and that would solve half of the issues

Next issue: the Rascal implementation does not deal well with zone offsets:

[ERROR] jsonWithDatetime1: <1318,68>(org.rascalmpl.test.infrastructure.RascalJUnitParallelRecursiveTestRunner$ModuleTester)  Time elapsed: 0 s  <<< ERROR!
java.lang.Exception: 
Test lang::rascal::tests::library::lang::json::JSONIOTests::jsonWithDatetime1 failed due to
        test returned false

Actual parameters:
        datetime =>$2019-10-17T17:57:50.000-04:01$



[ERROR] jsonStreaming2: <2361,193>(org.rascalmpl.test.infrastructure.RascalJUnitParallelRecursiveTestRunner$ModuleTester)  Time elapsed: 0 s  <<< ERROR!
java.lang.Exception: 
Test lang::rascal::tests::library::lang::json::JSONIOTests::jsonStreaming2 failed due to
        test returned false

Actual parameters:
        D =>date($2015-06-27T04:55:48.000+10:16$)



[ERROR] printTime_simpleFormat: <950,90>(org.rascalmpl.test.infrastructure.RascalJUnitParallelRecursiveTestRunner$ModuleTester)  Time elapsed: 0.001 s  <<< ERROR!
java.lang.Exception: 
Test lang::rascal::tests::library::DateTime::printTime_simpleFormat failed due to
        test returned false

Actual parameters:
        datetime =>$1968-08-04T14:12:33.000-13:53$



[ERROR] printDateTime_simpleFormat: <1046,123>(org.rascalmpl.test.infrastructure.RascalJUnitParallelRecursiveTestRunner$ModuleTester)  Time elapsed: 0 s  <<< ERROR!
java.lang.Exception: 
Test lang::rascal::tests::library::DateTime::printDateTime_simpleFormat failed due to
        test returned false

Actual parameters:
        datetime =>$2009-09-21T07:46:03.000-06:50$

That's another kind of random datetime value that was not generated before, but it is now. I'll look into the details to see if this can be fixed at the Rascal side

@jurgenvinju
Copy link
Member Author

Ah. We loose offsets when writing/reading JSON like this:

@Override
      public IValue visitDateTime(Type type) throws IOException {
        try {
          switch (in.peek()) {
            case STRING:
              // toEpocMilli is the wrong do-er
              return vf.datetime(format.get().parse(in.nextString()).toInstant().toEpochMilli());
            case NUMBER:
              return vf.datetime(in.nextLong());
            default:
              throw new IOException("Expected a datetime instant " + in.getPath());
          }
        } catch (ParseException e) {
          throw new IOException("Could not parse date: " + in.getPath());
        }
      }
      ```

@jurgenvinju jurgenvinju changed the title ? and hasnot supported on datetime fields Rascal and JSON/CVS libraries do not support partial datetime (time without date and vice versa) or time offsets (they are normalized to instants) Sep 27, 2021
@DavyLandman
Copy link
Member

okay, so my branch solves quite some of those serialization bugs. (there are quite a few)

but, I stopped since there is this problem of not all datetimes being equal. I think adding a field that limits the random generator is a nice approach to keep going. But I do think we need to consider splitting up datetime into datetime&date &time.

@jurgenvinju
Copy link
Member Author

Very much agreed.

Now working on Zulu time issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants