Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero date problem #65

Open
dazoe opened this issue Dec 9, 2015 · 18 comments
Open

Zero date problem #65

dazoe opened this issue Dec 9, 2015 · 18 comments

Comments

@dazoe
Copy link

dazoe commented Dec 9, 2015

I came across a problem with a date time that shows up in mysql as 0000-00-00 and mysql-native crashes with core.time.TimeException@std\datetime.d(9079): 0 is not a valid month of the year.

@Abscissa Abscissa added the bug label Apr 14, 2016
@Marenz
Copy link

Marenz commented Jan 3, 2018

I have the same problem, in my case with a timestamp-typed field.

The problem seems to be that phobos Date struct doesn't accept such dates..

Not sure what a good solution would be

@Abscissa
Copy link

It appears that '0000-00-00' is a special "zero" date or "dummy date":
https://dev.mysql.com/doc/refman/5.7/en/using-date.html

I see three possibilities:

  1. Interpret '0000-00-00' as NULL. According to the link above, this is what ODBC does, plus certain configurations of MySQL server disallow this date entirely, so there is precedent for this. It's nice and simple and clean, but it prevents user code from being able to directly deal with this scenario if it wants to.

  2. Use a custom type INSTEAD of Date that can be in one of two states: the "special zero date" or an actual std.datetime.Date. This allows client code to distinguish the two, if needed, at the cost of additional complexity to both mysql-native and user-code.

  3. Use a custom type that MEANS "special zero date", and emit either that type OR an actual std.datetime.Date depending on the value received from the server. This has the same pros/cons as the second option above. Compared to that, it might be more complicated in some ways, or less complicated in some ways.

Although the other options are more of a pain, I'm leaning against the first option, just because I don't like the idea of potential use-case scenarios that mysql-native can't handle.

Maybe there's some good way I can do the first option, but provide users with the ability to distinguish between "special zero date" and "null" if they specifically ask for the ability???

@Abscissa Abscissa modified the milestones: v2.2.0, v2.3.0 Feb 26, 2018
@schveiguy
Copy link
Collaborator

schveiguy commented May 2, 2018

Just came across this too. I think the best answer is to pick a valid date, but one that is unlikely to ever be used. I don't think you want to use a different type, because there are serialization packages that are expecting a date in that field.

I don't think you want to return NULL because NULL could mean something different than a dummy date, and again, someone may not expect NULL in a non-null column.

Looking at the Date constructor, it requires a valid month between 1 and 12, a valid day (based on month and year), and the year can be anything.

What I would recommend is that we pick a "dummy date" of something like 1/1/0000, and store that as a setting in case people want to change it.

@Marenz
Copy link

Marenz commented May 2, 2018

The third option would kind of break the direct mapping of a result row to a struct because the type could be either dater or invalid-date. So I would add that as a con.

The first option prevents distinction between actual NULL and "0000-00-00".

Apart from a bit extra complexity, the second option seems to be the best way from my perspective.

@Abscissa
Copy link

Abscissa commented May 3, 2018

I don't think you want to return NULL because NULL could mean something different than a dummy date, and again, someone may not expect NULL in a non-null column.

The first option prevents distinction between actual NULL and "0000-00-00".

Yea, I agree. That's why it's the main one I was leaning against.

What I would recommend is that we pick a "dummy date" of something like 1/1/0000, and store that as a setting in case people want to change it.

I'll admit, it's tempting. It's simple and easy (for mysqln), and there's a certain logic to it. But I'm not sure I'm comfortable with the idea that if a DB contains "0000-00-00", mysqln would unceremoniously tell you the DB contains "0000-01-01":

A: At worst, that could be argued to be a form of data corruption or bug.

B: This would be a confusing surprise for those already familiar with the way MySQL dates [can] work [depending on server configuration].

C. Even if the "alternate zero-date" is configurable, the user would still need to 1. know about this quirk and 2. find a value, if any, which suits their use-case. Not exactly simple and easy for the user when this issue does crop up.

D: It means mysqln still wouldn't be able to distinguish between "0000-00-00" and whatever the substitute value is.

E: I know, and agree, it's an "unlikely" value in most cases, but I'm not sure it's appropriate for something as low-level as a DB client lib to make judgments about, and rely on, what is or isn't likely in real world use. Even if it's configurable, it's still making the judgment that "If you need to distinguish all possible values of a MySQL date, then tough luck." How likely is that? Not sure it's appropriate for mysqln to say.

F: And then there's still the whole mess of "What if the user needs to interface with some goofy legacy DB and for some annoying reason needs to write a zero-date?". New can of worms.

I'm starting to think we need to begin by admitting Phobos's Date simply does not match up with MySQL Date, even if they're frustratingly close to matching. So instead of forcing these mismatched puzzle pieces together, the best thing is probably to just bite the bullet and take the breaking change of an adapting, mysqln Date type which can match up with MySQL Date by wrapping Phobos Date and adding "isZeroMonth", "isZeroDay", and "isZeroDate" flags.

Breakages can be minimized by doing the following:

  1. The mysqln Date type exposes the underlying Phobos Date's interface and can be converted to Phobos Date, throwing a better exception if it's a MySQL zero-date (Is it possible this possible for this to be an implicit conversion? How would that be done? When I think "make custom D type implicitly convert", I think "alias this", but that doesn't really offer a hook one could use to validate before the conversion.) At least with this exception, the cause is more clear, and the user has the ability to work around it by checking "isZeroDate" first.

  2. mysqln offers an optional (non-default) setting so a developer can proactively opt-in to "Just return Phobos's Date". The user has a choice between mysql throwing upon receipt of a zero date, or returning a user-chosen sentinel (default: 0000-01-01). Although this creates another question: On what level (or levels) is this setting offered? exec/query? Connection? A global (ick)? A -version= identifier (ick)?

Thoughts?

@schveiguy
Copy link
Collaborator

Some updates: I spoke with @jmdavis at dconf.

He said that what may be useful is to keep throwing, but allow a configuration to change the behavior. Since everything is returned as a Variant, we have a lot of options that could be handled this way:

  1. Specify a default Date/DateTime
  2. Specify a different lambda/function that should handle the parsing.
  3. Possibly say "use specialized mysqln Date" as a convenience

As to how this looks API-wise, I would suggest a global that affects new connections, and also allow changing on individual connections.

I will caution that using a different type by default than phobos Date is going to break things -- other libraries can deal with Date (e.g. vibe.d json serialization), they probably can't deal with a proposed mysqln Date.

I'll admit, it's tempting. It's simple and easy (for mysqln), and there's a certain logic to it. But I'm not sure I'm comfortable with the idea that if a DB contains "0000-00-00", mysqln would unceremoniously tell you the DB contains "0000-01-01"

I'll say that in PHP it kind of works this way (this is what I saw from my PHP side when I was getting the exception), the result was something like 12/31/-01. Not that PHP is something to replicate, but it is a precedent :)

@schveiguy
Copy link
Collaborator

schveiguy commented May 3, 2018

C. Even if the "alternate zero-date" is configurable, the user would still need to 1. know about this quirk and 2. find a value, if any, which suits their use-case. Not exactly simple and easy for the user when this issue does crop up.

Argument against: You have to know about the existing "quirk" anyway. I didn't know until I got an exception, and that's not exactly a friendly result either. To be honest, if I knew about this, I would have added the option to avoid this (Mysql has options to disable storing this value).

And the set of people for whom 1/1/0000 is a valid date is probably pretty small.

@schveiguy
Copy link
Collaborator

To be honest, if I knew about this, I would have added the option to avoid this (Mysql has options to disable storing this value).

Hm... I would actually like this to be done automatically in mysql native whenever a connection is opened. I'll add another enhancement request.

@Abscissa
Copy link

Abscissa commented May 3, 2018

Some updates: I spoke with @jmdavis at dconf.
[...]

Sounds like we're thinking along similar lines.

Making "throw upon zero-date" the default makes sense. I like that.

As to how this looks API-wise, I would suggest a global that affects new connections, and also allow changing on individual connections.

My only concern is what happens when two different packages communicating with mysqln (ie, other third party libs, plus the overall application) each have their own way they want to doing things?

For example: Somebody's vibe.d webapp uses a third-party ORM built on top of mysqln, plus something else (logging-to-db? login/user management tool? etc.) also uses the same DB via mysqln. But one wants to use mysqln with "zero-date" configured one way, and the other wants to use mysqln with "zero-date" configured the other way. What then? Or do we force lib developers to either be pedantically mindful of the setting or risk being SOL when they fail to and they're combined with another lib wants it configured a different way?

Argument against: You have to know about the existing "quirk" anyway. I didn't know until I got an exception, and that's not exactly a friendly result either. To be honest, if I knew about this, I would have added the option to avoid this (Mysql has options to disable storing this value).

If "Just convert it to 0000-01-01" was the default, then people who need to know about it will fail to be informed. It's fine if the user specifically requests that behavior, because then they're guaranteed to already know about it. Otherwise, I think an exception is better, because then they're forced to notice. Which we kind of have right now (albiet, a rather confusing, unclear exception), we just need a way for the user to actually do something to handle it once they are made aware.

I'll say that in PHP it kind of works this way (this is what I saw from my PHP side when I was getting the exception), the result was something like 12/31/-01. Not that PHP is something to replicate, but it is a precedent :)

Ha ha, well, PHP is a goldmine of precedents for bad designs :)

@ghost91-
Copy link

ghost91- commented May 3, 2018

The mysqln Date type exposes the underlying Phobos Date's interface and can be converted to Phobos Date, throwing a better exception if it's a MySQL zero-date (Is it possible this possible for this to be an implicit conversion? How would that be done? When I think "make custom D type implicitly convert", I think "alias this", but that doesn't really offer a hook one could use to validate before the conversion.) At least with this exception, the cause is more clear, and the user has the ability to work around it by checking "isZeroDate" first.

You can use alias this with @property member functions, so you could do something similar to this:

struct MysqlDate
{
    bool isZeroDate() @property
    {
        /* ... */
    }

    DateTime getDateTime() @property
    {
        if (isZeroDate())
        {
            throw new Exception("Not a valid DateTime");
        }
        else
        {
            return DateTime( /* ... */ );
        }
    }

    alias getDateTime this;
}

@Abscissa
Copy link

Abscissa commented May 3, 2018

You can use alias this with @Property member functions, so you could do something similar to this:

Ooh, awesome! Didn't know that! (The possibility didn't even occur to me.)

@Abscissa Abscissa changed the title Null date problem Zero date problem May 4, 2018
@schveiguy
Copy link
Collaborator

@ghost91- It's a good idea, the only issue I would have is that a Variant is not going to be able to use the alias this.

In other words, if a Variant contains mysql.Date, and I use v.get!(std.datetime.date.Date), it's going to throw. This means there are still lots of code that will break.

I think it's a good idea to have an opt-in mechanism to do this. I still would argue for a runtime configuration for this -- done at the global and/or connection level, and at the same time, provide options for those who don't want to deal with a different type (e.g. allow a default).

@Abscissa
Copy link

Abscissa commented May 10, 2018

Just as a note, partly so I don't forget, I just noticed here that '0000-00-00' is not the only date mysqln needs to be mindful of: It's possible to have any arbitrary dates, like '1972-00-27' or '2001-03-00' where just the month or day (or both) is zero. In light of this, it's probably best to assume a mysql date could potentially be ANYTHING that matches:

NNNN-NN-NN

Where each "N" is 0-9.

@schveiguy
Copy link
Collaborator

It's possible to have any arbitrary dates

This strongly suggests the library needs a mysql Date type. I'd make it POD only, basically, a struct with 3 integer fields, which can be implicitly cast to std.datetime.Date, and then all the checking happens on the conversion. I'd still recommend making it opt-in, as most code written that uses mysql-native is going to be already expecting std.datetime.Date.

@Abscissa
Copy link

Unfortunately, looks like this is going to have to wait until after I cleanup the internal low-level protocol-handling code:

Due to the way the low-level internals are right now, if I want to do this in a way I can be confident I haven't missed any important cases or code-paths, I need to change all the low-level internals to use the new invalid-date-compatible MySQLDate/MySQLDateTime types instead of their Phobos counterparts (and then attempt conversion to the Phobos types when construction each struct Row).

This requires making changes in and around the consume/decode functions (in packet_helpers.d). Unfortunately, nearly every time I've ever tried to make alterations around those functions, including this time (see here and here), it results in mysteriously corrupted data or protocol state. (Which is why I've been itching so much to finally clean up those internals.)

My current attempt is in the 'issue65' branch if anyone wants to take a look. So far, it doesn't expose the new date types for reading or support sending them to the DB, it only uses them internally, and converts to std.datetime for public use (throwing a custom exception if the conversion can't be done). But as I mentioned above, it currently leads to corruption in a couple regression tests (one of which hangs).

@Abscissa Abscissa removed this from the v2.3.0 milestone May 14, 2018
@schveiguy
Copy link
Collaborator

Getting back to purging my database of this nonsense, and I realize, I don't have a good library to use to rid myself of these dates! I can't use mysql-native to search all columns in all tables 😛

Looking forward to both a fix for this and the underlying protocol cleanup (which is sure to make progress easier).

@Sandman83
Copy link

Just as a note, partly so I don't forget, I just noticed here that '0000-00-00' is not the only date mysqln needs to be mindful of: It's possible to have any arbitrary dates, like '1972-00-27' or '2001-03-00' where just the month or day (or both) is zero. In light of this, it's probably best to assume a mysql date could potentially be ANYTHING that matches:

NNNN-NN-NN

Where each "N" is 0-9.

There are validity checks in phobos, std.datetime.date.d like
bool valid(string units)(int value)
and
bool valid(string units)(int year, int month, int day)

Another reason for a POD here, I would say.

@schveiguy
Copy link
Collaborator

Since we are moving away from Variant, this might be a good time to add a MySQLDate type that alias this' itself it a phobos Date. I'll revisit this once the safe branch is pulled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants