-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(mssql): week time grain should respect datefirst setting #10811
Conversation
@@ -46,7 +46,7 @@ class MssqlEngineSpec(BaseEngineSpec): | |||
"PT0.5H": "DATEADD(minute, DATEDIFF(minute, 0, {col}) / 30 * 30, 0)", | |||
"PT1H": "DATEADD(hour, DATEDIFF(hour, 0, {col}), 0)", | |||
"P1D": "DATEADD(day, DATEDIFF(day, 0, {col}), 0)", | |||
"P1W": "DATEADD(week, DATEDIFF(week, 0, {col}), 0)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The P1W
time grain tends to refer to the official weekly date truncation, in this case WEEK
. There are a few time grains that target specific weekday starts that can be seen here: https://github.com/apache/incubator-superset/blob/master/superset/db_engine_specs/base.py#L94-L95 . Do you think it would make sense to leave P1W
unchanged, and add two new time grains, specifically 1969-12-28T00:00:00Z/P1W
to also refer to the regular WEEK
interval, and 1969-12-29T00:00:00Z/P1W
to refer to the proposed monday starting week definition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we want see more or less consistent behaviour of Superset across all the supported engines we should check how other engines calculate P1W interval i.e. what "the official weekly date truncation" actually means: respecting or not respecting the current locale. If they do respect the locale then we should change it for MS SQL server accordingly (as proposed). Else we surely can introduce these 2 explicit "starting Monday"/"starting Sunday" intervals, but I think it also worth mentioning as a remark in the documentation to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With "official weekly date truncation" I was referring to what the database by default means by WEEK
, which can mean different things for different databases. Personally, I tend to use the weekly grain to see what 7 day aggregates are; I usually don't so much care if it's a week starting on Sunday or Monday. For the use case where this is important, I suggest using the dedicated time grains where supported. Going forward, I do agree it would be great to start introducing more consistency in what these terms mean, and we encourage and gladly accept contributions to that end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With regard to the proposed change here, and given the ambiguity of what P1W
means, I think this change makes sense as a "best effort" weekly time grain (for the lack of a better term). But if you could add the explicit Sunday and Monday starting grains, then users could have the option of using either the local week definition, or a fixed one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, finally
- As of "P1W" I still believe that Superset should behave expectedly i.e. if the underlying database/connection has certain locale settings Superset should not override them unless the user wants to specify it explicitly. I also have added the
DATEADD(day, DATEDIFF(day, 0, {col}), 0)
wrap around the{col}
in order to correctly truncate the time part in case of MS SQL 2005/2008. Not the least is that the previous version has returned a wrong (T+1) value for the axis' labels / cells text, though is hasn't affected the grouping itself. - I added the support of "1969-12-28T00:00:00Z/P1W" and "1969-12-29T00:00:00Z/P1W" grains for the user to have a choice to explicitly indicate which start of the week he or she prefers for the query. For the "week_start_sunday" I used the previous version of "P1W" with subtraction of one day. The "week_start_monday" is calculated in similar fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@binome74 I'm ok with defaulting to the default locale of the database. In the latest commit I didn't see the original DATEPART
call; did I understand correctly that the current syntax works similarly, but supports older versions of SQL Server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
"P1W": "DATEADD(day, 1 - DATEPART(weekday, {col}), DATEADD(day, DATEDIFF(day, 0, {col}), 0))"
here DATEADD(day, 1 - DATEPART(weekday, {col})
does the job
and DATEADD(day, DATEDIFF(day, 0, {col}), 0))
just truncates the time part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good. Stylistic request: can we change the reserved names (day
, weekday
etc) to all caps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to make the changes proposed by the linter:
superset/db_engine_specs/mssql.py:54:0: C0301: Line too long (99/88) (line-too-long)
superset/db_engine_specs/mssql.py:49:0: C0301: Line too long (102/88) (line-too-long)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. Did it so.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue |
…usted the "week" grain for better backward compatibility with MS SQL 2005/2008.
Codecov Report
@@ Coverage Diff @@
## master #10811 +/- ##
==========================================
- Coverage 77.04% 76.84% -0.21%
==========================================
Files 1041 1041
Lines 56068 56126 +58
Branches 7742 7742
==========================================
- Hits 43198 43129 -69
- Misses 12612 12739 +127
Partials 258 258
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue |
Is there anything else I need to do to have my PR accepted? |
@binome74 this looks good, but the tests have expired. Can you close and then just reopen this PR to retrigger all the test runs? |
@binome74 I'm terribly sorry this PR got overlooked. I rebased it, fixed the conflict and will merge once CI passes. Thanks again so much for the contribution, looking forward to getting this in! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* "P1W" grain should respect DATEFIRST setting in MS SQL Server * Added "week_start_sunday" and "week_start_monday" grains support. Adjusted the "week" grain for better backward compatibility with MS SQL 2005/2008. * Stylistic and linter-requested changes * fix test Co-authored-by: Valeriy Aleksashkin <v.aleksashkin@gmail.com> Co-authored-by: Ville Brofeldt <ville.v.brofeldt@gmail.com>
SUMMARY
In MS SQL Server specifying SET DATEFIRST has no effect on DATEDIFF. DATEDIFF always uses Sunday as the first day of the week to ensure the function operates in a deterministic way. This is not desirable for locales where weeks start on Monday. To get the first day of a calendar week for the current locale use DATEPART which does respect the DATEFIRST setting.
TEST PLAN
ADDITIONAL INFORMATION