-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(db engines): add support for Opendistro Elasticsearch (AWS ES) #12602
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Quick first pass review
def make_label_compatible(cls, label: str) -> str: | ||
new_label = super().make_label_compatible(label) | ||
return new_label.replace(".", "_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of overriding make_label_compatible
, we should define a custom _mutate_label
method that does this. See e.g.
superset/superset/db_engine_specs/bigquery.py
Lines 113 to 135 in 9771b82
@staticmethod | |
def _mutate_label(label: str) -> str: | |
""" | |
BigQuery field_name should start with a letter or underscore and contain only | |
alphanumeric characters. Labels that start with a number are prefixed with an | |
underscore. Any unsupported characters are replaced with underscores and an | |
md5 hash is added to the end of the label to avoid possible collisions. | |
:param label: Expected expression label | |
:return: Conditionally mutated label | |
""" | |
label_hashed = "_" + hashlib.md5(label.encode("utf-8")).hexdigest() | |
# if label starts with number, add underscore as first character | |
label_mutated = "_" + label if re.match(r"^\d", label) else label | |
# replace non-alphanumeric characters with underscores | |
label_mutated = re.sub(r"[^\w]+", "_", label_mutated) | |
if label_mutated != label: | |
# add first 5 chars from md5 hash to label to avoid possible collisions | |
label_mutated += label_hashed[:6] | |
return label_mutated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thks! I guess we should rename _mutate_label
to mutate_label
. This is still on it's infancy, still missing the dbapi PR merge and release and on this side the time grains are different for opendistro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on renaming (let's have a separate PR for that). I guess we should have a separate spec for elasticsearch
and odelasticsearch
to properly support the differing time grains?
Codecov Report
@@ Coverage Diff @@
## master #12602 +/- ##
==========================================
+ Coverage 65.05% 67.36% +2.31%
==========================================
Files 1021 489 -532
Lines 50095 28772 -21323
Branches 5141 0 -5141
==========================================
- Hits 32587 19383 -13204
+ Misses 17332 9389 -7943
+ Partials 176 0 -176
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
When is the plan to merge this PR? Any time lines? |
@syamat this still needs some work on the dialect side also https://github.com/preset-io/elasticsearch-dbapi, but hopping to have this merged in 2 weeks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quick question/comment
@classmethod | ||
def convert_dttm(cls, target_type: str, dttm: datetime) -> Optional[str]: | ||
if target_type.upper() == utils.TemporalType.DATETIME: | ||
return f"""'{dttm.isoformat(timespec="seconds")}'""" | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't ES/OD support other temporal types like DATE
or TIMESTAMP
'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding by default Elasticsearch only supports date
field type
https://opendistro.github.io/for-elasticsearch-docs/docs/sql/datatypes/#date-and-time-types
...
By default, the Elasticsearch DSL uses the date type as the only date-time related type that contains all information of an absolute time point.
...
For the SQL endpoint, we can use date functions that will return other date/time types out of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for clarifying!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* master: fix: UI toast typo (apache#13026) feat(db engines): add support for Opendistro Elasticsearch (AWS ES) (apache#12602) fix(build): black failing on master, add to required checks (apache#13039) fix: time filter db migration optimization (apache#13015) fix: untranslated text content of Dashboard page (apache#13024) fix(ci): remove signature requirements for commits to master (apache#13034) fix: add alerts and report to default config (apache#12999) docs(changelog): add entries for 1.0.1 (apache#12981) ci: skip cypress if no code changes (apache#12982) chore: add cypress required checks for branch protection (apache#12970) Refresh dashboard list after bulk delete (apache#12945) Updates storybook to version 6.1.17 (apache#13014) feat: Save datapanel state in local storage (apache#12996) fix: added text and changed margins (apache#12923) chore: Swap Slack Url 2 more places (apache#13004)
…pache#12602) * feat(db engines): add support for Opendistro Elasticsearch (AWS ES) * add time grains * lint * bump elasticsearch-dbapi version * add tests * fix test
@villebro When do you plan to release a new version with this improvement ? |
…pache#12602) * feat(db engines): add support for Opendistro Elasticsearch (AWS ES) * add time grains * lint * bump elasticsearch-dbapi version * add tests * fix test (cherry picked from commit b3a814f)
SUMMARY
Adds Elasticsearch opendistro support. Most work was done upstream on the release of
elasticsearch-dbapi
0.2.0.Checkout https://github.com/preset-io/elasticsearch-dbapi/blob/master/README.md for more details.
ADDITIONAL INFORMATION