[explore] proper filtering of NULLs and '' #4651

mistercrunch · 2018-03-20T06:24:04Z

codecov-io · 2018-03-20T17:26:26Z

Codecov Report

Merging #4651 into master will decrease coverage by 0.06%.
The diff coverage is 75.51%.

@@            Coverage Diff             @@
##           master    #4651      +/-   ##
==========================================
- Coverage   76.98%   76.91%   -0.07%     
==========================================
  Files          44       44              
  Lines        8498     8522      +24     
==========================================
+ Hits         6542     6555      +13     
- Misses       1956     1967      +11

Impacted Files	Coverage Δ
superset/connectors/sqla/models.py	`75.4% <50%> (-0.92%)`	⬇️
superset/connectors/druid/models.py	`81.16% <75%> (-0.45%)`	⬇️
superset/connectors/base/models.py	`90.5% <86.95%> (-0.61%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44c2d5b...741832e. Read the comment docs.

hughhhh · 2018-03-21T04:56:23Z

Definitely need this!!

hughhhh · 2018-03-21T08:18:13Z

Did some work on the druid side. I can pick up the rest but i wanted to get some insight before i go on.

https://github.com/hughhhh/incubator-superset/commit/3bc3bd46d05a96335f9663745f68429f10b5eff3

jeffreythewang

I had a [WIP] idea (tc-dc#8) for handling nulls that involves showing a tooltip next to the label, to distinguish between actual null values and "null" string values.

(I've only tested with druid, and so far only have it such that Enable Filter Select must be on.)

jeffreythewang · 2018-03-22T19:03:51Z

superset/assets/javascripts/components/AlteredSliceTag.jsx

@@ -61,7 +61,7 @@ export default class AlteredSliceTag extends React.Component {
        return '[]';
      }
      return value.map((v) => {
-        const filterVal = v.val.constructor === Array ? `[${v.val.join(', ')}]` : v.val;
+        const filterVal = v.val && v.val.constructor === Array ? `[${v.val.join(', ')}]` : v.val;


might be my unfamiliarity, but why not use Array.isArray like in other parts of the code?

Didn't write that line but FWIW someone on StackOverflow says it's the best way
https://stackoverflow.com/questions/767486/how-do-you-check-if-a-variable-is-an-array-in-javascript

betodealmeida · 2018-04-12T21:28:51Z

superset/connectors/base/models.py

+        def handle_single_value(v):
+            # backward compatibility with previous <select> components
+            if isinstance(v, basestring):
+                v = v.strip().strip("'").strip('"')


You can do a single strip to remove whitespace and single/double quotes:

v = v.strip(' \'"')

Or, to also take care of tabs and line feeds:

v = v.strip('\t\n \'"')

betodealmeida · 2018-04-12T21:30:50Z

superset/connectors/base/models.py

+            return v
+        if isinstance(values, (list, tuple)):
+            values = [handle_single_value(v) for v in values]
+        values = handle_single_value(values)


This is a bit confusing... I'd add the second call to an else block (unless I'm missing something):

if isinstance(values, (list, tuple)): values = [handle_single_value(v) for v in values] else: values = handle_single_value(values)

betodealmeida · 2018-04-12T21:32:06Z

superset/connectors/base/models.py

+            values = [handle_single_value(v) for v in values]
+        values = handle_single_value(values)
+        if is_list_target and not isinstance(values, (tuple, list)):
+            values = [values]


What if is_list_target is true but values is a tuple? In this case it would return a tuple, is that ok?

The variable should probably be called is_iterable_target, both tuple and list will work here.

betodealmeida · 2018-04-12T21:33:22Z

superset/connectors/druid/models.py

-                    eq = utils.string_to_num(eq)
-
+            eq = cls.filter_values_handler(
+                eq, is_list_target=op in ('in', 'not in'),


This is a bit hard to read, can we define is_list_target outside the function call, like you did for is_numeric_col?

betodealmeida · 2018-04-12T21:33:47Z

superset/connectors/sqla/models.py

+            eq = self.filter_values_handler(
+                flt.get('val'),
+                target_column_is_numeric=col_obj.is_num,
+                is_list_target=op in ('in', 'not in'))


Same as above here.

betodealmeida · 2018-04-12T21:34:02Z

superset/viz.py

@@ -1693,6 +1693,8 @@ def run_extra_queries(self):
        for flt in filters:
            qry['groupby'] = [flt]
            df = self.get_df_payload(query_obj=qry).get('df')
+            print(df)
+            print(df.dtypes)


betodealmeida · 2018-04-12T21:34:24Z

tests/druid_func_tests.py

        filtr = {'col': 'A', 'op': '==', 'val': []}
        res = DruidDatasource.get_filters([filtr], [])
-        self.assertEqual('', res.filter['filter']['value'])
+        self.assertEqual(None, res.filter['filter']['value'])


Use assertIsNone instead.

betodealmeida · 2018-04-12T21:34:35Z

tests/druid_func_tests.py


    def test_get_filters_handles_none_for_string_types(self):
        filtr = {'col': 'A', 'op': '==', 'val': None}
        res = DruidDatasource.get_filters([filtr], [])
-        self.assertEqual('', res.filter['filter']['value'])
+        self.assertEqual(None, res)


mistercrunch · 2018-04-13T00:21:44Z

@betodealmeida addressed your comments

TODO: handling of Druid equivalents

Error "unorderable types: str() < int()" occurs when grouping by a numerical Druid colummn that contains null values. * druid/pydruid returns strings in the datafram with NAs for nulls * Superset has custom logic around get_fillna_for_col that fills in the NULLs based on declared column type (FLOAT here), so now we have a mixed bag of type in the series * pandas chokes on pivot_table or groupby operations as it cannot sorts mixed types The approach here is to stringify and fillna('<NULL>') to get a consistent series.

* [WiP] [explore] proper filtering of NULLs and '' TODO: handling of Druid equivalents * Unit tests * Some refactoring * [druid] fix 'Unorderable types' when col has nuls Error "unorderable types: str() < int()" occurs when grouping by a numerical Druid colummn that contains null values. * druid/pydruid returns strings in the datafram with NAs for nulls * Superset has custom logic around get_fillna_for_col that fills in the NULLs based on declared column type (FLOAT here), so now we have a mixed bag of type in the series * pandas chokes on pivot_table or groupby operations as it cannot sorts mixed types The approach here is to stringify and fillna('<NULL>') to get a consistent series. * typo * Fix druid_func tests * Addressing more comments * last touches

mistercrunch force-pushed the nulls branch from 4dc7ee6 to 2f2f752 Compare March 20, 2018 06:25

mistercrunch mentioned this pull request Mar 20, 2018

[Filter Box] Boolean values show up blank in selector #4297

Closed

3 tasks

jeffreythewang reviewed Mar 22, 2018

View reviewed changes

mistercrunch force-pushed the nulls branch 4 times, most recently from f513bc8 to bf32f56 Compare March 27, 2018 16:53

hughhhh mentioned this pull request Mar 30, 2018

Need isNull isNotNull type filter option #4716

Closed

3 tasks

mistercrunch force-pushed the nulls branch from bf32f56 to b4798d2 Compare April 5, 2018 18:33

mistercrunch mentioned this pull request Apr 11, 2018

[druid] fix 'Unorderable types' when col has nulls #4771

Closed

mistercrunch force-pushed the nulls branch 2 times, most recently from e0956e6 to aeb5bf2 Compare April 11, 2018 05:25

mistercrunch changed the title ~~[WiP] [explore] proper filtering of NULLs and ''~~ [explore] proper filtering of NULLs and '' Apr 11, 2018

mistercrunch force-pushed the nulls branch from 570e108 to 004aac5 Compare April 11, 2018 18:16

betodealmeida reviewed Apr 12, 2018

View reviewed changes

mistercrunch force-pushed the nulls branch from 9616fa8 to fb45522 Compare April 13, 2018 00:19

mistercrunch force-pushed the nulls branch from 2b9336e to e1611c6 Compare April 18, 2018 04:16

mistercrunch added 8 commits April 18, 2018 04:44

[WiP] [explore] proper filtering of NULLs and ''

e4b449d

TODO: handling of Druid equivalents

Unit tests

a66801d

Some refactoring

6e74170

typo

289b5aa

Fix druid_func tests

ffe30a1

Addressing more comments

01aa3a2

last touches

741832e

mistercrunch force-pushed the nulls branch from e1611c6 to 741832e Compare April 18, 2018 04:45

mistercrunch merged commit eac97ce into apache:master Apr 18, 2018

mistercrunch deleted the nulls branch April 18, 2018 05:26

mistercrunch mentioned this pull request Apr 23, 2018

[explore, PostgreSQL] filter values not correctly displayed for boolean columns #3007

Closed

3 tasks

stephenLYZ mentioned this pull request Jan 11, 2022

fix(sql): unable to filter text with quotes #17881

Merged

9 tasks

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.25.0 labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[explore] proper filtering of NULLs and '' #4651

[explore] proper filtering of NULLs and '' #4651

mistercrunch commented Mar 20, 2018 •

edited

Loading

codecov-io commented Mar 20, 2018 •

edited

Loading

hughhhh commented Mar 21, 2018

hughhhh commented Mar 21, 2018 •

edited

Loading

jeffreythewang left a comment •

edited

Loading

jeffreythewang Mar 22, 2018

mistercrunch Mar 23, 2018

betodealmeida Apr 12, 2018

mistercrunch Apr 12, 2018

betodealmeida Apr 12, 2018

betodealmeida Apr 12, 2018

mistercrunch Apr 12, 2018

betodealmeida Apr 12, 2018

betodealmeida Apr 12, 2018

betodealmeida Apr 12, 2018

betodealmeida Apr 12, 2018

betodealmeida Apr 12, 2018

mistercrunch commented Apr 13, 2018

[explore] proper filtering of NULLs and '' #4651

[explore] proper filtering of NULLs and '' #4651

Conversation

mistercrunch commented Mar 20, 2018 • edited Loading

codecov-io commented Mar 20, 2018 • edited Loading

Codecov Report

hughhhh commented Mar 21, 2018

hughhhh commented Mar 21, 2018 • edited Loading

jeffreythewang left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mistercrunch commented Apr 13, 2018

mistercrunch commented Mar 20, 2018 •

edited

Loading

codecov-io commented Mar 20, 2018 •

edited

Loading

hughhhh commented Mar 21, 2018 •

edited

Loading

jeffreythewang left a comment •

edited

Loading