Skip to content

BUG: fixed .str.contains(..., na=False) for categorical series #22170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Nov 20, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
077d136
BUG: fixed .str.contains(..., na=False) for categorical series
pulkitmaloo Aug 2, 2018
2ae44d1
na argument for _wrap_results
pulkitmaloo Aug 2, 2018
a1b3d7b
fixed str.contains for missing values
pulkitmaloo Aug 17, 2018
dbd990b
PEP8 Issue: added whitespace after ','
pulkitmaloo Aug 17, 2018
9d5d2c2
Merge branch 'master' of https://github.com/pulkitmaloo/pandas into c…
pulkitmaloo Aug 24, 2018
90aef7b
na argument for wrap_results in match
pulkitmaloo Aug 24, 2018
69f16af
BUG: fixed .str.contains(..., na=False) for categorical series
pulkitmaloo Aug 2, 2018
78cf8c7
na argument for _wrap_results
pulkitmaloo Aug 2, 2018
93bb24a
fixed str.contains for missing values
pulkitmaloo Aug 17, 2018
6c2700f
PEP8 Issue: added whitespace after ','
pulkitmaloo Aug 17, 2018
6649129
na argument for wrap_results in match
pulkitmaloo Aug 24, 2018
5c87e81
Merge branch 'categorical_bug' of https://github.com/pulkitmaloo/pand…
pulkitmaloo Aug 24, 2018
a09dcc5
Merge branch 'master' into categorical_bug
pulkitmaloo Aug 25, 2018
3abdea5
Update circle-27-compat.yaml
pulkitmaloo Sep 22, 2018
d136599
Update travis-27-locale.yaml
pulkitmaloo Sep 22, 2018
b942e16
Merge branch 'categorical_bug' of https://github.com/pulkitmaloo/pand…
pulkitmaloo Sep 22, 2018
82f9b9e
added tests for na arg for categorical and objects
pulkitmaloo Sep 22, 2018
53e9253
updated _wrap_results with arg fill_value and removed na
pulkitmaloo Sep 23, 2018
9f0286f
merging
pulkitmaloo Sep 24, 2018
ffa9969
BUG: fixed .str.contains(..., na=False) for categorical series
pulkitmaloo Aug 2, 2018
07c1d73
na argument for _wrap_results
pulkitmaloo Aug 2, 2018
7f1f2e2
fixed str.contains for missing values
pulkitmaloo Aug 17, 2018
f6cb04f
PEP8 Issue: added whitespace after ','
pulkitmaloo Aug 17, 2018
1f0256a
na argument for wrap_results in match
pulkitmaloo Aug 24, 2018
7025f34
Update travis-27-locale.yaml
pulkitmaloo Sep 22, 2018
7542448
added tests for na arg for categorical and objects
pulkitmaloo Sep 22, 2018
f1b4274
updated _wrap_results with arg fill_value and removed na
pulkitmaloo Sep 23, 2018
386ab98
Merge branch 'categorical_bug' of https://github.com/pulkitmaloo/pand…
pulkitmaloo Sep 24, 2018
7a09c44
Update travis-27-locale.yaml
pulkitmaloo Sep 24, 2018
3408920
fixed line too long
pulkitmaloo Sep 24, 2018
3288d11
whatsnew note
pulkitmaloo Sep 25, 2018
6c87770
Update v0.24.0.txt
pulkitmaloo Sep 25, 2018
d242647
Update doc/source/whatsnew/v0.24.0.txt
TomAugspurger Oct 19, 2018
def1b4e
Merge branch 'master' into PR_TOOL_MERGE_PR_22170
jreback Nov 18, 2018
fd99431
whatsnew
jreback Nov 18, 2018
44b36a4
cleanup
jreback Nov 18, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Prev Previous commit
Next Next commit
Merge branch 'master' into PR_TOOL_MERGE_PR_22170
  • Loading branch information
jreback committed Nov 18, 2018
commit def1b4eb8db4c08fba8bcf0dc859a9e94cd53f5d
125 changes: 5 additions & 120 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -1,43 +1,6 @@
version: 2
jobs:

# --------------------------------------------------------------------------
# 0. py27_compat
# --------------------------------------------------------------------------
py27_compat:
docker:
- image: continuumio/miniconda:latest
# databases configuration
- image: circleci/postgres:9.6.5-alpine-ram
environment:
POSTGRES_USER: postgres
POSTGRES_DB: pandas_nosetest
- image: circleci/mysql:8-ram
environment:
MYSQL_USER: "root"
MYSQL_HOST: "localhost"
MYSQL_ALLOW_EMPTY_PASSWORD: "true"
MYSQL_DATABASE: "pandas_nosetest"
environment:
JOB: "2.7_COMPAT"
ENV_FILE: "ci/circle-27-compat.yaml"
LOCALE_OVERRIDE: "it_IT.UTF-8"
MINICONDA_DIR: /home/ubuntu/miniconda3
steps:
- checkout
- run:
name: build
command: |
./ci/install_circle.sh
./ci/show_circle.sh
- run:
name: test
command: ./ci/run_circle.sh --skip-slow --skip-network

# --------------------------------------------------------------------------
# 1. py36_locale
# --------------------------------------------------------------------------
py36_locale:
build:
docker:
- image: continuumio/miniconda:latest
# databases configuration
Expand All @@ -54,94 +17,16 @@ jobs:

environment:
JOB: "3.6_LOCALE"
ENV_FILE: "ci/circle-36-locale.yaml"
ENV_FILE: "ci/deps/circle-36-locale.yaml"
LOCALE_OVERRIDE: "zh_CN.UTF-8"
MINICONDA_DIR: /home/ubuntu/miniconda3
steps:
- checkout
- run:
name: build
command: |
./ci/install_circle.sh
./ci/show_circle.sh
./ci/circle/install_circle.sh
./ci/circle/show_circle.sh
- run:
name: test
command: ./ci/run_circle.sh --skip-slow --skip-network

# --------------------------------------------------------------------------
# 2. py36_locale_slow
# --------------------------------------------------------------------------
py36_locale_slow:
docker:
- image: continuumio/miniconda:latest
# databases configuration
- image: circleci/postgres:9.6.5-alpine-ram
environment:
POSTGRES_USER: postgres
POSTGRES_DB: pandas_nosetest
- image: circleci/mysql:8-ram
environment:
MYSQL_USER: "root"
MYSQL_HOST: "localhost"
MYSQL_ALLOW_EMPTY_PASSWORD: "true"
MYSQL_DATABASE: "pandas_nosetest"

environment:
JOB: "3.6_LOCALE_SLOW"
ENV_FILE: "ci/circle-36-locale_slow.yaml"
LOCALE_OVERRIDE: "zh_CN.UTF-8"
MINICONDA_DIR: /home/ubuntu/miniconda3
steps:
- checkout
- run:
name: build
command: |
./ci/install_circle.sh
./ci/show_circle.sh
- run:
name: test
command: ./ci/run_circle.sh --only-slow --skip-network

# --------------------------------------------------------------------------
# 3. py35_ascii
# --------------------------------------------------------------------------
py35_ascii:
docker:
- image: continuumio/miniconda:latest
# databases configuration
- image: circleci/postgres:9.6.5-alpine-ram
environment:
POSTGRES_USER: postgres
POSTGRES_DB: pandas_nosetest
- image: circleci/mysql:8-ram
environment:
MYSQL_USER: "root"
MYSQL_HOST: "localhost"
MYSQL_ALLOW_EMPTY_PASSWORD: "true"
MYSQL_DATABASE: "pandas_nosetest"

environment:
JOB: "3.5_ASCII"
ENV_FILE: "ci/circle-35-ascii.yaml"
LOCALE_OVERRIDE: "C"
MINICONDA_DIR: /home/ubuntu/miniconda3
steps:
- checkout
- run:
name: build
command: |
./ci/install_circle.sh
./ci/show_circle.sh
- run:
name: test
command: ./ci/run_circle.sh --skip-slow --skip-network


workflows:
version: 2
build_and_test:
jobs:
- py27_compat
- py36_locale
- py36_locale_slow
- py35_ascii
command: ./ci/circle/run_circle.sh --skip-slow --skip-network
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,5 @@ doc/build/html/index.html
# Windows specific leftover:
doc/tmp.sv
doc/source/styled.xlsx
doc/source/templates/
env/
doc/source/savefig/
13 changes: 11 additions & 2 deletions .pep8speaks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,18 @@
scanner:
diff_only: True # If True, errors caused by only the patch are shown

# Opened issue in pep8speaks, so we can directly use the config in setup.cfg
# (and avoid having to duplicate it here):
# https://github.com/OrkoHunter/pep8speaks/issues/95

pycodestyle:
max-line-length: 79
ignore: # Errors and warnings to ignore
ignore:
- W503, # line break before binary operator
- W504, # line break after binary operator
- E402, # module level import not at top of file
- E722, # do not use bare except
- E731, # do not assign a lambda expression, use a def
- W503 # line break before binary operator
- C406, # Unnecessary list literal - rewrite as a dict literal.
- C408, # Unnecessary dict call - rewrite as a literal.
- C409 # Unnecessary list passed to tuple() - rewrite as a tuple literal.
50 changes: 20 additions & 30 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ env:

git:
# for cloning
depth: 1000
depth: 1500

matrix:
fast_finish: true
Expand All @@ -34,55 +34,49 @@ matrix:
include:
- dist: trusty
env:
- JOB="3.7" ENV_FILE="ci/travis-37.yaml" TEST_ARGS="--skip-slow --skip-network"
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" TEST_ARGS="--skip-slow --skip-network"

- dist: trusty
env:
- JOB="2.7, locale, slow, old NumPy" ENV_FILE="ci/travis-27-locale.yaml" LOCALE_OVERRIDE="zh_CN.UTF-8" SLOW=true
- JOB="2.7, locale, slow, old NumPy" ENV_FILE="ci/deps/travis-27-locale.yaml" LOCALE_OVERRIDE="zh_CN.UTF-8" SLOW=true
addons:
apt:
packages:
- language-pack-zh-hans
- dist: trusty
env:
- JOB="2.7, lint" ENV_FILE="ci/travis-27.yaml" TEST_ARGS="--skip-slow" LINT=true
- JOB="2.7" ENV_FILE="ci/deps/travis-27.yaml" TEST_ARGS="--skip-slow"
addons:
apt:
packages:
- python-gtk2
- dist: trusty
env:
- JOB="3.6, coverage" ENV_FILE="ci/travis-36.yaml" TEST_ARGS="--skip-slow --skip-network" PANDAS_TESTING_MODE="deprecate" COVERAGE=true DOCTEST=true
# In allow_failures
- dist: trusty
env:
- JOB="3.6, slow" ENV_FILE="ci/travis-36-slow.yaml" SLOW=true
# In allow_failures
- JOB="3.6, lint, coverage" ENV_FILE="ci/deps/travis-36.yaml" TEST_ARGS="--skip-slow --skip-network" PANDAS_TESTING_MODE="deprecate" COVERAGE=true LINT=true
- dist: trusty
env:
- JOB="3.7, NumPy dev" ENV_FILE="ci/travis-37-numpydev.yaml" TEST_ARGS="--skip-slow --skip-network -W error" PANDAS_TESTING_MODE="deprecate"
- JOB="3.7, NumPy dev" ENV_FILE="ci/deps/travis-37-numpydev.yaml" TEST_ARGS="--skip-slow --skip-network -W error" PANDAS_TESTING_MODE="deprecate"
addons:
apt:
packages:
- xsel

# In allow_failures
- dist: trusty
env:
- JOB="3.6, doc" ENV_FILE="ci/travis-36-doc.yaml" DOC=true
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" SLOW=true

# In allow_failures
- dist: trusty
env:
- JOB="3.6, doc" ENV_FILE="ci/deps/travis-36-doc.yaml" DOC=true
allow_failures:
- dist: trusty
env:
- JOB="3.6, slow" ENV_FILE="ci/travis-36-slow.yaml" SLOW=true
- dist: trusty
env:
- JOB="3.7, NumPy dev" ENV_FILE="ci/travis-37-numpydev.yaml" TEST_ARGS="--skip-slow --skip-network -W error" PANDAS_TESTING_MODE="deprecate"
addons:
apt:
packages:
- xsel
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" SLOW=true
- dist: trusty
env:
- JOB="3.6, doc" ENV_FILE="ci/travis-36-doc.yaml" DOC=true
- JOB="3.6, doc" ENV_FILE="ci/deps/travis-36-doc.yaml" DOC=true

before_install:
- echo "before_install"
Expand Down Expand Up @@ -114,22 +108,18 @@ script:
- ci/run_build_docs.sh
- ci/script_single.sh
- ci/script_multi.sh
- ci/lint.sh
- ci/doctests.sh
- echo "checking imports"
- source activate pandas && python ci/check_imports.py
- echo "script done"
- ci/code_checks.sh

after_success:
- ci/upload_coverage.sh

after_script:
- echo "after_script start"
- source activate pandas && pushd /tmp && python -c "import pandas; pandas.show_versions();" && popd
- if [ -e /tmp/single.xml ]; then
ci/print_skipped.py /tmp/single.xml;
- if [ -e test-data-single.xml ]; then
ci/print_skipped.py test-data-single.xml;
fi
- if [ -e /tmp/multiple.xml ]; then
ci/print_skipped.py /tmp/multiple.xml;
- if [ -e test-data-multiple.xml ]; then
ci/print_skipped.py test-data-multiple.xml;
fi
- echo "after_script done"
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@
<tr>
<td></td>
<td>
<a href="https://ci.appveyor.com/project/pandas-dev/pandas">
<img src="https://ci.appveyor.com/api/projects/status/86vn83mxgnl4xf1s/branch/master?svg=true" alt="appveyor build status" />
<a href="https://dev.azure.com/pandas-dev/pandas/_build/latest?definitionId=1&branch=master">
<img src="https://dev.azure.com/pandas-dev/pandas/_apis/build/status/pandas-dev.pandas?branch=master" alt="Azure Pipelines build status" />
</a>
</td>
</tr>
Expand Down Expand Up @@ -97,7 +97,7 @@ easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, **real world** data analysis in Python. Additionally, it has
the broader goal of becoming **the most powerful and flexible open source data
analysis / manipulation tool available in any language**. It is already well on
its way toward this goal.
its way towards this goal.

## Main Features
Here are just a few of the things that pandas does well:
Expand Down
17 changes: 4 additions & 13 deletions asv_bench/benchmarks/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,12 @@
try:
hashing = import_module(imp)
break
except:
except (ImportError, TypeError, ValueError):
pass

from .pandas_vb_common import setup # noqa


class Factorize(object):

goal_time = 0.2

params = [True, False]
param_names = ['sort']

Expand All @@ -40,8 +36,6 @@ def time_factorize_string(self, sort):

class Duplicated(object):

goal_time = 0.2

params = ['first', 'last', False]
param_names = ['keep']

Expand All @@ -63,8 +57,6 @@ def time_duplicated_string(self, keep):

class DuplicatedUniqueIndex(object):

goal_time = 0.2

def setup(self):
N = 10**5
self.idx_int_dup = pd.Int64Index(np.arange(N * 5))
Expand All @@ -77,8 +69,6 @@ def time_duplicated_unique_int(self):

class Match(object):

goal_time = 0.2

def setup(self):
self.uniques = tm.makeStringIndex(1000).values
self.all = self.uniques.repeat(10)
Expand All @@ -90,8 +80,6 @@ def time_match_string(self):

class Hashing(object):

goal_time = 0.2

def setup_cache(self):
N = 10**5

Expand Down Expand Up @@ -126,3 +114,6 @@ def time_series_timedeltas(self, df):

def time_series_dates(self, df):
hashing.hash_pandas_object(df['dates'])


from .pandas_vb_common import setup # noqa: F401
9 changes: 3 additions & 6 deletions asv_bench/benchmarks/attrs_caching.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,9 @@
except ImportError:
from pandas.util.decorators import cache_readonly

from .pandas_vb_common import setup # noqa


class DataFrameAttributes(object):

goal_time = 0.2

def setup(self):
self.df = DataFrame(np.random.randn(10, 6))
self.cur_index = self.df.index
Expand All @@ -25,8 +21,6 @@ def time_set_index(self):

class CacheReadonly(object):

goal_time = 0.2

def setup(self):

class Foo:
Expand All @@ -38,3 +32,6 @@ def prop(self):

def time_cache_readonly(self):
self.obj.prop


from .pandas_vb_common import setup # noqa: F401
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.