Bloomberg Hackathon

Contributing Guidlines / Help:
https://github.com/pydata/pandas/wiki

Dev Docs
http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html

Docs:
- Doc Strings examples/links: https://github.com/pydata/pandas/issues/3439, https://github.com/pydata/pandas/issues/2916, https://github.com/pydata/pandas/issues/3324
- ~~Docs on ipython startup files: #5748~~
- Links to API docs in the tutorials: https://github.com/https://github.com/pydata/pandas/issues/3705, #1967
- better docs on DataFrame.apply (examples): https://github.com/pydata/pandas/issues/5299
- ~~GA docs: https://github.com/pydata/pandas/issues/3508~~
- doc groupby NA group work-around: https://github.com/pydata/pandas/issues/5456
- consistent imports in all documentation: #1967
- DOC: improve groupby reference docs #6944 
- documenting cython class (eg Timestamp): "cyfunction is not a python function" #5218
- Redesigning/reorganising the documentation website

Perf:
- ~~vbench on different group sizes: https://github.com/pydata/pandas/issues/6787~~
- use seed in vbenches: https://github.com/pydata/pandas/issues/8144
- pandas + [airspeed velocity](https://spacetelescope.github.io/asv/index.html) #8361

Tests:
- matplotlib lib to check plots: https://github.com/pydata/pandas/issues/5379
- harmonize testing namespace: https://github.com/pydata/pandas/issues/8023
- verify timedelta algos: https://github.com/pydata/pandas/issues/5986
- non_unique storage tests for HDFStore: https://github.com/pydata/pandas/issues/7813

Bugs:
- Make changes in numpy API for 1.7: https://github.com/pydata/pandas/issues/8329
- accept scalar in Panel construction: https://github.com/pydata/pandas/issues/8285
- fillna bug: https://github.com/pydata/pandas/issues/8251
- HDFStore modifying columns when passed: https://github.com/pydata/pandas/issues/7212
-  Plot label = None instead of provided label in line plot #8905 
-  plot with kind=scatter fails when providing an array for the size #8852 
-  Grouper(key='A') gives AttributeError when applying function #8795 
-  ValueError exception with pd.resample #8683 
- to_csv issue with chunksize when large number of columns  #8621 
- EASY: pd.option_context without 'with' changes option values #8514 

Enhancements:
-  ENH: support TimedeltaIndex plotting #8711 
- as suggested below, an enhancement to `read_csv` (or maybe `read_repr/string` to allow round-triping of the repr (can also serve as a basis for `read_clipboard`)
- raise on invalid compression options in HDFStore: https://github.com/pydata/pandas/issues/4582
- ~~accept Period in DatetimeIndex for start/end: https://github.com/pydata/pandas/issues/6780~~
- dont use bare Exceptions: https://github.com/pydata/pandas/issues/7948
- add more Series/Index ops: https://github.com/pydata/pandas/issues/6382
- ~~to_dict orient parm: https://github.com/pydata/pandas/issues/7840~~
- sort_index to generic.py: https://github.com/pydata/pandas/issues/8283
- get to take an axis argument: https://github.com/pydata/pandas/issues/6703
- axis argument to append: https://github.com/pydata/pandas/issues/8295
- better error on invalid input to cut: https://github.com/pydata/pandas/issues/7751
- ~~level kw to any/all: https://github.com/pydata/pandas/issues/8302~~
- astype accepting a dict: https://github.com/pydata/pandas/issues/7271
- ~~clean up code by removing core/array.py: https://github.com/pydata/pandas/issues/8359~~

IO:
- to_clipboard bug/improvements: https://github.com/pydata/pandas/issues/8304
- to_html to actually create links: https://github.com/pydata/pandas/issues/2679, https://github.com/pydata/pandas/issues/6488, https://github.com/pydata/pandas/issues/4987
- max_colwidth with groupby: https://github.com/pydata/pandas/issues/7856
- date_formatting in to_csv not being passed thru: https://github.com/pydata/pandas/issues/7791
- generate gbq schema: https://github.com/pydata/pandas/issues/8325
- make `to_latex` work with multi-index: https://github.com/pydata/pandas/issues/8336
- `Series.to_html` not working so well: https://github.com/pydata/pandas/issues/5563
- to_csv issue with chunksize when large number of columns  #8621 

Excel Oriented:
- More customization of Excel input/output could be great, i.e. making it easier to specify per-column colors/formatting, float formats, etc. The code base isn't too complicated there (just a mixture of the formatter and the ExcelWriter stuff) and you could make rapid progress because it's really easy to test and create samples. I think the result would be very immediately rewarding (better looking things, easier to make reports, etc.). Plus for #4679 and #8272 you'd get a better sense of pandas internals too.
- dtypes per column in read in (#8212 and #8272)
- Treat multiple rows/columns as MultiIndex/hierarchical columns #4679
- More flexible output formatting for Excel (mentioned in #8191 , but I'm going to put up an issue about having something like per-column styles, also #1663).
- Allow writing multiple tables to same sheet and/or setting starting position for a particular sheet (mentioned by Wes in an older issue but I can't find it right now).
- extend excel writers to write to open document format #2311
- allow ExcelWriter to automatically convert lists and dict to string #8188
- Performance benchmarks for Excel writers and readers #7171
- Support BytesIO output in ExcelWriter #7074

SQL:
- ENH: specify dtype in `to_sql` per column: #8778
- BUG: to_sql fails with datetime.time values with sqlite fallback mode #8341 

More advanced:
- add write_index option to HDFStore: https://github.com/pydata/pandas/issues/8319
- start/stop on HDFstore fixed: https://github.com/pydata/pandas/issues/8287
- block splitting in HDFStore: https://github.com/pydata/pandas/issues/8284
- tz handling using HDFStore / fixed: https://github.com/pydata/pandas/issues/8165, https://github.com/pydata/pandas/issues/7775
- PeriodIndex in HDFStore: https://github.com/pydata/pandas/issues/7796
- Write meta data as a CArray: https://github.com/pydata/pandas/issues/6245
- Modify methods for HDFStore: https://github.com/pydata/pandas/issues/6857
- Integrate multi-index reindexing a bit more: https://github.com/pydata/pandas/issues/7895
- Timedelta support in groupby: #5724
- support numeric index ops: https://github.com/pydata/pandas/issues/8226
- allow index to referenced like a column: https://github.com/pydata/pandas/issues/8162

Collaborative Efforts:
- Missing data support in numpy: https://github.com/pydata/pandas/issues/8350

@jorisvandenbossche @cpcloud @TomAugspurger @hayd 
cc @shoyer
cc @immerrr


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bloomberg Hackathon #8323

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bloomberg Hackathon #8323

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions