Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tablib vs dataset vs pandas: what areas do they cover, what is tablib's feature vision? #136

Closed
tony opened this issue Jan 17, 2014 · 1 comment
Labels

Comments

@tony
Copy link
Member

tony commented Jan 17, 2014

@kennethreitz: Greetings, I saw https://github.com/kennethreitz/tablib/issues/124, I didn't want to burden with another issue, but I have a few questions about tablib to articulate:

  1. Where does tablib stand against https://github.com/pudo/dataset?

    It's not a duplicate effort, but I find the interoperability with databases to be a great value add and related.

    Actually, from my pov, I feel a bit of redundancy, (I hope I convey this correctly) I have a need to import tabular into a table-friendly data into python object ( DataSet (tablib) and table (dataset). tablib can import from a wide variety of places, dataset maintains DB back-end.

    I could possibly glue both together somehow. I'm not sure it would look elegant or / come a performance cost.

    Is there a chance to potentially kill two birds with one stone and see if collaboration / cooperation is possible?

  2. Where does tablib stand against https://github.com/pydata/pandas and its DataFrame's?

    I think Pandas is a little too heavy to bring into an application I'd like to keep lite. It has a dependency of numpy. However, it covers a lot turf, I also see some interesting activity for Pandas DataFrame's and sql at ENH: sql support pandas-dev/pandas#4163.

  3. This question is for @kennethreitz and everybody. I'm interested speed ups for python containers and data structures. Does cython or python C API stand any chance of offering a speed up if Dataset, Databook, and / or any parts of tablib were written in C? Is there anything in tablib that stands to benefit from C optimization?

  4. Overall

    What areas does tablib cover? What does it not? To be honest, I would really like a go-to for relational and tabular data without the overhead of pandas.

@kennethreitz
Copy link
Contributor

  1. Dataset is a SQL library, Tablib is not.
  2. Pandas is an incredibly powerful data science tool. Tablib is intended for small dataset creation and exporting into friendly formats.
  3. No interest or need for speedups. This isn't a scientific library, and I want it to be as simple to install as possible. If you want things to run faster, I suggest you use PyPy as your runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants