Skip to content

Commit fa80887

Browse files
authored
Improve reader interface (#90)
* 🔨 improve reader interface * 🔨 shrink reader code * This is an auto-commit, updating project meta data, such as changelog.rst, contributors.rst * 🔥 remove redundant functionalitoes, never will use. what's the point * 📚 updated doc string and the tutorial * 🔨 update import statements * 🔬 more test coverage * This is an auto-commit, updating project meta data, such as changelog.rst, contributors.rst * 💚 fix unit test failure * 📚 update reader plugin example * 💄 update coding style * 📚 fix index rst file * This is an auto-commit, updating project meta data, such as changelog.rst, contributors.rst Co-authored-by: chfw <chfw@users.noreply.github.com>
1 parent 29c2668 commit fa80887

33 files changed

+337
-76
lines changed
File renamed without changes.

.moban.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,5 @@ targets:
66
- setup.py: io_setup.py.jj2
77
- .travis.yml: custom_travis.yml.jj2
88
- README.rst: io_readme.rst.jj2
9-
- "docs/source/index.rst": "docs/source/index.rst"
9+
- "docs/source/index.rst": "docs/source/index.rst.jj2"
1010
- .gitignore: gitignore.jj2

CONTRIBUTORS.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11

2+
23
5 contributors
34
================================================================================
45

docs/source/extensions.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,31 @@
1+
Extend pyexcel-io Tutorial
2+
================================================================================
3+
4+
pyexcel-io itself comes with csv support.
5+
6+
Reader
7+
--------------------------------------------------------------------------------
8+
9+
Suppose we have a yaml file, containing a dictionary where the values are
10+
two dimensional array. The task is write reader plugin to pyexcel-io so that
11+
we can use get_data() to read it out.
12+
13+
Example yaml data::
14+
15+
.. literalinclude:: ../../examples/test.yaml
16+
:language: yaml
17+
18+
Example code::
19+
20+
.. literalinclude:: ../../examples/custom_yeaml_reader.py
21+
:language: python
22+
23+
24+
Writer
25+
--------------------------------------------------------------------------------
26+
27+
28+
129
Working with xls, xlsx, and ods formats
230
================================================================================
331

docs/source/index.rst

Lines changed: 105 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,16 @@
33
You can adapt this file completely to your liking, but it should at least
44
contain the root `toctree` directive.
55
6-
{%include "header.rst.jj2" %}
6+
`pyexcel-io` - Let you focus on data, instead of file formats
7+
================================================================================
8+
9+
:Author: chfw
10+
:Source code: http://github.com/pyexcel/pyexcel-io.git
11+
:Issues: http://github.com/pyexcel/pyexcel-io/issues
12+
:License: New BSD License
13+
:Development: |release|
14+
:Released: |version|
15+
:Generated: |today|
716

817
Introduction
918
--------------------------------------------------------------------------------
@@ -33,11 +42,104 @@ as of 2014. They are invented and supported by `pyexcel-io`_.
3342
Installation
3443
--------------------------------------------------------------------------------
3544

36-
{%include "installation.rst.jj2" %}
45+
46+
You can install pyexcel-io via pip:
47+
48+
.. code-block:: bash
49+
50+
$ pip install pyexcel-io
51+
52+
53+
or clone it and install it:
54+
55+
.. code-block:: bash
56+
57+
$ git clone https://github.com/pyexcel/pyexcel-io.git
58+
$ cd pyexcel-io
59+
$ python setup.py install
3760
3861
For individual excel file formats, please install them as you wish:
3962

40-
{%include "io-plugins-list.rst.jj2" %}
63+
.. _file-format-list:
64+
.. _a-map-of-plugins-and-file-formats:
65+
66+
.. table:: A list of file formats supported by external plugins
67+
68+
======================== ======================= ================= ==================
69+
Package name Supported file formats Dependencies Python versions
70+
======================== ======================= ================= ==================
71+
`pyexcel-io`_ >=v0.6.0 csv, csvz [#f1]_, tsv, 3.6+
72+
tsvz [#f2]_
73+
`pyexcel-io`_ <=0.5.20 same as above 2.6, 2.7, 3.3,
74+
3.4, 3.5, 3.6
75+
pypy
76+
`pyexcel-xls`_ xls, xlsx(read only), `xlrd`_, same as above
77+
xlsm(read only) `xlwt`_
78+
`pyexcel-xlsx`_ xlsx `openpyxl`_ same as above
79+
`pyexcel-ods3`_ ods `pyexcel-ezodf`_, 2.6, 2.7, 3.3, 3.4
80+
lxml 3.5, 3.6
81+
`pyexcel-ods`_ ods `odfpy`_ same as above
82+
======================== ======================= ================= ==================
83+
84+
.. table:: Dedicated file reader and writers
85+
86+
======================== ======================= ================= ==================
87+
Package name Supported file formats Dependencies Python versions
88+
======================== ======================= ================= ==================
89+
`pyexcel-xlsxw`_ xlsx(write only) `XlsxWriter`_ Python 2 and 3
90+
`pyexcel-xlsxr`_ xlsx(read only) lxml same as above
91+
`pyexcel-xlsbr`_ xlsx(read only) pyxlsb same as above
92+
`pyexcel-odsr`_ read only for ods, fods lxml same as above
93+
`pyexcel-odsw`_ write only for ods loxun same as above
94+
`pyexcel-htmlr`_ html(read only) lxml,html5lib same as above
95+
`pyexcel-pdfr`_ pdf(read only) pdftables Python 2 only.
96+
======================== ======================= ================= ==================
97+
98+
99+
Plugin shopping guide
100+
------------------------
101+
102+
Except csv files, xls, xlsx and ods files are a zip of a folder containing a lot of
103+
xml files
104+
105+
The dedicated readers for excel files can stream read
106+
107+
108+
In order to manage the list of plugins installed, you need to use pip to add or remove
109+
a plugin. When you use virtualenv, you can have different plugins per virtual
110+
environment. In the situation where you have multiple plugins that does the same thing
111+
in your environment, you need to tell pyexcel which plugin to use per function call.
112+
For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr.
113+
You need to append get_array(..., library='pyexcel-odsr').
114+
115+
116+
117+
.. _pyexcel-io: https://github.com/pyexcel/pyexcel-io
118+
.. _pyexcel-xls: https://github.com/pyexcel/pyexcel-xls
119+
.. _pyexcel-xlsx: https://github.com/pyexcel/pyexcel-xlsx
120+
.. _pyexcel-ods: https://github.com/pyexcel/pyexcel-ods
121+
.. _pyexcel-ods3: https://github.com/pyexcel/pyexcel-ods3
122+
.. _pyexcel-odsr: https://github.com/pyexcel/pyexcel-odsr
123+
.. _pyexcel-odsw: https://github.com/pyexcel/pyexcel-odsw
124+
.. _pyexcel-pdfr: https://github.com/pyexcel/pyexcel-pdfr
125+
126+
.. _pyexcel-xlsxw: https://github.com/pyexcel/pyexcel-xlsxw
127+
.. _pyexcel-xlsxr: https://github.com/pyexcel/pyexcel-xlsxr
128+
.. _pyexcel-xlsbr: https://github.com/pyexcel/pyexcel-xlsbr
129+
.. _pyexcel-htmlr: https://github.com/pyexcel/pyexcel-htmlr
130+
131+
.. _xlrd: https://github.com/python-excel/xlrd
132+
.. _xlwt: https://github.com/python-excel/xlwt
133+
.. _openpyxl: https://bitbucket.org/openpyxl/openpyxl
134+
.. _XlsxWriter: https://github.com/jmcnamara/XlsxWriter
135+
.. _pyexcel-ezodf: https://github.com/pyexcel/pyexcel-ezodf
136+
.. _odfpy: https://github.com/eea/odfpy
137+
138+
139+
.. rubric:: Footnotes
140+
141+
.. [#f1] zipped csv file
142+
.. [#f2] zipped tsv file
41143
42144
After that, you can start get and save data in the loaded format. There
43145
are two plugins for the same file format, e.g. pyexcel-ods3 and pyexcel-ods.
@@ -91,7 +193,6 @@ get_data(.., library='pyexcel-ods')
91193
csvz
92194
sqlalchemy
93195
django
94-
options
95196
extensions
96197

97198

examples/custom_yaml_reader.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
import yaml
2+
from pyexcel_io import get_data
3+
from pyexcel_io.sheet import NamedContent
4+
from pyexcel_io.plugins import IOPluginInfoChainV2
5+
from pyexcel_io.plugin_api import ISheet, IReader
6+
7+
8+
class YourSingleSheet(ISheet):
9+
def __init__(self, your_native_sheet):
10+
self.two_dimensional_array = your_native_sheet
11+
12+
def row_iterator(self):
13+
yield from self.two_dimensional_array
14+
15+
def column_iterator(self, row):
16+
yield from row
17+
18+
19+
class YourReader(IReader):
20+
def __init__(self, file_name, file_type, **keywords):
21+
self.file_handle = open(file_name, "r")
22+
self.native_book = yaml.load(self.file_handle)
23+
self.content_array = [
24+
NamedContent(key, values)
25+
for key, values in self.native_book.items()
26+
]
27+
28+
def read_sheet(self, sheet_index):
29+
two_dimensional_array = self.content_array[sheet_index].payload
30+
return YourSingleSheet(two_dimensional_array)
31+
32+
def close(self):
33+
self.file_handle.close()
34+
35+
36+
IOPluginInfoChainV2(__name__).add_a_reader(
37+
relative_plugin_class_path="YourReader",
38+
locations=["file"],
39+
file_types=["yaml"],
40+
stream_type="text",
41+
)
42+
43+
if __name__ == "__main__":
44+
data = get_data("test.yaml")
45+
print(data)

examples/test.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
sheet 1:
2+
- - 1
3+
- 2
4+
- 3
5+
- - 2
6+
- 3
7+
- 4
8+
sheet 2:
9+
- - A
10+
- B
11+
- C

pyexcel-io.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies:
1313
- lml>=0.0.4
1414
test_dependencies:
1515
- pyexcel
16-
- pyexcel-xls
16+
- pyexcel-xls==0.5.9
1717
- SQLAlchemy
1818
- pyexcel-xlsxw
1919
extra_dependencies:

pyexcel_io/_compact.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,4 @@ def is_string(atype):
5151
if atype == str:
5252
return True
5353

54-
elif PY2:
55-
if atype == unicode:
56-
return True
57-
5854
return False

pyexcel_io/database/exporters/django.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@
77
:copyright: (c) 2014-2020 by Onni Software Ltd.
88
:license: New BSD License, see LICENSE for more details
99
"""
10+
from pyexcel_io.plugin_api import IReader
1011
from pyexcel_io.database.querysets import QuerysetsReader
11-
from pyexcel_io.plugin_api.abstract_reader import IReader
1212

1313

1414
class DjangoModelReader(QuerysetsReader):

0 commit comments

Comments
 (0)