cell extraction

your package works great but I had to modify it slightly..

```
self._insert(row_ind, col_ind, row_span, col_span, self._transformer(cell.get_text()))
```

This is fine if the content is text but if it contains links you want to keep then it's problematic

I have modified it to:

```
class Extractor(object):
    def __init__(self, table, id_=None, cell_transformer=None):
        ...
        self._cell_transformer = cell_transformer if cell_transformer else lambda x: x.get_text()

    def parse(self):
      ...
      self._insert(row_ind, col_ind, row_span, col_span, self._cell_transformer(cell))
```

this allows the callee to implement the cell extraction if required.

Also, having to do 3 lines..
```
ext = Extractor(html)
ext.parse()
print ext.return_list()
```
would be nicer to just do 

```
result = Extractor().parse(html)
```

Thanks, this package is small but useful :)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cell extraction #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

cell extraction #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions