Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening CSV from a URI is failing #374

Closed
kiranchangsathya opened this issue Jan 5, 2017 · 5 comments
Closed

Opening CSV from a URI is failing #374

kiranchangsathya opened this issue Jan 5, 2017 · 5 comments

Comments

@kiranchangsathya
Copy link

with the latest version 2.7.1, opening a CSV from a URI is not working. Getting "No such file found" error. However this was fine with 2.5.1 or 2.6.0, so had to lock down these version.

@iainbeeston
Copy link
Contributor

I'm seeing this as well. I've tracked down the issue to c107923. Before that commit, the only place in the code that loaded csv data was Roo::CSV.each_row(), and that method had separate code for handling data from uris, streams or local files. Everything else in Roo::CSV would to call that method to access csv data, and so every other method didn't need to know where the data was coming from.

In c107923, Roo::CSV. read_cells() was refactored and some of the code was moved into two new methods: Roo::CSV.set_column_count() and Roo::CSV.set_row_count(). But instead of calling Roo::CSV.each_row() they have their own code for loading the csv data, and can't handle loading data from a uri.

The important bit that Roo::CSV.set_column_count() and Roo::CSV.set_row_count() are missing is that for uris Roo::CSV.each_row() downloads the csv file before parsing it. The code for this is in Roo::CSV.each_row_using_tempdir() and it looks like it could be refactored so that it can be reused by the other methods as well.

@iainbeeston
Copy link
Contributor

It's possible that loading csv data from a stream is broken as well, but I haven't been able to test that

@cthornbe
Copy link

we have the same issue, I reverted back to 2.6.0

@sarojkh
Copy link

sarojkh commented Jun 1, 2018

We had similar issue as well. Reverted back to 2.6.0 to fix it.

@matiasgarcia
Copy link

Has anyone found a workaround for this? Looks like downgrading roo to 2.6.0 works, but 2.7.0 is way more performant than 2.6.0

chopraanmol1 added a commit to chopraanmol1/roo that referenced this issue Oct 5, 2018
### Summary

Fixes roo-rb#378 and roo-rb#374 (Introduced in roo-rb#368)

### Benchmark

```
file_name = 'test/files/Bibelbund.csv'

MemoryProfiler.report{ Roo::Spreadsheet.open(file_name).tap{|x|(2..x.last_row).each{|i| x.row(i)}} }
puts Benchmark.measure{ Roo::Spreadsheet.open(file_name).tap{|x|(2..x.last_row).each{|i| x.row(i)}} }
```

Master
```
Total allocated: 39705265 bytes (561479 objects)
Total retained:  768 bytes (4 objects)

  0.300000   0.000000   0.300000 (  0.304877)
```

Modified:
```
Total allocated: 16952085 bytes (234487 objects)
Total retained:  768 bytes (4 objects)

  0.190000   0.000000   0.190000 (  0.181199)
```
chopraanmol1 added a commit to chopraanmol1/roo that referenced this issue Oct 5, 2018
### Summary

Fixes roo-rb#378 and roo-rb#374 (Introduced in roo-rb#368)

### Benchmark

```
file_name = 'test/files/Bibelbund.csv'

MemoryProfiler.report{ Roo::Spreadsheet.open(file_name).tap{|x|(2..x.last_row).each{|i| x.row(i)}} }
puts Benchmark.measure{ Roo::Spreadsheet.open(file_name).tap{|x|(2..x.last_row).each{|i| x.row(i)}} }
```

Master
```
Total allocated: 39705265 bytes (561479 objects)
Total retained:  768 bytes (4 objects)

  0.300000   0.000000   0.300000 (  0.304877)
```

Modified:
```
Total allocated: 16952085 bytes (234487 objects)
Total retained:  768 bytes (4 objects)

  0.190000   0.000000   0.190000 (  0.181199)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants