Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No column detected from CSV with no data (header row only) #815

Closed
AlexandreGarino opened this issue Jul 6, 2020 · 4 comments · Fixed by #909
Closed

No column detected from CSV with no data (header row only) #815

AlexandreGarino opened this issue Jul 6, 2020 · 4 comments · Fixed by #909

Comments

@AlexandreGarino
Copy link

Hi,

I read a CSV file with no data (header row only) and I noticed that no column was detected.

The resulting table is empty.

final ByteArrayInputStream inputStream = new ByteArrayInputStream("Column1;Column2;Column3\n".getBytes(StandardCharsets.UTF_8));

final CsvReadOptions readOptions = ((CsvReadOptions.Builder) CsvReadOptions.builder(inputStream)
        .columnTypesToDetect(Arrays.asList(ColumnType.STRING)))
        .header(true)
        .separator(';')
        .build();

final Table table = Table.read().csv(readOptions);

assertThat(table.columns(), hasSize(0)); // OK

Did I make something wrong?

Any help would be greatly appreciated.

@lwhite1
Copy link
Collaborator

lwhite1 commented Jul 11, 2020

I guess this would be a bug, but a fairly minor one from my perspective.

I don't see the use-case for reading a file with nothing but a header line. Tablesaw is geared more towards analyzing existing datasets than on building datasets in memory. Of course you can create a table entirely in code, but I've only ever used that for testing.

Have you found any work arounds? I think i would look at either (a) trying to load the file with the column types pre-specified (I notice all your columns are strings), or (b), just reading the headers using a standard java file reading approach and creating the table in code, by looping over the names

@AlexandreGarino
Copy link
Author

AlexandreGarino commented Jul 11, 2020

Hi,

The code snippet is here just to reproduce the bug.

The real code scan pragmatically a folder for new CSV files and apply some business logic (we compute new columns based on existing columns) on some criteria.

For now, when we read the table, if the table is empty we copy the file as-is.

lujop added a commit to lujop/tablesaw that referenced this issue Apr 26, 2021
lujop added a commit to lujop/tablesaw that referenced this issue Apr 27, 2021
lujop added a commit to lujop/tablesaw that referenced this issue Apr 30, 2021
lwhite1 pushed a commit that referenced this issue May 9, 2021
…eaders (#909)

* Fix #822 and #815

* Apply PR requestes changes

* Changes asked in PR

* Rename variable for better code readibility
@imagejan
Copy link

imagejan commented Mar 29, 2023

Is there an option now for keeping all columns when reading a header-only file (even when the column types can't be determined of course)?

In our use case, we write many tables automated in batch, and some of them could potentially end up being empty. Nevertheless, we have the same column headers across files. Currently, we get an exception (in MoBIE which depends on tablesaw) because tablesaw returns a Table without columns at all. I would have expected at least all the columnNames() be the same as in the (header-only) input file.

/cc @tischi

@lwhite1
Copy link
Collaborator

lwhite1 commented Mar 29, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants