Integer column randomly typed as string when threading enabled

The following code:
```
using CSV
using DataFrames
using Random

NCOLS = 30
NROWS = 150
Random.seed!(1)
fname = tempname()
f = open(fname, "w")
write(f, join("col".*string.(1:NCOLS), ","))
write(f, "\r\n")
for i in 0:NROWS
	write(f, join(string.(rand(Int16, NCOLS)), ","))
	write(f, "\r\n")
end
close(f)
df_by_threads = CSV.read(fname, DataFrame)
df_single_threaded = CSV.read(fname, DataFrame; ntasks=1)
print(eltype.(eachcol(df_by_threads)) == eltype.(eachcol(df_single_threaded)))
```
will print `false` , or at least it does on my Windows box, with 8 threads, running CSV v0.10.7.

If I reduce NCOLS or NROWS it will be true. If I choose a different random seed it may become true.

If one inspects the columns of `df_by_threads`, at least one column will be of type `String7`, but which column may vary with repeated execution and sometimes there are two such columns, even though the data written to the file is fixed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integer column randomly typed as string when threading enabled #1047

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integer column randomly typed as string when threading enabled #1047

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions