Skip to content

Parallel parsing of Csv files #1159

@nhirschey

Description

@nhirschey

It should be more obvious how to parallelize Csv file parsing. This is important for big files and multicore PCs. Two possibilities:

  1. Add some sort of Parallel option to CsvFile.Load() that triggers use of parallel map under the hood. For example a simple version of the function might be:

`

type test = CsvProvider<"test.csv">
let LazyParallelCsvLoadRows file =
    System.IO.File.ReadLines(file)
    |> Seq.skip 1
    |> PSeq.map(fun r -> test.ParseRows(r).[0])

let EagerParallelCsvLoadRows file =
    System.IO.File.ReadAllLines(file)
    |> Array.skip 1
    |> Array.Parallel.map(fun r -> test.ParseRows(r).[0])

`
2. Don't change any code, but add something like this as examples to the website documentation.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions