C# library for parsing and exporting tabular data in delimited format (e.g. CSV).
The Parser.Parse()
method takes a TextReader
and returns DataTable
.
var parser = new Parser();
using (DataTable myData = parser.Parse(myTextReader))
{
// Make use of `myData` here…
}
For processing a large amount of delimited data, the Parser.ParseReader()
method takes a TextReader
and returns System.Data.Common.DbDataReader
, which provides a fast, forward-only stream of rows. This allows for processing each row in turn, rather than reading a whole file into memory.
var parser = new Parser();
var reader = parser.ParseReader(myTextReader))
while (reader.Read())
{
// Each field for the current row can be retrieved using the column index:
var field1 = reader[0];
var field2 = reader[1];
// etc…
}
FieldSeparator
- the character used as field delimiter in the text file. Default:,
(i.e., CSV).UseFirstRowAsColumnHeaders
- specifies whether the first row of the text file should be treated as a header row. Default:true
.TrimColumnHeaders
- specifies whether the column headers, if present, should have whitespace trimmed before being used as a key.
The Exporter.Export()
method takes a DataTable
and writes to a TextWriter
.
string myCsv;
using (var writer = new StringWriter())
{
var exporter = new Exporter();
exporter.Export(myData, writer);
myCsv = writer.ToString();
}
For exporting a large amount of delimited data, the Exporter.ExportReader()
method takes a System.Data.Common.DbDataReader
instance and writes to a specified TextWriter
, which provides a fast, streamed-based generation of row data. This allows for processing each row in turn, rather than needing to retain the whole data source in memory.
using (var fileWriter = File.CreateText("my.csv"))
{
var exporter = new Exporter();
exporter.ExportReader(myDataReader, fileWriter);
}
FieldSeparator
- the character used as field delimiter in the text file. Default:,
(i.e., CSV).SanitizeStrings
- specifies whether strings should be sanitized, prepending blacklisted characters at the start of the string with a single quote'
. Default:true
.IncludeEscapeCharacters
- specifies whether each value should be escaped by wrapping in quotation marks. Default:true
.OutputColumnHeaders
- specifies whether an initial row containing column names should be written to the output. Default:true
.UseExtendedPropertyForColumnName(string key)
- specifies a key that is used to search on the ExtendedProperties of a DataColumn. If it finds a value this will be used as the column header, if no match is found it will default to the column's ColumnName. This should be used if you are required to output a different column header to what is stored on the column's ColumnName.
The Exporter
class constructor has an overload that takes an IProgress<int>
. If supplied, the Exporter
will use the progress provider to report the current row number as an int
after each row is written to the output file.
The Parser
and Exporter
support setting columns (identified by column name) as "text" wherein data is wrapped in quotes and preceded with an equals sign, as follows: ="1337"
To set columns as text, call the SetColumnsAsText
method on either Parser
or Exporter
.
parser.SetColumnsAsText(new[] { "Foo", "Bar" });
exporter.SetColumnsAsText(new[] { "Baz", "Qux" });
To clear any columns previously set as text, call the ClearColumnsAsText
method on either Parser
or Exporter
.
parser.ClearColumnsAsText();
exporter.ClearColumnsAsText();
The Exporter
supports preventing specific columns from being sanitized. This is primarily to aid numerical columns with negative signs e.g. -1.23
, but can be used for any column.
To prevent columns from being sanitized, call the SetColumnsAsSanitizationPrevented
method on the Exporter
.
exporter.SetColumnsAsSanitizationPrevented(new[] { "Foo", "Bar" });
To clear any columns previously set to be not sanitization, call the ClearColumnsSanitizationPrevented
method on the Exporter
.
exporter.ClearColumnsSanitizationPrevented();
- Console application demonstrating the use of Windows-1252 encoding on .NET Core.