Skip to content

CSV File Uploading Practices between Frontend and Backend #152

@reboottime

Description

@reboottime

Introduction to CSV and its applications, challenges

what is CSV and what it used for

CSV is an acronym for Comma-Separated Values. CSV is commonly used for data exchange between different applications, importing and exporting data from spreadsheets, etc.

Structure of CSV

  • Rows : Each line in a CSV file represents a row of data. Each row typically corresponds to a record or a data entry.
  • Columns: Within each row, the values are separated by a delimiter, often a comma(,). Alternative delimiter values can also be a semicolon and tab, or any other character based on the requirements.
  • Headers(Optional): The first row of a CSV file is often used to store column names or headers. Which provide a lable for each column

Consideration and Challenges

  • General Challenges

    • Data Types: CSV treats all data as strings. If your data includes numbers or other non-text types, you may need to convert them explicitly in your code.
    • Quoting: If your data contains the delimiter character itself (e.g., a comma) or line breaks, you might need to enclose the values in quotes.
    • Encoding: Pay attention to the character encoding of your CSV files, especially when dealing with international characters.
    • Parsing Errors: Be prepared to handle cases where the CSV data doesn't follow the expected structure.
  • Challenges on parsing large csv file

  • Performance:

    • When parsing large CSV files, the browser's memory usage can increase significantly, potentially causing performance issues and even crashes, blocking the UI
    • How to indicate the progress
    • Optimization and Chunking: Efficiently parsing large CSV files require techs like chunking, where the file is processed in smaller segments to reduce memory consumption and improving performance.
  • Solutions directions

    • Web Workers: Use Web Workers to run the parsing task in a separate thread.
    • Chunking: Break down the CSV file into smaller chunks and process them sequentially. This can help manage memory and prevent long blocking times.
    • Streaming: If possible, stream the CSV data and process it in chunks as it arrives, rather than loading the entire file into memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions