Skip to content

Import/Export Assistant #161

@FelixKirsch

Description

@FelixKirsch

Is your feature request related to a problem? Please describe.
Currently, refinery enables the user to provide upload options to define how the included data should be imported (e.g. column separator, line terminator). But often, the import does not work as expected, for example, the defined options do not match the format of the uploaded file perfectly.
Further, user cannot specify if only part of the included data should be imported to refinery or map data to (existing) data in refinery (e.g. user data).

Describe the solution you'd like

The import and export should be supported by an assistant. This assistant would preview how the uploaded data would be imported into refinery and provide more options.

Preview
The assistant should include a view that displays how (one or a couple) of records would look like in the import or export.
So, for an import, it would show which attributes would be created and the included values for the sample records. For export, it would show the exported record, for example the created json string.

Provide more options

  • Pandas import options
    As already included, the user should be able to specify pandas import options. This includes column separator, line terminator etc.
  • Mappings
    Users should be able to create mappings for the imported data. For example, a mapping between users in the import and users in refinery.
  • Extraction data
    In refinery, extraction data is labeled on token level (tokens are defined by spacy). Other labeling tools follow different approaches. E.g. Labeling studio enables the user to label any charspan. Therefore, charspans must be matched with tokens when importing these data into refinery. Different strategies can be applied for the matching, e.g. expanding the charspan to the next tokens. Here, the user should be able to choose between the different strategies.

Additional context
test finding v1.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions