Skip to content

Fast upsert mode #34

@jqnatividad

Description

@jqnatividad

Currently, DP+ like Datapusher and xloader, only does drop & replace and doesn't do upserts.

It'd be great if DP+ can support upserts in a performant way.

This can be done by:

  • adding a resource-level metadata field that the Data Publisher can set to enable upsert mode.
  • when a resource has upsert mode enabled, instead of drop & replace, DP+ will:
    • compare the schemas of the existing resource and the new CSV to see if they are identical (qsv can do this very quickly)
    • if they're not, DP+ will abort stating that the resource is in upsert mode and the schemas do not match
    • if the schemas are identical, do a PostgreSQL copy to a temporary table of the file to be pushed
    • then do a INSERT INTO ON CONFLICT DO UPDATE to upsert the temporary table into the existing resource
    • the temporary table is then deleted

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions