Skip to content

Allow to update duplicates on conflict in PostgreSQL #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 25, 2019
Merged

Allow to update duplicates on conflict in PostgreSQL #40

merged 2 commits into from
Mar 25, 2019

Conversation

sobstel
Copy link
Contributor

@sobstel sobstel commented Aug 21, 2018

It adds support for PostgreSQL 9.5+ ON CONFLICT DO UPDATE (
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#INSERT_..._ON_CONFLICT_DO_NOTHING.2FUPDATE_.28.22UPSERT.22.29).

It has been implemented as suggested in #34 (comment).

However, in a hindsight I feel that these things (ignore, update_duplicates, etc) make this simple gem a bit too bloated. You should probably do standard stuff with a gem and use... custom queries for all the others (I ended up with custom script for my use case eventually: https://gist.github.com/sobstel/1b022e17f2f2bb276e766f7fea974b17).

Feel free to merge it if you disagree ;-)

Copy link
Collaborator

@mberlanda mberlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of remarks about testing and the options passed to the method.

This is a great initiative and could actually extend more support for MySQL as well. To recap:

  • add support for update_duplicates with a boolean to be consistent with MySQL implementation
  • consider to pass to update_duplicates a list of column to update (supported by both MySQL and PG) instead of the conflicting columns (you may set a primary key for this)
  • more testing

Book.bulk_insert(*destination_columns, update_duplicates: true) do |worker|
worker.add(...)
worker.add(...)
# ...
end

# Update duplicate rows (PostgreSQL)
Book.bulk_insert(*destination_columns, update_duplicates: %w[title]) do |worker|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a little bit misleading this change of argument type according to the adapter.

The options to pass to this argument could be:

Assuming that some has set correctly its primary keys, I would consider more intuitive that update_duplicates: will provide the list of the columns to update. This will make the option supported for MySQL as well.

Copy link
Contributor Author

@sobstel sobstel Nov 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider more intuitive that update_duplicates: will provide the list of the columns to update.

I would normally consider it a more intuitive as well, but Postgres requires inference specification or constraint name (in short: it needs conflicting columns). I've spent a while on it now and I find it hard to come up with satisfying abstraction that will work both for MySQL and Postgres.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about considering the following scenarios:

  • false, default
  • %[], only selected columns
  • true, all destination columns
    ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it solves the issue. We have destination columns (data you insert) and we have update_duplicates (unique columns for postgres, which is done automatically by mysql).

  • %[], only selected columns

How it would for MySQL?

  • true, all destination columns

How it would work for PostgreSQL?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @sobstel , thank you for your patience 😄

What I meant is the following:

Let's imagine we have a foos table with columns a, b, c, d.

destination_columns = [:a, :b]
# I would see 3 use cases

# PG
Foo.bulk_insert(*destination_columns, update_duplicates: false) #clear
Foo.bulk_insert(*destination_columns, update_duplicates: [:a]) #clear
Foo.bulk_insert(*destination_columns, update_duplicates: true)
# => true will be a shortcut for update_duplicates: destination_columns

# Mysql
Foo.bulk_insert(*destination_columns, update_duplicates: false) #clear
Foo.bulk_insert(*destination_columns, update_duplicates: true) # clear
Foo.bulk_insert(*destination_columns, update_duplicates: [:a])
# => you will update only a

elsif adapter_name =~ /^mysql/i && update_duplicates
update_values = @columns.map do |column|
"`#{column.name}`=VALUES(`#{column.name}`)"
end.join(', ')
' ON DUPLICATE KEY UPDATE ' + update_values

New implementation suggested:

 elsif adapter_name =~ /^mysql/i && update_duplicates 
    columns_scope = \
       if  update_duplicates.is_a?Array
         ->(c) { update_duplicates.map(&:to_s).include?(c.name) }
       else
          ->(c) { c }
        end
   update_values = @columns.select(&columns_scope).map do |column| 
     "`#{column.name}`=VALUES(`#{column.name}`)" 
   end.join(', ') 
   ' ON DUPLICATE KEY UPDATE ' + update_values 

You can try it out in irb easily:

$ irb
irb(main):001:0> update_duplicates = ['a', :b, 1]
=> ["a", :b, 1]
irb(main):002:0> columns = ["b", "1"]
=> ["b", "1"]
irb(main):003:0> columns_scope = ->(c) { update_duplicates.map(&:to_s).include?(c) }
=> #<Proc:0x00007fa17980ba20@(irb):3 (lambda)>
irb(main):004:0> columns.select(&columns_scope)
=> ["b", "1"]
irb(main):005:0> id = ->(c) { c }
=> #<Proc:0x00007fa1768544d0@(irb):5 (lambda)>
irb(main):006:0> update_duplicates.select(&id)
=> ["a", :b, 1]

Copy link
Contributor Author

@sobstel sobstel Jan 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reply and sorry for another long wait. Unfortunately I struggle to find more time to invest into it, sorry.

Actually you have implementation done above. Feel free to update it or even to create separate PR with final solution including your improvements.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see this goes into code, we cannot use this amazing gem because lack of postgress support

@mberlanda mberlanda dismissed their stale review January 14, 2019 22:12

@sobstel I would keep #40 (comment) as an improvement for a separated PR. Can you prove with an example that this implementation is working? If so, I would be OK to pass it

@ponnusamygit
Copy link

ponnusamygit commented Mar 25, 2019

@sobstel's PR is working fine,
image

Inserted value 'Grade 1', updated it 'Grade 2'.

@mberlanda , Are you merging this?

Copy link
Collaborator

@mberlanda mberlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This work and has been opened since a while. Since it looks it is pretty demanded i will merge but I do not have the rights to release. (cc @jamis could bump a minor version? 🙇 )

@sobstel I guess we need to find some energies and time to create some adapters because we are starting to get too many regexp like adapter_name =~ /\APost(?:greSQL|GIS)/i.

@mberlanda mberlanda merged commit 3ded6b0 into jamis:master Mar 25, 2019
@sobstel
Copy link
Contributor Author

sobstel commented Apr 4, 2019

@sobstel I guess we need to find some energies and time to create some adapters because we are starting to get too many regexp like adapter_name =~ /\APost(?:greSQL|GIS)/i.

I agree, but I'm still not able to accommodate any time for this, but I'll try to keep it in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants