Allow to update duplicates on conflict in PostgreSQL #40

sobstel · 2018-08-21T18:08:39Z

It adds support for PostgreSQL 9.5+ ON CONFLICT DO UPDATE (
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#INSERT_..._ON_CONFLICT_DO_NOTHING.2FUPDATE_.28.22UPSERT.22.29).

It has been implemented as suggested in #34 (comment).

However, in a hindsight I feel that these things (ignore, update_duplicates, etc) make this simple gem a bit too bloated. You should probably do standard stuff with a gem and use... custom queries for all the others (I ended up with custom script for my use case eventually: https://gist.github.com/sobstel/1b022e17f2f2bb276e766f7fea974b17).

Feel free to merge it if you disagree ;-)

mberlanda

A couple of remarks about testing and the options passed to the method.

This is a great initiative and could actually extend more support for MySQL as well. To recap:

add support for update_duplicates with a boolean to be consistent with MySQL implementation
consider to pass to update_duplicates a list of column to update (supported by both MySQL and PG) instead of the conflicting columns (you may set a primary key for this)
more testing

test/bulk_insert/worker_test.rb

mberlanda · 2018-08-23T07:34:19Z

README.md

 Book.bulk_insert(*destination_columns, update_duplicates: true) do |worker|
  worker.add(...)
  worker.add(...)
  # ...
 end
+
+# Update duplicate rows (PostgreSQL)
+Book.bulk_insert(*destination_columns, update_duplicates: %w[title]) do |worker|


It may be a little bit misleading this change of argument type according to the adapter.

I would support the option update_duplicates: boolean argument to be consistent in the gem implementation (this is the reason the of original ON CONFLICT DO NOTHING to reproduce the behaviour of INSERT_IGNORE Psql conflict | Made ignore parameter dependent on adapter name #19)

The options to pass to this argument could be:

the columns to update (supported also in mysql https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html)

the conflicting columns (by default the primary key for MySQL)

Assuming that some has set correctly its primary keys, I would consider more intuitive that update_duplicates: will provide the list of the columns to update. This will make the option supported for MySQL as well.

I would consider more intuitive that update_duplicates: will provide the list of the columns to update.

I would normally consider it a more intuitive as well, but Postgres requires inference specification or constraint name (in short: it needs conflicting columns). I've spent a while on it now and I find it hard to come up with satisfying abstraction that will work both for MySQL and Postgres.

what about considering the following scenarios:

false, default

%[], only selected columns

true, all destination columns
?

I don't think it solves the issue. We have destination columns (data you insert) and we have update_duplicates (unique columns for postgres, which is done automatically by mysql).

%[], only selected columns

How it would for MySQL?

true, all destination columns

How it would work for PostgreSQL?

hey @sobstel , thank you for your patience 😄

What I meant is the following:

Let's imagine we have a foos table with columns a, b, c, d.

destination_columns = [:a, :b] # I would see 3 use cases # PG Foo.bulk_insert(*destination_columns, update_duplicates: false) #clear Foo.bulk_insert(*destination_columns, update_duplicates: [:a]) #clear Foo.bulk_insert(*destination_columns, update_duplicates: true) # => true will be a shortcut for update_duplicates: destination_columns # Mysql Foo.bulk_insert(*destination_columns, update_duplicates: false) #clear Foo.bulk_insert(*destination_columns, update_duplicates: true) # clear Foo.bulk_insert(*destination_columns, update_duplicates: [:a]) # => you will update only a

bulk_insert/lib/bulk_insert/worker.rb

Lines 155 to 159 in 91b1a5b

elsif adapter_name =~ /^mysql/i && update_duplicates

update_values = @columns.map do |column|

"`#{column.name}`=VALUES(`#{column.name}`)"

end.join(', ')

' ON DUPLICATE KEY UPDATE ' + update_values

New implementation suggested:

elsif adapter_name =~ /^mysql/i && update_duplicates columns_scope = \ if update_duplicates.is_a?Array ->(c) { update_duplicates.map(&:to_s).include?(c.name) } else ->(c) { c } end update_values = @columns.select(&columns_scope).map do |column| "`#{column.name}`=VALUES(`#{column.name}`)" end.join(', ') ' ON DUPLICATE KEY UPDATE ' + update_values

You can try it out in irb easily:

$ irb irb(main):001:0> update_duplicates = ['a', :b, 1] => ["a", :b, 1] irb(main):002:0> columns = ["b", "1"] => ["b", "1"] irb(main):003:0> columns_scope = ->(c) { update_duplicates.map(&:to_s).include?(c) } => #<Proc:0x00007fa17980ba20@(irb):3 (lambda)> irb(main):004:0> columns.select(&columns_scope) => ["b", "1"] irb(main):005:0> id = ->(c) { c } => #<Proc:0x00007fa1768544d0@(irb):5 (lambda)> irb(main):006:0> update_duplicates.select(&id) => ["a", :b, 1]

Thanks for your reply and sorry for another long wait. Unfortunately I struggle to find more time to invest into it, sorry.

Actually you have implementation done above. Feel free to update it or even to create separate PR with final solution including your improvements.

Would love to see this goes into code, we cannot use this amazing gem because lack of postgress support

@sobstel

@sobstel I would keep #40 (comment) as an improvement for a separated PR. Can you prove with an example that this implementation is working? If so, I would be OK to pass it

ponnusamygit · 2019-03-25T04:04:21Z

@sobstel's PR is working fine,

Inserted value 'Grade 1', updated it 'Grade 2'.

@mberlanda , Are you merging this?

mberlanda

This work and has been opened since a while. Since it looks it is pretty demanded i will merge but I do not have the rights to release. (cc @jamis could bump a minor version? 🙇 )

@sobstel I guess we need to find some energies and time to create some adapters because we are starting to get too many regexp like adapter_name =~ /\APost(?:greSQL|GIS)/i.

sobstel · 2019-04-04T19:57:52Z

@sobstel I guess we need to find some energies and time to create some adapters because we are starting to get too many regexp like adapter_name =~ /\APost(?:greSQL|GIS)/i.

I agree, but I'm still not able to accommodate any time for this, but I'll try to keep it in mind.

Allow to update duplicates on conflict in PostgreSQL

501dd65

mberlanda previously requested changes Aug 23, 2018

View reviewed changes

Restore test for postgres on conflict nothing

77f2f6c

mberlanda approved these changes Mar 25, 2019

View reviewed changes

mberlanda merged commit 3ded6b0 into jamis:master Mar 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to update duplicates on conflict in PostgreSQL #40

Allow to update duplicates on conflict in PostgreSQL #40

sobstel commented Aug 21, 2018

mberlanda left a comment

mberlanda Aug 23, 2018

sobstel Nov 20, 2018 •

edited

Loading

mberlanda Nov 21, 2018

sobstel Nov 22, 2018

mberlanda Nov 23, 2018

sobstel Jan 3, 2019 •

edited

Loading

randavidovitz Mar 3, 2019

ponnusamygit commented Mar 25, 2019 •

edited

Loading

mberlanda left a comment

sobstel commented Apr 4, 2019

	elsif adapter_name =~ /^mysql/i && update_duplicates
	update_values = @columns.map do \|column\|
	"`#{column.name}`=VALUES(`#{column.name}`)"
	end.join(', ')
	' ON DUPLICATE KEY UPDATE ' + update_values

Allow to update duplicates on conflict in PostgreSQL #40

Allow to update duplicates on conflict in PostgreSQL #40

Conversation

sobstel commented Aug 21, 2018

mberlanda left a comment

Choose a reason for hiding this comment

mberlanda Aug 23, 2018

Choose a reason for hiding this comment

sobstel Nov 20, 2018 • edited Loading

Choose a reason for hiding this comment

mberlanda Nov 21, 2018

Choose a reason for hiding this comment

sobstel Nov 22, 2018

Choose a reason for hiding this comment

mberlanda Nov 23, 2018

Choose a reason for hiding this comment

sobstel Jan 3, 2019 • edited Loading

Choose a reason for hiding this comment

randavidovitz Mar 3, 2019

Choose a reason for hiding this comment

ponnusamygit commented Mar 25, 2019 • edited Loading

mberlanda left a comment

Choose a reason for hiding this comment

sobstel commented Apr 4, 2019

sobstel Nov 20, 2018 •

edited

Loading

sobstel Jan 3, 2019 •

edited

Loading

ponnusamygit commented Mar 25, 2019 •

edited

Loading