(PDB-5747) Batch database inserts #3986

rbrw · 2024-07-01T18:40:13Z

No description provided.

CLAassistant · 2024-07-01T18:40:18Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

So we'll always see stack traces during development.

Rename test from exceeding-db-index-limit-produces-annotated-error and make sure the type and title in the key match the type and title in the resource.

When reporting an index error, get the file and line from the resource itself, not just the attempted updates.

Drop the code we copied from java.jdbc (in order to add support for on-conflict) in favor of hsql's :on-conflict support. Change realize paths to batch and to be more efficient.

In order to support upcoming changes to batch inserts, adjust the index error handler to also accept a collection of resources and report any that might be large enough to have provoked the error. Even though the actual pg limit is (currently) 8191 bytes, we can't be certain of the culprit because if nothing else, that limit applies after possible in-line compression: https://www.postgresql.org/message-id/8326.1289618101%40sss.pgh.pa.us Treat a resource as suspicious if the UTF-8 encoded length of any indexed keys exceeds 8000. Just use UTF-8 for now since the other encodings seem unlikely: https://www.postgresql.org/docs/current/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED

(Use plan to avoid creating unnecessary clojure data structures.)

...in advance of further changes.

Switch to a map/update so we'll be able to compose a whole set of the operations as transducers.

Filter before remove-dupes so we can compose it with the other operations when we switch them to transducers.

Use a composed transducer for much of the work, avoiding multiple layers of transient garbage (per-map data structures). Can't use a simple composed function even if we wanted to, given the filter. Handle the filtering more directly.

Apparently clojure-mode (now?) handles time! like time.

Merge the existing "maybe" helpers into the code, and only add keys when needed, instead of adding and then conditionally removing them.

…-one!

rbrw added the don't merge label Jul 1, 2024

rbrw force-pushed the pdb-5747-batch-db-inserts branch from cc89ad8 to 80d5135 Compare July 11, 2024 20:53

austb approved these changes Jul 16, 2024

View reviewed changes

rbrw added 2 commits July 16, 2024 13:06

(PDB-5747) project: add -XX:-OmitStackTraceInFastThrow to dev profile

6458592

So we'll always see stack traces during development.

(PDB-5747) resource-key-too-big-for-pg-index: fix resource key

d658163

Rename test from exceeding-db-index-limit-produces-annotated-error and make sure the type and title in the key match the type and title in the resource.

rbrw force-pushed the pdb-5747-batch-db-inserts branch 3 times, most recently from 8bfe5c0 to 6fb376a Compare July 16, 2024 21:15

austb removed the don't merge label Jul 16, 2024

austb marked this pull request as ready for review July 16, 2024 21:16

austb requested review from a team as code owners July 16, 2024 21:16

rbrw added 17 commits July 16, 2024 16:55

(PDB-5747) update-catalog-resources!: report *resource* file/line

3697558

When reporting an index error, get the file and line from the resource itself, not just the attempted updates.

(PDB-5747) storage: replace diff-fn with explicit diffing

df57faa

(PDB-5747) puppetdb.jdbc: reformat ns before further changes

47d036f

(PDB-5747) scf.storage: reformat ns before further changes

2b1600c

(PDB-5747) Drop copied jdbc code (e.g. for insert-multi on-conflict)

a904cb7

Drop the code we copied from java.jdbc (in order to add support for on-conflict) in favor of hsql's :on-conflict support. Change realize paths to batch and to be more efficient.

(PDB-5747) add-certnames: batch explicitly

10f184c

(PDB-5747) Drop convert-tags-array

431a35e

(PDB-5747) add-params!: batch explicitly

c3c9ffa

(PDB-5747) insert-catalog-resources!: batch pg inserts

68cc476

(PDB-5747) replace-edges!: batch pg inserts

cbed5bb

(PDB-5747) store-catalog-inputs!: batch pg inserts

41b07da

(PDB-5747) update-packages insert-packages: batch pg inserts

f1f6138

(PDB-5747) insert-missing-packages: batch pg inserts

346812e

(Use plan to avoid creating unnecessary clojure data structures.)

(PDB-5747) add-report!*: move events insert-resource-events

ce58b85

...in advance of further changes.

(PDB-5747) insert-resource-events: replace specter with map

b4df135

Switch to a map/update so we'll be able to compose a whole set of the operations as transducers.

(PDB-5747) insert-resource-events: move filter up so we can comp

9767462

Filter before remove-dupes so we can compose it with the other operations when we switch them to transducers.

rbrw added 8 commits July 16, 2024 16:56

(PDB-5747) insert-resource-events: batch pg inserts

c340a41

(PDB-5747) add-report!*: update indentation

65c0ec7

Apparently clojure-mode (now?) handles time! like time.

(PDB-5747) add-report!*: simplify and do less work

643dac3

Merge the existing "maybe" helpers into the code, and only add keys when needed, instead of adding and then conditionally removing them.

(PDB-5747) scf.storage: simplify ensure and id functions

47d2ee2

(PDB-5747) add-catalog-metadata!: avoid returning db data via execute…

1029204

…-one!

(PDB-5747) add-facts!: avoid returning db data via execute-one!

6dcdedb

(PDB-5747) add-report!*: return just id via select-one!

ae097c7

rbrw force-pushed the pdb-5747-batch-db-inserts branch from 6fb376a to ae097c7 Compare July 16, 2024 22:00

austb merged commit f5be6a1 into puppetlabs:main Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(PDB-5747) Batch database inserts #3986

(PDB-5747) Batch database inserts #3986

Uh oh!

rbrw commented Jul 1, 2024

Uh oh!

CLAassistant commented Jul 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

(PDB-5747) Batch database inserts #3986

(PDB-5747) Batch database inserts #3986

Uh oh!

Conversation

rbrw commented Jul 1, 2024

Uh oh!

CLAassistant commented Jul 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants