Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ReturnValuesOnConditionCheckFailure (#245) #246

Merged
merged 3 commits into from
Aug 26, 2024

Conversation

guregu
Copy link
Owner

@guregu guregu commented Aug 22, 2024

See: #245

Adds two ways of obtaining the current value if a condition check fails.

  • Specify that the error should return the item with IncludeItemInCondCheckFail
  • Unmarshal the item from an error, use UnmarshalItemFromCondCheckFailed for Put/Update/Delete or UnmarshalItemsFromTxCondCheckFailed for WriteTx
  • Use the CurrentValue methods for Put/Update/Delete

Currently it will always set input.ReturnValuesOnConditionCheckFailure = types.ReturnValuesOnConditionCheckFailureAllOld if there is a condition. I can add an opt-out if anyone needs it. It doesn't affect the cost of queries.
Update: added IncludeItemInCondCheckFail to control this.

@extemporalgenome
Copy link

It doesn't affect the cost of queries.

It could increase egress traffic costs for those cases where it's not desired. It could also slow time-to-last-byte in bandwidth constrained cases.

It this PR always asking for the item state on cond failure, or just one of those new methods is called?

Is the distinction you're mentioning just between AllOld and the only-changes enum variant?

@extemporalgenome
Copy link

Looking through the PR, it looks like the value is extracted from the error. Would it be hard to implement as such so the importing code continues to use CurrentValue to specify the destination, while using new companion methods, like ReturnValueOnCondFailure (or perhaps a catch-all configuration method which takes an binary-or'd enum of the cases in which the return value is desired), and then just store the result in the pointer passed to CurrentValue if the relevant outcome (success, cond failure, tx failure) applies?

@guregu
Copy link
Owner Author

guregu commented Aug 22, 2024

I was waffling a bit about whether to make it default or not.

The easiest way to make it hard to misuse, IMO, would be to make the error-extracting functions private, and only enable returning the failed item for terminal methods like CurrentValue (which is what you suggested, I think? Apologies if I misunderstood).

I think that if it were disabled by default it's a little too easy to shoot yourself in the foot by not specifying that you want the failed item, assuming an API like:

err := table.Put(item).ReturnValueOnCondCheckFailure(true).Run() // now it's ok to grab the item from Error
err := table.Put(item).Run(); _, err2 := UnmarshalItemFromCondCheckFailed(err, &item) // fails for a potentially confusing reason if you aren't familiar with the underlying api

For transactions I think that without the failed items the errors can become borderline useless when you have more than one item with conditions, it's hard to tell which items were the problem ones. However I do understand that traffic may be a concern for some (even if DynamoDB doesn't charge for bandwidth, there are cross-AZ ingress fees and such).

@guregu
Copy link
Owner Author

guregu commented Aug 23, 2024

Hmm, after some pondering I'm thinking it's probably best to have an individual toggle. It can be useful in the case where you only care about some of all of the items in a transaction (for example, I could see a case where you have multiple conditions but only one item you need to grab info from).

The latest commit changes it to be off by default and adds a method (IncludeItemInCondCheckFail) to toggle it. CurrentValue will enable it too. I tried to make the error message from UnmarshalItemFromCondCheckFailed better in the case where it's disabled.

I could make it included by default and you need to opt-out with IncludeItemInCondCheckFail(false), but not sure if that would be a good idea. In general, I try to make the user-friendly option the default but still allow for advanced performance boosting settings (Limit vs SearchLimit is an example of that), so it's tempting to make it include by default, but I could see arguments for the other side as well. 🤔

@extemporalgenome
Copy link

DynamoDB is a database that gives you a lot of control (and for high-scale use-cases, is one in which you need to exert that control). My organization appreciates your library because it's convenient, yes, but generally avoids surprises (i.e. the application retains control.

Broadly, the main important detail for retaining control is behaving more like a query builder than a "opaque ORM" (and this library does query building quite nicely), and by providing either an opt-out or an opt-in to anything that could categorically impact performance/control. The request for SearchLimit fell into the category of a control-impacting opt-out.

Whether it's an opt-out or an opt-in probably isn't as important in the long term. In the short term, if you do start instructing Dynamo to return values when not asked, you could surprise current users in some cases.

Personally, I'd tend to favor opt-in (i.e. if you don't know you need it, then you don't need it), and because an extra line of code shouldn't generally be of much stylistic concern in Go, especially when it regards the behavior of a network round-trip.

In this particular case, it seems reasonable for Value/CurrentValue methods to implicitly handle the CondCheckFail case, and rely on opt-out if the user doesn't want it, presuming that writes will typically be designed to succeed more often than not.

Sorry for the inconsistent-seeming responses: they're just various thoughts that you can do with as you will. I think the PR as it is right now (with the latest commit) will provide my team good benefit, thanks! It is a little unusual to see a errors used as a value-passing mechanism, but it doesn't really hurt anything besides adding a few more symbols to the package API.

@guregu
Copy link
Owner Author

guregu commented Aug 24, 2024

Thanks! I agree passing values in an error is weird, but that's (unfortunately?) how the AWS SDK decided to implement it. I try to always pass the AWS-generated errors as-is so it's a consequence of that.

@extemporalgenome
Copy link

Out of interest, does the V2 SDK also pass these values via error? If not, it seems like it could be worth unexporting the new error-value-extraction functions until the time of V2. Even if V2 does also pass these via error, your method implementations do what they need to do (i.e. extract the value from the error), but that doesn't mean we need to expose the underlying helpers.

@guregu
Copy link
Owner Author

guregu commented Aug 24, 2024

This PR is for v2 (of the SDK and this library). The SDK bits are here: https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/dynamodb@v1.34.6/types#ConditionalCheckFailedException, this error is returned by the SDK and the Item field has the item we need. These kinds of special errors were broken for a while in the v1 SDK, although I think they did eventually fix it (or at least add a way to work around it), but I'd need to make a patch against the dynamo/v1 branch to support it.

I would like to avoid more clutter in the exported symbols but I'll probably need at least the one for transactions (afaik, there's no way to get the current value on successful Updates in a transaction, so something like CurrentValue isn't possible), so I'll probably just keep them for now.

@extemporalgenome
Copy link

Got it. In any case, the PR LGTM, thanks!

@guregu guregu merged commit f0ac63c into master Aug 26, 2024
2 checks passed
@guregu guregu deleted the old-value-condcheck branch August 26, 2024 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants