AzureBlobStorageDataStore is not safe for concurrent reads/writes

The current implementation of `AzureBlobStorageDataStore` does not seem like it is safe to multiple instances reading and writing from the underlying blob where it stores its data. It's seems to have problems on multiple levels...

# Singleton, but not thread safe
Within the app it is treated as a singleton which means it's possible that multiple threads could be accessing the class at the same time. The class in general is not hardened for thread safety so that could be a problem for starters.

# Multi-instance definitely not safe
The real problem comes if the app scaled out and there were multiple instances of it running it would definitely have problems simply because there are no good concurrency checks between the point when the blob was read by the `Cache` method and when it's potentially written later by the `Commit` method. 

Consider the following scenario:
 1. Instance 1 does a read of the blob
 1. Instance 2 does a read of the blob
 1. Instance 1 calls methods that mutate the data in memory
 1. Instance 2 calls methods that mutate the data in memory
 1. Instance 1 commits
 1. Instance 2 commits

The current implementation will always allow for Instance 2 to stomp on Instance 1's changes. 

This is due to the fact that the `ETag` is not recorded by the implementation at the time of the read. So while the `Commit` method does get the `ETag` right before it writes to make sure it's writing over whatever version of the data was there in _that_ instance, it doesn't mean it is the version of the data that was read by the `Cache` method and subsequently mutated.

# Possible solutions
The potential fix for this is to remember the `ETag` at read time in the `Cache` method and then use _that_ value at write time in `Commit`. However, this would present a problem that does not exist today because the implementation just allows clobbering and that is: what happens if you try to `Commit`, but another instance has updated the underlying data? Like any concurrency error, the only way to fix that would have to have ripples all the way upstream to the logic that was attempting to change the data in the first place so it could refresh the data it was working with an attempt to apply the same logical changes again, but even that may not be possible and then it has to propagate all the way back to the origin of the operation.

Now, it's possible that the implementation could act at the more logical level and detect non-conflicting changes to the overall JSON, but then that might actually make the case that it shouldn't be stored in a single JSON blob any way. That way you wouldn't have to load/cache the whole thing constantly as one and you also wouldn't have to write that logic because you could just rely on a per blob `ETag` instead. That would, of course, require a bigger change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AzureBlobStorageDataStore is not safe for concurrent reads/writes #41

Singleton, but not thread safe

Multi-instance definitely not safe

Possible solutions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AzureBlobStorageDataStore is not safe for concurrent reads/writes #41

Description

Singleton, but not thread safe

Multi-instance definitely not safe

Possible solutions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions