Description
The current Etag handling IMHO are more problematic than helpful at this stage, especially when there is a failure during import. See 255b7f9#diff-938a299f8406c1d3defaec48838bc4f6f1307635f5d2ba36e9777227cedbb383R46
-
We are not using Etag correctly: Etags are designed to be passed back in HTTP headers. We should instead send the proper HTTP header as per https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match and integrate this finely in the processing as this typically needs to use streaming=True requests, and proper handling of the HTTP return code. In all cases this should end up being a single request, not a HEAD followed by another GET request
-
Etag are unlikely meant to stored long term in the DB like we are now
-
the
create_etag
function name is misleading as we are creating, checking and saving the Etag in thiscreate_etag
function -
If some import fails, we have to manually delete the records in the DB to restart an import
-
the time it takes to download data seems to be very small when compared with the time to perform the import
-
Once the initial import is done, incremental import should be much smaller, making Etgas even less relevant
Therefore, I think we need to reconsider using Etag.