Skip to content

Releases: octue/octue-sdk-python

Use correct formatter for analysis logger

14 Sep 16:39
e970cb9
Compare
Choose a tag to compare

Summary

Simplify and correct the choice of log formatter on different platforms.

Contents

Enhancements

  • Remove COMPUTE_PROVIDER and USE_OCTUE_LOG_HANDLER from build arguments for Google Cloud Run Dockerfile (just keep as environment variables)

Fixes

  • Use correct formatter for analysis logger
  • Stop allowing COMPUTE_PROVIDER environment variable to override USE_OCTUE_LOG_HANDLER

Refactoring

  • Move decision on which log formatter to use to new octue.log_handlers.get_formatter function

Testing

  • Expand and update log handler tests

Revert "Acknowledge Pub/Sub messages received on Google Cloud Run straight away"

08 Sep 15:08
0b737a8
Compare
Choose a tag to compare

Reverts #221

We've had to revert due to limitations in Google Cloud Run:

  • Only instances that haven't returned an HTTP status code yet are counted as active
  • Acknowledgement of trigger Pub/Sub messages can only be done by returning a status code
  • Threads can be launched so the trigger message can be acknowledged to avoid it being sent again, but the instance will then be treated as idle and killed after around 15 minutes
  • Together, this means that acknowledgement of trigger Pub/Sub messages can only be done after all processing has completed
  • There is also a 600s maximum acknowledgement deadline, after which the message is sent again, resulting in extra containers being spawned and the same computation being carried out multilple times until the first instance of it has finished and acknowledged the trigger message
  • This means that long-running processes on Google Cloud Run cause the same processing to happen potentially many times, wasting a lot of compute resource.
  • The only solution we can see to this currently (while staying on Google Cloud Run) is to set the acknowledgement deadline to its maximum of 600s and set the message retention deadline to its minimum of 600s so messages for longer-running processes aren't resent
  • The long-term solution is to stop using Cloud Run and use something like Knative instead

Acknowledge Pub/Sub messages received on Google Cloud Run straight away

07 Sep 20:20
b69ca57
Compare
Choose a tag to compare

Summary

Acknowledge Pub/Sub messages received on Google Cloud Run straight away rather than waiting for the analysis to complete first. This avoids triggering the same analysis multiple times (due to Pub/Sub sending the same message multiple times), wasting compute resource, and adding a large amount of noise to the logs.

Contents

Fixes

  • Answer questions from Google Cloud Run in a thread

Make input values optional when asking a question

07 Sep 19:44
34ede2b
Compare
Choose a tag to compare

Summary

Allow questions with only an input manifest to be asked (questions with only input values are already allowed).

Contents

Fixes

  • Make input_values optional in Service.ask

Allow input manifests referencing local files to be used when asking questions

07 Sep 16:30
6456ce2
Compare
Choose a tag to compare

Summary

Allow the input manifest sent to a child service to reference local files if the user confirms that the child will have access to them. This is useful if there are several children running on a single machine (or several machines with a shared filesystem) that produce files so large that it would cost too much or take too long to upload and download these from cloud storage repeatedly. A good example of this might be running heavy-computation/big data children on a high-performance computing cluster.

Contents

New features

  • Allow input manifests referencing local files to be used if the files can be accessed by the child and the allow_local_files parameter is True

Testing

  • Loosen deprecation warning test

Quality Checklist

  • New features are fully tested (No matter how much Coverage Karma you have)

Make OrderedMessageHandler work with no timeout

07 Sep 15:14
48e34e0
Compare
Choose a tag to compare

Contents

Fixes

  • Make OrderedMessageHandler.handle_messages work with no timeout

Only retry transient errors in Google Pub/Sub code

07 Sep 15:01
45a3b27
Compare
Choose a tag to compare

Summary

Simplify Google Pub/Sub retries and restrict them to transient errors, specifically removing retries for NotFound errors (these were triggering many unneeded retries on Google Cloud Run). This also stops the retry schedule being proportional to the timeout for waiting for an answer to a question, which could lead to very long retry schedules for large timeouts (e.g. for questions that involve long analyses).

Contents

Enhancements

  • Add timeout parameter to Service.ask

Fixes

  • Only retry transient errors in Google Pub/Sub Service and GooglePubSubHandler

Ensure crc32c hashing on cloud upload works for binary files

07 Sep 10:55
a768a53
Compare
Choose a tag to compare

Contents

Fixes

  • Ensure crc32c hash calculation for cloud upload fidelity check works for binary files

Testing

  • Make deprecation warning test less stringent

Fix error messages about uppercase characters in tags and labels

06 Sep 15:52
7708c56
Compare
Choose a tag to compare

Contents

Fixes

  • Fix error messages about uppercase characters in tags and labels

Format answerer exceptions properly when sending to asker

31 Aug 10:33
bafeb35
Compare
Choose a tag to compare

Summary

Ensure exceptions with multiple arguments of any type are formatted and sent correctly to the asking service by the answering service.

Contents

Fixes

  • Format answerer exceptions properly when sending to asker