Skip to content

Commit

Permalink
cronjob fixes:
Browse files Browse the repository at this point in the history
- cache_dir = /tmp since it's the only writable place
- always pull the latest image to be safe
- polish resource: 250MB is not enough for the script

Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
  • Loading branch information
TomasTomecek committed Feb 8, 2024
1 parent 73a4a56 commit 6e7e36e
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 2 deletions.
7 changes: 6 additions & 1 deletion files/compile_extraction_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,12 @@
with open(os.path.join(tmp_dir, 'q_a_extract.json'), 'w') as f:
json.dump(parsed, f)

dataset = load_dataset('json', data_files=os.path.join(tmp_dir, 'q_a_extract.json'))
# cache_dir: /tmp is the only writable place in an openshift pod
dataset = load_dataset(
'json',
data_files=os.path.join(tmp_dir, 'q_a_extract.json'),
cache_dir="/tmp"
)

if "HF_TOKEN" not in os.environ:
raise RuntimeError("Please set HF_TOKEN so you can upload the data set to HF.")
Expand Down
8 changes: 8 additions & 0 deletions openshift/dataset-cron.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ spec:
containers:
- name: upload-dataset
image: quay.io/log-detective/website:latest
imagePullPolicy: Always
# for some reason we need to explicitly call the command like this -_-
command: ["python3", "/usr/bin/compile_extraction_dataset.py"]
env:
Expand All @@ -25,4 +26,11 @@ spec:
secretKeyRef:
name: hf-secret
key: token
resources:
requests:
memory: "550Mi"
cpu: "250m"
limits:
memory: "550Mi"
cpu: "250m"
restartPolicy: OnFailure
2 changes: 1 addition & 1 deletion openshift/log-detective.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ spec:
cpu: "50m"
limits:
memory: "800Mi"
cpu: "1"
cpu: "500m"
replicas: 1
strategy:
type: Recreate
Expand Down

0 comments on commit 6e7e36e

Please sign in to comment.