-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of IBM's Cloud Object Storage: COS #1966
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Cohen-J-Omer I left some comments. It should be ready after this. I'll let @romilbhardwaj sign off on this
|
||
Follow the next steps to create/extract the above fields: | ||
|
||
1. Create/Select a COS instance from the `web console <https://cloud.ibm.com/objectstorage/>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cohen-J-Omer It would be sufficient to add an example in the yaml section starting with
# *** Copying files from IBM COS ***
just like how it's written for S3 and GCS
sky/data/data_utils.py
Outdated
from multiprocessing import pool | ||
import concurrent.futures | ||
import os | ||
import subprocess | ||
from typing import Any, Callable, Dict, List, Optional, Tuple | ||
import urllib.parse | ||
import re | ||
import textwrap | ||
import threading | ||
from filelock import FileLock | ||
from enum import Enum | ||
|
||
from sky import exceptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from multiprocessing import pool | |
import concurrent.futures | |
import os | |
import subprocess | |
from typing import Any, Callable, Dict, List, Optional, Tuple | |
import urllib.parse | |
import re | |
import textwrap | |
import threading | |
from filelock import FileLock | |
from enum import Enum | |
from sky import exceptions | |
import concurrent.futures | |
from enum import Enum | |
from multiprocessing import pool | |
import os | |
import subprocess | |
import textwrap | |
import threading | |
from typing import Any, Callable, Dict, List, Optional, Tuple | |
import urllib.parse | |
from filelock import FileLock | |
from sky import exceptions |
1. Add process lock due to `subprocess_utils.run_in_parallel`. 2. Move locks to ibm adaptor.
Hey @landscapepainter @romilbhardwaj , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cohen-J-Omer Thanks for adding support for parallel storage deletion! Would you mind merging master with the repo and confirm if it passes test_smoke.py
? There was a recent update in test_multiple_buckets_creation_and_deletion
.
README.md
Outdated
@@ -39,7 +39,7 @@ SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offer | |||
SkyPilot **abstracts away cloud infra burdens**: | |||
- Launch jobs & clusters on any cloud | |||
- Easy scale-out: queue and run many jobs, automatically managed | |||
- Easy access to object stores (S3, GCS, R2) | |||
- Easy access to object stores (S3, GCS, R2, IBM COS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of keeping the README concise, can we exclude this change for now? We can add it when users start using it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're breaking my heart here @romilbhardwaj...
Will remove.
If not all supported object stores are listed, I suggest implying there are more, e.g.:
Easy access to object stores, such as: S3, GCS, R2
sky/data/data_utils.py
Outdated
from sky.utils import ux_utils | ||
|
||
Client = Any | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintended?
Hey @landscapepainter , @romilbhardwaj
|
Thanks @Cohen-J-Omer. I ran into an issue when running final tests: I ran
The test passed, but note the leaked semaphore warning. Now whenever I run any sky command, including
If I switch branch to master, the warning goes away. Switching back to storage-ibm brings back the warning. This is likely stemming from the global locks created in For reference, I'm on M2 Macbook with python 3.9.13. Also, can you merge with master? |
Hey @romilbhardwaj,
|
Hmm, I restarted my machine and have 17G memory free. The warning still shows up only when i checkout If I remove L24 (
does not trigger the warning. BTW, do we need both |
- drop multithread protection on cos client/resource
Hey @romilbhardwaj,
I hope we can now merge this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Cohen-J-Omer, I noticed that your recent merge with master branch does not reflect the update to the test I mentioned. The test is still outdated in this repository.
There was a recent update in
test_multiple_buckets_creation_and_deletion
.
Please merge with the recent master branch and check if the updated test passes.
With regard to the error @romilbhardwaj encountered, I ran pytest tests/test_smoke.py::test_ibm_storage_mounts --ibm
on my machine, Debian 4.19.260-1 (2022-09-29) x86_64 GNU/Linux, with python 3.8 and did not come across the same error. Seems like it's an issue for M1/M2 machines:
Hey @landscapepainter, @romilbhardwaj: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Cohen-J-Omer - all looks good! I tested by installing from scratch in a docker container (both, pip install -e .[ibm]
and pip install -e .[aws]
). This should be ready to go. Thanks again for the hard work!
tests/test_smoke.py
Outdated
from sky.adaptors import ibm | ||
from sky.data.data_utils import Rclone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: alphabetical ordering for imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Hope I understood you, since i had to reorder 2 other import lines in the process.
Support for IBM's COS.
Notes:
getting-started/installation.rst
) to provide instructions for setting up IBM credentials.sky check
still returnsIBM:enabled
, even if the necessary fields for storage are missing in~/.ibm/credentials.yaml
.Issue
I wasn't able to make
persistent: False
delete the used bucket, either after task concluded, or cluster terminated. Would appreciate some guidance.Tested
notes on the rest of the tests