Skip to content

Antalya Smart tag selection for integration/runner #809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: antalya
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 54 additions & 1 deletion tests/integration/runner
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ from typing import Any

from integration_test_images import get_docker_env

sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "ci")))
from digest_helper import DockerDigester
from ci import check_missing_images_on_dockerhub

def random_str(length: int = 6) -> str:
alphabet = string.ascii_lowercase + string.digits
Expand Down Expand Up @@ -299,7 +302,7 @@ if __name__ == "__main__":

parser.add_argument(
"--docker-image-version",
default="latest",
default=None,
help="Version of docker image which runner will use to run tests",
)

Expand Down Expand Up @@ -427,6 +430,56 @@ if __name__ == "__main__":
else:
logging.info("Unknown image %s", image)

# Get the image tags that CICD generates for the current branch.
# These tags will be used to populate unspecified arguments (--docker-image-version, --docker-compose-images-tags)
# If the files an image depends on have been changed, the tag will not exist until CI job has been run.
# These files change rarely, so it is usually safe to assume the tags exist.

docker_digester = DockerDigester()
imagename_digest_dict = (
docker_digester.get_all_digests()
) # 'image name - digest' mapping

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is in this map? As I understand, it's like altinityinfra/integration-test => sha256:74ac23582ebab40378fa66806e796f2143b4b5a6956a2daf9cd98ff33d0d6117. Which digest it contains when multiple versions of image available?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains the digests matching the states of the docker containers in your current branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't get the idea.
@strtgbb , could you explain it more explicitly/verbosely ?
What is the sequence of actions?

Copy link
Collaborator Author

@strtgbb strtgbb Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When CICD runs, it calculates hashes for the dockerfiles that it builds. It then publishes the images using the hashes to form the tags.
DockerDigester.get_all_digests() calculates the tags that CICD would use. The unspecified arguments can then be populated using those tags.
It's a relatively safe assumption that CICD has already run on the base of your branch and that you have not changed the dockerfiles (and thus the calculated hashes).

Copy link
Collaborator

@ilejn ilejn Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.
I suggest adding the comment above to the script.

Do you know what DIND stands for?
Shouldn't we promote the change to upstream?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only a Dockerfile, but rather a whole directory used to build the docker image, and all of it's dependent docker images.
docker/images.json describes all docker images that are used in CI/CD, excerpt:

{
    "docker/test/integration/base": {
        "only_amd64": true,
        "name": "altinityinfra/integration-test",
        "dependent": [
            "docker/test/integration/clickhouse_with_unity_catalog"
        ]
    },
...
   "docker/test/integration/clickhouse_with_unity_catalog": {
        "name": "altinityinfra/integration-test-with-unity-catalog",
        "dependent": []
    }
}

Corresponding directories:

$ ls docker/test/integration/base
Dockerfile  requirements.txt

$ ls docker/test/integration/clickhouse_with_unity_catalog
Dockerfile

So in order to get a hash for altinityinfra/integration-test docker image, DockerDigester.get_all_digests() will have to hash following files:

docker/test/integration/base/Dockerfile
docker/test/integration/base/requirements.txt
docker/test/integration/clickhouse_with_unity_catalog/Dockerfile

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilejn
DIND stands for Docker in Docker. It means that the image will be configured for nested Docker. It's the image that you're specifying with --docker-image-version.
I don't think upstream will care for this. It's a little bit hacky, and they don't support old releases like we do.

if args.docker_image_version is None:
args.docker_image_version = imagename_digest_dict[
DIND_INTEGRATION_TESTS_IMAGE_NAME
]
print(
f"Calculated digest {args.docker_image_version} for DIND integration tests image"
)
missing_images = check_missing_images_on_dockerhub(
{
DIND_INTEGRATION_TESTS_IMAGE_NAME: args.docker_image_version,
}
)
if missing_images:
print(
f"Error: The following Docker image with calculated digest does not exist: {missing_images}."
"\nPlease specify manually with --docker-image-version"
)
sys.exit(1)

if "DOCKER_BASE_TAG" not in env_tags:
BASE_INTEGRATION_TEST_IMAGE_NAME = "altinityinfra/integration-test"
base_integration_image_tag = imagename_digest_dict[
BASE_INTEGRATION_TEST_IMAGE_NAME
]
env_tags += get_docker_env(
BASE_INTEGRATION_TEST_IMAGE_NAME,
base_integration_image_tag,
)
print(
f"Calculated digest {base_integration_image_tag} for base integration test image"
)
missing_images = check_missing_images_on_dockerhub(
{BASE_INTEGRATION_TEST_IMAGE_NAME: base_integration_image_tag}
)
if missing_images:
print(
f"Error: The following Docker image with calculated digest does not exist: {missing_images}."
"\nPlease specify manually with --docker-compose-images-tags"
)
sys.exit(1)

# create named volume which will be used inside to store images and other docker related files,
# to avoid redownloading it every time
#
Expand Down
Loading