Skip to content

MatchingEngineIndex.create_tree_ah_index in Vertex Pipelines times out after 900 seconds #1870

Closed
@chrisk447

Description

@chrisk447

Environment details

  • Running in Vertex Pipelines
  • OS type and version: gcr.io/deeplearning-platform-release/base-cu100
  • google-cloud-aiplatform version: 1.20.0
  • KFP version: 1.8.18

Steps to reproduce
Build a kfp component using aiplatform.MatchingEngineIndex.create_tree_ah_index.

Include google-cloud-aiplatform==1.20.0 as a package to install. Use the gcr.io/deeplearning-platform-release/base-cu100 as the docker image. Build the pipeline json using kfp.v2.dsl's component function.

Create a Vertex Pipeline using the pipeline json.

Expected result
Pipeline should continue running until matching engine index is fully created.

Code Example

def create_tree_ah_index(
    display_name: str,
    jsonl_formatted_data_uri: str,
    dimensions: int = 100,
    approximate_neighbors_count: int = 150,
    distance_measure_type: str = "DOT_PRODUCT_DISTANCE",
    leaf_node_embedding_count: int = 500,
    leaf_nodes_to_search_percent: float = 7,
    description: str = "ANN index",
    labels: dict = {"label_name": "label_value"},
    sync: bool = False
):
    from google.cloud import aiplatform
    import logging
    import time
    logging.basicConfig(level=logging.INFO)

    logging.info("ANN Index")
    try:
        tree_ah_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
            display_name=display_name,
            contents_delta_uri=jsonl_formatted_data_uri,
            dimensions=dimensions,
            approximate_neighbors_count=approximate_neighbors_count,
            distance_measure_type=distance_measure_type,
            leaf_node_embedding_count=leaf_node_embedding_count,
            leaf_nodes_to_search_percent=leaf_nodes_to_search_percent,
            description=description,
            labels=labels,
            sync=sync
        )
        while True:
            if tree_ah_index._are_futures_done():
                index_resource_name = tree_ah_index.resource_name
                logging.info("Index successfully created with ID : %s ", index_resource_name)
                break
            logging.info("Polling the operation every 3 minutes to create index...")
            time.sleep(180)

    except Exception as e:
        logging.exception("The index creation failed: {}".format(e))

Stack trace

[KFP Executor 2022-12-22 18:33:31,182 INFO]: Create MatchingEngineIndex backing LRO: projects/589820861215/locations/us-central1/indexes/8058285535598215168/operations/2805451540867842048
[KFP Executor 2022-12-22 18:36:30,704 INFO]: Polling the operation every 3 minutes to create index...
[KFP Executor 2022-12-22 18:36:30,704 INFO]: Polling the operation every 3 minutes to create index...
[KFP Executor 2022-12-22 18:36:30,704 INFO]: Polling the operation every 3 minutes to create index...
[KFP Executor 2022-12-22 18:36:30,704 INFO]: Polling the operation every 3 minutes to create index...
[KFP Executor 2022-12-22 18:48:31,057 ERROR]: The index creation failed: MatchingEngineIndex resource has not been created. Resource failed with: Operation did not complete within the designated timeout of 900 seconds.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: vertex-aiIssues related to the googleapis/python-aiplatform API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions