Skip to content

[cosmos] Bulk API does not retry when some Operations throttle #29100

Open

Description

  • Package Name: azure/cosmos
  • Package Version: 4.0.0
  • Operating system: Windows 10
  • nodejs
    • version: v18.16.0

Describe the bug
Context: I use an Azure function to compute data from an eventhub to cosmosDB with the bulk API.
The throtled operations are not retried according to the retry policy set in the CosmosClient.

connectionPolicy: {
    retryOptions: {
      maxRetryAttemptCount: 50,
      fixedRetryIntervalInMilliseconds: 1_000,
      maxWaitTimeInSeconds: 120,
    },
 },

To Reproduce
Steps to reproduce the behavior:

  1. Create a CosmosClient with a retry policy
  2. Try to Upsert 100 documents on a container provisionned with 400 RU
  3. Response return with some failed operation (statusCode: 429)

Expected behavior
The failed operations should be retried according to the retryPolicy

Additional context

I dived a bit into the sdk codebase. I found that all the operations are sent in a single request to :
https://<COSMOS_ENPOINT>:443/dbs/<DATABASE_ID>/colls/<CONTAINER_ID>/docs

The server respond with a status 207 and a body with all the operations status.
The sdk only check the status of the request to throw an error that will be catched and retried if the status code is 429. Since the status code of the request is 207 no error are thrown and no check are performed on the request body to ensure that all the operations are successful.

Here the small node script i use to reproduce the issue:

const cosmos = require("@azure/cosmos");
const crypto = require("crypto");
const dotEnv = require("dotenv");

const id = crypto.randomUUID();
dotEnv.config();

const client = new cosmos.CosmosClient({
  endpoint: process.env["COSMOS_DATABASE_ENDPOINT"],
  key: process.env["COSMOS_DATABASE_KEY"],
  connectionPolicy: {
    retryOptions: {
      maxRetryAttemptCount: 50,
      fixedRetryIntervalInMilliseconds: 1_000,
      maxWaitTimeInSeconds: 120,
    },
  },
});
const container = client
  .database(process.env["COSMOS_DATABASE_ID"])
  .container(process.env["COSMOS_CONTAINER_ID"]);

const operationsChunk = Array.from({ length: 100 }, (_, index) => ({
  operationType: cosmos.BulkOperationType.Upsert,
  id: `${id}-${index}`,
  partitionKey: `serial-${id}-${index}`,
  resourceBody: {
    id: `${id}-${index}`,
    serial: `serial-${id}-${index}`,
    history: [{ value: `${id}-${index}`, ts: 0 }],
  },
}));

(async () => {
  const results = await container.items.bulk(operationsChunk, {
    continueOnError: true,
  });
  console.log(results.filter(({ statusCode }) => statusCode === 429));
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Labels

ClientThis issue points to a problem in the data-plane of the library.CosmosService AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

  • Status

    In Progress

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions