Skip to content

Conversation

@fschade
Copy link

@fschade fschade commented Oct 22, 2025

Problem description

The current posixfs blobstore implementation relies on buffered writes from the upload to the destination file, that destination file is read (can) (for example by some PROPFIND inside another routine) while writing.

The problem starts to arise as soon as other concurrent ops rely on the etag (etag = mtime + id)!

For example if a PROFIND resolves that file while it gets written (different go routine), the etag will differ from the initial upload response header etag!

As soon as the file write is done, the etag will be the same again (from tus-POST response header).

POST (etag 1) -> PROPFIND (etag 2) -> SLEEP -> PROPFIND (etag 1)

the error rate depends on the OC environment, it happens more often if the file is large or PP is enabled.
But also if the file is written synchronously. It depends on the timing, the request count, and the file size.

Fallout

Having a different etag within the PROPFIND response tells the client that some data has changed on the remote.
This accidentally incorrect etag instructs the client to download the file again, which can have serious consequences from many points of view, such as:

  • unnecessary network traffic
  • conflicts if the file has been edited locally in the meantime

The file can only be downloaded if all binary data has been written (otherwise, the download request fails), but timing is crucial here too.

If the PROPFIND returns a different etag after the upload, but all bytes have been written in the meantime, and post-processing is complete, the aforementioned conditions may occur.

Issues

Reproduce and validation

reproducing things that rely on timing is not easy, specially if multiple routines and users are involved.
For that reason i wrote some k6 tests to reproduce the behavior reliable!

K6-CDPERF (i will create a PR for that test too)

import {queryJson, queryXml, randomString, store} from '@opencloud-eu/k6-tdk/lib/utils'
import { check } from 'k6';
import exec from 'k6/execution'
import {Options} from 'k6/options'

import {clientFor} from '@/shortcuts'
import {envValues} from '@/values'
import {getPoolItem} from '@/utils';
import {userPool} from '@/pools';
import {sleep} from 'k6';
import {randomBytes} from 'k6/crypto';
import { expect } from 'https://jslib.k6.io/k6-testing/0.5.0/index.js';

export interface Environment {
  adminData: {
    adminRoot: string;
  };
  actorData: {
    actorLogin: string;
    actorPassword: string;
    actorRoot: string;
  }[];
}

/**/
export const settings = {
  ...envValues(),
}

/**/
export const options: Options = {
  vus: 1,
  insecureSkipTLSVerify: true
}

const uploadedResources: {
  [key: string]: { etag: string };
} = {}


export default async function actor(): Promise<void> {
  const user = getPoolItem({pool: userPool, n: exec.vu.idInTest})
  const userStore = store(user.userLogin)

  const client = await userStore.setOrGet('client', async () => {
    return clientFor(user)
  })

  const root = await userStore.setOrGet('root', async () => {
    const getMyDrivesResponse = await client.me.getMyDrives({params: {$filter: "driveType eq 'personal'"}})
    const [actorRoot = user.userLogin] = queryJson("$.value[?(@.driveType === 'personal')].id", getMyDrivesResponse?.body)
    sleep(settings.sleep.after_request)

    return actorRoot
  })

  const uploadName = `file-${randomString()}.bin`
  const uploadResourceResponse = client.resource.uploadResource({
    root: [root].join('/'),
    resourcePath: uploadName,
    resourceBytes: randomBytes(1000 * 1000 * 1)
  })

  const uploadEtag = uploadResourceResponse.headers['Oc-Etag'].split('"').join('')
  uploadedResources[uploadResourceResponse.headers['Oc-Fileid']] = {etag: uploadEtag}

  const rootResourcePropertiesResponse = client.resource.getResourceProperties({
    root,
    resourcePath: ""
  })

  const [, ...rootResourcePropstats] = queryXml("$['d:multistatus']['d:response'][*]['d:propstat']", rootResourcePropertiesResponse.body)
  expect(rootResourcePropstats).toHaveLength(Object.keys(uploadedResources).length)
  check({uploadedResources, rootResourcePropstats}, {
    'upload.count match': (r) => r.rootResourcePropstats.length === Object.keys(r.uploadedResources).length,
  });

  rootResourcePropstats.forEach(propstat => {
    if(Array.isArray(propstat)) propstat = propstat[0]

    const fileID = propstat["d:prop"]["oc:fileid"]
    check({uploadedResources, fileID, propstat}, {
      'upload.fileid exists': (r) => Object.keys(r.uploadedResources).includes(r.fileID),
      'upload.id.etag match': (r) => uploadedResources[fileID].etag === r.propstat["d:prop"]["d:getetag"].split('"').join(''),
    });

    expect(uploadedResources).toHaveProperty(fileID)
    expect(uploadedResources[fileID].etag).toBe(propstat["d:prop"]["d:getetag"].split('"').join(''))
  })
}

Solution

The PosixFS blobstore must ensure that all bytes are made available atomically during the upload, and only when everything is complete and ready!

So I'm now writing everything to a .tmp file, and as soon as everything is complete, the mtime and extended attributes are transferred from the placeholder to this .tmp file.

Once all this has been done, the placeholder is replaced with os.Rename. Since we don't support Windows, this is the best solution in my opinion.

After all that the tests are green (testet for ~800 uploads and 10 concurrent users)

Before Fix (1 minute, 10 users, 41 failures, 838 completions, 8.4 GB)

k6 run packages/k6-tests/artifacts/oc-tus-upload-propfind-etag-default-k6.js --duration=1m --vus=10

         /\      Grafana   /‾‾/  
    /\  /  \     |\  __   /  /   
   /  \/    \    | |/ /  /   ‾‾\ 
  /          \   |   (  |  (‾)  |
 / __________ \  |_|\_\  \_____/ 

     execution: local
        script: packages/k6-tests/artifacts/oc-tus-upload-propfind-etag-default-k6.js
        output: -

     scenarios: (100.00%) 1 scenario, 10 max VUs, 1m30s max duration (incl. graceful stop):
              * default: 10 looping VUs for 1m0s (gracefulStop: 30s)



  █ TOTAL RESULTS 

    checks_total.......................: 73622  1212.068429/s
    checks_succeeded...................: 99.94% 73581 out of 73622
    checks_failed......................: 0.05%  41 out of 73622

    ✓ authn -> logonResponse - status
    ✓ authn -> authorizeResponse - status
    ✓ authn -> accessTokenResponse - status
    ✓ client -> role.getMyDrives - status
    ✓ client -> resource.uploadResource - status
    ✓ client -> resource.getResourceProperties - status
    ✓ upload.count match
    ✓ upload.fileid exists
    ✗ upload.id.etag match
      ↳  99% — ✓ 35493 / ✗ 41

    HTTP
    http_req_duration.......................................................: avg=84.09ms  min=10.52ms  med=74.75ms  max=996.62ms p(90)=145.83ms p(95)=174.24ms
      { expected_response:true }............................................: avg=84.09ms  min=10.52ms  med=74.75ms  max=996.62ms p(90)=145.83ms p(95)=174.24ms
    http_req_failed.........................................................: 0.00%  0 out of 1716
    http_reqs...............................................................: 1716   28.251194/s

    EXECUTION
    iteration_duration......................................................: avg=721.74ms min=164.87ms med=669.58ms max=2.77s    p(90)=1.11s    p(95)=1.29s   
    iterations..............................................................: 838    13.796329/s
    vus.....................................................................: 10     min=10        max=10
    vus_max.................................................................: 10     min=10        max=10

    NETWORK
    data_received...........................................................: 58 MB  958 kB/s
    data_sent...............................................................: 8.4 GB 138 MB/s




running (1m00.7s), 00/10 VUs, 838 complete and 0 interrupted iterations
default ✓ [======================================] 10 VUs  1m0s

After Fix (1 minute, 10 users, 0 failures, 871 completions, 8.7 GB)

k6 run packages/k6-tests/artifacts/oc-tus-upload-propfind-etag-default-k6.js --duration=1m --vus=10

         /\      Grafana   /‾‾/                                                                                                                                                         
    /\  /  \     |\  __   /  /                                                                                                                                                          
   /  \/    \    | |/ /  /   ‾‾\                                                                                                                                                        
  /          \   |   (  |  (‾)  |                                                                                                                                                       
 / __________ \  |_|\_\  \_____/ 

     execution: local
        script: packages/k6-tests/artifacts/oc-tus-upload-propfind-etag-default-k6.js
        output: -

     scenarios: (100.00%) 1 scenario, 10 max VUs, 1m30s max duration (incl. graceful stop):
              * default: 10 looping VUs for 1m0s (gracefulStop: 30s)



  █ TOTAL RESULTS 

    checks_total.......................: 79393   1303.024585/s
    checks_succeeded...................: 100.00% 79393 out of 79393
    checks_failed......................: 0.00%   0 out of 79393

    ✓ authn -> logonResponse - status
    ✓ authn -> authorizeResponse - status
    ✓ authn -> accessTokenResponse - status
    ✓ client -> role.getMyDrives - status
    ✓ client -> resource.uploadResource - status
    ✓ client -> resource.getResourceProperties - status
    ✓ upload.count match
    ✓ upload.fileid exists
    ✓ upload.id.etag match

    HTTP
    http_req_duration.......................................................: avg=82.81ms  min=8.18ms   med=75.94ms max=1.27s p(90)=141ms p(95)=162.3ms
      { expected_response:true }............................................: avg=82.81ms  min=8.18ms   med=75.94ms max=1.27s p(90)=141ms p(95)=162.3ms
    http_req_failed.........................................................: 0.00%  0 out of 1782
    http_reqs...............................................................: 1782   29.246783/s

    EXECUTION
    iteration_duration......................................................: avg=697.99ms min=160.18ms med=641.7ms max=3.16s p(90)=1.13s p(95)=1.22s  
    iterations..............................................................: 871    14.295145/s
    vus.....................................................................: 6      min=6         max=10
    vus_max.................................................................: 10     min=10        max=10

    NETWORK
    data_received...........................................................: 63 MB  1.0 MB/s
    data_sent...............................................................: 8.7 GB 143 MB/s




running (1m00.9s), 00/10 VUs, 871 complete and 0 interrupted iterations
default ✓ [======================================] 10 VUs  1m0s

@fschade fschade marked this pull request as draft October 22, 2025 16:04
@fschade fschade force-pushed the oc-issue-1523-atomic-posixfs-uploads branch 2 times, most recently from ad35ecf to 3bcf1aa Compare October 23, 2025 07:36
@fschade fschade requested review from butonic and rhafer October 23, 2025 08:32
@fschade fschade force-pushed the oc-issue-1523-atomic-posixfs-uploads branch from 3bcf1aa to dc0e8ff Compare October 23, 2025 13:59
@fschade fschade self-assigned this Oct 24, 2025
@fschade fschade added the Priority:p1-urgent Urgent priority label Oct 24, 2025
@github-project-automation github-project-automation bot moved this to Qualification in OpenCloud Team Board Oct 24, 2025
@fschade fschade moved this from Qualification to In Progress in OpenCloud Team Board Oct 24, 2025
Copy link

@dragotin dragotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this fix.

Copy link

@butonic butonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now writes the blob to a temporary file before moving it over the existing file. This makes the mtime returned in propfind requests reflect the change on disk.

@fschade fschade force-pushed the oc-issue-1523-atomic-posixfs-uploads branch from dc0e8ff to e65b97c Compare October 24, 2025 14:35
@fschade fschade marked this pull request as ready for review October 27, 2025 10:10
@fschade fschade merged commit 5f317aa into opencloud-eu:main Oct 27, 2025
19 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in OpenCloud Team Board Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants