Skip to content

[Azure Devops Repos] clone failure as packs use REF_DELTA objects: fix with git-style delta resolution #1025

Open
@bittrance

Description

@bittrance

Current behavior 😯

It seems that Azure Devops (at least over SSH) has some form of pagination when reading packs. Given the code below, fetching a trivial repo (one file, one commit) will work correctly produce the following output:

created handshake
created negotiate (round 1)
created receiving pack
created remote
created read pack
created remote: Azure Repos
Found 3 objects to send. (20 ms)
Some("read pack") incremented to Some(759)
created create index file
created indexing
created decompressing
Some("decompressing") incremented to Some(985)
Some("indexing") incremented to Some(1)
Some("decompressing") incremented to Some(1022)
Some("indexing") incremented to Some(2)
Some("decompressing") incremented to Some(1214)
Some("indexing") incremented to Some(3)
Some("read pack") Info done 759 items in 0.00s (223952 items/s)
Some("indexing") Info done 3 items in 0.00s (2480 items/s)
Some("decompressing") Info done 1214 items in 0.00s (994769 items/s)
Some("create index file") incremented to Some(1)
created Resolving
created Decoding
created thread 0
created thread 1
created thread 3
created thread 2
created thread 4
created thread 5
created thread 7
created thread 6
Some("Resolving") Info done 0 items in 0.05s (0 items/s)
Some("Decoding") Info done 0 items in 0.05s (0 items/s)
Some("create index file") incremented to Some(2)
created sorting by id
Some("create index file") incremented to Some(3)
created writing index file
created writing fan-out table
Some("writing index file") incremented to Some(1)
created writing ids
Some("writing index file") incremented to Some(2)
created writing crc32
Some("writing index file") incremented to Some(3)
created writing offsets
Some("writing index file") incremented to Some(4)
Some("writing index file") Success done 1.2KB in 0.00s (5.0MB/s)
Some("create index file") Success done 3.0 objects in 0.05s (57.0 objects/s)

When fed the URL for a a non-trivial repo, output will look like this. Note the suspicious 8192 on read pack progress. I've tested a few different repos and they all get 8192 "pack steps".

created handshake
created negotiate (round 1)
created receiving pack
created remote
created read pack
created remote: Azure Repos
Found 679 objects to send. (200 ms)
Some("read pack") incremented to Some(8192)
created create index file
created indexing
created decompressing
Some("decompressing") incremented to Some(985)
Some("indexing") incremented to Some(1)
Some("decompressing") incremented to Some(1022)
Some("indexing") incremented to Some(2)
Some("decompressing") incremented to Some(1214)
Some("indexing") incremented to Some(3)
Some("decompressing") incremented to Some(1488)
Some("indexing") incremented to Some(4)
Some("decompressing") incremented to Some(1770)
Some("indexing") incremented to Some(5)
Some("decompressing") incremented to Some(2259)
Some("indexing") incremented to Some(6)
Some("decompressing") incremented to Some(2997)
Some("indexing") incremented to Some(7)
Some("decompressing") incremented to Some(3803)
Some("indexing") incremented to Some(8)
Some("decompressing") incremented to Some(3838)
Some("indexing") incremented to Some(9)
Some("decompressing") incremented to Some(5320)
Some("indexing") incremented to Some(10)
Some("read pack") incremented to Some(16384)
Some("read pack") incremented to Some(24576)
Some("decompressing") incremented to Some(7051)
Some("indexing") incremented to Some(11)
Some("decompressing") incremented to Some(61187)
Some("indexing") incremented to Some(12)
Some("decompressing") incremented to Some(61223)
Some("indexing") incremented to Some(13)
Some("decompressing") incremented to Some(61432)
Some("indexing") incremented to Some(14)
Some("decompressing") incremented to Some(62040)
Some("indexing") incremented to Some(15)
Some("decompressing") incremented to Some(62671)
Some("indexing") incremented to Some(16)
Some("decompressing") incremented to Some(62788)
Some("indexing") incremented to Some(17)
Some("decompressing") incremented to Some(64644)
Some("indexing") incremented to Some(18)
Some("decompressing") incremented to Some(64789)
Some("indexing") incremented to Some(19)
Info done 24576 items in 0.23s (106798 items/s)

At this point it hangs. Eventually, the timeout occurs and the following error is raised:

Error: Fetch(WritePack(IndexWrite(PackEntryDecode(NotFound { object_id: Sha1(cdd16b2c111f7de5457e4c3182ed4d781a062a2a) }))))

Tho I suppose this is just a consequence of us not having received the object in question.

I'll try to dig deeper tomorrow.

Expected behavior 🤔

If it works for small Azure Devops repos, surely it should work for large :) The same code has no problem fetching a GitHub repo of non-trivial size.

Steps to reproduce 🕹

Code to reproduce the problem:

#[derive(Default)]
struct P {
    max: Option<Step>,
    step: Option<Step>,
    unit: Option<Unit>,
    name: Option<String>,
}

impl Progress for P {
    type SubProgress = P;

    fn add_child(&mut self, name: impl Into<String>) -> Self::SubProgress {
        let mut p = P::default();
        p.set_name(name.into());
        p
    }

    fn add_child_with_id(&mut self, name: impl Into<String>, _id: Id) -> Self::SubProgress {
        let mut p = P::default();
        p.set_name(name.into());
        p
    }

    fn init(&mut self, max: Option<Step>, unit: Option<Unit>) {
        self.max = max;
        self.unit = unit;
    }

    fn set(&mut self, step: Step) {
        self.step = Some(step);
        println!("{:?} start at {:?}", self.name, self.step);
    }

    fn step(&self) -> Step {
        self.step.clone().unwrap_or_default()
    }

    fn inc_by(&mut self, step: Step) {
        *self.step.get_or_insert(Step::default()) += step;
        println!("{:?} incremented to {:?}", self.name, self.step);
    }

    fn set_name(&mut self, name: impl Into<String>) {
        let name = name.into();
        println!("created {}", name);
        self.name = Some(name);
    }

    fn name(&self) -> Option<String> {
        self.name.clone()
    }

    fn id(&self) -> Id {
        todo!()
    }

    fn message(&self, level: MessageLevel, message: impl Into<String>) {
        println!("{:?} {:?} {}", self.name, level, message.into());
    }
}

fn clone_repo(
    config: &GitConfig,
    deadline: Instant,
    target: &Path,
) -> Result<Repository, Error> {
    let should_interrupt = AtomicBool::new(false);
    let cancel = AtomicBool::new(false);
    scope(|s| {
        s.spawn(|| {
            while Instant::now() < deadline && !cancel.load(Ordering::Relaxed) {
                sleep(Duration::from_secs(1));
            }
            should_interrupt.store(true, Ordering::Relaxed);
        });
        let p = P::default();
        let (repo, _outcome) = gix::prepare_clone(config.url.clone(), target)
            .unwrap()
            .fetch_only(p, &should_interrupt)?;
        cancel.store(true, Ordering::Relaxed);
        Ok(repo)
    })
}

Metadata

Metadata

Assignees

Labels

acknowledgedan issue is accepted as shortcoming to be fixedhelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions