Summary
When installing a gitlab: shorthand git dependency from a private GitLab repo, pacote's git fetcher tries the hosted tarball URL first. For private repos, GitLab redirects unauthenticated requests to /users/sign_in with HTTP 302 → 200. Since the response is HTTP 200 (not an error), npm's HTTP client treats it as a successful download. The tar extractor then tries to parse the HTML sign-in page as a tarball and fails with TAR_BAD_ARCHIVE.
The fallback to git clone in lib/git.js only triggers on HTTP errors:
```javascript
// lib/git.js ~line 258
}).extract(tmp).then(() => handler(...), er => {
// fall back to ssh download if tarball fails
if (er.constructor.name.match(/^Http/)) {
return this.#clone(handler, false)
} else {
throw er
}
})
```
TAR_BAD_ARCHIVE is not an HTTP error, so it throws instead of falling back to clone. A one-line fix resolves the issue:
```diff
- if (er.constructor.name.match(/^Http/)) {
- if (er.constructor.name.match(/^Http/) || er.code === 'TAR_BAD_ARCHIVE') {
```
Steps to reproduce
- Have a private repo on GitLab (e.g.,
gitlab:myorg/my-private-pkg#1.0.0)
- Ensure no GitLab HTTPS auth is configured (only SSH)
- Run
npm install gitlab:myorg/my-private-pkg#1.0.0
What happens
```
npm warn tar TAR_ENTRY_INVALID checksum failure
npm warn tar TAR_BAD_ARCHIVE: Unrecognized archive format
npm error code TAR_BAD_ARCHIVE
npm error TAR_BAD_ARCHIVE: Unrecognized archive format
```
Debug logs show the tar parser receiving HTML (<!DOCTYPE html> / GitLab sign-in page) instead of a tarball archive. The HTTP request to https://gitlab.com/{user}/{project}/repository/archive.tar.gz?ref={tag} gets a 302 redirect to /users/sign_in, which returns 200 with HTML.
What should happen
pacote should fall back to git clone (like it does for HTTP errors) when the tarball extraction fails, since tar errors after a "successful" HTTP download indicate the response wasn't actually a tarball.
Environment
- npm 11.11.0 (Node 24.14.1) — also reproducible on npm 10.x / Node 20, Node 22
- pacote version: bundled with npm
hosted-git-info 9.0.2
- GitLab.com (private repos)
Related
This likely affects all hosted git providers that return 200 with HTML for unauthenticated archive requests instead of a proper HTTP error status.
Summary
When installing a
gitlab:shorthand git dependency from a private GitLab repo,pacote's git fetcher tries the hosted tarball URL first. For private repos, GitLab redirects unauthenticated requests to/users/sign_inwith HTTP 302 → 200. Since the response is HTTP 200 (not an error), npm's HTTP client treats it as a successful download. The tar extractor then tries to parse the HTML sign-in page as a tarball and fails withTAR_BAD_ARCHIVE.The fallback to
git cloneinlib/git.jsonly triggers on HTTP errors:```javascript
// lib/git.js ~line 258
}).extract(tmp).then(() => handler(...), er => {
// fall back to ssh download if tarball fails
if (er.constructor.name.match(/^Http/)) {
return this.#clone(handler, false)
} else {
throw er
}
})
```
TAR_BAD_ARCHIVEis not an HTTP error, so it throws instead of falling back to clone. A one-line fix resolves the issue:```diff
```
Steps to reproduce
gitlab:myorg/my-private-pkg#1.0.0)npm install gitlab:myorg/my-private-pkg#1.0.0What happens
```
npm warn tar TAR_ENTRY_INVALID checksum failure
npm warn tar TAR_BAD_ARCHIVE: Unrecognized archive format
npm error code TAR_BAD_ARCHIVE
npm error TAR_BAD_ARCHIVE: Unrecognized archive format
```
Debug logs show the tar parser receiving HTML (
<!DOCTYPE html>/ GitLab sign-in page) instead of a tarball archive. The HTTP request tohttps://gitlab.com/{user}/{project}/repository/archive.tar.gz?ref={tag}gets a 302 redirect to/users/sign_in, which returns 200 with HTML.What should happen
pacote should fall back to
git clone(like it does for HTTP errors) when the tarball extraction fails, since tar errors after a "successful" HTTP download indicate the response wasn't actually a tarball.Environment
hosted-git-info9.0.2Related
hosted-git-infoneeds to update the deprecated GitLab tarball URL templateThis likely affects all hosted git providers that return 200 with HTML for unauthenticated archive requests instead of a proper HTTP error status.