# Package Registry Security
Package registries (npm, PyPI, crates.io, Maven Central, RubyGems, NuGet) are the distribution layer for software dependencies. They are critical infrastructure for the software supply chain and a primary target for attackers because a single malicious package can propagate to millions of downstream projects.
## Trust model
Registries operate on an open-publish model: anyone can create an account and upload packages with any unclaimed name. This is great for the open-source ecosystem's velocity but creates fundamental security challenges:
- **No identity verification** for publishers (though npm now requires 2FA for popular packages)
- **Names are first-come-first-served**, enabling [[Namesquatting]]
- **No link between package name and source code**: a package named `react-utils` has no verified connection to React
- **Metadata is self-declared**: repository links, descriptions, and keywords are unverified, enabling [[Starjacking]]
## Registry-level defenses
### Name policies
- **Similarity checks**: npm blocks new names too close to existing popular packages (hamming distance, edit distance)
- **Name normalization**: PyPI normalizes hyphens, underscores, and case (`my-package` = `my_package` = `My_Package`)
- **Scoped namespaces**: npm's `@scope/package` prevents collisions; PyPI lacks true namespaces
- **Limitation**: these defenses don't catch [[Slopsquatting]] because hallucinated names are often dissimilar to any existing package
### Provenance and signing
- **npm provenance attestations**: link a package version to a specific CI build and source commit (Sigstore-based)
- **PyPI Trusted Publishers**: packages can only be published from verified CI/CD workflows
- **Sigstore**: keyless signing infrastructure used by npm, PyPI, and others
- **Limitation**: adoption is still low; most packages have no provenance attestation
### Malware scanning
- npm, PyPI, and others run automated malware scans on uploads
- Socket.dev analyzes package behavior (network calls, filesystem access, eval usage) rather than just matching CVEs
- Scans catch known patterns but miss novel techniques
## Consumer-side defenses
1. **Private registry proxies** (Artifactory, Verdaccio, Nexus): cache approved packages, block direct access to public registries
2. **Lockfiles**: pin exact versions and integrity hashes; review diffs
3. **[[Software Composition Analysis (SCA)]]**: scan dependencies continuously
4. **Allowlists**: restrict which packages and publishers your CI/CD can install
5. **Namespace reservation**: publish placeholder packages for internal names on public registries (prevents [[Dependency Confusion]])
## Open problems
- No universal namespace ownership (anyone can publish `company-auth` on npm)
- Cross-registry confusion: a PyPI package and an npm package can share a name but be completely different things
- AI agents installing packages autonomously bypass all human review gates
- No standardized way to revoke or deprecate malicious packages across registries
## References
-
## Related
- [[Software Supply Chain Security]]
- [[Namesquatting]]
- [[Typosquatting]]
- [[Slopsquatting]]
- [[Dependency Confusion]]
- [[Starjacking]]
- [[Software Composition Analysis (SCA)]]
- [[Least Privilege Principle]]
- [[Zero Trust Security]]
- [[AI Skill Supply Chain Security]]
- [[Attack surface]]