§Extending the Package Calculus
I’ve been back at work on the package management paper. In a previous log I described the ‘The Package Calculus’ as a core model of expressivity of package management systems.
This week I’ve formalised how we can extend this core to support npm’s peer dependencies. But first, peer dependencies are only needed as functionality of npm as it allows multiple versions of a single package to exist in a resolution.
§Concurrent Versions
(A, 1) Δ (B, {1})
(A, 1) Δ (C, {1})
(B, 1) Δ (D, {1})
(C, 1) Δ (D, {3})
The figure above exhibits the ‘diamond dependency problem’, so called as the graph with edges {(A, B), (A, C), (B, D), (C, D)} forms a diamond, and has no solution. To address this, some package managers allow multiple versions of a package to exist in the resolution. For example, Cargo supports multiple versions of the same package coexisting in a resolution, provided they have incompatible semantic versions. We will define a Concurrent Package Calculus to model this behaviour, and show how we can reduce it to the core calculus.
§Concurrent Package Calculus
Granularity Function
We define g : V → G mapping a version to a granularity which is the same value for conflicting versions of a package. For example, in Cargo we can define g(x.y.z) = x to extract the major version of a package. For npm, we can define g(v) = v as it allows duplication regardless of version (except for peer dependencies). We can emulate the single-version condition with g(v) = ϵ, all versions conflict.
Concurrent Resolution
Given dependencies ΔC, root rC, and a granularity function g, a resolution SC is valid if the following conditions hold:
Root Inclusion: rC ∈ SC
Dependency Closure: ∀p ∈ SC. pΔC(n, vs) ⟹ ∃v ∈ vs. (n, v) ∈ SC
Version Granularity: ∀(n, v), (n, v′) ∈ SC. v ≠ v′ ⟹ g(v) ≠ g(v′)
Packages of the same name and different versions in SC must have distinct values of their version applied to g.
We write 𝒮C(ΔC, rC, g) for the set of all resolutions of rC in ΔC under g.
Note that this means multiple version sets can be selected even if it would be possible to use just one, which is consistent with Cargo’s implementation.
Example: Consider the following dependencies with g(x.y.z) = x:
(A, 1.0.0) Δ_C (B, {1.0.0})
(A, 1.0.0) Δ_C (C, {1.0.0})
(B, 1.0.0) Δ_C (D, {1.0.0, 2.0.0, 2.0.1})
(C, 1.0.0) Δ_C (D, {2.0.0, 2.0.1, 3.0.0})
§Reduction to the Core Calculus
To reduce the Concurrent Package Calculus to the core calculus we push the granular version into the package name to provide version granularity with version uniqueness. We introduce intermediate packages that allow for depending on multiple granular versions of now-distinct package names.
Concurrent Reduction
Given RC, ΔC, rC, and g, we define a reduction to the core calculus with R, Δ, and r, as follows:
Packages:
- ∀(n, v) ∈ RC. (⟨n, g(v)⟩, v) ∈ R where ⟨n, g(v)⟩ ∈ N
- r = (⟨n, g(v)⟩, v) where rC = (n, v)
Dependencies: For each (n, v)ΔC(m, vs):
Direct case (single granular version): If ∃w ∈ G such that ∀u ∈ vs. g(u) = w, then (⟨n, g(v)⟩, v)Δ(⟨m, w⟩, vs).
Split case (multiple granular versions): Otherwise, W ≔ {g(u) ∣ u ∈ vs} and i ≔ ⟨n, v, m⟩ where i ∈ N:
- ∀w ∈ W. (i, w) ∈ R and (i, w)Δ(⟨m, w⟩, {u ∈ vs ∣ g(u) = w})
- (⟨n, g(v)⟩, v)Δ(i, W)
Example: The reduction of the previous example produces:
(⟨A, 1⟩, 1.0.0) Δ (⟨B, 1⟩, {1.0.0})
(⟨A, 1⟩, 1.0.0) Δ (⟨C, 1⟩, {1.0.0})
(⟨B, 1⟩, 1.0.0) Δ (⟨B, 1.0.0, D⟩, {1, 2})
(⟨B, 1.0.0, D⟩, 1) Δ (⟨D, 1⟩, {1.0.0})
(⟨B, 1.0.0, D⟩, 2) Δ (⟨D, 2⟩, {2.0.0, 2.0.1})
(⟨C, 1⟩, 1.0.0) Δ (⟨C, 1.0.0, D⟩, {2, 3})
(⟨C, 1.0.0, D⟩, 2) Δ (⟨D, 2⟩, {2.0.0, 2.0.1})
(⟨C, 1.0.0, D⟩, 3) Δ (⟨D, 3⟩, {3.0.0})
Soundness: If S is a valid resolution in 𝒮(Δ, r), then SC = {(n, v) ∣ (⟨n, g(v)⟩, v) ∈ S} is a valid resolution in 𝒮C(ΔC, rC, g).
Completeness: If SC is a valid resolution in 𝒮C(ΔC, rC, g), then S = {(⟨n, g(v)⟩, v) ∣ (n, v) ∈ SC} ∪ {(⟨n, v, m⟩, g(u)) ∣ (n, v)Δ(m, vs), (m, u) ∈ S, u ∈ vs} is a valid resolution in 𝒮(Δ, r).
§The Peer Dependency Package Calculus
npm supports ‘peer dependencies’ which express that a child package requires its parent to provide a peer dependency rather than installing its own copy. The child package’s version constraints on the peer dependency must be satisfied by the parent’s chosen version. Plugins use peer dependencies to constrain the versions of their base package they can be co-installed with. This mechanism only makes sense in the context of concurrent versions, where multiple versions of such a base package can coexist.
§Peer Package Calculus
We extend the Concurrent Package Calculus to model peer dependencies.
Peer Dependencies
We define peer dependencies Θ ⊆ P × (N × 𝒫(V)) as a relation where pΘ(n, vs) denotes that package p’s parent can only depend on the peer package name n if it has a version in vs.
Peer Dependency Resolution
Given dependencies ΔC, peer dependencies Θ, granularity function g, and root rC, a resolution SΘ and parent function π : SΘ \ {rC} → SΘ are valid if:
Concurrent Resolution: SΘ is valid in 𝒮C(ΔC, g, rC)
Parent Closure: ∀p ∈ SΘ. pΔC(n, vs) ⟹ ∃v ∈ vs. (n, v) ∈ SΘ ∧ p = π((n, v))
The parent of a package is the package whose dependency caused its inclusion.
Peer Satisfaction: ∀p ∈ SΘ. pΘ(n, vs) ∧ π(p)ΔC(n, us) ⟹ ∃(n, v) ∈ SΘ. v ∈ (vs ∩ us)
If a non-root package p has a peer dependency on n, and its parent depends on n, then the resolved version of n must satisfy both constraints.
We write 𝒮Θ(ΔC, Θ, g, rC) ∋ (SΘ, π) for the set of all resolutions and parent functions of rC in ΔC and Θ under g.
Example: Consider where parent A depends on child B with a peer dependency on C and g(v) = v:
(A, 1) Δ_C (B, {1})
(A, 1) Δ_C (C, {2, 3})
(B, 1) Θ (C, {1, 2})
§Reduction to the Core Calculus
We reduce peer dependencies to the Core Package Calculus with a modified Concurrent Package Calculus reduction using intermediate package dependencies to constrain the peer package version selection to the intersection of the parent and the child constraints.
Peer Dependency Reduction
Given RC, ΔC, Θ, rC, and g, we define a reduction to the core calculus with R, Δ, and r, as follows:
Packages:
- ∀(n, v) ∈ RC. (⟨n, g(v)⟩, v) ∈ R where ⟨n, g(v)⟩ ∈ N
- r = (⟨n, g(v)⟩, v) where rC = (n, v)
Dependencies: For each (n, v)ΔC(m, vs), with i ≔ ⟨n, v, m⟩ where i ∈ N:
Direct case: If ∃w ∈ G such that ∀u ∈ vs. g(u) = w (single granular version), and $\not\exists (n, v) \Delta_C (o, us).\ \exists u \in us.\ (o, u) \Theta (m, ws)$ (not a peer dependency), then (⟨n, g(v)⟩, v)Δ(⟨m, w⟩, vs)
Split case (multiple granular versions or a peer dependency): Otherwise,
- ∀u ∈ vs. (i, u) ∈ R and (i, u)Δ(⟨m, g(u)⟩, {u})
- (⟨n, g(v)⟩, v)Δ(i, vs)
Note we use the full version in the intermediate package to allow selection of an exact peer package version as the intersection of constraints, regardless of g.
Peer Dependencies (another child o has a peer dependency on m): If ∃(n, v)ΔC(o, us). ∃u ∈ us. (o, u)Θ(m, ws), then
- ∀w ∈ ws. (i, w) ∈ R and (i, w)Δ(⟨m, g(w)⟩, {w})
- ∀u ∈ us. (o, u)Θ(m, ws) ⟹ (⟨n, v, o⟩, u)Δ(i, ws)
Example: The reduction of the previous example produces:
(⟨A, 1⟩, 1) Δ (⟨A, 1, B⟩, {1})
(⟨A, 1, B⟩, 1) Δ (⟨B, 1⟩, {1})
(⟨A, 1⟩, 1) Δ (⟨A, 1, C⟩, {2, 3})
(⟨A, 1, C⟩, 1) Δ (⟨C, 1⟩, {1})
(⟨A, 1, C⟩, 2) Δ (⟨C, 2⟩, {2})
(⟨A, 1, C⟩, 3) Δ (⟨C, 3⟩, {3})
(⟨A, 1, B⟩, 1) Δ (⟨A, 1, C⟩, {1, 2})
The key constraint is that ⟨A, 1, B⟩ (representing B as selected by A) must satisfy B’s peer dependency requiring C ∈ {1, 2}, while A directly constrains C to {2, 3}. The intersection {1, 2} ∩ {2, 3} = {2} means only C version 2 can be selected.
Soundness: If S is a valid resolution in 𝒮(Δ, r), then SΘ = {(n, v) ∣ (⟨n, g(v)⟩, v) ∈ S} π = {((m, u), (n, v)) ∣ (⟨n, v, m⟩, u) ∈ S} are a valid resolution and parent function in 𝒮Θ(ΔC, Θ, g, rC).
Completeness: If (SΘ, π) ∈ 𝒮Θ(ΔC, Θ, g, rC), then S = {(⟨n, g(v)⟩, v) ∣ (n, v) ∈ SΘ} ∪ {(⟨n, v, m⟩, u) ∣ (m, u) ∈ SΘ, (m, u) ≠ rC, π((m, u)) = (n, v)} is a valid resolution in 𝒮(Δ, r).
§What’s next?
So far I’ve formalised how the Package Calculus can model version ordering, version formula, conflicts, concurrent versions, features, package formula, variable formula, virtual packages, dependency cycles, optional dependencies, and precise dependencies.
I have an idea to demonstrate that we can invert reductions, to use Pac as the intermediate format for bi-directional translation of dependency information between ecosystems. This requires more proofs, not to mention an implementation.
Not covered is how these extensions compose; some are straightforward, such as virtual packages and conflicts, like Debian; but others are less clear, how do features and concurrent versions interact, such as in Cargo.
There is also the matter of formalising that these extensions themselves are capable of modelling the core calculus. This is straightfoward for some (e.g. setting the granularity function to the identity function in the concurrent calculus), but is not true for precise dependencies (e.g. Nix).
Finally, I’m not sure if this warrants the name ‘Calculus’ – suggestions welcome for a better name!