the chasm at depth four, and tensor rank : old results, new insights
TRANSCRIPT
the chasm at depth four, and tensor rank:old results, new insights
Suryajith Chillara 1 Mrinal Kumar 2 Ramprasad Saptharishi 3 V Vinay4
November 4, 20161Chennai Mathematical Institute.
2Rutgers University.
3Tata Institute of Fundamental Research.
4Limberlink Technologies and Chennai Mathematical Institute.
Arithmetic circuits and formulas
Depth reduction
Depth reduction to depth four circuits
Log product decompositions for homogeneous formulas
Depth reduction of homogeneous formulas
Application
Set-multilinear polynomials
Tensors
Raz’s work
Tensor rank upper bound
Conclusion
arithmetic circuits and formulas
1
arithmetic circuits
Definition
An Arithmetic circuit Φ over the field F and the set of variablesX = (x1, x2, . . . , xn) is a directed acyclic graph as follows.
∙ Every gate with in-degree 0 is called an input gate and is labelledeither by a variable or a field element from F.
∙ Every other gate is labelled by either ×(product gate) or +(sumgate).
∙ The size of Φ is the number of edges present in it.
∙ The depth of Φ is the maximal depth of a gate in it.
2
arithmetic circuit
×
+
1 x1
+
1 x2
+
1 xn−1
+
1 xn
· · ·
Figure: The arithmetic circuit for f(x) =∑S⊆[n]
∏i∈S
xi =∏i∈S
(1+ xi)
3
arithmetic formula
Definition
An arithmetic formula is an arithmetic circuit where the fan-out ofevery gate is at most 1.
4
formula size lower bounds
Strassen showed the following connection between the Tensor rankand the formula size complexity.
Theorem
For an explicit tensor T : [n]3 → F, the size of the correspondingpolynomial f is at least Ω(TensorRank(f)).
∙ The best size lower bound one could obtain from this is Ω(n2).∙ State of the art lower bound on a tensor of order 3 is linear (dueto Alexeev et al.).
5
formula size lower bounds
Strassen showed the following connection between the Tensor rankand the formula size complexity.
Theorem
For an explicit tensor T : [n]3 → F, the size of the correspondingpolynomial f is at least Ω(TensorRank(f)).
∙ The best size lower bound one could obtain from this is Ω(n2).∙ State of the art lower bound on a tensor of order 3 is linear (dueto Alexeev et al.).
5
depth reduction
depth reduction
Theorem ([AV08][Koi12][Tav15])
Any arithmetic circuit of size s computing a degree d polynomial canbe computed by a depth-4 circuit of bottom fan-in t and size at mostsO( d
t ).
7
arithmetic circuit classes
Definition (ΣΠΣΠ circuit)
∙ A ΣΠΣΠ circuit is composed of alternating layers of addition andmultiplication gates.
∙ A ΣΠ[D]ΣΠ[t] circuit computes a polynomial of the form
s∑i=1
di∏j=1
fi,j(x1, . . . , xn)
where∙ s is the fan-in of + gate at the top∙ fi,js are polynomials of degree at most t∙ dj < D for all j.
8
arithmetic circuit classes
Definition (ΣΠΣΠ circuit)
∙ A ΣΠΣΠ circuit is composed of alternating layers of addition andmultiplication gates.
∙ A ΣΠ[D]ΣΠ[t] circuit computes a polynomial of the form
s∑i=1
di∏j=1
fi,j(x1, . . . , xn)
where∙ s is the fan-in of + gate at the top∙ fi,js are polynomials of degree at most t∙ dj < D for all j.
8
depth-4 circuits
+
×
...
×
+
...
+
×
x1 x3
×
x2 x5
×
x3 x6
+
...
×
...
. . . . . .
. . . . . .
. . . . . .
9
log product decomposition for homogeneous formulas
Theorem ([HY11])
Let f be an n-variate degree d polynomial computed by a size shomogeneous formula. Then, f can be expressed as
f =s∑
i=1
fi1 · fi2 · · · fir (1)
where
1. the expression is homogeneous,2. for each i, j, we have
( 13)j d ≤ deg(fij) ≤
( 23)j d and r = Θ(logd),
3. each fij is also computed by homogeneous formulas of size atmost s.
10
depth reduction to depth four
Theorem
Let f be a homogeneous n-variate degree d polynomial computed bya size s homogeneous formula. Then for any 0 < t ≤ d, f can beequivalently computed by a homogeneous ΣΠ[a]ΣΠ[t] formula of topfan-in s10(d/t) where
a >110
dt log t.
11
depth reduction to depth four
Proof.
We start with equation (1) which is easily seen to be a homogeneousΣΠΣΠ[2d/3] circuit with top fan-in s:
f =s∑
i=1
fi1 · fi2 · · · fir
1. For each summand fi1 . . . fir in the RHS, pick the gate fij with largestdegree (if there is a tie, pick the one with smaller index j). If fij hasdegree more than t, expand that fij in-place using (1).
2. Repeat this process until all fij’s on the RHS have degree at most t.
12
depth reduction to depth four
g
g11g12 . . .g1r gs1gs2 . . .gsr· · ·
s
g111g112 . . . g11r
·g12 . . . g1r· · ·
g1s1g1s2 . . . g1sr
·g12 · · · g1r
s
gs11gs12 . . . gs1r
·gs2 . . . gsr· · ·
gss1gss2 · · · gssr
·gs2 · · · gsr
s
Figure: Depth reduction analysis
13
depth reduction to depth four
1. Consider a potential function — the total degree of all the bigfactors.
2. Geometric progression of degrees =⇒ total degree of smallfactors in a summand is at most 3t. Total degree of big terms is atleast d− 3t.
3. Potential drops by at most 3t per iteration. =⇒ at least d/3titerations are needed.
4. Every expansion introduces at least (log3 t) non-trivial terms.
14
application
set-multilinear polynomials
Definition (Set-multilinear polynomials)
Let x = x1 ⊔ · · · ⊔ xd be a partition of variables and let |xi| = mi. Apolynomial f(x) is said to be set-multilinear (SML) with respect to theabove partition if every monomial m in f satisfies |m ∩ Xi| = 1 for alli ∈ [d].
16
tensors
Definition
A tensor of order d can be defined as the function T : [n]d → F.
17
tensors as set-multilinear polynomials
Consider the partition of variables x = x1 ⊔ · · · ⊔ xd such that each|xi| = n.
Definition
Given a tensor T : [n]d → F, we can define a set-multilinearpolynomial f(x) as follows.
f(x) =∑
(i1,i2,...,id)∈[n]dT(i1, i2, . . . , id) · x1,i1 · x2,i2 · · · · xd,id .
18
tensor rank
A rank-1 tensor is precisely a set-multilinear product of linear formssuch as
f(x) = ℓ1(x1) · · · ℓd(xd)
where each ℓi(xi) is a linear form in the variables in xi.
19
tensor rank
Definition (Tensor rank, as set-multilinear polynomials)
For polynomial f(x) that is set-multilinear with respect tox = x1 ⊔ · · · ⊔ xd, the tensor rank of f (denoted by TensorRank(f)) isthe smallest r for which f can be expressed as a set-multilinear ΣΠΣcircuit:
f(x) =r∑
i=1
ℓi1(x1) · · · ℓid(xd).
20
properties of tensor rank
1. Sub-additive: For two SML polynomials f and g overx = x1 ⊔ · · · ⊔ xd,
TensorRank(f+ g) ≤ TensorRank(f) + TensorRank(g).
2. Sub-multiplicative: For two SML polynomials f and g overx1 ⊔ · · · ⊔ xd1 and xd1+1 ⊔ · · · ⊔ xd respectively,
TensorRank(f · g) ≤ TensorRank(f) · TensorRank(g).
3. For any SML polynomial f over x = x1 ⊔ · · · ⊔ xd such that each|xi| = n,
TensorRank(f) ≤ nd−1.
21
raz’s work
Theorem ([Raz10])
Let c be a constant. Let Φ be a formula of size s ≤ nc computing aset-multilinear polynomial f(x) with respect to x = x1 ⊔ · · · ⊔ xd. Ifd = O(logn/ log logn), then,
TensorRank(f) ≤ nd
nd/ exp(c) .
Corollary
For a family of tensors Tn : [n]d → F, with the tensor rank of thecorresponding SML polynomial f greater than nd(1−1/ exp(c)) for aconstant c, then f can not be computed by an arithmetic formula ofsize nc.
22
raz’s work
Theorem ([Raz10])
Let c be a constant. Let Φ be a formula of size s ≤ nc computing aset-multilinear polynomial f(x) with respect to x = x1 ⊔ · · · ⊔ xd. Ifd = O(logn/ log logn), then,
TensorRank(f) ≤ nd
nd/ exp(c) .
Corollary
For a family of tensors Tn : [n]d → F, with the tensor rank of thecorresponding SML polynomial f greater than nd(1−1/ exp(c)) for aconstant c, then f can not be computed by an arithmetic formula ofsize nc.
22
tensor rank bound for homogeneous formulas
Further improvement in case of homogeneous formulas:
Theorem
Let f be a set-multilinear polynomial with respect to x = x1 ⊔ · · · ⊔ xdthat is computed by a homogeneous formula (not necessarilyset-multilinear) Φ of size s = nc. If d is sub-polynomial in n, that islogd = o(logn), then
TensorRank(f) <nd
nd/ exp(c) .
23
tensor rank upper bound for hom. formulas
1. A homogeneous formula Φ of size nc can be reduced to a ΣΠΣΠ[t]
formula Φ′ of size n10c(d/t) for a parameter t.2. Consider a term T = Q1 · · ·Qa in Φ′ where a ≥ d log t
10t . This is notnecessarily a SML product.
3. We shall extract the set multilinear components from each of thefactors in T.
24
upper bound
1. For any S ⊂ [d] let Qi,S be the sum of monomials of Qi over thepartitions ⊔j∈Sxj and let di = deg(Qi) for all i ∈ [a].
2.SML(T) =
∑S1⊔···⊔Sa=[d]
|Si|=di
Q1,S1 · · ·Qa,Sa .
3. The tensor rank of each summand is upper bounded bynd1−1nd2−1 · · ·nda−1 and the number of summands is at most( dd1
)(d−d1d2
)· · ·
(d−∑a−1i=1 di
da
).
4.TensorRank(SML(T)) ≤ nd
na ·(
dd1 d2 · · · da
)and
TensorRank(f) ≤ n10c(d/t) · TensorRank(SML(T)).
25
conclusion
conclusion
1. Proving the existence of tensors with high rank would imply betterlower bounds.
2. State of the art lower bounds on explicit tensors (due to Alexeev,Forbes, and Tsimerman) for tensors do not help us get even thequadratic lower bounds for (homogeneous) formulas.
27
Questions ?
28
Thank you!1
1The theme of these slides is based on the the theme made available athttps://github.com/matze/mtheme undercba licence.
29
29