the chasm at depth four, and tensor rank : old results, new insights

the chasm at depth four, and tensor rank:old results, new insights

Suryajith Chillara 1 Mrinal Kumar 2 Ramprasad Saptharishi 3 V Vinay4

November 4, 20161Chennai Mathematical Institute.

2Rutgers University.

3Tata Institute of Fundamental Research.

4Limberlink Technologies and Chennai Mathematical Institute.

Arithmetic circuits and formulas

Depth reduction

Depth reduction to depth four circuits

Log product decompositions for homogeneous formulas

Depth reduction of homogeneous formulas

Application

Set-multilinear polynomials

Tensors

Raz’s work

Tensor rank upper bound

Conclusion

arithmetic circuits and formulas

1

arithmetic circuits

Definition

An Arithmetic circuit Φ over the field F and the set of variablesX = (x1, x2, . . . , xn) is a directed acyclic graph as follows.

∙ Every gate with in-degree 0 is called an input gate and is labelledeither by a variable or a field element from F.

∙ Every other gate is labelled by either ×(product gate) or +(sumgate).

∙ The size of Φ is the number of edges present in it.

∙ The depth of Φ is the maximal depth of a gate in it.

2

arithmetic circuit

×

+

1 x1

+

1 x2

+

1 xn−1

+

1 xn

· · ·

Figure: The arithmetic circuit for f(x) =∑S⊆[n]

∏i∈S

xi =∏i∈S

(1+ xi)

3

arithmetic formula

Definition

An arithmetic formula is an arithmetic circuit where the fan-out ofevery gate is at most 1.

4

formula size lower bounds

Strassen showed the following connection between the Tensor rankand the formula size complexity.

Theorem

For an explicit tensor T : [n]3 → F, the size of the correspondingpolynomial f is at least Ω(TensorRank(f)).

∙ The best size lower bound one could obtain from this is Ω(n2).∙ State of the art lower bound on a tensor of order 3 is linear (dueto Alexeev et al.).

5

depth reduction

depth reduction

Theorem ([AV08][Koi12][Tav15])

Any arithmetic circuit of size s computing a degree d polynomial canbe computed by a depth-4 circuit of bottom fan-in t and size at mostsO( d

t ).

7

arithmetic circuit classes

Definition (ΣΠΣΠ circuit)

∙ A ΣΠΣΠ circuit is composed of alternating layers of addition andmultiplication gates.

∙ A ΣΠ[D]ΣΠ[t] circuit computes a polynomial of the form

s∑i=1

di∏j=1

fi,j(x1, . . . , xn)

where∙ s is the fan-in of + gate at the top∙ fi,js are polynomials of degree at most t∙ dj < D for all j.

8

depth-4 circuits

+

×

...

×

+

...

+

×

x1 x3

×

x2 x5

×

x3 x6

+

...

×

...

. . . . . .

. . . . . .

. . . . . .

9

log product decomposition for homogeneous formulas

Theorem ([HY11])

Let f be an n-variate degree d polynomial computed by a size shomogeneous formula. Then, f can be expressed as

f =s∑

i=1

fi1 · fi2 · · · fir (1)

where

1. the expression is homogeneous,2. for each i, j, we have

( 13)j d ≤ deg(fij) ≤

( 23)j d and r = Θ(logd),

3. each fij is also computed by homogeneous formulas of size atmost s.

10

depth reduction to depth four

Theorem

Let f be a homogeneous n-variate degree d polynomial computed bya size s homogeneous formula. Then for any 0 < t ≤ d, f can beequivalently computed by a homogeneous ΣΠ[a]ΣΠ[t] formula of topfan-in s10(d/t) where

a >110

dt log t.

11


Proof.

We start with equation (1) which is easily seen to be a homogeneousΣΠΣΠ[2d/3] circuit with top fan-in s:

f =s∑

i=1

fi1 · fi2 · · · fir

1. For each summand fi1 . . . fir in the RHS, pick the gate fij with largestdegree (if there is a tie, pick the one with smaller index j). If fij hasdegree more than t, expand that fij in-place using (1).

2. Repeat this process until all fij’s on the RHS have degree at most t.

12


g

g11g12 . . .g1r gs1gs2 . . .gsr· · ·

s

g111g112 . . . g11r

·g12 . . . g1r· · ·

g1s1g1s2 . . . g1sr

·g12 · · · g1r

s

gs11gs12 . . . gs1r

·gs2 . . . gsr· · ·

gss1gss2 · · · gssr

·gs2 · · · gsr

s

Figure: Depth reduction analysis

13


1. Consider a potential function — the total degree of all the bigfactors.

2. Geometric progression of degrees =⇒ total degree of smallfactors in a summand is at most 3t. Total degree of big terms is atleast d− 3t.

3. Potential drops by at most 3t per iteration. =⇒ at least d/3titerations are needed.

4. Every expansion introduces at least (log3 t) non-trivial terms.

14

application

set-multilinear polynomials

Definition (Set-multilinear polynomials)

Let x = x1 ⊔ · · · ⊔ xd be a partition of variables and let |xi| = mi. Apolynomial f(x) is said to be set-multilinear (SML) with respect to theabove partition if every monomial m in f satisfies |m ∩ Xi| = 1 for alli ∈ [d].

16

tensors

Definition

A tensor of order d can be defined as the function T : [n]d → F.

17

tensors as set-multilinear polynomials

Consider the partition of variables x = x1 ⊔ · · · ⊔ xd such that each|xi| = n.

Definition

Given a tensor T : [n]d → F, we can define a set-multilinearpolynomial f(x) as follows.

f(x) =∑

(i1,i2,...,id)∈[n]dT(i1, i2, . . . , id) · x1,i1 · x2,i2 · · · · xd,id .

18

tensor rank

A rank-1 tensor is precisely a set-multilinear product of linear formssuch as

f(x) = ℓ1(x1) · · · ℓd(xd)

where each ℓi(xi) is a linear form in the variables in xi.

19

tensor rank

Definition (Tensor rank, as set-multilinear polynomials)

For polynomial f(x) that is set-multilinear with respect tox = x1 ⊔ · · · ⊔ xd, the tensor rank of f (denoted by TensorRank(f)) isthe smallest r for which f can be expressed as a set-multilinear ΣΠΣcircuit:

f(x) =r∑

i=1

ℓi1(x1) · · · ℓid(xd).

20

properties of tensor rank

1. Sub-additive: For two SML polynomials f and g overx = x1 ⊔ · · · ⊔ xd,

TensorRank(f+ g) ≤ TensorRank(f) + TensorRank(g).

2. Sub-multiplicative: For two SML polynomials f and g overx1 ⊔ · · · ⊔ xd1 and xd1+1 ⊔ · · · ⊔ xd respectively,

TensorRank(f · g) ≤ TensorRank(f) · TensorRank(g).

3. For any SML polynomial f over x = x1 ⊔ · · · ⊔ xd such that each|xi| = n,

TensorRank(f) ≤ nd−1.

21

raz’s work

Theorem ([Raz10])

Let c be a constant. Let Φ be a formula of size s ≤ nc computing aset-multilinear polynomial f(x) with respect to x = x1 ⊔ · · · ⊔ xd. Ifd = O(logn/ log logn), then,

TensorRank(f) ≤ nd

nd/ exp(c) .

Corollary

For a family of tensors Tn : [n]d → F, with the tensor rank of thecorresponding SML polynomial f greater than nd(1−1/ exp(c)) for aconstant c, then f can not be computed by an arithmetic formula ofsize nc.

22

tensor rank bound for homogeneous formulas

Further improvement in case of homogeneous formulas:

Theorem

Let f be a set-multilinear polynomial with respect to x = x1 ⊔ · · · ⊔ xdthat is computed by a homogeneous formula (not necessarilyset-multilinear) Φ of size s = nc. If d is sub-polynomial in n, that islogd = o(logn), then

TensorRank(f) <nd

nd/ exp(c) .

23

tensor rank upper bound for hom. formulas

1. A homogeneous formula Φ of size nc can be reduced to a ΣΠΣΠ[t]

formula Φ′ of size n10c(d/t) for a parameter t.2. Consider a term T = Q1 · · ·Qa in Φ′ where a ≥ d log t

10t . This is notnecessarily a SML product.

3. We shall extract the set multilinear components from each of thefactors in T.

24

upper bound

1. For any S ⊂ [d] let Qi,S be the sum of monomials of Qi over thepartitions ⊔j∈Sxj and let di = deg(Qi) for all i ∈ [a].

2.SML(T) =

∑S1⊔···⊔Sa=[d]

|Si|=di

Q1,S1 · · ·Qa,Sa .

3. The tensor rank of each summand is upper bounded bynd1−1nd2−1 · · ·nda−1 and the number of summands is at most( dd1

)(d−d1d2

)· · ·

(d−∑a−1i=1 di

da

).

4.TensorRank(SML(T)) ≤ nd

na ·(

dd1 d2 · · · da

)and

TensorRank(f) ≤ n10c(d/t) · TensorRank(SML(T)).

25

conclusion

conclusion

1. Proving the existence of tensors with high rank would imply betterlower bounds.

2. State of the art lower bounds on explicit tensors (due to Alexeev,Forbes, and Tsimerman) for tensors do not help us get even thequadratic lower bounds for (homogeneous) formulas.

27

Questions ?

28

Thank you!1

1The theme of these slides is based on the the theme made available athttps://github.com/matze/mtheme undercba licence.

29

https://github.com/matze/mtheme

the chasm at depth four, and tensor rank : old results, new insights

Education