論文の公開元へ

書き出し

Refer/BibIX

RIS

BibTeX

TSV

Reshaped tensor nuclear norms for higher order tensor completion

Wimalawarne, Kishan Mamitsuka, Hiroshi 京都大学 DOI:10.1007/s10994-020-05927-y

2021.03

概要

We investigate optimal conditions for inducing low-rankness of higher order tensors by using convex tensor norms with reshaped tensors. We propose the reshaped tensor nuclear norm as a generalized approach to reshape tensors to be regularized by using the tensor nuclear norm. Furthermore, we propose the reshaped latent tensor nuclear norm to combine multiple reshaped tensors using the tensor nuclear norm. We analyze the generalization bounds for tensor completion models regularized by the proposed norms and show that the novel reshaping norms lead to lower Rademacher complexities. Through simulation and real-data experiments, we show that our proposed methods are favorably compared to existing tensor norms consolidating our theoretical claims.

論文の公開元へ

この論文で使われている画像

参考文献

1. Berclaz, J., Fleuret, F., Turetken, E., and Fua, P. Multiple Object Tracking using

K-Shortest Paths Optimization. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 2011.

2. Carroll, J. D. and Chang, J.-J. Analysis of individual differences in multidimensional

scaling via an n-way generalization of “eckart-young” decomposition. Psychometrika, 35

(3):283–319, Sep 1970.

3. El-Yaniv, R. and Pechyony, D. Transductive rademacher complexity and its applications.

In Learning Theory, volume 4539, pp. 157–171. 2007.

4. Fazel, M., Hindi, H., and Boyd, S. P. A rank minimization heuristic with application

to minimum order system approximation. In Proceedings of the 2001 American Control

Conference. (Cat. No.01CH37148), volume 6, pp. 4734–4739, June 2001.

5. Guo, X., Yao, Q., and Kwok, J. T. Efficient sparse low-rank tensor completion using the

frank-wolfe algorithm. In Proceedings of the Thirty-First AAAI Conference on Artificial

Intelligence, February 4-9, 2017, San Francisco, California, USA., pp. 1948–1954, 2017.

6. Hackbusch, W. Tensor Spaces and Numerical Tensor Calculus. Springer Series in

Computational Mathematics. Springer Berlin Heidelberg, 2012. ISBN 9783642280276.

7. Harshman, R. A. Foundations of the PARAFAC procedure: models and conditions for an

explanatory multimodal factor analysis. UCLA Working Papers in Phonetics, 16:1–84,

1970.

8. Hillar, C. J. and Lim, L.-H. Most tensor problems are np-hard. J. ACM, 60(6), November

2013. ISSN 0004-5411.

9. Hitchcock, F. L. The expression of a tensor or a polyadic as a sum of products. J. Math.

Phys, 6(1):164–189, 1927.

10. Imaizumi, M., Maehara, T., and Hayashi, K. On tensor train rank minimization :

Statistical efficiency and scalable algorithm. In NIPS, pp. 3933–3942, 2017.

11. Karatzoglou, A., Amatriain, X., Baltrunas, L., and Oliver, N. Multiverse recommendation:

N-dimensional tensor factorization for context-aware collaborative filtering. In RecSys, pp.

79–86. ACM, 2010.

12. Kolda, T. G. and Bader, B. W. Tensor Decompositions and Applications. SIAM Review,

51(3):455–500, 2009.

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Kishan Wimalawarne, Hiroshi Mamitsuka

13. Latala, R. Some estimates of norms of random matrices. Proc. Amer. Math. Soc., 133

(5):1273–1282, 2005.

14. Lim, L. and Comon, P. Blind multilinear identification. IEEE Trans. Information

Theory, 60(2):1260–1280, 2014.

15. Liu, J., Musialski, P., Wonka, P., and Ye, J. Tensor completion for estimating missing

values in visual data. In ICCV, pp. 2114–2121, 2009.

16. Mu, C., Huang, B., Wright, J., and Goldfarb, D. Square deal: Lower bounds and

improved relaxations for tensor recovery. In ICML, pp. 73–81, 2014.

17. Oseledets, I. V. Tensor-train decomposition. SIAM J. Sci. Comput., 33(5):2295–2317,

September 2011. ISSN 1064-8275.

18. Rai, P., Hu, C., Harding, M., and Carin, L. Scalable probabilistic tensor factorization for

binary and count data. IJCAI’15, pp. 3770–3776. AAAI Press, 2015. ISBN 978-1-57735738-4.

19. Raskutti, G., Chen, H., and Yuan, M. Convex regularization for high-dimensional

multi-response tensor regression. CoRR, abs/1512.01215v2, 2015.

20. Shamir, O. and Shalev-Shwartz, S. Matrix completion with the trace norm: Learning,

bounding, and transducing. Journal of Machine Learning Research, 15:3401–3423, 2014.

21. Song, Q., Ge, H., Caverlee, J., and Hu, X. Tensor completion algorithms in big data

analytics. CoRR, abs/1711.10105, 2017.

22. Tomioka, R. and Suzuki, T. Convex tensor decomposition via structured schatten norm

regularization. In NIPS, 2013.

23. Wimalawarne, K., Sugiyama, M., and Tomioka, R. Multitask learning meets tensor

factorization: task imputation via convex optimization. In NIPS. 2014.

24. Yang, Y., Feng, Y., and Suykens, J. A. K. A rank-one tensor updating algorithm for

tensor completion. IEEE Signal Processing Letters, 22(10):1633–1637, Oct 2015. ISSN

1070-9908.

25. Yuan, M. and Zhang, C.-H. On tensor completion via nuclear norm minimization.

Foundations of Computational Mathematics, 16(4):1031–1068, Aug 2016.

26. Zheng, V. W., Cao, B., Zheng, Y., Xie, X., and Yang, Q. Collaborative filtering meets

mobile recommendation: A user-centered approach. In AAAI, AAAI’10, pp. 236–241.

AAAI Press, 2010.

A Dual Norms of Reshaped Tensor Nuclear Norms

In this section, we discuss the dual norm of the proposed reshaped tensor nuclear norm. The

dual norm is useful in developing optimization procedures and proving theoretical bounds.

The dual norm the tensor nuclear norm [24, 25] of a K-mode tensor T ∈ Rn1 ×···×nK is

given by

kT kop =

max

hT , y1 ⊗ y2 ⊗ · · · ⊗ yK i.

(6)

kyi k2 =1,1≤i≤K

This definition applies to all the tenor nuclear norms including reshaped norms.

The next Lemma provides the dual norm for the reshaped latent tensor nuclear norm.

Lemma 1 The dual norm of the reshaped latent tensor nuclear norm for a tensor W ∈

Rn1 ×···×nK for a collection of G reshaping sets DL = (D(1) , . . . , D(G) ) is

kWkr

latent(DL )∗

= max kW(D(g) ) kop .

Proof Using the standard formulation of the dual norm, we write the dual norm for

kWkr latent(DL )∗ as

kWkr

latent(DL )∗

= sup

k=1

(k)

s.t.

inf

X (1) +···+X (G) =X k=1

(k)

(D (k) )

≤ 1.

(7)

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Reshaped Tensor Nuclear Norms for Higher Order Tensor Completion

The solution to (7) resides on the simplex of inf X (1) +···+X (G) =X

k=1

(k)

(D (k) )

≤ 1 and

one of the edges of the simplex is a solution. Then, we can take any g ∈ 1, . . . , G such that

X (g) = X and all X (k6=g) = 0, such that we arrange (7) as

kWkr latent(DL )∗ = sup

X(D(g) ) , W(D(g) )

s.t. kX(D(g) ) k? ≤ 1,

g∈1,...,G

which results in the following

kWkr

latent(DL )∗

max

g∈1,...,G

kW(D(g) ) kop .

B Proofs of theoretical analysis

In this section, we provide proofs of the theoretical analysis in Section 4.

First, we prove the following useful lemmas. These lemmas bound the tensor nuclear

norm and the reshaped tensor nuclear norms with respect to the multilinear rank of a tensor.

Lemma 2 Let X ∈ Rn1 ×...×nK be a random K-mode tensor with a multilinear rank of

(r1 , . . . , rK ). Let rcp be the CP rank of X , then

kX k? =

( rcp

γj |X =

j=1

rcp

γj u1j ⊗ u2j · · · ⊗

uKj , kukj k22

= 1, γj ≥ γj+1 > 0

j=1

k=1 rk

γ1 ,

maxj=1,...,K ri

where γi is the ith singular value of X .

Proof Let us consider the Tucker decomposition of X as

X =

r1 X

···

j1 =1 j2 =1

(1)

(2)

(K)

Cj1 ,...,jK uj1 ⊗ uj2 ⊗ · · · ⊗ uj

jK =1

where C ∈ Rr1 ×···×rK is the core tensor and uj(i) ∈ Rnj , kuj(i) k2 = 1, i = 1, . . . , ri , j =

1, . . . , K are component vectors.

Following Chapter 8 of [6], we can express the above Tucker decomposition as

(1)

(2)

(K)

(8)

X =

···

Cj1 ,...,jK uj1 ⊗uj2 ⊗ · · · ⊗ uj ,

j2 =1

jK =1

j1 =1

ˆ (1) [j2 ,...,jK ]∈Rn1

where we have taken summation over the multiplications of core tensor elements with

component vectors of the mode 1. It is easy to see that we can consider the summation over

component vectors of any other mode.

By considering u

ˆ(1) [j2 , . . . , jK ] = γ[j2 , . . . , jK ]

ˆ (1) [j2 ,...,jK ]

kˆ

u(1) [j2 ,...,jK ]k2

where γ[j2 , . . . , jK ] =

kˆ

u(1) [j2 , . . . , jK ]k2 , it leads to a CP decomposition with rank of rcp =

k=1 rk

maxj=1,...,K ri

(1)

vectors u

ˆ [j2 , . . . , jK ]

By arranging γ[j2 , . . . , jK ] in descending order along with component

and renaming them as γ1 ≥ γ2 ≥ . . . and u1j , respectively, we obtain

( rcp

rcp

kX k? =

γj |X =

γj u1j ⊗ u2j · · · ⊗ uKj , kukj k2 = 1, γj ≥ γj+1 > 0 ,

j=1

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Kishan Wimalawarne, Hiroshi Mamitsuka

(k)

where ukj ∈ [u1 , . . . urk ] are component vectors from (8) for each k = 2, . . . , K.

Then the final bound is trivial

kX k? =

( rcp

γj |X =

rcp

γj u1j ⊗ u2j · · · ⊗

uKj , kukj k22

= 1, γj ≥ γj+1 > 0

j=1

k=1 rk

γ1 .

maxj=1,...,K ri

Lemma 3 Let X ∈ Rn×...×n be a random K-mode tensor with multilinear rank of

(r1 , . . . , rK ). We consider a set of M reshaping modes Di , i = 1, . . . , M . Let rcp be the CP

rank of X , then

( rcp

kX(D1 ,...,DM ) k? =

γj |X(D1 ,...,DM ) =

j=1

rcp

γj u1j ⊗ u2j · · · ⊗ uM j ,

j=1

kukj k22 = 1, γj ≥ γj+1 > 0

k=1 rk

maxj=1,...,M

i∈Dj

γ1 ,

where γi is the ith singular value of X(D1 ,...,DM ) .

Proof Let us consider the Tucker decomposition of X as

X =

r1 X

···

j1 =1 j2 =1

(1)

(2)

(K)

Cj1 ,...,jK uj1 ⊗ uj2 ⊗ · · · ⊗ uj

jK =1

where C ∈ Rr1 ×···×rK is the core tensor, (r1 , . . . , rK ) is the multilinear rank and uji ∈

Rnj , kuji k2 = 1, i = 1, . . . , ri , j = 1, . . . , K are component vectors. We rearrange the Tucker

decomposition for the reshaped tensor X(D1 ,...,DM ) as

X(D1 ,...,DM ) =

j 0 0 ,j 0 0 ,...∈D2

···

(b)

Cj1 ,...,jK ΠD1 (uja ⊗ uj · · · )

(a)

j 0000 ,j 0000 ,...∈DM

ja ,jb ,...∈D1

0 ,...,D 0 ]∈Rprod(D1 )

ˆ 1 [D2

(a0 )

⊗ ΠDM (uj 0

(b0 )

⊗ uj 0

···)

⊗ ··· .

ˆ [D 0 ,...,D 0 ]

0 ] = γ[D 0 , . . . , D 0 ]

0 ] = kˆ

0 ]k .

Taking u

ˆ1 [D20 , . . . , DM

where γ[D20 , . . . , DM

u1 [D20 , . . . , DM

M kˆ

u [D 0 ,...,D 0 ]k

We can consider the about summation over any reshaping set and it is easy

to see that the

arrangement takes a CP decomposition with a CP rank of rcp =

k=1

maxj=1,...,M

i∈Dj

By arranging γ[D2 , . . . , DM ] in descending order order along with component vectors

ˆ(1) [D2 , . . . , DM ] and renaming them as γ1 ≥ γ2 ≥ . . . and u1j , respectively, we obtain

kX(D1 ,...,DM ) k? =

( rcp

γj |X(D1 ,...,DM ) =

(a )

where ukj ∈ [ΠDk (u1

γj u1j ⊗u2j · · ·⊗uM j , kukj k22 = 1, γj ≥ γj+1 > 0 ,

j=1

rcp

(b )

⊗ u1

(a0 )

· · · ), . . . , ΠDk (ur0

k = 2, . . . , M and a0 , b0 , . . . ∈ Dk .

(b0 )

⊗ ur0 · · · )] are components for each

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Reshaped Tensor Nuclear Norms for Higher Order Tensor Completion

The following inequality is trivial,

kX(D1 ,...,DM ) k? =

( rcp

γj |X(D1 ,...,DM ) =

rcp

γj u1j ⊗ u2j · · · ⊗ uM j ,

j=1

kukj k22

= 1, γj ≥ γj+1 > 0

k=1 rk

maxj=1,...,M

i∈Dj

γ1 ,

Lemma 4 Let X ∈ Rn1 ×...×nK be a random K-mode tensor with CP rank of rcp . We

consider a set of M reshaping sets Di , i = 1, . . . , M . Then

kX(D1 ,...,DM ) k? ≤ rcp γ1 ,

where γi is the ith singular value of X(D1 ,...,DM ) .

Proof Let us consider X as

X =

rcp

γj u1j ⊗ u2j · · · ⊗ uKj ,

j=1

with kukj k22 = 1, γj ≥ γj+1 > 0. For the reshaping set (D1 , . . . , DM ), we rearrange the X as

X(D1 ,...,DM ) =

rcp

γj (◦i1 ∈D1 ui1 j ) ⊗ (◦i2 ∈D2 ui2 j ) · · · ⊗ (◦iM ∈DM uiM j ),

j=1

where a ◦ b = [a1 b, a2 b, . . . , an b]> is the Khatri-Rao product [12]. It is easy to verify that

vec((a ◦ b) ⊗ (c ◦ d)) = vec(a ⊗ b ⊗ c ⊗ d), which indicates that vec(X ) = vec(X(D1 ,...,DM ) ).

Using the fact that Rank(a ⊗ b) ≤ Rank(a)Rank(b) from [12], we have

Rank(X(D1 ,...,DM ) ) ≤ Rank(X ) = rcp .

This lead to the final observation

kX(D1 ,...,DM ) k? ≤ rcp γ1 .

In order to prove Rademacher complexities in Theorem 1 and 2, we use the following

Lemma form [19].

Lemma 5 (Raskutti, Chen and Yuan,2015) Consider a K-mode tensor X ∈ Rn1 ×···×nK

with random samples from an i.i.d. Gaussian tensor ensemble. Then

EkX kop ≤ 4 log(4K)

nk .

k=1

Given a tensor X ∈ Rn1 ×···×nK with Gaussian entries, we can write

EX = E

Xi1 ,i2 ,...,iK ei1 ⊗ ei2 ⊗ · · · ⊗ eiK ,

i1 ,i2 ,...,iK

where eik is the vector with 1 at the kth element and rest of the elements are zero. Due to

each Xi1 ,i2 ,...,iK being a Gaussian entry, we have

EX = Eg E

i1 ,i2 ,...,iK

i1 ,i2 ,...,iK |Xi1 ,i2 ,...,iK |ei1 ⊗ ei2 ⊗ · · · ⊗ eiK ,

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Kishan Wimalawarne, Hiroshi Mamitsuka

where i1 ,i2 ,...,iK ∈ {−1, 1}. Using the Jensen’s inequality, we have

Eg E

i1 ,i2 ,...,iK |Xi1 ,i2 ,...,iK |ei1 ⊗ ei2 ⊗ · · · eiK

i1 ,i2 ,...,iK

≥ E

i1 ,i2 ,...,iK Eg |Xi1 ,i2 ,...,iK |ei1 ⊗ ei2 ⊗ · · · eiK

i1 ,i2 ,...,iK

2πE

i1 ,i2 ,...,iK ei1 ⊗ ei2 ⊗ · · · eiK .

i1 ,i2 ,...,iK

This shows that we can use the Lemma 3 to bound tensors with Bernoulli random variables.

Next we give the detailed proof of Theorem 1.

Proof of Theorem 1: We expand the Rademacher complexity in (5) as

RS (l ◦ W) =

Eσ

|S|

sup

W∈W i ,..,i

Σi1 ,...,iK l(Xi1 ,...,iK , Wi1 ,...,iK ) ,

where Σi1 ,...,iK = σj when (i1 , . . . , iK ) = σj ∈ S and Σi1 ,...,iK = 0, otherwise.

We analyze the Rademacher complexity

RS (l ◦ W) =

Eσ

|S|

Eσ

|S|

sup

W∈W i ,..,i

sup

W∈W i ,..,i

Σi1 ,...,iK l(Xi1 ,...,iK , Wi1 ,...,iK ) ,

Σi1 ,...,iK Wi1 ,...,iK

(Rademacher contraction) (9)

Eσ sup kW(D1 ,...,DM ) k? kΣ(D1 ,...,DM ) k?∗ ,

|S|

W∈W

(Duality relationship)

(a) Given that tensor has a multilinear rank of (r1 , . . . , rK ), using the Lemma 3, we

know that

kW(D1 ,...,DM ) k? ≤

k=1 rk

maxj=1,...,M

i∈Dj

γ1 (W(D1 ,...,DM ) ).

(10)

Using Lemma 5, we can bound Eσ kΣ(D1 ,...,DM ) k?∗ as

Eσ kΣ(D1 ,...,DM ) k?∗ ≤ 4 log(4M )

M s Y

j=1

np .

(11)

p∈Dj

By substituting (10) and (11) to (9), we obtain the following bound

RS (l ◦ W) ≤

cΛ

|S|

k=1 rk

maxj=1,...,M

i∈Dj

M s Y

γ1 (W(D1 ,...,DM ) ) log(4M )

np . (12)

j=1

p∈Dj

(b) Given that tensor has a CP rank of rcp , using the Lemma 4, we have

kW(D1 ,...,DM ) k? ≤ rcp γ1 (W(D1 ,...,DM ) ).

(13)

From Lemma 5, we have

Eσ kΣ(D1 ,...,DM ) k?∗ ≤ 4 log(4M )

M s Y

j=1

p∈Dj

np .

(14)

A Self-archived copy in

Kyoto University Research Information Repository

https://repository.kulib.kyoto-u.ac.jp

Reshaped Tensor Nuclear Norms for Higher Order Tensor Completion

By substituting (13) and (14) to (9), we obtain the desired bound

RS (l ◦ W) ≤

M s Y

cΛ

rcp γ1 (W(D1 ,...,DM ) ) log(4M )

n|Dj | .

|S|

j=1

p∈D

(15)

Next, we give the proof for theorem 2.

Proof of theorem 2: We expand the Rademacher complexity in (5) using latent tensors

W (1) , . . . , W (G) for the reshaped latent tensor nuclear norm as

RS (l ◦ (W (1) + · · · + W (G) )) =

Σi1 ,...,iK l(Xi1 ,...,iK , Wi1 ,...,iK ) ,

sup

Eσ

|S|

W (1) +···+W (G) =W∈Wrl i ,..,i

where Σi1 ,...,iK = σj when (i1 , . . . , iK ) = σj ∈ S and Σi1 ,...,iK = 0, otherwise.

We can analyze the Rademacher complexity as

RS (l ◦ (W (1) + · · · + W (G) )) =

Eσ

|S|

Eσ

|S|

sup

W (1) +···+W (G) =W∈Wrl i1 ,..,iK

sup

Σi1 ,...,iK l(Xi1 ,...,iK , Wi1 ,...,iK ) ,

W (1) +···+W (G) =W∈Wrl i1 ,..,iK

Σi1 ,...,iK Wi1 ,...,iK

(Rademacher contraction)

kWkr

Eσ

sup

|S|

W (1) +···+W (G) =W∈Wrl

latent kΣkr latent∗

(Duality relationship).

(16)

(a) For tensor with multilinear rank, using Lemma 4, we obtain

kWkr

latent

inf

≤ min

g∈G

maxj=1,...,M ...

参考文献をもっと見る

分野

大学

学位論文種類・取得年

言語

Reshaped tensor nuclear norms for higher order tensor completion

概要

この論文で使われている画像

関連論文

Applications of the dynamical generator coordinate method to quadrupole excitations

Tensor renormalization group approach to higher-dimensional lattice field theories

Geometry of configuration space in Markov chain Monte Carlo methods and the worldvolume approach to the tempered Lefschetz thimble method

Learning on Hypergraphs with Sparsity

Borel resummation of secular divergences in stochastic inflation

参考文献

分野

大学

学位論文種類・取得年

言語

コピーが完了しました

URLをコピーしました

Reshaped tensor nuclear norms for higher order tensor completion

概要

この論文で使われている画像

関連論文

Applications of the dynamical generator coordinate method to quadrupole excitations

Tensor renormalization group approach to higher-dimensional lattice field theories

Geometry of configuration space in Markov chain Monte Carlo methods and the worldvolume approach to the tempered Lefschetz thimble method

Learning on Hypergraphs with Sparsity

Borel resummation of secular divergences in stochastic inflation

参考文献