Fast Bilinear Algorithms for Symmetric Tensor Contractions

  • 1 Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
  • 2 Department of Electrical Engineering and Computer Science and Department of Mathematics, University of California, Berkeley, Berkeley, CA, USA
Edgar SolomonikORCID iD: https://orcid.org/0000-0002-6480-9066 and James Demmel
  • Department of Electrical Engineering and Computer Science and Department of Mathematics, University of California, Berkeley, Berkeley, CA, USA
  • Email
  • Search for other articles:
  • degruyter.comGoogle Scholar

Abstract

In matrix-vector multiplication, matrix symmetry does not permit a straightforward reduction in computational cost. More generally, in contractions of symmetric tensors, the symmetries are not preserved in the usual algebraic form of contraction algorithms. We introduce an algorithm that reduces the bilinear complexity (number of computed elementwise products) for most types of symmetric tensor contractions. In particular, it lowers the bilinear complexity of symmetrized contractions of symmetric tensors of order s+v and v+t by a factor of (s+t+v)!s!t!v! to leading order. The algorithm computes a symmetric tensor of bilinear products, then subtracts unwanted parts of its partial sums. Special cases of this algorithm provide improvements to the bilinear complexity of the multiplication of a symmetric matrix and a vector, the symmetrized vector outer product, and the symmetrized product of symmetric matrices. While the algorithm requires more additions for each elementwise product, the total number of operations is in some cases less than classical algorithms, for tensors of any size. We provide a round-off error analysis of the algorithm and demonstrate that the error is not too large in practice. Finally, we provide an optimized implementation for one variant of the symmetry-preserving algorithm, which achieves speedups of up to 4.58× for a particular tensor contraction, relative to a classical approach that casts the problem as a matrix-matrix multiplication.

  • [1]

    A. A. Albert, On Jordan algebras of linear transformations, Trans. Amer. Math. Soc. 59 (1946), 524–555.

    • Crossref
    • Export Citation
  • [2]

    E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. D. Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov and D. Sorensen, LAPACK Users’ Guide, SIAM, Philadelphia, 1992.

  • [3]

    G. Ballard, J. Demmel, O. Holtz, B. Lipshitz and O. Schwartz, Communication-optimal parallel algorithm for Strassen’s matrix multiplication, Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures—SPAA ’12, ACM, New York (2012), 193–204.

  • [4]

    D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, J. Symbolic Comput. 9 (1990), no. 3, 251–280.

    • Crossref
    • Export Citation
  • [5]

    E. Deumens, V. F. Lotrich, A. Perera, M. J. Ponton, B. A. Sanders and R. J. Bartlett, Software design of ACES III with the super instruction architecture, WIREs Comput. Molecular Sci. 1 (2011), no. 6, 895–901.

    • Crossref
    • Export Citation
  • [6]

    E. Epifanovsky, M. Wormit, T. Kuś, A. Landau, D. Zuev, K. Khistyaev, P. Manohar, I. Kaliman, A. Dreuw and A. I. Krylov, New implementation of high-level correlated methods using a general block-tensor library for high-performance electronic structure calculations, J. Comput. Chem. (2013), 10.1002/jcc.23377.

  • [7]

    A. Grüneis, G. H. Booth, M. Marsman, J. Spencer, A. Alavi and G. Kresse, Natural orbitals for wave function based correlated calculations using a plane wave basis set, J. Chem. Theory Comput. 7 (2011), no. 9, 2780–2785.

    • Crossref
    • PubMed
    • Export Citation
  • [8]

    W. Hackbusch, A sparse matrix arithmetic based on {\mathscr{H}}-matrices. I. Introduction to {\mathscr{H}}-matrices, Computing 62 (1999), no. 2, 89–108.

    • Crossref
    • Export Citation
  • [9]

    M. Hanrath and A. Engels-Putzka, An efficient matrix-matrix multiplication based antisymmetric tensor contraction engine for general order coupled cluster, J. Chem. Phys. 133 (2010), no. 6, Article ID 064108.

    • PubMed
    • Export Citation
  • [10]

    M. Head-Gordon, J. A. Pople and M. J. Frisch, MP2 energy evaluation by direct methods, Chem. Phys. Lett. 153 (1988), no. 6, 503–506.

    • Crossref
    • Export Citation
  • [11]

    S. Hirata, Tensor Contraction Engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories, J. Phys. Chem. A 107 (2003), no. 46, 9887–9897.

    • Crossref
    • Export Citation
  • [12]

    F. L. Hitchcock, The expression of a tensor or a polyadic as a sum of products, Stud. Appl. Math. 6 (1927), no. 1–4, 164–189.

  • [13]

    J. Huang, D. A. Matthews and R. A. van de Geijn, Strassen’s algorithm for tensor contraction, SIAM J. Sci. Comput. 40 (2018), no. 3, C305–C326.

    • Crossref
    • Export Citation
  • [14]

    M. Kállay and P. R. Surján, Higher excitations in coupled-cluster theory, J. Chem. Phys. 115 (2001), no. 7, Article ID 2945.

  • [15]

    V. Khoromskaia and B. N. Khoromskij, Tensor Numerical Methods in Quantum Chemistry, De Gruyter, Berlin, 2018.

  • [16]

    B. N. Khoromskij, Tensor Numerical Methods in Scientific Computing, adon Ser. Comput. Appl. Math. 19, De Gruyter, Berlin, 2018.

  • [17]

    T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Rev. 51 (2009), no. 3, 455–500.

    • Crossref
    • Export Citation
  • [18]

    C. L. Lawson, R. J. Hanson, D. R. Kincaid and F. T. Krogh, Basic linear algebra subprograms for Fortran usage, ACM Trans. Math. Software (TOMS), 5 (1979), no. 3, 308–323.

    • Crossref
    • Export Citation
  • [19]

    V. Lotrich, N. Flocke, M. Ponton, B. A. Sanders, E. Deumens, R. J. Bartlett and A. Perera, An infrastructure for scalable and portable parallel programs for computational chemistry, Proceedings of the 23rd International Conference on Supercomputing—ICS ’09, ACM, New York (2009), 523–524.

  • [20]

    D. A. Matthews and J. F. Stanton, Aquarius: Scalability and extensibility by design, Abstracts Papers Amer. Chem. Soc. 248 (2014).

  • [21]

    J. Noga and P. Valiron, Improved algorithm for triple-excitation contributions within the coupled cluster approach, Molecular Phys. 103 (2005), no. 15–16, 2123–2130.

    • Crossref
    • Export Citation
  • [22]

    R. Orús, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann. Physics 349 (2014), 117–158.

    • Crossref
    • Export Citation
  • [23]

    I. V. Oseledets, Tensor-train decomposition, SIAM J. Sci. Comput. 33 (2011), no. 5, 2295–2317.

    • Crossref
    • Export Citation
  • [24]

    V. Pan, How can we speed up matrix multiplication?, SIAM Rev. 26 (1984), no. 3, 393–415.

    • Crossref
    • Export Citation
  • [25]

    S. Rajbhandari, A. Nikam, P.-W. Lai, K. Stock, S. Krishnamoorthy and P. Sadayappan, Framework for distributed contractions of tensors with symmetry, preprint (2013), Ohio State University.

  • [26]

    M. D. Schatz, T. M. Low, R. A. van de Geijn and T. G. Kolda, Exploiting symmetry in tensors for high performance: multiplication with symmetric tensors, SIAM J. Sci. Comput. 36 (2014), no. 5, C453–C479.

    • Crossref
    • Export Citation
  • [27]

    Y. Shao, Advances in methods and algorithms in a modern quantum chemistry program package, Phys. Chem. Chem. Phys. 8 (2006), no. 27, 3172–3191.

    • Crossref
    • Export Citation
  • [28]

    E. Solomonik, Provably Efficient Algorithms for Numerical Tensor Algebra, PhD thesis, University of California, Berkeley, 2014.

  • [29]

    E. Solomonik and J. Demmel, Contracting symmetric tensors using fewer multiplications, Technical report, ETH Zürich, 2015.

  • [30]

    E. Solomonik, D. Matthews, J. R. Hammond, J. F. Stanton and J. Demmel, A massively parallel tensor contraction framework for coupled-cluster computations, J. Parallel Distributed Comput. 74 (2014), no. 12, 3176–3190.

    • Crossref
    • Export Citation
  • [31]

    V. Strassen, Gaussian elimination is not optimal, Numer. Math. 13 (1969), 354–356.

    • Crossref
    • Export Citation
  • [32]

    V. Strassen, Rank and optimal computation of generic tensors, Linear Algebra Appl. 52/53 (1983), 645–685.

    • Crossref
    • Export Citation
  • [33]

    L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika 31 (1966), 279–311.

    • Crossref
    • PubMed
    • Export Citation
  • [34]

    M. Valiev, E. J. Bylaska, N. Govind, K. Kowalski, T. P. Straatsma, H. J. Van Dam, D. Wang, J. Nieplocha, E. Apra, T. Windus and W. A. de Jong, NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations, Comput. Phys. Commun. 181 (2010), no. 9, 1477–1489.

    • Crossref
    • Export Citation
  • [35]

    V. V. Williams, Multiplying matrices faster than Coppersmith–Winograd, Proceedings of the 2012 ACM Symposium on Theory of Computing—STOC’12, ACM, New York (2012), 887–898.

  • [36]

    J. Xia, S. Chandrasekaran, M. Gu and X. S. Li, Fast algorithms for hierarchically semiseparable matrices, Numer. Linear Algebra Appl. 17 (2010), no. 6, 953–976.

    • Crossref
    • Export Citation
  • [37]

    K. Ye and L.-H. Lim, Algorithms for structured matrix-vector product of optimal bilinear complexity, 2016 IEEE Information Theory Workshop (ITW), IEEE Press, Piscataway (2016), 310–314.

  • [38]

    K. Ye and L.-H. Lim, Fast structured matrix computations: tensor rank and Cohn–Umans method, Found. Comput. Math. 18 (2018), no. 1, 45–95.

    • Crossref
    • Export Citation
Purchase article
Get instant unlimited access to the article.
$42.00
Log in
Already have access? Please log in.


or
Log in with your institution

Journal + Issues

Search