Fully Connected Networks on a Diet With the Mediterranean Matrix Multiplication

Eshkiki, Hassan; Mora, Benjamin; Xie, Xianghua

doi:10.1109/tnnls.2022.3176197

Journal article 777 views 24 downloads

Fully Connected Networks on a Diet With the Mediterranean Matrix Multiplication

Hassan Eshkiki

, Benjamin Mora

, Xianghua Xie

IEEE Transactions on Neural Networks and Learning Systems, Volume: 35, Issue: 1, Pages: 634 - 647

Swansea University Authors: Hassan Eshkiki , Benjamin Mora , Xianghua Xie

PDF | Version of Record

© 2022 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
Download (3.96MB)

Check full text

DOI (Published version): 10.1109/tnnls.2022.3176197

Abstract

This article proposes the Mediterranean matrix multiplication, a new, simple and practical randomized algorithm that samples angles between the rows and columns of two matrices with sizes m,n, and p to approximate matrix multiplication in O(k(mn+np+mp)) steps, where k is a constant only related to t...

Full description

Published in:	IEEE Transactions on Neural Networks and Learning Systems
ISSN:	2162-237X 2162-2388
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2024
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa60038

Abstract:	This article proposes the Mediterranean matrix multiplication, a new, simple and practical randomized algorithm that samples angles between the rows and columns of two matrices with sizes m,n, and p to approximate matrix multiplication in O(k(mn+np+mp)) steps, where k is a constant only related to the precision desired. The number of instructions carried out is mainly bounded by bitwise operators, amenable to a simplified processing architecture and compressed matrix weights. Results show that the method is superior in size and number of operations to the standard approximation with signed matrices. Equally important, this article demonstrates a first application to machine learning inference by showing that weights of fully connected layers can be compressed between 30× and 100× with little to no loss in inference accuracy. The requirements for pure floating-point operations are also down as our algorithm relies mainly on simpler bitwise operators.
Keywords:	Complexity theory, Approximation algorithms, Neural networks, Monte Carlo methods, Heart, Transforms, Inference algorithms
College:	Faculty of Science and Engineering
Funders:	This work was supported in part by the UK Government through the Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/R51312X/1 and in part by the Natural Environment Research Council (NERC) under Grant NE/W502911/1.
Issue:	1
Start Page:	634
End Page:	647

Fully Connected Networks on a Diet With the Mediterranean Matrix Multiplication

Similar Items