No Cover Image

Journal article 399 views 119 downloads

Stochastic weight matrix dynamics during learning and Dyson Brownian motion

Gert Aarts Orcid Logo, Biagio Lucini Orcid Logo, Chanju Park Orcid Logo

Physical Review E, Volume: 111, Issue: 1

Swansea University Authors: Gert Aarts Orcid Logo, Biagio Lucini Orcid Logo

  • 68607.VoR.pdf

    PDF | Version of Record

    Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license.

    Download (915.79KB)

Abstract

We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more...

Full description

Published in: Physical Review E
ISSN: 2470-0045 2470-0053
Published: American Physical Society (APS) 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa68607
first_indexed 2025-01-09T20:33:57Z
last_indexed 2025-01-27T20:29:55Z
id cronfa68607
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2025-01-27T13:58:16.0162628</datestamp><bib-version>v2</bib-version><id>68607</id><entry>2024-12-19</entry><title>Stochastic weight matrix dynamics during learning and Dyson Brownian motion</title><swanseaauthors><author><sid>1ba0dad382dfe18348ec32fc65f3f3de</sid><ORCID>0000-0002-6038-3782</ORCID><firstname>Gert</firstname><surname>Aarts</surname><name>Gert Aarts</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>7e6fcfe060e07a351090e2a8aba363cf</sid><ORCID>0000-0001-8974-8266</ORCID><firstname>Biagio</firstname><surname>Lucini</surname><name>Biagio Lucini</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-12-19</date><deptcode>BGPS</deptcode><abstract>We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine.</abstract><type>Journal Article</type><journal>Physical Review E</journal><volume>111</volume><journalNumber>1</journalNumber><paginationStart/><paginationEnd/><publisher>American Physical Society (APS)</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>2470-0045</issnPrint><issnElectronic>2470-0053</issnElectronic><keywords/><publishedDay>8</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-01-08</publishedDate><doi>10.1103/physreve.111.015303</doi><url/><notes/><college>COLLEGE NANME</college><department>Biosciences Geography and Physics School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>BGPS</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>G.A. andB.L. are supported by STFC Consolidated Grant No. ST/X000648/1 .B. L. is further supported by the UKRI EPSRC ExCALIBURExaTEPP Project No. EP/X017168/1. C. P. is supported by the UKRI AIMLAC CDT EP/S023992/1.</funders><projectreference/><lastEdited>2025-01-27T13:58:16.0162628</lastEdited><Created>2024-12-19T15:46:22.6943315</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Biosciences, Geography and Physics - Biosciences</level></path><authors><author><firstname>Gert</firstname><surname>Aarts</surname><orcid>0000-0002-6038-3782</orcid><order>1</order></author><author><firstname>Biagio</firstname><surname>Lucini</surname><orcid>0000-0001-8974-8266</orcid><order>2</order></author><author><firstname>Chanju</firstname><surname>Park</surname><orcid>0009-0009-2750-6080</orcid><order>3</order></author></authors><documents><document><filename>68607__33417__40defca735da42ccb212234dcc30f882.pdf</filename><originalFilename>68607.VoR.pdf</originalFilename><uploaded>2025-01-27T13:55:38.4566864</uploaded><type>Output</type><contentLength>937765</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807>
spelling 2025-01-27T13:58:16.0162628 v2 68607 2024-12-19 Stochastic weight matrix dynamics during learning and Dyson Brownian motion 1ba0dad382dfe18348ec32fc65f3f3de 0000-0002-6038-3782 Gert Aarts Gert Aarts true false 7e6fcfe060e07a351090e2a8aba363cf 0000-0001-8974-8266 Biagio Lucini Biagio Lucini true false 2024-12-19 BGPS We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine. Journal Article Physical Review E 111 1 American Physical Society (APS) 2470-0045 2470-0053 8 1 2025 2025-01-08 10.1103/physreve.111.015303 COLLEGE NANME Biosciences Geography and Physics School COLLEGE CODE BGPS Swansea University SU Library paid the OA fee (TA Institutional Deal) G.A. andB.L. are supported by STFC Consolidated Grant No. ST/X000648/1 .B. L. is further supported by the UKRI EPSRC ExCALIBURExaTEPP Project No. EP/X017168/1. C. P. is supported by the UKRI AIMLAC CDT EP/S023992/1. 2025-01-27T13:58:16.0162628 2024-12-19T15:46:22.6943315 Faculty of Science and Engineering School of Biosciences, Geography and Physics - Biosciences Gert Aarts 0000-0002-6038-3782 1 Biagio Lucini 0000-0001-8974-8266 2 Chanju Park 0009-0009-2750-6080 3 68607__33417__40defca735da42ccb212234dcc30f882.pdf 68607.VoR.pdf 2025-01-27T13:55:38.4566864 Output 937765 application/pdf Version of Record true Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. true eng https://creativecommons.org/licenses/by/4.0/
title Stochastic weight matrix dynamics during learning and Dyson Brownian motion
spellingShingle Stochastic weight matrix dynamics during learning and Dyson Brownian motion
Gert Aarts
Biagio Lucini
title_short Stochastic weight matrix dynamics during learning and Dyson Brownian motion
title_full Stochastic weight matrix dynamics during learning and Dyson Brownian motion
title_fullStr Stochastic weight matrix dynamics during learning and Dyson Brownian motion
title_full_unstemmed Stochastic weight matrix dynamics during learning and Dyson Brownian motion
title_sort Stochastic weight matrix dynamics during learning and Dyson Brownian motion
author_id_str_mv 1ba0dad382dfe18348ec32fc65f3f3de
7e6fcfe060e07a351090e2a8aba363cf
author_id_fullname_str_mv 1ba0dad382dfe18348ec32fc65f3f3de_***_Gert Aarts
7e6fcfe060e07a351090e2a8aba363cf_***_Biagio Lucini
author Gert Aarts
Biagio Lucini
author2 Gert Aarts
Biagio Lucini
Chanju Park
format Journal article
container_title Physical Review E
container_volume 111
container_issue 1
publishDate 2025
institution Swansea University
issn 2470-0045
2470-0053
doi_str_mv 10.1103/physreve.111.015303
publisher American Physical Society (APS)
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Biosciences, Geography and Physics - Biosciences{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Biosciences, Geography and Physics - Biosciences
document_store_str 1
active_str 0
description We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine.
published_date 2025-01-08T05:20:09Z
_version_ 1851731539089424384
score 11.089864