Journal article 399 views 119 downloads
Stochastic weight matrix dynamics during learning and Dyson Brownian motion
Physical Review E, Volume: 111, Issue: 1
Swansea University Authors:
Gert Aarts , Biagio Lucini
-
PDF | Version of Record
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license.
Download (915.79KB)
DOI (Published version): 10.1103/physreve.111.015303
Abstract
We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more...
| Published in: | Physical Review E |
|---|---|
| ISSN: | 2470-0045 2470-0053 |
| Published: |
American Physical Society (APS)
2025
|
| Online Access: |
Check full text
|
| URI: | https://cronfa.swan.ac.uk/Record/cronfa68607 |
| first_indexed |
2025-01-09T20:33:57Z |
|---|---|
| last_indexed |
2025-01-27T20:29:55Z |
| id |
cronfa68607 |
| recordtype |
SURis |
| fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2025-01-27T13:58:16.0162628</datestamp><bib-version>v2</bib-version><id>68607</id><entry>2024-12-19</entry><title>Stochastic weight matrix dynamics during learning and Dyson Brownian motion</title><swanseaauthors><author><sid>1ba0dad382dfe18348ec32fc65f3f3de</sid><ORCID>0000-0002-6038-3782</ORCID><firstname>Gert</firstname><surname>Aarts</surname><name>Gert Aarts</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>7e6fcfe060e07a351090e2a8aba363cf</sid><ORCID>0000-0001-8974-8266</ORCID><firstname>Biagio</firstname><surname>Lucini</surname><name>Biagio Lucini</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-12-19</date><deptcode>BGPS</deptcode><abstract>We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine.</abstract><type>Journal Article</type><journal>Physical Review E</journal><volume>111</volume><journalNumber>1</journalNumber><paginationStart/><paginationEnd/><publisher>American Physical Society (APS)</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>2470-0045</issnPrint><issnElectronic>2470-0053</issnElectronic><keywords/><publishedDay>8</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-01-08</publishedDate><doi>10.1103/physreve.111.015303</doi><url/><notes/><college>COLLEGE NANME</college><department>Biosciences Geography and Physics School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>BGPS</DepartmentCode><institution>Swansea University</institution><apcterm>SU Library paid the OA fee (TA Institutional Deal)</apcterm><funders>G.A. andB.L. are supported by STFC Consolidated Grant No. ST/X000648/1 .B. L. is further supported by the UKRI EPSRC ExCALIBURExaTEPP Project No. EP/X017168/1. C. P. is supported by the UKRI AIMLAC CDT EP/S023992/1.</funders><projectreference/><lastEdited>2025-01-27T13:58:16.0162628</lastEdited><Created>2024-12-19T15:46:22.6943315</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Biosciences, Geography and Physics - Biosciences</level></path><authors><author><firstname>Gert</firstname><surname>Aarts</surname><orcid>0000-0002-6038-3782</orcid><order>1</order></author><author><firstname>Biagio</firstname><surname>Lucini</surname><orcid>0000-0001-8974-8266</orcid><order>2</order></author><author><firstname>Chanju</firstname><surname>Park</surname><orcid>0009-0009-2750-6080</orcid><order>3</order></author></authors><documents><document><filename>68607__33417__40defca735da42ccb212234dcc30f882.pdf</filename><originalFilename>68607.VoR.pdf</originalFilename><uploaded>2025-01-27T13:55:38.4566864</uploaded><type>Output</type><contentLength>937765</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
| spelling |
2025-01-27T13:58:16.0162628 v2 68607 2024-12-19 Stochastic weight matrix dynamics during learning and Dyson Brownian motion 1ba0dad382dfe18348ec32fc65f3f3de 0000-0002-6038-3782 Gert Aarts Gert Aarts true false 7e6fcfe060e07a351090e2a8aba363cf 0000-0001-8974-8266 Biagio Lucini Biagio Lucini true false 2024-12-19 BGPS We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine. Journal Article Physical Review E 111 1 American Physical Society (APS) 2470-0045 2470-0053 8 1 2025 2025-01-08 10.1103/physreve.111.015303 COLLEGE NANME Biosciences Geography and Physics School COLLEGE CODE BGPS Swansea University SU Library paid the OA fee (TA Institutional Deal) G.A. andB.L. are supported by STFC Consolidated Grant No. ST/X000648/1 .B. L. is further supported by the UKRI EPSRC ExCALIBURExaTEPP Project No. EP/X017168/1. C. P. is supported by the UKRI AIMLAC CDT EP/S023992/1. 2025-01-27T13:58:16.0162628 2024-12-19T15:46:22.6943315 Faculty of Science and Engineering School of Biosciences, Geography and Physics - Biosciences Gert Aarts 0000-0002-6038-3782 1 Biagio Lucini 0000-0001-8974-8266 2 Chanju Park 0009-0009-2750-6080 3 68607__33417__40defca735da42ccb212234dcc30f882.pdf 68607.VoR.pdf 2025-01-27T13:55:38.4566864 Output 937765 application/pdf Version of Record true Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. true eng https://creativecommons.org/licenses/by/4.0/ |
| title |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| spellingShingle |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion Gert Aarts Biagio Lucini |
| title_short |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| title_full |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| title_fullStr |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| title_full_unstemmed |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| title_sort |
Stochastic weight matrix dynamics during learning and Dyson Brownian motion |
| author_id_str_mv |
1ba0dad382dfe18348ec32fc65f3f3de 7e6fcfe060e07a351090e2a8aba363cf |
| author_id_fullname_str_mv |
1ba0dad382dfe18348ec32fc65f3f3de_***_Gert Aarts 7e6fcfe060e07a351090e2a8aba363cf_***_Biagio Lucini |
| author |
Gert Aarts Biagio Lucini |
| author2 |
Gert Aarts Biagio Lucini Chanju Park |
| format |
Journal article |
| container_title |
Physical Review E |
| container_volume |
111 |
| container_issue |
1 |
| publishDate |
2025 |
| institution |
Swansea University |
| issn |
2470-0045 2470-0053 |
| doi_str_mv |
10.1103/physreve.111.015303 |
| publisher |
American Physical Society (APS) |
| college_str |
Faculty of Science and Engineering |
| hierarchytype |
|
| hierarchy_top_id |
facultyofscienceandengineering |
| hierarchy_top_title |
Faculty of Science and Engineering |
| hierarchy_parent_id |
facultyofscienceandengineering |
| hierarchy_parent_title |
Faculty of Science and Engineering |
| department_str |
School of Biosciences, Geography and Physics - Biosciences{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Biosciences, Geography and Physics - Biosciences |
| document_store_str |
1 |
| active_str |
0 |
| description |
We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the minibatch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and nonuniversal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine. |
| published_date |
2025-01-08T05:20:09Z |
| _version_ |
1851731539089424384 |
| score |
11.089864 |

