Towards glueball masses of large-N SU(N) pure-gauge theories without topological freezing

In commonly used Monte Carlo algorithms for lattice gauge theories the integrated autocorrelation time of the topological charge is known to be exponentially-growing as the continuum limit is approached. This topological freezing , whose severity increases with the size of the gauge group, can result in potentially large systematics. To provide a direct quantiﬁcation of the latter, we focus on SU(6) Yang–Mills theory at a lattice spacing for which conventional methods associated to the decorrelation of the topological charge have an unbearable computational cost. We adopt the recently proposed parallel tempering on boundary conditions algorithm, which has been shown to remove systematic eﬀects related to topological freezing, and compute glueball masses with a typical accuracy of 2 − 5%. We observe no sizeable systematic eﬀect in the mass of the ﬁrst lowest-lying glueball states, with respect to calculations performed at nearly-frozen topological sector.


Introduction
Based on the confining properties of QCD, it is predicted that gaugeinvariant bound states made of gluons alone, called glueballs, should appear in the spectrum as asymptotic states.So far, glueballs have eluded experimental detection, although candidate events have been recently found [1].From the theoretical point of view, during last decades several studies have appeared in the literature where glueball masses are computed in the nonperturbative setting provided by numerical lattice field theory simulations, where many intriguing predictions can been derived, both in relation to QCD [2,3,4,5,6,7,8] or to possible Standard Model extensions [9].Regarding QCD glueballs, the majority of such predictions has been obtained for quarkless pure-gauge theories in the large number of colors (N) limit N → ∞.Large-N pure SU(N) gauge theories provide a reasonable approximation of real-world N = 3 QCD [10], as finite-N corrections are suppressed as powers of 1/N 2 , and enable us to avoid some technical complications (e.g., all glueballs are exactly non-interacting and have infinite lifetime at N = ∞).
The extraction of glueball masses from lattice simulations at large N is a non-trivial task and several sources of systematic errors have to be addressed to obtain reliable results.In the last decades, enormous progress has been made in the development and the refinement of the relevant techniques [11,12,13,14,15,16,2,3,17,4].Nevertheless, systematic effects related to topological freezing [18] have never been addressed in a satisfactory way, so far.
Standard local updating algorithms suffer from non-ergodicity in the vicinity of the continuum limit: as a → 0, the Markov chain of configurations explored by the system tends to remain trapped in a fixed topological sector.This problem becomes exponentially more severe as N is increased.When N is large, the evolution of the topological charge along the Monte Carlo trajectory freezes already for coarse lattice spacings [19,20,21,22].In particular, there is ample numerical evidence that the autocorrelation time of the topological charge τ (Q) diverges exponentially as a function of 1/a and/or N [20,21,22], thus making ergodic exploration of different topological sectors rapidly unfeasible as the continuum limit and/or the large-N limit are approached.In practice, when N ≥ 6 and a ∼ 0.1 fm or below, essentially very few to no fluctuations of Q are observed during reasonably-long Monte Carlo histories 1 .
Since there is theoretical evidence that computing glueball masses on a fixed topological sector may introduce a bias [24], it is of utmost importance to check that any systematic error related to the restriction in a fixed topological sector is under control within the typical precision achieved in actual simulations.The effect of fixed topology has however never been systematically probed on large-N glueball mass computations from the lattice.
In this letter we make a first step in this direction by removing any systematic effect related to the freezing problem through the Parallel Tempering on Boundary Conditions (PTBC).The PTBC algorithm was proposed by M. Hasenbusch [25] for 2d large-N CP N −1 models and was recently employed both in the latter case and in large-N SU(N) pure-gauge theories [26,27] to improve state of the art of large-N topology from the lattice.In particular, the PTBC algorithm has been shown to provide a dramatic enhancement compared to standard algorithms when looking at the evolution of the topological charge Q, allowing to achieve a gain of several orders of magnitude in terms of τ (Q). 2  We generated 20k well-decorrelated gauge configurations for SU(6) at a ≃ 0.0938 fm adopting the PTBC algorithm.These configurations were then used to compute glueball masses for the first few lightest states using standard methods.Since the PTBC algorithm is designed to restore ergodicity, our simulations frequently explored Q = 0 sectors.Hence, for our model, we were able to provide the first results for glueball masses free of any systematics related to topological freezing.
This letter is organized as follows: in Sec. 2 we describe our numerical setup, in Sec. 3 we show our results for glueball masses and compare them with results obtained with standard algorithms, finally in Sec. 4 we draw our conclusions.

Lattice setup
We consider a collection of N r hypercubic lattice replicas with L 4 sites.Replicas differ from one another only in the boundary conditions imposed on the links on a small sub-region of the lattice, D, which we call the de-1 An exploratory study of the topological charge in the theory with dynamical fermions for 3 ≤ N ≤ 5 has been performed in [23].
2 Other recently proposed algorithms to avoid topological freezing include [28].
fect.Boundary conditions on the defect are chosen in order to interpolate between open boundary conditions (OBC) [29] and periodic boundary conditions (PBC), while are taken periodic elsewhere for every replica.Each replica is evolved independently using standard local algorithms.After each replica has been updated, swaps among different replicas are proposed and accepted/rejected by means of a standard Metropolis test.Iterations over the full lattice are alternated with hierarchical updates over small sub-lattices centered around D to improve the efficiency of the algorithm.
In practice, the lattice action of the r th replica looks like where β is the bare coupling, Π x,µν is the plaquette computed on the gauge configuration of the r th replica and is used to impose boundary conditions on the links crossing orthogonally the defect d }.In our simulations we used L (2) and the defect is kept fixed in the position described here; its position is however effectively moved by translating the periodic copy (which is translation-invariant).Coefficients c(r) interpolate between c(0) = 1 (PBC) and c(N r − 1) = 0 (OBC) and are tuned through short runs to make swap probabilities uniform among different replicas.
Glueball masses are computed on the periodic r = 0 replica using standard techniques, which we here succinctly summarize.We define a variational basis B = {O i (t)} of time-dependent operators, with quantum numbers compatible with the desired glueball state.We only consider zero-momentum operators O i (t) = x O i (t, x), where O i (t, x) are gauge-invariant local operators expressed as traces of products of links taken over closed space-like lattice paths.We also include in B operators obtained from blocked and smeared links.Once B is chosen, to extract the lightest state in the selected channel, we compute C ij (t) = O i (t)O j (0) and, through the Generalized EigenValue method (GEV), we obtain the eigenvector v i related to the largest eigenvalue of the generalized eigenvalue problem The correlator of the best overlapping operator between the vacuum and the desired glueball state is then obtained as C best (t) ≡ C ij (t)v i v j .The glueball mass m is finally obtained in lattice units through a best fit of the expression where the fit is performed over a range where the effective mass shows a plateau.For more details about the glueball mass extraction procedure, the choice of the variational basis for each B channel and the smearing algorithms adopted in the context of glueball mass computations, we refer, e.g., to Refs.[11,12,13,15,2,3,17,4,9,7,8].

Results
We simulated SU(6) at β = 25.452(a ≃ 0.0938 fm) on a 16 4 lattice with a cubic L d = 3 defect.For each of 10 independent runs, we collected 2000 well-decorrelated configurations, stored every 200 parallel tempering steps, after discarding the first 10000 parallel tempering steps for thermalization.A single parallel tempering step is performed as follows: 1.Each replica is updated in parallel with a full lattice sweep of a 4:1 combination of over-relaxation and over-heat-bath algorithms (in the following, this combination will be referred to as "standard updating step").2. Swaps are proposed between replica pairs (r, r + 1), first for r even, then for r odd or viceversa (order decided stochastically).Swaps of odd and even (r, r + 1) pairs are proposed in parallel and accepted with probability After the swap proposals, the periodic r = 0 replica is translated of 1 lattice site along a random direction to effectively move the position of the defect.3.Each replica is updated in parallel with hierarchical sweeps on small sub-lattices centered around the defect.After each hierarchical iteration, the swaps and the r = 0 replica translations are performed as in 2.
It is clear that a single parallel tempering step requires a factor of ∼ N r larger numerical effort compared to a standard updating step.Nonetheless, even when considering this overhead, the obtained gains in terms of decorrelation of the topological charge make the parallel tempering algorithm the obvious method of choice between the two, as it will be manifest in the following.With this implementation, collecting the sample of configurations employed for this study required ∼ 2.3M core-hours on Intel Skylake processors.We chose a uniform ∼ 30% swap probability for all pairs; to reach it we needed N r = 30 replicas.In Fig. 1 we show the behavior of c(r) for every r and the related swap acceptances (left plot above).Choosing approximately uniform swap probabilities among different replica pairs ensured that a given configuration explored uniformly all boundary conditions c(r) in a randomwalk fashion, which is a necessary condition for the correct operation of the PTBC algorithm (left plot below).
Moreover, In Fig. 1 we also show the histogram of the obtained sampling of the topological charge Q (right plot above) and the history of the topological charge evolution in our typical run compared to the evolution obtained with standard algorithms (right plot below).This quantity was computed from the standard clover definition on smoothened configurations, obtained after 20 cooling steps, and rounded to the nearest integer using the so-called alpha-rounding method explained in, e.g., Refs.[30,22,27].The PTBC algorithm is capable of performing an ergodic sampling of the space of configurations with respect to the topological charge, and allows to observe numerous fluctuations of Q in a case where, with standard algorithms, only a handful would be observed, cfr.Fig. 1.The gain in terms of the integrated autocorrelation time of the topological charge τ (Q) is dramatic: while for the standard run we estimate τ std (Q) ∼ 5000, with the PTBC algorithm we find τ PTBC (Q) = 92(8) (where τ PTBC was obtained keeping into account that a single PTBC step requires a numerical effort which is larger by approximately a factor of ∼ N r compared to a standard updating step).
We then employed the generated sample to compute glueball masses for the Ground State (GS) of all R PC channels, with the exception of A −− 1 and A −+ 2 , which appear to be heavier than our ultra-violet cut-off Λ UV ∼ 2/a, and defined the dimensionless ratios m R PC /m A ++ 1 ∼ m J PC /m 0 ++ .Here R stands for a particular representation of the octahedral group, J stands for the corresponding representation of SO(3) in the continuum, and PC stands for the spatial parity and charge conjugation quantum numbers.For the ground states in the R PC channels, we can establish the following correspondence Right plot above: histogram of the topological charge obtained from our configuration sample, generated with the PTBC algorithm.Left plot below: random walk of a configuration through different replicas.Time along horizontal axis is expressed in units of PTBC steps, and the shown time window corresponds to ∼ 10 −2 % of our total statistics.Right plot below: Monte Carlo evolution of the topological charge Q obtained with the PTBC and with the standard algorithms.The time on the horizontal axis is expressed in units of standard updating steps for both algorithms (Monte Carlo time of the PTBC run was rescaled with a factor of N r ), and the shown time window corresponds to ∼ 0.2% of the total statistics collected with the PTBC algorithm.

among representations of the octahedral group and representations of SO(3):
The ratios m R PC /m A ++ 1 were then obtained with a precision of the order of 2 − 5%.We then compared our results for such quantities with those reported in Ref. [8], using am A ++ 1 to fix a common lattice spacing scale.As the sample of configurations analyzed in Ref. [8] was obtained from standard local algorithms, it is affected by severe topological freezing.The comparison was performed as follows.We first extrapolated the finite-a results of Ref. [8] towards the continuum limit by fitting to the data, using c R PC and m J P C /m 0 ++ as fitting parameters.Then, we computed the value of m R PC /m A ++ 1 expected at our value of am A ++ 1 , according to the best fit of Eq. ( 3) above.To allow a comparison, we report in Tab. 1 our results for the mass of the GS in each R PC channel, expressed in lattice units, as well as our determinations for m R PC /m A ++ 1 .The determinations from Ref. [8], using the procedure described above, are also reported in Tab. 1, in the rightmost column.
A comparison can also be made from Fig. 2, where, for brevity, we just show the masses of the first 4 lightest states above 0 ++ (the GS of A ++ 1 ): 2 ++ (obtained from the weighted arithmetic mean between the mass of the GS of the E ++ channel and the mass of the GS of the T ++ 2 channel, which are expected to become degenerate in the continuum limit), 0 −+ (the GS of the A −+ 1 channel), 1 +− (the GS of the T +− 1 channel) and 2 −+ (the GS of the E −+ channel).We stress that no difference was observed in heavier channels compared to the results we are displaying in Fig. 2, cfr.Tab. 1.
As a matter of fact, in none of the explored cases any systematic effect related to topological freezing was observed.Our results always fall on top of the ones obtained by interpolating those of Ref. [8], see Figs. 2. This is a strong indication that, even when focusing on channels with the same quantum numbers PC = −+ as the topological charge (for example, the 0 −+ or the 2 −+ channels), no systematic error on glueball mass determinations related to topological freezing can be appreciated within our ∼ 2 − 5% level of accuracy.
As a final comment, we observe that our error bars are generally larger than those reported in Ref. [8], especially for heavier states.This is related to the procedure adopted to extract glueball masses.Indeed, our uncertainties are dominated by systematic effects related to the exponential fit of Eq. (1).
In principle, the mass m should be obtained by fitting the large-t asymptotic behaviour of C best (t) with a single exponential as in Eq. (1).In practice, the contamination by larger-mass states at small t and the effects of statistical noise at large t hinder this procedure and produce sizable systematic  1: Summary of the obtained results for the mass m R PC of the GS of all accessible R PC channels in lattice units for N = 6 and β = 25.452 on a 16 4 lattice, obtained from gauge configurations generated with the PTBC algorithm.We also compare our results for the ratios m R PC /m A ++ 1 with those obtained interpolating the best fit of Eq. ( 3) to results of Ref. [8] for our value of am A ++ 1 .
errors.The choice of the fitting range [t min , t max ] is thus crucial and is determined as follows.We look for a plateau in the effective mass in Eq. ( 2), which should signal that the single-exponential asymptotic regime has set in.If a plateau can be identified over an interval [t 1 , t 2 ], we set t min = t 1 .The value of t max is then chosen as the largest t ≤ t 2 which allows to obtain a single-exponential best fit in [t min , t max ] with a reasonable value of the χ 2 /dof, where reasonable means that the corresponding p-value is between 5% and 95%.If instead a plateau cannot be clearly identified, we estimate m from an envelope of the quasi-plateau of m eff .We stress that such procedure tends to be harder for states whose mass is close to or above the lattice ultra-violet cut-off.In those cases, the plateau is typically very short, the effects of noise immediately apparent and the systematics more prominent.The net result is that the mass of heavier states tends to be determined less precisely.
The results above were obtained after an expensive computation, as the production of our sample of configurations required a budget of approximately ∼ 2.3M core-hours on the cluster where simulations were run.Unfortunately, our resources did not allow us to improve our statistics further, so as to reach an accuracy comparable to that of Ref. [8] also for heavier states.Nonetheless, we observe a substantial agreement between our results and those in Ref. [8] also for the latter states, confirming the picture that already emerges for lighter states, where our accuracy is mostly of the same order of magnitude as the one achieved in Ref. [8], as can be appreciated from Tab. 1.

Conclusions
In this letter we applied the PTBC algorithm proposed by M. Hasenbusch to perform the first determinations of glueball masses on the lattice at large-N without any systematic effect related to topological freezing.We did so in the pure SU (6) gauge theory, at a ≃ 0.0938 fm.The masses of the first few low-lying glueball states were computed from a sample of 20k well-decorrelated configurations.We compared our results with those obtained from simulations performed with standard local algorithms, and thus affected by severe topological freezing.No systematic effect related to the non-ergodicity of the standard algorithms was observed in the value of glueball masses within our 2 − 5% level of accuracy.This is a first robust indication3 that estimates of glueball masses obtained in a fixed topological sector at large-N can be trusted at up to the few percents level.Moreover, this shows that the PTBC algorithm is a perfectly viable solution to the problem of accurately computing glueball masses at  (left-below plot) and m 2 −+ (right-below plot) to m 0 ++ obtained from configurations generated with the PTBC algorithm (diamond points), compared to those of Ref. [8], obtained from configurations generated with standard local algorithms (empty round points).The correspondence among R PC and J PC channels was done according to Tab. 1. Full round points, dashed lines and shadowed areas represent, respectively, the continuum extrapolation results, linear best fits and related fit errors of data of Ref. [8].
large-N without the effects of topological freezing.This algorithm could be easily adopted in more extensive future studies, both to extend our current results to larger values of N and/or to finer lattice spacings.
Several possible future directions can be explored to further clarify the relationship between glueball mass computations and topological properties.An independent way of probing the sensitivity of glueball masses to the choice of a fixed topological sector is to study their dependence on the dimensionless parameter θ, that couples the global topological charge Q to the standard Yang-Mills action.In particular, the quantity m 2 ≡ d 2 m glueball dθ 2 | θ=0 is expected to control the magnitude of systematics effects related to the restriction of the sample of configurations to a fixed topological sector when computing m glueball [24].The computation of m 2 has been tackled from θ = 0 simulations for the 0 ++ state [20], but only compatible-with-zero determinations have been reported for N ≥ 4.This problem could be re-examined from the point of view of the PTBC algorithm in combination with imaginary-θ simulations, which have been shown to improve the computation of higher-order terms in the θ-expansion [32,22,33,27].Another possible improvement could be achieved by introducing topological operators in the variational basis used in the GEV method, which could help in detecting any possible coupling of glueball states to topological modes.Finally, intriguing insights might come from the application of recent Neural Network techniques [34], for instance by training a Neural Network to distinguish correlators computed from samples of configurations with different global topology.AccelerateAI A100 GPU system, which are part funded by the European Regional Development Fund (ERDF) via Welsh Government.

Figure 1 :
Figure1: Left plot above: choice of c(r) and related swap probabilities.Dotted line represents naive uniform choice of c(r).Right plot above: histogram of the topological charge obtained from our configuration sample, generated with the PTBC algorithm.Left plot below: random walk of a configuration through different replicas.Time along horizontal axis is expressed in units of PTBC steps, and the shown time window corresponds to ∼ 10 −2 % of our total statistics.Right plot below: Monte Carlo evolution of the topological charge Q obtained with the PTBC and with the standard algorithms.The time on the horizontal axis is expressed in units of standard updating steps for both algorithms (Monte Carlo time of the PTBC run was rescaled with a factor of N r ), and the shown time window corresponds to ∼ 0.2% of the total statistics collected with the PTBC algorithm.

Figure 2 :
Figure2: Results for the ratios of m 2 ++ (left-above plot), m 0 −+ (right-above plot), m 1 +− (left-below plot) and m 2 −+ (right-below plot) to m 0 ++ obtained from configurations generated with the PTBC algorithm (diamond points), compared to those of Ref.[8], obtained from configurations generated with standard local algorithms (empty round points).The correspondence among R PC and J PC channels was done according to Tab. 1. Full round points, dashed lines and shadowed areas represent, respectively, the continuum extrapolation results, linear best fits and related fit errors of data of Ref.[8].