Previously I had alluded to some of the major projects I’ve been working on. One has come to fruition and can be found on the arXiv and in the Astrophysical Journal&. It has taken many years to assemble the data in this paper, during which time the models purporting to explain some of it have evolved considerably while consistently failing to address the real problems they raise. There is a lot to explore, so it will take more than one post.
Here I start with the empirical basis: the stellar mass and baryonic Tully-Fisher relations. The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator. As such, it was instrumental in breaking the impasse in the Hubble constant* debate (back when it was 50 vs. 100, not 67 vs. 73), and it remains useful in this role.
Physically, the obvious interpretation was that luminosity is a proxy for stellar mass and linewidth*^ is a proxy for rotation speed. This is correct. Of the various rotation speeds one can define and measure, the one that works best, in terms of minimizing the scatter in the relation, is the flat rotation speed measured in the outer parts of extended rotation curves. See Stark et al. (2009) and Trachternach et al. (2009) for further examples. The scatter is basically a function of data quality.
On the mass axis, converting measured flux to luminosity to mass is a bit dicier, as we need to know the distance for the first step and the stellar mass-to-light ratio for the second. There is inevitably some intrinsic scatter in the mass-to-light ratio of a stellar population. While I don’t doubt that luminosity is a proxy for stellar mass, improving on it is hard to do: there are many instances in which simply assuming a straight mapping of light to mass can be as effective as applying fancier population models. We might^ finally be getting past that, so it is worth discussing a bit.
The procedure to convert starlight into stellar mass involves the construction of stellar population models that use the color(s) or spectral energy distribution of a galaxy to infer the types of stars that make the light. This is a long-argued subject; suffice it to say there are a number of points where it can go wrong. The most obvious is the IMF; the initial spectrum of masses with which stars are born. Most of the light we see from galaxies is produced by its higher mass stars, which are disproportionately bright (there is a steep scaling of stellar luminosity with mass). But most of the mass is locked up in low mass stars that contribute little to the total luminosity. So we are, in effect, using the light of the few to represent the mass of the many. That would go badly wrong if we don’t know the relative mix, i.e., the shape of the IMF. This has been the subject of much research, and over many decades has been narrowed down pretty well. While I hope that this is almost settled, the specter of the IMF lurks as a menace to all stellar mass determinations.
There is a lot else we need to know to build a stellar population model. This includes such essentials as the spectra of individual stars of each and every type and stellar evolution as a function of mass and composition including exotic phases like the asymptotic giant branch. There are a lot of places where this can go badly wrong, and sometimes^% does. So I wouldn’t say we know how to do this perfectly, but we have become pretty good at it.
Converting light to mass suffices to plot the stellar mass Tully-Fisher relation. That accounts for most of the baryonic mass of high mass spirals, but it ignores the mass of the interstellar gas. This can be appreciable in lower mass systems. Indeed, the standard issue dwarf galaxy in the field is more gas than stars:

With measurements of mass and rotation speed, we can construct the Tully-Fisher relation:

The stellar mass Tully-Fisher relation is a good correlation by the standards of extragalactic astronomy. The majority of studies in the literature are restricted to massive% galaxies, mostly those with M* > 1010 M☉ where stars dominate the baryonic mass budget so the omission of gas is not obvious. As we look to lower masses, the relation bends and the scatter increases. That this happens right where gas starts to become important to the mass budget suggests that we’re missing an important component, and voila – a nice, continuous relation that is linear in log space is restored when we plot the baryonic mass Mb = M*+Mg. Indeed, the data are consistent with a simple power law
with A = 50 M☉ km-4 s4. The intercept A has consistently been measured within 10% of this value over the past couple of decades. That this is an integer power law so that the intercept has real physical units is intriguing. That doesn’t happen in most astronomical scaling laws, which are usually more happenstance, like the mass-luminosity relation for main sequence stars.
Why limit ourselves to rotationally supported galaxies? Let’s plots every known type of gravitationally bound extragalactic object, from the smallest ultrafaint dwarfs to the largest clusters of galaxies. Note that I’ve flipped the axes to accommodate the huge dynamic range in baryonic mass, roughly twelve (12) orders of magnitude. This is like having gnats at one end of the scale and blue whales at the other. On that scale, a person is a regular galaxy like the Milky Way.

One improvement from twenty years ago, aside from the greater number of objects and the increase in dynamic range, is the accuracy of the mass measurements. I tried a number of prescriptions for the stellar mass-to-light ratio in McGaugh (2005), which resulted in a range of possible slopes. Now we just use the stellar mass from precise population models (Duey et al. 2025) and recover my best estimate from back then. The room to dodge the obvious conclusion about the slope of the relation by complaining about the choice of stellar mass estimator – a popular course of action back then – is gone. Another technical issue we’ve spent a lot of effort working on is how to put all these very different systems on the same scale of Vf. I won’t elaborate on this here: if you’re interested in that level of detail, you can go read the paper and references there in. If we got this wrong, it would add to the scatter in the relation, and/or create offsets between different types of data.
Both of the extended Tully-Fisher relations, that in stellar mass (top panel) and that in baryonic mass (bottom panel, the extended BTFR) are good correlations. That in baryonic mass is clearly better in the sense that it is tighter over a larger dynamic range. From small dwarf galaxies (Mb ~ 5 x 105) to groups of galaxies (5 x 1012 M☉), the data are consistent with a single power law (Mb ~ Vf4) for all systems with remarkably little scatter. Outside this range, the data for both the lowest and the highest mass systems deviate from a straight line towards higher mass at a given flat velocity. I don’t put much credence in the smallest systems as I think there is little chance that their measured velocity dispersions are representative of their equilibrium gravitational potential. For all practical purposes, our knowledge runs out as we hit the regime of ultrafaint# dwarfs. The deviations of the most massive systems, clusters of galaxies, are more difficult to dismiss.
Restricting our attention for the moment to the range where a single power law suffices to describe the data, we note that there is not much scatter in the BTFR. Some of it is from random uncertainties; these dominate most studies and lead to a lot more scatter than seen here: these data are very good. We can account for the known observational errors and subtract off their contribution to estimate the intrinsic scatter in the relation. This is the variance of the data from a perfect line. The intrinsic scatter for the best data (the WISE-SPARC sample of Duey et al. 2026) is about 0.11 dex in mass – about what we expect$ for stellar populations. That doesn’t leave much room for other sources of scatter, so the underlying physical relation has to be very tight indeed: essentially perfect over the range 5 x 105 < Mb < 5 x 1012 M☉.
Scatter will also occur if our mass budget is incomplete. We can see this in the transition from the stars-only relation to the BTFR. There is a lot of scatter in the stellar mass Tully-Fisher relation around 107 < Mb < 109 M☉. Galaxies in this mass range are sometimes star-dominated and sometimes gas-dominated. The gas fraction is all over the place. This shows up as scatter in the stellar mass Tully-Fisher relation. That’s not real; it is a sign that we’ve missed an important mass reservoir. This is cured when we add in the gas mass, which is dominated by atomic gas (HI to spectroscopists and astronomers). That this addition removes the scatter and restores a single power law relation strongly suggests that there are no further substantial reservoirs** of baryonic material that we’re missing.
This logic applies to other systems as well. Bright spirals do not need much correction because their baryonic mass is dominated by stars. Their stellar mass Tully-Fisher relation is pretty much already their BTFR.
Perhaps this applies to clusters of galaxies as well? There was a huge correction from stars-only to stars plus gas. The gas in this case is the hot, ionized plasma of the intracluster medium (ICM) that belongs to the cluster itself and not any individual galaxy within it. That goes most of the way to close the gap between the stars-only cluster data and the extrapolation of the BTFR fit to individual galaxies, but not all the way. So perhaps we are still missing an important baryonic mass component? It happened before – we didn’t know about the ICM for decades after Zwicky first identified the missing mass problem in clusters – so perhaps there are still more baryons to discover there.
It could also be that the apparent offset occurs because we’ve failed to put clusters on the same Vf scale as galaxies. This is not easy to do, and we’ve spent a lot of time worrying about it. I don’t think this is what’s going on, though it would make my life a lot simpler if it were. Different indicators – dynamics vs. ICM hydrostatics vs. gravitational lensing – can give somewhat different answers, but not in a way that “fixes” the problem: I see no viable path in which the offset turns out to be a simple difference in the way the depth of the gravitational potential is measured. I would love to be wrong here, but I’m not dismissing the offset for clusters as I am for ultrafaint dwarfs (which don’t do lightly).
Perhaps the extrapolation of the BTFR from individual galaxies to clusters is simply not appropriate. They’re very different kinds of systems, after all. To dig into that, we need some theoretical perspective – why does the observed power law happen? Should we expect different systems to share the same BTFR?
Theory is something I’ve studiously avoided in this post: the possibility that there are baryons that remain to be discovered in clusters can be inferred empirically. All the other data line up, so why not clusters? But unless and until these hypothetical additional baryons are discovered, that’s just one possibility. How likely this possibility seems to be diverges rapidly once we overlay a theoretical preference, which I will leave to future posts. (I did warn it would take more than one.)
&This paper appears in ApJ volume 1001. The literature has grown quite a bit since I started contributing to it in volume 342. The Astrophysical Journal was founded in 1895. So I’ve been contributing to it for a little over a quarter of its temporal existence, but nearly twice the number of volumes have been published in that shorter time. It’s no wonder none of us can keep up.
*Indeed, Tully & Fisher’s “preliminary estimate of the Hubble constant is H0 = 80 km/s/Mpc” remains correct to this day, within the uncertainties (hard to estimate at the time, but roughly ±10 km/s/Mpc).
*^There appears to be an irreducible intrinsic scatter in the linewidth: it is not a perfect proxy for rotation speed. Linewidths are observationally easier to obtain than resolved, extended rotation curves, so the numbers of galaxies in samples using linewidths can be very large without ever approaching the quality provided by resolved interferometric observations. Bigger samples are not necessarily better.
^I emphasize might here because the community seems to have moved towards reporting stellar masses as if we observe these rather than the luminosities and colors/SEDs that the mass estimates are based upon. The latter are data – observed quantities – while stellar masses are a derived quantity that is inevitably model dependent. This doesn’t stop being true just because we decide to invest a lot of faith in our models.
*^The Sloan Digital Sky Survey provides stellar masses based on models that are known to be wrong in the near infrared. Since SDSS itself is entirely optical, one might not notice. If one mixes SDSS data with near-IR data, one will get the wrong answer.
%This is a classic selection effect. Brighter objects can be seen at a much greater distance than dim ones, so probe a much larger volume. Consequently, their raw numbers always dominate surveys even if their number density is low. Stars are a great example: most of the stars you can see at night are intrinsically luminous: bright stars that are rather far away. Mundane, low mass stars do not stand out even when nearby.
#This isn’t for lack of observations of ultrafaint dwarfs, it’s the underlying assumptions.
$No amount of information suffices to perfectly specify the stellar mass that produces an observed luminosity and SED (spectral energy distribution/set of colors), so one always expects at least some intrinsic scatter in the stellar mass-to-light ratio. I’ve seen estimates that range from 0.1 – 0.2 dex for near-IR colors. That’s as good as it can get as there is always some transient population (e.g., AGB stars) that produce an amount of light that depends on the star formation rate some time ago, not what we measure now. Optical colors are worse in the sense of having more intrinsic scatter, as they are more susceptible to the comings and goings of bright but short-lived stars whose numbers fluctuate with the stochastic star formation rate. Finding 0.11 dex intrinstic scatter is pretty much as good as it can get. (By dex we mean the scatter in log space.)
**We noted this effect in the original BTFR paper to argue that it was unlikely that we were missing substantial amounts of molecular gas (H2), which was a concern at the time. Flash forward, and we were right: the molecular gas mass is almost always a distant third behind stars and atomic gas in the baryonic mass budgets of individual galaxies. Nowadays, the concern is about the mass of baryons in the circumgalactic medium (CGM). That’s getting ahead of the story, which I’ll save for a future post. For now, it suffices to note that any baryonic mass in the CGM is far beyond the radius where the flat velocity is measured, so is not relevant to the sums here.
Does not the recent Zhang et al paper (https://arxiv.org/abs/2602.06082) bring the clusters at the high end of Fig 3 closer to the orange line? Supported by analysis of 46 nearby galaxy clusters, they postulate that stellar remnants make up much of the apparent mass deficit. They argue for MOND with top-heavy IMFs.
Yes, it certainly does, if we adopt their mass estimates. We had already made our own mass estimates, which are what is shown here. Their solution falls into the category of baryonic dark matter that I allude to: there is more baryonic mass than meets the eye.
Stubborn! The BTFR endures in spite of our best efforts to dismiss it. For those of us with backgrounds in analytical chemistry and particle analysis, endurance of a spectral width/mass relationship over this range of magnitudes is unfathomable. Something is right about MOND. Something is wrong about gravity.
The success of MOND at galactic scales may be evidence against universality, not evidence for a better universal law.
That distinction is key.
Because once you interpret MOND as a contextual emergent closure instead of a replacement universal law, the cluster problem stops being an embarrassment needing hidden particles (the exact same mindset leading to fictional dark matter).
It becomes a regime boundary signal.
I haven’t talked about MOND in this post. Just looking at the data, there is a remarkable uniformity that is nigh-on universal. Maybe it isn’t the single power law predicted by MOND – one can imagine other functional forms – but the deviations are small and restricted to a small mass range at the either end of the relation. A single power law is a lot closer to being universal than it has any right to be, so portraying this as evidence against universality requires ignoring the forest to complain about a few outlying trees.
The point is not that clusters are numerous enough to invalidate the relation. It is that they are dynamically different enough that their systematic deviations may carry disproportionate conceptual significance. It seems unlikely to be a coincidence that both the extended BTFR and MOND itself begin to struggle specifically in clusters.
Yes, that’s certainly a possibility, as is the presence of still-unidentified baryonic mass.
what is really interesting about the BTFR is that it under-predicts “baryonic” mass at the low end, and is essentially dead-on accurate once there is sufficient sampling of rotational and dispersion support in quadrature. It is also interesting that the “observed” baryon fraction systematically increases with observed object core temperatures. clusters/ super-clusters have a ~15-16% baryon fraction, HSB galaxies are in the 5-10% (ish) range, and LSBs/ dwarfs are in the 1-3% range.
What is fascinating is that when I model DM as a family of ordinary baryons with their own ladder of ionization energies, for rotationally supported galaxies, it recovers a slope 4 BTFR by assuming DM is arranged in an extend axisymmetric disk, strongly gravitationally coupled with the luminous mass distribution. the exact slope 4 comes from truncated Mestel geometry and the normalization comes from a marginal stability condition. It competes really well on SPARC and produces a scalable dimensionless kernel. so the disk halo “conspiracy” is essentially strongly coupled but stratified disk geometry. and subsequent shapes like ellipticals and spheroidals come from environmental processing. Galaxies are actually “born as disks” from the outset, NOT as NFW halos – they EVOLVE to other morphologies via tides, stripping, wet and dry mergers. this is WHY spirals and thin disks dominate by number count, especially in the field. The BTFR is a huge clue. No gravity modification needed, geometry suffices
“The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator.” Why are Tully & Fisher not Nobel Laureates in Astronomy?
“The Baryonic Tully-Fisher Relation of Gas Rich Galaxies as a Test of LCDM and MOND”, S. McGaugh, 2011 https://arxiv.org/abs/1107.2934
The deep-MOND limit — a study in Primary vs secondary predictions”, M. Milgrom, 2025 https://arxiv.org/abs/2510.16520
Hypothesis 1. Tully & Fisher are Nobel-Prize-less because there is no FUNDAMOND theory that string theorists view favorably.
Hypothesis 2. General relativity theory predicts twice as much gravitational lensing as Newtonian gravitational theory.
Hypothesis 3. FUNDAMOND string theory predicts twice as much gravitational lensing as MOND.
What is the McGaugh opinion of the 3 preceding hypotheses?
There is no Nobel prize in astronomy. In Nobel’s time, astronomy and mathematics were considered the same field. Story is that his wife ran of with an astro/mathematician, dooming those of us in either field to his eternal snub.
Since Ryle & Hewish, I suggest that the “Nobel Prize in Physics” is really the “Nobel Prize in Physics, Astrophysics, or Astronomy”.
As of 2025, 10 astrophysicists have won Nobel Prizes in Physics.
“Verification of the anecdote about Edwin Hubble and the Nobel Prize”, Kohji Tsumura, 2017
https://arxiv.org/abs/1705.10125
“The Nobel prizes in physics for astrophysics and gravitation and the Nobel prize for black holes: Past, present, and future”, José P. S. Lemos, 2021
https://arxiv.org/abs/2112.14346
Yes, but it is spotty. Radio astronomers are more likely to be recognized than traditional optical astronomers for reasons that I suspect have more to do with the social interconnectedness (or not) of the fields than anything else.
Some laws are limited, some look universal. No-one can argue that all are limited, because we don’t have them all – let’s look at it again in 500 years. The TFR was the 20th century version, the BTFR, which you put on the map in 2000, is so tight and covers so many scales, it’s possible to turn it around in some situations, assume the law is true, and work from there. That gives an estimate for the missing baryons in clusters for instance, though as you say, clusters may be too different to fit well.
But there’s the question of whether when we turn it around, we should take the rule to be as we find it nearby. It seems to me that because further out it effectively needs adjusting by a simple factor multiplied in (which increases with distance, and may scale with 1 + z), there’s a need to explain why a galaxy of the same mass spins faster at a higher z.
This is particularly so because at very high redshifts galaxy formation times seem way faster than expected. There are also other things (I’ve been working on a curve that brings together data from various areas [to discuss things and possibly work together, get in touch] with a conceptual explanation underneath it). With the BTFR, do you think these terms might need to be taken separately? It’s arguable that we should call them adjustments to a simpler theory, rather than a built-in component of a more complicated one.
Yes, one has to be careful about turning it around. I also think it is early days for data at high redshift. While it is clear that galaxies became massive earlier than expected in LCDM, the kinematics are still in dispute. My reading is that the best evidence so far shows no hint of evolution in the BTFR – galaxies of the same mass spin at the same rate at high z, not faster. On the other hand, the linear sizes of galaxies are smaller by (1+z), which rings alarm bells.
Sorry, it seems that LCDM predicts evolution of the BTFR, MOND says it doesn’t, and measurements tend to agree with MOND. Is the linear size of galaxies inversely proportional to 1 + z across a large range, and accurately?
The size evolution has been seen by different workers, and seems to be real – and really weird – but I’m not clear on how much we Have to believe it yet.
The circumstellar disks around stars also form spirals, and these disks contain dust and gas. Perhaps this is similar to the formation principle of spiral galaxies. Spiral galaxies are mainly composed of a disk of many stars and gas. When such a structure rotates, it cannot be simply calculated using universal gravitation. Perhaps it will naturally form a spiral shape, thus explaining the flat velocity.It may require supercomputer simulation.
There is some similarity in morphology. There’s also an important difference in the mass distribution. The circumstellar disk around a star is completely dominated by the gravity of the central star while a spiral galaxy has an extended mass distribution that involves dark matter (whatever that means).
“While it is clear that galaxies become massive earlier than expected in LCDM, the kinematics are still in dispute.” Is the ΛCDM propaganda mill far more powerful than the MOND propaganda mill?
According to Prof. Turner of the U. of Chicago, “The fact the need for dark matter in galaxies occurs at a universal acceleration is remarkable and can be accounted for in ΛCDM … Further, the adherents of MOND have not been successful in turning MOND into a relativistic theory of gravity that can match the many successes of ΛCDM beyond galactic rotation curves, let alone make new predictions that can decisively test it. Moreover, the evidence for dark matter today goes well beyond galactic rotation curves, and now includes the dark matter in clusters of galaxies, the CMB determination of the matter and baryon densities, and the necessity of dark matter for explaining the large-scale structure of the Universe.”
Turner, Michael S. “Everyone wants something better than ΛCDM.” Proceedings of the National Academy of Sciences 123, no. 8 (Feb. 13, 2026): e2526436123. https://www.pnas.org/doi/abs/10.1073/pnas.2526436123
https://arxiv.org/abs/2510.05483
It seems to me that dark matter particles might be endowed with more and more surprising properties (perhaps leading all the way to miracles) — however, I think Prof. Turner has woefully underestimated the importance of MOND. There is still about 1 month remaining for a MOND guru to submit a Letter to the Editor of the Proceedings of NAS to specifically dispute the MONDian section of the article “Everyone wants something better than ΛCDM” & present a pro-MOND perspective.
Turner isn’t underestimating the importance of MOND, he lives in abject terror of it, because it represents the deathknell of his religion – hence the need he feels to denigrate and downplay it at every opportunity. He is the prime example of a scientist turned religious zealot; there is no evidence that could persuade him, and he is willfully blind to all that contradicts his preferred outcome. It is not a good use of my time to argue with walls or scientific bigots.
So you let him talk, since scientific investigations will continue anyway. It is a pity that the general opinion has to progress only due to the demise of the old generation. I’ve seen a picture Milgrom used though, with a titantic (DM) and the iceberg (MOND); they have been warned sufficiently many times. Ah well, the gaia wide binary data will pretty surely convince many good scientists after it is released, I guess. I don’t buy Banik’s approach anymore, thanks to the careful analysis of Chae and Hernandez.
This is very interesting, the extended TFR reaching into new regimes raises a natural question: is the scatter purely observational, or does it carry dynamical structure?
In the Gravitational Memory programme (seven papers, Zenodo), we find that adding a coherence state ψ to the BTFR yields ΔAIC = -24.8 on 143 SPARC galaxies, and the BTFR residual correlates with ψ_mem, the component of ψ not reconstructible from instantaneous observables (ρ = -0.480, p < 10⁻⁴).
More broadly, the RAR scatter itself separates into two dynamical branches when sorted by ψ (KS p = 6 × 10⁻⁹⁵). At g_bar < a₀, ψ separates G_eff into two populations: 4.9 vs 1.6 (p = 5 × 10⁻¹⁰). This suggests that low acceleration is necessary but not sufficient, dynamically organised coherence is additionally required.
This would imply that both the TFR and RAR are not single universal relations but statistical projections of multiple coherence states. Systems without organised rotational structure (wide binaries, DF2-class UDGs) remain Newtonian regardless of their acceleration regime.
A validation companion with falsification criteria: https://zenodo.org/communities/gravitational-memory
I would welcome any assessment of whether the extended TFR data show ψ-dependent structure in their residuals.
Javier Meizoso Fernández
There are no perceptible residuals from the BTFR. There can, of course, always be something below the detection limit. But that’s pretty tight now. Restricting myself to rotating galaxies for which we have the best data, the scatter is now limited by that in the stellar population mass-to-light ratios. We are unable to perceive anything below this limit, but the possibilities are strongly restricted by this limit.
Thank you for this response.
You make a fair point about the BTFR: the scatter is impressively tight and M/L uncertainty dominates. I agree that the BTFR alone may not be the best place to look for this.
However, the signal does appear clearly in two other diagnostics on the same SPARC data:
The RAR scatter, which is larger than the BTFR scatter, separates into two branches when sorted by ψ (KS p = 6 × 10⁻⁹⁵). This is well above the M/L floor.
The 2×2 test: at g_bar < a₀, galaxies with high ψ show G_eff = 4.9 while those with low ψ show G_eff = 1.6 (p = 5 × 10⁻¹⁰). This separation is not driven by M/L uncertainty.
The BTFR may be too tight to reveal ψ-structure, but the RAR scatter appears to be exactly the right diagnostic, large enough to contain structure, small enough to have been dismissed as noise.
Would you expect the RAR residuals to be fully explained by M/L uncertainties as well, or is there room for dynamical structure there?
ESA hast now a countdown for GAIA D4:
https://www.cosmos.esa.int/web/gaia/data-release-4
Will it bring the decision or new conundrums?
I have an intuition about the offset at the cluster end. In the BTFR, individual galaxies can be approximated as point sources because their mass is concentrated and compact. But clusters are large and diffuse — their mass distribution is too spread out to be treated as a single point mass. If the long-range correction term scales with the square root of mass, then a diffuse system with the same total mass would produce a weaker overall correction than a compact one. Is this picture consistent with the trend you observed? Would more diffuse systems show a larger offset?
There are some pretty diffuse galaxies. Indeed, MOND was the only theory to correctly predict in advance that low surface brightness galaxies would fall on the same BTFR as other galaxies. Conventionally they should be well off. So I don’t think this is going to help us with clusters. I will note that the baryonic mass of clusters involves a big extrapolation so is rather uncertain. But it would require a lot more extrapolation to “fix” this.
I think the key distinction might be “compact mass source vs. absence of such a source” rather than diffuseness per se. Consider the Bullet Cluster. The hot gas is pressure-supported and non-rotating — it genuinely lacks a compact, rotationally supported core. In contrast, the galaxies, including the stripped one, remain compact enough. If the long-range correction in MOND requires a compact source to be fully activated, then the gas contributes little while the compact galaxy drives most of the effect. This would naturally offset the lensing peak from the gas peak — the classic “dark matter separation” — without invoking dark matter. Extending this to clusters: if much of the baryonic mass is in diffuse, pressure-supported gas that cannot activate the full correction, the BTFR would indeed show an offset at the cluster end — not because baryons are missing, but because they are in a form that doesn’t fully participate in the same dynamical mechanism. Just a conjecture, curious if it holds up to your data.
As a side note, what if the essence of a₀ in MOND is simply dark energy? Phenomenologically, dark energy acts like a uniformly stretched rubber band—it drives galaxies apart, and by the same logic, it should stretch the range of gravity locally as well. Setting aside the nature of dark energy, and regardless of whether dark matter exists, doesn’t it follow from first principles that the dark energy phenomenon must inevitably share at least part of the gravitational correction? Moreover, by the cosmological principle, any correction to gravity from dark energy must fall off as 1/r, not 1/r²—otherwise the laws of physics would change with distance from the observer. This also explains why the so-called dark matter halo appears in the outskirts of galaxies and takes a flat shape.
It is certainly an intriguing numerical coincidence that a0 ~ a_L/(2 pi) where a_L = c^2*sqrt(Lambda/3). It has been suggested that a particle reads its inertial mass in the Unruh radiation that is generated when accelerated. This would be a form of modified inertia by which particles are easier to push around at low acceleration rather than being affected by stronger gravity.
Yes, that’s what I believe, because it’s the most ‘harmless’ modification for all inverse square law gravity theories (which are almost all of them). The acceleration of the particle and its energy/mass leads to a local disturbance in spacetime curvature, alike to all nearby particles accelerating the opposite way for some portion proportional to both the mass and acceleration of the particle. So I conjecture locally Unruh radiation is generated, when accelerating.
The energy required for this would be why there is inertia, rather than having everything instantly jump from standstill to lightspeed. It’s one explanation of why a force is required to get acceleration, proportional to mass and the target acceleration. And as Meng and Stacy mention, Milgrom suggested MOND might be due to Unruh radiation when combined with a cosmological constant. I just added the above conjecture that any accelerating particle with mass/energy also creates Unruh radiation locally.
The benefits of this would be that it doesn’t modify gravity at all, which erases any dynamical disagreement with string theory, loop quantum gravity, whatever inverse square law gravity. But I don’t really understand where the division by 2 pi comes from. But I like it for explaining what causes the need for force and the existence of inertia, while being compatible with other gravity theories (the graviton, especially. It’s less magical to have a particle carrying the force, rather than it just jumping out of the equations of invisible fields).
Indeed, I find modified inertia appealing for the same reasons.
I also do not understand where the 2pi comes from, nor am I sure it is the correct factor to turn ~ into =. It is another intriguing number (like Planck’s h vs. h_bar) but here lies the path to numerological madness.
Thank you very much for sharing! But could it be that the truth behind a₀ is far simpler and more direct than any of us have imagined? Recently, while reworking some derivations, I found that the point-mass approximation already accounts for most of the deviations in both galaxy rotation curves and gravitational lensing—quite naturally, without extra assumptions. Merging galaxy clusters turn out to be a perfect laboratory for testing this idea: they simultaneously contain diffuse gas (no compact core), observable lensing signals, and compact galaxy cores, all of which can be compared directly within a single framework.
Another thread that keeps recurring in my mind is 2π: it is both the ratio of a circle’s circumference to its radius and the natural scaling factor by which the causal information of a gravitational wave or gravitational potential decays with distance as it propagates uniformly over a two-dimensional spherical surface. The underlying logic is quite intuitive: suppose the gravitational signal propagates uniformly outward on a two-dimensional spherical surface, with the total amount of causal information conserved. Take a one-dimensional cross-section—that is, along any chosen geodesic direction of the propagating gravity—and the information density along that line must fall off as the sphere’s circumference grows. Since the circumference is 2πr, the information density per unit arc length becomes (total information) / (2πr)—so the decay law naturally goes as 1/(2πr). By the same token, a point within a gravitational field can be regarded equivalently as being accelerated by a force… Of course, the exact conversions still require more rigorous formulas to express. But when the same number keeps reappearing in different guises, it’s hard not to wonder: could there be some inevitable logical connection behind it?
If we’re throwing conceptual ideas around, and it can help to do that, here’s one that came out of thinking – if those cosmology number coincidences with a0 actually were something, what would it be. This idea hasn’t been thought through much so far (if anyone writes about it please mention where it came from), but what I like about it is it tries to address a clue at the heart of MOND, and very few ideas that I’ve had or seen do. So, good to try, even if you fail…. the clue boils down to: ‘every acceleration that tries to go below a0 gets combined with a0 in an equal kind of way (geometric mean)’. why should that be?
2pi can be a geometric phase factor in horizon-boundary equations – trying to get an acceleration out of some basic numbers, suppose cH[0]/2pi or something like it represents an acceleration related to the background expansion of space, and it’s 1.2e-10. Matter can go below this acceleration, but beyond there it’s only partly decoupled from space, so it gets affected by what the background space is doing, so these two accelerations combine.
This may have various problems, one is a0 wouldn’t be the same number in different eras, but we measure it (out to some distance) as the same. There something in a cosmological theory I have that might partly cancel that, but I’ve not looked into it. Anyway, this is a set of comparatively loose ideas, hope it’s of interest.
Yes these are interesting avenues, thanks.
The picture with that is the line on a graph for acceleration vs radius. At a0 the Newton line starts descending less steeply than it otherwise would, and roughly bisects the angle between a horizontal line and where it would go without MOND. Every acceleration below a0 is getting pulled up halfway towards a0.
The numbers work better if you make one change. Looking at cH[0]/2pi = a0, you can get an estimate for H[0] from a0 x 2pi/c. This gives H[0] = 77.61, which is high. But it’s possible to adjust the original acceleration to use the age of the universe as the unit of time, instead of the Hubble time.
A basic acceleration is R[H] / t^2[H] = (c/H[0]) / (1/H[0])^2 = cH[0]. Divide by 2pi, for whatever exact reason, and adjust t to change from the Hubble time to the age of the universe, you get 1.048 x cH[0]/2pi = a0. So H[0] is then a0 x 2pi/1.048c, which gives 74.05.
So if nearby measurements are right, and H0 is 73 to 74, then all the numbers land more or less exactly right, and it looks less like an approximate relationship, and even less like a coincidence. Any thoughts on the concept? – that the local expansion of space underneath an orbiting object involves an acceleration, a0 (space is in motion at a small scale), and that affects matter, and boosts any acceleration below a0, combining the two, as in MOND. This is unexpected because in standard physics, gravitationally bound systems are unaffected by the expansion. But perhaps they’re partially affected by it in their outer regions.
If you look at the mathematics of MOND, and then say “when accelerations get low to a certain level, they blend with some general background acceleration at that same level”, to me that fits very well.
PS There’s an error in the algebra above, which was done quickly, but the result is much the same.
That a0 is so close to c^2*sqrt(Lambda) is most likely no coincidence. The question is how to read that clue.
Numerology tells me that the formula a0 = c^2*sqrt(Lambda)/8 is an almost perfect fit for the observed value; much better than with 2 pi*sqrt(3) instead of 8, which is quite a bit off. Also, when I’m doing numerology, I like simple numbers, and 8 is simpler than 2 pi*sqrt(3).
I guess that the factor 2 pi*sqrt(3) is motivated by a formula for Unruh radiation: Every inertial observer in de Sitter spacetime feels immersed in a “thermal bath” at temperature T = sqrt(Lambda)/(2 pi*sqrt(3)) [cf. Gibbons/Hawking 1977, https://doi.org/10.1103/PhysRevD.15.2738, formulas 2.7 and 4.25], where units with G = hbar = k = c = 1 are used.
But I don’t buy the idea that MOND has anything to do with Unruh radiation, or more generally with any quantum effect in curved spacetime.
All those effects should be expected to be unobservably tiny, especially in the MOND regime, where accelerations and spacetime curvature are small.
Also, a0 = c^2*sqrt(Lambda)/8 does not contain any factor hbar. Apparently a0 does not depend on hbar. If the difference between MOND and Newtonian gravity is caused by quantum effects, what exactly would be the idea? That (for small accelerations a>0) the mass m in the formula F[a,hbar] = m*a^2/a0 depends somehow on hbar and a?
Let’s assume that: F[a,hbar] = m[a,hbar]*a^2/a0. Let’s also assume that m[a,hbar] tends to m[a,0] as hbar tends to 0. If the difference between MOND and Newtonian gravity is caused by quantum effects, then F[a,hbar] must tend to the Newtonian limit m[a,0]*a as hbar tends to 0. Hence m[a,hbar]/a0 tends to m[a,0]/a as hbar tends to 0. This is true for every sufficiently small a>0. That is only possible if the Newtonian-limit mass m[a,0] is 0. But that mass is of course allowed to be nonzero!
(The problem remains even if we allow a0 to depend on hbar.)
Maybe I lack imagination, but this whole idea is not how quantum effects usually work. In contrast, the Unruh temperature formula above becomes kT = sqrt(Lambda)/(2 pi*sqrt(3)) * hbar*c when we make all constants explicit. Here T tends to 0 as hbar tends to 0. This is what one would expect from a quantum effect.
TLDR: Wherever MOND comes from, my best guess is that it has nothing to do with quantum effects.
I think there may be an intervening step (or several). Before we get to a viable quantum theory of gravity, we need a complete classical theory. MOND suggests that GR is not that. I don’t know how many steps there are between GR+MOND to a more complete classical theory to a quantum theory. One might hope it can all be solved in one brilliant leap, but that’s a lot to hope for.
Thous I also do not see how Unruh radiation can provide a mechanism for inertia, but that doesn’t mean it can’t be done. See https://ui.adsabs.harvard.edu/abs/1994AnPhy.229..384M/abstract
Regarding the discussion on a0, I’d like to share a recent result. I used the geometric mechanism established in merging galaxy clusters to recalibrate the SPARC galaxy sample. Specifically, from the analysis of merging clusters, I found that the point-mass approximation serves as an upper bound on the strength of the geometric correction term. Applying this insight to the galactic scale, I grouped the SPARC galaxies by bulge fraction (as a proxy for central density):
– Strong-bulge galaxies (closest to the point-mass limit): deviation only about +8.1%
– Intermediate-bulge galaxies: deviation about +9.9%
– Weak-bulge galaxies (disk-dominated, farthest from the point-mass limit): deviation about +26.4%
The higher the central density, the closer the geometric correction term approaches the point-mass limit, and the smaller the deviation — fully consistent with the picture from merging clusters, where compact galaxy cores anchor the lensing peaks while diffuse gas contributes negligibly. The effective mass-to-light ratio calibrated from the strong-bulge group is about 0.51 (considering the geometric correction term, the equivalent observed mass-to-light ratio would be slightly higher, close to the SPARC-recommended 0.7), within the normal range of a Chabrier IMF.
This result means that the way a0 influences galaxy rotation curves — specifically, the effective strength of its correction term under different mass distributions — needs to be adjusted according to how well the point-mass approximation holds. The same a0 value applies almost directly in compact systems, but is systematically overestimated by the point-mass approximation in diffuse ones.
(The a0 value used in the fit differs slightly from the MOND value, but the correlation with bulge fraction remains unchanged.)
Meng: there is a nesting limit in WordPress, so I can’t reply directly to your comment below. However, I want to make clear that we do not assume the point mass limit in our calculations. Bulges are assumed to be spherical and disks cylindrical with finite thickness.
I agree with everything in your first paragraph. Milgrom’s article you cite is long, I haven’t read it yet, so I cannot yet comment on your second paragraph. Instead, for what it’s worth, my general view on the matter is as follows.
MOND depends on a single parameter a0 = c^2 sqrt(Lambda)/8. As we let a0 (equivalently, Lambda) tend to 0 in the formulas, the theory tends to Newtonian gravity: for accelerations a >> a0, MOND behaves like Newtonian gravity; and in the limit a0 -> 0, every a lies in that regime.
Now one can try to somehow explain MOND by quantum theory (via Unruh effects or whatever). This means showing that in some secret way the MOND formula depends on hbar. Once that dependence has been made explicit in the formula, every MOND prediction must tend to the Newtonian prediction as hbar tends to 0.
The same principle applies if one tries to find an explanation for MOND involving any other constant instead of hbar.
But Occam’s razor suggests that we don’t need hbar or any other constant for that: a0 has the desired limit property already, and apparently a0 depends only on c^2 sqrt(Lambda), not on hbar or anything else. The crucial point here seems to be Lambda.
As I see it, the main problem is not to find an explanation for MOND in terms of known physics (like Unruh radiation, which is theoretically well understood, although its effects are so tiny that we don’t have any chance to measure them). Something new is needed. The problem is to explain why the same constant Lambda that occurs in Einstein’s equation occurs in MOND, and how we can derive MOND from more fundamental ideas: what’s the full theory here?
Physics has been in similar situations before, with the constants 1/c and hbar instead of Lambda. In each case, a profound new insight was needed.
What we need is a theory that is to Lambda as Special Relativity is to 1/c, and as quantum theory is to hbar. A theory that fully clarifies the meaning of the constant. I believe that MOND is fundamentally about Lambda, and Lambda is not understood.
If we apply Occam’s razor and strip away all unnecessary assumptions, keeping only what cannot be further reduced — the finite speed of light and the finite age of the universe — then dark energy can also be understood through the following logic.
From any observation point, we can construct a light cone composed of historical events. The boundary of this light cone — the particle horizon — is a two-dimensional cross-section separating the observable from the unobservable. The properties of this two-dimensional cross-section closely resemble those of a black hole horizon. If we imagine the entire light cone as a “gravitational well” (where the gravitational force decays as 1/r rather than 1/r², as I discussed above), then we, as observers, sit at the top of this well. Any mass M located below us (and every other point in the universe fits this description) will feel a stronger gravitational pull than we do.
The key lies in this: as M orbits around us, it simultaneously orbits around the bottom of this gravitational well. Could it be that the additional centrifugal acceleration it experiences because it is at the bottom of the well — an acceleration that points precisely toward us, the observers — is the physical origin of a0?
Actually, I’ve approached this question from four or five equivalent physical perspectives, and each time I arrived at nearly the same conclusion. I recently organized these ideas into a preprint, though it’s still a work in progress: numerical verification is ongoing, and I keep finding errata as the revisions go. The core logic was already in place in the earliest draft. What remains is refining the numerical estimates and tightening the presentation. It only offers one possible solution among many, of course. But numerical coincidences can happen once or twice — they shouldn’t happen so many times.
I once attempted to derive the 2π factor from first principles; although the logical chain proved too long for me to guarantee its stability, it nonetheless provided some valuable clues. Interestingly, most derivations of the 2π factor can be condensed into remarkably short logical paths when viewed through the lens of the holographic principle. Perhaps the deepest clues lie not in the mathematics alone, but in finding the right geometric language for the physics.
For anyone curious, the messy draft is here: https://doi.org/10.5281/zenodo.20270500
“… MOND was the only theory to correctly predict in advance that low surface brightness galaxies would fall on the same BTFR as other galaxies.” How many basic victories does Milgromian dynamics score against Newtonian dynamics?
Should the MOND experts make appropriate entries in the Comments section of the following?
“World’s largest ever survey of physicists, results & reaction” (survey completed in August 2025), Phil Halper, May 2026
https://www.youtube.com/watch?v=6B004Gsv9Ks&t=771s (dark matter section starting at 12:51 of 1:13:24)
There is a table comparing MOND and LCDM at
https://astroweb.case.edu/ssm/mond/LCDMmondtesttable.html
The other major project underway is an update to that: expect a MOND white paper in a few months.
I commented on an earlier version of this survey somewhere I cannot now find. I find it amusing that axions now lead WIMPs as DM candidates; this is the natural progression I pointed out in 2008 – see
https://astroweb.case.edu/ssm/darkmatter/WIMPexperiments.html
So here again the field is twenty years behind. More surprising is that MOND and hybrids do as well as they do.
Is there a mass value for missing baryons in clusters that would remove both offsets, the a0 to a++ one, and the BTFR one?
The velocity is under-predicted: the baryons we see explain ~80% of the velocity with MOND. Since M ~ V^4 in MOND, that’s roughly a factor of two in additional mass that’s required. That’s less than we added going from stars only to stars + hot gas, but it is still a lot!
This is great! Thanks. I downloaded the paper, it’ll take me a while to read. Figure 3. is awesome. So I have two stupid questions. I realize you haven’t discussed Mond in this post and the question are about the External Field Effect (EFE). So at the low mass end is there any EFE that could account for some of the scatter? (I think a simpler version of this question is do the magellanic clouds have any EFE from the milky way?)* The second question is about the EFE in clusters at the high mass end. The gas looks to be ~90% of the mass in the graph. Aren’t all the galaxies in the cluster living in the EFE of the whole cluster? (I’m clearly confused about this?) Also any EFE in clusters is going to shift things in the ‘wrong’ direction. (I think.)
Thanks for the post, I look forward to the future instalments.
*A perfectly reasonable response is go do the calculation.
Yes. (Yes.) and Yes. You are not confused. The EFE matters in clusters and for dwarfs in the Local Group. It certainly does add to the scatter, but that’s hard to perceive on this scale. It would also push in the wrong direction for the discrepant cases.
Thanks, that’s good to know. The EFE is not like other linear physics I’m more use to thinking about. So if we assume that mond is correct at all scales. Then at some point it’s the total mass of the universe that sets the minimum acceleration limit. TBH that feels somewhat satisfying. I do hope you’ll post more about clusters of galaxies. These are weird beasts I don’t understand. And when I read about them there are always referrals to dark matter.
“The velocity is under-predicted: the baryons we see explain ~80% of the velocity with MOND. Since M ~ V^4 in MOND, that’s roughly a factor of two in additional mass that’s required. That’s less than we added going from stars only to stars + hot gas, but it is still a lot!” Consider: HYPOTHESIS 1: General relativity theory predicts twice as much gravitational lensing as Newtonian gravitational theory. HYPOTHESIS 2: FUNDAMOND string theory (based on FUNDAMOND inertia) predicts twice as much gravitational lensing as MOND. HYPOTHESIS 3: Guendelman has revolutionized string theory (“Dynamical String Tension Theories with target space scale invariance SSB and restoration”) & provided a highly plausible mathematical basis for FUNDAMOND string theory. HYPOTHESIS 4: Guendelman’s new version of string theory bolsters the case for the inflaton field by providing a mathematical basis for dark-energy inertia, providing an additional destabilization of black holes — this allows black holes to emit both Hawking radiation & waves of inflatons — the waves of inflatons emitted by black holes might explain the dark energy phenomenon. Has Alan Guth studied MOND’s predictions? What does Guth think about MOND? Note that Eduardo Guendelman is one of Guth’s co-authors.
https://www.rankless.org/authors/alan-h-guth
What strikes me is that the BTFR has crossed the line from “interesting correlation” to “boundary condition on theory.” A successful model should not merely fit individual rotation curves or invoke enough hidden variables to absorb the relation; it should explain why baryonic mass and asymptotic velocity know about each other with so little scatter across such different systems. The cluster offset is then especially valuable: not a nuisance, but a regime test.
I’m agree with you entirely up until the last sentence. I do not know what a regime test is in this instance. What regime? Mass? Potential well depth? Word salad generator?
“… modified inertia appealing …”
String theorists might refuse to MOND because MOND is inconsistent with special relativity theory.
Let us consider (type A) FUNDAMOND modified gravitation (MG) theories not based on modified inertia (MI) versus (type B) MI theories based primarily on FUNDAMOND inertia. According to Milgrom:
” … Who is afraid of modified inertia?
The interpretation of MOND as MI was on the table from the very inception of MOND … Yet, very little work has been done in this promising vein since then – far less than on developing MG theories, and on studying their implications analytically and numerically.
Why is it so?
It is true that MI theories seem to entail more drastic modifications of standard dynamics, and there is a natural tendency to minimize change. To boot, there has been much work done already on modifying general-relativistic gravity from the very advent of GR itself. So, from what I have heard, developers feel much more comfortable with searching and propounding MG theories. It is also true that it has proven rather hard to incorporate the idea of MI in full-fledged theories, based on the standard requirement – symmetries, conservation laws, etc. All this seems to have kept developers away.”
Milgrom, Mordehai. “MOND as manifestation of modified inertia.” arXiv preprint arXiv:2310.14334 (2023). https://arxiv.org/abs/2310.14334
I suggest that we make a maximum effort to convince some of the younger string theorists that Milgrom is the Kepler of contemporary cosmology, and FUNDAMOND string theory with FUNDAMOND inertia is very promising.
If Vf is not due to intrinsic baryon velocity, but rather the associated spectral line width that is used to determine Vf is instead directly related to the same spectral “distortion” that determines the precise inferred baryon temperature, then could we get a tight power law like this without any real dark matter or modified gravity?
I am interested in what kinds of observational entropy could be projected to appear as either DM or MOND behavior, and whether this could support the conjecture that perhaps GR and QFT are complementary descriptions of such observational entropy at extreme scales.
The potential distortion in measurement wouldn’t affect the Newtonian Dynamics of course, but enters at the scale where the deviation in our own acceleration starts to affect the measurement, i.e. a0.
The cosmic connection with a0 may be that the horizon is but a projection – defined by the entropy of the observer’s frame – as is the frame’s acceleration.
So what we know is real, is real. What we don’t know becomes the projection. Sometimes what we don’t know is actually methodically removed information. But has it truly been removed, or just moved?
I have a substrate-level derivation that forces BTFR slope-4 with zero intrinsic scatter from a substrate-cosmology boundary identification; would you read a 3-page summary?
Sorry, but no. I’m sure this seems like a small ask to you, but I get roughly a dozen requests like this every single day. I’m not a fast reader, and these are not easy matters to consider. There’s simply no way.
In the Milgrom paper that Stacy linked above on 16 May, at 9:07 AM, Milgrom repeatedly emphasizes the need for “non-locality” in any viable inertia based version of MOND. I’m assuming that Milgrom is implying that such a version of MOND would need to satisfy Mach’s Principle (I just skimmed over the paper). Now, Mach’s Principle, as I understand it, is basically non-locality switched on continuously – every particle in the universe continually communicating information to every other particle instantaneously to explain inertia. But quantum mechanics (QM), only allows for a single, ‘one-off’, non-local interaction where two quantum particles have become entangled, forming a single quantum system, when in close proximity, physically interacting, at an earlier time. They remain correlated no matter how far apart they become, as long as neither is measured. When the correlated quantum property of one is measured, say spin, then the spin state of the other is instantly determined even if they are a billion light years apart. I’ve looked at a number of papers that claim to incorporate Mach’s Principle, but I strongly doubt that anyone really has come up with a mechanism to allow for its realization in their model.
I am not aware of a viable mechanism to incorporate Mach’s Principle. Einstein reportedly tried hard to include it in GR but ultimately gave up. Neither of those things means it is wrong, just that we don’t understand how it would work – which is, of course, the issue with MOND (and dark matter, for that matter).
I don’t think quantum entanglement is relevant to the non-locality of modified inertia. It does indeed depend on where everything else is, and that’s pain enough, but this does not require constant communication, only that we reside on a non-trivial space-like surface that is defined by where everything else was on the past light cone. So where everything else *was* that could contribute to the now, not where it *is* in this moment in some particular frame.