Previously I had alluded to some of the major projects I’ve been working on. One has come to fruition and can be found on the arXiv and in the Astrophysical Journal&. It has taken many years to assemble the data in this paper, during which time the models purporting to explain some of it have evolved considerably while consistently failing to address the real problems they raise. There is a lot to explore, so it will take more than one post.
Here I start with the empirical basis: the stellar mass and baryonic Tully-Fisher relations. The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator. As such, it was instrumental in breaking the impasse in the Hubble constant* debate (back when it was 50 vs. 100, not 67 vs. 73), and it remains useful in this role.
Physically, the obvious interpretation was that luminosity is a proxy for stellar mass and linewidth*^ is a proxy for rotation speed. This is correct. Of the various rotation speeds one can define and measure, the one that works best, in terms of minimizing the scatter in the relation, is the flat rotation speed measured in the outer parts of extended rotation curves. See Stark et al. (2009) and Trachternach et al. (2009) for further examples. The scatter is basically a function of data quality.
On the mass axis, converting measured flux to luminosity to mass is a bit dicier, as we need to know the distance for the first step and the stellar mass-to-light ratio for the second. There is inevitably some intrinsic scatter in the mass-to-light ratio of a stellar population. While I don’t doubt that luminosity is a proxy for stellar mass, improving on it is hard to do: there are many instances in which simply assuming a straight mapping of light to mass can be as effective as applying fancier population models. We might^ finally be getting past that, so it is worth discussing a bit.
The procedure to convert starlight into stellar mass involves the construction of stellar population models that use the color(s) or spectral energy distribution of a galaxy to infer the types of stars that make the light. This is a long-argued subject; suffice it to say there are a number of points where it can go wrong. The most obvious is the IMF; the initial spectrum of masses with which stars are born. Most of the light we see from galaxies is produced by its higher mass stars, which are disproportionately bright (there is a steep scaling of stellar luminosity with mass). But most of the mass is locked up in low mass stars that contribute little to the total luminosity. So we are, in effect, using the light of the few to represent the mass of the many. That would go badly wrong if we don’t know the relative mix, i.e., the shape of the IMF. This has been the subject of much research, and over many decades has been narrowed down pretty well. While I hope that this is almost settled, the specter of the IMF lurks as a menace to all stellar mass determinations.
There is a lot else we need to know to build a stellar population model. This includes such essentials as the spectra of individual stars of each and every type and stellar evolution as a function of mass and composition including exotic phases like the asymptotic giant branch. There are a lot of places where this can go badly wrong, and sometimes^% does. So I wouldn’t say we know how to do this perfectly, but we have become pretty good at it.
Converting light to mass suffices to plot the stellar mass Tully-Fisher relation. That accounts for most of the baryonic mass of high mass spirals, but it ignores the mass of the interstellar gas. This can be appreciable in lower mass systems. Indeed, the standard issue dwarf galaxy in the field is more gas than stars:

With measurements of mass and rotation speed, we can construct the Tully-Fisher relation:

The stellar mass Tully-Fisher relation is a good correlation by the standards of extragalactic astronomy. The majority of studies in the literature are restricted to massive% galaxies, mostly those with M* > 1010 M☉ where stars dominate the baryonic mass budget so the omission of gas is not obvious. As we look to lower masses, the relation bends and the scatter increases. That this happens right where gas starts to become important to the mass budget suggests that we’re missing an important component, and voila – a nice, continuous relation that is linear in log space is restored when we plot the baryonic mass Mb = M*+Mg. Indeed, the data are consistent with a simple power law
with A = 50 M☉ km-4 s4. The intercept A has consistently been measured within 10% of this value over the past couple of decades. That this is an integer power law so that the intercept has real physical units is intriguing. That doesn’t happen in most astronomical scaling laws, which are usually more happenstance, like the mass-luminosity relation for main sequence stars.
Why limit ourselves to rotationally supported galaxies? Let’s plots every known type of gravitationally bound extragalactic object, from the smallest ultrafaint dwarfs to the largest clusters of galaxies. Note that I’ve flipped the axes to accommodate the huge dynamic range in baryonic mass, roughly twelve (12) orders of magnitude. This is like having gnats at one end of the scale and blue whales at the other. On that scale, a person is a regular galaxy like the Milky Way.

One improvement from twenty years ago, aside from the greater number of objects and the increase in dynamic range, is the accuracy of the mass measurements. I tried a number of prescriptions for the stellar mass-to-light ratio in McGaugh (2005), which resulted in a range of possible slopes. Now we just use the stellar mass from precise population models (Duey et al. 2025) and recover my best estimate from back then. The room to dodge the obvious conclusion about the slope of the relation by complaining about the choice of stellar mass estimator – a popular course of action back then – is gone. Another technical issue we’ve spent a lot of effort working on is how to put all these very different systems on the same scale of Vf. I won’t elaborate on this here: if you’re interested in that level of detail, you can go read the paper and references there in. If we got this wrong, it would add to the scatter in the relation, and/or create offsets between different types of data.
Both of the extended Tully-Fisher relations, that in stellar mass (top panel) and that in baryonic mass (bottom panel, the extended BTFR) are good correlations. That in baryonic mass is clearly better in the sense that it is tighter over a larger dynamic range. From small dwarf galaxies (Mb ~ 5 x 105) to groups of galaxies (5 x 1012 M☉), the data are consistent with a single power law (Mb ~ Vf4) for all systems with remarkably little scatter. Outside this range, the data for both the lowest and the highest mass systems deviate from a straight line towards higher mass at a given flat velocity. I don’t put much credence in the smallest systems as I think there is little chance that their measured velocity dispersions are representative of their equilibrium gravitational potential. For all practical purposes, our knowledge runs out as we hit the regime of ultrafaint# dwarfs. The deviations of the most massive systems, clusters of galaxies, are more difficult to dismiss.
Restricting our attention for the moment to the range where a single power law suffices to describe the data, we note that there is not much scatter in the BTFR. Some of it is from random uncertainties; these dominate most studies and lead to a lot more scatter than seen here: these data are very good. We can account for the known observational errors and subtract off their contribution to estimate the intrinsic scatter in the relation. This is the variance of the data from a perfect line. The intrinsic scatter for the best data (the WISE-SPARC sample of Duey et al. 2026) is about 0.11 dex in mass – about what we expect$ for stellar populations. That doesn’t leave much room for other sources of scatter, so the underlying physical relation has to be very tight indeed: essentially perfect over the range 5 x 105 < Mb < 5 x 1012 M☉.
Scatter will also occur if our mass budget is incomplete. We can see this in the transition from the stars-only relation to the BTFR. There is a lot of scatter in the stellar mass Tully-Fisher relation around 107 < Mb < 109 M☉. Galaxies in this mass range are sometimes star-dominated and sometimes gas-dominated. The gas fraction is all over the place. This shows up as scatter in the stellar mass Tully-Fisher relation. That’s not real; it is a sign that we’ve missed an important mass reservoir. This is cured when we add in the gas mass, which is dominated by atomic gas (HI to spectroscopists and astronomers). That this addition removes the scatter and restores a single power law relation strongly suggests that there are no further substantial reservoirs** of baryonic material that we’re missing.
This logic applies to other systems as well. Bright spirals do not need much correction because their baryonic mass is dominated by stars. Their stellar mass Tully-Fisher relation is pretty much already their BTFR.
Perhaps this applies to clusters of galaxies as well? There was a huge correction from stars-only to stars plus gas. The gas in this case is the hot, ionized plasma of the intracluster medium (ICM) that belongs to the cluster itself and not any individual galaxy within it. That goes most of the way to close the gap between the stars-only cluster data and the extrapolation of the BTFR fit to individual galaxies, but not all the way. So perhaps we are still missing an important baryonic mass component? It happened before – we didn’t know about the ICM for decades after Zwicky first identified the missing mass problem in clusters – so perhaps there are still more baryons to discover there.
It could also be that the apparent offset occurs because we’ve failed to put clusters on the same Vf scale as galaxies. This is not easy to do, and we’ve spent a lot of time worrying about it. I don’t think this is what’s going on, though it would make my life a lot simpler if it were. Different indicators – dynamics vs. ICM hydrostatics vs. gravitational lensing – can give somewhat different answers, but not in a way that “fixes” the problem: I see no viable path in which the offset turns out to be a simple difference in the way the depth of the gravitational potential is measured. I would love to be wrong here, but I’m not dismissing the offset for clusters as I am for ultrafaint dwarfs (which don’t do lightly).
Perhaps the extrapolation of the BTFR from individual galaxies to clusters is simply not appropriate. They’re very different kinds of systems, after all. To dig into that, we need some theoretical perspective – why does the observed power law happen? Should we expect different systems to share the same BTFR?
Theory is something I’ve studiously avoided in this post: the possibility that there are baryons that remain to be discovered in clusters can be inferred empirically. All the other data line up, so why not clusters? But unless and until these hypothetical additional baryons are discovered, that’s just one possibility. How likely this possibility seems to be diverges rapidly once we overlay a theoretical preference, which I will leave to future posts. (I did warn it would take more than one.)
&This paper appears in ApJ volume 1001. The literature has grown quite a bit since I started contributing to it in volume 342. The Astrophysical Journal was founded in 1895. So I’ve been contributing to it for a little over a quarter of its temporal existence, but nearly twice the number of volumes have been published in that shorter time. It’s no wonder none of us can keep up.
*Indeed, Tully & Fisher’s “preliminary estimate of the Hubble constant is H0 = 80 km/s/Mpc” remains correct to this day, within the uncertainties (hard to estimate at the time, but roughly ±10 km/s/Mpc).
*^There appears to be an irreducible intrinsic scatter in the linewidth: it is not a perfect proxy for rotation speed. Linewidths are observationally easier to obtain than resolved, extended rotation curves, so the numbers of galaxies in samples using linewidths can be very large without ever approaching the quality provided by resolved interferometric observations. Bigger samples are not necessarily better.
^I emphasize might here because the community seems to have moved towards reporting stellar masses as if we observe these rather than the luminosities and colors/SEDs that the mass estimates are based upon. The latter are data – observed quantities – while stellar masses are a derived quantity that is inevitably model dependent. This doesn’t stop being true just because we decide to invest a lot of faith in our models.
*^The Sloan Digital Sky Survey provides stellar masses based on models that are known to be wrong in the near infrared. Since SDSS itself is entirely optical, one might not notice. If one mixes SDSS data with near-IR data, one will get the wrong answer.
%This is a classic selection effect. Brighter objects can be seen at a much greater distance than dim ones, so probe a much larger volume. Consequently, their raw numbers always dominate surveys even if their number density is low. Stars are a great example: most of the stars you can see at night are intrinsically luminous: bright stars that are rather far away. Mundane, low mass stars do not stand out even when nearby.
#This isn’t for lack of observations of ultrafaint dwarfs, it’s the underlying assumptions.
$No amount of information suffices to perfectly specify the stellar mass that produces an observed luminosity and SED (spectral energy distribution/set of colors), so one always expects at least some intrinsic scatter in the stellar mass-to-light ratio. I’ve seen estimates that range from 0.1 – 0.2 dex for near-IR colors. That’s as good as it can get as there is always some transient population (e.g., AGB stars) that produce an amount of light that depends on the star formation rate some time ago, not what we measure now. Optical colors are worse in the sense of having more intrinsic scatter, as they are more susceptible to the comings and goings of bright but short-lived stars whose numbers fluctuate with the stochastic star formation rate. Finding 0.11 dex intrinstic scatter is pretty much as good as it can get. (By dex we mean the scatter in log space.)
**We noted this effect in the original BTFR paper to argue that it was unlikely that we were missing substantial amounts of molecular gas (H2), which was a concern at the time. Flash forward, and we were right: the molecular gas mass is almost always a distant third behind stars and atomic gas in the baryonic mass budgets of individual galaxies. Nowadays, the concern is about the mass of baryons in the circumgalactic medium (CGM). That’s getting ahead of the story, which I’ll save for a future post. For now, it suffices to note that any baryonic mass in the CGM is far beyond the radius where the flat velocity is measured, so is not relevant to the sums here.
Does not the recent Zhang et al paper (https://arxiv.org/abs/2602.06082) bring the clusters at the high end of Fig 3 closer to the orange line? Supported by analysis of 46 nearby galaxy clusters, they postulate that stellar remnants make up much of the apparent mass deficit. They argue for MOND with top-heavy IMFs.
Yes, it certainly does, if we adopt their mass estimates. We had already made our own mass estimates, which are what is shown here. Their solution falls into the category of baryonic dark matter that I allude to: there is more baryonic mass than meets the eye.
Stubborn! The BTFR endures in spite of our best efforts to dismiss it. For those of us with backgrounds in analytical chemistry and particle analysis, endurance of a spectral width/mass relationship over this range of magnitudes is unfathomable. Something is right about MOND. Something is wrong about gravity.
The success of MOND at galactic scales may be evidence against universality, not evidence for a better universal law.
That distinction is key.
Because once you interpret MOND as a contextual emergent closure instead of a replacement universal law, the cluster problem stops being an embarrassment needing hidden particles (the exact same mindset leading to fictional dark matter).
It becomes a regime boundary signal.
I haven’t talked about MOND in this post. Just looking at the data, there is a remarkable uniformity that is nigh-on universal. Maybe it isn’t the single power law predicted by MOND – one can imagine other functional forms – but the deviations are small and restricted to a small mass range at the either end of the relation. A single power law is a lot closer to being universal than it has any right to be, so portraying this as evidence against universality requires ignoring the forest to complain about a few outlying trees.
The point is not that clusters are numerous enough to invalidate the relation. It is that they are dynamically different enough that their systematic deviations may carry disproportionate conceptual significance. It seems unlikely to be a coincidence that both the extended BTFR and MOND itself begin to struggle specifically in clusters.
Yes, that’s certainly a possibility, as is the presence of still-unidentified baryonic mass.
what is really interesting about the BTFR is that it under-predicts “baryonic” mass at the low end, and is essentially dead-on accurate once there is sufficient sampling of rotational and dispersion support in quadrature. It is also interesting that the “observed” baryon fraction systematically increases with observed object core temperatures. clusters/ super-clusters have a ~15-16% baryon fraction, HSB galaxies are in the 5-10% (ish) range, and LSBs/ dwarfs are in the 1-3% range.
What is fascinating is that when I model DM as a family of ordinary baryons with their own ladder of ionization energies, for rotationally supported galaxies, it recovers a slope 4 BTFR by assuming DM is arranged in an extend axisymmetric disk, strongly gravitationally coupled with the luminous mass distribution. the exact slope 4 comes from truncated Mestel geometry and the normalization comes from a marginal stability condition. It competes really well on SPARC and produces a scalable dimensionless kernel. so the disk halo “conspiracy” is essentially strongly coupled but stratified disk geometry. and subsequent shapes like ellipticals and spheroidals come from environmental processing. Galaxies are actually “born as disks” from the outset, NOT as NFW halos – they EVOLVE to other morphologies via tides, stripping, wet and dry mergers. this is WHY spirals and thin disks dominate by number count, especially in the field. The BTFR is a huge clue. No gravity modification needed, geometry suffices
“The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator.” Why are Tully & Fisher not Nobel Laureates in Astronomy?
“The Baryonic Tully-Fisher Relation of Gas Rich Galaxies as a Test of LCDM and MOND”, S. McGaugh, 2011 https://arxiv.org/abs/1107.2934
The deep-MOND limit — a study in Primary vs secondary predictions”, M. Milgrom, 2025 https://arxiv.org/abs/2510.16520
Hypothesis 1. Tully & Fisher are Nobel-Prize-less because there is no FUNDAMOND theory that string theorists view favorably.
Hypothesis 2. General relativity theory predicts twice as much gravitational lensing as Newtonian gravitational theory.
Hypothesis 3. FUNDAMOND string theory predicts twice as much gravitational lensing as MOND.
What is the McGaugh opinion of the 3 preceding hypotheses?
There is no Nobel prize in astronomy. In Nobel’s time, astronomy and mathematics were considered the same field. Story is that his wife ran of with an astro/mathematician, dooming those of us in either field to his eternal snub.
Since Ryle & Hewish, I suggest that the “Nobel Prize in Physics” is really the “Nobel Prize in Physics, Astrophysics, or Astronomy”.
As of 2025, 10 astrophysicists have won Nobel Prizes in Physics.
“Verification of the anecdote about Edwin Hubble and the Nobel Prize”, Kohji Tsumura, 2017
https://arxiv.org/abs/1705.10125
“The Nobel prizes in physics for astrophysics and gravitation and the Nobel prize for black holes: Past, present, and future”, José P. S. Lemos, 2021
https://arxiv.org/abs/2112.14346
Yes, but it is spotty. Radio astronomers are more likely to be recognized than traditional optical astronomers for reasons that I suspect have more to do with the social interconnectedness (or not) of the fields than anything else.
Some laws are limited, some look universal. No-one can argue that all are limited, because we don’t have them all – let’s look at it again in 500 years. The TFR was the 20th century version, the BTFR, which you put on the map in 2000, is so tight and covers so many scales, it’s possible to turn it around in some situations, assume the law is true, and work from there. That gives an estimate for the missing baryons in clusters for instance, though as you say, clusters may be too different to fit well.
But there’s the question of whether when we turn it around, we should take the rule to be as we find it nearby. It seems to me that because further out it effectively needs adjusting by a simple factor multiplied in (which increases with distance, and may scale with 1 + z), there’s a need to explain why a galaxy of the same mass spins faster at a higher z.
This is particularly so because at very high redshifts galaxy formation times seem way faster than expected. There are also other things (I’ve been working on a curve that brings together data from various areas [to discuss things and possibly work together, get in touch] with a conceptual explanation underneath it). With the BTFR, do you think these terms might need to be taken separately? It’s arguable that we should call them adjustments to a simpler theory, rather than a built-in component of a more complicated one.
Yes, one has to be careful about turning it around. I also think it is early days for data at high redshift. While it is clear that galaxies became massive earlier than expected in LCDM, the kinematics are still in dispute. My reading is that the best evidence so far shows no hint of evolution in the BTFR – galaxies of the same mass spin at the same rate at high z, not faster. On the other hand, the linear sizes of galaxies are smaller by (1+z), which rings alarm bells.
Sorry, it seems that LCDM predicts evolution of the BTFR, MOND says it doesn’t, and measurements tend to agree with MOND. Is the linear size of galaxies inversely proportional to 1 + z across a large range, and accurately?
The size evolution has been seen by different workers, and seems to be real – and really weird – but I’m not clear on how much we Have to believe it yet.
The circumstellar disks around stars also form spirals, and these disks contain dust and gas. Perhaps this is similar to the formation principle of spiral galaxies. Spiral galaxies are mainly composed of a disk of many stars and gas. When such a structure rotates, it cannot be simply calculated using universal gravitation. Perhaps it will naturally form a spiral shape, thus explaining the flat velocity.It may require supercomputer simulation.
There is some similarity in morphology. There’s also an important difference in the mass distribution. The circumstellar disk around a star is completely dominated by the gravity of the central star while a spiral galaxy has an extended mass distribution that involves dark matter (whatever that means).
“While it is clear that galaxies become massive earlier than expected in LCDM, the kinematics are still in dispute.” Is the ΛCDM propaganda mill far more powerful than the MOND propaganda mill?
According to Prof. Turner of the U. of Chicago, “The fact the need for dark matter in galaxies occurs at a universal acceleration is remarkable and can be accounted for in ΛCDM … Further, the adherents of MOND have not been successful in turning MOND into a relativistic theory of gravity that can match the many successes of ΛCDM beyond galactic rotation curves, let alone make new predictions that can decisively test it. Moreover, the evidence for dark matter today goes well beyond galactic rotation curves, and now includes the dark matter in clusters of galaxies, the CMB determination of the matter and baryon densities, and the necessity of dark matter for explaining the large-scale structure of the Universe.”
Turner, Michael S. “Everyone wants something better than ΛCDM.” Proceedings of the National Academy of Sciences 123, no. 8 (Feb. 13, 2026): e2526436123. https://www.pnas.org/doi/abs/10.1073/pnas.2526436123
https://arxiv.org/abs/2510.05483
It seems to me that dark matter particles might be endowed with more and more surprising properties (perhaps leading all the way to miracles) — however, I think Prof. Turner has woefully underestimated the importance of MOND. There is still about 1 month remaining for a MOND guru to submit a Letter to the Editor of the Proceedings of NAS to specifically dispute the MONDian section of the article “Everyone wants something better than ΛCDM” & present a pro-MOND perspective.
Turner isn’t underestimating the importance of MOND, he lives in abject terror of it, because it represents the deathknell of his religion – hence the need he feels to denigrate and downplay it at every opportunity. He is the prime example of a scientist turned religious zealot; there is no evidence that could persuade him, and he is willfully blind to all that contradicts his preferred outcome. It is not a good use of my time to argue with walls or scientific bigots.
So you let him talk, since scientific investigations will continue anyway. It is a pity that the general opinion has to progress only due to the demise of the old generation. I’ve seen a picture Milgrom used though, with a titantic (DM) and the iceberg (MOND); they have been warned sufficiently many times. Ah well, the gaia wide binary data will pretty surely convince many good scientists after it is released, I guess. I don’t buy Banik’s approach anymore, thanks to the careful analysis of Chae and Hernandez.
This is very interesting, the extended TFR reaching into new regimes raises a natural question: is the scatter purely observational, or does it carry dynamical structure?
In the Gravitational Memory programme (seven papers, Zenodo), we find that adding a coherence state ψ to the BTFR yields ΔAIC = -24.8 on 143 SPARC galaxies, and the BTFR residual correlates with ψ_mem, the component of ψ not reconstructible from instantaneous observables (ρ = -0.480, p < 10⁻⁴).
More broadly, the RAR scatter itself separates into two dynamical branches when sorted by ψ (KS p = 6 × 10⁻⁹⁵). At g_bar < a₀, ψ separates G_eff into two populations: 4.9 vs 1.6 (p = 5 × 10⁻¹⁰). This suggests that low acceleration is necessary but not sufficient, dynamically organised coherence is additionally required.
This would imply that both the TFR and RAR are not single universal relations but statistical projections of multiple coherence states. Systems without organised rotational structure (wide binaries, DF2-class UDGs) remain Newtonian regardless of their acceleration regime.
A validation companion with falsification criteria: https://zenodo.org/communities/gravitational-memory
I would welcome any assessment of whether the extended TFR data show ψ-dependent structure in their residuals.
Javier Meizoso Fernández
There are no perceptible residuals from the BTFR. There can, of course, always be something below the detection limit. But that’s pretty tight now. Restricting myself to rotating galaxies for which we have the best data, the scatter is now limited by that in the stellar population mass-to-light ratios. We are unable to perceive anything below this limit, but the possibilities are strongly restricted by this limit.