Previously I had alluded to some of the major projects I’ve been working on. One has come to fruition and can be found on the arXiv and in the Astrophysical Journal&. It has taken many years to assemble the data in this paper, during which time the models purporting to explain some of it have evolved considerably while consistently failing to address the real problems they raise. There is a lot to explore, so it will take more than one post.

Here I start with the empirical basis: the stellar mass and baryonic Tully-Fisher relations. The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator. As such, it was instrumental in breaking the impasse in the Hubble constant* debate (back when it was 50 vs. 100, not 67 vs. 73), and it remains useful in this role.

Physically, the obvious interpretation was that luminosity is a proxy for stellar mass and linewidth*^ is a proxy for rotation speed. This is correct. Of the various rotation speeds one can define and measure, the one that works best, in terms of minimizing the scatter in the relation, is the flat rotation speed measured in the outer parts of extended rotation curves. See Stark et al. (2009) and Trachternach et al. (2009) for further examples. The scatter is basically a function of data quality.

On the mass axis, converting measured flux to luminosity to mass is a bit dicier, as we need to know the distance for the first step and the stellar mass-to-light ratio for the second. There is inevitably some intrinsic scatter in the mass-to-light ratio of a stellar population. While I don’t doubt that luminosity is a proxy for stellar mass, improving on it is hard to do: there are many instances in which simply assuming a straight mapping of light to mass can be as effective as applying fancier population models. We might^ finally be getting past that, so it is worth discussing a bit.

The procedure to convert starlight into stellar mass involves the construction of stellar population models that use the color(s) or spectral energy distribution of a galaxy to infer the types of stars that make the light. This is a long-argued subject; suffice it to say there are a number of points where it can go wrong. The most obvious is the IMF; the initial spectrum of masses with which stars are born. Most of the light we see from galaxies is produced by its higher mass stars, which are disproportionately bright (there is a steep scaling of stellar luminosity with mass). But most of the mass is locked up in low mass stars that contribute little to the total luminosity. So we are, in effect, using the light of the few to represent the mass of the many. That would go badly wrong if we don’t know the relative mix, i.e., the shape of the IMF. This has been the subject of much research, and over many decades has been narrowed down pretty well. While I hope that this is almost settled, the specter of the IMF lurks as a menace to all stellar mass determinations.

There is a lot else we need to know to build a stellar population model. This includes such essentials as the spectra of individual stars of each and every type and stellar evolution as a function of mass and composition including exotic phases like the asymptotic giant branch. There are a lot of places where this can go badly wrong, and sometimes^% does. So I wouldn’t say we know how to do this perfectly, but we have become pretty good at it.

Converting light to mass suffices to plot the stellar mass Tully-Fisher relation. That accounts for most of the baryonic mass of high mass spirals, but it ignores the mass of the interstellar gas. This can be appreciable in lower mass systems. Indeed, the standard issue dwarf galaxy in the field is more gas than stars:

Figure 1 from McGaugh et al. (2019): The gas and stellar masses of rotating galaxies. Blue points are galaxies in the SPARC database (Lelli et al. 2016b) and the gas rich galaxies discussed by McGaugh (2012). The location of the Milky Way is noted in red (McGaugh 2016): it is a typical bright spiral. Grey points are the sample of Bradford et al. (2015). The line is the line of equality where M* = Mg.

With measurements of mass and rotation speed, we can construct the Tully-Fisher relation:

Figure 4 from McGaugh et al. (2019): The stellar mass (left) and baryonic Tully-Fisher relation (right). Data from Lelli et al. (2016b) and McGaugh (2012) are shown as blue points if both axes are measured with at least 20% accuracy; less accurate data are shown in grey. The latter include cases for which the rotation curve does not extend far enough to measure Vf, in which case the last measure point is used. These cases are systematically offset to lower velocity. Inclination uncertainties and distance errors also contribute to the scatter. The better the data, the tighter the relation. The location of the Milky Way is noted in red (you are here).

The stellar mass Tully-Fisher relation is a good correlation by the standards of extragalactic astronomy. The majority of studies in the literature are restricted to massive% galaxies, mostly those with M* > 1010 M where stars dominate the baryonic mass budget so the omission of gas is not obvious. As we look to lower masses, the relation bends and the scatter increases. That this happens right where gas starts to become important to the mass budget suggests that we’re missing an important component, and voila – a nice, continuous relation that is linear in log space is restored when we plot the baryonic mass Mb = M*+Mg. Indeed, the data are consistent with a simple power law

Mb=AVf4M_b = A \, V_f^4

with A = 50 M km-4 s4. The intercept A has consistently been measured within 10% of this value over the past couple of decades. That this is an integer power law so that the intercept has real physical units is intriguing. That doesn’t happen in most astronomical scaling laws, which are usually more happenstance, like the mass-luminosity relation for main sequence stars.

Why limit ourselves to rotationally supported galaxies? Let’s plots every known type of gravitationally bound extragalactic object, from the smallest ultrafaint dwarfs to the largest clusters of galaxies. Note that I’ve flipped the axes to accommodate the huge dynamic range in baryonic mass, roughly twelve (12) orders of magnitude. This is like having gnats at one end of the scale and blue whales at the other. On that scale, a person is a regular galaxy like the Milky Way.

Figure 3 from McGaugh et al. (2026)Extended Tully-Fisher relations plotting the flat-equivalent circular velocity of extragalactic systems as a function of stellar mass (top panel) and baryonic mass (bottom panel). Data for rotationally supported galaxies are depicted by circles; squares represent pressure supported systems. The blue circles are galaxies with directly measured distances, Vf from rotation curves, and stellar masses from WISE photometry from Duey et al. (2026, in preparation). Green circles are gas-rich galaxies (Mg > M*; Stark et al. 2009; Trachternach et al. 2009; Bernstein-Cooper et al. 2014; McNichols et al. 2016; Iorio et al. 2017; Namumba et al. 2025; Xu et al. 2025) not already in Duey et al. (2026). Yellow points are Local Group galaxies, both spirals and dwarfs (McGaugh et al. 2021); gray squares are ultrafaint dwarfs (Lelli et al. 2017). Lensing results for early- and late-type galaxies (Mistele et al. 2024a) are shown as pink squares and magenta circles, respectively. Red squares are clusters of galaxies (Mistele et al. 2025), and purple squares are groups of galaxies (McGaugh et al. 2026). The orange line is the BTFR fit only to rotating galaxies over a more limited range (about three orders of magnitude in baryonic mass, from Mb ~ 4 x 108 to 4 x 1011 M) by McGaugh (2005).

One improvement from twenty years ago, aside from the greater number of objects and the increase in dynamic range, is the accuracy of the mass measurements. I tried a number of prescriptions for the stellar mass-to-light ratio in McGaugh (2005), which resulted in a range of possible slopes. Now we just use the stellar mass from precise population models (Duey et al. 2025) and recover my best estimate from back then. The room to dodge the obvious conclusion about the slope of the relation by complaining about the choice of stellar mass estimator – a popular course of action back then – is gone. Another technical issue we’ve spent a lot of effort working on is how to put all these very different systems on the same scale of Vf. I won’t elaborate on this here: if you’re interested in that level of detail, you can go read the paper and references there in. If we got this wrong, it would add to the scatter in the relation, and/or create offsets between different types of data.

Both of the extended Tully-Fisher relations, that in stellar mass (top panel) and that in baryonic mass (bottom panel, the extended BTFR) are good correlations. That in baryonic mass is clearly better in the sense that it is tighter over a larger dynamic range. From small dwarf galaxies (Mb ~ 5 x 105) to groups of galaxies (5 x 1012 M), the data are consistent with a single power law (Mb ~ Vf4) for all systems with remarkably little scatter. Outside this range, the data for both the lowest and the highest mass systems deviate from a straight line towards higher mass at a given flat velocity. I don’t put much credence in the smallest systems as I think there is little chance that their measured velocity dispersions are representative of their equilibrium gravitational potential. For all practical purposes, our knowledge runs out as we hit the regime of ultrafaint# dwarfs. The deviations of the most massive systems, clusters of galaxies, are more difficult to dismiss.

Restricting our attention for the moment to the range where a single power law suffices to describe the data, we note that there is not much scatter in the BTFR. Some of it is from random uncertainties; these dominate most studies and lead to a lot more scatter than seen here: these data are very good. We can account for the known observational errors and subtract off their contribution to estimate the intrinsic scatter in the relation. This is the variance of the data from a perfect line. The intrinsic scatter for the best data (the WISE-SPARC sample of Duey et al. 2026) is about 0.11 dex in mass – about what we expect$ for stellar populations. That doesn’t leave much room for other sources of scatter, so the underlying physical relation has to be very tight indeed: essentially perfect over the range 5 x 105 < Mb < 5 x 1012 M.

Scatter will also occur if our mass budget is incomplete. We can see this in the transition from the stars-only relation to the BTFR. There is a lot of scatter in the stellar mass Tully-Fisher relation around 107 < Mb < 109 M. Galaxies in this mass range are sometimes star-dominated and sometimes gas-dominated. The gas fraction is all over the place. This shows up as scatter in the stellar mass Tully-Fisher relation. That’s not real; it is a sign that we’ve missed an important mass reservoir. This is cured when we add in the gas mass, which is dominated by atomic gas (HI to spectroscopists and astronomers). That this addition removes the scatter and restores a single power law relation strongly suggests that there are no further substantial reservoirs** of baryonic material that we’re missing.

This logic applies to other systems as well. Bright spirals do not need much correction because their baryonic mass is dominated by stars. Their stellar mass Tully-Fisher relation is pretty much already their BTFR.

Perhaps this applies to clusters of galaxies as well? There was a huge correction from stars-only to stars plus gas. The gas in this case is the hot, ionized plasma of the intracluster medium (ICM) that belongs to the cluster itself and not any individual galaxy within it. That goes most of the way to close the gap between the stars-only cluster data and the extrapolation of the BTFR fit to individual galaxies, but not all the way. So perhaps we are still missing an important baryonic mass component? It happened before – we didn’t know about the ICM for decades after Zwicky first identified the missing mass problem in clusters – so perhaps there are still more baryons to discover there.

It could also be that the apparent offset occurs because we’ve failed to put clusters on the same Vf scale as galaxies. This is not easy to do, and we’ve spent a lot of time worrying about it. I don’t think this is what’s going on, though it would make my life a lot simpler if it were. Different indicators – dynamics vs. ICM hydrostatics vs. gravitational lensing – can give somewhat different answers, but not in a way that “fixes” the problem: I see no viable path in which the offset turns out to be a simple difference in the way the depth of the gravitational potential is measured. I would love to be wrong here, but I’m not dismissing the offset for clusters as I am for ultrafaint dwarfs (which don’t do lightly).

Perhaps the extrapolation of the BTFR from individual galaxies to clusters is simply not appropriate. They’re very different kinds of systems, after all. To dig into that, we need some theoretical perspective – why does the observed power law happen? Should we expect different systems to share the same BTFR?

Theory is something I’ve studiously avoided in this post: the possibility that there are baryons that remain to be discovered in clusters can be inferred empirically. All the other data line up, so why not clusters? But unless and until these hypothetical additional baryons are discovered, that’s just one possibility. How likely this possibility seems to be diverges rapidly once we overlay a theoretical preference, which I will leave to future posts. (I did warn it would take more than one.)


&This paper appears in ApJ volume 1001. The literature has grown quite a bit since I started contributing to it in volume 342. The Astrophysical Journal was founded in 1895. So I’ve been contributing to it for a little over a quarter of its temporal existence, but nearly twice the number of volumes have been published in that shorter time. It’s no wonder none of us can keep up.

*Indeed, Tully & Fisher’s “preliminary estimate of the Hubble constant is H0 = 80 km/s/Mpc” remains correct to this day, within the uncertainties (hard to estimate at the time, but roughly ±10 km/s/Mpc).

*^There appears to be an irreducible intrinsic scatter in the linewidth: it is not a perfect proxy for rotation speed. Linewidths are observationally easier to obtain than resolved, extended rotation curves, so the numbers of galaxies in samples using linewidths can be very large without ever approaching the quality provided by resolved interferometric observations. Bigger samples are not necessarily better.

^I emphasize might here because the community seems to have moved towards reporting stellar masses as if we observe these rather than the luminosities and colors/SEDs that the mass estimates are based upon. The latter are data – observed quantities – while stellar masses are a derived quantity that is inevitably model dependent. This doesn’t stop being true just because we decide to invest a lot of faith in our models.

*^The Sloan Digital Sky Survey provides stellar masses based on models that are known to be wrong in the near infrared. Since SDSS itself is entirely optical, one might not notice. If one mixes SDSS data with near-IR data, one will get the wrong answer.

%This is a classic selection effect. Brighter objects can be seen at a much greater distance than dim ones, so probe a much larger volume. Consequently, their raw numbers always dominate surveys even if their number density is low. Stars are a great example: most of the stars you can see at night are intrinsically luminous: bright stars that are rather far away. Mundane, low mass stars do not stand out even when nearby.

#This isn’t for lack of observations of ultrafaint dwarfs, it’s the underlying assumptions.

$No amount of information suffices to perfectly specify the stellar mass that produces an observed luminosity and SED (spectral energy distribution/set of colors), so one always expects at least some intrinsic scatter in the stellar mass-to-light ratio. I’ve seen estimates that range from 0.1 – 0.2 dex for near-IR colors. That’s as good as it can get as there is always some transient population (e.g., AGB stars) that produce an amount of light that depends on the star formation rate some time ago, not what we measure now. Optical colors are worse in the sense of having more intrinsic scatter, as they are more susceptible to the comings and goings of bright but short-lived stars whose numbers fluctuate with the stochastic star formation rate. Finding 0.11 dex intrinstic scatter is pretty much as good as it can get. (By dex we mean the scatter in log space.)

**We noted this effect in the original BTFR paper to argue that it was unlikely that we were missing substantial amounts of molecular gas (H2), which was a concern at the time. Flash forward, and we were right: the molecular gas mass is almost always a distant third behind stars and atomic gas in the baryonic mass budgets of individual galaxies. Nowadays, the concern is about the mass of baryons in the circumgalactic medium (CGM). That’s getting ahead of the story, which I’ll save for a future post. For now, it suffices to note that any baryonic mass in the CGM is far beyond the radius where the flat velocity is measured, so is not relevant to the sums here.

Leave a Reply

Your email address will not be published. Required fields are marked *