Last time, we started talking about the data in the recent paper The Baryonic Mass-Halo Mass Relation of Extragalactic Systems. Here, we’ll put on our dark matter hat, and use the data to make an accounting of the mass – both the dark matter and the baryons in all their various forms. From this conventional perspective we will obtain a method for relating what we see to what we don’t. In the context of LCDM cosmology, this provides an alternative approach to abundance matching. It also provides a test: are the two consistent?
The conventional picture we have in mind is a baryonic galaxy residing in a dark matter halo bathed in a background of intergalactic matter.

Fig. 1 of McGaugh et al. (2026): Conceptual elements of a galaxy: the stars (yellow/blue) and atomic gas (green) of NGC 6946 (Spitzer 3.6µ and 21 cm data: F. Walter et al. 2008) are shown embedded in an extended dark matter halo (black). The dark matter density decreases continuously with radius so the halo has no hard edge, but for convenience we adopt the common convention that the radius r200 marks the boundary of the dark matter halo and the dividing line between the circumgalactic medium (CGM) and the intergalactic medium (IGM; orange). The stars and atomic gas illustrated here appear within r < 20 kpc while r200 ≈ 220 kpc (not shown to scale).
I’ve talked here about the stars and gas a lot because that’s what we see. These are the essential components that define a galaxy and comprise the mass that correlates with rotation velocity to make the baryonic Tully-Fisher relation (BTFR). I’ve talked a bit about the stuff between the galaxies, the intergalactic medium (IGM), but I don’t think I’ve previously had cause to talk much about the circumgalactic medium (CGM). As the name implies, this is gas in the vicinity of a galaxy, but not in the galaxy itself – at least not the part we can readily see. In the notional picture above, the distinction between the CGM and the IGM is the boundary of the dark matter halo that nominally demarcates gravitationally bound from unbound material.
Notional is doing a lot of work here. There’s a lot of gas in the IGM, and some of it is certainly in the vicinity of galaxies, so in that regard counts as circum-galactic. But there’s no hard and fast distinction between these components just as there’s no hard edge to a dark matter halo. Our brains don’t like that, so we impose notional boundaries and proceed as if these are meaningful.
Proceeding thus, we expect our dark matter halo* to contain its fair share of the cosmic baryon fraction, fb = Mb/M200 = 0.157 according to the Planck flavor of LCDM cosmology. We can test this by adding up all the baryons and comparing that to the total mass enclosed by r200. This is straightforward for the stars and gas we see, but not for the stuff we don’t see – both dark matter and the gas in the CGM.
There are some measurements of the CGM, but these tend to be statistical in nature (if we stack data for a bunch of galaxies, we sorta see something), not the precise, individual, galaxy-by-galaxy measurements that we have for the stars and atomic gas. The stars and atomic gas are the mass in the extended Tully-Fisher relations we discussed previously, and are the bulk of the normal material in the galaxies we see. The bulk of the CGM lies at much larger radii, beyond the stars and atomic gas, but within the notional edge of the dark matter halo, as depicted above. Since we don’t measure it directly in individual galaxies, we’re gonna leave the mass of the CGM as an open question rather than something to be included in the sum of known baryonic mass.
The situation is even murkier for the dark matter, which we don’t see at all, so we don’t have a good way to measure the “total” mass of dark matter halos. This isn’t even a well-defined quantity in principle since halos are not expected to have a hard edge. Conventionally, we adopt the mass within a radius that contains a density two hundred times the cosmic critical density, r200, as the notional edge. There are obscure historical reasons for this choice that I do not have the patience to describe. One could make other choices, arguably better choices, but r200 is the most common choice used in the literature so we’ll stick with it here. The halo mass is the mass enclosed by this radius, M200. If one goes through the math, it turns out that the circular speed of a test particle, V200, orbiting at r200 scales with the Hubble parameter [h = H0/(100 km/s/Mpc)] such that V200 = h r200 when V200 is in km/s and r200 is in kpc. The dynamical mass (rV2/G) can then be written
That is a lot of huffing and puffing to get a way to relate the halo mass to something we can (kinda sorta) measure. The flat rotation velocity Vf has always been taken as the signature of the dark matter halo. One therefore expects V200 ~ Vf. Indeed, these quantities cannot differ by much if dark matter is what explains flat rotation curves. However, the notional radius of the dark matter halo where V200 occurs is much larger, by roughly an order of magnitude or more, than the radius where Vf is measured. So they need not be identical, depending on the halo model. So to relate what we measure to what we’d like to know we define a little ol’ fudge factor, fv, such that:
If a rotation curve stays flat indefinitely (as our empirical experience suggests), fv = 1. If instead dark matter halos behave as they should in LCDM, then the rotation speed should gradually decline as we approach the halo’s edge so that fv > 1. How much greater?
One way to estimate the fudge factor fv is to fit dark matter halo models to data. This process does not directly measure V200, but it does provide an estimate of that quantity based on the data available a smaller radii. One can do this for as many halo models as one has the patience to consider. For example, here are the results for two common halo models, the traditional pseudo-isothermal halo first adopted to explain flat rotation curves and the CDM-expected NFW halo:

The result for pseudo-isothermal halos is consistent with fv = 1, as expected – this model was adopted to make flat rotation curves. There is nevertheless some scatter. This typically happens because the observed rotation is not observed to be flat over a large enough range of radii to enforce flatness further out (as often happens in dwarf galaxies) or because the stars account for so much of the mass over the observed range that the inferred dark matter component is still rising (as often happens in bright, high surface brightness galaxies). This sort of haziness is inevitable when one only measures the inner few percent of the notional virial radius.
The result for NFW halos is approximately fv = 1.4, albeit with a lot more scatter. This happens for the same reasons as above, with the additional problem that the dark matter profile in real galaxies rarely looks like NFW. Of all the many halo models considered by Li et al. (2020), NFW consistently performs the worst. One is forcing a fit of a function that would rather not. One signature of this misfit is the occurrence of very large V200 for dwarf galaxies with small Vf. Taken literally, this would mean that some of the smallest dwarf galaxies reside in dark matter halos that outweigh those of giants like the Milky Way. This seems absurd, and it is. For example, by this approach, the dwarf galaxy NGC 3109 residing just outside the Local Group outweighs the Local Group and both its giants, Andromeda and the Milky Way, put together. But it is pretty clear from the local velocity field that the entire Local Group is not orbiting this little dwarf.
The estimation of huge V200 for galaxies with small Vf happens because of the cusp-core problem. The density cusp predicted by NFW expects a curved shape for the inner rotation curve while the data show a more gradual, quasi-linear rise. Any decent fitting program will realize that it can make a curve look like a straight line if it stretches it out enough, so it does exactly this by making the halo very large. That sorta fits the data, but it makes no physical sense. Between this systematic effect and the large scatter induced by the other effects discussed above, one is better off inferring V200 from Vf with a fixed fudge factor. So we’ll do that, leaving the exact value of fv as an open question, but noting that for most objects it almost certainly resides in the narrow range
That’s a lot of words to say the observed flat rotation speed gives us our best kinematic estimator or the dark matter halo mass. In this context, bear in mind the small scatter in the extended Tully-Fisher relations. This contrasts with the large scatter seen in the fits above. This strongly implies that Vf is more closely tied to the underlying mass^ than are the model-specific halo fits to the entire rotation curve. That might seem counterintuitive given that Vf is only a portion of the rotation curve (albeit a well-defined portion). However, it makes more sense when one considers that rotation curve fits must consider the contribution of stars as well as dark matter. Since the stellar mass-to-light ratio is never perfectly known, there is a degeneracy between the two that contributes to the scatter seen above. That variation is not real, it’s just an artifact of the fitting procedure. But when we get to large radii, beyond the confounding effects of the stellar population, the signature of the dominant mass becomes apparent in the flat rotation speed.
We saw above that we expect the halo mass M200 to correlate with V200. We observe that baryonic mass Mb correlates with the flat rotation velocity Vf. The natural assumption is that the stuff we see is proportional to the total (mostly dark) mass while the observed flat velocity is a property of the halo. Hence Mb ~ M200 and Vf ~ V200. This simple argument has been the basis for many papers claiming to explain the Tully-Fisher relation over the course of many years. This would be entirely satisfactory if it weren’t so completely wrong.
Here we need to introduce another fudge factor, mb, that relates the mass we see to the halo that spawned each galaxy:
The obvious assumption is that mb is a constant for all galaxies, in which case Tully-Fisher follows because Mb ~ M200 ~ V2003 and V200 ~ Vf. The wee problem is that this predicts a Tully-Fisher relation with slope 3: Mb ~ Vf3 when we observe one with slope 4: Mb ~ Vf4. In order to reconcile these two, our new fudge factor cannot be a constant. Worse, we need to fine tune it to transform the predicted power law into the observed one: mb ~ Vf. That… doesn’t make any sense.
We can refrain from thinking and plunge ahead to simply plot the baryon fraction. While we’re at it, let’s also plot the stellar mass fraction m* = M*/M200 because that is more commonly discussed in the literature. (Often stellar masses are available for galaxies without the corresponding gas mass measurements.) These fractions have to be increasing functions of circular velocity, or equivalently, mass (mb ~ Vf ~ Mb1/4):

To be specific, I’ve computed the halo mass assuming fv = 1. Different assumptions just slide the data up and down; the trend persists. This is discussed more in the paper if you’re interested in such details.
This gives a nifty way to relate what we can see to what we can’t. There’s a simple formula:
where fb = 0.157 is the cosmic baryon fraction and and M0 = 5 x 1013 M☉ is the scale where the function bends, transitioning from the Mb ~ Vf4 of the BTFR that holds over most of the mass range to the mb = fb of rich galaxy clusters. The precise value of the turnover mass is not well constrained, as it happens in the one place that is not well sampled by the available data. Indeed, there is nothing special about the functional form; it is simply a choice that transitions nicely from one regime to the other. There’s no physics in it&. Still, this is a useful way to estimate the halo mass of pretty much any extragalactic object just by summing up its observed baryonic mass.
Indeed, this kinematic mass-matching relation is better than the widely used abundance matching relations in that it has less scatter. Abundance matching generally relies on stellar mass; that results in more scatter for the same reasons discussed for Tully-Fisher. This is particularly apparent at the low mass end of the top panel above, where galaxies of the same circular velocity (halo mass) have very different stellar masses. This goes away when baryonic mass is used instead.
There is reasonable agreement between abundance matching and kinematics at intermediate masses. The lines representing various abundance matching relations parallel the kinematic data. The offsets that are apparent can be cured by an appropriate choice of fv. Always a free parameter to the rescue there is.
At the high mass end, things go amiss again. Partly this is because abundance matching relations reference the stellar mass of the “central” galaxy. The picture is that each halo contains one central galaxy with many satellite galaxies in subhalos, so what matters is the stellar mass of the central. This is overly simplistic: galaxy clusters are messy, the brightest galaxy isn’t necessarily at the center, and most have substructure with multiple groups rather than a single hierarchy. Besides that, the stellar mass tells you little about the halo mass without further environmental context: a galaxy with M* ~ 4 x 1011 M☉ could reside in halo masses spanning a couple of orders of magnitude.
Setting aside the issue of centrals, there is a serious tension for individual high mass galaxies. The stellar mass fraction suggested by kinematics keeps going up where that of abundance matching turns over. This is due to the linearity of the Tully-Fisher relation compared to the knee in the Schechter function shape of the stellar mass function. The two don’t match up, as discussed previously. This same tension has long been with us; in the ’90s we were concerned with the difference between “the luminosity function normalization” and “the Tully-Fisher normalization.” This tension never went away. Still, the tension between abundance matching and kinematics doesn’t seem tragic, and might be remedied with some appropriate finagling of both the baryon fraction and the velocity fudge factor.
But where are all the baryons? They’re all accounted for in clusters, which reach the cosmic baryon fraction. But in no other system is the checksum complete. There is a missing baryon problem locally in each and every dark matter halo below the cluster scale. To confound matters further, there is a fine-tuning problem: the amount of missing baryons scales precisely with the amount of observed baryons.
The logarithmic plot above may understate the magnitude of the problem. To clarify this, we can plot the ratio of missing-to-observed baryons on a linear scale, at least in part:

The scatter blows up when we plot linear ratios; this is an artifact of error propagation. Nevertheless, it is helpful to see that the local missing baryon problem is not subtle. It is already a factor of ~2 for groups and ~3 for bright galaxies. It’s not as if we’ve misplaced a few percent of the baryons. Most of the baryons that should be associated with galaxy dark matter halos are not in evidence.
This problem has been known for a while, but doesn’t seem to be acknowledged to be a problem. Not all baryons need condense down into the central galaxy; some might be left behind, still mixed in with the dark matter halo. The widespread assumption seems to be that the missing baryons are probably in the CGM.
Accounting for the missing baryons with gas in the CGM almost works in bright galaxies like the Milky Way where we need “only” a factor of a few. Recent estimates suggest that the CGM is comparable in mass to the stars, or even somewhat more. These are very uncertain, as this mass is dispersed in diffuse gas over an enormous volume, and the total mass estimates often involve large extrapolations: the CGM is detected most readily nearby the central galaxy, but most of its implied mass is way far out near r200. Accepting these estimates at face value leads to the star symbols in the plot above. This makes the checksum complete provided the halo is not too massive, as happens if fv ≈ 1.4. This is what we expect for NFW halos, so it might work out if those were viable. However, there is a bigger issue.
The local missing baryon problem gets progressively worse for lower mass galaxies. For 1010 M☉ galaxies – not all that much smaller than the Milky Way (Mb = 7 x 1010 M☉), the problem isn’t a factor of two or three: there are ~6 baryons missing for every one that is observed. For 109 M☉ galaxies, the deficit is an order of magnitude. For even lower mass galaxies, the difference is so large we have to abandon the linear plot lest the interesting parts for bright galaxies get scrunched into invisibility. By the time we get to small dwarf galaxies of 106 M☉, the ratio of missing-to-observed baryons approaches 100:1. It is not plausible to imagine that the CGM of dwarf galaxies explains this deficit. (And yes, we’ve looked.)
A common explanation for this variation is that low mass dark matter halos have shallower potential wells, so have a harder time holding onto their baryons. Supernova can drive material out of galaxies; these go off with the same energy regardless of the galaxy they’re in so they may be more effective at blowing baryons out of lower mass systems. There is sufficient energy (IF properly% distributed) to completely unbind the baryons, so they might wind up in the IGM, defeating any hope of completing the checksum. This is the sort of argument that sounds clever but fails to address the real problem. The difficulty isn’t just ridding ourselves of these meddlesome baryons, it is getting rid of exactly the right amount each and every time.
As awkward as it is to realize that most of the baryons that should be in low mass halos are not in evidence, it is not difficult to imagine ways in which this might happen, like the aforementioned supernova-driven galactic winds. The more dire aspect of the problem is the fine-tuning. Galaxies of the same observed baryonic mass are always missing the same amount of baryons, whether that’s a factor of 2 or 10 or 100. If the visible parts of a dwarf galaxy are only 1% of the available baryons, you’d expect a lot of scatter. Sometimes a halo of that mass might have 2% or even 3% of its baryons condense to the parts we see. That would show up in the scatter in a way it does not: galaxies of the same circular velocity (halo mass) have the same baryonic mass every time. They don’t vary by factors of two (or more). So while we can build models that makes the baryon fraction just so, the fact that we can write a simple equation for it with practically zero scatter is profoundly uncomfortable.
An extra bit of weirdness is that in LCDM, galaxies are built hierarchically by merging small objects into large ones. This poses a teleological problem. Consider a small halo at high redshift. If it remains alone, then it it will contain a dwarf galaxy at low redshift that has a low baryon fraction. But if it mergers into a larger system, then by the current time that larger system has to have a larger baryon fraction. In effect, a low mass halo has to know where it will end up some billions of years in the future. Will it remain alone and unmerged? Better blow out all those baryons! Will it merge into a larger system? Better hang on to the right amount of baryons. Does that system merge into a still larger object? Hope it held onto even more baryons, in exactly the right amount at every step along dozens of mergers.
I can imagine all this happening in a stochastic fashion with the net result being that more massive systems wind up with a higher baryon fraction, at least on average. I cannot give credence to this process resulting in the small observed scatter. As people are always telling me, “galaxies are complicated.” Indeed, they should be – in LCDM. But in reality they’re not! They obey simple scaling laws, laws that do not follow naturally from LCDM.
The local missing baryon problem encapsulates one of the fine-tuning problems that has never been satisfactorily explained. This alone would be considered fatal for most theories. For LCDM, it is just another problem to be addressed through the eternal tweaking of models and simulations.
*Strictly speaking, M200 refers to all mass within r200, baryons as well as dark matter. I’m going to call it halo mass anyway, because that’s what we mean, the baryons are a small fraction of the total, and because that’s what everybody does in the literature. If we make some other choice for the definition of the mass of the halo, MΔ, then the inferred baryon fraction of an objects scales by M200/MΔ. The cosmic baryon fraction does not care what choice we make, so the implicit assumption is that one asymptotes to the cosmic fraction if one gets far enough out, irrespective of what rΔ we adopt. While this is a sensible assumption – individual objects must merge into the larger cosmos at some point – there is no guarantee that the universe cooperates. For example, the baryon fraction in galaxies declines with increasing radius, but that in galaxy clusters increases with radius. I’ve seen hints that it doesn’t really settle down to the cosmic (or any particular) value. These are only hints – considerable extrapolation is involved – so we’ll ignore this inconvenience and assume that the baryon fractions of individual objects do in fact converge to the cosmic value far enough out.
^It makes the most sense if the underlying total mass is the observed baryonic mass.
&I made a very similar fit in McGaugh et al. (2010) but didn’t publish it because there was no physics in it. Since then the field has been awash in abundance matching relations that were similarly fit sans physics. There has been much ink spilled justifying it post-facto with feedback, but I have refrained from this exercise in intellectual onanism.
%It is common to assume in simulations that a large fraction (50 – 100%) of the energy from supernovae is returned to the surrounding gas. This process is not resolved in cosmological simulations, all the energy return happens as part of the “subgrid” physics, so the feedback efficiency is set, in practice, to make things work out as well as possible.
Observationally, most of the SN energy finds its way out along the path of least resistance where the density of the surrounding gas is smallest (“chimneys”). This process couples to the surrounding gas with only a few percent efficiency.