Extended Tully-Fisher relations

Extended Tully-Fisher relations

Previously I had alluded to some of the major projects I’ve been working on. One has come to fruition and can be found on the arXiv and in the Astrophysical Journal&. It has taken many years to assemble the data in this paper, during which time the models purporting to explain some of it have evolved considerably while consistently failing to address the real problems they raise. There is a lot to explore, so it will take more than one post.

Here I start with the empirical basis: the stellar mass and baryonic Tully-Fisher relations. The Tully-Fisher relation was originally discovered as a relation between luminosity and linewidth in rotationally supported galaxies – spirals and irregulars. It immediately proved useful as an extragalactic distance indicator. As such, it was instrumental in breaking the impasse in the Hubble constant* debate (back when it was 50 vs. 100, not 67 vs. 73), and it remains useful in this role.

Physically, the obvious interpretation was that luminosity is a proxy for stellar mass and linewidth*^ is a proxy for rotation speed. This is correct. Of the various rotation speeds one can define and measure, the one that works best, in terms of minimizing the scatter in the relation, is the flat rotation speed measured in the outer parts of extended rotation curves. See Stark et al. (2009) and Trachternach et al. (2009) for further examples. The scatter is basically a function of data quality.

On the mass axis, converting measured flux to luminosity to mass is a bit dicier, as we need to know the distance for the first step and the stellar mass-to-light ratio for the second. There is inevitably some intrinsic scatter in the mass-to-light ratio of a stellar population. While I don’t doubt that luminosity is a proxy for stellar mass, improving on it is hard to do: there are many instances in which simply assuming a straight mapping of light to mass can be as effective as applying fancier population models. We might^ finally be getting past that, so it is worth discussing a bit.

The procedure to convert starlight into stellar mass involves the construction of stellar population models that use the color(s) or spectral energy distribution of a galaxy to infer the types of stars that make the light. This is a long-argued subject; suffice it to say there are a number of points where it can go wrong. The most obvious is the IMF; the initial spectrum of masses with which stars are born. Most of the light we see from galaxies is produced by its higher mass stars, which are disproportionately bright (there is a steep scaling of stellar luminosity with mass). But most of the mass is locked up in low mass stars that contribute little to the total luminosity. So we are, in effect, using the light of the few to represent the mass of the many. That would go badly wrong if we don’t know the relative mix, i.e., the shape of the IMF. This has been the subject of much research, and over many decades has been narrowed down pretty well. While I hope that this is almost settled, the specter of the IMF lurks as a menace to all stellar mass determinations.

There is a lot else we need to know to build a stellar population model. This includes such essentials as the spectra of individual stars of each and every type and stellar evolution as a function of mass and composition including exotic phases like the asymptotic giant branch. There are a lot of places where this can go badly wrong, and sometimes^% does. So I wouldn’t say we know how to do this perfectly, but we have become pretty good at it.

Converting light to mass suffices to plot the stellar mass Tully-Fisher relation. That accounts for most of the baryonic mass of high mass spirals, but it ignores the mass of the interstellar gas. This can be appreciable in lower mass systems. Indeed, the standard issue dwarf galaxy in the field is more gas than stars:

Figure 1 from McGaugh et al. (2019): The gas and stellar masses of rotating galaxies. Blue points are galaxies in the SPARC database (Lelli et al. 2016b) and the gas rich galaxies discussed by McGaugh (2012). The location of the Milky Way is noted in red (McGaugh 2016): it is a typical bright spiral. Grey points are the sample of Bradford et al. (2015). The line is the line of equality where M* = Mg.

With measurements of mass and rotation speed, we can construct the Tully-Fisher relation:

Figure 4 from McGaugh et al. (2019): The stellar mass (left) and baryonic Tully-Fisher relation (right). Data from Lelli et al. (2016b) and McGaugh (2012) are shown as blue points if both axes are measured with at least 20% accuracy; less accurate data are shown in grey. The latter include cases for which the rotation curve does not extend far enough to measure Vf, in which case the last measure point is used. These cases are systematically offset to lower velocity. Inclination uncertainties and distance errors also contribute to the scatter. The better the data, the tighter the relation. The location of the Milky Way is noted in red (you are here).

The stellar mass Tully-Fisher relation is a good correlation by the standards of extragalactic astronomy. The majority of studies in the literature are restricted to massive% galaxies, mostly those with M* > 1010 M where stars dominate the baryonic mass budget so the omission of gas is not obvious. As we look to lower masses, the relation bends and the scatter increases. That this happens right where gas starts to become important to the mass budget suggests that we’re missing an important component, and voila – a nice, continuous relation that is linear in log space is restored when we plot the baryonic mass Mb = M*+Mg. Indeed, the data are consistent with a simple power law

Mb=AVf4M_b = A \, V_f^4

with A = 50 M km-4 s4. The intercept A has consistently been measured within 10% of this value over the past couple of decades. That this is an integer power law so that the intercept has real physical units is intriguing. That doesn’t happen in most astronomical scaling laws, which are usually more happenstance, like the mass-luminosity relation for main sequence stars.

Why limit ourselves to rotationally supported galaxies? Let’s plots every known type of gravitationally bound extragalactic object, from the smallest ultrafaint dwarfs to the largest clusters of galaxies. Note that I’ve flipped the axes to accommodate the huge dynamic range in baryonic mass, roughly twelve (12) orders of magnitude. This is like having gnats at one end of the scale and blue whales at the other. On that scale, a person is a regular galaxy like the Milky Way.

Figure 3 from McGaugh et al. (2026)Extended Tully-Fisher relations plotting the flat-equivalent circular velocity of extragalactic systems as a function of stellar mass (top panel) and baryonic mass (bottom panel). Data for rotationally supported galaxies are depicted by circles; squares represent pressure supported systems. The blue circles are galaxies with directly measured distances, Vf from rotation curves, and stellar masses from WISE photometry from Duey et al. (2026, in preparation). Green circles are gas-rich galaxies (Mg > M*; Stark et al. 2009; Trachternach et al. 2009; Bernstein-Cooper et al. 2014; McNichols et al. 2016; Iorio et al. 2017; Namumba et al. 2025; Xu et al. 2025) not already in Duey et al. (2026). Yellow points are Local Group galaxies, both spirals and dwarfs (McGaugh et al. 2021); gray squares are ultrafaint dwarfs (Lelli et al. 2017). Lensing results for early- and late-type galaxies (Mistele et al. 2024a) are shown as pink squares and magenta circles, respectively. Red squares are clusters of galaxies (Mistele et al. 2025), and purple squares are groups of galaxies (McGaugh et al. 2026). The orange line is the BTFR fit only to rotating galaxies over a more limited range (about three orders of magnitude in baryonic mass, from Mb ~ 4 x 108 to 4 x 1011 M) by McGaugh (2005).

One improvement from twenty years ago, aside from the greater number of objects and the increase in dynamic range, is the accuracy of the mass measurements. I tried a number of prescriptions for the stellar mass-to-light ratio in McGaugh (2005), which resulted in a range of possible slopes. Now we just use the stellar mass from precise population models (Duey et al. 2025) and recover my best estimate from back then. The room to dodge the obvious conclusion about the slope of the relation by complaining about the choice of stellar mass estimator – a popular course of action back then – is gone. Another technical issue we’ve spent a lot of effort working on is how to put all these very different systems on the same scale of Vf. I won’t elaborate on this here: if you’re interested in that level of detail, you can go read the paper and references there in. If we got this wrong, it would add to the scatter in the relation, and/or create offsets between different types of data.

Both of the extended Tully-Fisher relations, that in stellar mass (top panel) and that in baryonic mass (bottom panel, the extended BTFR) are good correlations. That in baryonic mass is clearly better in the sense that it is tighter over a larger dynamic range. From small dwarf galaxies (Mb ~ 5 x 105) to groups of galaxies (5 x 1012 M), the data are consistent with a single power law (Mb ~ Vf4) for all systems with remarkably little scatter. Outside this range, the data for both the lowest and the highest mass systems deviate from a straight line towards higher mass at a given flat velocity. I don’t put much credence in the smallest systems as I think there is little chance that their measured velocity dispersions are representative of their equilibrium gravitational potential. For all practical purposes, our knowledge runs out as we hit the regime of ultrafaint# dwarfs. The deviations of the most massive systems, clusters of galaxies, are more difficult to dismiss.

Restricting our attention for the moment to the range where a single power law suffices to describe the data, we note that there is not much scatter in the BTFR. Some of it is from random uncertainties; these dominate most studies and lead to a lot more scatter than seen here: these data are very good. We can account for the known observational errors and subtract off their contribution to estimate the intrinsic scatter in the relation. This is the variance of the data from a perfect line. The intrinsic scatter for the best data (the WISE-SPARC sample of Duey et al. 2026) is about 0.11 dex in mass – about what we expect$ for stellar populations. That doesn’t leave much room for other sources of scatter, so the underlying physical relation has to be very tight indeed: essentially perfect over the range 5 x 105 < Mb < 5 x 1012 M.

Scatter will also occur if our mass budget is incomplete. We can see this in the transition from the stars-only relation to the BTFR. There is a lot of scatter in the stellar mass Tully-Fisher relation around 107 < Mb < 109 M. Galaxies in this mass range are sometimes star-dominated and sometimes gas-dominated. The gas fraction is all over the place. This shows up as scatter in the stellar mass Tully-Fisher relation. That’s not real; it is a sign that we’ve missed an important mass reservoir. This is cured when we add in the gas mass, which is dominated by atomic gas (HI to spectroscopists and astronomers). That this addition removes the scatter and restores a single power law relation strongly suggests that there are no further substantial reservoirs** of baryonic material that we’re missing.

This logic applies to other systems as well. Bright spirals do not need much correction because their baryonic mass is dominated by stars. Their stellar mass Tully-Fisher relation is pretty much already their BTFR.

Perhaps this applies to clusters of galaxies as well? There was a huge correction from stars-only to stars plus gas. The gas in this case is the hot, ionized plasma of the intracluster medium (ICM) that belongs to the cluster itself and not any individual galaxy within it. That goes most of the way to close the gap between the stars-only cluster data and the extrapolation of the BTFR fit to individual galaxies, but not all the way. So perhaps we are still missing an important baryonic mass component? It happened before – we didn’t know about the ICM for decades after Zwicky first identified the missing mass problem in clusters – so perhaps there are still more baryons to discover there.

It could also be that the apparent offset occurs because we’ve failed to put clusters on the same Vf scale as galaxies. This is not easy to do, and we’ve spent a lot of time worrying about it. I don’t think this is what’s going on, though it would make my life a lot simpler if it were. Different indicators – dynamics vs. ICM hydrostatics vs. gravitational lensing – can give somewhat different answers, but not in a way that “fixes” the problem: I see no viable path in which the offset turns out to be a simple difference in the way the depth of the gravitational potential is measured. I would love to be wrong here, but I’m not dismissing the offset for clusters as I am for ultrafaint dwarfs (which don’t do lightly).

Perhaps the extrapolation of the BTFR from individual galaxies to clusters is simply not appropriate. They’re very different kinds of systems, after all. To dig into that, we need some theoretical perspective – why does the observed power law happen? Should we expect different systems to share the same BTFR?

Theory is something I’ve studiously avoided in this post: the possibility that there are baryons that remain to be discovered in clusters can be inferred empirically. All the other data line up, so why not clusters? But unless and until these hypothetical additional baryons are discovered, that’s just one possibility. How likely this possibility seems to be diverges rapidly once we overlay a theoretical preference, which I will leave to future posts. (I did warn it would take more than one.)


&This paper appears in ApJ volume 1001. The literature has grown quite a bit since I started contributing to it in volume 342. The Astrophysical Journal was founded in 1895. So I’ve been contributing to it for a little over a quarter of its temporal existence, but nearly twice the number of volumes have been published in that shorter time. It’s no wonder none of us can keep up.

*Indeed, Tully & Fisher’s “preliminary estimate of the Hubble constant is H0 = 80 km/s/Mpc” remains correct to this day, within the uncertainties (hard to estimate at the time, but roughly ±10 km/s/Mpc).

*^There appears to be an irreducible intrinsic scatter in the linewidth: it is not a perfect proxy for rotation speed. Linewidths are observationally easier to obtain than resolved, extended rotation curves, so the numbers of galaxies in samples using linewidths can be very large without ever approaching the quality provided by resolved interferometric observations. Bigger samples are not necessarily better.

^I emphasize might here because the community seems to have moved towards reporting stellar masses as if we observe these rather than the luminosities and colors/SEDs that the mass estimates are based upon. The latter are data – observed quantities – while stellar masses are a derived quantity that is inevitably model dependent. This doesn’t stop being true just because we decide to invest a lot of faith in our models.

*^The Sloan Digital Sky Survey provides stellar masses based on models that are known to be wrong in the near infrared. Since SDSS itself is entirely optical, one might not notice. If one mixes SDSS data with near-IR data, one will get the wrong answer.

%This is a classic selection effect. Brighter objects can be seen at a much greater distance than dim ones, so probe a much larger volume. Consequently, their raw numbers always dominate surveys even if their number density is low. Stars are a great example: most of the stars you can see at night are intrinsically luminous: bright stars that are rather far away. Mundane, low mass stars do not stand out even when nearby.

#This isn’t for lack of observations of ultrafaint dwarfs, it’s the underlying assumptions.

$No amount of information suffices to perfectly specify the stellar mass that produces an observed luminosity and SED (spectral energy distribution/set of colors), so one always expects at least some intrinsic scatter in the stellar mass-to-light ratio. I’ve seen estimates that range from 0.1 – 0.2 dex for near-IR colors. That’s as good as it can get as there is always some transient population (e.g., AGB stars) that produce an amount of light that depends on the star formation rate some time ago, not what we measure now. Optical colors are worse in the sense of having more intrinsic scatter, as they are more susceptible to the comings and goings of bright but short-lived stars whose numbers fluctuate with the stochastic star formation rate. Finding 0.11 dex intrinstic scatter is pretty much as good as it can get. (By dex we mean the scatter in log space.)

**We noted this effect in the original BTFR paper to argue that it was unlikely that we were missing substantial amounts of molecular gas (H2), which was a concern at the time. Flash forward, and we were right: the molecular gas mass is almost always a distant third behind stars and atomic gas in the baryonic mass budgets of individual galaxies. Nowadays, the concern is about the mass of baryons in the circumgalactic medium (CGM). That’s getting ahead of the story, which I’ll save for a future post. For now, it suffices to note that any baryonic mass in the CGM is far beyond the radius where the flat velocity is measured, so is not relevant to the sums here.

Yep, it’s a religion

Yep, it’s a religion

I have been concerned for years that dark matter was morphing from legitimate science into a cold, dark religion. I have been reluctant to put it that way, because there are lots of scientists who work on dark matter that have not fallen entirely down that rabbit hole and who continue to make valuable contributions working in that context. But a recent experience reminded me that my concerns were not misplaced, and there are plenty of scientists who have fallen irredeemably down this rabbit hole. No matter what answer the future holds to be correct, many current scientists will have gone to their graves in denial of it.

Where is the boundary between science and religion? It is hard to assess where the borderline is. But it is easy to see when people are far over the line – so far over that it doesn’t really matter where exactly the line is. One can attend any conference on the subject to find people who unabashedly assert that dark matter exists without question. Not just that acceleration discrepancies have been amply demonstrated empirically, but that the only possible interpretation is dark matter. If asked whether this invisible mass is in the room with us now, they will enthusiastically# answer yes! Since dark matter has not been detected in the laboratory, this assertion is an expression of faith – the hallmark of religion – not of an established scientific fact. What we have established is that there are discrepancies between what we see and what we get when we assume Newtonian gravity (or GR, if needed). What we don’t know is whether the cause of these discrepancies is some form of invisible mass (dark matter) or if the equations we employ are inadequate (modified gravity [or more generally, dynamics]).

Indeed, these days many people will assert that dark matter has already been detected, usually citing astronomical evidence that used to be considered too feeble to merit a Nobel prize. Funny how repeating a mantra long enough morphs an aspiration into accepted reality. Modern physics is not providing a strong falsification of the supposition that science is a social construct.

A prominent example of an observation of the sky that is frequently cited as absolutely requiring cold dark matter is the acoustic power spectrum of the cosmic microwave background. Quoting clayton from a few years ago:

the primary reason to believe in the phenomenon of cold dark matter is the very high precision with which we measure the CMB power spectrum, especially modes beyond the second acoustic peak. There is a stone-cold, qualitative, crystal clear prediction of CDM about the relative sizes of the second and third peaks that modified gravity profoundly and irredeemably gets wrong: it thinks the third peak should be relatively larger* than the second… whereas CDM thinks they should be about the same

I would accept that this were conclusive proof of dark matter if this were the unique prediction of dark matter: that there was no other way to do it, so all other approaches were indeed irredeemable. (Quite the strong language, eh?) The problem is that CDM is not the one unique was to fit these data. Skordis & Zlosnik showed that it is possible to write a modified gravity theory that also fits the CMB data:

CMB power spectrum observed by Planck fit by AeST (Skordis & Zlosnik 2021).

This does not prove the AeST theory of Skordis & Zlosnik is correct, but it does demonstrate that it is possible to write a modified gravity theory that does indeed do what it is frequently asserted to be impossible for a modified gravity theory to do. I’ve heard of a couple of other theories that can also do this (the relativistic Khronon theory of Blanchet and nonlocal MOND as discussed by Deffayet & Woodard), so clearly this success is not uniquely limited to cold dark matter, or even a particular modified gravity theory. The work of Skordis & Zlosnik (2021) was known and in the literature before clayton made the assertion above in late 2022, so either he wasn’t paying attention (likely) or is convinced that it is impossible so doesn’t even consider the possibility (also likely). The former just says we’re all too busy, but the latter is a mark of religious thinking: my god is the only god, thou shalt have no other hypotheses before& me.

Many people are very impressed with the quality of the LCDM fit to the CMB. That is indeed very good, but there are enough free parameters that we were going to get a fit to any physically plausible power spectrum. If not, we’ve never been shy about making up new parameters. (Evolving dark energy, anyone? How about a running power spectrum? There’s a whole bag of possibilities!) What I’ve been more impressed with is the consistency of the fit to the CMB data with the many independent constraints on conventional cosmology. Or at least it was, until it wasn’t.

The Hubble tension has gotten steadily worse (in terms of statistical significance), and it really does not look like local measurements are to blame, nor is it the only tension. People seem to miss that it is the CMB-fitted value of the Hubble constant that has evolved over time to spoil the concordance that got us to believe in LCDM in the first place. But if the CMB is the cornerstone of your religion, all other data must inevitably be at fault and can be ignored: there is an entire community of cosmologists who choose to believe the best-fit Planck cosmology to the exclusion of all other data. It’s like the bad old days of the Hubble tension all over again, with the physics community choosing to believe the lower value of H0 because it makes more sense for the aspects of cosmology that they care about while those in the astronomical community who actually measure H0 find a persistently higher value.

A real tension in LCDM implies the need for new physics of the unknown variety. One doesn’t want to go there if it can be helped. I didn’t consider MOND until I was already concerned for the viability of dark matter. There are real problems for the paradigm that its more intense advocates simply deny, brush aside without real thought, or choose to remain ignorant of. When they are confronted with a problem, they are pretty creative about making stuff up on the spot. Anything to avoid having to confront the unspeakable – another hallmark of religion.

For example, cold dark matter is scale free. That’s foundational to the hypothesis. So the existence of an acceleration scale in the kinematic data is anathema to CDM. When I first pointed this contradiction out, there were a variety of assertions to the effect of “does too!” One example is provided by Kaplinghat & Turner, who claim to show “how Milgrom’s law comes about in the cold dark matter theory of structure formation.” That would, indeed, be ideal, and is a requirement for any theory to be successful.

Wee problem: they demonstrat no such thing. CDM is scale free, yet K&T claim that it explains Milgrom’s Law, which is predicated on the existence of an acceleration scale. Well, which is it? Is CDM scale free? Or does it explains the acceleration scale? We can’t have it both ways: their very premise is self-contradictory. It is absurd on its face.

The acceleration scale is defined by baryons, for which K&T have no model. To connect baryons with dark matter, they make a hand-waving argument about galaxies reaching a0 at the edge of their disks. This is not even a concept of a model and does not begin to suffice as an explanation for many reasons, a prominent one being that low surface brightness galaxies have accelerations less than a0 everywhere:

Centripetal acceleration curves color coded by galaxy surface brightness. Low surface brightness galaxies (blue colors) have low (sub-a0) accelerations everywhere: there is no edge at which they reach a0. (Adapted from McGaugh 2020.)

Milgrom pointed out this and many other shortcomings of their scenario, so I feel no need to elaborate further. Milgrom eviscerated their paper so thoroughly that the proper course of action would have been to retract it. Instead, they simply never acknowledge the criticism, and persist to this day in pushing it as some sort of valid scientific explanation. It is not; it does not withstand even mild critical scrutiny. But it doesn’t need to: it reassures the faithful that all is well. They hear what they want to hear without questioning its veracity. That’s another hallmark of religion.

I have refrained from saying these things in the past because I’m too nice. For example, a few years ago I started then abandoned the draft text below, which I simply cut & paste:


One of the things that attracted me to a career in science is the notion of objectivity. I grew up for a time in the bible belt, where people earnestly believed things that were obviously untrue, even to the eyes of a small child. On the occasions that I had the temerity to point out the obvious, the contradictions posed by facts never had an impact on their belief system. Rather, it inevitably earned me a warning that I was going to hell. No few of these people seemed to think it was their religious duty to send me there prematurely, or at least to make life on Earth a living hell.

Scientists eschew such behavior, but are also human, so often engage in it anyway. I’ve encountered it a lot. I get it; I went through the same denial, grief, and anger over the prospect of losing my good friend cold dark matter. The stages of grief never brought something back from the dead, but it has engendered a lot of blame-the-messenger.

Here’s an example, from a review by Mike Turner:

Excerpted from Turner (2021).

There is a lot of misinformation packed into this short paragraph.

The first clue is right there at the beginning, in red: the heading “False starts.” This is false framing, a classic tool of propagandists. It starts from the outset by asserting that the topic to be discussed is wrong at a level of knowledge so common it requires no justification. This is not the way one starts an objective discussion, much less a scientific one.

Turner then misconstrues what Milgrom did. He didn’t notice the scale a0 in the data, for which there was scant evidence at the time. Rather, Milgrom made the obvious statement that the inference of dark matter relied on the assumption that dynamics, as encapsulated by the laws of inertia and gravity, is the same on the very different scales of galaxies as in the solar system where they were established, so we ought to consider if dynamics might change in some way. He quickly excluded a size dependence as a possibility. How he settled on acceleration is beyond the scope of this post, and not for me to say. Neither is it for Turner to say.

After a brief and incomplete description of what MOND is, Turner allows that “this one-parameter model fits all the rotation-curve data”. Even in making this admission, he chooses to call it a model rather than a theory. A model is something specific you build in the context of a theory, like a halo model in CDM. MOND is more than that.

Turner quickly moves on without contemplating any meaning that rotation curves might hold. Let’s pause to consider that.

First, I would not say that MOND fits all the rotation curve data. It fits most galaxies, but there are a minority of weird cases that are not well fit. The weird cases inevitably don’t make sense in terms of dark matter either, so on the whole I interpret this to be the usual price of dealing with astronomical data – some of it is just goofy. Setting such cases aside, I can and have fit the same data with all sorts of dark matter halo models. MOND requires fewer parameters, which is important, but the difference isn’t in the fitting. The difference is in predictive ability. I can use MOND to predict the dynamics of galaxies a priori, and have done so many times. I cannot use any flavor of dark matter theory to do the same, and it’s not for lack of trying.

The predictive power of MOND must be telling us something, even if it is something about the nature of dark matter or the process of galaxy formation. There are many papers written on this, some deep and profound, others absurd and banal. Turner cites none of them, nor displays any awareness that such work exists. I would venture to guess that is because acknowledging such work would imply that there is something to debate here, something he would apparently rather not admit.


That’s where I left off. It’s exhausting deciphering other people’s false assertions. Moreover, I just don’t like criticizing other people, no matter how richly they deserve it. (Turner has never refrained from criticizing me in ad hominem terms: on one occasion$ he showed my picture to an audience and called me “the enemy.”) A large segment of the particle physics and cosmology community appears to think this way, and has succumbed to a scientific version of bible thumping in which you can assert any absurd thing so long as it falls within the framework of the holy LCDM. They really need to find something better to do.

I had hoped we were past this, but I heard a talk last week that was exactly in this mode. To paraphrase, the talk went

We’re sure dark matter exists. We have been sure about it for decades. In that time, we have been repeatedly proven wrong about what it is. Rather than re-think our paradigm in the face of these repeated failures, we double down yet again on the existence of this invisible, undetected mass, asserting aggressively% that it must be true while eliding or misrepresenting the evidence that it is not. This enables us to make up a whole lot of exciting new possibilities for what the dark matter might be and conceive of ever more grandiose experiments to continue not to detect it. You must believe in dark matter!

This was not a science talk so much as an indoctrination session. It was as if I had stumbled into a revivalist tent where some hothead was preaching to the choir. This is the kind of talk that misled an entire generation into wasting their careers at the bottom of a mine shaft searching for WIMPs. At least WIMPs were a well-motivated hypothesis; this kind of talk could lead a new generation down an even greater variety of garden paths.

I am well aware that I might fall prey to this attitude myself. That’s why I set criteria by which I would change my mind: detect dark matter already, or at least provide a satisfactory explanation as to how MOND comes about. Neither of those criteria have been met. There are claims to do the latter, but so far these are just variations on models I tried and found to fail long ago. If I thought these could work, I would have said so. At the same time, I don’t see any dark matter advocates taking up the challenge to specify what would change their minds. When I ask them what could falsify dark matter, I get dumbfounded looks – the deer-in-the-headlight face one gets when the immediate response why would you even ask that? is checked by a distant memory that scientific theories are supposed to be falsifiable.

Personally, I found it humbling to encounter MOND in my own data. I too thought we understood the universe with dark matter. But who ordered this? Certainly not me: my own conventional, dark-matter based predictions were falsified. No one else working in the context of dark matter had got it right at the time either. Only Milgrom ordered this.

And what is this? There is a direct connection between what we see and what we get. Even in ignorance of MOND, the radial acceleration relation encodes a one-to-one relation between the distribution of baryons and the effective force. This is so direct that one can right down a single equation connecting the two:

gobs=F(gN/a0)gN.g_{obs} = F(g_N/a_0)\,g_N.

The observed acceleration is a simple function of that predicted by Newton for the stars and gas that we see. There is no mention of unseen mass; everything is specified by what we can see is there.

I’ve sometimes heard astronomers complain about the reductionist ethos of physics, trying to cram all the complexity of the entire universe into a theory of everything. But here it is appropriate: there is a single, apparently universal force-law at work in galaxies. That’s telling us something profound. And yet if questioned about this, the physicists are the ones who will complain that galaxies are complicated, so they should be exempted from having to explain them. Galaxies should be complicated – in LCDM. But they’re observed not to be, in the sense that a single equation suffices to describe their kinematics. The problem isn’t that galaxies are inexplicably complicated, it’s that they should be but aren’t.

I am deeply disappointed that many scientists apparently lack the physical intuition to immediately recognize the import of the simple relation between what we see and what we get. It is the same sort of thing Newton noticed in the solar system: everything happens as if the gravitational force is proportional to the product of the masses and the inverse square of their separation. He didn’t understand why at the time, and was criticized for indulging in magical thinking: how can there be action at a distance? But that’s what the data were saying, and the same applies now. We might not yet understand the why, but that the data look as if MOND is what’s happening in this universe.


#The framing has morphed over the years. A recent advent is that some people have started proactively asserting that invisible mass is in the room with us now in order to avoid having to answer it as a question that makes them sound like loonies.

*He means the third peak should be smaller than the second, not larger, if by “it” he means modified gravity with the baryon density expected from big bang nucleosynthesis, which was the hypothesis that correctly predicted the first-to-second peak ratio but does indeed get the second-to-third peak ratio wrong. Funny how the CMB community was able to completely ignore the successful prediction for several years, but were then suddenly all over the latter failure. The third peak falsifies the ansatz on which that particular prediction was built, not the entire concept of modified gravity. This would be like asserting that all possible forms of dark matter are excluded because we haven’t yet detected WIMPs. It is a classic failure of objectivity, which is another hallmark of faith-based argumentation: we know His name is [insert favorite deity], not [insert any other deity].

&Or after me. Dark matter was my first hypothesis, and I’m here to tell you that True Believers do not suffer second hypotheses or those who stray from the fold. I guess that’s why so many scientists who are MOND-curious keep it on the down low. Wise, perhaps (that’s why tenure needs to be a thing), but hardly the ideal of the open and free exchange of scientific ideas.

$I wasn’t there, but one audience member (not someone I knew) thought it was so over the top that he told me about it, sharing a link with a video. (I did not retain that link, and doubt the hosting conference website is still active.)

%Argument weak here. RAISE VOICE!

Paradigm Shifts in Modern Astrophysics

Paradigm Shifts in Modern Astrophysics

I see that I’ve been posting once a month so far in 2026. I’ve lots to say but no time to say it. Some of it good, some of it bad, maybe sometime I’ll get around to it. No guarantees. On the good side, I’ve been working on a big project or two; may have something to say about those soon. I’ve also been meaning to write about the Planet 9 anomaly for months stretching into years now. Fascinating stuff related to MOND but not something I’ve worked on myself. On the bad side, I’ve been obliged to waste yet more time on my university administration’s insistence on merging our department into physics based on a snap decision made by a disinterested leader who employed all the forethought typically reserved for bombing a random country in the Middle East.

So I have had no time for novel posts lately, and today is no different. However, I thought readers of this blog would appreciate the post Paradigm Shifts in Modern Astrophysics: Applying Thomas Kuhn’s The Structure of Scientific Revolutions to Dark Matter at Heritage Diner that was pointed out to me by Moti Milgrom. Since I wouldn’t have seen it had he not mentioned it, perhaps that’s the case for you as well. I’m not gonna re-post it verbatim – you can read it there yourself – but I am going to offer a running commentary with a few observations, both personal and historical. So bring it up in a separate browser window and let’s read along…

This post riffs off of Kuhn’s The Structure of Scientific Revolutions as it pertains to dark matter and MOND. If you’re not familiar with it, Kuhn’s work on the philosophy of science is foundational to the way in which a lot of physical scientists approach their field (whether they realize it or not). Philosophers of science have done a lot more since then, but I’m not going to attempt to go there. I will look back to Popper* to note that I’ve heard Kuhn depicted as being some sort of antithesis to Popper. I don’t see it that way. To be pithy, Popper tells us how science should be done while Kuhn tells us how it is done. Who could have imagined that a human endeavor would be messy in practice and not always live up to its ideal?

I’m not sure how to do this; I guess I’ll excerpt relevant quotes and riff off those. The basic thesis is that dark matter is on the brink of a Kuhnian paradigm shift.

We are living through exactly that moment in modern astrophysics.

I certainly hope so! This moment in the history of science is taking a long damn time. A century ago, we went from “classical physics explains everything” to “quantum mechanics, WTF?’ in the space of about a decade. I’ve been working on matters related to MOND for over thirty years now, dark matter longer than that, and of course Milgrom started more than a decade before I did.

The essay discusses the “cartography of collapse,” which includes crisis and revolution:

The third stage is crisis — triggered when anomalies accumulate beyond the paradigm’s absorptive capacity. And the fourth is revolution, in which a new framework displaces the old not through incremental persuasion but through a gestalt shift, what Kuhn famously described as seeing the same duck-rabbit drawing and suddenly recognizing a rabbit where you had always seen a duck.

This resonated with me because I had exactly this experience. I started my career as much a believer in dark matter as anyone. I was barely aware that MOND existed (this seems to remain a common condition). But it reared its ugly head in my own data for low surface brightness galaxies. Try as I might – and I tried mighty hard, for a long time – I could not reconcile how the shapes of rotation curves depended on surface brightness as they should according to Newton while simultaneously lying exactly on the Tully-Fisher relation without any hint of dependence on surface brightness+. I could explain one or another, but not both simultaneously – at least, not without engaging in some form of tautology that made it so. I came up with a lot of those, and that has been a full-time occupation for many theorists ever since.

For me, this gradually became a genuine crisis. I pounded my head against the wall for months. Then, as I was wrestling with this problem, I happened to attend a talk by Milgrom. I almost didn’t go. I remember thinking “modified gravity? Who wants to hear about that?” But I did, and in a few short lines on the board, Milgrom derived from MOND exactly the result I found so confusing in terms of dark matter. This chance meeting in Middle Earth (Cambridge, UK) changed how I saw the universe. The change wasn’t immediate – it had to ferment a while – but ultimately I found myself asking myself over and over how this stupid theory could have its predictions come true when there was so much evidence for dark matter. Finally I realized that the evidence for dark matter assumes that gravity is normal; really it was just evidence of a discrepancy, and it could be that the assumption was at fault. That realization was sudden: where I’d always seen a duck, suddenly I could also see a rabbit.

Most scientists have not had this experience. What constitutes a crisis serious enough to contemplate a paradigm change is a highly personal matter of judgement. It happened in my data, so I took it seriously, but others didn’t care. So I made predictions for their data. Some of those came true, but they rejected the evidence of their own data. It just could not be so! At what point does a mere problem amount to a true anomaly?

Part of the sociological issue is that the dark matter paradigm has been in a constant state of crisis since its inception. The reasons vary over time. Sometimes valid solutions have been found to the crisis du jour, other times we’ve chosen to just live with it. It is much easier to live with a bad solution than to rethink one’s entire world view.

The problem with being in a constant state of crisis makes is that it seems like nothing can ever be a genuine crisis. Every foundational change is just another new normal. We complain, say it can’t be so, argue, offer bad ideas, reject them, get used to them, then eventually accept that one of them maybe isn’t so bad, so that must be what is going on. After a few years It is Known and people convince themselves that we expected just that all along.

It takes a lot of evidentiary weight for a paradigm to change, and it takes a lot of time for that to accumulate. But, as Kuhn recognized, mere facts are not enough. Humans and their attitudes matter. As Feyerabend noted,

The normal component [i.e. the accepted paradigm and its adherents] is large and well entrenched. Hence, a change of the normal component is very noticeable. So is the resistance of the normal component to change. This resistance becomes especially strong and noticeable in periods where a change seems to be imminent.

P. Feyerabend in Criticism and Growth of Knowledge

The post correctly points out that dark matter itself was an anomaly going back to Zwicky in 1933. This is often depicted^ as the first detection of dark matter, but it was also noted by Oort in 1932. Zwicky was aware of Oort’s work and cited him, but they’re very different results. Oort was worried about a factor of ~2 discrepancy in stellar dynamics in our local chunk of the Milky Way; Zwicky discovered a discrepancy of a factor of ~1000 in the Coma cluster of galaxies. These both imply the need for unseen mass, but the results are not at all the same. In retrospect, Oort’s discrepancy is a subtle detection of a flat rotation curve while Zwicky’s discrepancy was (at least) two distinct discrepancies: what we now consider the usual cosmic dark matter, but also missing baryons: most of the normal matter in clusters is in the hot, diffuse intracluster medium, not in the stars in the galaxies that Zwicky could see and account for. The modern discrepancy is only a factor of ~6, which is rather less than 1,000. (The distance scale also played a role in exaggerating Zwicky’s result.)

This all seemed crazy in the 1930s, even in the immediate aftermath of the quantum revolution. Consequently, Zwicky’s work was mostly ignored$. The subject of dark matter didn’t really take off until the 1970s. Considerable credit goes (rightly) to Vera Rubin, though many others made essential contributions – just on the subject of rotation curves, Albert Bosma, Mort Roberts, and Seth Shostak all made important contributions, the relative importance of which depends on who you ask.

An important aspect of scientific revolutions is persistence. Vera was persistent. She was fond of relating the story of showing her first (1970) flat rotation curve of Andromeda to Alan Sandage, only to have him dismiss it as “the effect of looking at a bright galaxy.” What the heck did that mean? Nothing, of course – it is the sort of stupid thing that smart people say when confronted with the inconceivable. So Vera persisted, and by the end of the decade had shown that flat rotation curves were the rule, not some strange exception. They became accepted as a de facto law of nature, and the dark matter interpretation was solidly in place by 1980.

The scientific community absorbed this anomaly not by questioning Newtonian gravity or Einstein’s general relativity, but by proposing an invisible scaffolding — a halo of non-luminous, non-interacting matter surrounding every galaxy. Dark matter became not a crisis but a patch.

Indeed, this seemed the most appropriate (scientifically conservative) course of action at the time, as summarized in this exchange (also from the early 1980s):

To emphasize the essence of what is said here:

Tohline: I might be so bold as to suggest that the validity of Newton’s law should now be seriously questioned.

Rubin: The point you raise is worth keeping in mind although I believe most of us would rather alter Newtonian gravitational theory only as a last resort.

This was a very reasonable attitude, at the time. But I’ve heard the phrase “only as a last resort” many times now over the course of many years from many different scientists. At what point have we reached the last resort? In the case of dark matter, once we’ve convinced ourselves that invisible mass has to exist, how can we possibly disabuse ourselves of that notion, should it happen to be wrong?

In Kuhnian terms the last resort is reached when the weight of anomalies in the standard paradigm become too great to sustain. But that point is never reached for many die-hard adherents. Whatever the right answer about dark matter turns out to be, I’m sure many brilliant people will go to their graves in denial. Hence the more cynical phrase

Science progresses one funeral at a time.%

But does it? What if the adherents of an ingrained but incorrect paradigm breed faster than they go away? I’ve seen True Believers train graduate students who’ve gone on to train students of their own. Each generation seems to accept without serious examination the inadequate explanations for the anomalies made by their antecedents, so the weight of the anomalies doesn’t accumulate; instead, each one gets swept separately under the proverbial rug and forgotten. Forgetting is important: when new anomalies come to light, hands are waved and new explanations are promulgated; no one chekcs if the new explanations contradict the previous generation of explanations. What passed before is a solved problem, and we need never speak of it again.

This is not a recipe for a scientific revolution, but for a thousand years of dark epicycles.

Returning to the post,

By the late 1980s and early 1990s, dark matter had been formally incorporated into the reigning cosmological framework. Lambda-CDM — where Lambda refers to the cosmological constant (a proxy for dark energy) and CDM stands for Cold Dark Matter — became the standard model of cosmology.

The essence of this statement is correct but some of the details are not. Dark matter was widely accepted by 1980. That’s still a little before my time, but my impression is that the magnitude of the discrepancy was at first a factor of two, so it could simply have been normal baryons that were hard to see. However, the discrepancy rapidly snowballed to an order of magnitude, so we needed something non-baryonic. This was happening simultaneously with talk of supersymmetry and grand unified theories in particle physics that could readily provide new particles to be candidates for the dark matter, leading to the shotgun marriage of particle physics and cosmology, two communities that had had little to do with each other before then, and which still make an odd couple. Cosmology as traditionally practiced by astronomers needed dark matter but didn’t much care what it was; particle physics was all about the possibility of new particles but didn’t care about the details of the astronomical evidence.

To rephrase the above quote, I think it is fair to say that “by the late 1980s and early 1990s, cold dark matter had been formally incorporated into the reigning cosmological framework.” But that framework was not yet LCDM, it was Ωm = 1 SCDM. The Lambda only came to prominence by the end of the 1990s, as I’ve related elsewhere. This process is depicted by many scientists as a revolution in itself, and in many regards it was. The cosmological constant had been very far out of favor; rehabilitating it was a grueling experience and no trivial matter. But it wasn’t really a scientific revolution in the sense that Kuhn meant: our picture didn’t fundamentally change, we just learned to accept a parameter& that was already there but that we didn’t like.

The post goes on to note the absence of dark matter detections:

This silence is itself an anomaly… as the silence deepens, the null result itself becomes harder to dismiss.

This is correct, and yet… Physicists have built many experiments that have achieved extraordinary sensitivities. If cold dark matter was composed of WIMPs as originally hypothesized, we would have detected them long ago. Initially, the reaction was to modify WIMPs. Did we say the cross-section would be 10-39 cm2? We meant 10-44 cm2. When that was excluded, we slid the cross section still lower, but people also started giving themselves permission to think the unthinkable. By unthinkable I mean a particle that can’t be detected, not modified gravity. That’s more unthinkable. So the anomaly isn’t dismissed, but it is treated with less gravity than it should be, and certainly with less import than a positive detection would have been granted. Did we say WIMPs? We didn’t mean just WIMPs. It could be anything. (They damn well meant WIMPs and only WIMPs#. Anyone who tells you otherwise is gaslighting*% you, and probably themselves.)

The post goes on to talk about MOND. It gives me too much credit for the gravitational lensing work. This was done by Tobias Mistele, and our work is based on that of Brouwer et al. But it is correct to note that these data are a problem for the dark matter paradigm. Rotation curves remain flat beyond where dark matter halos should end. If correct, this is a genuine anomaly. Perhaps in some distant future it will be recognized** as such in retrospect; at present it seems mostly to be ignored.

It goes on to talk about the JWST observations. Yeah, that part is correct. The community seems to be in the usual process of gaslighting itself into denial of the anomaly. For the first two years after JWST started returning images of the deep universe, people were aghast. How can this be so? It was all anyone could talk about. But then the unexpected became the new normal. Hands were waved, star formation was accepted to be absurdly efficient, and people accepted the impossible. I no longer hear the talk of how problematic the JWST observations are; this chatter simply stopped.

Anomalies don’t weigh a paradigm down if we don’t accept that they’re anomalies. But I’ve lived through the revolution, it’s hard to see a positive outcome while it is still ongoing. For it is certainly true that

What waits on the other side of the dark matter revolution — if that is what is coming — we cannot yet know.

The future is the unknown territory. We don’t know, and can never know, if dark matter doesn’t exist – it is impossible to prove the negative. But we do know MOND works much better than it should in a universe made of dark matter. That demands a scientific explanation that is still wanting. But MOND by itself is not a complete answer, so we are like the parable of the blind men and the elephant, each sensing a different part of reality but as yet unable to see the whole.

Still, there is reason for optimism. The article closes by noting that

Kuhn’s deepest insight was not that science changes. It is that the change, when it comes, is never merely technical. It is a reorganization of the world itself — the universe seen suddenly whole in a configuration it has always had, but that we had simply lacked the paradigm to perceive.

Not knowing how things ultimately work out is good, actually. One way or the other, there is still fundamental science to be done. We have not reached the stage of looking for our discoveries in the sixth place of decimals.


*Trivia I just learned looking at Popper’s wikipedia page: he was spending his last days in London around the same time I was a postdoc in Cambridge just starting to struggle with the scientific and philosophical implications of the dark matter-MOND miasma.

Unrelated trivia: I was at a workshop in Jerusalem early in the century but missed the opportunity to meet Jacob Bekenstein because I was too shy to bother the great man.

+If you do not find this confusing, you are not thinking clearly.

^A nice, brief summary of this early history is related by Einasto. This is the first place I’ve seen the citation to Opik (1915) written out. I’ve only heard mentioned verbally before, so I’ll have to try looking that up later.

The full story is way more complicated than this sounds, and still gets debated off and on. The amplitude of the Oort discrepancy is much smaller today. Locally, the 3D density of mass seems to be accounted for by known stars, gas, and stellar remnants (which were still a new thing in the 1930s). So this Oort limit shows no discrepancy. There remains a modest discrepancy in the 2D dynamical surface density. It appears to me to boil down to the vertical restoring force having a (sometimes ignored) term that depends on the gradient of the rotation curve. Were that falling in a normal Newtonian way, there would be no discrepancy. But it isn’t; this deviation from Newton in the radial direction leads to the Oort discrepancy in the vertical direction. Instead of being as negative as Newton predicts, dV/dR is close to zero, hence my description of this as an indirect detection of a[n almost] flat rotation curve. (dV/dR = -1.7 km/s/kpc, so not exactly zero, but a lot closer to zero than Newton without dark matter would have it be.) The vertical discrepancy is nevertheless much reduced, now being well below a factor of two.

$To his apparently great embitterment. He had some choice things to say about astronomers of his time. I am inclined to suspect that those who praise Zwicky the loudest today would have been among those he had reason to complain about had they been contemporaries.

%This is attributed to Planck, but he had a lot more nuanced things to say about it in his Nobel Prize lecture.

&Einstein disavowed the cosmological constant as his “greatest blunder,” so one argument against it was (for a long time) that it should never have been a part of the theory of General Relativity in the first place. I wonder how things might have gone had that been the case – that he had never introduced Lambda. Perhaps then the data that led to us accepting Lambda would have required a genuine revolution, but it isn’t obvious that we would have accepted it (we might still be debating it), nor is it apparent that LCDM is what comes out of such a revolution. But we don’t get to do that experiment: the Great Man had suggested Lambda, so it was OK to bring it back: we weren’t wrecking his theory by introducing a crazy new entity, we were just admitting an unlikely (antigravity-like) component thereof.

#Or axions! Or warm or self-interacting dark matter. Or macros nee strange nuggets! Or or or… Sure, there have been lots of ideas for what the dark matter could be. But when we say that “by the late 1980s and early 1990s, cold dark matter had been formally incorporated into the reigning cosmological framework” what the vast majority of scientists working on the topic (including myself) meant was that CDM == WIMPs. We were aggressively derisive of other ideas, and these are only dredged up again now because of the experimental non-detection of WIMPs. WIMPs are still a better dark matter candidate than the others for the same reasons that we were derisive of the others back in the day. We haven’t been looking as hard for the others, so comparable experimental limits do not yet exist. To quote myself,

The concept of dark matter is not falsifiable. If we exclude one candidate, we are free to make up another one. After WIMPs, the next obvious candidate is axions. Should those be falsified, we invent something else. (Particle physicists love to do this. The literature is littered with half-baked dark matter candidates invented for dubious reasons, often to explain phenomena with obvious astrophysical causes. The ludicrous uproar over the ATIC and PAMELA cosmic ray experiments is a good example.)

McGaugh (2008)

*%An easy way to deflate such gaslighting is to ask why so many experiments have been built to search for WIMPs but not all these other allegedly great dark matter candidates. After a pause and dismayed stare, you’ll probably get an answer about “looking under the lamp post” because that’s where it is possible to make detections. That’s sorta true, but it isn’t the real reason. The real reason is that we all drank the Kool-Aid of the WIMP miracle, so genuinely believed that the dark matter had to be WIMPS, not merely that they were a convenient experimental target. (I did not chug the kool-aid as hard as the people who based entire careers on building WIMP detection experiments, but I did buy into the idea to the exclusion of other possibilities for dark matter – as did most everyone else.)

**In retrospect, Galileo’s observations of the angular size and phases of Venus were utterly fatal to the geocentric paradigm. That’s easy to say now; at the time it was just another piece of evidence.

Has dark matter been detected in the Milky Way?

Has dark matter been detected in the Milky Way?

If a title is posed as a question, the answer is usually

No.

There has been a little bit of noise that dark matter might have been detected near the center of the Milky Way. The chatter seems to have died down quickly, for, as usual, this claim is greatly exaggerated. Indeed, the claim isn’t even made in the actual paper so much as in the scuttlebutt# related to it. The scientific claim that is made is that

The halo excess spectrum can be fitted by annihilation with a particle mass mχ 0.5–0.8 TeV and cross section συ (5–8)×1025cm3s1 for the bb¯ channel.

Totani (2025)

What the heck does that mean?

First, the “excess spectrum” refers to a portion of the gamma ray emission detected by the Fermi telescope that exceeds that from known astrophysical sources. This signal might be from a WIMP with a mass in the range of 500 – 800 GeV. That’s a bit heavier than originally anticipated (~100 GeV), but not ridiculous. The cross-section is the probability for an interaction with bottom quarks and anti-quarks. (The Higgs boson can decay into b quarks.)

Astrophysical sources at the Galactic center

There is a long-running issue with the interpretation of excess signals as dark matter. Most of the detected emission is from known astrophysical sources, hence the term “excess.” There being an excess implies that we understand all the sources. There are a lot of astrophysical sources at the Galactic center:

The center of the Milky Way as seen by the South African MeerKAT radio telescope with a close up from JWST. Image credit: NASA, ESA, CSA, STScI, SARAO, S. Crowe (UVA), J. Bally (CU), R. Fedriani (IAA-CSIC), I. Heywood (Oxford).

As you can see, the center of the Galaxy is a busy place. It is literally the busiest place in the Galaxy. Attributing any “excess” to non-baryonic dark matter is contingent on understanding all of the astrophysical sources so that they can be correctly subtracted off. Looking at the complexity of the image above, that’s a big if, which we’ll come back to later. But first, how does dark matter even come unto a discussion of emission from the Galactic center?

Indirect WIMP detection

Dark matter does not emit light – not directly, anyway. But WIMP dark matter is hypothesized to interact with Standard Model particles through the weak nuclear force, which is what provides a window to detect it in the laboratory. So how does that work? Here is the notional Feynman diagram:

Conceivable Interactions between WIMPs (X) and standard model particles (q). The diagram can be read left to right to represent WIMPs scattering off of atomic nuclei, top to bottom to represent WIMPs annihilating into standard model particles, or bottom to top to represent the production of dark matter particles in high energy collisions.

The devious brilliance of this Feynman diagram is that we don’t need to know how the interaction works. There are many possibilities, but that’s a detail – that central circle is where the magic happens; what exactly that magic is can remain TBD. All that matters is that it can happen (with some probability quantified by the interaction cross-section), so all the pathways illustrated above should be possible.

Direct detection experiments look for scattering of WIMPs off of nuclei in underground detectors. They have not seen anything. In principle, WIMPs could be created in sufficiently high-energy collisions of Standard Model particles. The LHC has more than adequate energy to produce dark matter particles in this way, but no such signal has been seen$. The potential signal we’re discussing here is an example of indirect detection. There are a number of possibilities for this, but the most obvious^ one follows from WIMPs being their own anti-particles, so they occasionally meet in space and annihilate into Standard Model particles.

The most obvious product of WIMP annihilations is a pair of gamma rays, hence the potential for the Fermi gamma ray telescope to detect their decay products. Here is a simulated image of the gamma ray sky resulting from dark matter annihilations:

Simulated image from the via Lactea II simultion (Fig. 1 of Kuhlen et al. 2008).

The dark regions are the brightest, where the dark matter density is highest. That includes the center of the Milky Way (white circle) and also sub-halos that might contain dwarf satellite galaxies.

Since we don’t really know how the magic interaction happens, but have plenty of theoretical variations, many other things are also possible, some of which might be cosmic rays:

Fig. 3 of Topchiev et al. (2017) illustrating possible decay channels for WIMP annihilations. Gamma rays are one inevitable product, but other particles might also be produced. These would be born with energies much higher than their rest masses (~100 GeV, while electrons and positrons have masses of 0.5 MeV) so would be moving near the speed of light. In effect, dark matter could be a source of cosmic rays.

The upshot of all this is that the detection of an “excess” of unexpected but normal particles might be a sign of dark matter.

Sociology: different perspectives from different communities

A lot hinges on the confidence with which we can disentangle expected from unexpected. Once we’ve accounted for the sources we already knew about, there are always new sources to be discovered. That’s astronomy. So initially, the communal attitude was that we shouldn’t claim a signal was due to dark matter until all astrophysical signals had been thoroughly excluded. That never happened: we just kept discovering new astrophysical sources. But at some point, the communal attitude transformed into one of eager credulity. It was no longer embarrassing to make a wrong claim; instead, marginal and dubious claims were made eagerly in the hopes of claiming a Nobel prize. If it didn’t work out, oh well, just try again. And again and again and again. There is apparently no shame in claiming to see the invisible when you’re completely convinced it is there to be seen.

This switch in sociology happened in the mid to late ’00s as people calling themselves astroparticle& physicists became numerous. These people were remarkably uninterested in astrophysics or astrophysical sources in their own right but very interested in dark matter. They were quick to claim that any and every quirk in data was a sign of dark matter. I can’t help but wonder if this behavior is inherited from the long drought in interesting particle collider results, which gradually evolved into a propensity for high energy particle phenomenologists to leap on every two-sigma blip as a sign of new physics, dumping hundreds of preprints on arXiv after each signal of marginal significance was announced. It is always a sprint to exercise the mental model-building muscles and make up some shit in the brief weeks before the signal inevitably goes away again.

Let’s review a few examples of previous indirect dark matter detection claims.

Cosmic rays from Kaluza-Klein dark matter – or not

This topic has a long and sordid history. In the late ’00s, there were numerous claims of an excess in cosmic raysATIC saw too many electrons for the astrophysical background, and and PAMELA saw an apparent rise in the positron fraction, perhaps indicating a source with a peak energy around 620 GeV. (If the signal is from dark matter, the rest mass of the WIMP is imprinted in the energy spectrum of its decay products.) The combination of excess electrons and extra positrons seemed fishy enough* to some to point to new physics: dark matter. There were of course more sober analyses, for example:

Fig. 3 from Aharonian et al. (2009): The energy spectrum E3 dN/dE of cosmic-ray electrons measured by H.E.S.S. and balloon experiments. Also shown are calculations for a Kaluza-Klein signature in the H.E.S.S. data with a mass of 620 GeV and a flux as determined from the ATIC data (dashed-dotted line), the background model fitted to low-energy ATIC and high-energy H.E.S.S. data (dashed line) and the sum of the two contributions (solid line). The shaded regions represent the approximate systematic error as in Fig. 2.

A few things to note about this plot: first, the data are noisy – science is hard. The ATIC and H.E.S.S. data are not really consistent – one shows an excess, the other does not. The excess is over a background model that is overly simplistic – the high energy astrophysicists I knew were shouting that the apparent signal could easily be caused by a nearby pulsar##. The advocates for a detection in the astroparticle community simply ignored this point, or if pressed, asserted that it seemed unlikely.

One problem that arose with the dark matter interpretation was that there wasn’t enough of it. Space is big and the dark matter density is low, so it is hard to get WIMPs together to annihilate. Indeed, the expected signal scales as the square of the WIMP density, so is very sensitive to just how much dark matter is lurking about. The average density in the solar neighborhood needed to explain astronomical data is around 0.3 to 0.4 GeV cm-3; this falls short of producing the observed signal (if real) by a factor of ~500.

An ordinary scientist might have taken this setback as a sign that he$$ was barking up the wrong tree. Not to be discouraged, the extraordinary astroparticle physicists started talking about the “boost factor.” If there is a region of enhanced dark matter density, then the gamma ray/cosmic ray signal would be boosted, potentially by a lot given the density-squared dependence. This is not quite as crazy as it sounds, as cold dark matter halos are predicted to be lumpy: there should be lots of sub-halos within each halo (and many sub-sub halos within those, right the way down). So, what are the odds that we happen to live near enough to a subhalo that could result in the required boost factor?

The odds are small but nonzero. I saw someone at a conference in 2009 make a completely theoretical attempt to derive those odds. He took a merger tree from some simulation and calculated the chance that we’d be near one of these lumps. Then he expanded that to include a spectrum of plausible merger trees for Milky Way-mass dark matter halos. The noisier merger histories gave higher probabilities, as halos with more recent mergers tend to be lumpier, having had a fresh injection of subhalos that haven’t had time to erode away through dynamical friction into the larger central halo.

This was all very sensible sounding, in theory – and only in theory. We don’t live in any random galaxy. We live in the Milky Way and we know quite a bit about it. One of those things is that it has had a rather quiet merger history by the standards of simulated merger trees. To be sure, there have been some mergers, like the Gaia-Enceladus Sausage. But these are few and far between compared to the expectations of the simulations our theorist was considering. Moreover, we’d know if it weren’t, because mergers tend to heat the stellar disk and puff up its thickness. The spiral disk of the Milky Way is pretty cold dynamically, which places limits on how much mass has merged and when. Indeed, there is a whole subfield dedicated to the study of the thick disk, which seems to have been puffed up in an ancient event ~8 Gyr ago. Since then it has been pretty quiet, though more subtle things can and do happen.

The speaker did not mention any of that. He had a completely theoretical depiction of the probabilities unsullied by observational evidence, and was succeeding in persuading those who wanted to believe that the small probability he came up with was nevertheless reasonable. It was a mixed audience: along with the astroparticle physicists were astronomers like myself, including one of the world’s experts on the thick disk, Rosy Wyse. However, she was too polite to call this out, so after watching the discussion devolve towards accepting the unlikely as probable, I raise my hand to comment: “We know the Milky Way’s merger history isn’t as busy as the models that give a high probability.” This was met with utter incredulity. How could astronomy teach us anything about dark matter? It’s not like the evidence is 100% astronomical in nature, or… wait, it is. But no, no waiting or self-reflection was involved. It rapidly became clear that the majority of people calling themselves astroparticle physicists were ignorant of some relevant astrophysics that any astronomy grad student would be expected to know. It just wasn’t in their training or knowledge base. Consequently, it was strange and shocking&& for them to learn about it this way. So the discussion trended towards denial, at which point Rosy spoke up to say yes, we know this. Duh. (I paraphrase.)

The interpretation of the excess cosmic ray signal as dark matter persisted a few years, but gradually cooler heads prevailed and the pulsar interpretation became widely accepted to be more plausible – as it always had been. Indeed, claiming cosmic rays were from dark matter became almost disreputable, as it richly deserved to be. So much so that when the AMS cosmic ray experiment joined the party late, it had essentially zero impact. I didn’t hear anyone advocating for it, even in whispers at workshops. It seemed more like its Nobel laureate PI just wanted a second Nobel prize, please and thank you, and even the astroparticle community felt embarrassed for him.

This didn’t preclude the same story from playing out repeatedly.

Gamma rays from WIMPs – or not

In the lead-up to a conference on dark matter hosted at Harvard in 2014, there were claims that the Fermi telescope – the same one that is again in the news – had seen a gamma ray line around 126 GeV that was attributed to dark matter. This claim had many red flags. The mass was close to the Higgs particle mass, which was kinda weird. The signal was primarily seen on the limb of the Earth, which is exactly where you’d expect garbage noise to creep in. Most telling, the Fermi team itself was not making this claim. It came from others who were analyzing their data. I am no fan of science by big teams – they tend to become bureaucratic behemoths that create red tape for their participants and often suppress internal dissent** – but one thing they do not do is leave Nobel prizes unanalyzed in their data. The Fermi team’s silence in this matter was deafening.

In short, this first claim of gamma rays from dark matter looked to be very much on the same trajectory as that from cosmic rays. So I was somewhat surprised when I saw the draft program for the Harvard conference, as it had an entire afternoon session devoted to this topic. I wrote the organizers to politely ask if they really thought this would still be a thing by the time the conference happened. One of them was an enthusiastic proponent, so yes.

Narrator: it was not.

By the time the conference happened, the related claims had all collapsed, and all the scientists invited to speak about it talked instead about something completely different, as if it had never been a thing at all.

X-rays from sterile neutrinos – or not

Later, there was the 3.5 keV line. If one squinted really hard at X-ray data, it looked like there might sorta kinda be an unidentified line. This didn’t look particularly convincing, and there are instances when new lines have been discovered in astronomical data rather than laboratory data (e.g., helium was first recognized in the spectrum of the sun, hence the name; also nebulium, which was later recognized to be ionized oxygen), so again, one needed to consider the astrophysical possibilities.

Of course, it was much more exciting to claim it was dark matter. Never mind that it was a silly energy scale, being far too low mass to be cold dark matter (people seem to have forgotten*# the Lee-Weinberg limit, which requires mX > 2 GeV); a few keV is rather less than a few GeV. No matter, we can always come up with an appropriate particle – in this case, sterile neutrinos*$.

If you’ve read this far, you can see how this was going to pan out.

Gamma rays from WIMPs again, maybe maybe

So now we have a renewed claim that the Fermi excess is dark matter. Given the history related above, the reader may appreciate that my first reaction was Really? Are we doing this again?

“Many people have speculated that if we knew exactly why the bowl of petunias had thought that we would know a lot more about the nature of the Universe than we do now.”

― Douglas Adams, The Hitchhiker’s Guide to the Galaxy

This is different from the claim a decade ago. The claimed mass is different, and the signal is real, being part of the mess of emission from the Galactic center. The trick, as so often the case, is disentangling the dark matter signal from the plausible astrophysical sources.

Indeed, the signal is not new, only this particular fit with WIMP dark matter is. There had, of course, been discussion of all this before, but it faded out when it became clear that the Fermi signal was well explained by a population of millisecond pulsars. Astrophysics was again the more obvious interpretation*%. Or perhaps not: I suppose if you’re part of a community convinced that dark matter exists who is spending an enormous amount of time and resources looking for a signal from dark matter and whose basic knowledge of astrophysics extends little beyond “astronomical data show dark matter exists but are messy so there’s always room to play” then maybe invoking an invisible agent from an unknown dark sector seems just as plausible as an obvious astrophysical source. Hmmm… that would have sounded crazy to me even back when, like them, I was sure that dark matter had to exist and be made of WIMPs, but here we are.

Looking around in the literature, I see there is still a somewhat active series of papers on this subject. They split between no way and maybe.

For example, Manconi et al. (2025) show that the excess signal has the same distribution on the sky as the light from old stars in the Galaxy. The distribution of stars is asymmetrical thanks to the Galactic bar, which we see at an angle somewhere around ~30 degrees, so one end is nearer to us than the other, creating a classic “X/peanut” shape seen in other edge-on barred spiral galaxies. So not only is the spectrum of the signal consistent with millisecond pulsars, it has the same distribution on the sky as the stars from which millisecond pulsars are born. So no way is this dark matter: it is clearly an astrophysical signal.

Not to be dissuaded by such a completely devastating combination of observations, Muru et al. (2025) argue that sure, the signal looks like the stars, but the dark matter could have exactly the same distribution as the stars. They cite the Hestia simulations of the Local Group as an example where this happens. Looking at those, they’re not as unrealistic as many simulations, but they appear to suffer the common affliction of too much dark mass near the center. That leaves the dark matter more room to be non-spherical so maybe be lumpy in the same was as the stars, and also provide a higher annihilation signal from the high density of dark matter. So they say maybe, calling the pulsar and dark matter interpretations “equally compelling.”

Returning to Totani’s sort-of claimed detection, he also says

This cross section is larger than the upper limits from dwarf galaxies and the canonical thermal relic value, but considering various uncertainties, especially the density profile of the MW halo, the dark matter interpretation of the 20 GeV “Fermi halo” remains feasible.

Totani (2025)

OK, so there’s a lot to break down in this one sentence.

The canonical thermal relic value is kinda central to the whole WIMP paradigm, so needing a value higher than that is a red flag reminiscent of the need for a boost factor for the cosmic ray signal. There aren’t really enough WIMPs there to do the job unless we juice their effectiveness at making gamma rays. The juice factor is an order of magnitude here: Steigman et al. (2012) give 2.2 x 10-26 cm3s-1 for what the thermal cross-section should be vs. the (5-8) x 10-25 cm3s-1 suggested by Totani (2025).

It is also worth noting that one point of Steigman’s paper is that as a well-posed hypothesis, the WIMP cross section can be calculated; it isn’t a free parameter to play with, so needing the cross-section to be larger than the upper limits from dwarf galaxies is another red flag. If this is indeed a dark matter signal from the Galactic center, then the subhalos in which dwarf satellites reside should also be visible, as in the simulated image from via Lactea above. They are not, despite having fewer messy astrophysical signals to compete with.

So “remains feasible” is doing a lot of work here. That’s the scientific way of saying “almost certainly wrong, but maybe? Because I’d really like for it to work out that way.”

The dark matter distribution in the Milky Way

One of the critical things here is the density of dark matter near the Galactic center, as the signal scales as the square of the density. Totani (2025) simply adopts the via Lactea simulation to represent the dark matter halo of the Galaxy in his calculations. This is a reasonable choice from a purely theoretical perspective, but it is not a conservative choice for the problem at hand.

What do we know empirically? The via Lactea simulation was dark matter only. There is no stellar disk, just a dark matter halo appropriate to the Milky Way. So let’s add that halo to a baryonic mass model of the Galaxy:

The rotation curve of the via Lactea dark matter halo (red curve) combined with the Milky Way baryon distribution (light blue line). The total rotation (dark blue line) overshoots the data.

The important part for the Galactic center signal is the region at small radius – the first kpc or two. Like most simulations, via Lactea has a cuspy central region of high dark matter density that is inconsistent with data. This overshoots the equivalent circular velocity curve from observed stellar motions. I could fix the fit above by reducing the stellar mass, but that’s not really an option in the Milky Way – we need a maximal stellar disk to explain the microlensing rate towards the center of the Galaxy. The “various uncertainties, especially the density profile of the MW halo” statement elides this inconvenient fact. Astronomical uncertainties are ever-present, but do not favor a dark matter signal here.

We can subtract the baryonic mass model from the rotation curve data to infer what the dark matter distribution needs to be. This is done in the plot below, where it is compared to the via Lactea halo:

The empirical dark matter halo density profile of the Milky Way (blue line) compared to the via Lactea simulation (red line).

The empirical dark matter density profile of the Milky Way does not continue to rise inwards as steeply as the simulation predicts. It shows the same proclivity for a shallower core as pretty much every other galaxy in the sky. This reduced density of dark matter in the central couple of kpc means the signal from WIMP annihilation should be much lower than calculated from the simulated distribution. Remember – the WIMP annihilation signal scales as the square of the dark matter density, so the turn-down seen at small radii in the log-log plot above is brutal. There isn’t enough dark matter there to do what it is claimed to be doing.

Cry wolf

There have now been so many claims to detect dark matter that have come and gone that it is getting to be like the fable of the boy who cried wolf. A long series of unpersuasive claims does not inspire confidence that the next will be correct. Indeed, it has the opposite effect: it is going to be really hard to take future claims seriously.

It’s almost as if this invisible dark matter stuff doesn’t exist.


Note added: Jeff Grube points out in the comments that Wang & Duan (2025) have a recent paper showing that the dark matter signal discussed here also predicts an antiproton signal that is already excluded by AMS data. While I find this unsurprising, it is an excellent check. Indeed, it would have caused me to think again had the antiproton signal been there: independent corroboration from a separate experiment is how science is supposed to work.


#It has become a pattern for advocates of dark matter to write a speculative paper for the journals that is fairly restrained in its claims, then hype it as an actual detection to the press. It’s like “Even I think this is probably wrong, but let’s make the claim on the off chance it pans out.”

$Ironically, a detection from a particle collider would be a non-detection. The signature of dark matter produced in a collision would be an imbalance between the mass-energy that goes into the collision and that measured in detected particles coming out of it. The mass-energy converted into WIMPs would escape the detector undetected. This is analogous to how neutrinos were first identified, though Fermi was reluctant to make up an invisible, potentially undetectable particle – a conservative value system that modern particle physicists have abandoned. The 13,000 GeV collision energy of the LHC is more than adequate to make ~100 GeV WIMPs, so the failure of this detection mode is telling.

^A less obvious possibility is spontaneous decay. This would happen if WIMPs are unstable and decay with a finite half-life. The shorter the half-life, the more decays, and the stronger the resulting signal. This implies some fine-tuning in the half-life – if it is much longer than a Hubble time, then it happens so seldom it is irrelevant; if it is shorter than a Hubble time, then dark matter halos evaporate and stable galaxies don’t exist.

&Astroparticle physics, also known as particle astrophysics, is a relatively new field. It is also an oxymoron, being a branch of particle physics with only aspirational delusions of relevance to astrophysics. I say that to be rude to people who are rude to astronomers, but it is also true. Astrophysics is the physics of objects in the sky, and as such, requires all of physics. Physics is a broad field, so some aspects are more relevant than others. When I teach a survey course, it touches on gravity, electromagnetism, atomic and molecular quantum mechanics, nuclear physics, and with the discovery of exoplanets, increasingly on geophysics. Particle physics doesn’t come up. It’s just not relevant, except where it overlaps with nuclear physics. (As poorly as particle physicists think of astronomers, they seem to think even less of nuclear physicists, whom they consider to be failed particle physicists (if only they were smart enough!) and nuclear physicists hate them in return.) This new field of astroparticle physics seems to be all about dark matter as driven by early universe cosmology, with contempt for everything that happens in the 13 billion years following the production of the relic radiation seen as the microwave background. Anything later is dismissed as mere “gastrophysics” that is too complicated to understand so cannot possibly inform fundamental physics. I guess that’s true if one chooses to remain ignorant of it.

*Fishy results can also indicate something fishy with the data. I had a conversation with an instrument builder at the time who pointed out that PAMELA had chosen to fly without a particular discriminator in order to save weight; he suggested that its absence could explain the apparent upturn in positrons.

##There is a relatively nearby pulsar that fits the bill. It has a name: Geminga. This illustrates the human tendency to see what we’re looking for. The astroparticle community was looking for dark matter, so that’s what many of them saw in the excess cosmic ray signal. High energy astrophysicists work on neutron stars, so the obvious interpretation to them was a pulsar. One I recall being particularly scornful of the dark matter interpretation when there was an obvious astrophysical source. I also remember the astroparticle people being quick to dismiss the pulsar interpretation because it seemed unlikely to them for one to be so close but really they hadn’t thought about it before: that pulsars could do this was news to them, and many preferred to believe the dark matter interpretation.

$$All the people barking were men.

&&This experience opened my eyes to the existence of an entire community of scientists who were working on dark matter in somewhat gratuitous ignorance of the astronomical evidence for dark matter. To them, the existence of the stuff had already been demonstrated; the interesting thing now was to find the responsible particle. But they were clearly missing many important ingredients – another example is disk stability, a foundational reason to invoke dark matter that seems to routinely come as a surprise to particle physicists. This disconnect is part of what motivated me to develop an entire semester course on dark matter, which I’ve taught every other year since 2013 and will teach again this coming semester. The first time I taught it, I worried that there wasn’t enough material for a whole semester. Now a semester isn’t enough time.

**I had a college friend (sadly now deceased) who was part of the team that discovered the Higgs. That was big business, to the extent that there were two experiments – one to claim the detection, and another on the same beam to do the confirmation. The first experiment exceeded the arbitrary 5σ threshold to claim a 5.2σ detection, but the second only reached 4.9σ. So, in all appropriateness, he asked in a meeting if they could/should really announce a detection. A Nobel prize was on the line, so the answer was straightforward: Do you want a detection or not? (His words.)

*#Rather than forget, some choose to fiddle ways around the Lee-Weinberg limit. This has led to the sub-genre of “light dark matter” which means lightweight, not luminous. I’d say this was the worst name ever, but the same people talk about dark photons with a straight face, so irony continues to bleed out.

*$Ironically, a sterile neutrino has also been invoked to address problems in MOND.

*%I was amused once to see one of the more rabid advocates of dark matter signals of this type give an entire talk hyping the various possibilities only to mention pulsars at the end with a sigh, admitting that the Fermi signal looked exactly like that.

The odd primordial halo of the Milky Way

The odd primordial halo of the Milky Way

The mass distribution of dark matter halos that we infer from observations tells us where the dark matter needs to be now. This differs form the mass distribution it had to start, as it gets altered by the process of galaxy formation. It is the primordial distribution that dark matter-only simulations predict most robustly. We* reverse-engineer the collapse of the baryons that make up the visible Galaxy to infer the primordial distribution, which turns out to be… odd.

The Gaia rotation curve and the mass of the Milky Way

As we discussed a couple of years ago, Gaia DR3 data indicate a declining rotation curve for the Milky Way. This decline becomes more steep, nearly Keplerian, in the outskirts of the Milky Way (17 < R < 30 kpc). This is may or may not be consistent with data further out, which gets hard to interpret as the LMC (at 50 kpc) perturbs orbits and the observed motions may not correspond to orbits in dynamical equilibrium. So how much do the data inform us about the gravitational potential?

Milky Way rotation curve (various data) including Gaia DR3 (multiple analyses). Also shown is the RAR model (blue line) that was fit to the terminal velocities from 3 < R < 8.2 kpc (gray points) and predates other data illustrated here.

I am skeptical of the Keplerian portion of this result (as discussed at length at the time) because other galaxies don’t do that. However, I am a big fan of listening to the data, and the people actually doing the work. Taken at face value, the Gaia data show a Keplerian decline with a total mass around 2 x 1011 M. If correct, this falsifies MOND.

How does dark matter fare? There is an implicit assumption made by many in the community that any failing of MOND is an automatic win for dark matter. However, it has been my experience that observations that are problematic for MOND are also problematic for dark matter. So let’s check.

Short answer: this is really weird in terms of dark matter. How weird? For starters, most recent non-Gaia dynamical analyses suggest a total mass closer to 1012 M, a factor of five higher than the Gaia value. I’m old enough to remember when the accepted mass was 2 x 1012 M, an order of magnitude higher. Yet even this larger mass is smaller than suggested by abundance matching recipes, which give more like 4 x 1012 M. So somewhere in the range 2 – 40 x 1011 M.

The Milky Mass has been adjusted so often, have we finally hit it?

The guy was all over the road. I had to swerve a number of times before I hit him.

Boston Driver’s Handbook (1982 edition)&

If it sounds like we’re all over the map, that’s because we are. It is very hard to constrain the total mass of a dark matter halo. We can’t see it, nor tell where it ends. We infer, indirectly, that the edge is way out beyond the tracers we can see. Heck, even speaking of an “edge” is ill-defined. Theoretically, we expect it to taper off with the density of dark matter falling as ρ ~ r-3, so there is no definitive edge. Somewhat arbitrarily,** we adopt the radius that encloses a density 200 times the average density of the universe as the “virial” radius. This is all completely notional, and it gets worse, as the process of forming a galaxy changes the initial mass distribution. What we observe today is the changed form, not the primordial initial condition for which the notional mass is defined.

Adiabatic compression during galaxy formation

To form a visible galaxy, baryons must dissipate and sink to the center of their parent dark matter halo. This process changes the mass distribution and alters the halo from its primordial state. In effect, the gravity of the sinking baryons drags some dark matter along# with them.

The change to the dark matter halo is often called adiabatic compression. The actual process need not be adiabatic, but that’s how we approximate it. We’ve tested this approximation with detailed numerical simulations, and it works pretty well, at least if you do it right (there are boring debates about technique). What happens makes sense intuitively: the response of the primordial halo to the infall of baryons is to become more dense at the center. While this makes sense physically, it is problematic for LCDM as it takes an NFW halo that is already too dense at the center to be consistent with data and makes it more dense. This has been known forever, so opposing this is one thing feedback is invoked to do, which it may or may not do, depending on how it really works. Even if feedback can really turn a compressed cusp into a core, it is widely to expected to be important only in low mass galaxies where the gravitational potential well isn’t too deep. It isn’t supposed to be all that important in galaxies as massive as the Milky Way, though I’m sure that can change as needed.

There are a variety of challenges to implementing an accurate compression computation, so we usually don’t bother: the standard practice is to assume a halo model and fit it to the data. That will, at best, given a description of the current dark matter halo, not what it started as, which is our closest point of comparison with theory. To give an example of the effect, here is a Milky Way model I built a decade ago:

Figure 13 from McGaugh (2016)Milky Way rotation curve from the data of Luna et al. (2006, red points) and McClure-Griffiths & Dickey (2007, gray points) together with a bulgeless baryonic mass model (black line). The total rotation is approximately fit (blue line) with an adiabatically compressed NFW halo (solid green line) using the procedure implemented by Sellwood & McGaugh (2005). The primordial halo before compression is shown as the dashed line. The parameters of the primordial halo are a concentration c = 7 and a mass M200 = 6 x 1011 M. Fitting NFW to the present halo instead gives c = 14, M200 = 4 x 1011 M, so the difference is appreciable and depend on the quality and radial extent of the available data.

The change from the green dashed line to the solid green line is the difference compression makes. That’s what happens if a baryon distribution like that of the Milky Way settles in an NFW halo. The inferred mass M200 is lower and the concentration c higher than it originally was – and it is the original version that we should compare to the expectations of LCDM.

When I built this model, I considered several choices for the bulge/bar fraction: something reasonable, something probably too large, and something definitely too small (zero). The model above is the last case of zero bulge/bar. I show it because it is the only case for which the compression procedure worked. If there is a larger central concentration of baryons – i.e., a bulge and/or a bar – then the compression is greater. Too great, in fact: I could not obtain a fit (see also Binney & Piffl and this related discussion).

The calculation of the compression requires knowledge of the primordial halo parameters, which is what one is trying to obtain. So one has to guess an initial state, run the code, check how close it came, then iterate the initial guess. This is computationally expensive, so I was just eyeballing the fit above. Pengfei has done a lot of work to implement a method that iteratively computes the compression and rigorously fits it to data. So we decided to apply it to the newer Gaia DR3 data.

Fitting the Gaia rotation curve with adiabatically compressed halos

We need two inputs here: one, the rotation curve to fit, and two, the baryonic distribution of the Milky Way. The latter is hard to specify given our location within the Milky Way, so there are many different estimates. We tried a dozen.

Another challenge of doing this is deciding which data rotation curve data to fit. We chose to focus on the rotation curve of Jiao et al. (2023) because they made estimates of the systematic as well as random errors. The statistics of Gaia are so good it is practically impossible to fit any equilibrium model to them. There are aspects of the data for which we have to consider non-equilibrium effects (spiral arms, the bar, “snails” from external perturbations) so the usual assumptions are at best an approximation, plus there can always be systematic errors. So the approach is to believe the data, but with the uncertainty estimate of Jiao et al. (2023) that includes systematics.

For a halo model, we started with the boilerplate LCDM NFW halo$. This doesn’t fit the data. Indeed, all attempts to fit NFW halos fail in similar ways for all of the different baryonic mass models we tried. The quasi-Keplerian part of the Gaia rotation curve simply cannot be fit: the NFW halo inevitably requires more mass further out.

Here are a few examples of the NFW fits:


Fig. A.3 from Li et al. (2025). Fits of Galactic circular velocities using the NFW model implementing adiabatic halo contraction using 3 baryonic models. [Another 9 appear in the paper.] Data points with errors are the rotation velocities from Jiao et al. (2023), while open triangles show the data from Eilers et al. (2019), which are not fitted. [The radius ranges from 5 to 30 kpc.] Blue, purple, green and black solid lines correspond to the contributions by the stellar disk, central bar, gas (and dust if any), and compressed dark matter halo, respectively. The total contributions are shown using red solid lines. Black dashed lines are the inferred primordial halos.

LCDM as represented by NFW suffers the same failure mode as seen in MOND (plot at top): both theories overshoot the Gaia rotation curve at R > 17 kpc. This is an example of how data that are problematic for MOND are also problematic for dark matter.

We do have more freedom in the case of dark matter. So we tried a different halo model, Einasto. (For this and many other halo models, see Pengfei’s epic compendium of dark matter halo fits.) Where NFW has two parameters, a concentration c and mass M200, Einasto has a third parameter that modulates the shape of the density profile%. For a very specific choice of this third parameter (α = 0.17), it looks basically the same as NFW. But if we let α be free, then we can obtain a fit. Of all the baryonic models, the RAR model+compressed Einasto fits best:


Fig. 1 from Li et al. (2025). Example of a circular velocity fit using the McGaugh19$$ model for baryonic mass distributions. The purple, blue, and green lines represent the contributions of the bar, disk, and gas components, respectively. The solid and dashed black lines show the current and primordial dark matter halos, respectively. The solid red line indicates the total velocity profile. The black points show the latest Gaia measurements (Jiao et al. 2023), and the gray upward triangles and squares show the terminal velocities from (McClure-Griffiths & Dickey 2007, 2016), and Portail et al. (2017), respectively. The data marked with open symbols were not fit because they do not consider the systematic uncertainties.

So it is possible to obtain a fit considering adiabatic compression. But at what price? The parameters of the best-fit primordial Einasto halo shown above are c = 5.1, M200 = 1.2 x 1011 M, and α = 2.75. That’s pretty far from the α = 0.17 expected in LCDM. The mass is lower than low. The concentration is also low. There are expectation values for all these quantities in LCDM, and all of them miss the mark.


Fig. 2 from Li et al. (2025). Halo masses and concentrations of the primordial Galactic halos derived from the Gaia circular velocity fits using 12 baryonic models. The red and blue stars with errors represent the halos with and without adiabatic contraction, respectively. The predicted halo mass-concentration relation within 1 σ from simulations (Dutton & Macciò 2014) is shown as the declining band. The vertical band shows the expected range of the MW halo mass according to the abundance-
matching relation (Moster et al. 2013). The upper and lower limits are set by the highest stellar mass model plus 1 σ and the lowest stellar mass model minus 1 σ, respectively.

The expectation for mass and concentration is shown as the bands above. If the primordial halo were anything like what it should be in LCDM, the halo parameters represented by the red stars should be where the bands intersect. They’re nowhere close. The same goes for the shape parameter. The halo should have a density profile like the blue band in the plot below; instead it is more like the red band.


Fig. 3 from Li et al. (2025). Structure of the inferred primordial and current Galactic halos, along with predictions for the cold and warm dark matter. The density profiles are scaled so that there is no need to assume or consider the masses or concentrations for these halos. The gray band indicates the range of the current halos derived from the Gaia velocity fits using the 12 baryonic models, and the red band shows their corresponding primordial halos within 1σ. The blue band presents the simulated halos with cold dark matter only (Dutton & Macciò 2014). The purple band shows the warm dark matter halos (normalized to match the primordial Galactic halo) with a core size spanning from 4.56 kpc (WDM5 in Macciò et al. 2012) to 7.0 kpc, corresponding to a particle mass of 0.05 keV and lower.

So the primordial halo of the Milky Way is pretty odd. From the perspective of LCDM, the mass is too low and the concentration is too low. The inner profile is too flat (a core rather than a cusp) and the outer profile is too steep. This outer steepness is a large part of why the mass comes out so low; there just isn’t a lot of halo out there. The characteristic density ρs is at least in the right ballpark, so aside from the inner slope, the outer slope, the mass, and the concentration, LCDM is doing great.

What if we ignore the naughty bits?

It is really hard for any halo model to fit the steep decline of the Gaia rotation curve at R > 17 kpc. Doing so is what makes the halo mass so small. I’m skeptical about this part of the data, so do things improve if we don’t sweat that part?

Ignoring the data at R > 17 kpc allows the mass to be larger, consistent with other dynamical determinations if not quite with abundance matching. However, the inner parts of the rotation curve still prefer a low density core. That is, something like the warm dark matter halo depicted as the purple band above rather than NFW with its dense central cusp. Or self-interacting dark matter. Or cold dark matter with just-so feedback. Or really anything that obfuscates the need to confront the dangerous question: why does MOND perform better?


*This post is based on the recently published paper by my former student Pengfei Li, who is now faculty at Nanjing University. They have a press release about it.

&A few months after reading this in the Boston Driver’s Handbook, this exact thing happened to me.

**This goes back to BBKS in 1986 when the bedrock assumption was that the universe had Ωm = 1, for which the virial radius was 188 times the critical density. 200 was close enough, and stuck, even though for LCDM the virial radius is more like an overdensity close to 100, which is even further out.

#This is one of many processes that occur in simulations, which are great for examining the statistics of simulated galaxy-like objects but completely useless for modeling individual galaxies in the real universe. There may be similar objects, but one can never say “this galaxy is represented by that simulated thing.” To model a real galaxy requires a customized approach.

$NFW halos consistently perform worse in fitting data than any other halo model, of which there are many. It has been falsified as a viable representation of reality so many times that I can’t recall them all, and yet they remain the go-to model. I think that’s partly thanks to their simplicity – it is mathematically straightforward to implement – and to the fact that is what simulations predict: LCDM halos should look like NFW. People, including scientists, often struggle to differentiate simulation from reality, so we keep flogging the dead horse.

%The density profile of the NFW halo model asymptotes to power laws at both small and large radii: ρ → r-1 as r → 0 and ρ → r-3 as r → ∞. The third parameter of Einasto allows a much wider ranges of shapes.

Einasto profiles. Einasto is observationally indistinguishable from NFW for α = 0.17, but allows many other shapes.

$$The McGaugh19 model user here is the one with a reasonable bulge/bar. This dense component can be fit in this case because we start with a halo model with a core rather than a cusp (closer to α = 1 than to the α = 0.17 of NFW/LCDM).

( There are none )

Currently, English is the lingua franca of science. It wasn’t always that way, and there’s no reason to expect it always will be. A century ago, all the great physicists who wanted to be part of the quantum revolution went to study in Germany. “Been to Germany” was a desirable bragging point on a cv. Then this little thing called WWII happened, and the gravitational center of physics research, and science more generally, moved to the United States. Now “Been to America” is a bragging point for a German cv.

American Science – the world’s gold standard

The post-war success of American science wasn’t happenstance, it was an outcome of intentional government policy. Investment in science research was seen as an essential element of national security. It also became a phenomenal engine for the growth of knowledge and technology that underpins many essential elements of modern society that we’ve come to take for granted but shouldn’t, like this here internet*. The relatively modest investments (as a fraction of the federal budget) that made this possible have been repaid many times over in economic growth.

Part of the way in which the federal government has invested in science over the past 75 years is through research grants from agencies like NSF, NIH, and NASA awarded to individual scientists via their university employers. This has created a web of interconnected success: grants fund the science, develop new technologies and facilities, train new scientists, help support the environment that makes this possible (including universities), and enable a society where science thrives. American leadership in science seems to be taken for granted, but it only happens with effort and investment. The past three quarters of a century give a clear answer to whether this investment is worthwhile: Absolutely YES.

A legitimate question is what level of investment is appropriate. America’s scientific leadership has been slipping because other nations have witnessed our success and many have taken steps to replicate it. That’s good. But if one wants to maintain leadership for all the value that provides, or even remain competitive, one needs to invest more, not less.

Instead, the budget currently before congress can only be described as a rampage of draconian budget reductions. NASA science is down 47%; NSF 56%. Even NIH, the core agency for research that impacts medicine that we all rely on at some point, is down 37%. Heck, a military unit is considered destroyed if it suffers 30% casualties; these cuts are deeper. This is how you destroy something while pretending not to do so. Rather than simply murder American science outright, the “big, beautiful bill” drags it behind the woodshed, ties it up, thrashes it half to death, and leaves it to bleed out, killing it slowly enough to preserve plausible deniability.

This is a prescription to abandon American leadership in science:

This is all being done in the name of rooting out fraud, waste and abuse. This is an excuse, an assertion without merit. In other words, pure, unadulterated political bullshit.

I’ve worked closely with NSF and NASA. NSF is incredibly efficient – an achievement made largely in response to years of congressional complaint. Funny how the same congresspeople keep complaining even after the agency has done everything they asked. NASA is less efficient, but that’s largely on the side that funds crewed spaceflight, which is super expensive if you don’t want to routinely explode. The science-funding side of NASA is basically pocket change.

Whether any of this research spending is wasteful depends on your value system. But there is no fraud to speak of, nor abuse. Grant budgets are closely scrutinized at many levels. Success rates are low (typically 20% before the cuts; they’re projected to be 7% afterwards. One might as well shoot dice.) The issue is not that fraudulent grants get funded, it is that there isn’t enough funding to support all the excellent proposals. One could literally double** the funding of the science agencies and there would still be meritorious grant proposals that went unfunded.

Personal Experience so far in 2025

I thought I would share some personal experience with how this has been unfolding, both as a member of a research university where I sit on university-wide committees that oversee such things, and as an individual scientist.

Overhead

In February, the Trump administration announced that the overhead rate for NIH grants would be limited to 15%. This is an odd-sounding technicality to most people, so first some background. I didn’t invent the federal grant system, and I do think there are some ways in which it could be improved. But this is the system that has developed, and changing it constructively would require lengthy study and consideration, not the sudden jolt that is being applied.

When a scientist like myself applies for a grant, we mostly focus on the science we want to do. But part of the process is making a budget: what will it cost to achieve the scientific goals? This usually involves funding for junior researchers (grad students and postdocs), money for laboratory equipment or travel to facilities like observatories, and in the system we have, partial funding for the PI (principle investigator). How much salary funding the PI is supposed to obtain from grants varies by field; for the physical sciences it is usually two or three months of summer*** salary.

For my colleagues in the School of Medicine, the average salary support from grants is around 50%; in some departments it is as high as 70%. So cuts to NIH funding are a big deal, even the overhead rate. Overhead is the amount of support provided to the university to keep the lights on, the buildings open, for safe and modern laboratories, administrative support, etc. – all the ecological support necessary to maintain a thriving research environment. Each university negotiates its overhead rate separately with one of the federal funding agencies; there are only a handful of federal employees who know how to do this, as it involves complicated formulae for laboratory space and all sorts of other factors affecting operations. The typical overhead rate is ~50%, so for every two dollars of direct spending (e.g., grad student salary), another dollar**** goes to the university to keep things running. This has gradually become an essential portion of the overall budget of universities over the years, so cuts to the overhead rate are de facto cuts to everything a university does.

The CWRU School of Medicine is a very successful research college. Its cancer research group is particularly renowned, including the only scientists on campus who rank ahead of yours truly in impact according to the Stanford-Elsevier science-wide author databases of standardized citation indicators. It is a large part of the overall campus research effort and is largely funded by NIH. The proposed cut to the overhead rate to 15% would correspond to a $54 million reduction in the university’s annual budget (about 6% of the total, if I recall right).

Not many organizations can gracefully miss $54 million dollars, so this prospect caused much consternation. There were lawsuits (by many universities, not just us), injunctions, petitions by the government to change venue so as to dodge the injunctions, and so far, no concrete action. So spending on existing grants continued as normal, for now. There was guarded optimism in our administration that we’d at least get through the fiscal year without immediate tragedy.

Then another insidious thing started to happen. NIH simply ceased disbursing new grants. Sure, you can spend on existing grants. You can apply for new grants and some of you will even be successful – on paper. We just won’t send you the money. There were administrative hijinx to achieve this end that are too complicated to bother explaining; the administration is very creative at bending/reinterpreting/making up rules to obtain the outcome they want. They did eventually start slow-walking some new grants, so again giving the appearance of normality while in practice choking off an important funding source. In the long run, that’s a bigger deal than the overhead rate. It doesn’t matter what the overhead rate is if it is a percentage of zero.

Now maybe there is some better way to fund science, and it shouldn’t be the role of the federal government. OK, so what would that be? It would be good planning to have a replacement system in place before trashing the existing one. But no one is doing that. Private foundations cannot possibly pick up the slack. So will my colleagues in the School of Medicine suffer 50% salary cuts? Most people couldn’t handle that, but their dean is acting like it’s a possibility.

From the outside, the current situation may look almost normal but it is not. There is no brilliant plan to come up with some better funding scheme. Things will crash soon if not all at once. I expect our university – and many across the country – to be forced to take draconian budget action of their own. Not today, not tomorrow, but soon. What that looks like I don’t know, but I don’t see how it fails to include mass layoffs. Aside from the human cost that obviously entails, it also means we can’t do as much in either research or education. Since this is happening nation-wide, we will all be reduced as a consequence.

As a nation, this is choosing to fail.

My own recent experience with grants

I can’t begin to describe how difficult it is to write a successful grant. There is so much that goes into it; it’s like cramming everything I’ve ever written in this blog into 15 pages without leaving anything out. You don’t dare leave anything out because if you leave out obscure reference X you can be sure the author of X will be on the panel and complain that you’re unaware of important result X. More importantly, every talented colleague I have – and there are many – are doing the same thing, competing for the same shrinking pot. It’s super competitive, and has been for so long that I’ve heard serious suggestions of simply drawing proposals at random, lottery style. Strange as this sounds, this procedure would be more fair than the multiple-jeopardy merit evaluation we have at present: if a proposal almost succeeds one year, and a panel tells you to just improve this one thing; next year a different panel may hate that one thing and ask for something different. Feedback from panels used to be extremely useful; now it is just a list of prefab excuses for why you got rejected again.

NSF

I’ve mostly worked with NSF and NASA. I had an NSF proposal that was really well received in 2023; the review was basically “we would have funded this if we had enough money but we didn’t and something else edged you out.” This happens a lot, so I resubmitted it last year. Same result. There was a time when you could expect to succeed through perseverance; that time had already seemed to have reached an end and dissolved into a crap shoot even before the proposed cuts.

In the good old days of which I hear tell, but entirely before my time, NSF had something called an accomplishment-based renewal. Basically you could get a continuation of your grant as long as you were doing Good Things. I never experienced that; all my grants have been a standard three years and done. Getting new grants means writing an entirely new proposal and all the work that entails. It’s exhausting and can be a distraction from actually doing the science. But the legacy of accomplishment-based renewals lives on; as part of the fifteen pages of an NSF grant, you are required to spend five saying what great things you accomplished with previous funding. For me, as it relates to the most recent proposal, that’d be SPARC.

SPARC has been widely used as a database. It is in great demand by the community. So great that when our web server was down recently for the better part of a week for some extensive updates, I immediately got a stack of email asking where was it and when would it be back? The SPARC data paper has been cited over 600 times; the Radial Acceleration Relation based on it over 500. Those are Babe Ruth numbers, easily in the top percentile of citation rates. These are important results, and the data are clearly data the community want. The new proposal would have provided that and more, a dozen-fold, but apparently that’s not good enough.

NASA

While waiting to hear of that predictable disappointment, I tried to rally for NASA ROSES. These Research Opportunities in Space and Earth Science are traditionally announced on Valentine’s Day. ROSES on Valentine’s day? Get it? Yuk, yuk. I didn’t, until it didn’t happen at the appointed time. There were any number of announcements from different parts of NASA saying different things, mostly to the effect of “any day now.” So in March, I logged into my NSPIRES account to see what was available. Here’s the screenshot:

NASA proposals due within 30 days of March 25, 2025.

OK, those are the dregs from last year: the last of the proposal opportunities from ROSES 2024. The program appropriate for my project already passed; I’m looking for the 2025 edition. So let’s filter for future opportunities:

( There are none )

OK. Clearly NASA is going through some things. Let’s all just take a chill pill and come back and check on them three months later:

Huh, same result: future opportunities? ( There are none ) Who coulda guessed? It’s like it’s a feature rather than a bug.

Maybe NASA will get around to slow-walking grants like NIH. But there will be a lot less money at whatever rate it gets dolled out – to the manifest detriment of science in the United States and everyone everywhere who is interested in science in general and astrophysics in particular.

The bottom line

Make no mistake, the cuts congress***** and the administration intend to make to US science agencies are so severe that they amount to a termination of science as we’ve come to know it. It is a willful abandonment of American leadership in scientific endeavors. It is culture-war hatred for nerds and eggheads rendered as public policy. The scientific endeavor in the US is already suffering, and it will get much worse. There will be some brain drain, but I’m more concerned with the absence of brain nourishment. We risk murdering the careers of a generation of aspiring scientists.

I am reminded of what I said in the acknowledgements of my own Ph.D. thesis many years ago:

As I recall the path that has brought me here, I am both amazed and appalled by the amount of time, effort, and energy I have put into the production of this document. But this pales in comparison to the amount of tolerance and support (both moral and financial) required to bring a person to this point. It is difficult to grasp the depth and breadth of community commitment the doctoral process requires, let alone acknowledge all who contribute to its successful completion.

S. McGaugh, Ph.D thesis, 1992

Was that investment not worthwhile? I think it was. But it will be impossible for an aspiring young American like me to do science the way I have done. The career path is already difficult; in future it looks like the opportunity simply won’t exist.

Science is a tiny piece of American greatness that the Trump administration – with the active help of republicans in congress and a corrupt, partisan Supreme Court – has idly tossed in the bonfire. I have focused my comments to what I know directly from my own experience. Millions upon millions of Americans are currently experiencing other forms of malignant maladministration. It’s as if competent government matters after all.

In the longer term, a likely result of the current perfidy is not just a surrender of American leadership, but that the lingua franca of science moves on from English to some other language that is less hostile to it.

I hate politics and have no interest in debating it. I’m not the one who chose to suddenly undo decades of successful bipartisan science policy in a way that has a very direct negative impact on the country, my field, and me personally. Since politics invites divisive argument, the comments section will not be open.


*I don’t know who doesn’t know this, but the internet was developed by universities and the NSF. It grew out of previous efforts by the military (DARPAnet) and private industry (DECnet), but what we now know as the internet was pioneered by academic scientists funded by NSF. I’ve sometimes seen this period (1985 – 1995) referred to as NSFnet to distinguish it from the internet after is was made available to the public in 1995. But that’s not what we called it back then; we called it the internet. That’s what it was; that’s where the name came from.

I’ve been on the internet since 1987. I personally was against sharing it with the public for selfish reasons. As a scientist, I was driving truckloads of data+ along narrow lanes of limited bandwidth; I didn’t want to share the road with randos sharing god knows what. That greedy people (e.g., Mark Zuckerberg) would fence off parts of the fruits of public investment and profit by gatekeeping who could see what and harvesting gobs of personal data had not occurred to me as something that would be allowed.

I relate this bit of personal experience because I’ve seen a lot of tech bros try to downplay the role of NSF and claim its successes by asserting that they invented the internet. They did not; they merely colonized and monetized it. It was invented by scientists to share data.

+I once personally choked what is now known as arXiv by submitting a preprint with galaxy images larger than the system could handle at the time. The submission set off a doom-loop of warning emails that throttled things for many hours before I succeeded in killing all the guilty unix processes. That’s why the comments of that preprint have a link (long since defunct) to a version of the paper on the Institute of Astronomy’s local server.


**I’m old enough to remember, not all that long ago, when there was a bipartisan commitment to double science funding. That didn’t happen. It really did have widespread bipartisan support, but the science budget is a tiny portion of discretionary spending which itself is a tiny portion of the overall federal budget. The effort got lost in reconciliation.


***I would prefer a system that is less focused on the the individual PI; it is a very American-social Darwinism approach to get you to compete by dangling the carrot of more pay. But that carrot long ago evolved into a stick; getting grants is a de facto job requirement, not merely an occasional success. Overall I can’t complain; I’ve been very successful, managing to remain fully funded over the course of my career, up until very recently. Now my grants are finished so my salary is down 25%. In the current environment I don’t expect to see that again.


****Is this a fair rate? I have no idea – not my specialty. But we recently had external consultants brought in to review our expenses; I think the board of trustees expected to identify wasteful spending that could be cut, and that was certainly the attitude the consultants brought in with them. After actually reviewing everything, their report was “Geez, this operation is super-efficient; there’s no fat to cut and really the whole operation should cost more than it does.” While that’s specific to my college, it seems to me to be a pretty accurate depiction of NSF as well.


*****The republicans are pushing through this poisonous budget with a one seat majority in the House or Representatives. One. Seat. It literally could not be closer to a 50/50 split. So don’t go thinking “Americans voted for this.” Americans couldn’t be more divided.

Sad to think how much tragedy could be averted if a single republican in congress grew a spine and put country before party.

Some persistent cosmic tensions

Some persistent cosmic tensions

I took the occasion of the NEIU debate to refresh my knowledge of the status of some of the persistent tensions in cosmology. There wasn’t enough time to discuss those, so I thought I’d go through a few of them here. These issues tend to get downplayed or outright ignored when we hype LCDM’s successes.

When I teach cosmology, I like to have the students do a project in which they each track down a measurement of some cosmic parameter, and then report back on it. The idea, when I started doing this back in 1999, was to combine the different lines of evidence to see if we reach a consistent concordance cosmology. Below is an example from the 2002 graduate course at the University of Maryland. Does it all hang together? I ask the students to debate the pros and cons of the various lines of evidence.

The mass density parameter Ωm = ρmcrit and the Hubble parameter h = H0/(100 km/s/Mpc) from various constraints (colored lines) available in 2002. I later added the first (2003) WMAP result (box). The combination of results excludes the grey region; only the white portion is viable: this is the concordance region.

The concordance cosmology is the small portion of this diagram that was not ruled out. This is the way in which LCDM was established. Before we had either the CMB acoustic power spectrum or Type Ia supernovae, LCDM was pretty much a done deal based on a wide array of other astronomical evidence. It was the subsequentα agreement of the Type Ia SN and the CMB that cemented the picture in place.

The implicit assumption in this approach is that we have identified the correct cosmology by process of elimination: whatever is left over must be the right answer. But what if nothing is left over?

I have long worried that we’ve painted ourselves into a corner: maybe the concordance window is merely the least unlikely spot before everything is excluded. Excluding everything would effectively falsify LCDM cosmology, if not the more basic picture of an expanding universe% emerging from a hot big bang. Once one permits oneself to think this way, then it occurs to one that perhaps the reason we have to invoke the twin tooth fairies of dark matter and dark energy is to get FLRW to approximate some deeper, underlying theory.

Most cosmologists do not appear to contemplate this frightening scenario. And indeed, before we believe something so drastic, we have to have thoroughly debunked the standard picture – something rather difficult to do when 95% of it is invisible. It also means believing all the constraints that call the standard picture into question (hence why contradictory results experience considerably more scrutiny* than conforming results). The fact is that some results are more robust than others. The trick is deciding which to trust.^

In the diagram above, the range of Ωm from cluster mass-to-light ratios comes from some particular paper. There are hundreds of papers on this topic, if not thousands. I do not recall which one this particular illustration came from, but most of the estimates I’ve seen from the same method come in somewhat higher. So if we slide those green lines up, the allowed concordance window gets larger.

The practice of modern cosmology has necessarily been an exercise in judgement: which lines of evidence should we most trust? For example, there is a line up there for rotation curves. That was my effort to ask what combination of cosmological parameters led to dark matter halo densities that were tolerable to the rotation curve data of the time. Dense cosmologies give birth to dense dark matter halos, so everything above that line was excluded because those parameters cram too much dark matter into too little space. This was a pretty conservative limit at the time, but it is predicated on the insistence of theorists that dark matter halos had to have the NFW form predicted by dark matter-only simulations. Since that time, simulations including baryons have found any number of ways to alter the initial cusp. This in turn means that the constraint no longer applies as the halo might have been altered from its original, cosmology-predicted initial form. Whether the mechanisms that might cause such alterations are themselves viable becomes a separate question.

If we believed all of the available constraints, then there is no window left and FLRW is already ruled out. But not all of those data are correct, and some contradict each other, even absent the assumption of FLRW. So which do we believe? Finding one’s path in this field is like traipsing through an intellectual mine field full of hardened positions occupied by troops dedicated to this or that combination of parameters.

H0 = 100! No, repent you fools, H0 = 50! (Comic by Paul North)

It is in every way an invitation to confirmation bias. The answer we get depends on how we weigh disparate lines of evidence. We are prone to give greater weight to lines of evidence that conform to our pre-established+ beliefs.

So, with that warning, let’s plunge ahead.

The modern Hubble tension

Gone but not yet forgotten are the Hubble wars between camps Sandage (H0 = 50!) and de Vaucouleurs (H0 = 100!). These were largely resolved early this century thanks to the Hubble Space Telescope Key Project on the distance scale. Obtaining this measurement was the major motivation to launch HST in the first place. Finally, this long standing argument was resolved: nearly everyone agreed that H0 = 72 km/s/Mpc.

That agreement was long-lived by the standards of cosmology, but did not last forever. Here is an illustration of the time dependence of H0 measurements this century, from Freedman (2021):

There are many illustrations like this; I choose this one because it looks great and seems to have become the go-to for illustrating the situation. Indeed, it seems to inform the attitude of many scientists close to but not directly involved in the H0 debate. They seem to perceive this as a debate between Adam Riess and Wendy Freedman, who have become associated with the Cepheid and TRGB$ calibrations, respectively. This is a gross oversimplification, as they are not the only actors on a very big stage&. Even in this plot, the first Cepheid point is from Freedman’s HST Key Project. But this apparent dichotomy between calibrators and people seems to be how the subject is perceived by scientists who have neither time nor reason for closer scrutiny. Let’s scrutinize.

Fits to the acoustic power spectrum of the CMB agreed with astronomical measurements of H0 for the first decade of the century. Concordance was confirmed. The current tension appeared with the first CMB data from Planck. Suddenly the grey band of the CMB best-fit no longer overlapped with the blue band of astronomical measurements. This came as a shock. Then a new (red) band appears, distinguishing between the “local” H0 calibrated by the TRGB from that calibrated by Cepheids.

I think I mentioned that cosmology was an invitation to confirmation bias. If you put a lot of weight on CMB fits, as many cosmologists do, then it makes sense from that perspective that the TRGB measurement is the correct one and the Cepheid H0 must be wrong. This is easy to imagine given the history of systematic errors that plagued the subject throughout the twentieth century. This confirmation bias makes one inclined to give more credence to the new# TRGB calibration, which is only in modest tension with the CMB value. The narrative is then simplified to two astronomical methods that are subject to systematic uncertainty: one that agrees with the right answer and one that does not. Ergo, the Cepheid H0 is in systematic error.

This narrative oversimplifies that matter to the point of being actively misleading, and the plot above abets this by focusing on only two of the many local measurements. There is no perfect way to do this, but I had a go at it last year. In the plot below, I cobbled together all the data I could without going ridiculously far back, but chose to show only one point per independent group, the most recent one available from each, the idea being that the same people don’t get new votes every time they tweak their result – that’s basically what is illustrated above. The most recent points from above are labeled Cepheids & TRGB (the date of the TRGB goes to the full Chicago-Carnegie paper, not Freedman’s summary paper where the above plot can be found). See McGaugh (2024) for the references.

When I first made this plot, I discovered that many measurements of the Hubble constant are not all that precise: the plot was an indecipherable forest of error bars. So I chose to make a cut at a statistical uncertainty of 3 km/s/Mpc: worse than that, the data are shown as open symbols sans error bars; better than that, the datum gets explicit illustration of both its statistical and systematic uncertainty. One could make other choices, but the point is that this choice paints a different picture from the choice made above. One of these local measurements is not like the others, inviting a different version of confirmation bias: the TRGB point is the outlier, so perhaps it is the one that is wrong.

Recent measurements of the Hubble constant (left) and the calibration of the baryonic Tully-Fisher relation (right) underpinning one of those measurements.

I highlight the measurement our group made not to note that we’ve done this too so much as to highlight an underappreciated aspect of the apparent tension between Cepheid and TRGB calibrations. There are 50 galaxies that calibrate the baryonic Tully-Fisher relation, split nearly evenly between galaxies whose distance is known through Cepheids (blue points) and TRGB (red points). They give the same answer. There is no tension between Cepheids and the TRGB here.

Chasing this up, it appears to me that what happened was that Freedman’s group reanalyzed the data that calibrate the TRGB, and wound up with a slightly different answer. This difference does not appear to be in the calibration equation (the absolute magnitude of the tip of the red giant branch didn’t change that much), but in something to do with how the tip magnitude is extracted. Maybe, I guess? I couldn’t follow it all the way, and I got bad vibes reminding me of when I tried to sort through Sandage’s many corrections in the early ’90s. That doesn’t make it wrong, but the point is that the discrepancy is not between Cepheids and TRGB calibrations so much as it is between the TRGB as implemented by Freedman’s group and the TRGB as implemented by others. The depiction of the local Hubble constant debate as being between Cepheid and TRGB calibrations is not just misleading, it is wrong.

Can we get away from Cepheids and the TRGB entirely? Yes. The black points above are for megamasers and gravitational lensing. These are geometric methods that do not require intermediate calibrators like Cepheids at all. It’s straight trigonometry. Both indicate H0 > 70. Which way is our confirmation bias leaning now?

The way these things are presented has an impact on scientific consensus. A fascinating experiment on this has been done in a recent conference report. Sometimes people poll conference attendees in an attempt to gauge consensus; this report surveys conference attendees “to take a snapshot of the attitudes of physicists working on some of the most pressing questions in modern physics.” One of the topics queried is the Hubble tension. Survey says:

Table XII from arXiv:2503.15776 in which scientists at the 2024 conference Black Holes Inside and Out vote on their opinion about the most likely solution of the Hubble tension.

First, a shout out to the 1/4 of scientists who expressed no opinion. That’s the proper thing to do when you’re not close enough to a subject to make a well-informed judgement. Whether one knows enough to do this is itself a judgement call, and we often let our arrogance override our reluctance to over-share ill-informed opinions.

Second, a shout out to the folks who did the poll for including a line for systematics in the CMB. That is a logical possibility, even if only 3 of the 72 participants took it seriously. This corroborates the impression I have that most physicists seem to think the CMB is prefect like some kind of holy scripture written in fire on the primordial sky, so must be correct and cannot be questioned, amen. That’s silly; systematics are always a possibility in any observation of the sky. In the case of the CMB, I suspect it is not some instrumental systematic but the underlying assumption of LCDM FLRW that is the issue; once one assumes that, then indeed, the best fit to the Planck data as published is H0 = 67.4, with H0 > 68 being right out. (I’ve checked.)

A red flag that the CMB is where the problem lies is the systematic variation of the best-fit parameters along the trench of minimum χ2:

The time evolution of best-fit CMB cosmology parameters. These have steadily drifted away from the LCDM concordance window while the astronomical measurements that established it have not.

I’ve shown this plot and variations for other choices of H0 before, yet it never fails to come as a surprise when I show it to people who work closely on the subject. I’m gonna guess that extends to most of the people who participated in the survey above. Some red flags prove to be false alarms, some don’t, but one should at least be aware of them and take them into consideration when making a judgement like this.

The plurality (35%) of those polled selected “systematic error in supernova data” as the most likely cause of the Hubble tension. It is indeed a common attitude, as I mentioned above, that the Hubble tension is somehow a problem of systematic errors in astronomical data like back in the bad old days** of Sandage & de Vaucouleurs.

Let’s unpack this a bit. First, the framing: systematic error in supernova data is not the issue. There may, of course, be systematic uncertainties in supernova data, but that’s not a contender for what is causing the apparent Hubble tension. The debate over the local value of H0 is in the calibrators of supernovae. This is often expressed as a tension between Cepheid and TRGB calibrators, but as we’ve seen, even that is misleading. So posing the question this way is all kinds of revealing, including of some implicit confirmation bias. It’s like putting the right answer of a multiple choice question first and then making up some random alternatives.

So what do we learn from this poll for consensus? There is no overwhelming consensus, and the most popular choice appears to be ill-informed. This could be a meme. Tell me you’re not an expert on a subject by expressing an opinion as if you were.

The kicker here is that this was a conference on black hole physics. There seems to have been some fundamental gravitational and quantum physics discussed, which is all very interesting, but this is a community that is pretty far removed from the nitty-gritty of astronomical observations. There are many other polls reported in this conference report, many of them about esoteric aspects of black holes that I find interesting but would not myself venture an opinion on: it’s not my field. It appears that a plurality of participants at this particular conference might want to consider adopting that policy for fields beyond their own expertise.

I don’t want to be too harsh, but it seems like we are repeating the same mistakes we made in the 1980s. As I’ve related before, I came to astronomy from physics with the utter assurance that H0 had to be 50. It was Known. Then I met astronomers who were actually involved in measuring H0 and they were like, “Maybe it is ~80?” This hurt my brain. It could not be so! and yet they turned out to be correct within the uncertainties of the time. Today, similar strong opinions are being expressed by the same community (and sometimes by the same people) who were wrong then, so it wouldn’t surprise me if they are wrong now. Putting how they think things should be ahead of how they are is how they roll.

There are other tensions besides the Hubble tension, but I’ll get to them in future posts. This is enough for now.


αAs I’ve related before, I date the genesis of concordance LCDM to the work of Ostriker & Steinhardt (1995), though there were many other contributions leading to it (e.g., Efstathiou et al. 1990). Certainly many of us anticipated that the Type Ia SN experiments would confirm or deny this picture. Since the issue of confirmation bias is ever-present in cosmic considerations, it is important to understand this context: the acceleration of the expansion rate that is often depicted as a novel discovery in 1998 was an expect result. So much so that at a conference in 1997 in Aspen I recall watching Michael Turner badger the SN presenters to Proclaim Lambda already. One of the representatives from the SN teams was Richard Ellis, who wasn’t having it: the SN data weren’t there yet even if the attitude was. Amusingly, I later heard Turner claim to have been completely surprised by the 1998 discovery, as if he hadn’t been pushing for it just the year before. Aspen is a good venue for discussion; I commented at the time that the need to rehabilitate the cosmological constant was a big stop sign in the sky. He glared at me, and I’ve been on his shit list ever since.

%I will not be entertaining assertions that the universe is not expanding in the comments: that’s beyond the scope of this post.

*Every time a paper corroborating a prediction of MOND is published, the usual suspects get on social media to complain that the referee(s) who reviewed the paper must be incompetent. This is a classic case of admitting you don’t understand how the process works by disparaging what happened in a process to which you weren’t privy. Anyone familiar with the practice of refereeing will appreciate that the opposite is true: claims that seem extraordinary are consistently held to a higher standard.

^Note that it is impossible to exclude the act of judgement. There are approaches to minimizing this in particular experiments, e.g., by doing a blind analysis of large scale structure data. But you’ve still assumed a paradigm in which to analyze those data; that’s a judgement call. It is also a judgement call to decide to believe only large scale data and ignore evidence below some scale.

+I felt this hard when MOND first cropped up in my data for low surface brightness galaxies. I remember thinking How can this stupid theory get any predictions right when there is so much evidence for dark matter? It took a while for me to realize that dark matter really meant mass discrepancies. The evidence merely indicates a problem, the misnomer presupposes the solution. I had been working so hard to interpret things in terms of dark matter that it came as a surprise that once I allowed myself to try interpreting things in terms of MOND I no longer had to work so hard: lots of observations suddenly made sense.

$TRGB = Tip of the Red Giant Branch. Low metallicity stars reach a consistent maximum luminosity as they evolve up the red giant branch, providing a convenient standard candle.

&Where the heck is Tully? He seldom seems to get acknowledged despite having played a crucial role in breaking the tyranny of H0 = 50 in the 1970s, having published steadily on the topic, and his group continues to provide accurate measurements to this day. Do physics-trained cosmologists even know who he is?

#The TRGB was a well-established method before it suddenly appears on this graph. That it appears this way shortly after the CMB told us what answer we should get is a more worrisome potential example of confirmation bias, reminiscent of the situation with the primordial deuterium abundance.

**Aside from the tension between the TRGB as implemented by Freedman’s group and the TRGB as implemented by others, I’m not aware of any serious hint of systematics in the calibration of the distance scale. Can it still happen? Sure! But people are well aware of the dangers and watch closely for them. At this juncture, there is ample evidence that we may indeed have gotten past this.

Ha! I knew the Riess reference off the top of my head, but lots of people have worked on this so I typed “hubble calibration not a systematic error” into Google to search for other papers only to have its AI overview confidently assert

The statement that Hubble calibration is not a systematic error is incorrect

Google AI

That gave me a good laugh. It’s bad enough when overconfident underachievers shout about this from the wrong peak of the Dunning-Kruger curve without AI adding its recycled opinion to the noise, especially since its “opinion” is constructed from the noise.

The best search engine for relevant academic papers is NASA ADS; putting the same text in the abstract box returns many hits that I’m not gonna wade through. (A well-structured ADS search doesn’t read so casually; apparently the same still applies to Google.)

On the timescale for galaxy formation

On the timescale for galaxy formation

I’ve been wanting to expand on the previous post ever since I wrote it, which is over a month ago now. It has been a busy end to the semester. Plus, there’s a lot to say – nothing that hasn’t been said before, somewhere, somehow, yet still a lot to cobble together into a coherent story – if that’s even possible. This will be a long post, and there will be more after to narrate the story of our big paper in the ApJ. My sole ambition here is to express the predictions of galaxy formation theory in LCDM and MOND in the broadest strokes.

A theory is only as good as its prior. We can always fudge things after the fact, so what matters most is what we predict in advance. What do we expect for the timescale of galaxy formation? To tell you what I’m going to tell you, it takes a long time to build a massive galaxy in LCDM, but it happens much faster in MOND.

Basic Considerations

What does it take to make a galaxy? A typical giant elliptical galaxy has a stellar mass of 9 x 1010 M. That’s a bit more than our own Milky Way, which has a stellar mass of 5 or 6 x 1010 M (depending who you ask) with another 1010 M or so in gas. So, in classic astronomy/cosmology style, let’s round off and say a big galaxy is about 1011 M. That’s a hundred billion stars, give or take.

An elliptical galaxy (NGC 3379, left) and two spiral galaxies (NGC 628 and NGC 891, right).

How much of the universe does it take to make one big galaxy? The critical density of the universe is the over/under point for whether an expanding universe expands forever, or has enough self-gravity to halt the expansion and ultimately recollapse. Numerically, this quantity is ρcrit = 3H02/(8πG), which for H0 = 73 km/s/Mpc works out to 10-29 g/cm3 or 1.5 x 10-7 M/pc3. This is a very small number, but provides the benchmark against which we measure densities in cosmology. The density of any substance X is ΩX = ρXcrit. The stars and gas in galaxies are made of baryons, and we know the baryon density pretty well from Big Bang Nucleosynthesis: Ωb = 0.04. That means the average density of normal matter is very low, only about 4 x 10-31 g/cm3. That’s less than one hydrogen atom per cubic meter – most of space is an excellent vacuum!

This being the case, we need to scoop up a large volume to make a big galaxy. Going through the math, to gather up enough mass to make a 1011 M galaxy, we need a sphere with a radius of 1.6 Mpc. That’s in today’s universe; in the past the universe was denser by (1+z)3, so at z = 10 that’s “only” 140 kpc. Still, modern galaxies are much smaller than that; the effective edge of the disk of the Milky Way is at a radius of about 20 kpc, and most of the baryonic mass is concentrated well inside that: the typical half-light radius of a 1011 M galaxy is around 6 kpc. That’s a long way to collapse.

Monolithic Galaxy Formation

Given this much information, an early concept was monolithic galaxy formation. We have a big ball of gas in the early universe that collapses to form a galaxy. Why and how this got started was fuzzy. But we knew how much mass we needed and the volume it had to come from, so we can consider what happens as the gas collapses to create a galaxy.

Here we hit a big astrophysical reality check. Just how does the gas collapse? It has to dissipate energy to do so, and cool to form stars. Once stars form, they may feed energy back into the surrounding gas, reheating it and potentially preventing the formation of more stars. These processes are nontrivial to compute ab initio, and attempting to do so obsesses much of the community. We don’t agree on how these things work, so they are the knobs theorists can turn to change an answer they don’t like.

Even if we don’t understand star formation in detail, we do observe that stars have formed, and can estimate how many. Moreover, we do understand pretty well how stars evolve once formed. Hence a common approach is to build stellar population models with some prescribed star formation history and see what works. Spiral galaxies like the Milky Way formed a lot of stars in the past, and continue to do so today. To make 5 x 1010 M of stars in 13 Gyr requires an average star formation rate of 4 M/yr. The current measured star formation rate of the Milky Way is estimated to be 2 ± 0.7 M/yr, so the star formation rate has been nearly constant (averaging over stochastic variations) over time, perhaps with a gradual decline. Giant elliptical galaxies, in contrast, are “red and dead”: they have no current star formation and appear to have made most of their stars long ago. Rather than a roughly constant rate of star formation, they peaked early and declined rapidly. The cessation of star formation is also called quenching.

A common way to formulate the star formation rate in galaxies as a whole is the exponential star formation rate, SFR(t) = SFR0 e-t/τ. A spiral galaxy has a low baseline star formation rate SFR0 and a long burn time τ ~ 10 Gyr while an elliptical galaxy has a high initial star formation rate and a short e-folding time like τ ~ 1 Gyr. Many variations on this theme are possible, and are of great interest astronomically, but this basic distinction suffices for our discussion here. From the perspective of the observed mass and stellar populations of local galaxies, the standard picture for a giant elliptical was a large, monolithic island universe that formed the vast majority of its stars early on then quenched with a short e-folding timescale.

Galaxies as Island Universes

The density parameter Ω provides another useful way to think about galaxy formation. As cosmologists, we obsess about the global value of Ω because it determines the expansion history and ultimate fate of the universe. Here it has a more modest application. We can think of the region in the early universe that will ultimately become a galaxy as its own little closed universe. With a density parameter Ω > 1, it is destined to recollapse.

A fun and funny fact of the Friedmann equation is that the matter density parameter Ωm → 1 at early times, so the early universe when galaxies form is matter dominated. It is also very uniform (more on that below). So any subset that is a bit more dense than average will have Ω > 1 just because the average is very close to Ω = 1. We can then treat this region as its own little universe (a “top-hat overdensity”) and use the Friedmann equation to solve for its evolution, as in this sketch:

The expansion of the early universe a(t) (blue line). A locally overdense region may behave as a closed universe, recollapsing in a finite time (red line) to potentially form a galaxy.

That’s great, right? We have a simple, analytic solution derived from first principles that explains how a galaxy forms. We can plug in the numbers to find how long it takes to form our basic, big 1011 M galaxy and… immediately encounter a problem. We need to know how overdense our protogalaxy starts out. Is its effective initial Ωm = 2? 10? What value, at what time? The higher it is, the faster the evolution from initially expanding along with the rest of the universe to decoupling from the Hubble flow to collapsing. We know the math but we still need to know the initial condition.

Annoying Initial Conditions

The initial condition for galaxy formation is observed in the cosmic microwave background (CMB) at z = 1090. Where today’s universe is remarkably lumpy, the early universe is incredibly uniform. It is so smooth that it is homogeneous and isotropic to one part in a hundred thousand. This is annoyingly smooth, in fact. It would help to have some lumps – primordial seeds with Ω > 1 – from which structure can grow. The observed seeds are too tiny; the typical initial amplitude is 10-5 so Ωm = 1.00001. That takes forever to decouple and recollapse; it hasn’t yet had time to happen.

The cosmic microwave background as observed by ESA’s Planck satellite. This is an all-sky picture of the relic radiation field – essentially a snapshot of the universe when it was just a few hundred thousand years old. The variations in color are variations in temperature which correspond to variations in density. These variations are tiny, only about one part in 100,000. The early universe was very uniform; the real picture is a boring blank grayscale. We have to crank the contrast way up to see these minute variations.

We would like to know how the big galaxies of today – enormous agglomerations of stars and gas and dust separated by inconceivably vast distances – came to be. How can this happen starting from such homogeneous initial conditions, where all the mass is equally distributed? Gravity is an attractive force that makes the rich get richer, so it will grow the slight initial differences in density, but it is also weak and slow to act. A basic result in gravitational perturbation theory is that overdensities grow at the same rate the universe expands, which is inversely related to redshift. So if we see tiny fluctuations in density with amplitude 10-5 at z = 1000, they should have only grown by a factor of 1000 and still be small today (10-2 at z = 0). But we see structures of much higher contrast than that. You can’t here from there.

The rich large scale structure we see today is impossible starting from the smooth observed initial conditions. Yet here we are, so we have to do something to goose the process. This is one of the original motivations for invoking cold dark matter (CDM). If there is a substance that does not interact with photons, it can start to clump up early without leaving too large a mark on the relic radiation field. In effect, the initial fluctuations in mass are larger, just in the invisible substance. (That’s not to say the CDM doesn’t leave a mark on the CMB; it does, but it is subtle and entirely another story.) So the idea is that dark matter forms gravitational structures first, and the baryons fall in later to make galaxies.

An illustration of the the linear growth of overdensities. Structure can grow in the dark matter (long dashed lines) with the baryons catching up only after decoupling (short dashed line). In effect, the dark matter gives structure formation a head start, nicely explaining the apparently impossible growth factor. This has been standard picture for what seems like forever (illustration from Schramm 1992).

With the right amount of CDM – and it has to be just the right amount of a dynamically cold form of non-baryonic dark matter (stuff we still don’t know actually exists) – we can explain how the growth factor is 105 since recombination instead of a mere 103. The dark matter got a head start over the stuff we can see; it looks like 105 because the normal matter lagged behind, being entangled with the radiation field in a way the dark matter was not.

This has been the imperative need in structure formation theory for so long that it has become undisputed lore; an element of the belief system so deeply embedded that it is practically impossible to question. I risk getting ahead of the story, but it is important to point out that, like the interpretation of so much of the relevant astrophysical data, this belief assumes that gravity is normal. This assumption dictates the growth rate of structure, which in turn dictates the need to invoke CDM to allow structure to form in the available time. If we drop this assumption, then we have to work out what happens in each and every alternative that we might consider. That definitely gets ahead of the story, so first let’s understand what we should expect in LCDM.

Hierarchical Galaxy formation in LCDM

LCDM predicts some things remarkably well but others not so much. The dark matter is well-behaved, responding only to gravity. Baryons, on the other hand, are messy – one has to worry about hydrodynamics in the gas, star formation, feedback, dust, and probably even magnetic fields. In a nutshell, LCDM simulations are very good at predicting the assembly of dark mass, but converting that into observational predictions relies on our incomplete knowledge of messy astrophysics. We know what the mass should be doing, but we don’t know so well how that translates to what we see. Mass good, light bad.

Starting with the assembly of mass, the first thing we learn is that the story of monolithic galaxy formation outlined above has to be wrong. Early density fluctuations start out tiny, even in dark matter. God didn’t plunk down island universes of galaxy mass then say “let there be galaxies!” The annoying initial conditions mean that little dark matter halos form first. These subsequently merge hierarchically to make ever bigger halos. Rather than top-down monolithic galaxy formation, we have the bottom-up hierarchical formation of dark matter halos.

The hierarchical agglomeration of dark matter halos into ever larger objects is often depicted as a merger tree. Here are four examples from the high resolution Illustris TNG50 simulation (Pillepich et al. 2019; Nelson et al. 2019).

Examples of merger trees from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019). Objects have been selected to have very nearly the same stellar mass at z=0. Mass is built up through a series of mergers. One large dark matter halo today (at top) has many antecedents (small halos at bottom). These merge hierarchically as illustrated by the connecting lines. The size of the symbol is proportional to the halo mass. I have added redshift and the corresponding age of the universe for vanilla LCDM in a more legible font. The color bar illustrates the specific star formation rate: the top row has objects that are still actively star forming like spirals; those in the bottom row are “red and dead” – things that have stopped forming stars, like giant elliptical galaxies. In all cases, there is a lot of merging and a modest rate of growth, with the typical object taking about half a Hubble time (~7 Gyr) to assemble half of its final stellar mass.

The hierarchical assembly of mass is generic in CDM. Indeed, it is one of its most robust predictions. Dark matter halos start small, and grow larger by a succession of many mergers. This gradual agglomeration is slow: note how tiny the dark matter halos at z = 10 are.

Strictly speaking, it isn’t even meaningful to talk about a single galaxy over the span of a Hubble time. It is hard to avoid this mental trap: surely the Milky Way has always been the Milky Way? so one imagines its evolution over time. This is monolithic thinking. Hierarchically, “the galaxy” refers at best to the largest progenitor, the object that traces the left edge of the merger trees above. But the other protogalactic chunks that eventually merge together are as much part of the final galaxy as the progenitor that happens to be largest.

This complicated picture is complicated further by what we can see being stars, not mass. The luminosity we observe forms through a combination of in situ growth (star formation in the largest progenitor) and ex situ growth through merging. There is no reason for some preferred set of protogalaxies to form stars faster than the others (though of course there is some scatter about the mean), so presumably the light traces the mass of stars formed traces the underlying dark mass. Presumably.

That we should see lots of little protogalaxies at high redshift is nicely illustrated by this lookback cone from Yung et al (2022). Here the color and size of each point corresponds to the stellar mass. Massive objects are common at low redshift but become progressively rare at high redshift, petering out at z > 4 and basically absent at z = 10. This realization of the observable stellar mass tracks the assembly of dark mass seen in merger trees.

Fig. 2 from Yung et al. (2022) illustrating what an observer would see looking back through their simulation to high redshift.

This is what we expect to see in LCDM: lots of small protogalaxies at high redshift; the building blocks of later galaxies that had not yet merged. The observation of galaxies much brighter than this at high redshift by JWST poses a fundamental challenge to the paradigm: mass appears not to be subdivided as expected. So it is entirely justifiable that people have been freaking out that what we see are bright galaxies that are apparently already massive. That shouldn’t happen; it wasn’t predicted to happen; how can this be happening?

That’s all background that is assumed knowledge for our ApJ paper, so we’re only now getting to its Figure 1. This combines one of the merger trees above with its stellar mass evolution. The left panel shows the assembly of dark mass; the right pane shows the growth of stellar mass in the largest progenitor. This is what we expect to see in observations.


Fig. 1 from McGaugh et al (2024): A merger tree for a model galaxy from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019, left panel) selected to have M ≈ 9 × 1010 M at z = 0; i.e., the stellar mass of a local L giant elliptical galaxy (Driver et al. 2022). Mass assembles hierarchically, starting from small halos at high redshift (bottom edge) with the largest progenitor traced along the left of edge of the merger tree. The growth of stellar mass of the largest progenitor is shown in the right panel. This example (jagged line) is close to the median (dashed line) of comparable mass objects (Rodriguez-Gomez et al. 2016), and within the range of the scatter (the shaded band shows the 16th – 84th percentiles). A monolithic model that forms at zf = 10 and evolves with an exponentially declining star formation rate with τ = 1 Gyr (purple line) is shown for comparison. The latter model forms most of its stars earlier than occurs in the simulation.

For comparison, we also show the stellar mass growth of a monolithic model for a giant elliptical galaxy. This is the classic picture we had for such galaxies before we realized that galaxy formation had to be hierarchical. This particular monolithic model forms at zf = 10 and follows an exponential star formation rate with τ = 1 Gyr. It is one of the models published by Franck & McGaugh (2017). It is, in fact, the first model I asked Jay to construct when he started the project. Not because we expected it to best describe the data, as it turns out to do, but because the simple exponential model is a touchstone of stellar population modeling. It was a starter model: do this basic thing first to make sure you’re doing it right. We chose τ = 1 Gyr because that was the typical number bandied about for elliptical galaxies, and zf = 10 because that seemed ridiculously early for a massive galaxy to form. At the time we built the model, it was ludicrously early to imagine a massive galaxy would form, from an LCDM perspective. A formation redshift zf = 10 was, less than a decade ago, practically indistinguishable from the beginning of time, so we expected it to provide a limit that the data would not possibly approach.

In a remarkably short period, JWST has transformed z = 10 from inconceivable to run of the mill. I’m not going to go into the data yet – this all-theory post is already a lot – but to offer one spoiler: the data are consistent with this monolithic model. If we want to “fix” LCDM, we have to make the red line into the purple line for enough objects to explain the data. That proves to be challenging. But that’s moving the goalposts; the prediction was that we should see little protogalaxies at high redshift, not massive, monolith-style objects. Just look at the merger trees at z = 10!

Accelerated Structure Formation in MOND

In order to address these issues in MOND, we have to go back to the beginning. What is the evolution of a spherical region (a top-hat overdensity) that might collapse to form a galaxy? How does a spherical region under the influence of MOND evolve within an expanding universe?

The solution to this problem was first found by Felten (1984), who was trying to play the Newtonian cosmology trick in MOND. In conventional dynamics, one can solve the equation of motion for a point on the surface of a uniform sphere that is initially expanding and recover the essence of the Friedmann equation. It was reasonable to check if cosmology might be that simple in MOND. It was not. The appearance of a0 as a physical scale makes the solution scale-dependent: there is no general solution that one can imagine applies to the universe as a whole.

Felten reasonably saw this as a failure. There were, however, some appealing aspects of his solution. For one, there was no such thing as a critical density. All MOND universes would eventually recollapse irrespective of their density (in the absence of the repulsion provided by a cosmological constant). It could take a very long time, which depended on the density, but the ultimate fate was always the same. There was no special value of Ω, and hence no flatness problem. The latter obsessed people at the time, so I’m somewhat surprised that no one seems to have made this connection. Too soon*, I guess.

There it sat for many years, an obscure solution for an obscure theory to which no one gave credence. When I became interested in the problem a decade later, I started methodically checking all the classic results. I was surprised to find how many things we needed dark matter to explain were just as well (or better) explained by MOND. My exact quote was “surprised the bejeepers out of us.” So, what about galaxy formation?

I started with the top-hat overdensity, and had the epiphany that Felten had already obtained the solution. He had been trying to solve all of cosmology, which didn’t work. But he had solved the evolution of a spherical region that starts out expanding with the rest of the universe but subsequently collapses under the influence of MOND. The overdensity didn’t need to be large, it just needed to be in the low acceleration regime. Something like the red cycloidal line in the second plot above could happen in a finite time. But how much?

The solution depends on scale and needs to be solved numerically. I am not the greatest programmer, and I had a lot else on my plate at the time. I was in no rush, as I figured I was the only one working on it. This is usually a good assumption with MOND, but not in this case. Bob Sanders had had the same epiphany around the same time, which I discovered when I received his manuscript to referee. So all credit is due to Bob: he said these things first.

First, he noted that galaxy formation in MOND is still hierarchical. Small things form first. Crudely speaking, structure formation is very similar to the conventional case, but now the goose comes from the change in the force law rather than extra dark mass. MOND is nonlinear, so the whole process gets accelerated. To compare with the linear growth of CDM:

A sketch of how structures grow over time under the influence of cold dark matter (left, from Schramm 1992, same as above) and MOND (right, from Sanders & McGaugh 2002; see also this further discussion and previous post). The slow linear growth of CDM (long-dashed line, left panel) is replaced by a rapid, nonlinear growth in MOND (solid lines at right; numbers correspond to different scales). Nonlinear growth moderates after cosmic expansion begins to accelerate (dashed vertical line in right panel).

The net effect is the same. A cosmic web of large scale structure emerges. They look qualitatively similar, but everything happens faster in MOND. This is why observations have persistently revealed structures that are more massive and were in place earlier than expected in contemporaneous LCDM models.

Simulated structure formation in ΛCDM (top) and MOND (bottom) showing the more rapid emergence of similar structures in MOND (note the redshift of each panel). From McGaugh (2015).

In MOND, small objects like globular clusters form first, but galaxies of a range of masses all collapse on a relatively short cosmic timescale. How short? Let’s consider our typical 1011 M galaxy. Solving Felten’s equation for the evolution of a sphere numerically, peak expansion is reached after 300 Myr and collapse happens in a similar time. The whole galaxy is in place speedy quick, and the initial conditions don’t really matter: a uniform, initially expanding sphere in the low acceleration regime will behave this way. From our distant vantage point thirteen billion years later, the whole process looks almost monolithic (the purple line above) even though it is a chaotic hierarchical mess for the first few hundred million years (z > 14). In particular, it is easy to form half of the stellar mass early on: the mass is already assembled.

The evolution of a 1011 M sphere that starts out expanding with the universe but decouples and collapses under the influence of MOND (dotted line). It reaches maximum expansion after 300 Myr and recollapses in a similar time, so the entire object is in place after 600 Myr. (A version of this plot with a logarithmic time axis appears as Fig. 2 in our paper.) The inset shows the evolution of smaller shells within such an object (Fig. 2 from Sanders 2008). The inner regions collapse first followed by outer shells. These oscillate and cross, mixing and ultimately forming a reasonable size galaxy – see Sanders’s Table 1 and also his Fig. 4 for the collapse times for objects of other masses. These early results are corroborated by Eappen et al. (2022), who further demonstrate that the details of feedback are not important in MOND, unlike LCDM.

This is what JWST sees: galaxies that are already massive when the universe is just half a billion years old. I’m sure I should say more but I’m exhausted now and you may be too, so I’m gonna stop here by noting that in 1998, when Bob Sanders predicted that “Objects of galaxy mass are the first virialized objects to form (by z=10),” the contemporaneous prediction of LCDM was that “present-day disc [galaxies] were assembled recently (at z<=1)” and “there is nothing above redshift 7.” One of these predictions has been realized. It is rare in science that such a clear a priori prediction comes true, let alone one that seemed so unreasonable at the time, and which took a quarter century to corroborate.


*I am not quite this old: I was still an undergraduate in 1984. I hadn’t even decided to be an astronomer at that point; I certainly hadn’t started following the literature. The first time I heard of MOND was in a graduate course taught by Doug Richstone in 1988. He only mentioned it in passing while talking about dark matter, writing the equation on the board and saying maybe it could be this. I recall staring at it for a long few seconds, then shaking my head and muttering “no way.” I then completely forgot about it, not thinking about it again until it came up in our data for low surface brightness galaxies. I expect most other professionals have the same initial reaction, which is fair. The test of character comes when it crops up in their data, as it is doing now for the high redshift galaxy community.

Nobel prizes that were, that might have been, and others that have not yet come to pass

Nobel prizes that were, that might have been, and others that have not yet come to pass

The time is approaching when Nobel prizes are awarded. This inevitably leads to a lot of speculation and chattering rumor. Last year one publication, I think it was Physics Today, went so far as to publish a list of things various people thought should be recognized. This aspirational list was led, of course, by dark matter. It was even formatted the way prize awards are phrased, saying something like “the prize goes to [blank] for the discovery of dark matter.” This would certainly be a prize-worthy discovery, if made. So far it hasn’t been, and I expect it never will be: blank will remain blank forever. I’d be happy to be proved wrong, as forever is a long time to wait for corroboration of this prediction.

While the laboratory detection of dark matter is a slam-dunk for a Nobel prize, there are plenty of discoveries that drive the missing mass problem that are already worthy of this recognition. The issue is too big for a single prize. Laboratory detection would be the culmination of a search that has been motivated by astronomical observations. The Nobel prize in physics has sometimes been awarded for astronomical discoveries – and should be, for those that impact fundamental physics or motivate entire fields like the search for dark matter – so let’s think about what those might be.

An obvious historical example would be Kepler’s Laws. Kepler predates Nobel by a few centuries, but there is no doubt that his identification of the eponymous laws of planetary motion impacted fundamental physics, being one of the key set of facts that led Newton to his universal law of gravity. Whether Tycho Brahe should also be named as the person who made the observations on which Kepler’s work is based is the sort of question the prize committee has to wrestle with. I would say yes: the prize is for “the person who shall have made the most important discovery or invention within the field of physics.” In this case, the discovery that led to gravity was a set of rules – how the orbits of planets behave – that required both observational work (Brahe’s) and numerical analysis (Kepler’s) to achieve.

One could of course also give a prize to Newton some decades later, though theories are not generally considered discoveries. The line can be hazy. For example, the Nobel Prize in Physics 1921 was awarded to Albert Einstein “for his services to Theoretical Physics, and especially for his discovery of the law of the photoelectric effect.” The “especially” is reserved for the empirical law, not relativity, though I guess “services to theoretical physics” is doing a lot of work there.

Reading up on that I was mildly surprised to learn that the committee had a hard time finding deserving recipients, initially skipping 1918 and 1921 but awarding those prizes in the subsequent year to Planck and Einstein, respectively. I wonder if they struggled with the definition of discovery: need it be experimental? For many, the answer is yes. A theory by itself, untethered from experimental or observational corroboration, does not a discovery make.

I don’t think they need to skip years any more, as the list of plausible nominees has grown so long that deserving people die waiting to be recognized: the Nobel prize is not awarded posthumously. The story is that this is what happened to both Henrietta Leavitt (who discovered the Cepheid period-luminosity relation) and Edwin Hubble (who used Leavitt’s relation for Cepheids to measure distances to other galaxies, thereby changing the course of cosmology). There is also the issue of what counts as physics. At the time, these were very astronomical discoveries. In retrospect, it is obvious that the impact Hubble had on cosmology counts as physics as well.

The same can be said for the discovery of flat rotation curves. I have made the case before that Vera Rubin and Albert Bosma (and arguably others) deserve the Nobel prize for this discovery. Note that I do not say the discovery of dark matter, because (1) that’s not what they did*, and (2) flat rotation curves are enough. Flat rotation curves are a de facto law of nature. That’s enough, every bit as much as Einstein’s “discovery of the law of the photoelectric effect.” A laboratory detection of dark matter would be another discovery worthy of a Nobel prize, but we already missed out on recognizing Rubin for this one.

Conflating discoveries with their interpretation has precluded recognition of other important astronomical discoveries – discoveries that implicate basic physics regardless of their ultimate interpretation, be it cold dark matter or MOND or something else we have yet to figure out. So, what are some others?

One obvious one is the Tully-Fisher relation. This is another de facto law of nature. Tully has been recognized for his work with the Gruber prize, so it’s not like it hasn’t been recognized. What remains lacking is recognition that this is a fundamental law of physics, at least the baryonic version when flat rotation speeds are measured.

Philip Mannheim pointed out to me that Milgrom deserves the prize for the discovery of the acceleration scale a0. This is a new constant of nature. That’s enough.

Milgrom went further, developing the whole MOND paradigm around this new scale. But that is extra credit material that needn’t be correct. Unfortunately, the controversial nature of MOND, deserved or not, serves to obscure that there is a new constant of nature whose discovery is analogous to Planck’s discovery of his eponymous constant. People argue over whether a0 is a single constant (it is) or whether it evolves over cosmic time (not so far as I can tell). The latter objection could be raised for Planck’s constant or Newton’s constant; these were established when it wasn’t possible to test whether their values might have varied over cosmic time. Now that we can, we do check! and so far, no: h, G, and a0 all appear to be constants of nature, to the extent we are able to perceive.

The above discoveries are all worthy of recognition by a Nobel prize. They are all connected by the radial acceleration relation, which is another worthy observational discovery in its own right. This is one that clearly transgresses the boundaries of physics and astronomy, as the early versions (Sanders 1990, McGaugh 1999, 2004) appeared in the astronomical literature, but more recent ones in the physics literature (McGaugh et al. 2016, Mistele et al. 2024). Sadly, the community seems perpetually stuck looping through the stages of Louis Agassiz‘s progression of responses to scientific discoveries. It shouldn’t be: this is an empirical relation that has long been well established and repeatedly confirmed. It suffers from association with MOND, but no reference to MOND is made in the construction of the observed relation. It’s right there in the data:

The radial acceleration relation as traced by both early (red) and late (cyan) type galaxies via both kinematics and gravitational lensing. The low acceleration behavior maps smoothly onto the Newtonian behavior seen in the solar system at higher accelerations. If Newton’s discovery of the inverse square force law would warrant a Nobel prize, as surely it would had the prize existed in Newton’s time, then so does the discovery of a systematically new behavior.

*Rubin and Bosma both argued, sensibly, that the interpretation of flat rotation curves required dark matter. That’s an interpretation, not a discovery. That rotation curves were flat, over and over again in every galaxy examined, to indefinitely large radii, was the observational discovery.