Take it where?

Take it where?

I had written most of the post below the line before an exchange with a senior colleague who accused me of asking us to abandon General Relativity (GR). Anyone who read the last post knows that this is the opposite of true. So how does this happen?

Much of the field is mired in bad ideas that seemed like good ideas in the 1980s. There has been some progress, but the idea that MOND is an abandonment of GR I recognize as a misconception from that time. It arose because the initial MOND hypothesis suggested modifying the law of inertia without showing a clear path to how this might be consistent with GR. GR was built on the Equivalence Principle (EP), the equivalence1 of gravitational charge with inertial mass. The original MOND hypothesis directly contradicted that, so it was a fair concern in 1983. It was not by 19842. I was still an undergraduate then, so I don’t know the sociology, but I get the impression that most of the community wrote MOND off at this point and never gave it further thought.

I guess this is why I still encounter people with this attitude, that someone is trying to rob them of GR. It’s feels like we’re always starting at square one, like there has been zero progress in forty years. I hope it isn’t that bad, but I admit my patience is wearing thin.

I’m trying to help you. Don’t waste you’re entire career chasing phantoms.

What MOND does ask us to abandon is the Strong Equivalence Principle. Not the Weak EP, nor even the Einstein EP. Just the Strong EP. That’s a much more limited ask that abandoning all of GR. Indeed, all flavors of EP are subject to experimental test. The Weak EP has been repeatedly validated, but there is nothing about MOND that implies platinum would fall differently from titanium. Experimental tests of the Strong EP are less favorable.

I understand that MOND seems impossible. It also keeps having its predictions come true. This combination is what makes it important. The history of science is chock full of ideas that were initially rejected as impossible or absurd, going all the way back to heliocentrism. The greater the cognitive dissonance, the more important the result.


Continuing the previous discussion of UT, where do we go from here? If we accept that maybe we have all these problems in cosmology because we’re piling on auxiliary hypotheses to continue to be able to approximate UT with FLRW, what now?

I don’t know.

It’s hard to accept that we don’t understand something we thought we understood. Scientists hate revisiting issues that seem settled. Feels like a waste of time. It also feels like a waste of time continuing to add epicycles to a zombie theory, be it LCDM or MOND or the phoenix universe or tired light or whatever fantasy reality you favor. So, painful as it may be, one has find a little humility to step back and take account of what we know empirically independent of the interpretive veneer of theory.

As I’ve said before, I think we do know that the universe is expanding and passed through an early hot phase that bequeathed us the primordial abundances of the light elements (BBN) and the relic radiation field that we observe as the cosmic microwave background (CMB). There’s a lot more to it than that, and I’m not going to attempt to recite it all here.

Still, to give one pertinent example, BBN only works if the expansion rate is as expected during the epoch of radiation domination. So whatever is going on has to converge to that early on. This is hardly surprising for UT since it was stipulated to contain GR in the relevant limit, but we don’t actually know how it does so until we work out what UT is – a tall order that we can’t expect to accomplish overnight, or even over the course of many decades without a critical mass of scientists thinking about it (and not being vilified by other scientists for doing so).

Another example is that the cosmological principle – that the universe is homogeneous and isotropic – is observed to be true in the CMB. The temperature is the same all over the sky to one part in 100,000. That’s isotropy. The temperature is tightly coupled to the density, so if the temperature is the same everywhere, so is the density. That’s homogeneity. So both of the assumptions made by the cosmological principle are corroborated by observations of the CMB.

The cosmological principle is extremely useful for solving the equations of GR as applied to the whole universe. If the universe has a uniform density on average, then the solution is straightforward (though it is rather tedious to work through to the Friedmann equation). If the universe is not homogeneous and isotropic, then it becomes a nightmare to solve the equations. One needs to know where everything was for all of time.

Starting from the uniform condition of the CMB, it is straightforward to show that the assumption of homogeneity and isotropy should persist on large scales up to the present day. “Small” things like galaxies go nonlinear and collapse, but huge volumes containing billions of galaxies should remain in the linear regime and these small-scale variations average out. One cubic Gigaparsec will have the same average density as the next as the next, so the cosmological principle continues to hold today.

Anyone spot the rub? I said homogeneity and isotropy should persist. This statement assumes GR. Perhaps it doesn’t hold in UT?

This aspect of cosmology is so deeply embedded in everything that we do in the field that it was only recently that I realized it might not hold absolutely – and I’ve been actively contemplating such a possibility for a long time. Shouldn’t have taken me so long. Felten (1984) realized right away that a MONDian universe would depart from isotropy by late times. I read that paper long ago but didn’t grasp the significance of that statement. I did absorb that in the absence of a cosmological constant (which no one believed in at the time), the universe would inevitably recollapse, regardless of what the density was. This seems like an elegant solution to the flatness/coincidence problem that obsessed cosmologists at the time. There is no special value of the mass density that provides an over/under line demarcating eternal expansion from eventual recollapse, so there is no coincidence problem. All naive MOND cosmologies share the same ultimate fate, so it doesn’t matter what we observe for the mass density.

MOND departs from isotropy for the same reason it forms structure fast: it is inherently non-linear. As well as predicting that big galaxies would form by z=10, Sanders (1998) correctly anticipated the size of the largest structures collapsing today (things like the local supercluster Laniakea) and the scale of homogeneity (a few hundred Mpc if there is a cosmological constant). Pretty much everyone who looked into it came to similar conclusions.

But MOND and cosmology, as we know it in the absence of UT, are incompatible. Where LCDM encompasses both cosmology and the dynamics of bound systems (dark matter halos3), MOND addresses the dynamics of low acceleration systems (the most common examples being individual galaxies) but says nothing about cosmology. So how do we proceed?

For starters, we have to admit our ignorance. From there, one has to assume some expanding background – that much is well established – and ask what happens to particles responding to a MONDian force-law in this background, starting from the very nearly uniform initial condition indicated by the CMB. From that simple starting point, it turns out one can get a long way without knowing the details of the cosmic expansion history or the metric that so obsess cosmologists. These are interesting things, to be sure, but they are aspects of UT we don’t know and can manage without to some finite extent.

For one, the thermal history of the universe is pretty much the same with or without dark matter, with or without a cosmological constant. Without dark matter, structure can’t get going until after thermal decoupling (when the matter is free to diverge thermally from the temperature of the background radiation). After that happens, around z = 200, the baryons suddenly find themselves in the low acceleration regime, newly free to respond to the nonlinear force of MOND, and structure starts forming fast, with the consequences previously elaborated.

But what about the expansion history? The geometry? The big questions of cosmology?

Again, I don’t know. MOND is a dynamical theory that extends Newton. It doesn’t address these questions. Hence the need for UT.

I’ve encountered people who refuse to acknowledge4 that MOND gets predictions like z=10 galaxies right without a proper theory for cosmology. That attitude puts the cart before the horse. One doesn’t look for UT unless well motivated. That one is able to correctly predict 25 years in advance something that comes as a huge surprise to cosmologists today is the motivation. Indeed, the degree of surprise and the longevity of the prediction amplify the motivation: if this doesn’t get your attention, what possibly could?

There is no guarantee that our first attempt at UT (or our second or third or fourth) will work out. It is possible that in the search for UT, one comes up with a theory that fails to do what was successfully predicted by the more primitive theory. That just lets you know you’ve taken a wrong turn. It does not mean that a correct UT doesn’t exist, or that the initial prediction was some impossible fluke.

One candidate theory for UT is bimetric MOND. This appears to justify the assumptions made by Sanders’s early work, and provide a basis for a relativistic theory that leads to rapid structure formation. Whether it can also fit the acoustic power spectrum of the CMB as well as LCDM and AeST has yet to be seen. These things take time and effort. What they really need is a critical mass of people working on the problem – a community that enjoys the support of other scientists and funding institutions like NSF. Until we have that5, progress will remain grudgingly slow.


1The equivalence of gravitational charge and inertial mass means that the m in F=GMm/d2 is identically the same as the m in F=ma. Modified gravity changes the former; modified inertia the latter.

2Bekenstein & Milgrom (1984) showed how a modification of Newtonian gravity could avoid the non-conservation issues suffered by the original hypothesis of modified inertia. They also outlined a path towards a generally covariant theory that Bekenstein pursued for the rest of his life. That he never managed to obtain a completely satisfactory version is often cited as evidence that it can’t be done, since he was widely acknowledged as one of the smartest people in the field. One wonders why he persisted if, as these detractors would have us believe, the smart thing to do was not even try.

3The data for galaxies do not look like the dark matter halos predicted by LCDM.

4I have entirely lost patience with this attitude. If a phenomena is correctly predicted in advance in the literature, we are obliged as scientists to take it seriously+. Pretending that it is not meaningful in the absence of UT is just an avoidance strategy: an excuse to ignore inconvenient facts.

+I’ve heard eminent scientists describe MOND’s predictive ability as “magic.” This also seems like an avoidance strategy. I, for one, do not believe in magic. That it works as well as it doesthat it works at all – must be telling us something about the natural world, not the supernatural.

5There does exist a large and active community of astroparticle physicists trying to come up with theories for what the dark matter could be. That’s good: that’s what needs to happen, and we should exhaust all possibilities. We should do the same for new dynamical theories.

What we have here is a failure to communicate

What we have here is a failure to communicate

Kuhn noted that as paradigms reach their breaking point, there is a divergence of opinions between scientists about what the important evidence is, or what even counts as evidence. This has come to pass in the debate over whether dark matter or modified gravity is a better interpretation of the acceleration discrepancy problem. It sometimes feels like we’re speaking about different topics in a different language. That’s why I split the diagram version of the dark matter tree as I did:

Evidence indicating acceleration discrepancies in the universe and various flavors of hypothesized solutions.

Astroparticle physicists seem to be well-informed about the cosmological evidence (top) and favor solutions in the particle sector (left). As more of these people entered the field in the ’00s and began attending conferences where we overlapped, I recognized gaping holes in their knowledge about the dynamical evidence (bottom) and related hypotheses (right). This was part of my motivation to develop an evidence-based course1 on dark matter, to try to fill in the gaps in essential knowledge that were obviously being missed in the typical graduate physics curriculum. Though popular on my campus, not everyone in the field has the opportunity to take this course. It seems that the chasm has continued to grow, though not for lack of attempts at communication.

Part of the problem is a phase difference: many of the questions that concern astroparticle physicists (structure formation is a big one) were addressed 20 years ago in MOND. There is also a difference in texture: dark matter rarely predicts things but always explains them, even if it doesn’t. MOND often nails some predictions but leaves other things unexplained – just a complete blank. So they’re asking questions that are either way behind the curve or as-yet unanswerable. Progress rarely follows a smooth progression in linear time.

I have become aware of a common construction among many advocates of dark matter to criticize “MOND people.” First, I don’t know what a “MOND person” is. I am a scientist who works on a number of topics, among them both dark matter and MOND. I imagine the latter makes me a “MOND person,” though I still don’t really know what that means. It seems to be a generic straw man. Users of this term consistently paint such a luridly ridiculous picture of what MOND people do or do not do that I don’t recognize it as a legitimate depiction of myself or of any of the people I’ve met who work on MOND. I am left to wonder, who are these “MOND people”? They sound very bad. Are there any here in the room with us?

I am under no illusions as to what these people likely say when I am out of ear shot. Someone recently pointed me to a comment on Peter Woit’s blog that I would not have come across on my own. I am specifically named. Here is a screen shot:

From a reply to a post of Peter Woit on December 8, 2022. I omit the part about right-handed neutrinos as irrelevant to the discussion here.

This concisely pinpoints where the field2 is at, both right and wrong. Let’s break it down.

let me just remind everyone that the primary reason to believe in the phenomenon of cold dark matter is the very high precision with which we measure the CMB power spectrum, especially modes beyond the second acoustic peak

This is correct, but it is not the original reason to believe in CDM. The history of the subject matters, as we already believed in CDM quite firmly before any modes of the acoustic power spectrum of the CMB were measured. The original reasons to believe in cold dark matter were (1) that the measured, gravitating mass density exceeds the mass density of baryons as indicated by BBN, so there is stuff out there with mass that is not normal matter, and (2) large scale structure has grown by a factor of 105 from the very smooth initial condition indicated initially by the nondetection of fluctuations in the CMB, while normal matter (with normal gravity) can only get us a factor of 103 (there were upper limits excluding this before there was a detection). Structure formation additionally imposes the requirement that whatever the dark matter is moves slowly (hence “cold”) and does not interact via electromagnetism in order to evade making too big an impact on the fluctuations in the CMB (hence the need, again, for something non-baryonic).

When cold dark matter became accepted as the dominant paradigm, fluctuations in the CMB had not yet been measured. The absence of observable fluctuations at a larger level sufficed to indicate the need for CDM. This, together with Ωm > Ωb from BBN (which seemed the better of the two arguments at the time), sufficed to convince me, along with most everyone else who was interested in the problem, that the answer had3 to be CDM.

This all happened before the first fluctuations were observed by COBE in 1992. By that time, we already believed firmly in CDM. The COBE observations caused initial confusion and great consternation – it was too much! We actually had a prediction from then-standard SCDM, and it had predicted an even lower level of fluctuations than what COBE observed. This did not cause us (including me) to doubt CDM (thought there was one suggestion that it might be due to self-interacting dark matter); it seemed a mere puzzle to accommodate, not an anomaly. And accommodate it we did: the power in the large scale fluctuations observed by COBE is part of how we got LCDM, albeit only a modest part. A lot of younger scientists seem to have been taught that the power spectrum is some incredibly successful prediction of CDM when in fact it has surprised us at nearly every turn.

As I’ve related here before, it wasn’t until the end of the century that CMB observations became precise enough to provide a test that might distinguish between CDM and MOND. That test initially came out in favor of MOND – or at least in favor of the absence of dark matter: No-CDM, which I had suggested as a proxy for MOND. Cosmologists and dark matter advocates consistently omit this part of the history of the subject.

I had hoped that cosmologists would experience the same surprise and doubt and reevaluation that I had experienced when MOND cropped up in my own data when it cropped up in theirs. Instead, they went into denial, ignoring the successful prediction of the first-to-second peak amplitude ratio, or, worse, making up stories that it hadn’t happened. Indeed, the amplitude of the second peak was so surprising that the first paper to measure it omitted mention of it entirely. Just didn’t talk about it, let alone admit that “Gee, this crazy prediction came true!” as I had with MOND in LSB galaxies. Consequently, I decided that it was better to spend my time working on topics where progress could be made. This is why most of my work on the CMB predates “modes beyond the second peak” just as our strong belief in CDM also predated that evidence. Indeed, communal belief in CDM was undimmed when the modes defining the second peak were observed, despite the No-CDM proxy for MOND being the only hypothesis to correctly predict it quantitatively a priori.

That said, I agree with clayton’s assessment that

CDM thinks [the second and third peak] should be about the same

That this is the best evidence now is both correct and a much weaker argument than it is made out to be. It sounds really strong, because a formal fit to the CMB data require a dark matter component at extremely high confidence – something approaching 100 sigma. This analysis assumes that dark matter exist. It does not contemplate that something else might cause the same effect, so all it really does, yet again, is demonstrate that General Relativity cannot explain cosmology when restricted to the material entities we concretely know to exist.

Given the timing, the third peak was not a strong element of my original prediction, as we did not yet have either a first or second peak. We hadn’t yet clearly observed peaks at all, so what I was doing was pretty far-sighted, but I wasn’t thinking that far ahead. However, the natural prediction for the No-CDM picture I was considering was indeed that the third peak should be lower than the second, as I’ve discussed before.

The No-CDM model (blue line) that correctly predicted the amplitude of the second peak fails to predict that of the third. Data from the Planck satellite; model line from McGaugh (2004); figure from McGaugh (2015).

In contrast, in CDM, the acoustic power spectrum of the CMB can do a wide variety of things:

Acoustic power spectra calculated for the CMB for a variety of cosmic parameters. From Dodelson & Hu (2002).

Given the diversity of possibilities illustrated here, there was never any doubt that a model could be fit to the data, provided that oscillations were observed as expected in any of the theories under consideration here. Consequently, I do not find fits to the data, though excellent, to be anywhere near as impressive as commonly portrayed. What does impress me is consistency with independent data.

What impresses me even more are a priori predictions. These are the gold standard of the scientific method. That’s why I worked my younger self’s tail off to make a prediction for the second peak before the data came out. In order to make a clean test, you need to know what both theories predict, so I did this for both LCDM and No-CDM. Here are the peak ratios predicted before there were data to constrain them, together with the data that came after:

The ratio of the first-to-second (left) and second-to-third peak (right) amplitude ratio in LCDM (red) and No-CDM (blue) as predicted by Ostriker & Steinhardt (1995) and McGaugh (1999). Subsequent data as labeled.

The left hand panel shows the predicted amplitude ratio of the first-to-second peak, A1:2. This is the primary quantity that I predicted for both paradigms. There is a clear distinction between the predicted bands. I was not unique in my prediction for LCDM; the same thing can be seen in other contemporaneous models. All contemporaneous models. I was the only one who was not surprised by the data when they came in, as I was the only one who had considered the model that got the prediction right: No-CDM.

The same No-CDM model fails to correctly predict the second-to-third peak ratio, A2:3. It is, in fact, way off, while LCDM is consistent with A2:3, just as Clayton says. This is a strong argument against No-CDM, because No-CDM makes a clear and unequivocal prediction that it gets wrong. Clayton calls this

a stone-cold, qualitative, crystal clear prediction of CDM

which is true. It is also qualitative, so I call it weak sauce. LCDM could be made to fit a very large range of A2:3, but it had already got A1:2 wrong. We had to adjust the baryon density outside the allowed range in order to make it consistent with the CMB data. The generous upper limit that LCDM might conceivably have predicted in advance of the CMB data was A1:2 < 2.06, which is still clearly less than observed. For the first years of the century, the attitude was that BBN had been close, but not quite right – preference being given to the value needed to fit the CMB. Nowadays, BBN and the CMB are said to be in great concordance, but this is only true if one restricts oneself to deuterium measurements obtained after the “right” answer was known from the CMB. Prior to that, practically all of the measurements for all of the important isotopes of the light elements, deuterium, helium, and lithium, all concurred that the baryon density Ωbh2 < 0.02, with the consensus value being Ωbh2 = 0.0125 ± 0.0005. This is barely half the value subsequently required to fit the CMBbh2 = 0.0224 ± 0.0001). But what’s a factor of two among cosmologists? (In this case, 4 sigma.)

Taking the data at face value, the original prediction of LCDM was falsified by the second peak. But, no problem, we can move the goal posts, in this case by increasing the baryon density. The successful prediction of the third peak only comes after the goal posts have been moved to accommodate the second peak. Citing only the comparable size of third peak to the second while not acknowledging that the second was too small elides the critical fact that No-CDM got something right, a priori, that LCDM did not. No-CDM failed only after LCDM had already failed. The difference is that I acknowledge its failure while cosmologists elide this inconvenient detail. Perhaps the second peak amplitude is a fluke, but it was a unique prediction that was exactly nailed and remains true in all subsequent data. That’s a pretty remarkable fluke4.

LCDM wins ugly here by virtue of its flexibility. It has greater freedom to fit the data – any of the models in the figure of Dodelson & Hu will do. In contrast. No-CDM is the single blue line in my figure above, and nothing else. Plausible variations in the baryon density make hardly any difference: A1:2 has to have the value that was subsequently observed, and no other. It passed that test with flying colors. It flunked the subsequent test posed by A2:3. For LCDM this isn’t even a test, it is an exercise in fitting the data with a model that has enough parameters5 to do so.

There were a number of years at the beginning of the century during which the No-CDM prediction for the A1:2 was repeatedly confirmed by multiple independent experiments, but before the third peak was convincingly detected. During this time, cosmologists exhibited the same attitude that Clayton displays here: the answer has to be CDM! This warrants mention because the evidence Clayton cites did not yet exist. Clearly the as-yet unobserved third peak was not the deciding factor.

In those days, when No-CDM was the only correct a priori prediction, I would point out to cosmologists that it had got A1:2 right when I got the chance (which was rarely: I was invited to plenty of conferences in those days, but none on the CMB). The typical reaction was usually outright denial6 though sometimes it warranted a dismissive “That’s not a MOND prediction.” The latter is a fair criticism. No-CDM is just General Relativity without CDM. It represented MOND as a proxy under the ansatz that MOND effects had not yet manifested in a way that affected the CMB. I expected that this ansatz would fail at some point, and discussed some of the ways that this should happen. One that’s relevant today is that galaxies form early in MOND, so reionization happens early, and the amplitude of gravitational lensing effects is amplified. There is evidence for both of these now. What I did not anticipate was a departure from a damping spectrum around L=600 (between the second and third peaks). That’s a clear deviation from the prediction, which falsifies the ansatz but not MOND itself. After all, they were correct in noting that this wasn’t a MOND prediction per se, just a proxy. MOND, like Newtonian dynamics before it, is relativity adjacent, but not itself a relativistic theory. Neither can explain the CMB on their own. If you find that an unsatisfactory answer, imagine how I feel.

The same people who complained then that No-CDM wasn’t a real MOND prediction now want to hold MOND to the No-CDM predicted power spectrum and nothing else. First it was the second peak isn’t a real MOND prediction! then when the third peak was observed it became no way MOND can do this! This isn’t just hypocritical, it is bad science. The obvious way to proceed would be to build on the theory that had the greater, if incomplete, predictive success. Instead, the reaction has consistently been to cherry-pick the subset of facts that precludes the need for serious rethinking.

This brings us to sociology, so let’s examine some more of what Clayton has to say:

Any talk I’ve ever seen by McGaugh (or more exotic modified gravity people like Verlinde) elides this fact, and they evade the questions when I put my hand up to ask. I have invited McGaugh to a conference before specifically to discuss this point, and he just doesn’t want to.

Now you’re getting personal.

There is so much to unpack here, I hardly know where to start. By saying I “elide this fact” about the qualitatively equality of the second and third peak, Clayton is basically accusing me of lying by omission. This is pretty rich coming from a community that consistently elides the history I relate above, and never addresses the question raised by MOND’s predictive power.

Intellectual honesty is very important to me – being honest that MOND predicted what I saw in low surface brightness where my own prediction was wrong is what got me into this mess in the first place. It would have been vastly more convenient to pretend that I never heard of MOND (at first I hadn’t7) and act like that never happened. That would be an lie of omission. It would be a large lie, a lie that denies an important aspect of how the world works (what we’re supposed to uncover through science), the sort of lie that cleric Paul Gerhardt may have had in mind when he said

When a man lies, he murders some part of the world.

Paul Gerhardt

Clayton is, in essence, accusing me of exactly that by failing to mention the CMB in talks he has seen. That might be true – I give a lot of talks. He hasn’t been to most of them, and I usually talk about things I’ve done more recently than 2004. I’ve commented explicitly on this complaint before

There’s only so much you can address in a half hour talk. [This is a recurring problem. No matter what I say, there always seems to be someone who asks “why didn’t you address X?” where X is usually that person’s pet topic. Usually I could do so, but not in the time allotted.]

– so you may appreciate my exasperation at being accused of dishonesty by someone whose complaint is so predictable that I’ve complained before about people who make this complaint. I’m only human – I can’t cover all subjects for all audiences every time all the time. Moreover, I do tend to choose to discuss subjects that may be news to an audience, not simply reprise the greatest hits they want to hear. Clayton obviously knows about the third peak; he doesn’t need to hear about it from me. This is the scientific equivalent of shouting Freebird! at a concert.

It isn’t like I haven’t talked about it. I have been rigorously honest about the CMB, and certainly have not omitted mention of the third peak. Here is a comment from February 2003 when the third peak was only tentatively detected:

Page et al. (2003) do not offer a WMAP measurement of the third peak. They do quote a compilation of other experiments by Wang et al. (2003). Taking this number at face value, the second to third peak amplitude ratio is A2:3 = 1.03 +/- 0.20. The LCDM expectation value for this quantity was 1.1, while the No-CDM expectation was 1.9. By this measure, LCDM is clearly preferable, in contradiction to the better measured first-to-second peak ratio.

Or here, in March 2006:

the Boomerang data and the last credible point in the 3-year WMAP data both have power that is clearly in excess of the no-CDM prediction. The most natural interpretation of this observation is forcing by a mass component that does not interact with photons, such as non-baryonic cold dark matter.

There are lots like this, including my review for CJP and this talk given at KITP where I had been asked to explicitly take the side of MOND in a debate format for an audience of largely particle physicists. The CMB, including the third peak, appears on the fourth slide, which is right up front, not being elided at all. In the first slide, I tried to encapsulate the attitudes of both sides:

I did the same at a meeting in Stony Brook where I got a weird vibe from the audience; they seemed to think I was lying about the history of the second peak that I recount above. It will be hard to agree on an interpretation if we can’t agree on documented historical facts.

More recently, this image appears on slide 9 of this lecture from the cosmology course I just taught (Fall 2022):

I recognize this slide from talks I’ve given over the past five plus years; this class is the most recent place I’ve used it, not the first. On some occasions I wrote “The 3rd peak is the best evidence for CDM.” I do not recall which all talks I used this in; many of them were likely colloquia for physics departments where one has more time to cover things than in a typical conference talk. Regardless, these apparently were not the talks that Clayton attended. Rather than it being the case that I never address this subject, the more conservative interpretation of the experience he relates would be that I happened not to address it in the small subset of talks that he happened to attend.

But do go off, dude: tell everyone how I never address this issue and evade questions about it.

I have been extraordinarily patient with this sort of thing, but I confess to a great deal of exasperation at the perpetual whataboutism that many scientists engage in. It is used reflexively to shut down discussion of alternatives: dark matter has to be right for this reason (here the CMB); nothing else matters (galaxy dynamics), so we should forbid discussion of MOND. Even if dark matter proves to be correct, the CMB is being used an excuse to not address the question of the century: why does MOND get so many predictions right? Any scientist with a decent physical intuition who takes the time to rub two brain cells together in contemplation of this question will realize that there is something important going on that simply invoking dark matter does not address.

In fairness to McGaugh, he pointed out some very interesting features of galactic DM distributions that do deserve answers. But it turns out that there are a plurality of possibilities, from complex DM physics (self interactions) to unmodelable SM physics (stellar feedback, galaxy-galaxy interactions). There are no such alternatives to CDM to explain the CMB power spectrum.

Thanks. This is nice, and why I say it would be easier to just pretend to never have heard of MOND. Indeed, this succinctly describes the trajectory I was on before I became aware of MOND. I would prefer to be recognized for my own work – of which there is plenty – than an association with a theory that is not my own – an association that is born of honestly reporting a surprising observation. I find my reception to be more favorable if I just talk about the data, but what is the point of taking data if we don’t test the hypotheses?

I have gone to great extremes to consider all the possibilities. There is not a plurality of viable possibilities; most of these things do not work. The specific ideas that are cited here are known not work. SIDM apears to work because it has more free parameters than are required to describe the data. This is a common failing of dark matter models that simply fit some functional form to observed rotation curves. They can be made to fit the data, but they cannot be used to predict the way MOND can.

Feedback is even worse. Never mind the details of specific feedback models, and think about what is being said here: the observations are to be explained by “unmodelable [standard model] physics.” This is a way of saying that dark matter claims to explain the phenomena while declining to make a prediction. Don’t worry – it’ll work out! How can that be considered better than or even equivalent to MOND when many of the problems we invoke feedback to solve are caused by the predictions of MOND coming true? We’re just invoking unmodelable physics as a deus ex machina to make dark matter models look like something they are not. Are physicists straight-up asserting that it is better to have a theory that is unmodelable than one that makes predictions that come true?

Returning to the CMB, are there no “alternatives to CDM to explain the CMB power spectrum”? I certainly do not know how to explain the third peak with the No-CDM ansatz. For that we need a relativistic theory, like Beklenstein‘s TeVeS. This initially seemed promising, as it solved the long-standing problem of gravitational lensing in MOND. However, it quickly became clear that it did not work for the CMB. Nevertheless, I learned from this that there could be more to the CMB oscillations than allowed by the simple No-CDM ansatz. The scalar field (an entity theorists love to introduce) in TeVeS-like theories could play a role analogous to cold dark matter in the oscillation equations. That means that what I thought was a killer argument against MOND – the exact same argument Clayton is making – is not as absolute as I had thought.

Writing down a new relativistic theory is not trivial. It is not what I do. I am an observational astronomer. I only play at theory when I can’t get telescope time.

Comic from the Far Side by Gary Larson.

So in the mid-00’s, I decided to let theorists do theory and started the first steps in what would ultimately become the SPARC database (it took a decade and a lot of effort by Jim Schombert and Federico Lelli in addition to myself). On the theoretical side, it also took a long time to make progress because it is a hard problem. Thanks to work by Skordis & Zlosnik on a theory they [now] call AeST8, it is possible to fit the acoustic power spectrum of the CMB:

CMB power spectrum observed by Planck fit by AeST (Skordis & Zlosnik 2021).

This fit is indistinguishable from that of LCDM.

I consider this to be a demonstration, not necessarily the last word on the correct theory, but hopefully an iteration towards one. The point here is that it is possible to fit the CMB. That’s all that matters for our current discussion: contrary to the steady insistence of cosmologists over the past 15 years, CDM is not the only way to fit the CMB. There may be other possibilities that we have yet to figure out. Perhaps even a plurality of possibilities. This is hard work and to make progress we need a critical mass of people contributing to the effort, not shouting rubbish from the peanut gallery.

As I’ve done before, I like to take the language used in favor of dark matter, and see if it also fits when I put on a MOND hat:

As a galaxy dynamicist, let me just remind everyone that the primary reason to believe in MOND as a physical theory and not some curious dark matter phenomenology is the very high precision with which MOND predicts, a priori, the dynamics of low-acceleration systems, especially low surface brightness galaxies whose kinematics were practically unknown at the time of its inception. There is a stone-cold, quantitative, crystal clear prediction of MOND that the kinematics of galaxies follows uniquely from their observed baryon distributions. This is something CDM profoundly and irremediably gets wrong: it predicts that the dark matter halo should have a central cusp9 that is not observed, and makes no prediction at all for the baryon distribution, let alone does it account for the detailed correspondence between bumps and wiggles in the baryon distribution and those in rotation curves. This is observed over and over again in hundreds upon hundreds of galaxies, each of which has its own unique mass distribution so that each and every individual case provides a distinct, independent test of the hypothesized force law. In contrast, CDM does not even attempt a comparable prediction: rather than enabling the real-world application to predict that this specific galaxy will have this particular rotation curve, it can only refer to the statistical properties of galaxy-like objects formed in numerical simulations that resemble real galaxies only in the abstract, and can never be used to directly predict the kinematics of a real galaxy in advance of the observation – an ability that has been demonstrated repeatedly by MOND. The simple fact that the simple formula of MOND is so repeatably correct in mapping what we see to what we get is to me the most convincing way to see that we need a grander theory that contains MOND and exactly MOND in the low acceleration limit, irrespective of the physical mechanism by which this is achieved.

That is stronger language than I would ordinarily permit myself. I do so entirely to show the danger of being so darn sure. I actually agree with clayton’s perspective in his quote; I’m just showing what it looks like if we adopt the same attitude with a different perspective. The problems pointed out for each theory are genuine, and the supposed solutions are not obviously viable (in either case). Sometimes I feel like we’re up the proverbial creek without a paddle. I do not know what the right answer is, and you should be skeptical of anyone who is sure that he does. Being sure is the sure road to stagnation.


1It may surprise some advocates of dark matter that I barely touch on MOND in this course, only getting to it at the end of the semester, if at all. It really is evidence-based, with a focus on the dynamical evidence as there is a lot more to this than seems to be appreciated by most physicists*. We also teach a course on cosmology, where students get the material that physicists seem to be more familiar with.

*I once had a colleague who was is a physics department ask how to deal with opposition to developing a course on galaxy dynamics. Apparently, some of the physicists there thought it was not a rigorous subject worthy of an entire semester course – an attitude that is all too common. I suggested that she pointedly drop the textbook of Binney & Tremaine on their desks. She reported back that this technique proved effective.

2I do not know who clayton is; that screen name does not suffice as an identifier. He claims to have been in contact with me at some point, which is certainly possible: I talk to a lot of people about these issues. He is welcome to contact me again, though he may wish to consider opening with an apology.

3One of the hardest realizations I ever had as a scientist was that both of the reasons (1) and (2) that I believed to absolutely require CDM assumed that gravity was normal. If one drops that assumption, as one must to contemplate MOND, then these reasons don’t require CDM so much as they highlight that something is very wrong with the universe. That something could be MOND instead of CDM, both of which are in the category of who ordered that?

4In the early days (late ’90s) when I first started asking why MOND gets any predictions right, one of the people I asked was Joe Silk. He dismissed the rotation curve fits of MOND as a fluke. There were 80 galaxies that had been fit at the time, which seemed like a lot of flukes. I mention this because one of the persistent myths of the subject is that MOND is somehow guaranteed to magically fit rotation curves. Erwin de Blok and I explicitly showed that this was not true in a 1998 paper.

5I sometimes hear cosmologists speak in awe of the thousands of observed CMB modes that are fit by half a dozen LCDM parameters. This is impressive, but we’re fitting a damped and driven oscillation – those thousands of modes are not all physically independent. Moreover, as can be seen in the figure from Dodelson & Hu, some free parameters provide more flexibility than others: there is plenty of flexibility in a model with dark matter to fit the CMB data. Only with the Planck data do minor tensions arise, the reaction to which is generally to add more free parameters, like decoupling the primordial helium abundance from that of deuterium, which is anathema to standard BBN so is sometimes portrayed as exciting, potentially new physics.

For some reason, I never hear the same people speak in equal awe of the hundreds of galaxy rotation curves that can be fit by MOND with a universal acceleration scale and a single physical free parameter, the mass-to-light ratio. Such fits are over-constrained, and every single galaxy is an independent test. Indeed, MOND can predict rotation curves parameter-free in cases where gas dominates so that the stellar mass-to-light ratio is irrelevant.

How should we weigh the relative merit of these very different lines of evidence?

6On a number of memorable occasions, people shouted “No you didn’t!” On smaller number of those occasions (exactly two), they bothered to look up the prediction in the literature and then wrote to apologize and agree that I had indeed predicted that.

7If you read this paper, part of what you will see is me being confused about how low surface brightness galaxies could adhere so tightly to the Tully-Fisher relation. They should not. In retrospect, one can see that this was a MOND prediction coming true, but at the time I didn’t know about that; all I could see was that the result made no sense in the conventional dark matter picture.

Some while after we published that paper, Bob Sanders, who was at the same institute as my collaborators, related to me that Milgrom had written to him and asked “Do you know these guys?”

8Initially they had called it RelMOND, or just RMOND. AeST stands for Aether-Scalar-Tensor, and is clearly a step along the lines that Bekenstein made with TeVeS.

In addition to fitting the CMB, AeST retains the virtues of TeVeS in terms of providing a lensing signal consistent with the kinematics. However, it is not obvious that it works in detail – Tobias Mistele has a brand new paper testing it, and it doesn’t look good at extremely low accelerations. With that caveat, it significantly outperforms extant dark matter models.

There is an oft-repeated fallacy that comes up any time a MOND-related theory has a problem: “MOND doesn’t work therefore it has to be dark matter.” This only ever seems to hold when you don’t bother to check what dark matter predicts. In this case, we should but don’t detect the edge of dark matter halos at higher accelerations than where AeST runs into trouble.

9Another question I’ve posed for over a quarter century now is what would falsify CDM? The first person to give a straight answer to this question was Simon White, who said that cusps in dark matter halos were an ironclad prediction; they had to be there. Many years later, it is clear that they are not, but does anyone still believe this is an ironclad prediction? If it is, then CDM is already falsified. If it is not, then what would be? It seems like the paradigm can fit any surprising result, no matter how unlikely a priori. This is not a strength, it is a weakness. We can, and do, add epicycle upon epicycle to save the phenomenon. This has been my concern for CDM for a long time now: not that it gets some predictions wrong, but that it can apparently never get a prediction so wrong that we can’t patch it up, so we can never come to doubt it if it happens to be wrong.

Artistic license with the dark matter tree

Artistic license with the dark matter tree

We are visual animals. What we see informs our perception of the world, so it often helps to make a sketch to help conceptualize difficult material. When first confronted with MOND phenomenology in galaxies that I had been sure were dark matter dominated, I made a sketch to help organize my thoughts. Here is a scan of the original dark matter tree that I drew on a transparency (pre-powerpoint!) in 1995:

The original dark matter tree.

At the bottom are the roots of the problem: the astronomical evidence for mass discrepancies. From these grow the trunk, which splits into categories of possible solutions, which in turn branch into ever more specific possibilities. Most of these items were already old news at the time: I was categorizing, not inventing. Indeed, some things have been rebranded over time without changing all that much, with strange nuggets now being known as macros (a generalization to describe dark matter candidates of nuclear density) and asymmetric gravity becoming MOG. The more things change, the more they stay the same.

I’ve used this picture many times in talks, both public and scientific. It helps to focus the mind. I updated it for the 2012 review Benoit Famaey wrote (see our Fig. 1), but I don’t think I really improved on the older version, which Don Lincoln had adapted for the cover illustration of an issue of Physics Teacher (circa 2013), with some embellishment by their graphic artists. That’s pretty good, but I prefer my original.

Though there are no lack of buds on the tree, there have certainly been more ideas for dark matter candidates over the past thirty years, so I went looking to see if someone had attempted a similar exercise to categorize or at least corral all the ideas people have considered. Tim Tait made one such figure, but you have to already be an expert to make any sense of it, it being a sort of Venn diagram of the large conceptual playground that is theoretical particle physics.

There is also this recent figure by Bertone & Tait:

This is nice: well organized and pleasantly symmetric, and making good use of color to distinguish different types of possibilities. One can recognize many of the same names from the original tree like MACHOs and MOND, along with newer, related entities like Macros and TeVeS. Interestingly, WIMPs are not mentioned, despite dominating the history of the field. They are subsumed under supersymmetry, which is now itself just a sub-branch of weak-scale possibilities rather than the grand unified theory of manifest inevitability that it was once considered to be. It is a sign of how far we have come that the number one candidate, the one that remains the focus of dozens of large experiments, doesn’t even come up by name. It is also a sign of how far we have yet to go that it seems preferable to many to invent new dark matter candidates than take seriously alternatives that have had much greater predictive success.

A challenge one faces in doing this exercise is to decide which candidates deserve mention, and which are just specific details that should be grouped under some more major branch. As a practical matter, it is impossible to wedge everything in, nor does every wild idea we’ve ever thought up deserve equal mention: Kaluza-Klein dark matter is not a coequal peer to WIMPs. But how do we be fair about making that call? It may not be possible.

I wanted to see how the new diagram mapped to the old tree, so I chopped it up and grafted each piece onto the appropriate branch of the original tree:

New blossoms on the old dark matter tree.

This works pretty well. It looks like the tree has blossomed with more ideas, which it has. There are more possibilities along well-established branches, and entirely new branches that I could only anticipate with question marks that allowed for the possibility of things we had not yet thought up. The tree is getting bushy.

Ultimately, the goal is not to have an ever bushier tree, but rather the opposite: we want to find the right answer. As an experimentalist, one wants to either detect or exclude specific dark matter candidates. As an scientist, I want to apply the wealth of observational knowledge we have accumulated like a chainsaw in the hands of an overzealous gardener to hack off misleading branches until the tree has been pruned down to a single branch, the one (and hopefully only one) correct answer.

As much as I like Bertone & Tait’s hexagonal image, it is very focused on ideas in particle physics. Five of the six branches are various forms of dark matter, while the possibility of modified gravity is grudgingly acknowledged in only one. It is illustrated as a dull grey that is unlike the bright, cheerful colors granted to the various flavors of dark matter candidates. To be sure, there are more ideas for solutions to the mass discrepancy problem from the particle physics than anywhere else, but that doesn’t mean they all deserve equal mention. One looking at this diagram might get the impression that the odds of dark matter:modified gravity are 5:1, which seems at once both biased against the latter and yet considerably more generous than its authors likely intended.

There is no mention at all of the data at the roots of the problem. That is all subsumed in the central DARK MATTER, as if we’re looking down at the top of the tree and recognize that it must have a central trunk, but cannot see its roots. This is indeed an apt depiction of the division between physics and astronomy. Proposed candidates for dark matter have emerged primarily from the particle physics community, which is what the hexagon categorizes. It takes for granted the evidence for dark matter, which is entirely astronomical in nature. This is not a trivial point; I’ve often encountered particle physicists who are mystified that astronomers have the temerity of think they can contribute to the dark matter debate despite 100% (not 90%, nor 99%, nor even 99.9%, but 100%) of the evidence for mass discrepancies stemming from observations of the sky. Apparently, our job was done when we told them we needed something unseen, and we should remain politely quiet while the Big Brains figure it out.

For a categorization of solutions, I suppose it is tolerable if dangerous divorced from the origins of the problem to leave off the evidence. There is another problem with placing DARK MATTER at the center. This is a linguistic problem that raises deep epistemological issues that most scientists working in the field rarely bother to engage with. Words matter; the names we use frame how we think about the problem. By calling it the dark matter problem, we presuppose the answer. A more appropriate term might be mass discrepancy, which was in use for a while by more careful-minded people, but it seems to have fallen into disuse. Dark matter is easier to say and sounds way more cool.

Jacob Bekenstein pointed out that an even better term would be acceleration discrepancy. That’s what we measure, after all. The centripetal acceleration in spiral galaxies exceeds that predicted by the observed distribution of visible matter. Mass is an inference, and a sloppy one at that: dynamical data only constrain the mass enclosed by the last measured point. The total mass of a dark matter halo depends on how far it extends, which we never observe because the darn stuff is invisible. And of course we only infer the existence of dark matter by assuming that the force law is correct. That gravity as taught to us by Einstein and Newton should apply to galaxies seems like a pretty darn good assumption, but it is just that. By calling it the dark matter problem, we make it all about unseen mass and neglect the possibility that the inference might go astray with that first, basic assumption.

So I’ve made a new picture, placing the acceleration discrepancy at the center where it belongs. The astronomical observations that inform the problem are on the vertical axis while the logical possibilities for physics solutions are on the horizontal axis. I’ve been very spare in filling in both: I’m trying to trace the logical possibilities with a minimum of bias and clutter, so I’ve retained some ideas that are pretty well excluded.

For example, on the dark matter side, MACHOs are pretty well excluded at this point, as are most (all?) dark matter candidates composed of Standard Model particles. Normal matter just doesn’t cut it, but I’ve left that sector in as a logical possibility that was considered historically and shouldn’t be forgotten. On the dynamical side, one of the first thoughts is that galaxies are big so perhaps the force law changes at some appropriate scale much large than the solar system. At this juncture, we have excluded all modifications to the force law that are made at a specific length scale.

The acceleration discrepancy diagram.

There are too many lines of observational evidence to do justice to here. I’ve lumped an enormous amount of it into a small number of categorical bins. This is not ideal, but some key points are at least mentioned. I invite the reader to try doing the exercise with pencil and paper. There are serious limits imposed by what you can physically display in a font the eye can read with a complexity limited to that which does not make the head explode. I fear I may already be pushing both.

I have made a split between dynamical and cosmological evidence. These tend to push the interpretation one way or the other, as hinted by the colors. Which way one goes depends entirely on how one weighs rather disparate lines of evidence.

I’ve also placed the things that were known from the outset of the modern dark matter paradigm closer to the center than those that were not. That galaxies and clusters of galaxies needed something more than meets the eye was known, and informed the need for dark matter. That the dynamics of galaxies over a huge range of mass, size, surface brightness, gas fraction, and morphology are organized by a few simple empirical relations was not yet known. The Baryonic Tully-Fisher Relation (BTFR) and the Radial Acceleration Relation (RAR) are critical pieces of evidence that did not inform the construction of the current paradigm, and are not satisfactorily explained by it.

Similarly for cosmology, the non-baryonic cold dark matter paradigm was launched by the observation that the dynamical mass density apparently exceeds that allowed for normal matter by primordial nucleosynthesis. This, together with the need to grow the observed large scale structure from the very smooth initial condition indicated by the cosmic microwave background (CMB), convinced nearly everyone (including myself) that there must be some new form of non-baryonic dark matter particle outside the realm of the Standard Model. Detailed observations of the power spectra of both galaxies and the CMB are important corroborating observations that did not yet exist at the time the idea took hold. We also got our predictions for these things very wrong initially, hence the need to change from Standard CDM to Lambda CDM.

Most of the people I have met who work on dark matter candidates seem to be well informed of cosmological constraints. In contrast, their knowledge of galaxy dynamics often seems to start and end with “rotation curves are flat.” There is quite a lot more to it than that. But, by and large, they stopped listening at “therefore we need dark matter” and were off and running with ideas for what it could be. There is a need to reassess the viability of these ideas in the light of the BTFR and the RAR.

People who work on galaxy dynamics are concerned with the obvious connections between dynamics and the observed stars and are inclined to be suspicious of the cosmological inference requiring non-baryonic dark matter. Over the years, I have repeatedly been approached by eminent dynamicists who have related in hushed tones, less the cosmologists overhear, that the dark matter must be baryonic. I can understand their reticence, since I was, originally, one of those people who they didn’t want to have overhear. Baryonic dark mater was crazy – we need more mass than is allowed by big bang nucleosynthesis! I usually refrained from raising this issue, as I have plenty of reasons to sympathize, and try to be a sympathetic ear even when I don’t. I did bring it up in an extended conversation with Vera Rubin once, who scoffed that the theorists were too clever by half. She reckoned that if she could demonstrate that Ωm = 1 in baryons one day, that they would have somehow fixed nucleosynthesis by the next. Her attitude was well-grounded in experience.

A common attitude among advocates of non-baryonic dark matter is that the power spectrum of the CMB requires its existence. Fits to the data require a non-baryonic component at something like 100 sigma. That’s pretty significant evidence.

The problem with this attitude is that it assumes General Relativity (GR). That’s the theory in which the fits are made. There is, indeed, no doubt that the existence of cold dark matter is required in order to make the fits in the context of GR: it does not work without it. To take this as proof of the existence of cold dark mater is entirely circular logic. Indeed, that we have to invent dark matter as a tooth fairy to save GR might be interpreted as evidence against it, or at least as an indication that there might exist a still more general theory.

Nevertheless, I do have sympathy for the attitude that any idea that is going to work has to explain all the data – including both dynamical and cosmological evidence. Where one has to be careful is to assume that the explanation we currently have is unique – so unique that no other theory could ever conceivably explain it. By that logic, MOND is the only theory that uniquely predicted both the BTFR and the RAR. So if we’re being even-handed, cold dark matter is ruled out by the dynamical relations identified after its invention at least as much as its competitors are excluded by the detailed, later measurement of the power spectrum of the CMB.

If we believe all the data, and hold all theories to the same high standard, none survive. Not a single one. A common approach seems to be to hold one’s favorite theory to a lower standard. I will not dignify that with a repudiation. The challenge with data both astronomical and cosmological, is figuring out what to believe. It has gotten better, but you can’t rely on every measurement being right, or – harder to bear in mind – actually measure what you want it to measure. Do the orbits of gas clouds in spiral galaxies trace the geodesics of test particles in perfectly circular motion? Does the assumption of hydrostatic equilibrium in the intracluster medium (ICM) of clusters of galaxies provide the same tracer of the gravitational potential as dynamics? There is an annoying offset in the acceleration scale measured by the two distinct methods. Is that real, or some systematic? It seems to be real, but it is also suspicious for appearing exactly where the change in method occurs.

The characteristic acceleration scale in extragalactic systems as a function of their observed baryonic mass. This is always close to the ubiquitous scale of 10-10 m/s/s first recognized by Milgrom. There is a persistent offset for clusters of galaxies that occurs where we switch from dynamical to hydrostatic tracers of the potential (Fig. 48 from Famaey & McGaugh 2012).

One will go mad trying to track down every conceivable systematic. Trust me, I’ve done the experiment. So an exercise I like to do is to ask what theory minimizes the amount of data I have to ignore. I spent several years reviewing all the data in order to do this exercise when I first got interested in this problem. To my surprise, it was MOND that did best by this measure, not dark matter. To this date, clusters of galaxies remain the most problematic for MOND in having a discrepant acceleration scale – a real problem that we would not hesitate to sweep under the rug if dark matter suffered it. For example, the offset the EAGLE simulation requires to [sort of] match the RAR is almost exactly the same amplitude as what MOND needs to match clusters. Rather than considering this to be a problem, they apply the required offset and call it natural to have missed by this much.

Most of the things we call evidence for dark matter are really evidence for the acceleration discrepancy. A mental hang up I had when I first came to the problem was that there’s so much evidence for dark matter. That is a misstatement stemming from the linguistic bias I noted earlier. There’s so much evidence for the acceleration discrepancy. I still see professionals struggle with this, often citing results as being contradictory to MOND that actually support it. They seem not to have bothered to check, as I have, and are content to repeat what they heard someone else assert. I sometimes wonder if the most lasting contribution to science made by the dark matter paradigm is as one giant Asch conformity experiment.

If we repeat today the exercise of minimizing the amount of data we have to disbelieve, the theory that fares best is the Aether Scalar Tensor (AeST) theory of Skordis & Zlosnik. It contains MOND in the appropriate limit while also providing an excellent fit to the power spectrum of galaxies and the CMB (see also the updated plots in their paper). Hybrid models struggle to do both while the traditional approach of simply adding mass in new particles does not provide a satisfactory explanation of the MOND phenomenology. They can be excluded unless we indulge in the special pleading that invokes feedback or other ad hoc auxiliary hypotheses. Similarly, more elaborate ideas like self-interacting dark matter were dead on arrival for providing a mechanism to solve the wrong problem: the cores inferred in dark matter halos are merely a symptom of the more general MONDian phenomenology; the proposed solution addresses the underlying disease about as much as a band-aid helps an amputation.

Does that mean AeST is the correct theory? Only in the sense that MOND was the best theory when I first did this exercise in the previous century. The needle has swung back and forth since then, so it might swing again. But I do hope that it is a step in a better direction.

The Angel Particle

The Angel Particle

The dominant paradigm for dark matter has long been the weakly interacting massive particle (WIMP). WIMPs are hypothetical particles motivated by supersymmetry. This is well-posed scientific hypothesis insofar as it makes a testable prediction: the cold dark matter thought to dominate the cosmic mass budget should be composed of a particle with a mass in the neighborhood of 100 GeV that interacts via the weak nuclear force – hence the name.

That WIMPs couple to the weak nuclear force as well as to gravity is what gives us a window to detect them in the laboratory. They should scatter off of nuclei of comparable mass, albeit only on the rare occasions dictated by the weak force. If we build big enough detectors, we should see it happen. This is what a whole host of massive, underground experiments have been looking for. So far, these experiments have succeeded in failing to detect WIMPs: if WIMPs existed with the properties we predicted them to have, they would have been detected by now.

The failure to find WIMPs has led to the consideration of a myriad of other possibilities. Few of these are as well motivated as the original WIMP. Some have nifty properties that might help with the phenomenology of galaxies. Most are woefully uninformed by such astrophysical considerations, as it is hard enough to do the particle physics without violating some basic constraint.

One possibility that most of us have been reluctant to contemplate is a particle that doesn’t interact at all via strong, weak, or electromagnetic forces. We already know that dark matter cannot interact via electromagnetism, as it wouldn’t be dark. It is similarly difficult to hide a particle that responds to the strong force (though people have of course tried, with strange nuggets in the ’80s and their modern reincarnation, the macro). But why should a particle have to interact at least through the weak force, as WIMPs do? No reason. So what if there is a particle that has zero interaction with standard model particles? It has mass and therefore gravity, but otherwise interacts with the rest of the universe not at all. Let’s call this the Angel Particle, because it will never reveal itself, no matter how much we pray for divine intervention.

I first heard this idea mooted in a talk by Tom Shutt in the early teens. He is a leader in the search for WIMPs, and has been since the outset. So to suggest that the dark matter is something that simply cannot be detected in the laboratory was anathema. A logical possibility to be noted, but only in passing with a shudder of existential dread: the legions of experimentalists looking for dark matter are wasting their time if there is no conceivable signal to detect.

Flash forward a decade, and what was anathema then seems reasonable now that WIMPs remain AWOL. I hear some theorists saying “why not?” with a straight face. “Why shouldn’t there be a particle that doesn’t interact with anything else?”

One the one hand, it’s true. As long as we’re making up particles outside the boundaries of known physics, I know of nothing that precludes us from inventing one that has zero interactions. On the other hand, how would we ever know? We would just give up on laboratory searches, and accept on faith that “gravitational detection” from astronomical evidence is adequate – and indeed, the only possible evidence for invisible mass.

Experimentalists go home! Your services are not required.

To me, this is not physics. There is no way to falsify this hypothesis, or even test it. I was already concerned that WIMPs are not strictly falsifiable. They can be confirmed if found in the laboratory, but if they are not found, we can always tweak the prediction – all the way to this limit of zero interaction, a situation I’ve previously described as the express elevator to hell.

If there is no way to test a hypothesis to destruction, it is metaphysics, not physics. Entertaining the existence of a particle with zero interaction cross-section is a logical possibility, but it is also a form of magical thinking. It provides a way to avoid confronting the many problems with the current paradigm. Indeed, it provides an excuse to never have to deal with them. This way lies madness, and the end of scientific rationalism. We might just as well imagine that angels are responsible for moving objects about.

Indeed, the only virtue of this hypothesis that springs to mind is to address the age-old question: how many angels can dance on the head of a pin? We know from astronomical data that the local density of angel particles must be about 1/4 GeV cm-3. Let’s say the typical pin head is a cylinder with a diameter of 2.5 mm and a thickness of 1 mm, giving it a volume of 10 mm3. Doing a few unit conversions, this means a dark mass of 1 MeV* per pin head, so exactly one angel can occupy the head of a pin if the mass of the Angel particle is 1 MeV.

Of course, we have no idea what the mass of the Angel particle is, so we’ve really only established a limit: 1 MeV is the upper limit for the mass of an angel that can fit on the head of a pin. If it weighs more than 1 MeV, the answer is zero: an angel is too fat to fit on the head of a pin. If angels weighs less than 1 MeV, then they can fit numbers in inverse proportion to their mass. If it is as small as 1 eV, then a million angels can party on the vast dance floor that is the head of a pin.

So I guess we still haven’t answered the age old question, and it looks like we never will.


*An electron is about half an MeV, so it is tempting to imagine dark matter composed of positronium. This does not work for many reasons, not least of which is that a mass of 1 MeV is a coincidence of the volume of the head of a pin that I made up for ease of calculation without bothering to measure the size of an actual pin – not to mention that the size of pins has nothing whatever to do with the dark matter problem. Another reason is that, being composed of an electron and its antiparticle the positron, positronium is unstable and self-annihilates into gamma rays in less than a nanosecond – rather less than the Hubble time that we require for dark matter to still be around at this juncture. Consequently, this hypothesis is immediately off by a factor of 1028, which is the sort of thing that tends to happen when you try to construct dark matter from known particles – hence the need to make up entirely new stuff.

God forbid we contemplate that maybe the force law might be broken. How crazy would that be?

Define “better”

Define “better”

Dark matter remains undetected in the laboratory. This has been true for forever, so I don’t know what drives the timing of the recent spate of articles encouraging us to keep the faith, that dark matter is still a better idea than anything else. This depends on how we define “better.”

There is a long-standing debate in the philosophy of science about the relative merits of accommodation and prediction. A scientific theory should have predictive power. It should also explain all the relevant data. To do the latter almost inevitably requires some flexibility in order to accommodate things that didn’t turn out exactly as predicted. What is the right mix? Do we lean more towards prediction, or accommodation? The answer to that defines “better” in this context.

One of the recent articles is titled “The dark matter hypothesis isn’t perfect, but the alternatives are worse” by Paul Sutter. This perfectly encapsulates the choice one has to make in what is unavoidably a value judgement. Is it better to accommodate, or to predict (see the Spergel Principle)? Dr. Sutter comes down on the side of accommodation. He notes a couple of failed predictions of dark matter, but mentions no specific predictions of MOND (successful or not) while concluding that dark matter is better because it explains more.

One important principle in science is objectivity. We should be even-handed in the evaluation of evidence for and against a theory. In practice, that is very difficult. As I’ve written before, it made me angry when the predictions of MOND came true in my data for low surface brightness galaxies. I wanted dark matter to be right. I felt sure that it had to be. So why did this stupid MOND theory have any of its predictions come true?

One way to check your objectivity is to look at it from both sides. If I put on a dark matter hat, then I largely agree with what Dr. Sutter says. To quote one example:

The dark matter hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has come up with a MOND-like theory that can explain the wealth of data we have about the universe. That doesn’t make MOND wrong, but it does make it a far weaker alternative to dark matter.

Paul Sutter

OK, so now let’s put on a MOND hat. Can I make the same statement?

The MOND hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has detected dark matter, nor come up with a dark matter-based theory with the predictive power of MOND. That doesn’t make dark matter wrong, but it does make it a far weaker alternative to MOND.

So, which of these statements is true? Well, both of them. How do we weigh the various lines of evidence? Is it more important to explain a large variety of the data, or to be able to predict some of it? This is one of the great challenges when comparing dark matter and MOND. They are incommensurate: the set of relevant data is not the same for both. MOND makes no pretense to provide a theory of cosmology, so it doesn’t even attempt to explain much of the data so beloved by cosmologists. Dark matter explains everything, but, broadly defined, it is not a theory so much as an inference – assuming gravitational dynamics are inviolate, we need more mass than meets the eye. It’s a classic case of comparing apples and oranges.

While dark matter is a vague concept in general, one can build specific theories of dark matter that are predictive. Simulations with generic cold dark matter particles predict cuspy dark matter halos. Galaxies are thought to reside in these halos, which dominate their dynamics. This overlaps with the predictions of MOND, which follow from the observed distribution of normal matter. So, do galaxies look like tracer particles orbiting in cuspy halos? Or do their dynamics follow from the observed distribution of light via Milgrom’s strange formula? The relevant subset of the data very clearly indicate the latter. When head-to-head comparisons like this can be made, the a priori predictions of MOND win, hands down, over and over again. [If this statement sounds wrong, try reading the relevant scientific literature. Being an expert on dark matter does not automatically make one an expert on MOND. To be qualified to comment, one should know what predictive successes MOND has had. People who say variations of “MOND only fits rotation curves” are proudly proclaiming that they lack this knowledge.]

It boils down to this: if you want to explain extragalactic phenomena, use dark matter. If you want to make a prediction – in advance! – that will come true, use MOND.

A lot of the debate comes down to claims that anything MOND can do, dark matter can do better. Or at least as well. Or, if not as well, good enough. This is why conventionalists are always harping about feedback: it is the deus ex machina they invoke in any situation where they need to explain why their prediction failed. This does nothing to explain why MOND succeeded where they failed.

This post-hoc reasoning is profoundly unsatisfactory. Dark matter, being invisible, allows us lots of freedom to cook up an explanation for pretty much anything. My long-standing concern for the dark matter paradigm is not the failure of any particular prediction, but that, like epicycles, it has too much explanatory power. We could use it to explain pretty much anything. Rotation curves flat when they should be falling? Add some dark matter. No such need? No dark matter. Rising rotation curves? Sure, we could explain that too: add more dark matter. Only we don’t, because that situation doesn’t arise in nature. But we could if we had to. (See, e.g., Fig. 6 of de Blok & McGaugh 1998.)

There is no requirement in dark matter that rotation curves be as flat as they are. If we start from the prior knowledge that they are, then of course that’s what we get. If instead we independently try to build models of galactic disks in dark matter halos, very few of them wind up with realistic looking rotation curves. This shouldn’t be surprising: there are, in principle, an uncountably infinite number of combinations of galaxies and dark matter halos. Even if we impose some sensible restrictions (e.g., scaling the mass of one component with that of the other), we still don’t get it right. That’s one reason that we have to add feedback, which suffices according to some, and not according to others.

In contrast, the predictions of MOND are unique. The kinematics of an object follow from its observed mass distribution. The two are tied together by the hypothesized force law. There is a one-to-one relation between what you see and what you get.

This was not expected in dark matter. It makes no sense that this should be so. The baryonic tail should not wag the dark matter dog.

From the perspective of building dark matter models, it’s like the proverbial needle in the haystack: the haystack is the volume of possible baryonic disk plus dark matter halo combinations; the one that “looks like” MOND is the needle. Somehow nature plucks the MOND-like needle out of the dark matter haystack every time it makes a galaxy.

The dark matter haystack. Galaxies might lie anywhere in this voluminous, multiparameter space, but in practice they inevitably seem to reside in the negligibly small part of the volume that “looks like” MOND.

Dr. Sutter says that we shouldn’t go with our gut. That’s exactly what I wanted to do, long ago, to maintain my preference for dark matter. I’d love to do that now so that I could stop having this argument with otherwise reasonable people.

Instead of going with my gut, I’m making a probabilistic statement. In Bayesian terms, the odds of observing MONDian behavior given the prior that we live in a universe made of dark matter are practically zero. In MOND, observing MONDian behavior is the only thing that can happen. That’s what we observe in galaxies, over and over again. Any information criterion shows a strong quantitative preference for MOND when dynamical evidence is considered. That does not happen when cosmological data are considered because MOND makes no prediction there. Concluding that dark matter is better overlooks the practical impossibility that MOND-like phenomenolgy is observed at all. Of course, once one knows this is what the data show, it seems a lot more likely, and I can see that effect in the literature over the long arc of scientific history. This is why, to me, predictive power is more important than accommodation: what we predict before we know the answer is more important than whatever we make up once the answer is known.

The successes of MOND are sometimes minimized by lumping all galaxies into a single category. That’s not correct. Every galaxy has a unique mass distribution; each one is an independent test. The data for galaxies extend over a large dynamic range, from dwarfs to giants, from low to high surface brightness, from gas to star dominated cases. Dismissing this by saying “MOND only explains rotation curves” is like dismissing Newton for only explaining planets – as if every planet, moon, comet, and asteroid aren’t independent tests of Newton’s inverse square law.

Two galaxies with very different mass distributions. Neither are well explained by dark matter, which provides no reason for the detailed shapes encapsulated by Sancisi’s Law. In contrast, MOND describes these naturally: features in the rotation curves follow from those in the baryon distributions because the force law tells them to.

MOND does explain more that rotation curves. That was the first thing I checked. I spent several years looking at all of the data, and have reviewed the situation many times since. What I found surprising is how much MOND explains, if you let it. More disturbing was how often I came across claims in the literature that MOND was falsified by X only to try the analysis myself and find that, no, if you bother to do it right, that’s pretty much just what it predicts. Not in every case, of course – no hypothesis is perfect – but I stopped bothering after several hundred cases. Literally hundreds. I can’t keep up with every new claim, and it isn’t my job to do so. My experience has been that as the data improve, so too does its agreement with MOND.

Dr. Sutter’s article goes farther, repeating a common misconception that “the tweaking of gravity under MOND is explicitly designed to explain the motions of stars within galaxies.” This is an overstatement so strong as to be factually wrong. MOND was explicitly designed to produce flat rotation curves – as was dark matter. However, there is a lot more to it than that. Once we write down the force law, we’re stuck with it. It has lots of other unavoidable consequences that lead to genuine predictions. Milgrom explicitly laid out what these consequences would be, and basically all of them have subsequently been observed. I include a partial table in my last review; it only ends where it does because I had to stop somewhere. These were genuine, successful, a priori predictions – the gold standard in science. Some of them can be explained with dark matter, but many cannot: they make no sense, and dark matter can only accommodate them thanks to its epic flexibility.

Dr. Sutter makes a number of other interesting points. He says we shouldn’t “pick [a hypothesis] that sounds cooler or seems simpler.” I’m not sure which seems cooler here – a universe pervaded by a mysterious invisible mass that we can’t [yet] detect in the laboratory but nevertheless controls most of what goes on out there seems pretty cool to me. That there might also be some fundamental aspect of the basic theory of gravitational dynamics that we’re missing also seems like a pretty cool possibility. Those are purely value judgments.

Simplicity, however, is a scientific value known as Occam’s razor. The simpler of competing theories is to be preferred. That’s clearly MOND: we make one adjustment to the force law, and that’s it. What we lack is a widely accepted, more general theory that encapsulates both MOND and General Relativity.

In dark matter, we multiply entities unnecessarily – there is extra mass composed of unknown particles that have no place in the Standard Model of particle physics (which is quite full up) so we have to imagine physics beyond the standard model and perhaps an entire dark sector because why just one particle when 85% of the mass is dark? and there could also be dark photons to exchange forces that are only active in the dark sector as well as entire hierarchies of dark particles that maybe have their own ecosystem of dark stars, dark planets, and maybe even dark people. We, being part of the “normal” matter, are just a minority constituent of this dark universe; a negligible bit of flotsam compared to the dark sector. Doesn’t it make sense to imagine that the dark sector has as rich and diverse a set of phenomena as the “normal” sector? Sure – if you don’t mind abandoning Occam’s razor. Note that I didn’t make any of this stuff up; everything I said in that breathless run-on sentence I’ve heard said by earnest scientists enthusiastic about how cool the dark sector could be. Bugger Occam.

There is also the matter of timescales. Dr. Sutter mentions that “In almost 50 years, nobody has come up with a MOND-like theory” that does all that we need it to do. That’s true, but for the typo. Next year (2023) will mark the 40th anniversary of Milgrom’s first publications on MOND, so it hasn’t been half a century yet. But I’ve heard recurring complaints to this effect before, that finding the deeper theory is taking too long. Let’s examine that, shall we?

First, remember some history. When Newton introduced his inverse square law of universal gravity, it was promptly criticized as a form of magical thinking: How, Sir, can you have action at a distance? The conception at the time was that you had to be in physical contact with an object to exert a force on it. For the sun to exert a force on the earth, or the earth on the moon, seemed outright magical. Leibnitz famously accused Newton of introducing ‘occult’ forces. As a consequence, Newton was careful to preface his description of universal gravity as everything happening as if the force was his famous inverse square law. The “as if” is doing a lot of work here, basically saying, in modern parlance “OK, I don’t get how this is possible, I know it seems really weird, but that’s what it looks like.” I say the same about MOND: galaxies behave as if MOND is the effective force law. The question is why.

As near as I can tell from reading the history around this, and I don’t know how clear this is, but it looks like it took about 20 years for Newton to realize that there was a good geometric reason for the inverse square law. We expect our freshman physics students to see that immediately. Obviously Newton was smarter than the average freshman, so why’d it take so long? Was he, perhaps, preoccupied with the legitimate-seeming criticisms of action at a distance? It is hard to see past a fundamental stumbling block like that, and I wonder if the situation now is analogous. Perhaps we are missing something now that will seems obvious in retrospect, distracted by criticisms that will seem absurd in the future.

Many famous scientists built on the dynamics introduced by Newton. The Poisson equation isn’t named the Newton equation because Newton didn’t come up with it even though it is fundamental to Newtonian dynamics. Same for the Lagrangian. And the classical Hamiltonian. These developments came many decades after Newton himself, and required the efforts of many brilliant scientists integrated over a lot of time. By that standard, forty years seems pretty short: one doesn’t arrive at a theory of everything overnight.

What is the right measure? The integrated effort of the scientific community is more relevant than absolute time. Over the past forty years, I’ve seen a lot of push back against even considering MOND as a legitimate theory. Don’t talk about that! This isn’t exactly encouraging, so not many people have worked on it. I can count on my fingers the number of people who have made important contributions to the theoretical development of MOND. (I am not one of them. I am an observer following the evidence, wherever it leads, even against my gut feeling and to the manifest detriment of my career.) It is hard to make progress without a critical mass of people working on a problem.

Of course, people have been looking for dark matter for those same 40 years. More, really – if you want to go back to Oort and Zwicky, it has been 90 years. But for the first half century of dark matter, no one was looking hard for it – it took that long to gel as a serious problem. These things take time.

Nevertheless, for several decades now there has been an enormous amount of effort put into all aspects of the search for dark matter: experimental, observational, and theoretical. There is and has been a critical mass of people working on it for a long time. There have been thousands of talented scientists who have contributed to direct detection experiments in dozens of vast underground laboratories, who have combed through data from X-ray and gamma-ray observatories looking for the telltale signs of dark matter decay or annihilation, who have checked for the direct production of dark matter particles in the LHC; even theorists who continue to hypothesize what the heck the dark matter could be and how we might go about detecting it. This research has been well funded, with billions of dollars having been spent in the quest for dark matter. And what do we have to show for it?

Zero. Nada. Zilch. Squat. A whole lot of nothing.

This is equal to the amount of funding that goes to support research on MOND. There is no faster way to get a grant proposal rejected than to say nice things about MOND. So one the one hand, we have a small number of people working on the proverbial shoestring, while on the other, we have a huge community that has poured vast resources into the attempt to detect dark matter. If we really believe it is taking too long, perhaps we should try funding MOND as generously as we do dark matter.

LZ: another non-detection

LZ: another non-detection

Just as I was leaving for a week’s vacation, the dark matter search experiment LZ reported its first results. Now that I’m back, I see that I didn’t miss anything. Here is their figure of merit:

The latest experimental limits on WIMP dark matter from LZ (arXiv:2207.03764). The parameter space above the line is excluded. Note the scale on the y-axis bearing in mind that the original expectation was for a cross section around 10-39 cm2, well above the top edge of this graph.

LZ is a merger of two previous experiments compelled to grow still bigger in the never-ending search for dark matter. It contains “seven active tonnes of liquid xenon,” which is an absurd amount, being a substantial fraction of the entire terrestrial supply. It all has to be super-cooled to near absolute zero and filtered of all contaminants that might include naturally radioactive isotopes that might mimic the sought-after signal of dark matter scattering off of xenon nuclei. It is a technological tour de force.

The technology is really fantastic. The experimentalists have accomplished amazing things in building these detectors. They have accomplished the target sensitivity, and then some. If WIMPs existed, they should have found them by now.

WIMPs have not been discovered. As the experiments have improved, the theorists have been obliged to repeatedly move the goalposts. The original (1980s) expectation for the interaction cross-section was 10-39 cm2. That was quickly excluded, but more careful (1990s) calculation suggested perhaps more like 10-42 cm2. This was also excluded experimentally. By the late 2000s, the “prediction” had migrated to 10-46 cm2. This has also now been excluded, so the goalposts have been moved to 10-48 cm2. This migration has been driven entirely by the data; there is nothing miraculous about a WIMP with this cross section.

As remarkable a technological accomplishment as experiments like LZ are, they are becoming the definition of insanity: repeating the same action but expecting a different result.

For comparison, consider the LIGO detection of gravitational waves. A large team of scientists worked unspeakably hard to achieve the detection of a tiny effect. It took 40 years of failure before success was obtained. Until that point, it seemed much the same: repeating the same action but expecting a different result.

Except it wasn’t, because there was a clear expectation for the sensitivity that was required to detect gravitational waves. Once that sensitivity was achieved, they were detected. It wasn’t that simple of course, but close enough for our purposes: it took a long time to get where they were going, but they achieved success once they got there. Having a clear prediction is essential.

In the case of WIMP searches, there was also a clear prediction. The required sensitivity was achieved – long ago. Nothing was found, so the goalposts were moved – by a lot. Then the new required sensitivity was achieved, still without detection. Repeatedly.

It always makes sense to look harder for something you expect if at first you don’t succeed. But at some point, you have to give up: you ain’t gonna find it. This is disappointing, but we’ve all experienced this kind of disappointment at some point in our lives. The tricky part is deciding when to give up.

In science, the point to give up is when your hypothesis is falsified. The original WIMP hypothesis was falsified a long time ago. We keep it on life support with modifications, often obfuscating (to our students and to ourselves) that the WIMPs we’re talking about today are no longer the WIMPs we originally conceived.

I sometimes like to imagine the thought experiment of sending some of the more zealous WIMP advocates back in time to talk to their younger selves. What would they say? How would they respond to themselves? These are not people who like to be contradicted by anyone, even themselves, so I suspect it would go something like

Old scientist: “Hey, kid – I’m future you. This experiment you’re about to spend your life working on won’t detect what you’re looking for.”

Young scientist: “Uh huh. You say you’re me from the future, Mr. Credibility? Tell me: at what point do I go senile, you doddering old fool?”

Old scientist: “You don’t. It just won’t work out the way you think. On top of dark matter, there’s also dark energy…”

Young scientist: “What the heck is dark energy, you drooling crackpot?”

Old scientist: “The cosmological constant.”

Young scientist: “The cosmological constant! You can’t expect people to take you seriously talking about that rubbish. GTFO.”

That’s the polite version that doesn’t end in fisticuffs. It’s easy to imagine this conversation going south much faster. I know that if 1993 me had received a visit from 1998 me telling me that in five years I would have come to doubt WIMPs, and also would have demonstrated that the answer to the missing mass problem might not be dark matter at all, I… would not have taken it well.

That’s why predictions are important in science. They tell us when to change our mind. When to stop what we’re doing because it’s not working. When to admit that we were wrong, and maybe consider something else. Maybe that something else won’t prove correct. Maybe the next ten something elses won’t. But we’ll never find out if we won’t let go of the first wrong thing.

Cosmic whack-a-mole

Cosmic whack-a-mole

The fine-tuning problem encountered by dark matter models that I talked about last time is generic. The knee-jerk reaction of most workers seems to be “let’s build a more sophisticated model.” That’s reasonable – if there is any hope of recovery. The attitude is that dark matter has to be right so something has to work out. This fails to even contemplate the existential challenge that the fine-tuning problem imposes.

Perhaps I am wrong to be pessimistic, but my concern is well informed by years upon years trying to avoid this conclusion. Most of the claims I have seen to the contrary are just specialized versions of the generic models I had already built: they contain the same failings, but these go unrecognized because the presumption is that something has to work out, so people are often quick to declare “close enough!”

In my experience, fixing one thing in a model often breaks something else. It becomes a game of cosmic whack-a-mole. If you succeed in suppressing the scatter in one relation, it pops out somewhere else. A model that seems like it passes the test you built it to pass flunks as soon as you confront it with another test.

Let’s consider a few examples.


Squeezing the toothpaste tube

Our efforts to evade one fine-tuning problem often lead to another. This has been my general experience in many efforts to construct viable dark matter models. It is like squeezing a tube of toothpaste: every time we smooth out the problems in one part of the tube, we simply squeeze them into a different part. There are many published claims to solve this problem or that, but they frequently fail to acknowledge (or notice) that the purported solution to one problem creates another.

One example is provided by Courteau and Rix (1999). They invoke dark matter domination to explain the lack of residuals in the Tully-Fisher relation. In this limit, Mb/R ​≪ ​MDM/R and the baryons leave no mark on the rotation curve. This can reconcile the model with the Tully-Fisher relation, but it makes a strong prediction. It is not just the flat rotation speed that is the same for galaxies of the same mass, but the entirety of the rotation curve, V(R) at all radii. The stars are just convenient tracers of the dark matter halo in this limit; the dynamics are entirely dominated by the dark matter. The hypothesized solution fixes the problem that is addressed, but creates another problem that is not addressed, in this case the observed variation in rotation curve shape.

The limit of complete dark matter domination is not consistent with the shapes of rotation curves. Galaxies of the same baryonic mass have the same flat outer velocity (Tully-Fisher), but the shapes of their rotation curves vary systematically with surface brightness (de Blok & McGaugh, 1996; Tully and Verheijen, 1997; McGaugh and de Blok, 1998a,b; Swaters et al., 2009, 2012; Lelli et al., 2013, 2016c). High surface brightness galaxies have steeply rising rotation curves while LSB galaxies have slowly rising rotation curves (Fig. 6). This systematic dependence of the inner rotation curve shape on the baryon distribution excludes the SH hypothesis in the limit of dark matter domination: the distribution of the baryons clearly has an impact on the dynamics.

Fig. 6. Rotation curve shapes and surface density. The left panel shows the rotation curves of two galaxies, one HSB (NGC 2403, open circles) and one LSB (UGC 128, filled circles) (de Blok & McGaugh, 1996; Verheijen and de Blok, 1999; Kuzio de Naray et al., 2008). These galaxies have very nearly the same baryonic mass (~ 1010 ​M), and asymptote to approximately the same flat rotation speed (~ 130 ​km ​s−1). Consequently, they are indistinguishable in the Tully-Fisher plane (Fig. 4). However, the inner shapes of the rotation curves are readily distinguishable: the HSB galaxy has a steeply rising rotation curve while the LSB galaxy has a more gradual rise. This is a general phenomenon, as illustrated by the central density relation (right panel: Lelli et al., 2016c) where each point is one galaxy; NGC 2403 and UGC 128 are highlighted as open points. The central dynamical mass surface density (Σdyn) measured by the rate of rise of the rotation curve (Toomre, 1963) correlates with the central surface density of the stars (Σ0) measured by their surface brightness. The line shows 1:1 correspondence: no dark matter is required near the centers of HSB galaxies. The need for dark matter appears below 1000 ​M pc−2 and grows systematically greater to lower surface brightness. This is the origin of the statement that LSB galaxies are dark matter dominated.

A more recent example of this toothpaste tube problem for SH-type models is provided by the EAGLE simulations (Schaye et al., 2015). These are claimed (Ludlow et al., 2017) to explain one aspect of the observations, the radial acceleration relation (McGaugh et al., 2016), but fail to explain another, the central density relation (Lelli et al., 2016c) seen in Fig. 6. This was called the ‘diversity’ problem by Oman et al. (2015), who note that the rotation velocity at a specific, small radius (2 kpc) varies considerably from galaxy to galaxy observationally (Fig. 6), while simulated galaxies show essentially no variation, with only a small amount of scatter. This diversity problem is exactly the same problem that was pointed out before [compare Fig. 5 of Oman et al. (2015) to Fig. 14 of McGaugh and de Blok (1998a)].

There is no single, universally accepted standard galaxy formation model, but a common touchstone is provided by Mo et al. (1998). Their base model has a constant ratio of luminous to dark mass md [their assumption (i)], which provides a reasonable description of the sizes of galaxies as a function of mass or rotation speed (Fig. 7). However, this model predicts the wrong slope (3 rather than 4) for the Tully-Fisher relation. This is easily remedied by making the luminous mass fraction proportional to the rotation speed (md ​∝ ​Vf), which then provides an adequate fit to the Tully-Fisher4 relation. This has the undesirable effect of destroying the consistency of the size-mass relation. We can have one or the other, but not both.

Fig. 7. Galaxy size (as measured by the exponential disk scale length, left) and mass (right) as a function of rotation velocity. The latter is the Baryonic Tully-Fisher relation; the data are the same as in Fig. 4. The solid lines are Mo et al. (1998) models with constant md (their equations 12 and 16). This is in reasonable agreement with the size-speed relation but not the BTFR. The latter may be fit by adopting a variable md ​∝ ​Vf (dashed lines), but this ruins agreement with the size-speed relation. This is typical of dark matter models in which fixing one thing breaks another.

This failure of the Mo et al. (1998) model provides another example of the toothpaste tube problem. By fixing one problem, we create another. The only way forward is to consider more complex models with additional degrees of freedom.

Feedback

It has become conventional to invoke ‘feedback’ to address the various problems that afflict galaxy formation theory (Bullock & Boylan-Kolchin, 2017; De Baerdemaker and Boyd, 2020). It goes by other monikers as well, variously being called ‘gastrophysics’5 for gas phase astrophysics, or simply ‘baryonic physics’ for any process that might intervene between the relatively simple (and calculable) physics of collisionless cold dark matter and messy observational reality (which is entirely illuminated by the baryons). This proliferation of terminology obfuscates the boundaries of the subject and precludes a comprehensive discussion.

Feedback is not a single process, but rather a family of distinct processes. The common feature of different forms of feedback is the deposition of energy from compact sources into the surrounding gas of the interstellar medium. This can, at least in principle, heat gas and drive large-scale winds, either preventing gas from cooling and forming too many stars, or ejecting it from a galaxy outright. This in turn might affect the distribution of dark matter, though the effect is weak: one must move a lot of baryons for their gravity to impact the dark matter distribution.

There are many kinds of feedback, and many devils in the details. Massive, short-lived stars produce copious amounts of ultraviolet radiation that heats and ionizes the surrounding gas and erodes interstellar dust. These stars also produce strong winds through much of their short (~ 10 Myr) lives, and ultimately explode as Type II supernovae. These three mechanisms each act in a distinct way on different time scales. That’s just the feedback associated with massive stars; there are many other mechanisms (e.g., Type Ia supernovae are distinct from Type II supernovae, and Active Galactic Nuclei are a completely different beast entirely). The situation is extremely complicated. While the various forms of stellar feedback are readily apparent on the small scales of stars, it is far from obvious that they have the desired impact on the much larger scales of entire galaxies.

For any one kind of feedback, there can be many substantially different implementations in galaxy formation simulations. Independent numerical codes do not generally return compatible results for identical initial conditions (Scannapieco et al., 2012): there is no consensus on how feedback works. Among the many different computational implementations of feedback, at most one can be correct.

Most galaxy formation codes do not resolve the scale of single stars where stellar feedback occurs. They rely on some empirically calibrated, analytic approximation to model this ‘sub-grid physics’ — which is to say, they don’t simulate feedback at all. Rather, they simulate the accumulation of gas in one resolution element, then follow some prescription for what happens inside that unresolved box. This provides ample opportunity for disputes over the implementation and effects of feedback. For example, feedback is often cited as a way to address the cusp-core problem — or not, depending on the implementation (e.g., Benítez-Llambay et al., 2019; Bose et al., 2019; Di Cintio et al., 2014; Governato et al., 2012; Madau et al., 2014; Read et al., 2019). High resolution simulations (Bland-Hawthorn et al., 2015) indicate that the gas of the interstellar medium is less affected by feedback effects than assumed by typical sub-grid prescriptions: most of the energy is funneled through the lowest density gas — the course of least resistance — and is lost to the intergalactic medium without much impacting the galaxy in which it originates.

From the perspective of the philosophy of science, feedback is an auxiliary hypothesis invoked to patch up theories of galaxy formation. Indeed, since there are many distinct flavors of feedback that are invoked to carry out a variety of different tasks, feedback is really a suite of auxiliary hypotheses. This violates parsimony to an extreme and brutal degree.

This concern for parsimony is not specific to any particular feedback scheme; it is not just a matter of which feedback prescription is best. The entire approach is to invoke as many free parameters as necessary to solve any and all problems that might be encountered. There is little doubt that such models can be constructed to match the data, even data that bear little resemblance to the obvious predictions of the paradigm (McGaugh and de Blok, 1998a; Mo et al., 1998). So the concern is not whether ΛCDM galaxy formation models can explain the data; it is that they can’t not.


One could go on at much greater length about feedback and its impact on galaxy formation. This is pointless. It is a form of magical thinking to expect that the combined effects of numerous complicated feedback effects are going to always add up to looking like MOND in each and every galaxy. It is also the working presumption of an entire field of modern science.

Two Hypotheses

Two Hypotheses

OK, basic review is over. Shit’s gonna get real. Here I give a short recounting of the primary reason I came to doubt the dark matter paradigm. This is entirely conventional – my concern about the viability of dark matter is a contradiction within its own context. It had nothing to do with MOND, which I was blissfully ignorant of when I ran head-long into this problem in 1994. Most of the community chooses to remain blissfully ignorant, which I understand: it’s way more comfortable. It is also why the field has remained mired in the ’90s, with all the apparent progress since then being nothing more than the perpetual reinvention of the same square wheel.


To make a completely generic point that does not depend on the specifics of dark matter halo profiles or the details of baryonic assembly, I discuss two basic hypotheses for the distribution of disk galaxy size at a given mass. These broad categories I label SH (Same Halo) and DD (Density begets Density) following McGaugh and de Blok (1998a). In both cases, galaxies of a given baryonic mass are assumed to reside in dark matter halos of a corresponding total mass. Hence, at a given halo mass, the baryonic mass is the same, and variations in galaxy size follow from one of two basic effects:

  • SH: variations in size follow from variations in the spin of the parent dark matter halo.
  • DD: variations in surface brightness follow from variations in the density of the dark matter halo.

Recall that at a given luminosity, size and surface brightness are not independent, so variation in one corresponds to variation in the other. Consequently, we have two distinct ideas for why galaxies of the same mass vary in size. In SH, the halo may have the same density profile ρ(r), and it is only variations in angular momentum that dictate variations in the disk size. In DD, variations in the surface brightness of the luminous disk are reflections of variations in the density profile ρ(r) of the dark matter halo. In principle, one could have a combination of both effects, but we will keep them separate for this discussion, and note that mixing them defeats the virtues of each without curing their ills.

The SH hypothesis traces back to at least Fall and Efstathiou (1980). The notion is simple: variations in the size of disks correspond to variations in the angular momentum of their host dark matter halos. The mass destined to become a dark matter halo initially expands with the rest of the universe, reaching some maximum radius before collapsing to form a gravitationally bound object. At the point of maximum expansion, the nascent dark matter halos torque one another, inducing a small but non-zero net spin in each, quantified by the dimensionless spin parameter λ (Peebles, 1969). One then imagines that as a disk forms within a dark matter halo, it collapses until it is centrifugally supported: λ → 1 from some initially small value (typically λ ​≈ ​0.05, Barnes & Efstathiou, 1987, with some modest distribution about this median value). The spin parameter thus determines the collapse factor and the extent of the disk: low spin halos harbor compact, high surface brightness disks while high spin halos produce extended, low surface brightness disks.

The distribution of primordial spins is fairly narrow, and does not correlate with environment (Barnes & Efstathiou, 1987). The narrow distribution was invoked as an explanation for Freeman’s Law: the small variation in spins from halo to halo resulted in a narrow distribution of disk central surface brightness (van der Kruit, 1987). This association, while apparently natural, proved to be incorrect: when one goes through the mathematics to transform spin into scale length, even a narrow distribution of initial spins predicts a broad distribution in surface brightness (Dalcanton, Spergel, & Summers, 1997; McGaugh and de Blok, 1998a). Indeed, it predicts too broad a distribution: to prevent the formation of galaxies much higher in surface brightness than observed, one must invoke a stability criterion (Dalcanton, Spergel, & Summers, 1997; McGaugh and de Blok, 1998a) that precludes the existence of very high surface brightness disks. While it is physically quite reasonable that such a criterion should exist (Ostriker and Peebles, 1973), the observed surface density threshold does not emerge naturally, and must be inserted by hand. It is an auxiliary hypothesis invoked to preserve SH. Once done, size variations and the trend of average size with mass work out in reasonable quantitative detail (e.g., Mo et al., 1998).

Angular momentum conservation must hold for an isolated galaxy, but the assumption made in SH is stronger: baryons conserve their share of the angular momentum independently of the dark matter. It is considered a virtue that this simple assumption leads to disk sizes that are about right. However, this assumption is not well justified. Baryons and dark matter are free to exchange angular momentum with each other, and are seen to do so in simulations that track both components (e.g., Book et al., 2011; Combes, 2013; Klypin et al., 2002). There is no guarantee that this exchange is equitable, and in general it is not: as baryons collapse to form a small galaxy within a large dark matter halo, they tend to lose angular momentum to the dark matter. This is a one-way street that runs in the wrong direction, with the final destination uncomfortably invisible with most of the angular momentum sequestered in the unobservable dark matter. Worse still, if we impose rigorous angular momentum conservation among the baryons, the result is a disk with a completely unrealistic surface density profile (van den Bosch, 2001a). It then becomes necessary to pick and choose which baryons manage to assemble into the disk and which are expelled or otherwise excluded, thereby solving one problem by creating another.

Early work on LSB disk galaxies led to a rather different picture. Compared to the previously known population of HSB galaxies around which our theories had been built, the LSB galaxy population has a younger mean stellar age (de Blok & van der Hulst, 1998; McGaugh and Bothun, 1994), a lower content of heavy elements (McGaugh, 1994), and a systematically higher gas fraction (McGaugh and de Blok, 1997; Schombert et al., 1997). These properties suggested that LSB galaxies evolve more gradually than their higher surface brightness brethren: they convert their gas into stars over a much longer timescale (McGaugh et al., 2017). The obvious culprit for this difference is surface density: lower surface brightness galaxies have less gravity, hence less ability to gather their diffuse interstellar medium into dense clumps that could form stars (Gerritsen and de Blok, 1999; Mihos et al., 1999). It seemed reasonable to ascribe the low surface density of the baryons to a correspondingly low density of their parent dark matter halos.

One way to think about a region in the early universe that will eventually collapse to form a galaxy is as a so-called top-hat over-density. The mass density Ωm → 1 ​at early times, irrespective of its current value, so a spherical region (the top-hat) that is somewhat over-dense early on may locally exceed the critical density. We may then consider this finite region as its own little closed universe, and follow its evolution with the Friedmann equations with Ω ​> ​1. The top-hat will initially expand along with the rest of the universe, but will eventually reach a maximum radius and recollapse. When that happens depends on the density. The greater the over-density, the sooner the top-hat will recollapse. Conversely, a lesser over-density will take longer to reach maximum expansion before recollapsing.

Everything about LSB galaxies suggested that they were lower density, late-forming systems. It therefore seemed quite natural to imagine a distribution of over-densities and corresponding collapse times for top-hats of similar mass, and to associate LSB galaxy with the lesser over-densities (Dekel and Silk, 1986; McGaugh, 1992). More recently, some essential aspects of this idea have been revived under the monicker of “assembly bias” (e.g. Zehavi et al., 2018).

The work that informed the DD hypothesis was based largely on photometric and spectroscopic observations of LSB galaxies: their size and surface brightness, color, chemical abundance, and gas content. DD made two obvious predictions that had not yet been tested at that juncture. First, late-forming halos should reside preferentially in low density environments. This is a generic consequence of Gaussian initial conditions: big peaks defined on small (e.g., galaxy) scales are more likely to be found in big peaks defined on large (e.g., cluster) scales, and vice-versa. Second, the density of the dark matter halo of an LSB galaxy should be lower than that of an equal mass halo containing and HSB galaxy. This predicts a clear signature in their rotation speeds, which should be lower for lower density.

The prediction for the spatial distribution of LSB galaxies was tested by Bothun et al. (1993) and Mo et al. (1994). The test showed the expected effect: LSB galaxies were less strongly clustered than HSB galaxies. They are clustered: both galaxy populations follow the same large scale structure, but HSB galaxies adhere more strongly to it. In terms of the correlation function, the LSB sample available at the time had about half the amplitude r0 as comparison HSB samples (Mo et al., 1994). The effect was even more pronounced on the smallest scales (<2 Mpc: Bothun et al., 1993), leading Mo et al. (1994) to construct a model that successfully explained both small and large scale aspects of the spatial distribution of LSB galaxies simply by associating them with dark matter halos that lacked close interactions with other halos. This was strong corroboration of the DD hypothesis.

One way to test the prediction of DD that LSB galaxies should rotate more slowly than HSB galaxies was to use the Tully-Fisher relation (Tully and Fisher, 1977) as a point of reference. Originally identified as an empirical relation between optical luminosity and the observed line-width of single-dish 21 ​cm observations, more fundamentally it turns out to be a relation between the baryonic mass of a galaxy (stars plus gas) and its flat rotation speed the Baryonic Tully-Fisher relation (BTFR: McGaugh et al., 2000). This relation is a simple power law of the form

Mb = AVf4 (equation 1)

with A ​≈ ​50 ​M km−4 s4 (McGaugh, 2005).

Aaronson et al. (1979) provided a straightforward interpretation for a relation of this form. A test particle orbiting a mass M at a distance R will have a circular speed V

V2 = GM/R (equation 2)

where G is Newton’s constant. If we square this, a relation like the Tully-Fisher relation follows:

V4 = (GM/R)2 &propto; MΣ (equation 3)

where we have introduced the surface mass density Σ ​= ​M/R2. The Tully-Fisher relation M ​∝ ​V4 is recovered if Σ is constant, exactly as expected from Freeman’s Law (Freeman, 1970).

LSB galaxies, by definition, have central surface brightnesses (and corresponding stellar surface densities Σ0) that are less than the Freeman value. Consequently, DD predicts, through equation (3), that LSB galaxies should shift systematically off the Tully-Fisher relation: lower Σ means lower velocity. The predicted effect is not subtle2 (Fig. 4). For the range of surface brightness that had become available, the predicted shift should have stood out like the proverbial sore thumb. It did not (Hoffman et al., 1996; McGaugh and de Blok, 1998a; Sprayberry et al., 1995; Zwaan et al., 1995). This had an immediate impact on galaxy formation theory: compare Dalcanton et al. (1995, who predict a shift in Tully-Fisher with surface brightness) with Dalcanton et al. (1997b, who do not).

Fig. 4. The Baryonic Tully-Fisher relation and residuals. The top panel shows the flat rotation velocity of galaxies in the SPARC database (Lelli et al., 2016a) as a function of the baryonic mass (stars plus gas). The sample is restricted to those objects for which both quantities are measured to better than 20% accuracy. The bottom panel shows velocity residuals around the solid line in the top panel as a function of the central surface density of the stellar disks. Variations in the stellar surface density predict variations in velocity along the dashed line. These would translate to shifts illustrated by the dotted lines in the top panel, with each dotted line representing a shift of a factor of ten in surface density. The predicted dependence on surface density is not observed (Courteau & Rix, 1999; McGaugh and de Blok, 1998a; Sprayberry et al., 1995; Zwaan et al., 1995).

Instead of the systematic variation of velocity with surface brightness expected at fixed mass, there was none. Indeed, there is no hint of a second parameter dependence. The relation is incredibly tight by the standards of extragalactic astronomy (Lelli et al., 2016b): baryonic mass and the flat rotation speed are practically interchangeable.

The above derivation is overly simplistic. The radius at which we should make a measurement is ill-defined, and the surface density is dynamical: it includes both stars and dark matter. Moreover, galaxies are not spherical cows: one needs to solve the Poisson equation for the observed disk geometry of LTGs, and account for the varying radial contributions of luminous and dark matter. While this can be made to sound intimidating, the numerical computations are straightforward and rigorous (e.g., Begeman et al., 1991; Casertano & Shostak, 1980; Lelli et al., 2016a). It still boils down to the same sort of relation (modulo geometrical factors of order unity), but with two mass distributions: one for the baryons Mb(R), and one for the dark matter MDM(R). Though the dark matter is more massive, it is also more extended. Consequently, both components can contribute non-negligibly to the rotation over the observed range of radii:

V2(R) = GM/R = G(Mb/R + MDM/R), (equation 4)

(4)where for clarity we have omitted* geometrical factors. The only absolute requirement is that the baryonic contribution should begin to decline once the majority of baryonic mass is encompassed. It is when rotation curves persist in remaining flat past this point that we infer the need for dark matter.

A recurrent problem in testing galaxy formation theories is that they seldom make ironclad predictions; I attempt a brief summary in Table 1. SH represents a broad class of theories with many variants. By construction, the dark matter halos of galaxies of similar stellar mass are similar. If we associate the flat rotation velocity with halo mass, then galaxies of the same mass have the same circular velocity, and the problem posed by Tully-Fisher is automatically satisfied.

Table 1. Predictions of DD and SH for LSB galaxies.

ObservationDDSH
Evolutionary rate++
Size distribution++
Clustering+X
Tully-Fisher relationX?
Central density relation+X

While it is common to associate the flat rotation speed with the dark matter halo, this is a half-truth: the observed velocity is a combination of baryonic and dark components (eq. (4)). It is thus a rather curious coincidence that rotation curves are as flat as they are: the Keplerian decline of the baryonic contribution must be precisely balanced by an increasing contribution from the dark matter halo. This fine-tuning problem was dubbed the “disk-halo conspiracy” (Bahcall & Casertano, 1985; van Albada & Sancisi, 1986). The solution offered for the disk-halo conspiracy was that the formation of the baryonic disk has an effect on the distribution of the dark matter. As the disk settles, the dark matter halo respond through a process commonly referred to as adiabatic compression that brings the peak velocities of disk and dark components into alignment (Blumenthal et al., 1986). Some rearrangement of the dark matter halo in response to the change of the gravitational potential caused by the settling of the disk is inevitable, so this seemed a plausible explanation.

The observation that LSB galaxies obey the Tully-Fisher relation greatly compounds the fine-tuning (McGaugh and de Blok, 1998a; Zwaan et al., 1995). The amount of adiabatic compression depends on the surface density of stars (Sellwood and McGaugh, 2005b): HSB galaxies experience greater compression than LSB galaxies. This should enhance the predicted shift between the two in Tully-Fisher. Instead, the amplitude of the flat rotation speed remains unperturbed.

The generic failings of dark matter models was discussed at length by McGaugh and de Blok ​(1998a). The same problems have been encountered by others. For example, Fig. 5 shows model galaxies formed in a dark matter halo with identical total mass and density profile but with different spin parameters (van den Bosch, ​2001b). Variations in the assembly and cooling history were also considered, but these make little difference and are not relevant here. The point is that smaller (larger) spin parameters lead to more (less) compact disks that contribute more (less) to the total rotation, exactly as anticipated from variations in the term Mb/R in equation (4). The nominal variation is readily detectable, and stands out prominently in the Tully-Fisher diagram (Fig. 5). This is exactly the same fine-tuning problem that was pointed out by Zwaan et al. ​(1995) and McGaugh and de Blok ​(1998a).

What I describe as a fine-tuning problem is not portrayed as such by van den Bosch (2000) and van den Bosch and Dalcanton (2000), who argued that the data could be readily accommodated in the dark matter picture. The difference is between accommodating the data once known, and predicting it a priori. The dark matter picture is extraordinarily flexible: one is free to distribute the dark matter as needed to fit any data that evinces a non-negative mass discrepancy, even data that are wrong (de Blok & McGaugh, 1998). It is another matter entirely to construct a realistic model a priori; in my experience it is quite easy to construct models with plausible-seeming parameters that bear little resemblance to real galaxies (e.g., the low-spin case in Fig. 5). A similar conundrum is encountered when constructing models that can explain the long tidal tails observed in merging and interacting galaxies: models with realistic rotation curves do not produce realistic tidal tails, and vice-versa (Dubinski et al., 1999). The data occupy a very narrow sliver of the enormous volume of parameter space available to dark matter models, a situation that seems rather contrived.

Fig. 5. Model galaxy rotation curves and the Tully-Fisher relation. Rotation curves (left panel) for model galaxies of the same mass but different spin parameters λ from van den Bosch (2001b, see his Fig. 3). Models with lower spin have more compact stellar disks that contribute more to the rotation curve (V2 ​= ​GM/R; R being smaller for the same M). These models are shown as square points on the Baryonic Tully-Fisher relation (right) along with data for real galaxies (grey circles: Lelli et al., 2016b) and a fit thereto (dashed line). Differences in the cooling history result in modest variation in the baryonic mass at fixed halo mass as reflected in the vertical scatter of the models. This is within the scatter of the data, but variation due to the spin parameter is not.

Both DD and SH predict residuals from Tully-Fisher that are not observed. I consider this to be an unrecoverable failure for DD, which was my hypothesis (McGaugh, 1992), so I worked hard to salvage it. I could not. For SH, Tully-Fisher might be recovered in the limit of dark matter domination, which requires further consideration.


I will save the further consideration for a future post, as that can take infinite words (there are literally thousands of ApJ papers on the subject). The real problem that rotation curve data pose generically for the dark matter interpretation is the fine-tuning required between baryonic and dark matter components – the balancing act explicit in the equations above. This, by itself, constitutes a practical falsification of the dark matter paradigm.

Without going into interesting but ultimately meaningless details (maybe next time), the only way to avoid this conclusion is to choose to be unconcerned with fine-tuning. If you choose to say fine-tuning isn’t a problem, then it isn’t a problem. Worse, many scientists don’t seem to understand that they’ve even made this choice: it is baked into their assumptions. There is no risk of questioning those assumptions if one never stops to think about them, much less worry that there might be something wrong with them.

Much of the field seems to have sunk into a form of scientific nihilism. The attitude I frequently encounter when I raise this issue boils down to “Don’t care! Everything will magically work out! LA LA LA!”


*Strictly speaking, eq. (4) only holds for spherical mass distributions. I make this simplification here to emphasize the fact that both mass and radius matter. This essential scaling persists for any geometry: the argument holds in complete generality.

Common ground

Common ground

In order to agree on an interpretation, we first have to agree on the facts. Even when we agree on the facts, the available set of facts may admit multiple interpretations. This was an obvious and widely accepted truth early in my career*. Since then, the field has decayed into a haphazardly conceived set of unquestionable absolutes that are based on a large but well-curated subset of facts that gratuitously ignores any subset of facts that are inconvenient.

Sadly, we seem to have entered a post-truth period in which facts are drowned out by propaganda. I went into science to get away from people who place faith before facts, and comfortable fictions ahead of uncomfortable truths. Unfortunately, a lot of those people seem to have followed me here. This manifests as people who quote what are essentially pro-dark matter talking points at me like I don’t understand LCDM, when all it really does is reveal that they are posers** who picked up on some common myths about the field without actually reading the relevant journal articles.

Indeed, a recent experience taught me a new psychology term: identity protective cognition. Identity protective cognition is the tendency for people in a group to selectively credit or dismiss evidence in patterns that reflect the beliefs that predominate in their group. When it comes to dark matter, the group happens to be a scientific one, but the psychology is the same: I’ve seen people twist themselves into logical knots to protect their belief in dark matter from being subject to critical examination. They do it without even recognizing that this is what they’re doing. I guess this is a human foible we cannot escape.

I’ve addressed these issues before, but here I’m going to start a series of posts on what I think some of the essential but underappreciated facts are. This is based on a talk that I gave at a conference on the philosophy of science in 2019, back when we had conferences, and published in Studies in History and Philosophy of Science. I paid the exorbitant open access fee (the journal changed its name – and publication policy – during the publication process), so you can read the whole thing all at once if you are eager. I’ve already written it to be accessible, so mostly I’m going to post it here in what I hope are digestible chunks, and may add further commentary if it seems appropriate.

Cosmic context

Cosmology is the science of the origin and evolution of the universe: the biggest of big pictures. The modern picture of the hot big bang is underpinned by three empirical pillars: an expanding universe (Hubble expansion), Big Bang Nucleosynthesis (BBN: the formation of the light elements through nuclear reactions in the early universe), and the relic radiation field (the Cosmic Microwave Background: CMB) (Harrison, 2000; Peebles, 1993). The discussion here will take this framework for granted.

The three empirical pillars fit beautifully with General Relativity (GR). Making the simplifying assumptions of homogeneity and isotropy, Einstein’s equations can be applied to treat the entire universe as a dynamical entity. As such, it is compelled either to expand or contract. Running the observed expansion backwards in time, one necessarily comes to a hot, dense, early phase. This naturally explains the CMB, which marks the transition from an opaque plasma to a transparent gas (Sunyaev and Zeldovich, 1980; Weiss, 1980). The abundances of the light elements can be explained in detail with BBN provided the universe expands in the first few minutes as predicted by GR when radiation dominates the mass-energy budget of the universe (Boesgaard & Steigman, 1985).

The marvelous consistency of these early universe results with the expectations of GR builds confidence that the hot big bang is the correct general picture for cosmology. It also builds overconfidence that GR is completely sufficient to describe the universe. Maintaining consistency with modern cosmological data is only possible with the addition of two auxiliary hypotheses: dark matter and dark energy. These invisible entities are an absolute requirement of the current version of the most-favored cosmological model, ΛCDM. The very name of this model is born of these dark materials: Λ is Einstein’s cosmological constant, of which ‘dark energy’ is a generalization, and CDM is cold dark matter.

Dark energy does not enter much into the subject of galaxy formation. It mainly helps to set the background cosmology in which galaxies form, and plays some role in the timing of structure formation. This discussion will not delve into such details, and I note only that it was surprising and profoundly disturbing that we had to reintroduce (e.g., Efstathiou et al., 1990; Ostriker and Steinhardt, 1995; Perlmutter et al., 1999; Riess et al., 1998; Yoshii and Peterson, 1995) Einstein’s so-called ‘greatest blunder.’

Dark matter, on the other hand, plays an intimate and essential role in galaxy formation. The term ‘dark matter’ is dangerously crude, as it can reasonably be used to mean anything that is not seen. In the cosmic context, there are at least two forms of unseen mass: normal matter that happens not to glow in a way that is easily seen — not all ordinary material need be associated with visible stars — and non-baryonic cold dark matter. It is the latter form of unseen mass that is thought to dominate the mass budget of the universe and play a critical role in galaxy formation.

Cold Dark Matter

Cold dark matter is some form of slow moving, non-relativistic (‘cold’) particulate mass that is not composed of normal matter (baryons). Baryons are the family of particles that include protons and neutrons. As such, they compose the bulk of the mass of normal matter, and it has become conventional to use this term to distinguish between normal, baryonic matter and the non-baryonic dark matter.

The distinction between baryonic and non-baryonic dark matter is no small thing. Non-baryonic dark matter must be a new particle that resides in a new ‘dark sector’ that is completely distinct from the usual stable of elementary particles. We do not just need some new particle, we need one (or many) that reside in some sector beyond the framework of the stubbornly successful Standard Model of particle physics. Whatever the solution to the mass discrepancy problem turns out to be, it requires new physics.

The cosmic dark matter must be non-baryonic for two basic reasons. First, the mass density of the universe measured gravitationally (Ωm ​≈ ​0.3, e.g., Faber and Gallagher, 1979; Davis et al., 1980, 1992) clearly exceeds the mass density in baryons as constrained by BBN (Ωb ​≈ ​0.05, e.g., Walker et al., 1991). There is something gravitating that is not ordinary matter: Ωm ​> ​Ωb.

The second reason follows from the absence of large fluctuations in the CMB (Peebles and Yu, 1970; Silk, 1968; Sunyaev and Zeldovich, 1980). The CMB is extraordinarily uniform in temperature across the sky, varying by only ~ 1 part in 105 (Smoot et al., 1992). These small temperature variations correspond to variations in density. Gravity is an attractive force; it will make the rich grow richer. Small density excesses will tend to attract more mass, making them larger, attracting more mass, and leading to the formation of large scale structures, including galaxies. But gravity is also a weak force: this process takes a long time. In the long but finite age of the universe, gravity plus known baryonic matter does not suffice to go from the initially smooth, highly uniform state of the early universe to the highly clumpy, structured state of the local universe (Peebles, 1993). The solution is to boost the process with an additional component of mass — the cold dark matter — that gravitates without interacting with the photons, thus getting a head start on the growth of structure while not aggravating the amplitude of temperature fluctuations in the CMB.

Taken separately, one might argue away the need for dark matter. Taken together, these two distinct arguments convinced nearly everyone, including myself, of the absolute need for non-baryonic dark matter. Consequently, CDM became established as the leading paradigm during the 1980s (Peebles, 1984; Steigman and Turner, 1985). The paradigm has snowballed since that time, the common attitude among cosmologists being that CDM has to exist.

From an astronomical perspective, the CDM could be any slow-moving, massive object that does not interact with photons nor participate in BBN. The range of possibilities is at once limitless yet highly constrained. Neutrons would suffice if they were stable in vacuum, but they are not. Primordial black holes are a logical possibility, but if made of normal matter, they must somehow form in the first second after the Big Bang to not impair BBN. At this juncture, microlensing experiments have excluded most plausible mass ranges that primordial black holes could occupy (Mediavilla et al., 2017). It is easy to invent hypothetical dark matter candidates, but difficult for them to remain viable.

From a particle physics perspective, the favored candidate is a Weakly Interacting Massive Particle (WIMP: Peebles, 1984; Steigman and Turner, 1985). WIMPs are expected to be the lightest stable supersymmetric partner particle that resides in the hypothetical supersymmetric sector (Martin, 1998). The WIMP has been the odds-on favorite for so long that it is often used synonymously with the more generic term ‘dark matter.’ It is the hypothesized particle that launched a thousand experiments. Experimental searches for WIMPs have matured over the past several decades, making extraordinary progress in not detecting dark matter (Aprile et al., 2018). Virtually all of the parameter space in which WIMPs had been predicted to reside (Trotta et al., 2008) is now excluded. Worse, the existence of the supersymmetric sector itself, once seemingly a sure thing, remains entirely hypothetical, and appears at this juncture to be a beautiful idea that nature declined to implement.

In sum, we must have cold dark matter for both galaxies and cosmology, but we have as yet no clue to what it is.


* There is a trope that late in their careers, great scientists come to the opinion that everything worth discovering has been discovered, because they themselves already did everything worth doing. That is not a concern I have – I know we haven’t discovered all there is to discover. Yet I see no prospect for advancing our fundamental understanding simply because there aren’t enough of us pulling in the right direction. Most of the community is busy barking up the wrong tree, and refuses to be distracted from their focus on the invisible squirrel that isn’t there.

** Many of these people are the product of the toxic culture that Simon White warned us about. They wave the sausage of galaxy formation and feedback like a magic wand that excuses all faults while being proudly ignorant of how the sausage was made. Bitch, please. I was there when that sausage was made. I helped make the damn sausage. I know what went into it, and I recognize when it tastes wrong.

Galaxy models in compressed halos

Galaxy models in compressed halos

The last post was basically an introduction to this one, which is about the recent work of Pengfei Li. In order to test a theory, we need to establish its prior. What do we expect?

The prior for fully formed galaxies after 13 billion years of accretion and evolution is not an easy problem. The dark matter halos need to form first, with the baryonic component assembling afterwards. We know from dark matter-only structure formation simulations that the initial condition (A) of the dark matter halo should resemble an NFW halo, and from observations that the end product of baryonic assembly needs to look like a real galaxy (Z). How the universe gets from A to Z is a whole alphabet of complications.

The simplest thing we can do is ignore B-Y and combine a model galaxy with a model dark matter halo. The simplest model for a spiral galaxy is an exponential disk. True to its name, the azimuthally averaged stellar surface density falls off exponentially from a central value over some scale length. This is a tolerable approximation of the stellar disks of spiral galaxies, ignoring their central bulges and their gas content. It is an inadequate yet surprisingly decent starting point for describing gravitationally bound collections of hundreds of billions of stars with just two parameters.

So a basic galaxy model is an exponential disk in an NFW dark matter halo. This is they type of model I discussed in the last post, the kind I was considering two decades ago, and the kind of model still frequently considered. It is an obvious starting point. However, we know that this starting point is not adequate. On the baryonic side, we should model all the major mass components: bulge, disk, and gas. On the halo side, we need to understand how the initial halo depends on its assembly history and how it is modified by the formation of the luminous galaxy within it. The common approach to do all that is to run a giant cosmological simulation and watch what happens. That’s great, provided we know how to model all the essential physics. The action of gravity in an expanding universe we can compute well enough, but we do not enjoy the same ability to calculate the various non-gravitational effects of baryons.

Rather than blindly accept the outcome of simulations that have become so complicated that no one really seems to understand them, it helps to break the problem down into its basic steps. There is a lot going on, but what we’re concerned about here boils down to a tug of war between two competing effects: adiabatic compression tends to concentrate the dark matter, while feedback tends to redistribute it outwards.

Adiabatic compression refers to the response of the dark matter halo to infalling baryons. Though this name stuck, the process isn’t necessarily adiabatic, and the A-word word tends to blind people to a generic and inevitable physical process. As baryons condense into the centers of dark matter halos, the gravitational potential is non-stationary. The distribution of dark matter has to respond to this redistribution of mass: the infall of dissipating baryons drags some dark matter in with them, so we expect dark matter halos to become more centrally concentrated. The most common approach to computing this effect is to assume the process is adiabatic (hence the name). This means a gentle settling that is gradual enough to be time-reversible: you can imagine running the movie backwards, unlike a sudden, violent event like a car crash. It needn’t be rigorously adiabatic, but the compressive response of the halo is inevitable. Indeed, forming a thin, dynamically cold, well-organized rotating disk in a preferred plane – i.e., a spiral galaxy – pretty much requires a period during which the adiabatic assumption is a decent approximation. There is a history of screwing up even this much, but Jerry Sellwood showed that it could be done correctly and that when one does so, it reproduces the results of more expensive numerical simulations. This provides a method to go beyond a simple exponential disk in an NFW halo: we can compute what happens to an NFW halo in response to an observed mass distribution.

After infall and compression, baryons form stars that produce energy in the form of radiation, stellar winds, and the blast waves of supernova explosions. These are sources of energy that complicate what until now has been a straightforward calculation of gravitational dynamics. With sufficient coupling to the surrounding gas, these energy sources might be converted into enough kinetic energy to alter the equilibrium mass distribution and the corresponding gravitational potential. I say might because we don’t really know how this works, and it is a lot more complicated than I’ve made it sound. So let’s not go there, and instead just calculate the part we do know how to calculate. What happens from the inevitable adiabatic compression in the limit of zero feedback?

We have calculated this for a grid of model galaxies that matches the observed distribution or real galaxies. This is important; it often happens that people do not explore a realistic parameter space. Here is a plot of size against stellar mass:

The size of galaxy disks as measured by the exponential scale length as a function of stellar mass. Grey points are real galaxies; red circles are model galaxies with parameters chosen to cover the same parameter space. This, and all plots, from Li et al. (2022).

Note that at a given stellar mass, there is a wide range of sizes. This is an essential aspect of galaxy properties; one has to explain size variations as well as the trend with mass. This obvious point has been frequently forgotten and rediscovered in the literature.

The two parameter plot above only suffices to approximate the stellar disks of spiral and irregular galaxies. Real galaxies have bulges and interstellar gas. We include these in our models so that they cover the same distribution as real galaxies in terms of bulge mass, size, and gas fraction. We then assign a dark matter halo to each model galaxy using an abundance matching relation (the stellar mass tells us the halo mass) and adopt the cosmologically appropriate halo mass-concentration relation. These specify the initial condition of the NFW halo in which each model galaxy is presumed to reside.

At this point, it is worth remarking that there are a variety of abundance matching relations in the literature. Some of these give tragically bad predictions for the kinematics. I won’t delve into this here, but do want to note that in what follows, we have adopted the most favorable abundance matching relation, which turns out to be that of Kravstov et al. (2018). Note that this means that we are already engaged in a kind of fine-tuning by cherry-picking the most favorable relation.

Before considering adiabatic compression, let’s see what happens if we simply add our model galaxies to NFW halos. This is the same exercise we did last time with exponential disks; now we’re including bulges and gas:

Galaxy models in the RAR plane. Models are color coded by their stellar surface density. The dotted line is 1:1 (Newton with no dark matter or other funny business). The black line is the fit to the observed RAR.

This looks pretty good, at least at a first glance. Most of the models fall nearly on top of each other. This isn’t entirely true, as the most massive models overpredict the RAR. This is a generic consequence of the bend in abundance matching relations. This bend is mildest in the Kravtsov relation, which is what makes it “best” here – other relations, like the commonly cited one of Behroozi, predict a lot more high-acceleration models. One sees only a hint of that here.

The scatter is respectably small, mostly solving the problem I initially encountered in the nineties. Despite predicting a narrow relation, the models do have a finite scatter that is a bit more than we observe. This isn’t too tragic, so maybe we can work with it. These models also miss the low acceleration end of the relation by a modest but appreciable amount. This seems more significant, as we found the same thing for pure exponential models: it is hard to make this part of the problem go away.

Including bulges in the models extends them to high accelerations. This would seem to explain a region of the RAR that pure exponential models do not address. Bulges are high surface density, star dominated regions, so they fall on the 1:1 part of the RAR at high accelerations.

And then there are the hooks. These are obvious in the plot above. They occur in low and intermediate mass galaxies that lack a significant bulge component. A pure exponential disk has a peak acceleration at finite radius, but an NFW halo has its peak at zero radius. So if you imagine following a given model line inwards in radius, it goes up in acceleration until it reaches the maximum for the disk along the x-axis. The baryonic component of the acceleration then starts to decline while that due to the NFW halo continues to rise. The model doubles back to lower baryonic acceleration while continuing to higher total acceleration, making the little hook shape. This deviation from the RAR is not commonly observed; indeed, these hooks are the signature of the cusp-core problem in the RAR plane.

Results so far are mixed. With the “right” choice of abundance matching relation, we are well ahead of where we were at the turn of the century, but some real problems remain. We have yet to compute the necessary adiabatic contraction, so hopefully doing that right will result in further improvement. So let’s make a rigorous calculation of the compression that would result from forming a galaxy of the stipulated parameters.

Galaxy models in the RAR plane after compression.

Adiabatic compression makes things worse. There is a tiny improvement at low accelerations, but the most pronounced effects are at small radii where accelerations are large. Compression makes cuspy halos cuspier, making the hooks more pronounced. Worse, the strong concentration of starlight that is a bulge inevitably leads to strong compression. These models don’t approach the 1:1 line at high acceleration, and never can: higher acceleration means higher stellar surface density means greater compression. One cannot start from an NFW halo and ever reach a state of baryon domination; too much dark matter is always in the mix.

It helps to look at the residual diagram. The RAR is a log-log plot over a large dynamic range; this can hide small but significant deviations. For some reason, people who claim to explain the RAR with dark matter models never seem to show these residuals.

As above, with the observed RAR divided out. Model galaxies are mostly above the RAR. The cusp-core problem is exacerbated in disks, and bulges never reach the 1:1 line at high accelerations.

The models built to date don’t have the right shape to explain the RAR, at least when examined closely. Still, I’m pleased: what we’ve done here comes closer than all my many previous efforts, and most of the other efforts that are out there. Still, I wouldn’t claim it as a success. Indeed, the inevitable compressive effects that occur at high surface densities means that we can’t invoke simple offsets to accommodate the data: if a model gets the shape of the RAR right but the normalization wrong, it doesn’t work to simply shift it over.

So, where does that leave us? Up the proverbial creek? Perhaps. We have yet to consider feedback, which is too complicated to delve into here. Instead, while we haven’t engaged in any specific fine-tuning, we have already engaged in some cherry picking. First, we’ve abandoned the natural proportionality between halo and disk mass, replacing it with abundance matching. This is no small step, as it converts a single-valued parameter of our theory to a rolling function of mass. Abundance matching has become familiar enough that people seemed to be lulled into thinking this is natural. There is nothing natural about it. Regardless of how much fancy jargon we use to justify it, it’s still basically a rolling fudge factor – the scientific equivalent of a lipstick smothered pig.

Abundance matching does, at least, use data that are independent of the kinematics to set the relation between stellar and halo mass, and it does go in the right direction for the RAR. This only gets us into the right ballpark, and only if we cherry-pick the particular abundance matching relation that we use. So we’re well down the path of tuning whether we realize it or not. Invoking feedback is simply another step along this path.

Feedback is usually invoked in the kinematic context to convert cusps into cores. That could help with the hooks. This kind of feedback is widely thought to affect low and intermediate mass galaxies, or galaxies of a particular stellar to halo mass ratio. Opinions vary a bit, but it is generally not thought to have such a strong effect on massive galaxies. And yet, we find that we need some (second?) kind of feedback for them, as we need to move bulges back onto the 1:1 line in the RAR plane. That’s perhaps related to the cusp-core problem, but it’s also different. Getting bulges right requires a fine-tuned amount of feedback to exactly cancel out the effects of compression. A third distinct place where the models need some help is at low accelerations. This is far from the region where feedback is thought to have much effect at all.

I could go on, and perhaps will in a future post. Point is, we’ve been tuning our feedback prescriptions to match observed facts about galaxies, not computing how we think it really works. We don’t know how to do the latter, and there is no guarantee that our approximations do justice to reality. So on the one hand, I don’t doubt that with enough tinkering this process can be made to work in a model. On the other hand, I do question whether this is how the universe really works.