What we have here is a failure to communicate

What we have here is a failure to communicate

Kuhn noted that as paradigms reach their breaking point, there is a divergence of opinions between scientists about what the important evidence is, or what even counts as evidence. This has come to pass in the debate over whether dark matter or modified gravity is a better interpretation of the acceleration discrepancy problem. It sometimes feels like we’re speaking about different topics in a different language. That’s why I split the diagram version of the dark matter tree as I did:

Evidence indicating acceleration discrepancies in the universe and various flavors of hypothesized solutions.

Astroparticle physicists seem to be well-informed about the cosmological evidence (top) and favor solutions in the particle sector (left). As more of these people entered the field in the ’00s and began attending conferences where we overlapped, I recognized gaping holes in their knowledge about the dynamical evidence (bottom) and related hypotheses (right). This was part of my motivation to develop an evidence-based course1 on dark matter, to try to fill in the gaps in essential knowledge that were obviously being missed in the typical graduate physics curriculum. Though popular on my campus, not everyone in the field has the opportunity to take this course. It seems that the chasm has continued to grow, though not for lack of attempts at communication.

Part of the problem is a phase difference: many of the questions that concern astroparticle physicists (structure formation is a big one) were addressed 20 years ago in MOND. There is also a difference in texture: dark matter rarely predicts things but always explains them, even if it doesn’t. MOND often nails some predictions but leaves other things unexplained – just a complete blank. So they’re asking questions that are either way behind the curve or as-yet unanswerable. Progress rarely follows a smooth progression in linear time.

I have become aware of a common construction among many advocates of dark matter to criticize “MOND people.” First, I don’t know what a “MOND person” is. I am a scientist who works on a number of topics, among them both dark matter and MOND. I imagine the latter makes me a “MOND person,” though I still don’t really know what that means. It seems to be a generic straw man. Users of this term consistently paint such a luridly ridiculous picture of what MOND people do or do not do that I don’t recognize it as a legitimate depiction of myself or of any of the people I’ve met who work on MOND. I am left to wonder, who are these “MOND people”? They sound very bad. Are there any here in the room with us?

I am under no illusions as to what these people likely say when I am out of ear shot. Someone recently pointed me to a comment on Peter Woit’s blog that I would not have come across on my own. I am specifically named. Here is a screen shot:

From a reply to a post of Peter Woit on December 8, 2022. I omit the part about right-handed neutrinos as irrelevant to the discussion here.

This concisely pinpoints where the field2 is at, both right and wrong. Let’s break it down.

let me just remind everyone that the primary reason to believe in the phenomenon of cold dark matter is the very high precision with which we measure the CMB power spectrum, especially modes beyond the second acoustic peak

This is correct, but it is not the original reason to believe in CDM. The history of the subject matters, as we already believed in CDM quite firmly before any modes of the acoustic power spectrum of the CMB were measured. The original reasons to believe in cold dark matter were (1) that the measured, gravitating mass density exceeds the mass density of baryons as indicated by BBN, so there is stuff out there with mass that is not normal matter, and (2) large scale structure has grown by a factor of 105 from the very smooth initial condition indicated initially by the nondetection of fluctuations in the CMB, while normal matter (with normal gravity) can only get us a factor of 103 (there were upper limits excluding this before there was a detection). Structure formation additionally imposes the requirement that whatever the dark matter is moves slowly (hence “cold”) and does not interact via electromagnetism in order to evade making too big an impact on the fluctuations in the CMB (hence the need, again, for something non-baryonic).

When cold dark matter became accepted as the dominant paradigm, fluctuations in the CMB had not yet been measured. The absence of observable fluctuations at a larger level sufficed to indicate the need for CDM. This, together with Ωm > Ωb from BBN (which seemed the better of the two arguments at the time), sufficed to convince me, along with most everyone else who was interested in the problem, that the answer had3 to be CDM.

This all happened before the first fluctuations were observed by COBE in 1992. By that time, we already believed firmly in CDM. The COBE observations caused initial confusion and great consternation – it was too much! We actually had a prediction from then-standard SCDM, and it had predicted an even lower level of fluctuations than what COBE observed. This did not cause us (including me) to doubt CDM (thought there was one suggestion that it might be due to self-interacting dark matter); it seemed a mere puzzle to accommodate, not an anomaly. And accommodate it we did: the power in the large scale fluctuations observed by COBE is part of how we got LCDM, albeit only a modest part. A lot of younger scientists seem to have been taught that the power spectrum is some incredibly successful prediction of CDM when in fact it has surprised us at nearly every turn.

As I’ve related here before, it wasn’t until the end of the century that CMB observations became precise enough to provide a test that might distinguish between CDM and MOND. That test initially came out in favor of MOND – or at least in favor of the absence of dark matter: No-CDM, which I had suggested as a proxy for MOND. Cosmologists and dark matter advocates consistently omit this part of the history of the subject.

I had hoped that cosmologists would experience the same surprise and doubt and reevaluation that I had experienced when MOND cropped up in my own data when it cropped up in theirs. Instead, they went into denial, ignoring the successful prediction of the first-to-second peak amplitude ratio, or, worse, making up stories that it hadn’t happened. Indeed, the amplitude of the second peak was so surprising that the first paper to measure it omitted mention of it entirely. Just didn’t talk about it, let alone admit that “Gee, this crazy prediction came true!” as I had with MOND in LSB galaxies. Consequently, I decided that it was better to spend my time working on topics where progress could be made. This is why most of my work on the CMB predates “modes beyond the second peak” just as our strong belief in CDM also predated that evidence. Indeed, communal belief in CDM was undimmed when the modes defining the second peak were observed, despite the No-CDM proxy for MOND being the only hypothesis to correctly predict it quantitatively a priori.

That said, I agree with clayton’s assessment that

CDM thinks [the second and third peak] should be about the same

That this is the best evidence now is both correct and a much weaker argument than it is made out to be. It sounds really strong, because a formal fit to the CMB data require a dark matter component at extremely high confidence – something approaching 100 sigma. This analysis assumes that dark matter exist. It does not contemplate that something else might cause the same effect, so all it really does, yet again, is demonstrate that General Relativity cannot explain cosmology when restricted to the material entities we concretely know to exist.

Given the timing, the third peak was not a strong element of my original prediction, as we did not yet have either a first or second peak. We hadn’t yet clearly observed peaks at all, so what I was doing was pretty far-sighted, but I wasn’t thinking that far ahead. However, the natural prediction for the No-CDM picture I was considering was indeed that the third peak should be lower than the second, as I’ve discussed before.

The No-CDM model (blue line) that correctly predicted the amplitude of the second peak fails to predict that of the third. Data from the Planck satellite; model line from McGaugh (2004); figure from McGaugh (2015).

In contrast, in CDM, the acoustic power spectrum of the CMB can do a wide variety of things:

Acoustic power spectra calculated for the CMB for a variety of cosmic parameters. From Dodelson & Hu (2002).

Given the diversity of possibilities illustrated here, there was never any doubt that a model could be fit to the data, provided that oscillations were observed as expected in any of the theories under consideration here. Consequently, I do not find fits to the data, though excellent, to be anywhere near as impressive as commonly portrayed. What does impress me is consistency with independent data.

What impresses me even more are a priori predictions. These are the gold standard of the scientific method. That’s why I worked my younger self’s tail off to make a prediction for the second peak before the data came out. In order to make a clean test, you need to know what both theories predict, so I did this for both LCDM and No-CDM. Here are the peak ratios predicted before there were data to constrain them, together with the data that came after:

The ratio of the first-to-second (left) and second-to-third peak (right) amplitude ratio in LCDM (red) and No-CDM (blue) as predicted by Ostriker & Steinhardt (1995) and McGaugh (1999). Subsequent data as labeled.

The left hand panel shows the predicted amplitude ratio of the first-to-second peak, A1:2. This is the primary quantity that I predicted for both paradigms. There is a clear distinction between the predicted bands. I was not unique in my prediction for LCDM; the same thing can be seen in other contemporaneous models. All contemporaneous models. I was the only one who was not surprised by the data when they came in, as I was the only one who had considered the model that got the prediction right: No-CDM.

The same No-CDM model fails to correctly predict the second-to-third peak ratio, A2:3. It is, in fact, way off, while LCDM is consistent with A2:3, just as Clayton says. This is a strong argument against No-CDM, because No-CDM makes a clear and unequivocal prediction that it gets wrong. Clayton calls this

a stone-cold, qualitative, crystal clear prediction of CDM

which is true. It is also qualitative, so I call it weak sauce. LCDM could be made to fit a very large range of A2:3, but it had already got A1:2 wrong. We had to adjust the baryon density outside the allowed range in order to make it consistent with the CMB data. The generous upper limit that LCDM might conceivably have predicted in advance of the CMB data was A1:2 < 2.06, which is still clearly less than observed. For the first years of the century, the attitude was that BBN had been close, but not quite right – preference being given to the value needed to fit the CMB. Nowadays, BBN and the CMB are said to be in great concordance, but this is only true if one restricts oneself to deuterium measurements obtained after the “right” answer was known from the CMB. Prior to that, practically all of the measurements for all of the important isotopes of the light elements, deuterium, helium, and lithium, all concurred that the baryon density Ωbh2 < 0.02, with the consensus value being Ωbh2 = 0.0125 ± 0.0005. This is barely half the value subsequently required to fit the CMBbh2 = 0.0224 ± 0.0001). But what’s a factor of two among cosmologists? (In this case, 4 sigma.)

Taking the data at face value, the original prediction of LCDM was falsified by the second peak. But, no problem, we can move the goal posts, in this case by increasing the baryon density. The successful prediction of the third peak only comes after the goal posts have been moved to accommodate the second peak. Citing only the comparable size of third peak to the second while not acknowledging that the second was too small elides the critical fact that No-CDM got something right, a priori, that LCDM did not. No-CDM failed only after LCDM had already failed. The difference is that I acknowledge its failure while cosmologists elide this inconvenient detail. Perhaps the second peak amplitude is a fluke, but it was a unique prediction that was exactly nailed and remains true in all subsequent data. That’s a pretty remarkable fluke4.

LCDM wins ugly here by virtue of its flexibility. It has greater freedom to fit the data – any of the models in the figure of Dodelson & Hu will do. In contrast. No-CDM is the single blue line in my figure above, and nothing else. Plausible variations in the baryon density make hardly any difference: A1:2 has to have the value that was subsequently observed, and no other. It passed that test with flying colors. It flunked the subsequent test posed by A2:3. For LCDM this isn’t even a test, it is an exercise in fitting the data with a model that has enough parameters5 to do so.

There were a number of years at the beginning of the century during which the No-CDM prediction for the A1:2 was repeatedly confirmed by multiple independent experiments, but before the third peak was convincingly detected. During this time, cosmologists exhibited the same attitude that Clayton displays here: the answer has to be CDM! This warrants mention because the evidence Clayton cites did not yet exist. Clearly the as-yet unobserved third peak was not the deciding factor.

In those days, when No-CDM was the only correct a priori prediction, I would point out to cosmologists that it had got A1:2 right when I got the chance (which was rarely: I was invited to plenty of conferences in those days, but none on the CMB). The typical reaction was usually outright denial6 though sometimes it warranted a dismissive “That’s not a MOND prediction.” The latter is a fair criticism. No-CDM is just General Relativity without CDM. It represented MOND as a proxy under the ansatz that MOND effects had not yet manifested in a way that affected the CMB. I expected that this ansatz would fail at some point, and discussed some of the ways that this should happen. One that’s relevant today is that galaxies form early in MOND, so reionization happens early, and the amplitude of gravitational lensing effects is amplified. There is evidence for both of these now. What I did not anticipate was a departure from a damping spectrum around L=600 (between the second and third peaks). That’s a clear deviation from the prediction, which falsifies the ansatz but not MOND itself. After all, they were correct in noting that this wasn’t a MOND prediction per se, just a proxy. MOND, like Newtonian dynamics before it, is relativity adjacent, but not itself a relativistic theory. Neither can explain the CMB on their own. If you find that an unsatisfactory answer, imagine how I feel.

The same people who complained then that No-CDM wasn’t a real MOND prediction now want to hold MOND to the No-CDM predicted power spectrum and nothing else. First it was the second peak isn’t a real MOND prediction! then when the third peak was observed it became no way MOND can do this! This isn’t just hypocritical, it is bad science. The obvious way to proceed would be to build on the theory that had the greater, if incomplete, predictive success. Instead, the reaction has consistently been to cherry-pick the subset of facts that precludes the need for serious rethinking.

This brings us to sociology, so let’s examine some more of what Clayton has to say:

Any talk I’ve ever seen by McGaugh (or more exotic modified gravity people like Verlinde) elides this fact, and they evade the questions when I put my hand up to ask. I have invited McGaugh to a conference before specifically to discuss this point, and he just doesn’t want to.

Now you’re getting personal.

There is so much to unpack here, I hardly know where to start. By saying I “elide this fact” about the qualitatively equality of the second and third peak, Clayton is basically accusing me of lying by omission. This is pretty rich coming from a community that consistently elides the history I relate above, and never addresses the question raised by MOND’s predictive power.

Intellectual honesty is very important to me – being honest that MOND predicted what I saw in low surface brightness where my own prediction was wrong is what got me into this mess in the first place. It would have been vastly more convenient to pretend that I never heard of MOND (at first I hadn’t7) and act like that never happened. That would be an lie of omission. It would be a large lie, a lie that denies an important aspect of how the world works (what we’re supposed to uncover through science), the sort of lie that cleric Paul Gerhardt may have had in mind when he said

When a man lies, he murders some part of the world.

Paul Gerhardt

Clayton is, in essence, accusing me of exactly that by failing to mention the CMB in talks he has seen. That might be true – I give a lot of talks. He hasn’t been to most of them, and I usually talk about things I’ve done more recently than 2004. I’ve commented explicitly on this complaint before

There’s only so much you can address in a half hour talk. [This is a recurring problem. No matter what I say, there always seems to be someone who asks “why didn’t you address X?” where X is usually that person’s pet topic. Usually I could do so, but not in the time allotted.]

– so you may appreciate my exasperation at being accused of dishonesty by someone whose complaint is so predictable that I’ve complained before about people who make this complaint. I’m only human – I can’t cover all subjects for all audiences every time all the time. Moreover, I do tend to choose to discuss subjects that may be news to an audience, not simply reprise the greatest hits they want to hear. Clayton obviously knows about the third peak; he doesn’t need to hear about it from me. This is the scientific equivalent of shouting Freebird! at a concert.

It isn’t like I haven’t talked about it. I have been rigorously honest about the CMB, and certainly have not omitted mention of the third peak. Here is a comment from February 2003 when the third peak was only tentatively detected:

Page et al. (2003) do not offer a WMAP measurement of the third peak. They do quote a compilation of other experiments by Wang et al. (2003). Taking this number at face value, the second to third peak amplitude ratio is A2:3 = 1.03 +/- 0.20. The LCDM expectation value for this quantity was 1.1, while the No-CDM expectation was 1.9. By this measure, LCDM is clearly preferable, in contradiction to the better measured first-to-second peak ratio.

Or here, in March 2006:

the Boomerang data and the last credible point in the 3-year WMAP data both have power that is clearly in excess of the no-CDM prediction. The most natural interpretation of this observation is forcing by a mass component that does not interact with photons, such as non-baryonic cold dark matter.

There are lots like this, including my review for CJP and this talk given at KITP where I had been asked to explicitly take the side of MOND in a debate format for an audience of largely particle physicists. The CMB, including the third peak, appears on the fourth slide, which is right up front, not being elided at all. In the first slide, I tried to encapsulate the attitudes of both sides:

I did the same at a meeting in Stony Brook where I got a weird vibe from the audience; they seemed to think I was lying about the history of the second peak that I recount above. It will be hard to agree on an interpretation if we can’t agree on documented historical facts.

More recently, this image appears on slide 9 of this lecture from the cosmology course I just taught (Fall 2022):

I recognize this slide from talks I’ve given over the past five plus years; this class is the most recent place I’ve used it, not the first. On some occasions I wrote “The 3rd peak is the best evidence for CDM.” I do not recall which all talks I used this in; many of them were likely colloquia for physics departments where one has more time to cover things than in a typical conference talk. Regardless, these apparently were not the talks that Clayton attended. Rather than it being the case that I never address this subject, the more conservative interpretation of the experience he relates would be that I happened not to address it in the small subset of talks that he happened to attend.

But do go off, dude: tell everyone how I never address this issue and evade questions about it.

I have been extraordinarily patient with this sort of thing, but I confess to a great deal of exasperation at the perpetual whataboutism that many scientists engage in. It is used reflexively to shut down discussion of alternatives: dark matter has to be right for this reason (here the CMB); nothing else matters (galaxy dynamics), so we should forbid discussion of MOND. Even if dark matter proves to be correct, the CMB is being used an excuse to not address the question of the century: why does MOND get so many predictions right? Any scientist with a decent physical intuition who takes the time to rub two brain cells together in contemplation of this question will realize that there is something important going on that simply invoking dark matter does not address.

In fairness to McGaugh, he pointed out some very interesting features of galactic DM distributions that do deserve answers. But it turns out that there are a plurality of possibilities, from complex DM physics (self interactions) to unmodelable SM physics (stellar feedback, galaxy-galaxy interactions). There are no such alternatives to CDM to explain the CMB power spectrum.

Thanks. This is nice, and why I say it would be easier to just pretend to never have heard of MOND. Indeed, this succinctly describes the trajectory I was on before I became aware of MOND. I would prefer to be recognized for my own work – of which there is plenty – than an association with a theory that is not my own – an association that is born of honestly reporting a surprising observation. I find my reception to be more favorable if I just talk about the data, but what is the point of taking data if we don’t test the hypotheses?

I have gone to great extremes to consider all the possibilities. There is not a plurality of viable possibilities; most of these things do not work. The specific ideas that are cited here are known not work. SIDM apears to work because it has more free parameters than are required to describe the data. This is a common failing of dark matter models that simply fit some functional form to observed rotation curves. They can be made to fit the data, but they cannot be used to predict the way MOND can.

Feedback is even worse. Never mind the details of specific feedback models, and think about what is being said here: the observations are to be explained by “unmodelable [standard model] physics.” This is a way of saying that dark matter claims to explain the phenomena while declining to make a prediction. Don’t worry – it’ll work out! How can that be considered better than or even equivalent to MOND when many of the problems we invoke feedback to solve are caused by the predictions of MOND coming true? We’re just invoking unmodelable physics as a deus ex machina to make dark matter models look like something they are not. Are physicists straight-up asserting that it is better to have a theory that is unmodelable than one that makes predictions that come true?

Returning to the CMB, are there no “alternatives to CDM to explain the CMB power spectrum”? I certainly do not know how to explain the third peak with the No-CDM ansatz. For that we need a relativistic theory, like Beklenstein‘s TeVeS. This initially seemed promising, as it solved the long-standing problem of gravitational lensing in MOND. However, it quickly became clear that it did not work for the CMB. Nevertheless, I learned from this that there could be more to the CMB oscillations than allowed by the simple No-CDM ansatz. The scalar field (an entity theorists love to introduce) in TeVeS-like theories could play a role analogous to cold dark matter in the oscillation equations. That means that what I thought was a killer argument against MOND – the exact same argument Clayton is making – is not as absolute as I had thought.

Writing down a new relativistic theory is not trivial. It is not what I do. I am an observational astronomer. I only play at theory when I can’t get telescope time.

Comic from the Far Side by Gary Larson.

So in the mid-00’s, I decided to let theorists do theory and started the first steps in what would ultimately become the SPARC database (it took a decade and a lot of effort by Jim Schombert and Federico Lelli in addition to myself). On the theoretical side, it also took a long time to make progress because it is a hard problem. Thanks to work by Skordis & Zlosnik on a theory they [now] call AeST8, it is possible to fit the acoustic power spectrum of the CMB:

CMB power spectrum observed by Planck fit by AeST (Skordis & Zlosnik 2021).

This fit is indistinguishable from that of LCDM.

I consider this to be a demonstration, not necessarily the last word on the correct theory, but hopefully an iteration towards one. The point here is that it is possible to fit the CMB. That’s all that matters for our current discussion: contrary to the steady insistence of cosmologists over the past 15 years, CDM is not the only way to fit the CMB. There may be other possibilities that we have yet to figure out. Perhaps even a plurality of possibilities. This is hard work and to make progress we need a critical mass of people contributing to the effort, not shouting rubbish from the peanut gallery.

As I’ve done before, I like to take the language used in favor of dark matter, and see if it also fits when I put on a MOND hat:

As a galaxy dynamicist, let me just remind everyone that the primary reason to believe in MOND as a physical theory and not some curious dark matter phenomenology is the very high precision with which MOND predicts, a priori, the dynamics of low-acceleration systems, especially low surface brightness galaxies whose kinematics were practically unknown at the time of its inception. There is a stone-cold, quantitative, crystal clear prediction of MOND that the kinematics of galaxies follows uniquely from their observed baryon distributions. This is something CDM profoundly and irremediably gets wrong: it predicts that the dark matter halo should have a central cusp9 that is not observed, and makes no prediction at all for the baryon distribution, let alone does it account for the detailed correspondence between bumps and wiggles in the baryon distribution and those in rotation curves. This is observed over and over again in hundreds upon hundreds of galaxies, each of which has its own unique mass distribution so that each and every individual case provides a distinct, independent test of the hypothesized force law. In contrast, CDM does not even attempt a comparable prediction: rather than enabling the real-world application to predict that this specific galaxy will have this particular rotation curve, it can only refer to the statistical properties of galaxy-like objects formed in numerical simulations that resemble real galaxies only in the abstract, and can never be used to directly predict the kinematics of a real galaxy in advance of the observation – an ability that has been demonstrated repeatedly by MOND. The simple fact that the simple formula of MOND is so repeatably correct in mapping what we see to what we get is to me the most convincing way to see that we need a grander theory that contains MOND and exactly MOND in the low acceleration limit, irrespective of the physical mechanism by which this is achieved.

That is stronger language than I would ordinarily permit myself. I do so entirely to show the danger of being so darn sure. I actually agree with clayton’s perspective in his quote; I’m just showing what it looks like if we adopt the same attitude with a different perspective. The problems pointed out for each theory are genuine, and the supposed solutions are not obviously viable (in either case). Sometimes I feel like we’re up the proverbial creek without a paddle. I do not know what the right answer is, and you should be skeptical of anyone who is sure that he does. Being sure is the sure road to stagnation.


1It may surprise some advocates of dark matter that I barely touch on MOND in this course, only getting to it at the end of the semester, if at all. It really is evidence-based, with a focus on the dynamical evidence as there is a lot more to this than seems to be appreciated by most physicists*. We also teach a course on cosmology, where students get the material that physicists seem to be more familiar with.

*I once had a colleague who was is a physics department ask how to deal with opposition to developing a course on galaxy dynamics. Apparently, some of the physicists there thought it was not a rigorous subject worthy of an entire semester course – an attitude that is all too common. I suggested that she pointedly drop the textbook of Binney & Tremaine on their desks. She reported back that this technique proved effective.

2I do not know who clayton is; that screen name does not suffice as an identifier. He claims to have been in contact with me at some point, which is certainly possible: I talk to a lot of people about these issues. He is welcome to contact me again, though he may wish to consider opening with an apology.

3One of the hardest realizations I ever had as a scientist was that both of the reasons (1) and (2) that I believed to absolutely require CDM assumed that gravity was normal. If one drops that assumption, as one must to contemplate MOND, then these reasons don’t require CDM so much as they highlight that something is very wrong with the universe. That something could be MOND instead of CDM, both of which are in the category of who ordered that?

4In the early days (late ’90s) when I first started asking why MOND gets any predictions right, one of the people I asked was Joe Silk. He dismissed the rotation curve fits of MOND as a fluke. There were 80 galaxies that had been fit at the time, which seemed like a lot of flukes. I mention this because one of the persistent myths of the subject is that MOND is somehow guaranteed to magically fit rotation curves. Erwin de Blok and I explicitly showed that this was not true in a 1998 paper.

5I sometimes hear cosmologists speak in awe of the thousands of observed CMB modes that are fit by half a dozen LCDM parameters. This is impressive, but we’re fitting a damped and driven oscillation – those thousands of modes are not all physically independent. Moreover, as can be seen in the figure from Dodelson & Hu, some free parameters provide more flexibility than others: there is plenty of flexibility in a model with dark matter to fit the CMB data. Only with the Planck data do minor tensions arise, the reaction to which is generally to add more free parameters, like decoupling the primordial helium abundance from that of deuterium, which is anathema to standard BBN so is sometimes portrayed as exciting, potentially new physics.

For some reason, I never hear the same people speak in equal awe of the hundreds of galaxy rotation curves that can be fit by MOND with a universal acceleration scale and a single physical free parameter, the mass-to-light ratio. Such fits are over-constrained, and every single galaxy is an independent test. Indeed, MOND can predict rotation curves parameter-free in cases where gas dominates so that the stellar mass-to-light ratio is irrelevant.

How should we weigh the relative merit of these very different lines of evidence?

6On a number of memorable occasions, people shouted “No you didn’t!” On smaller number of those occasions (exactly two), they bothered to look up the prediction in the literature and then wrote to apologize and agree that I had indeed predicted that.

7If you read this paper, part of what you will see is me being confused about how low surface brightness galaxies could adhere so tightly to the Tully-Fisher relation. They should not. In retrospect, one can see that this was a MOND prediction coming true, but at the time I didn’t know about that; all I could see was that the result made no sense in the conventional dark matter picture.

Some while after we published that paper, Bob Sanders, who was at the same institute as my collaborators, related to me that Milgrom had written to him and asked “Do you know these guys?”

8Initially they had called it RelMOND, or just RMOND. AeST stands for Aether-Scalar-Tensor, and is clearly a step along the lines that Bekenstein made with TeVeS.

In addition to fitting the CMB, AeST retains the virtues of TeVeS in terms of providing a lensing signal consistent with the kinematics. However, it is not obvious that it works in detail – Tobias Mistele has a brand new paper testing it, and it doesn’t look good at extremely low accelerations. With that caveat, it significantly outperforms extant dark matter models.

There is an oft-repeated fallacy that comes up any time a MOND-related theory has a problem: “MOND doesn’t work therefore it has to be dark matter.” This only ever seems to hold when you don’t bother to check what dark matter predicts. In this case, we should but don’t detect the edge of dark matter halos at higher accelerations than where AeST runs into trouble.

9Another question I’ve posed for over a quarter century now is what would falsify CDM? The first person to give a straight answer to this question was Simon White, who said that cusps in dark matter halos were an ironclad prediction; they had to be there. Many years later, it is clear that they are not, but does anyone still believe this is an ironclad prediction? If it is, then CDM is already falsified. If it is not, then what would be? It seems like the paradigm can fit any surprising result, no matter how unlikely a priori. This is not a strength, it is a weakness. We can, and do, add epicycle upon epicycle to save the phenomenon. This has been my concern for CDM for a long time now: not that it gets some predictions wrong, but that it can apparently never get a prediction so wrong that we can’t patch it up, so we can never come to doubt it if it happens to be wrong.

Question of the Year (and a challenge)

Why does MOND get any predictions right?

That’s the question of the year, and perhaps of the century. I’ve been asking it since before this century began, and I have yet to hear a satisfactory answer. Most of the relevant scientific community has aggressively failed to engage with it. Even if MOND is wrong for [insert favorite reason], this does not relieve us of the burden to understand why it gets many predictions right – predictions that have repeatedly come as a surprise to the community that has declined to engage, preferring to ignore the elephant in the room.

It is not good enough to explain MOND phenomenology post facto with some contrived LCDM model. That’s mostly1 what is on offer, being born of the attitude that we’re sure LCDM is right, so somehow MOND phenomenology must emerge from it. We could just as [un]reasonably adopt the attitude that MOND is correct, so surely LCDM phenomenology happens as a result of trying to fit the standard cosmological model to some deeper, subtly different theory.

A basic tenet of the scientific method is that if a theory has its predictions come true, we are obliged to acknowledge its efficacy. This is how we know when to change our minds. This holds even if we don’t like said theory – especially if we don’t like it.

That was my experience with MOND. It correctly predicted the kinematics of the low surface brightness galaxies I was interested in. Dark matter did not. The data falsified all the models available at the time, including my own dark matter-based hypothesis. The only successful a priori predictions were those made by Milgrom. So what am I to conclude2 from this? That he was wrong?

Since that time, MOND has been used to make a lot of further predictions that came true. Predictions for specific objects that cannot even be made with LCDM. Post-hoc explanations abound, but are not satisfactory as they fail to address the question of the year. If LCDM is correct, why is it that MOND keeps making novel predictions that LCDM consistently finds surprising? This has happened over and over again.

I understand the reluctance to engage. It really ticked me off that my own model was falsified. How could this stupid theory of Milgrom’s do better for my galaxies? Indeed, how could it get anything right? I had no answer to this, nor does the wider community. It is not for lack of trying on my part; I’ve spent a lot of time3 building conventional dark matter models. They don’t work. Most of the models made by others that I’ve seen are just variations on models I had already considered and rejected as obviously unworkable. They might look workable from one angle, but they inevitably fail from some other, solving one problem at the expense of another.

Predictive success does not guarantee that a theory is right, but it does make it better than competing theories that fail for the same prediction. This is where MOND and LCDM are difficult to compare, as the relevant data are largely incommensurate. Where one is eloquent, the other tends to be muddled. However, it has been my experience that MOND more frequently reproduces the successes of dark matter than vice-versa. I expect this statement comes as a surprise to some, as it certainly did to me (see the comment line of astro-ph/9801102). The people who say the opposite clearly haven’t bothered to check2 as I have, or even to give MOND a real chance. If you come to a problem sure you know the answer, no data will change your mind. Hence:

A challenge: What would falsify the existence of dark matter?

If LCDM is a scientific theory, it should be falsifiable4. Dark matter, by itself, is a concept, not a theory: mass that is invisible. So how can we tell if it’s not there? Once we have convinced ourselves that the universe is full of invisible stuff that we can’t see or (so far) detect any other way, how do we disabuse ourselves of this notion, should it happen to be wrong? If it is correct, we can in principle find it in the lab, so its existence can be confirmed. But is it falsifiable? How?

That is my challenge to the dark matter community: what would convince you that the dark matter picture is wrong? Answers will vary, as it is up to each individual to decide for themself how to answer. But there has to be an answer. To leave this basic question unaddressed is to abandon the scientific method.

I’ll go first. Starting in 1985 when I was first presented evidence in a class taught by Scott Tremaine, I was as much of a believer in dark matter as anyone. I was even a vigorous advocate, for a time. What convinced me to first doubt the dark matter picture was the fine-tuning I had to engage in to salvage it. It was only after that experience that I realized that the problems I was encountering were caused by the data doing what MOND had predicted – something that really shouldn’t happen if dark matter is running the show. But the MOND part came after; I had already become dubious about dark matter in its own context.

Falsifiability is a question every scientist who works on dark matter needs to face. What would cause you to doubt the existence of dark matter? Nothing is not a scientific answer. Neither is it correct to assert that the evidence for dark matter is already overwhelming. That is a misstatement: the evidence for acceleration discrepancies is overwhelming, but these can be interpreted as evidence for either dark matter or MOND.

This important thing is to establish criteria by which you would change your mind. I changed my mind before: I am no longer convinced that the solution the acceleration discrepancy has to be non-baryonic dark matter. I will change my mind again if the evidence warrants. Let me state, yet again, what would cause me to doubt that MOND is a critical element of said solution. There are lots of possibilities, as MOND is readily falsifiable. Three important ones are:

  1. MOND getting a fundamental prediction wrong;
  2. Detecting dark matter;
  3. Answering the question of the year.

None of these have happened yet. Just shouting MOND is falsified already! doesn’t make it so: the evidence has to be both clear and satisfactory. For example,

  1. MOND might be falsified by cluster data, but it’s apparent failure is not fundamental. There is a residual missing mass problem in the richest clusters, but there’s nothing in MOND that says we have to have detected all the baryons by now. Indeed, LCDM doesn’t fare better, just differently, with both theories suffering a missing baryon problem. The chief difference is that we’re willing to give LCDM endless mulligans but MOND none at all. Where the problem for MOND in clusters comes up all the time, the analogous problem in LCDM is barely discussed, and is not even recognized as a problem.
  2. A detection of dark matter would certainly help. To be satisfactory, it can’t be an isolated signal in a lone experiment that no one else can reproduce. If a new particle is detected, its properties have to be correct (e.g, it has the right mass density, etc.). As always, we must be wary of some standard model event masquerading as dark matter. WIMP detectors will soon reach the neutrino background accumulated from all the nuclear emissions of stars over the course of cosmic history, at which time they will start detecting weakly interacting particles as intended: neutrinos. Those aren’t the dark matter, but what are the odds that the first of those neutrino detections will be eagerly misinterpreted as dark matter?
  3. Finally, the question of the year: why does MOND get any prediction right? To provide a satisfactory answer to this, one must come up with a physical model that provides a compelling explanation for the phenomena and has the same ability as MOND to make novel predictions. Just building a post-hoc model to match the data, which is the most common approach, doesn’t provide a satisfactory, let alone a compelling, explanation for the phenomenon, and provides no predictive power at all. If it did, we could have predicted MOND-like phenomenology and wouldn’t have to build these models after the fact.

So far, none of these three things have been clearly satisfied. The greatest danger to MOND comes from MOND itself: the residual mass discrepancy in clusters, the tension in Galactic data (some of which favor MOND, other of which don’t), and the apparent absence of dark matter in some galaxies. While these are real problems, they are also of the scale that is expected in the normal course of science: there are always tensions and misleading tidbits of information; I personally worry the most about the Galactic data. But even if my first point is satisfied and MOND fails on its own merits, that does not make dark matter better.

A large segment of the scientific community seems to suffer a common logical fallacy: any problem with MOND is seen as a success for dark matter. That’s silly. One has to evaluate the predictions of dark matter for the same observation to see how it fares. My experience has been that observations that are problematic for MOND are also problematic for dark matter. The latter often survives by not making a prediction at all, which is hardly a point in its favor.

Other situations are just plain weird. For example, it is popular these days to cite the absence of dark matter in some ultradiffuse galaxies as a challenge to MOND, which they are. But neither does it make sense to have galaxies without dark matter in a universe made of dark matter. Such a situation can be arranged, but the circumstances are rather contrived and usually involve some non-equilibrium dynamics. That’s fine; that can happen on rare occasions, but disequilibrium situations can happen in MOND too (the claims of falsification inevitably assume equilibrium). We can’t have it both ways, permitting special circumstances for one theory but not for the other. Worse, some examples of galaxies that are claimed to be devoid of dark matter are as much a problem for LCDM as for MOND. A disk galaxy devoid of either can’t happen; we need something to stabilize disks.

So where do we go from here? Who knows! There are fundamental questions that remain unanswered, and that’s a good thing. There is real science yet to be done. We can make progress if we stick to the scientific method. There is more to be done than measuring cosmological parameters to the sixth place of decimals. But we have to start by setting standards for falsification. If there is no observation or experimental result that would disabuse you of your current belief system, then that belief system is more akin to religion than to science.


1There are a few ideas, like superfluid dark matter, that try to automatically produce MOND phenomenology. This is what needs to happen. It isn’t clear yet whether these ideas work, but reproducing the MOND phenomenology naturally is a minimum standard that has to be met for a model to be viable. Run of the mill CDM models that invoke feedback do not meet this standard. They can always be made to reproduce the data once observed, but not to predict it in advance as MOND does.


2There is a common refrain that “MOND fits rotation curves and nothing else.” This is a myth, plain and simple. A good, old-fashioned falsehood sustained by the echo chamber effect. (That’s what I heard!) Seriously: if you are a scientist who thinks this, what is your source? Did it come from a review of MOND, or from idle chit-chat? How many MOND papers have you read? What do you actually know about it? Ignorance is not a strong position from which to draw a scientific conclusion.


3Like most of the community, I have invested considerably more effort in dark matter than in MOND. Where I differ from much of the galaxy formation community* is in admitting when those efforts fail. There is a temptation to slap some lipstick on the dark matter pig and claim success just to go along to get along, but what is the point of science if that is what we do when we encounter an inconvenient result? For me, MOND has been an incredibly inconvenient result. I would love to be able to falsify it, but so far intellectual honesty forbids.

*There is a widespread ethos of toxic positivity in the galaxy formation literature, which habitually puts a more positive spin on results than is objectively warranted. I’m aware of at least one prominent school where students are taught “to be optimistic” and omit mention of caveats that might detract from the a model’s reception. This is effective in a careerist sense, but antithetical to the scientific endeavor.


4The word “falsification” carries a lot of philosophical baggage that I don’t care to get into here. The point is that there must be a way to tell if a theory is wrong. If there is not, we might as well be debating the number of angels that can dance on the head of a pin.

Remain Skeptical

Remain Skeptical

I would like to write something positive to close out the year. Apparently, it is not in my nature, as I am finding it difficult to do so. I try not to say anything if I can’t say anything nice, and as a consequence I have said little here for weeks at a time.

Still, there are good things that happened this year. JWST launched a year ago. The predictions I made for it at that time have since been realized. There have been some bumps along the way, with some of the photometric redshifts for very high z galaxies turning out to be wrong. They have not all turned out to be wrong, and the current consensus seems to be converging towards acceptance of there existing a good number of relatively bright galaxies at z > 10. Some of these have been ‘confirmed’ by spectroscopy.

I remain skeptical of some of the spectra as well as the photometric redshifts. There isn’t much spectrum to see at these rest frame ultraviolet wavelengths. There aren’t a lot of obvious, distinctive features in the spectra that make for definitive line identifications, and the universe is rather opaque to the UV photons blueward of the Lyman break. Here is an example from the JADES survey:

Images and spectra of z > 10 galaxy candidates from JADES. [Image Credits: NASA, ESA, CSA, M. Zamani (ESA/Webb), Leah Hustak (STScI); Science Credits: Brant Robertson (UC Santa Cruz), S. Tacchella (Cambridge), E. Curtis-Lake (UOH), S. Carniani (Scuola Normale Superiore), JADES Collaboration]

Despite the lack of distinctive spectral lines, there is a clear shape that is ramping up towards the blue until hitting a sharp edge. This is consistent with the spectrum of a star forming galaxy with young stars that make a lot of UV light: the upward bend is expected for such a population, and hard to explain otherwise. The edge is cause by opacity: intervening gas and dust gobbles up those photons, few of which are likely to even escape their host galaxy, much less survive the billions of light-years to be traversed between there-then and here-now. So I concur that the most obvious interpretation of these spectra is that of high-z galaxies even if we don’t have the satisfaction of seeing blatantly obvious emission lines like C IV or Mg II (ionized species of carbon and magnesium that are frequently seen in the spectra of quasars). [The obscure nomenclature dates back to nineteenth century laboratory spectroscopy. Mg I is neutral, Mg II singly ionized, C IV triply ionized.]

Even if we seem headed towards consensus on the reality of big galaxies at high redshift, the same cannot yet be said about their interpretation. This certainly came as a huge surprise to astronomers not me. The obvious interpretation is the theory that predicted this observation in advance, no?

Apparently not. Another predictable phenomenon is that people will self-gaslight themselves into believing that this was expected all along. I have been watching in real time as the community makes the transition from “there is nothing above redshift 7” (the prediction of LCDM contemporary with Bob Sanders’s MOND prediction that galaxy mass objects form by z=10) to “this was unexpected!” and is genuinely problematic to “Nah, we’re good.” This is the same trajectory I’ve seen the community take with the cusp-core problem, the missing satellite problem, the RAR, the existence of massive clusters of galaxies at surprisingly high redshift, etc., etc. A theory is only good to the extent that its predictions are not malleable enough to be made to fit any observation.

As I was trying to explain on twitter that individually high mass galaxies had not been expected in LCDM, someone popped into my feed to assert that they had multiple simulations with galaxies that massive. That certainly had not been the case all along, so this just tells me that LCDM doesn’t really make a prediction here that can’t be fudged (crank up the star formation efficiency!). This is worse than no prediction at all: you can never know that you’re wrong, as you can fix any failing. Worse, it has been my experience that there is always someone willing to play the role of fixer, usually some ambitious young person eager to gain credit for saving the most favored theory. It works – I can point to many Ivy league careers that followed this approach. They don’t even have to work hard at it, as the community is predisposed to believe what they want to hear.

These are all reasons why predictions made in advance of the relevant observation are the most valuable.

That MOND has consistently predicted, in advance, results that were surprising to LCDM is a fact that the community apparently remains unaware of. Communication is inefficient, so for a long time I thought this sufficed as an explanation. That is no longer the case; the only explanation that fits the sociological observations is that the ignorance is willful.

“It is difficult to get a man to understand something, when his salary depends on his not understanding it.”

Upton Sinclair

We have been spoiled. The last 400 years has given us the impression that science progresses steadily and irresistibly forward. This is in no way guaranteed. Science progresses in fits and starts; it only looks continuous when the highlights are viewed in retrospective soft focus. Progress can halt and even regress, as happened abruptly with the many engineering feats of the Romans with the fall of their empire. Science is a human endeavor subject to human folly, and we might just as easily have a thousand years of belief in invisible mass as we did in epicycles.

Despite all this, I remain guardedly optimistic that we can and will progress. I don’t know what the right answer is. The first step is to let go of being sure that we do.

I’ll end with a quote pointed out to me by David Merritt that seems to apply today as it did centuries ago:

“The scepticism of that generation was the most uncompromising that the world has known; for it did not even trouble to deny: it simply ignored. It presented a blank wall of perfect indifference alike to the mysteries of the universe and to the solutions of them.”

Books and Characters by Lytton Strachey (chapter on Mme du Deffand)

Live long, and prosper in the new year. Above all, remain skeptical.

Not quiet on the northern front

Not quiet on the northern front

It has been two months since my last post. Sorry for the extended silence, but I do have a real job. It is not coincidental that my last post precedes the start of the semester. It has been the best of semesters, but mostly the worst of semesters.

On the positive side, I’m teaching our upper level cosmology course. The students are great, really interested and interactive. Interest has always run high, going back to the first time I taught it (in 1999) as a graduate course at the University of Maryland. Aficionados of web history may marvel at the old course website, which was one of the first of its kind, as was the class – prior to that, graduate level cosmology was often taught as part of extragalactic astronomy. Being a new member of the faculty, it was an obvious gap to fill. I also remember with bemusement receiving Mike A’Hearn (comet expert and PI of Deep Impact) as an envoy from the serious-minded planetary scientists, who wondered if there was enough legitimate substance to the historically flaky subject of cosmology to teach a full three credit graduate course on the subject. Being both an expert and a skeptic, it was easy to reassure him: yes.

That class was large for a graduate level course, being taken in equal numbers by both astronomy and physics students. The astronomers were shocked and horrified that I went so deeply into the background theory to frame the course from the outset, and frequently asked “what’s a metric?” while the physicists loved that part. When we got to observational constraints, you could see the astronomers’ eyes glaze – not the distance scale again – while the physicists desperately asked “what’s a distance modulus?” This dichotomy persists.

This semester’s course is the largest it has ever been, up 70% from previous already-large enrollments. This is consistent with the explosive growth of the field. Interest in the field has never been higher. The number of astronomy majors has doubled over the past decade, having doubled already in the preceding decade.

Astronomy bachelor’s degrees as reported by the American Institute of Physics.

That’s the good news. The bad news is that over the past four years, our department has been allowed to whither. In 2018, we were the smallest astronomy department in the country, with five tenured professors and an observatory manager who functioned as research faculty. The inevitable retirements that we had warned our administration were coming arrived, and we were allowed to fall off the demographic cliff (a common problem here and at many institutions). Despite the clear demand and the depth, breadth, and diversity of the available talent pool, the only faculty hire we have made in the past decade was an instructor (a rank that differs from a professor in having no research obligations), so now we are a department of two tenured professors and one instructor. I thought we were already small! It boggles the mind when you realize that the three of us are obliged to cover literally the entire universe in our curriculum.

Though always a small department, we managed. Now we don’t manage so much as cling to the edge of the cliff by our fingernails. We can barely cover the required courses for our majors. During the peak of concern about the Covid pandemic, we Chairs were asked to provide a plan for covering courses should one or some of our faculty become ill for an extended period. What a joke. The only “plan” I could offer was “don’t get sick.”

We did at least get along, which is not the case with faculty in all departments. The only minor tension we sometimes encountered was the distribution of research students. A Capstone (basically a senior thesis) is required here, and some faculty wound up with a higher supervisory load than others. That is baked-in now, as we have fewer faculty but more students to supervise.

We have reached a breaking point. The only way to address the problems we face is to hire new faculty. So the solution proffered by the dean is to merge our department into Physics.

Regardless of any other pros and cons, a merger does nothing to address the fundamental problem: we need astronomers to teach the astronomy curriculum. We need astronomers to conduct astronomy research, and to have a critical mass for a viable research community. In short, we need astronomers to do astronomy.

I have been Chair of the CWRU Department of Astronomy for over seven years now. Prof. Mihos served in this capacity for six years before that. No sane faculty member wants to be Chair; it is a service obligation we take on because there are tasks that need doing to serve our students and enable our research. Though necessary, these tasks are a drain on the person doing them, and detract from our ability to help our students and conduct research. Having sustained the department for this long to be told we needn’t have bothered is a deep and profound betrayal. I did not come here to turn out the lights.

Define “better”

Define “better”

Dark matter remains undetected in the laboratory. This has been true for forever, so I don’t know what drives the timing of the recent spate of articles encouraging us to keep the faith, that dark matter is still a better idea than anything else. This depends on how we define “better.”

There is a long-standing debate in the philosophy of science about the relative merits of accommodation and prediction. A scientific theory should have predictive power. It should also explain all the relevant data. To do the latter almost inevitably requires some flexibility in order to accommodate things that didn’t turn out exactly as predicted. What is the right mix? Do we lean more towards prediction, or accommodation? The answer to that defines “better” in this context.

One of the recent articles is titled “The dark matter hypothesis isn’t perfect, but the alternatives are worse” by Paul Sutter. This perfectly encapsulates the choice one has to make in what is unavoidably a value judgement. Is it better to accommodate, or to predict (see the Spergel Principle)? Dr. Sutter comes down on the side of accommodation. He notes a couple of failed predictions of dark matter, but mentions no specific predictions of MOND (successful or not) while concluding that dark matter is better because it explains more.

One important principle in science is objectivity. We should be even-handed in the evaluation of evidence for and against a theory. In practice, that is very difficult. As I’ve written before, it made me angry when the predictions of MOND came true in my data for low surface brightness galaxies. I wanted dark matter to be right. I felt sure that it had to be. So why did this stupid MOND theory have any of its predictions come true?

One way to check your objectivity is to look at it from both sides. If I put on a dark matter hat, then I largely agree with what Dr. Sutter says. To quote one example:

The dark matter hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has come up with a MOND-like theory that can explain the wealth of data we have about the universe. That doesn’t make MOND wrong, but it does make it a far weaker alternative to dark matter.

Paul Sutter

OK, so now let’s put on a MOND hat. Can I make the same statement?

The MOND hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has detected dark matter, nor come up with a dark matter-based theory with the predictive power of MOND. That doesn’t make dark matter wrong, but it does make it a far weaker alternative to MOND.

So, which of these statements is true? Well, both of them. How do we weigh the various lines of evidence? Is it more important to explain a large variety of the data, or to be able to predict some of it? This is one of the great challenges when comparing dark matter and MOND. They are incommensurate: the set of relevant data is not the same for both. MOND makes no pretense to provide a theory of cosmology, so it doesn’t even attempt to explain much of the data so beloved by cosmologists. Dark matter explains everything, but, broadly defined, it is not a theory so much as an inference – assuming gravitational dynamics are inviolate, we need more mass than meets the eye. It’s a classic case of comparing apples and oranges.

While dark matter is a vague concept in general, one can build specific theories of dark matter that are predictive. Simulations with generic cold dark matter particles predict cuspy dark matter halos. Galaxies are thought to reside in these halos, which dominate their dynamics. This overlaps with the predictions of MOND, which follow from the observed distribution of normal matter. So, do galaxies look like tracer particles orbiting in cuspy halos? Or do their dynamics follow from the observed distribution of light via Milgrom’s strange formula? The relevant subset of the data very clearly indicate the latter. When head-to-head comparisons like this can be made, the a priori predictions of MOND win, hands down, over and over again. [If this statement sounds wrong, try reading the relevant scientific literature. Being an expert on dark matter does not automatically make one an expert on MOND. To be qualified to comment, one should know what predictive successes MOND has had. People who say variations of “MOND only fits rotation curves” are proudly proclaiming that they lack this knowledge.]

It boils down to this: if you want to explain extragalactic phenomena, use dark matter. If you want to make a prediction – in advance! – that will come true, use MOND.

A lot of the debate comes down to claims that anything MOND can do, dark matter can do better. Or at least as well. Or, if not as well, good enough. This is why conventionalists are always harping about feedback: it is the deus ex machina they invoke in any situation where they need to explain why their prediction failed. This does nothing to explain why MOND succeeded where they failed.

This post-hoc reasoning is profoundly unsatisfactory. Dark matter, being invisible, allows us lots of freedom to cook up an explanation for pretty much anything. My long-standing concern for the dark matter paradigm is not the failure of any particular prediction, but that, like epicycles, it has too much explanatory power. We could use it to explain pretty much anything. Rotation curves flat when they should be falling? Add some dark matter. No such need? No dark matter. Rising rotation curves? Sure, we could explain that too: add more dark matter. Only we don’t, because that situation doesn’t arise in nature. But we could if we had to. (See, e.g., Fig. 6 of de Blok & McGaugh 1998.)

There is no requirement in dark matter that rotation curves be as flat as they are. If we start from the prior knowledge that they are, then of course that’s what we get. If instead we independently try to build models of galactic disks in dark matter halos, very few of them wind up with realistic looking rotation curves. This shouldn’t be surprising: there are, in principle, an uncountably infinite number of combinations of galaxies and dark matter halos. Even if we impose some sensible restrictions (e.g., scaling the mass of one component with that of the other), we still don’t get it right. That’s one reason that we have to add feedback, which suffices according to some, and not according to others.

In contrast, the predictions of MOND are unique. The kinematics of an object follow from its observed mass distribution. The two are tied together by the hypothesized force law. There is a one-to-one relation between what you see and what you get.

This was not expected in dark matter. It makes no sense that this should be so. The baryonic tail should not wag the dark matter dog.

From the perspective of building dark matter models, it’s like the proverbial needle in the haystack: the haystack is the volume of possible baryonic disk plus dark matter halo combinations; the one that “looks like” MOND is the needle. Somehow nature plucks the MOND-like needle out of the dark matter haystack every time it makes a galaxy.

The dark matter haystack. Galaxies might lie anywhere in this voluminous, multiparameter space, but in practice they inevitably seem to reside in the negligibly small part of the volume that “looks like” MOND.

Dr. Sutter says that we shouldn’t go with our gut. That’s exactly what I wanted to do, long ago, to maintain my preference for dark matter. I’d love to do that now so that I could stop having this argument with otherwise reasonable people.

Instead of going with my gut, I’m making a probabilistic statement. In Bayesian terms, the odds of observing MONDian behavior given the prior that we live in a universe made of dark matter are practically zero. In MOND, observing MONDian behavior is the only thing that can happen. That’s what we observe in galaxies, over and over again. Any information criterion shows a strong quantitative preference for MOND when dynamical evidence is considered. That does not happen when cosmological data are considered because MOND makes no prediction there. Concluding that dark matter is better overlooks the practical impossibility that MOND-like phenomenolgy is observed at all. Of course, once one knows this is what the data show, it seems a lot more likely, and I can see that effect in the literature over the long arc of scientific history. This is why, to me, predictive power is more important than accommodation: what we predict before we know the answer is more important than whatever we make up once the answer is known.

The successes of MOND are sometimes minimized by lumping all galaxies into a single category. That’s not correct. Every galaxy has a unique mass distribution; each one is an independent test. The data for galaxies extend over a large dynamic range, from dwarfs to giants, from low to high surface brightness, from gas to star dominated cases. Dismissing this by saying “MOND only explains rotation curves” is like dismissing Newton for only explaining planets – as if every planet, moon, comet, and asteroid aren’t independent tests of Newton’s inverse square law.

Two galaxies with very different mass distributions. Neither are well explained by dark matter, which provides no reason for the detailed shapes encapsulated by Sancisi’s Law. In contrast, MOND describes these naturally: features in the rotation curves follow from those in the baryon distributions because the force law tells them to.

MOND does explain more that rotation curves. That was the first thing I checked. I spent several years looking at all of the data, and have reviewed the situation many times since. What I found surprising is how much MOND explains, if you let it. More disturbing was how often I came across claims in the literature that MOND was falsified by X only to try the analysis myself and find that, no, if you bother to do it right, that’s pretty much just what it predicts. Not in every case, of course – no hypothesis is perfect – but I stopped bothering after several hundred cases. Literally hundreds. I can’t keep up with every new claim, and it isn’t my job to do so. My experience has been that as the data improve, so too does its agreement with MOND.

Dr. Sutter’s article goes farther, repeating a common misconception that “the tweaking of gravity under MOND is explicitly designed to explain the motions of stars within galaxies.” This is an overstatement so strong as to be factually wrong. MOND was explicitly designed to produce flat rotation curves – as was dark matter. However, there is a lot more to it than that. Once we write down the force law, we’re stuck with it. It has lots of other unavoidable consequences that lead to genuine predictions. Milgrom explicitly laid out what these consequences would be, and basically all of them have subsequently been observed. I include a partial table in my last review; it only ends where it does because I had to stop somewhere. These were genuine, successful, a priori predictions – the gold standard in science. Some of them can be explained with dark matter, but many cannot: they make no sense, and dark matter can only accommodate them thanks to its epic flexibility.

Dr. Sutter makes a number of other interesting points. He says we shouldn’t “pick [a hypothesis] that sounds cooler or seems simpler.” I’m not sure which seems cooler here – a universe pervaded by a mysterious invisible mass that we can’t [yet] detect in the laboratory but nevertheless controls most of what goes on out there seems pretty cool to me. That there might also be some fundamental aspect of the basic theory of gravitational dynamics that we’re missing also seems like a pretty cool possibility. Those are purely value judgments.

Simplicity, however, is a scientific value known as Occam’s razor. The simpler of competing theories is to be preferred. That’s clearly MOND: we make one adjustment to the force law, and that’s it. What we lack is a widely accepted, more general theory that encapsulates both MOND and General Relativity.

In dark matter, we multiply entities unnecessarily – there is extra mass composed of unknown particles that have no place in the Standard Model of particle physics (which is quite full up) so we have to imagine physics beyond the standard model and perhaps an entire dark sector because why just one particle when 85% of the mass is dark? and there could also be dark photons to exchange forces that are only active in the dark sector as well as entire hierarchies of dark particles that maybe have their own ecosystem of dark stars, dark planets, and maybe even dark people. We, being part of the “normal” matter, are just a minority constituent of this dark universe; a negligible bit of flotsam compared to the dark sector. Doesn’t it make sense to imagine that the dark sector has as rich and diverse a set of phenomena as the “normal” sector? Sure – if you don’t mind abandoning Occam’s razor. Note that I didn’t make any of this stuff up; everything I said in that breathless run-on sentence I’ve heard said by earnest scientists enthusiastic about how cool the dark sector could be. Bugger Occam.

There is also the matter of timescales. Dr. Sutter mentions that “In almost 50 years, nobody has come up with a MOND-like theory” that does all that we need it to do. That’s true, but for the typo. Next year (2023) will mark the 40th anniversary of Milgrom’s first publications on MOND, so it hasn’t been half a century yet. But I’ve heard recurring complaints to this effect before, that finding the deeper theory is taking too long. Let’s examine that, shall we?

First, remember some history. When Newton introduced his inverse square law of universal gravity, it was promptly criticized as a form of magical thinking: How, Sir, can you have action at a distance? The conception at the time was that you had to be in physical contact with an object to exert a force on it. For the sun to exert a force on the earth, or the earth on the moon, seemed outright magical. Leibnitz famously accused Newton of introducing ‘occult’ forces. As a consequence, Newton was careful to preface his description of universal gravity as everything happening as if the force was his famous inverse square law. The “as if” is doing a lot of work here, basically saying, in modern parlance “OK, I don’t get how this is possible, I know it seems really weird, but that’s what it looks like.” I say the same about MOND: galaxies behave as if MOND is the effective force law. The question is why.

As near as I can tell from reading the history around this, and I don’t know how clear this is, but it looks like it took about 20 years for Newton to realize that there was a good geometric reason for the inverse square law. We expect our freshman physics students to see that immediately. Obviously Newton was smarter than the average freshman, so why’d it take so long? Was he, perhaps, preoccupied with the legitimate-seeming criticisms of action at a distance? It is hard to see past a fundamental stumbling block like that, and I wonder if the situation now is analogous. Perhaps we are missing something now that will seems obvious in retrospect, distracted by criticisms that will seem absurd in the future.

Many famous scientists built on the dynamics introduced by Newton. The Poisson equation isn’t named the Newton equation because Newton didn’t come up with it even though it is fundamental to Newtonian dynamics. Same for the Lagrangian. And the classical Hamiltonian. These developments came many decades after Newton himself, and required the efforts of many brilliant scientists integrated over a lot of time. By that standard, forty years seems pretty short: one doesn’t arrive at a theory of everything overnight.

What is the right measure? The integrated effort of the scientific community is more relevant than absolute time. Over the past forty years, I’ve seen a lot of push back against even considering MOND as a legitimate theory. Don’t talk about that! This isn’t exactly encouraging, so not many people have worked on it. I can count on my fingers the number of people who have made important contributions to the theoretical development of MOND. (I am not one of them. I am an observer following the evidence, wherever it leads, even against my gut feeling and to the manifest detriment of my career.) It is hard to make progress without a critical mass of people working on a problem.

Of course, people have been looking for dark matter for those same 40 years. More, really – if you want to go back to Oort and Zwicky, it has been 90 years. But for the first half century of dark matter, no one was looking hard for it – it took that long to gel as a serious problem. These things take time.

Nevertheless, for several decades now there has been an enormous amount of effort put into all aspects of the search for dark matter: experimental, observational, and theoretical. There is and has been a critical mass of people working on it for a long time. There have been thousands of talented scientists who have contributed to direct detection experiments in dozens of vast underground laboratories, who have combed through data from X-ray and gamma-ray observatories looking for the telltale signs of dark matter decay or annihilation, who have checked for the direct production of dark matter particles in the LHC; even theorists who continue to hypothesize what the heck the dark matter could be and how we might go about detecting it. This research has been well funded, with billions of dollars having been spent in the quest for dark matter. And what do we have to show for it?

Zero. Nada. Zilch. Squat. A whole lot of nothing.

This is equal to the amount of funding that goes to support research on MOND. There is no faster way to get a grant proposal rejected than to say nice things about MOND. So one the one hand, we have a small number of people working on the proverbial shoestring, while on the other, we have a huge community that has poured vast resources into the attempt to detect dark matter. If we really believe it is taking too long, perhaps we should try funding MOND as generously as we do dark matter.

By the wayside

By the wayside

I noted last time that in the rush to analyze the first of the JWST data, that “some of these candidate high redshift galaxies will fall by the wayside.” As Maurice Aabe notes in the comments there, this has already happened.

I was concerned because of previous work with Jay Franck in which we found that photometric redshifts were simply not adequately precise to identify the clusters and protoclusters we were looking for. Consequently, we made it a selection criterion when constructing the CCPC to require spectroscopic redshifts. The issue then was that it wasn’t good enough to have a rough idea of the redshift, as the photometric method often provides (what exactly it provides depends in a complicated way on the redshift range, the stellar population modeling, and the wavelength range covered by the observational data that is available). To identify a candidate protocluster, you want to know that all the potential member galaxies are really at the same redshift.

This requirement is somewhat relaxed for the field population, in which a common approach is to ask broader questions of the data like “how many galaxies are at z ~ 6? z ~ 7?” etc. Photometric redshifts, when done properly, ought to suffice for this. However, I had noticed in Jay’s work that there were times when apparently reasonable photometric redshift estimates went badly wrong. So it made the ganglia twitch when I noticed that in early JWST work – specifically Table 2 of the first version of a paper by Adams et al. – there were seven objects with candidate photometric redshifts, and three already had a preexisting spectroscopic redshift. The photometric redshifts were mostly around z ~ 9.7, but the three spectroscopic redshifts were all smaller: two z ~ 7.6, one 8.5.

Three objects are not enough to infer a systematic bias, so I made a mental note and moved on. But given our previous experience, it did not inspire confidence that all the available cases disagreed, and that all the spectroscopic redshifts were lower than the photometric estimates. These things combined to give this observer a serious case of “the heebie-jeebies.”

Adams et al have now posted a revised analysis in which many (not all) redshifts change, and change by a lot. Here is their new Table 4:

Table 4 from Adams et al. (2022, version 2).

There are some cases here that appear to confirm and improve the initial estimate of a high redshift. For example, SMACS-z11e had a very uncertain initial redshift estimate. In the revised analysis, it is still at z~11, but with much higher confidence.

That said, it is hard to put a positive spin on these numbers. 23 of 31 redshifts change, and many change drastically. Those that change all become smaller. The highest surviving redshift estimate is z ~ 15 for SMACS-z16b. Among the objects with very high candidate redshifts, some are practically local (e.g., SMACS-z12a, F150DB-075, F150DA-058).

So… I had expected that this could go wrong, but I didn’t think it would go this wrong. I was concerned about the photometric redshift method – how well we can model stellar populations, especially at young ages dominated by short lived stars that in the early universe are presumably lower metallicity than well-studied nearby examples, the degeneracies between galaxies at very different redshifts but presenting similar colors over a finite range of observed passbands, dust (the eternal scourge of observational astronomy, expected to be an especially severe affliction in the ultraviolet that gets redshifted into the near-IR for high-z objects, both because dust is very efficient at scattering UV photons and because this efficiency varies a lot with metallicity and the exact gran size distribution of the dust), when is a dropout really a dropout indicating the location of the Lyman break and when is it just a lousy upper limit of a shabby detection, etc. – I could go on, but I think I already have. It will take time to sort these things out, even in the best of worlds.

We do not live in the best of worlds.

It appears that a big part of the current uncertainty is a calibration error. There is a pipeline for handling JWST data that has an in-built calibration for how many counts in a JWST image correspond to what astronomical magnitude. The JWST instrument team warned us that the initial estimate of this calibration would “improve as we go deeper into Cycle 1” – see slide 13 of Jane Rigby’s AAS presentation.

I was not previously aware of this caveat, though I’m certainly not surprised by it. This is how these things work – one makes an initial estimate based on the available data, and one improves it as more data become available. Apparently, JWST is outperforming its specs, so it is seeing as much as 0.3 magnitudes deeper than anticipated. This means that people were inferring objects to be that much too bright, hence the appearance of lots of galaxies that seem to be brighter than expected, and an apparent systematic bias to high z for photometric redshift estimators.

I was not at the AAS meeting, let alone Dr. Rigby’s presentation there. Even if I had been, I’m not sure I would have appreciated the potential impact of that last bullet point on nearly the last slide. So I’m not the least bit surprised that this error has propagated into the literature. This is unfortunate, but at least this time it didn’t lead to something as bad as the Challenger space shuttle disaster in which the relevant warning from the engineers was reputed to have been buried in an obscure bullet point list.

So now we need to take a deep breath and do things right. I understand the urgency to get the first exciting results out, and they are still exciting. There are still some interesting high z candidate galaxies, and lots of empirical evidence predating JWST indicating that galaxies may have become too big too soon. However, we can only begin to argue about the interpretation of this once we agree to what the facts are. At this juncture, it is more important to get the numbers right than to post early, potentially ill-advised takes on arXiv.

That said, I’d like to go back to writing my own ill-advised take to post on arXiv now.

LZ: another non-detection

LZ: another non-detection

Just as I was leaving for a week’s vacation, the dark matter search experiment LZ reported its first results. Now that I’m back, I see that I didn’t miss anything. Here is their figure of merit:

The latest experimental limits on WIMP dark matter from LZ (arXiv:2207.03764). The parameter space above the line is excluded. Note the scale on the y-axis bearing in mind that the original expectation was for a cross section around 10-39 cm2, well above the top edge of this graph.

LZ is a merger of two previous experiments compelled to grow still bigger in the never-ending search for dark matter. It contains “seven active tonnes of liquid xenon,” which is an absurd amount, being a substantial fraction of the entire terrestrial supply. It all has to be super-cooled to near absolute zero and filtered of all contaminants that might include naturally radioactive isotopes that might mimic the sought-after signal of dark matter scattering off of xenon nuclei. It is a technological tour de force.

The technology is really fantastic. The experimentalists have accomplished amazing things in building these detectors. They have accomplished the target sensitivity, and then some. If WIMPs existed, they should have found them by now.

WIMPs have not been discovered. As the experiments have improved, the theorists have been obliged to repeatedly move the goalposts. The original (1980s) expectation for the interaction cross-section was 10-39 cm2. That was quickly excluded, but more careful (1990s) calculation suggested perhaps more like 10-42 cm2. This was also excluded experimentally. By the late 2000s, the “prediction” had migrated to 10-46 cm2. This has also now been excluded, so the goalposts have been moved to 10-48 cm2. This migration has been driven entirely by the data; there is nothing miraculous about a WIMP with this cross section.

As remarkable a technological accomplishment as experiments like LZ are, they are becoming the definition of insanity: repeating the same action but expecting a different result.

For comparison, consider the LIGO detection of gravitational waves. A large team of scientists worked unspeakably hard to achieve the detection of a tiny effect. It took 40 years of failure before success was obtained. Until that point, it seemed much the same: repeating the same action but expecting a different result.

Except it wasn’t, because there was a clear expectation for the sensitivity that was required to detect gravitational waves. Once that sensitivity was achieved, they were detected. It wasn’t that simple of course, but close enough for our purposes: it took a long time to get where they were going, but they achieved success once they got there. Having a clear prediction is essential.

In the case of WIMP searches, there was also a clear prediction. The required sensitivity was achieved – long ago. Nothing was found, so the goalposts were moved – by a lot. Then the new required sensitivity was achieved, still without detection. Repeatedly.

It always makes sense to look harder for something you expect if at first you don’t succeed. But at some point, you have to give up: you ain’t gonna find it. This is disappointing, but we’ve all experienced this kind of disappointment at some point in our lives. The tricky part is deciding when to give up.

In science, the point to give up is when your hypothesis is falsified. The original WIMP hypothesis was falsified a long time ago. We keep it on life support with modifications, often obfuscating (to our students and to ourselves) that the WIMPs we’re talking about today are no longer the WIMPs we originally conceived.

I sometimes like to imagine the thought experiment of sending some of the more zealous WIMP advocates back in time to talk to their younger selves. What would they say? How would they respond to themselves? These are not people who like to be contradicted by anyone, even themselves, so I suspect it would go something like

Old scientist: “Hey, kid – I’m future you. This experiment you’re about to spend your life working on won’t detect what you’re looking for.”

Young scientist: “Uh huh. You say you’re me from the future, Mr. Credibility? Tell me: at what point do I go senile, you doddering old fool?”

Old scientist: “You don’t. It just won’t work out the way you think. On top of dark matter, there’s also dark energy…”

Young scientist: “What the heck is dark energy, you drooling crackpot?”

Old scientist: “The cosmological constant.”

Young scientist: “The cosmological constant! You can’t expect people to take you seriously talking about that rubbish. GTFO.”

That’s the polite version that doesn’t end in fisticuffs. It’s easy to imagine this conversation going south much faster. I know that if 1993 me had received a visit from 1998 me telling me that in five years I would have come to doubt WIMPs, and also would have demonstrated that the answer to the missing mass problem might not be dark matter at all, I… would not have taken it well.

That’s why predictions are important in science. They tell us when to change our mind. When to stop what we’re doing because it’s not working. When to admit that we were wrong, and maybe consider something else. Maybe that something else won’t prove correct. Maybe the next ten something elses won’t. But we’ll never find out if we won’t let go of the first wrong thing.

Some Outsider Perspective from Insiders

Some Outsider Perspective from Insiders

Avi Loeb has a nice recent post Recalculating Academia, in which he discusses some of the issues confronting modern academia. One of the reasons I haven’t written here for a couple of months is despondency over the same problems. If you’re here reading this, you’ll likely be interested in what he has to say.

I am not eager to write at length today, but I do want to amplify some of the examples he gives with my own experience. For example, he notes that there are

theoretical physicists who avoid the guillotine of empirical tests for half a century by dedicating their career to abstract conjectures, avoid the risk of being proven wrong while demonstrating mathematical virtuosity.

Avi Loeb

I recognize many kinds of theoretical physicists who fit this description. My first thought was string theory, which took off in the mid-80s when I was a grad student at Princeton, ground zero for that movement in the US. (The Russians indulged in this independently.) I remember a colloquium in which David Gross advocated the “theory of everything” with gratuitous religious fervor to a large audience of eager listeners quavering with anticipation with the texture of religious revelation. It was captivating and convincing, up until the point near the end when he noted that experimental tests were many orders of magnitude beyond any experiment conceivable at the time. That… wasn’t physics to me. If this was the path the field was going down, I wanted no part of it. This was one of many factors that precipitated my departure from the toxic sludge that was grad student life in the Princeton physics department.

I wish I could say I had been proven wrong. Instead, decades later, physics has nothing to show for its embrace of string theory. There have been some impressive development in mathematics stemming from it. Mathematics, not physics. And yet, there persists a large community of theoretical physicists who wander endlessly in the barren and practically infinite parameter space of multidimensional string theory. Maybe there is something relevant to physical reality there, or maybe it hasn’t been found because there isn’t. At what point does one admit that the objective being sought just ain’t there? [Death. For many people, the answer seems to be never. They keep repeating the same fruitless endeavor until they die.]

We do have new physics, in the form of massive neutrinos and the dark matter problem and the apparent acceleration of the expansion rate of the universe. What we don’t have is the expected evidence for supersymmetry, the crazy-bold yet comparatively humble first step on the road to string theory. If they had got even this much right, we should have seen evidence for it at the LHC, for example in the decay of the aptly named BS meson. If supersymmetric particles existed, they should provide many options for the meson to decay into, which otherwise has few options in the Standard Model of particle physics. This was a strong prediction of minimal supersymmetry, so much so that it was called the Golden Test of supersymmetry. After hearing this over and over in the ’80s and ’90s, I have not heard it again any time in this century. I’m nor sure when the theorists stopped talking about this embarrassment, but I suspect it is long enough ago now that it will come as a surprise to younger scientists, even those who work in the field. Supersymmetry flunked the golden test, and it flunked it hard. Rather than abandon the theory (some did), we just stopped talking about. There persists a large community of theorists who take supersymmetry for granted, and react with hostility if you question that Obvious Truth. They will tell you with condescension that only minimal supersymmetry is ruled out; there is an enormous parameter space still open for their imaginations to run wild, unbridled by experimental constraint. This is both true and pathetic.

Reading about the history of physics, I learned that there was a community of physicists who persisted believing in aether for decades after the Michelson-Morley experiment. After all, only some forms of aether were ruled out. This was true, at the time, but we don’t bother with that detail when teaching physics now. Instead, it gets streamlined to “aether was falsified by Michelson-Morley.” This is, in retrospect, true, and we don’t bother to mention those who pathetically kept after it.

The standard candidate for dark matter, the WIMP, is a supersymmetric particle. If supersymmetry is wrong, WIMPs don’t exist. And yet, there is a large community of particle physicists who persist in building ever bigger and better experiments designed to detect WIMPs. Funny enough, they haven’t detected anything. It was a good hypothesis, 38 years ago. Now its just a bad habit. The better ones tacitly acknowledge this, attributing their continuing efforts to the streetlight effect: you look where you can see.

Prof. Loeb offers another pertinent example:

When I ask graduating students at their thesis exam whether the cold dark matter paradigm will be proven wrong if their computer simulations will be in conflict with future data, they almost always say that any disagreement will indicate that they should add a missing ingredient to their theoretical model in order to “fix” the discrepancy.

Avi Loeb

This is indeed the attitude. So much so that no additional ingredient seems to absurd if it is what we need to save the phenomenon. Feedback is the obvious example in my own field, as that (or the synonyms “baryon physics” or “gastrophysics”) is invoked to explain away any and all discrepancies. It sounds simple, since feedback is a real effect that does happen, but this single word does a lot of complicated work under the hood. There are many distinct kinds of feedback: stellar winds, UV radiation from massive stars, supernova when those stars explode, X-rays from compact sources like neutron stars, and relativistic jets from supermasive black holes at the centers of galactic nuclei. These are the examples of feedback that I can think of off the top of my head, there are probably more. All of these things have perceptible, real-world effects on the relevant scales, with, for example, stars blowing apart the dust and gas of their stellar cocoons after they form. This very real process has bugger all to do with what feedback is invoked to do on galactic scales. Usually, supernova are blamed by theorists for any and all problems in dwarf galaxies, while observers tell me that stellar winds do most of the work in disrupting star forming regions. Confronted with this apparent discrepancy, the usual answer is that it doesn’t matter how the energy is input into the interstellar medium, just that it is. Yet we can see profound differences between stellar winds and supernova explosions, so this does not inspire confidence for the predictive power of theories that generically invoke feedback to explain away problems that wouldn’t be there in a healthy theory.

This started a long time ago. I had already lost patience with this unscientific attitude to the point that I dubbed it the

Spergel Principle: “It is better to postdict than to predict.”

McGaugh 1998

This continues to go on and has now done so for so long that generations of students seem to think that this is how science is supposed to be done. If asked about hypothesis testing and whether a theory can be falsified, many theorists will first look mystified, then act put out. Why would you even ask that? (One does not question the paradigm.) The minority of better ones then rally to come up with some reason to justify that yes, what they’re talking about can be falsified, so it does qualify as physics. But those goalposts can always be moved.

A good example of moving goalposts is the cusp-core problem. When I first encountered this in the mid to late ’90s, I tried to figure a way out of it, but failed. So I consulted one of the very best theorists, Simon White. When I asked him what he thought would constitute a falsification of cold dark matter, he said cusps: “cusps have to be there” [in the center of a dark matter halo]. Flash forward to today, when nobody would accept that as a falsification of cold dark matter: it can be fixed by feedback. Which would be fine, if it were true, which isn’t really clear. At best it provides a post facto explanation for an unpredicted phenomenon without addressing the underlying root cause, that the baryon distribution is predictive of the dynamics.

This is like putting a band-aid on a Tyrannosaurus. It’s already dead and fossilized. And if it isn’t, well, you got bigger problems.

Another disease common to theory is avoidance. A problem is first ignored, then the data are blamed for showing the wrong thing, then they are explained in a way that may or may not be satisfactory. Either way, it is treated as something that had been expected all along.

In a parallel to this gaslighting, I’ve noticed that it has become fashionable of late to describe unsatisfactory explanations as “natural.” Saying that something can be explained naturally is a powerful argument in science. The traditional meaning is that ok, we hadn’t contemplated this phenomena before it surprised us, but if we sit down and work it out, it makes sense. The “making sense” part means that an answer falls out of a theory easily when the right question is posed. If you need to run gazillions of supercomputer CPU hours of a simulation with a bunch of knobs for feedback to get something that sorta kinda approximates reality but not really, your result does not qualify as natural. It might be right – that’s a more involved adjudication – but it doesn’t qualify as natural and the current fad to abuse this term again does not inspire confidence that the results of such simulations might somehow be right. Just makes me suspect the theorists are fooling themselves.

I haven’t even talked about astroparticle physicists or those who engage in fantasies about the multiverse. I’ll just close by noting that Popper’s criterion for falsification was intended to distinguish between physics and metaphysics. That’s not the same as right or wrong, but physics is subject to experimental test while metaphysics is the stuff of late night bull sessions. The multiverse is manifestly metaphysical. Cool to think about, has lots of implications for philosophy and religion, but not physics. Even Gross has warned against treading down the garden path of the multiverse. (Tell me that you’re warning others not to make the same mistakes you made without admitting you made mistakes.)

There are a lot of scientists who would like to do away with Popper, or any requirement that physics be testable. These are inevitably the same people whose fancy turns to metascapes of mathematically beautiful if fruitless theories, and want to pass off their metaphysical ramblings as real physics. Don’t buy it.

A brief history of the Radial Acceleration Relation

A brief history of the Radial Acceleration Relation

In science, all new and startling facts must encounter in sequence the responses

1. It is not true!

2. It is contrary to orthodoxy.

3. We knew it all along.

Louis Agassiz (circa 1861)

This expression exactly depicts the progression of the radial acceleration relation. Some people were ahead of this curve, others are still behind it, but it quite accurately depicts the mass sociology. This is how we react to startling new facts.

For quotation purists, I’m not sure exactly what the original phrasing was. I have paraphrased it to be succinct and have substituted orthodoxy for religion, because even scientists can have orthodoxies: holy cows that must not be slaughtered.

I might even add a precursor stage zero to the list above:

0. It goes unrecognized.

This is to say, that if a new fact is sufficiently startling, we don’t just disbelieve it (stage 1); at first we fail to see it at all. We lack the cognitive framework to even recognize how important it is. An example is provided by the 1941 detection of the microwave background by Andrew McKellar. In retrospect, this is as persuasive as the 1964 detection of Penzias and Wilson to which we usually ascribe the discovery. At the earlier time, there was simply no framework for recognizing what it was that was being detected. It appears to me that P&Z didn’t know what they were looking at either until Peebles explained it to them.

The radial acceleration relation was first posed as the mass discrepancy-acceleration relation. They’re fundamentally the same thing, just plotted in a slightly different way. The mass discrepancy-acceleration relation shows the ratio of total mass to that which is visible. This is basically the ratio of the observed acceleration to that predicted by the observed baryons. This is useful to see how much dark matter is needed, but by construction the axes are not independent, as both measured quantities are used in forming the ratio.

The radial acceleration relation shows independent observations along each axis: observed vs. predicted acceleration. Though measured independently, they are not physically independent, as the baryons contribute some to the total observed acceleration – they do have mass, after all. One can construct a halo acceleration relation by subtracting the baryonic contribution away from the total; in principle the remainders are physically independent. Unfortunately, the axes again become observationally codependent, and the uncertainties blow up, especially in the baryon dominated regime. Which of these depictions is preferable depends a bit on what you’re looking to see; here I just want to note that they are the same information packaged somewhat differently.

To the best of my knowledge, the first mention of the mass discrepancy-acceleration relation in the scientific literature is by Sanders (1990). Its existence is explicit in MOND (Milgrom 1983), but here it is possible to draw a clear line between theory and data. I am only speaking of the empirical relation as it appears in the data, irrespective of anything specific to MOND.

I met Bob Sanders, along with many other talented scientists, in a series of visits to the University of Groningen in the early 1990s. Despite knowing him and having talked to him about rotation curves, I was unaware that he had done this.

Stage 0: It goes unrecognized.

For me, stage one came later in the decade at the culmination of a several years’ campaign to examine the viability of the dark matter paradigm from every available perspective. That’s a long paper, which nevertheless drew considerable praise from many people who actually read it. If you go to the bother of reading it today, you will see the outlines of many issues that are still debated and others that have been forgotten (e.g., the fine-tuning issues).

Around this time (1998), the dynamicists at Rutgers were organizing a meeting on galaxy dynamics, and asked me to be one of the speakers. I couldn’t possibly discuss everything in the paper in the time allotted, so was looking for a way to show the essence of the challenge the data posed. Consequently, I reinvented the wheel, coming up with the mass discrepancy-acceleration relation. Here I show the same data that I had then in the form of the radial acceleration relation:

The Radial Acceleration Relation from the data in McGaugh (1999). Plot credit: Federico Lelli. (There is a time delay in publication: the 1998 meeting’s proceedings appeared in 1999.)

I recognize this version of the plot as having been made by Federico Lelli. I’ve made this plot many times, but this is version I came across first, and it is better than mine in that the opacity of the points illustrates where the data are concentrated. I had been working on low surface brightness galaxies; these have low accelerations, so that part of the plot is well populated.

The data show a clear correlation. By today’s standards, it looks crude. Going on what we had then, it was fantastic. Correlations practically never look this good in extragalactic astronomy, and they certainly don’t happen by accident. Low quality data can hide a correlation – uncertainties cause scatter – but they can’t create a correlation where one doesn’t exist.

This result was certainly startling if not as new as I then thought. That’s why I used the title How Galaxies Don’t Form. This was contrary to our expectations, as I had explained in exhaustive detail in the long paper and revisit in a recent review for philosophers and historians of science.

I showed the same result later that year (1998) at a meeting on the campus of the University of Maryland where I was a brand new faculty member. It was a much shorter presentation, so I didn’t have time to justify the context or explain much about the data. Contrary to the reception at Rutgers where I had adequate time to speak, the hostility of the audience to the result was palpable, their stony silence eloquent. They didn’t want to believe it, and plenty of people got busy questioning the data.

Stage 1: It is not true.

I spent the next five years expanding and improving the data. More rotation curves became available thanks to the work of many, particularly Erwin de Blok, Marc Verheijen, and Rob Swaters. That was great, but the more serious limitation was how well we could measure the stellar mass distribution needed to predict the baryonic acceleration.

The mass models we could build at the time were based on optical images. A mass model takes the observed light distribution, assigns a mass-to-light ratio, and makes a numerical solution of the Poisson equation to obtain the the gravitational force corresponding to the observed stellar mass distribution. This is how we obtain the stellar contribution to the predicted baryonic force; the same procedure is applied to the observed gas distribution. The blue part of the spectrum is the best place in which to observe low contrast, low surface brightness galaxies as the night sky is darkest there, at least during new moon. That’s great for measuring the light distribution, but what we want is the stellar mass distribution. The mass-to-light ratio is expected to have a lot of scatter in the blue band simply from the happenstance of recent star formation, which makes bright blue stars that are short-lived. If there is a stochastic uptick in the star formation rate, then the mass-to-light ratio goes down because there are lots of bright stars. Wait a few hundred million years and these die off, so the mass-to-light ratio gets bigger (in the absence of further new star formation). The time-integrated stellar mass may not change much, but the amount of blue light it produces does. Consequently, we expect to see well-observed galaxies trace distinct lines in the radial acceleration plane, even if there is a single universal relation underlying the phenomenon. This happens simply because we expect to get M*/L wrong from one galaxy to the next: in 1998, I had simply assumed all galaxies had the same M*/L for lack of any better prescription. Clearly, a better prescription was warranted.

In those days, I traveled through Tucson to observe at Kitt Peak with some frequency. On one occasion, I found myself with a few hours to kill between coming down from the mountain and heading to the airport. I wandered over to the Steward Observatory at the University of Arizona to see who I might see. A chance meeting in the wild west: I encountered Eric Bell and Roelof de Jong, who were postdocs there at the time. I knew Eric from his work on the stellar populations of low surface brightness galaxies, an interest closely aligned with my own, and Roelof from my visits to Groningen.

As we got to talking, Eric described to me work they were doing on stellar populations, and how they thought it would be possible to break the age-metallicity degeneracy using near-IR colors in addition to optical colors. They were mostly focused on improving the age constraints on stars in LSB galaxies, but as I listened, I realized they had constructed a more general, more powerful tool. At my encouragement (read their acknowledgements), they took on this more general task, ultimately publishing the classic Bell & de Jong (2001). In it, they built a table that enabled one to look up the expected mass-to-light ratio of a complex stellar population – one actively forming stars – as a function of color. This was a big step forward over my educated guess of a constant mass-to-light ratio: there was now a way to use a readily observed property, color, to improve the estimated M*/L of each galaxy in a well-calibrated way.

Combining the new stellar population models with all the rotation curves then available, I obtained an improved mass discrepancy-acceleration relation:

The Radial Acceleration Relation from the data in McGaugh (2004); version using Bell’s stellar population synthesis models to estimate M*/L (see Fig. 5 for other versions). Plot credit: Federico Lelli.

Again, the relation is clear, but with scatter. Even with the improved models of Bell & de Jong, some individual galaxies have M*/L that are wrong – that’s inevitable in this game. What you cannot know is which ones! Note, however, that there are now 74 galaxies in this plot, and almost all of them fall on top of each other where the point density is large. There are some obvious outliers; those are presumably just that: the trees that fall outside the forest because of the expected scatter in M*/L estimates.

I tried a variety of prescriptions for M*/L in addition to that of Bell & de Jong. Though they differed in texture, they all told a consistent story. A relation was clearly present; only its detailed form varied with the adopted prescription.

The prescription that minimized the scatter in the relation was the M*/L obtained in MOND fits. That’s a tautology: by construction, a MOND fit finds the M*/L that puts a galaxy on this relation. However, we can generalize the result. Maybe MOND is just a weird, unexpected way of picking a number that has this property; it doesn’t have to be the true mass-to-light ratio in nature. But one can then define a ratio Q

Equation 21 of McGaugh (2004).

that relates the “true” mass-to-light ratio to the number that gives a MOND fit. They don’t have to be identical, but MOND does return M*/L that are reasonable in terms of stellar populations, so Q ~ 1. Individual values could vary, and the mean could be a bit more or less than unity, but not radically different. One thing that impressed me at the time about the MOND fits (most of which were made by Bob Sanders) was how well they agreed with the stellar population models, recovering the correct amplitude, the correct dependence on color in different bandpasses, and also giving the expected amount of scatter (more in the blue than in the near-IR).

Fig. 7 of McGaugh (2004). Stellar mass-to-light ratios of galaxies in the blue B-band (top) and near-IR K-band (bottom) as a function of BV color for the prescription of maximum disk (left) and MOND (right). Each point represents one galaxy for which the requisite data were available at the time. The line represents the mean expectation of stellar population synthesis models from Bell et al. (2003). These lines are completely independent of the data: neither the normalization nor the slope has been fit to the dynamical data. The red points are due to Sanders & Verheijen (1998); note the weak dependence of M*/L on color in the near-IR.

The obvious interpretation is that we should take seriously a theory that obtains good fits with a single free parameter that checks out admirably well with independent astrophysical constraints, in this case the M*/L expected for stellar populations. But I knew many people would not want to do that, so I defined Q to generalize to any M*/L in any (dark matter) context one might want to consider.

Indeed, Q allows us to write a general expression for the rotation curve of the dark matter halo (essentially the HAR alluded to above) in terms of that of the stars and gas:

Equation 22 of McGaugh (2004).

The stars and the gas are observed, and μ is the MOND interpolation function assumed in the fit that leads to Q. Except now the interpolation function isn’t part of some funny new theory; it is just the shape of the radial acceleration relation – a relation that is there empirically. The only fit factor between these data and any given model is Q – a single number of order unity. This does leave some wiggle room, but not much.

I went off to a conference to describe this result. At the 2006 meeting Galaxies in the Cosmic Web in New Mexico, I went out of my way at the beginning of the talk to show that even if we ignore MOND, this relation is present in the data, and it provides a strong constraint on the required distribution of dark matter. We may not know why this relation happens, but we can use it, modulo only the modest uncertainty in Q.

Having bent over backwards to distinguish the data from the theory, I was disappointed when, immediately at the end of my talk, prominent galaxy formation theorist Anatoly Klypin loudly shouted

“We don’t have to explain MOND!”

It stinks of MOND!

But you do have to explain the data. The problem was and is that the data look like MOND. It is easy to conflate one with the other; I have noticed that a lot of people have trouble keeping the two separate. Just because you don’t like the theory doesn’t mean that the data are wrong. What Anatoly was saying was that

2. It is contrary to orthodoxy.

Despite phrasing the result in a way that would be useful to galaxy formation theorists, they did not, by and large, claim to explain it at the time – it was contrary to orthodoxy so didn’t need to be explained. Looking at the list of papers that cite this result, the early adopters were not the target audience of galaxy formation theorists, but rather others citing it to say variations of “no way dark matter explains this.”

At this point, it was clear to me that further progress required a better way to measure the stellar mass distribution. Looking at the stellar population models, the best hope was to build mass models from near-infrared rather than optical data. The near-IR is dominated by old stars, especially red giants. Galaxies that have been forming stars actively for a Hubble time tend towards a quasi-equilibrium in which red giants are replenished by stellar evolution at about the same rate they move on to the next phase. One therefore expects the mass-to-light ratio to be more nearly constant in the near-IR. Not perfectly so, of course, but a 2 or 3 micron image is as close to a map of the stellar mass of a galaxy as we’re likely to get.

Around this time, the University of Maryland had begun a collaboration with Kitt Peak to build a big infrared camera, NEWFIRM, for the 4m telescope. Rob Swaters was hired to help write software to cope with the massive data flow it would produce. The instrument was divided into quadrants, each of which had a field of view sufficient to hold a typical galaxy. When it went on the telescope, we developed an efficient observing method that I called “four-shooter”, shuffling the target galaxy from quadrant to quadrant so that in processing we could remove the numerous instrumental artifacts intrinsic to its InSb detectors. This eventually became one of the standard observing modes in which the instrument was operated.

NEWFIRM in the lab in Tucson. Most of the volume is for cryogenics: the IR detectors are heliumcooled to 30 K. Partial student for scale.

I was optimistic that we could make rapid progress, and at first we did. But despite all the work, despite all the active cooling involved, we were still on the ground. The night sky was painfully bright in the IR. Indeed, the thermal component dominated, so we could observe during full moon. To an observer of low surface brightness galaxies attuned to any hint of scattered light from so much as a crescent moon, I cannot describe how discombobulating it was to walk outside the dome and see the full fricking moon. So bright. So wrong. And that wasn’t even the limiting factor: the thermal background was.

We had hit a surface brightness wall, again. We could do the bright galaxies this way, but the LSBs that sample the low acceleration end of the radial acceleration relation were rather less accessible. Not inaccessible, but there was a better way.

The Spitzer Space Telescope was active at this time. Jim Schombert and I started winning time to observe LSB galaxies with it. We discovered that space is dark. There was no atmosphere to contend with. No scattered light from the clouds or the moon or the OH lines that afflict that part of the sky spectrum. No ground-level warmth. The data were fantastic. In some sense, they were too good: the biggest headache we faced was blotting out all the background galaxies that shown right through the optically thin LSB galaxies.

Still, it took a long time to collect and analyze the data. We were starting to get results by the early-teens, but it seemed like it would take forever to get through everything I hoped to accomplish. Fortunately, when I moved to Case Western, I was able to hire Federico Lelli as a postdoc. Federico’s involvement made all the difference. After many months of hard, diligent, and exacting work, he constructed what is now the SPARC database. Finally all the elements were in place to construct an empirical radial acceleration relation with absolutely minimal assumptions about the stellar mass-to-light ratio.

In parallel with the observational work, Jim Schombert had been working hard to build realistic stellar population models that extended to the 3.6 micron band of Spitzer. Spitzer had been built to look redwards of this, further into the IR. 3.6 microns was its shortest wavelength passband. But most models at the time stopped at the K-band, the 2.2 micron band that is the reddest passband that is practically accessible from the ground. They contain pretty much the same information, but we still need to calculate the band-specific value of M*/L.

Being a thorough and careful person, Jim considered not just the star formation history of a model stellar population as a variable, and not just its average metallicity, but also the metallicity distribution of its stars, making sure that these were self-consistent with the star formation history. Realistic metallicity distributions are skewed; it turn out that this subtle effect tends to counterbalance the color dependence of the age effect on M*/L in the near-IR part of the spectrum. The net results is that we expect M*/L to be very nearly constant for all late type galaxies.

This is the best possible result. To a good approximation, we expected all of the galaxies in the SPARC sample to have the same mass-to-light ratio. What you see is what you get. No variable M*/L, no equivocation, just data in, result out.

We did still expect some scatter, as that is an irreducible fact of life in this business. But even that we expected to be small, between 0.1 and 0.15 dex (roughly 25 – 40%). Still, we expected the occasional outlier, galaxies that sit well off the main relation just because our nominal M*/L didn’t happen to apply in that case.

One day as I walked past Federico’s office, he called for me to come look at something. He had plotted all the data together assuming a single M*/L. There… were no outliers. The assumption of a constant M*/L in the near-IR didn’t just work, it worked far better than we had dared to hope. The relation leapt straight out of the data:

The Radial Acceleration Relation from the data in McGaugh et al. (2016). Plot credit: Federico Lelli.

Over 150 galaxies, with nearly 2700 resolved measurements within each galaxy, each with their own distinctive mass distribution, all pile on top of each other without effort. There was plenty of effort in building the database, but once it was there, the result appeared, no muss, no fuss. No fitting or fiddling. Just the measurements and our best estimate of the mean M*/L, applied uniformly to every individual galaxy in the sample. The scatter was only 0.12 dex, within the range expected from the population models.

No MOND was involved in the construction of this relation. It may look like MOND, but we neither use MOND nor need it in any way to see the relation. It is in the data. Perhaps this is the sort of result for which we would have to invent MOND if it did not already exist. But the dark matter paradigm is very flexible, and many papers have since appeared that claim to explain the radial acceleration relation. We have reached

3. We knew it all along.

On the one hand, this is good: the community is finally engaging with a startling fact that has been pointedly ignored for decades. On the other hand, many of the claims to explain the radial acceleration relation are transparently incorrect on their face, being nothing more than elaborations of models I considered and discarded as obviously unworkable long ago. They do not provide a satisfactory explanation of the predictive power of MOND, and inevitably fail to address important aspects of the problem, like disk stability. Rather than grapple with the deep issues the new and startling fact poses, it has become fashionable to simply assert that one’s favorite model explains the radial acceleration relation, and does so naturally.

There is nothing natural about the radial acceleration relation in the context of dark matter. Indeed, it is difficult to imagine a less natural result – hence stages one and two. So on the one hand, I welcome the belated engagement, and am willing to consider serious models. On the other hand, if someone asserts that this is natural and that we expected it all along, then the engagement isn’t genuine: they’re just fooling themselves.

Early Days. This was one of Vera Rubin’s favorite expressions. I always had a hard time with it, as many things are very well established. Yet it seems that we have yet to wrap our heads around the problem. Vera’s daughter, Judy Young, once likened the situation to the parable of the blind men and the elephant. Much is known, yes, but the problem is so vast that each of us can perceive only a part of the whole, and the whole may be quite different from the part that is right before us.

So I guess Vera is right as always: these remain Early Days.

A script for every observational test

A script for every observational test

Science progresses through hypothesis testing. The primary mechanism for distinguishing between hypotheses is predictive power. The hypothesis that can predict new phenomena is “better.” This is especially true for surprising, a priori predictions: it matters more when the new phenomena was not expected in the context of an existing paradigm.

I’ve seen this happen many times now. MOND has had many predictive successes. As a theory, it has been exposed to potential falsification, and passed many tests. These have often been in the form of phenomena that had not been anticipated in any other way, and were initially received as strange to the point of seeming impossible. It is exactly the situation envisioned in Putnam’s “no miracles” argument: it is unlikely to the point of absurdity that a wholly false theory should succeed in making so many predictions of such diversity and precision.

MOND has many doubters, which I can understand. What I don’t get is the ignorance I so often encounter among them. To me, the statement that MOND has had many unexpected predictions come true is a simple statement of experiential fact. I suspect it will be received by some as a falsehood. It shouldn’t be, so if you don’t know what I’m talking about, you should try reading the relevant literature. What papers about MOND have you actually read?

Ignorance is not a strong basis for making scientific judgements. Before I criticize something, I make sure I know what I’m talking about. That’s rarely true of the complaints I hear against MOND. There are legitimate ones, to be sure, but for the most part I hear assertions like

  • MOND is guaranteed to fit rotation curves.
  • It fits rotation curves but does nothing else.
  • It is just a fitting tool with no predictive power.

These are myths, plain and simple. They are easily debunked, and were long ago. Yet I hear them repeated often by people who think they know better, one as recently as last week. Serious people who expect to be taken seriously as scientists, and yet they repeat known falsehoods as if they were established fact. Is there a recycling bin of debunked myths that gets passed around? I guess it is easy to believe a baseless rumor when it conforms to your confirmation bias: no need for fact-checking!

Aside from straight-up reality denial, another approach is to claim that dark matter predicts exactly the same thing, whatever it is. I’ve seen this happen so often, I know how the script always goes:


• We make a new observation X that is surprising.
• We test the hypothesis, and report the result: “Gee, MOND predicted this strange effect, and we see evidence of it in the data.”
• Inevitable Question: What does LCDM predict?
• Answer: Not that.
• Q: But what does it predict?
• A: It doesn’t really make a clear prediction on this subject, so we have to build some kind of model to even begin to address this question. In the most obvious models one can construct, it predicts Y. Y is not the same as X.
• Q: What about more complicated models?
• A: One can construct more complicated models, but they are not unique. They don’t make a prediction so much as provide a menu of options from which we may select the results that suit us. The obvious danger is that it becomes possible to do anything, and we have nothing more than an epicycle theory of infinite possibilities. If we restrict ourselves to considering the details of serious models that have only been partially fine-tuned over the course of the development of the field, then there are still a lot of possibilities. Some of them come closer to reality than others but still don’t really do the right thing for the following reasons…[here follows 25 pages of minutia in the ApJ considering every up/down left/right stand on your head and squint possibility that still winds up looking more like Y than like X.] You certainly couldn’t predict X this way, as MOND did a priori.
• Q: That’s too long to read. Dr. Z says it works, so he must be right since we already know that LCDM is correct.

The thing is, Dr. Z did not predict X ahead of time. MOND did. Maybe Dr. Z’s explanation in terms of dark matter makes sense. Often it does not, but even if it does, so what? Why should I be more impressed with a theory that only explains things after they’re observed when another predicted them a priori?

There are lots of Dr. Z’s. No matter how carefully one goes through the minutia, no matter how clearly one demonstrates that X cannot work in a purely conventional CDM context, there is always someone who says it does. That’s what people want to hear, so that’s what they choose to believe. Way easier that way. Or, as it has been noted before

Faced with the choice between changing one’s mind and proving that there is no need to do so, almost everybody gets busy on the proof.

J. K. Galbraith (1965)