On the timescale for galaxy formation

On the timescale for galaxy formation

I’ve been wanting to expand on the previous post ever since I wrote it, which is over a month ago now. It has been a busy end to the semester. Plus, there’s a lot to say – nothing that hasn’t been said before, somewhere, somehow, yet still a lot to cobble together into a coherent story – if that’s even possible. This will be a long post, and there will be more after to narrate the story of our big paper in the ApJ. My sole ambition here is to express the predictions of galaxy formation theory in LCDM and MOND in the broadest strokes.

A theory is only as good as its prior. We can always fudge things after the fact, so what matters most is what we predict in advance. What do we expect for the timescale of galaxy formation? To tell you what I’m going to tell you, it takes a long time to build a massive galaxy in LCDM, but it happens much faster in MOND.

Basic Considerations

What does it take to make a galaxy? A typical giant elliptical galaxy has a stellar mass of 9 x 1010 M. That’s a bit more than our own Milky Way, which has a stellar mass of 5 or 6 x 1010 M (depending who you ask) with another 1010 M or so in gas. So, in classic astronomy/cosmology style, let’s round off and say a big galaxy is about 1011 M. That’s a hundred billion stars, give or take.

An elliptical galaxy (NGC 3379, left) and two spiral galaxies (NGC 628 and NGC 891, right).

How much of the universe does it take to make one big galaxy? The critical density of the universe is the over/under point for whether an expanding universe expands forever, or has enough self-gravity to halt the expansion and ultimately recollapse. Numerically, this quantity is ρcrit = 3H02/(8πG), which for H0 = 73 km/s/Mpc works out to 10-29 g/cm3 or 1.5 x 10-7 M/pc3. This is a very small number, but provides the benchmark against which we measure densities in cosmology. The density of any substance X is ΩX = ρXcrit. The stars and gas in galaxies are made of baryons, and we know the baryon density pretty well from Big Bang Nucleosynthesis: Ωb = 0.04. That means the average density of normal matter is very low, only about 4 x 10-31 g/cm3. That’s less than one hydrogen atom per cubic meter – most of space is an excellent vacuum!

This being the case, we need to scoop up a large volume to make a big galaxy. Going through the math, to gather up enough mass to make a 1011 M galaxy, we need a sphere with a radius of 1.6 Mpc. That’s in today’s universe; in the past the universe was denser by (1+z)3, so at z = 10 that’s “only” 140 kpc. Still, modern galaxies are much smaller than that; the effective edge of the disk of the Milky Way is at a radius of about 20 kpc, and most of the baryonic mass is concentrated well inside that: the typical half-light radius of a 1011 M galaxy is around 6 kpc. That’s a long way to collapse.

Monolithic Galaxy Formation

Given this much information, an early concept was monolithic galaxy formation. We have a big ball of gas in the early universe that collapses to form a galaxy. Why and how this got started was fuzzy. But we knew how much mass we needed and the volume it had to come from, so we can consider what happens as the gas collapses to create a galaxy.

Here we hit a big astrophysical reality check. Just how does the gas collapse? It has to dissipate energy to do so, and cool to form stars. Once stars form, they may feed energy back into the surrounding gas, reheating it and potentially preventing the formation of more stars. These processes are nontrivial to compute ab initio, and attempting to do so obsesses much of the community. We don’t agree on how these things work, so they are the knobs theorists can turn to change an answer they don’t like.

Even if we don’t understand star formation in detail, we do observe that stars have formed, and can estimate how many. Moreover, we do understand pretty well how stars evolve once formed. Hence a common approach is to build stellar population models with some prescribed star formation history and see what works. Spiral galaxies like the Milky Way formed a lot of stars in the past, and continue to do so today. To make 5 x 1010 M of stars in 13 Gyr requires an average star formation rate of 4 M/yr. The current measured star formation rate of the Milky Way is estimated to be 2 ± 0.7 M/yr, so the star formation rate has been nearly constant (averaging over stochastic variations) over time, perhaps with a gradual decline. Giant elliptical galaxies, in contrast, are “red and dead”: they have no current star formation and appear to have made most of their stars long ago. Rather than a roughly constant rate of star formation, they peaked early and declined rapidly. The cessation of star formation is also called quenching.

A common way to formulate the star formation rate in galaxies as a whole is the exponential star formation rate, SFR(t) = SFR0 e-t/τ. A spiral galaxy has a low baseline star formation rate SFR0 and a long burn time τ ~ 10 Gyr while an elliptical galaxy has a high initial star formation rate and a short e-folding time like τ ~ 1 Gyr. Many variations on this theme are possible, and are of great interest astronomically, but this basic distinction suffices for our discussion here. From the perspective of the observed mass and stellar populations of local galaxies, the standard picture for a giant elliptical was a large, monolithic island universe that formed the vast majority of its stars early on then quenched with a short e-folding timescale.

Galaxies as Island Universes

The density parameter Ω provides another useful way to think about galaxy formation. As cosmologists, we obsess about the global value of Ω because it determines the expansion history and ultimate fate of the universe. Here it has a more modest application. We can think of the region in the early universe that will ultimately become a galaxy as its own little closed universe. With a density parameter Ω > 1, it is destined to recollapse.

A fun and funny fact of the Friedmann equation is that the matter density parameter Ωm → 1 at early times, so the early universe when galaxies form is matter dominated. It is also very uniform (more on that below). So any subset that is a bit more dense than average will have Ω > 1 just because the average is very close to Ω = 1. We can then treat this region as its own little universe (a “top-hat overdensity”) and use the Friedmann equation to solve for its evolution, as in this sketch:

The expansion of the early universe a(t) (blue line). A locally overdense region may behave as a closed universe, recollapsing in a finite time (red line) to potentially form a galaxy.

That’s great, right? We have a simple, analytic solution derived from first principles that explains how a galaxy forms. We can plug in the numbers to find how long it takes to form our basic, big 1011 M galaxy and… immediately encounter a problem. We need to know how overdense our protogalaxy starts out. Is its effective initial Ωm = 2? 10? What value, at what time? The higher it is, the faster the evolution from initially expanding along with the rest of the universe to decoupling from the Hubble flow to collapsing. We know the math but we still need to know the initial condition.

Annoying Initial Conditions

The initial condition for galaxy formation is observed in the cosmic microwave background (CMB) at z = 1090. Where today’s universe is remarkably lumpy, the early universe is incredibly uniform. It is so smooth that it is homogeneous and isotropic to one part in a hundred thousand. This is annoyingly smooth, in fact. It would help to have some lumps – primordial seeds with Ω > 1 – from which structure can grow. The observed seeds are too tiny; the typical initial amplitude is 10-5 so Ωm = 1.00001. That takes forever to decouple and recollapse; it hasn’t yet had time to happen.

The cosmic microwave background as observed by ESA’s Planck satellite. This is an all-sky picture of the relic radiation field – essentially a snapshot of the universe when it was just a few hundred thousand years old. The variations in color are variations in temperature which correspond to variations in density. These variations are tiny, only about one part in 100,000. The early universe was very uniform; the real picture is a boring blank grayscale. We have to crank the contrast way up to see these minute variations.

We would like to know how the big galaxies of today – enormous agglomerations of stars and gas and dust separated by inconceivably vast distances – came to be. How can this happen starting from such homogeneous initial conditions, where all the mass is equally distributed? Gravity is an attractive force that makes the rich get richer, so it will grow the slight initial differences in density, but it is also weak and slow to act. A basic result in gravitational perturbation theory is that overdensities grow at the same rate the universe expands, which is inversely related to redshift. So if we see tiny fluctuations in density with amplitude 10-5 at z = 1000, they should have only grown by a factor of 1000 and still be small today (10-2 at z = 0). But we see structures of much higher contrast than that. You can’t here from there.

The rich large scale structure we see today is impossible starting from the smooth observed initial conditions. Yet here we are, so we have to do something to goose the process. This is one of the original motivations for invoking cold dark matter (CDM). If there is a substance that does not interact with photons, it can start to clump up early without leaving too large a mark on the relic radiation field. In effect, the initial fluctuations in mass are larger, just in the invisible substance. (That’s not to say the CDM doesn’t leave a mark on the CMB; it does, but it is subtle and entirely another story.) So the idea is that dark matter forms gravitational structures first, and the baryons fall in later to make galaxies.

An illustration of the the linear growth of overdensities. Structure can grow in the dark matter (long dashed lines) with the baryons catching up only after decoupling (short dashed line). In effect, the dark matter gives structure formation a head start, nicely explaining the apparently impossible growth factor. This has been standard picture for what seems like forever (illustration from Schramm 1992).

With the right amount of CDM – and it has to be just the right amount of a dynamically cold form of non-baryonic dark matter (stuff we still don’t know actually exists) – we can explain how the growth factor is 105 since recombination instead of a mere 103. The dark matter got a head start over the stuff we can see; it looks like 105 because the normal matter lagged behind, being entangled with the radiation field in a way the dark matter was not.

This has been the imperative need in structure formation theory for so long that it has become undisputed lore; an element of the belief system so deeply embedded that it is practically impossible to question. I risk getting ahead of the story, but it is important to point out that, like the interpretation of so much of the relevant astrophysical data, this belief assumes that gravity is normal. This assumption dictates the growth rate of structure, which in turn dictates the need to invoke CDM to allow structure to form in the available time. If we drop this assumption, then we have to work out what happens in each and every alternative that we might consider. That definitely gets ahead of the story, so first let’s understand what we should expect in LCDM.

Hierarchical Galaxy formation in LCDM

LCDM predicts some things remarkably well but others not so much. The dark matter is well-behaved, responding only to gravity. Baryons, on the other hand, are messy – one has to worry about hydrodynamics in the gas, star formation, feedback, dust, and probably even magnetic fields. In a nutshell, LCDM simulations are very good at predicting the assembly of dark mass, but converting that into observational predictions relies on our incomplete knowledge of messy astrophysics. We know what the mass should be doing, but we don’t know so well how that translates to what we see. Mass good, light bad.

Starting with the assembly of mass, the first thing we learn is that the story of monolithic galaxy formation outlined above has to be wrong. Early density fluctuations start out tiny, even in dark matter. God didn’t plunk down island universes of galaxy mass then say “let there be galaxies!” The annoying initial conditions mean that little dark matter halos form first. These subsequently merge hierarchically to make ever bigger halos. Rather than top-down monolithic galaxy formation, we have the bottom-up hierarchical formation of dark matter halos.

The hierarchical agglomeration of dark matter halos into ever larger objects is often depicted as a merger tree. Here are four examples from the high resolution Illustris TNG50 simulation (Pillepich et al. 2019; Nelson et al. 2019).

Examples of merger trees from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019). Objects have been selected to have very nearly the same stellar mass at z=0. Mass is built up through a series of mergers. One large dark matter halo today (at top) has many antecedents (small halos at bottom). These merge hierarchically as illustrated by the connecting lines. The size of the symbol is proportional to the halo mass. I have added redshift and the corresponding age of the universe for vanilla LCDM in a more legible font. The color bar illustrates the specific star formation rate: the top row has objects that are still actively star forming like spirals; those in the bottom row are “red and dead” – things that have stopped forming stars, like giant elliptical galaxies. In all cases, there is a lot of merging and a modest rate of growth, with the typical object taking about half a Hubble time (~7 Gyr) to assemble half of its final stellar mass.

The hierarchical assembly of mass is generic in CDM. Indeed, it is one of its most robust predictions. Dark matter halos start small, and grow larger by a succession of many mergers. This gradual agglomeration is slow: note how tiny the dark matter halos at z = 10 are.

Strictly speaking, it isn’t even meaningful to talk about a single galaxy over the span of a Hubble time. It is hard to avoid this mental trap: surely the Milky Way has always been the Milky Way? so one imagines its evolution over time. This is monolithic thinking. Hierarchically, “the galaxy” refers at best to the largest progenitor, the object that traces the left edge of the merger trees above. But the other protogalactic chunks that eventually merge together are as much part of the final galaxy as the progenitor that happens to be largest.

This complicated picture is complicated further by what we can see being stars, not mass. The luminosity we observe forms through a combination of in situ growth (star formation in the largest progenitor) and ex situ growth through merging. There is no reason for some preferred set of protogalaxies to form stars faster than the others (though of course there is some scatter about the mean), so presumably the light traces the mass of stars formed traces the underlying dark mass. Presumably.

That we should see lots of little protogalaxies at high redshift is nicely illustrated by this lookback cone from Yung et al (2022). Here the color and size of each point corresponds to the stellar mass. Massive objects are common at low redshift but become progressively rare at high redshift, petering out at z > 4 and basically absent at z = 10. This realization of the observable stellar mass tracks the assembly of dark mass seen in merger trees.

Fig. 2 from Yung et al. (2022) illustrating what an observer would see looking back through their simulation to high redshift.

This is what we expect to see in LCDM: lots of small protogalaxies at high redshift; the building blocks of later galaxies that had not yet merged. The observation of galaxies much brighter than this at high redshift by JWST poses a fundamental challenge to the paradigm: mass appears not to be subdivided as expected. So it is entirely justifiable that people have been freaking out that what we see are bright galaxies that are apparently already massive. That shouldn’t happen; it wasn’t predicted to happen; how can this be happening?

That’s all background that is assumed knowledge for our ApJ paper, so we’re only now getting to its Figure 1. This combines one of the merger trees above with its stellar mass evolution. The left panel shows the assembly of dark mass; the right pane shows the growth of stellar mass in the largest progenitor. This is what we expect to see in observations.


Fig. 1 from McGaugh et al (2024): A merger tree for a model galaxy from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019, left panel) selected to have M ≈ 9 × 1010 M at z = 0; i.e., the stellar mass of a local L giant elliptical galaxy (Driver et al. 2022). Mass assembles hierarchically, starting from small halos at high redshift (bottom edge) with the largest progenitor traced along the left of edge of the merger tree. The growth of stellar mass of the largest progenitor is shown in the right panel. This example (jagged line) is close to the median (dashed line) of comparable mass objects (Rodriguez-Gomez et al. 2016), and within the range of the scatter (the shaded band shows the 16th – 84th percentiles). A monolithic model that forms at zf = 10 and evolves with an exponentially declining star formation rate with τ = 1 Gyr (purple line) is shown for comparison. The latter model forms most of its stars earlier than occurs in the simulation.

For comparison, we also show the stellar mass growth of a monolithic model for a giant elliptical galaxy. This is the classic picture we had for such galaxies before we realized that galaxy formation had to be hierarchical. This particular monolithic model forms at zf = 10 and follows an exponential star formation rate with τ = 1 Gyr. It is one of the models published by Franck & McGaugh (2017). It is, in fact, the first model I asked Jay to construct when he started the project. Not because we expected it to best describe the data, as it turns out to do, but because the simple exponential model is a touchstone of stellar population modeling. It was a starter model: do this basic thing first to make sure you’re doing it right. We chose τ = 1 Gyr because that was the typical number bandied about for elliptical galaxies, and zf = 10 because that seemed ridiculously early for a massive galaxy to form. At the time we built the model, it was ludicrously early to imagine a massive galaxy would form, from an LCDM perspective. A formation redshift zf = 10 was, less than a decade ago, practically indistinguishable from the beginning of time, so we expected it to provide a limit that the data would not possibly approach.

In a remarkably short period, JWST has transformed z = 10 from inconceivable to run of the mill. I’m not going to go into the data yet – this all-theory post is already a lot – but to offer one spoiler: the data are consistent with this monolithic model. If we want to “fix” LCDM, we have to make the red line into the purple line for enough objects to explain the data. That proves to be challenging. But that’s moving the goalposts; the prediction was that we should see little protogalaxies at high redshift, not massive, monolith-style objects. Just look at the merger trees at z = 10!

Accelerated Structure Formation in MOND

In order to address these issues in MOND, we have to go back to the beginning. What is the evolution of a spherical region (a top-hat overdensity) that might collapse to form a galaxy? How does a spherical region under the influence of MOND evolve within an expanding universe?

The solution to this problem was first found by Felten (1984), who was trying to play the Newtonian cosmology trick in MOND. In conventional dynamics, one can solve the equation of motion for a point on the surface of a uniform sphere that is initially expanding and recover the essence of the Friedmann equation. It was reasonable to check if cosmology might be that simple in MOND. It was not. The appearance of a0 as a physical scale makes the solution scale-dependent: there is no general solution that one can imagine applies to the universe as a whole.

Felten reasonably saw this as a failure. There were, however, some appealing aspects of his solution. For one, there was no such thing as a critical density. All MOND universes would eventually recollapse irrespective of their density (in the absence of the repulsion provided by a cosmological constant). It could take a very long time, which depended on the density, but the ultimate fate was always the same. There was no special value of Ω, and hence no flatness problem. The latter obsessed people at the time, so I’m somewhat surprised that no one seems to have made this connection. Too soon*, I guess.

There it sat for many years, an obscure solution for an obscure theory to which no one gave credence. When I became interested in the problem a decade later, I started methodically checking all the classic results. I was surprised to find how many things we needed dark matter to explain were just as well (or better) explained by MOND. My exact quote was “surprised the bejeepers out of us.” So, what about galaxy formation?

I started with the top-hat overdensity, and had the epiphany that Felten had already obtained the solution. He had been trying to solve all of cosmology, which didn’t work. But he had solved the evolution of a spherical region that starts out expanding with the rest of the universe but subsequently collapses under the influence of MOND. The overdensity didn’t need to be large, it just needed to be in the low acceleration regime. Something like the red cycloidal line in the second plot above could happen in a finite time. But how much?

The solution depends on scale and needs to be solved numerically. I am not the greatest programmer, and I had a lot else on my plate at the time. I was in no rush, as I figured I was the only one working on it. This is usually a good assumption with MOND, but not in this case. Bob Sanders had had the same epiphany around the same time, which I discovered when I received his manuscript to referee. So all credit is due to Bob: he said these things first.

First, he noted that galaxy formation in MOND is still hierarchical. Small things form first. Crudely speaking, structure formation is very similar to the conventional case, but now the goose comes from the change in the force law rather than extra dark mass. MOND is nonlinear, so the whole process gets accelerated. To compare with the linear growth of CDM:

A sketch of how structures grow over time under the influence of cold dark matter (left, from Schramm 1992, same as above) and MOND (right, from Sanders & McGaugh 2002; see also this further discussion and previous post). The slow linear growth of CDM (long-dashed line, left panel) is replaced by a rapid, nonlinear growth in MOND (solid lines at right; numbers correspond to different scales). Nonlinear growth moderates after cosmic expansion begins to accelerate (dashed vertical line in right panel).

The net effect is the same. A cosmic web of large scale structure emerges. They look qualitatively similar, but everything happens faster in MOND. This is why observations have persistently revealed structures that are more massive and were in place earlier than expected in contemporaneous LCDM models.

Simulated structure formation in ΛCDM (top) and MOND (bottom) showing the more rapid emergence of similar structures in MOND (note the redshift of each panel). From McGaugh (2015).

In MOND, small objects like globular clusters form first, but galaxies of a range of masses all collapse on a relatively short cosmic timescale. How short? Let’s consider our typical 1011 M galaxy. Solving Felten’s equation for the evolution of a sphere numerically, peak expansion is reached after 300 Myr and collapse happens in a similar time. The whole galaxy is in place speedy quick, and the initial conditions don’t really matter: a uniform, initially expanding sphere in the low acceleration regime will behave this way. From our distant vantage point thirteen billion years later, the whole process looks almost monolithic (the purple line above) even though it is a chaotic hierarchical mess for the first few hundred million years (z > 14). In particular, it is easy to form half of the stellar mass early on: the mass is already assembled.

The evolution of a 1011 M sphere that starts out expanding with the universe but decouples and collapses under the influence of MOND (dotted line). It reaches maximum expansion after 300 Myr and recollapses in a similar time, so the entire object is in place after 600 Myr. (A version of this plot with a logarithmic time axis appears as Fig. 2 in our paper.) The inset shows the evolution of smaller shells within such an object (Fig. 2 from Sanders 2008). The inner regions collapse first followed by outer shells. These oscillate and cross, mixing and ultimately forming a reasonable size galaxy – see Sanders’s Table 1 and also his Fig. 4 for the collapse times for objects of other masses. These early results are corroborated by Eappen et al. (2022), who further demonstrate that the details of feedback are not important in MOND, unlike LCDM.

This is what JWST sees: galaxies that are already massive when the universe is just half a billion years old. I’m sure I should say more but I’m exhausted now and you may be too, so I’m gonna stop here by noting that in 1998, when Bob Sanders predicted that “Objects of galaxy mass are the first virialized objects to form (by z=10),” the contemporaneous prediction of LCDM was that “present-day disc [galaxies] were assembled recently (at z<=1)” and “there is nothing above redshift 7.” One of these predictions has been realized. It is rare in science that such a clear a priori prediction comes true, let alone one that seemed so unreasonable at the time, and which took a quarter century to corroborate.


*I am not quite this old: I was still an undergraduate in 1984. I hadn’t even decided to be an astronomer at that point; I certainly hadn’t started following the literature. The first time I heard of MOND was in a graduate course taught by Doug Richstone in 1988. He only mentioned it in passing while talking about dark matter, writing the equation on the board and saying maybe it could be this. I recall staring at it for a long few seconds, then shaking my head and muttering “no way.” I then completely forgot about it, not thinking about it again until it came up in our data for low surface brightness galaxies. I expect most other professionals have the same initial reaction, which is fair. The test of character comes when it crops up in their data, as it is doing now for the high redshift galaxy community.

What if we never find dark matter?

Some people have asked me to comment on the Scientific American article What if We Never Find Dark Matter? by Slatyer & Tait. For the most part, I find it unobjectionable – from a certain point of view. It is revealing to examine this point of view, starting with the title, which frames the subject in a way that gives us permission to believe in dark matter while never finding it. This framing is profoundly unscientific, as it invites a form of magical thinking that could usher in a thousand years of dark epicycles (feedback being the modern epicycle) on top of the decades it has already sustained.

The article does recognize that a modification of gravity is at least a logical possibility. The mere mention of this is progress, if grudging and slow. They can’t bring themselves to name a specific theory: they never say MOND and only allude obliquely to a single relativistic theory as if saying its name out loud would bring a curse% upon their house.

Of course, they mention modified gravity merely to dismiss it:

A universe without dark matter would require striking modifications to the laws of gravity… [which] seems exceptionally difficult.

Yes it is. But it has also proven exceptionally difficult to detect dark matter. That hasn’t stopped people from making valiant efforts to do so. So the argument is that we should try really hard to accomplish the exceptionally difficult task of detecting dark matter, but we shouldn’t bother trying to modify gravity because doing so would be exceptionally difficult.

This speaks to motivations – is one idea better motivated? In the 1980s, cold dark matter was motivated by both astronomical observations and physical theory. Absent the radical thought of modifying gravity, we had a clear need for unseen mass. Some of that unseen mass could simply have been undetected normal matter, but most of it needed to be some form of non-baryonic dark matter that exceeded the baryon density allowed by Big Bang Nucleosynthesis and did not interact directly with photons. That meant entirely new physics from beyond the Standard Model of particle physics: no particle in the known stable of particles suffices. This new physics was seen as a good thing, because particle physicists already had the feeling that there should be something more than the Standard Model. There was a desire for Grand Unified Theories (GUTs) and supersymmetry (SUSY). SUSY naturally provides a home for particles that could be the dark matter, in particular the Weakly Interacting Massive Particles (WIMPs) that are the prime target for the vast majority of experiments that are working to achieve the exceptionally difficult task of detecting them. So there was a confluence of reasons from very different perspectives to make the search for WIMPs very well motivated.

That was then. Fast forward a few decades, and the search for WIMPs has failed. Repeatedly. Continuing to pursue it is an example of the sunk cost fallacy. We keep doing it because we’ve already done so much of it that surely we should keep going. So I feel the need to comment on this seemingly innocuous remark:

although many versions of supersymmetry predict WIMP dark matter, the converse isn’t true; WIMPs are viable dark matter candidates even in a universe without supersymmetry.

Strictly speaking, this is correct. It is also weak sauce. The neutrino is an example of a weakly interacting particle that has some mass. We know neutrinos exist, and they reside in the Standard Model – no need for supersymmetry. We also know that they cannot be the dark matter, so it would be disingenuous to conflate the two. Beyond that, it is possible to imagine a practically infinite variety of particles that are weakly interacting by not part of supersymmetry. That’s just throwing mud at the wall. SUSY WIMPs were extraordinarily well motivated, with the WIMP miracle being the beautiful argument that launched a thousand experiments. But lacking SUSY – which seems practically dead at this juncture – WIMPS as originally motivated are dead along with it. The motivation for more generic WIMPs is lacking, so the above statement is nothing more than an assertion that runs interference for the fact that we no longer have good reason to expect WIMPs at all.

There is also an element of disciplinary-centric thinking: if you’re a particle physicist, you can build a dark matter detector and maybe make a major discovery or at least get great gobs of grants in the effort to do so. If instead what is going on is really a modification of gravity, then your expertise is irrelevant and there is no reason to keep shoveling money into your field. Worse, a career spent at the bottom of a mine shaft working on dark matter detectors is a waste of effort. I can understand why people don’t want to hear that message, but that just brings us back to the sunk cost fallacy.

Speaking of money, I occasionally get scientists who come up to me Big Mad that grant money gets spent on MOND research, as that would be a waste of taxpayer money. I can assure them that no government dollars have been harmed in the pursuit of MOND research. Certainly not in the U.S., at any rate. But lots and lots of tax dollars have been burned in the search for dark matter, and the article we’re discussing advocates spending a whole lot more to search for dark matter candidates that are nowhere near as well motivated as WIMPs were. That’s why I keep asking: how do we know when to stop? I don’t expect other scientists to agree to my interpretation of the data, but I do expect them to have a criterion whereby they would accede that dark matter is incorrect. If we lack any notion of how we could figure out that we are wrong, then we’ve made the leap from science to religion. So far, such criteria are sadly lacking, and I see precious little evidence of people rising to the challenge. Indeed, I frequently get the opposite, as other scientists have frequently asserted to me that they would only consider MOND as a last resort. OK, when does that happen? There’s always another particle we can think up, so the answer seems to be “never.”

I wrote long ago that “After WIMPs, the next obvious candidate is axions.” Sure enough, this article spills a lot of ink discussing axions. Rather than dwell on this different doomed idea for dark matter, let’s take a gander at the remarkable art made to accompany the article, because we are visual animals and graphical representations are important.

Artwork by Olena Shmahalo that accompanies the article by Slatyer & Tait.

Where to start? Right in the center is a scroll of an old-timey star chart. On top of that are several depictions of what I guess are meant to be galaxies*. Around those is an ethereal dragon representing the unknown dark matter. The depiction of dark matter as an unfathomable monster is at once both spot on and weirdly anthropomorphic. Is this a fabled beast the adventurous hero is supposed to seek out and slay? or befriend? or maybe it is a tale in which he grows during the journey to realize he has been on the wrong path the whole time? I love the dragon as art, but as a representation of a scientific subject it imparts an aura of teleological biology to something that is literally out of this world, residing in a dark sector that is not part of our daily experience and may be entirely inaccessible to our terrestrial experimentation. Off the edge of the map and on into extra dimensions: here there be monsters.

The representations here are fantastic. There is the coffee mug and the candle to represent the hard work of those of us who burn the candle at both ends wrestling with the dark matter problem. There’s a magnifying glass to represent how hard the experimentalists have looked for the dark matter. Scattered around are various totems, like the Polaroid-style picture at right depicting the gravitational lensing around a black hole. This is cool, but has squat to do with the missing mass problem. It’s more a nod to General Relativity and the Faith we have therein, albeit in a regime many orders of magnitude removed from the one that concerns us here. On the left is an old newspaper article about WIMPs, complete with a sketch of a Feynman diagram that depicts how we might detect them. And at the top, peeking out of a book, as it were a thought made long ago now seeking new relevance, a note saying Axions!

I can save everyone a lot of time, effort, and expense. It ain’t WIMPs and it ain’t axions. Nor is the dark matter any of the plethora of other ideas illustrated in the eye-watering depiction of the landscape of particle possibilities in the article. These simply add mass while providing no explanation of the observed MOND phenomenology. This phenomenology is fundamental to the problem, so any approach that ignores it is doomed to failure. I’m happy to consider explanations based on dark matter, but these need to have a direct connection to baryons baked-in to be viable. None of the ideas they discuss meet this minimum criterion.

Of course it could be that MOND – either as modified gravity or modified inertia, an important possibility that usually gets overlooked – is essentially correct and that’s why it keeps having predictions come true. That’s what motivates considering it now: repeated and sustained predictive success, particularly for phenomena that dark matter does not provide a satisfactory explanation for.

Of course, this article advocating dark matter is at pains to dismiss modified gravity as a possibility:

The changes [of modified gravity] would have to mimic the effects of dark matter in astrophysical systems ranging from giant clusters of galaxies to the Milky Way’s smallest satellite galaxies. In other words, they would need to apply across an enormous range of scales in distance and time, without contradicting the host of other precise measurements we’ve gathered about how gravity works. The modifications would also need to explain why, if dark matter is just a modification to gravity—which is universally associated with all matter—not all galaxies and clusters appear to contain dark matter. Moreover, the most sophisticated attempts to formulate self-consistent theories of modified gravity to explain away dark matter end up invoking a type of dark matter anyway, to match the ripples we observe in the cosmic microwave background, leftover light from the big bang.

That’s a lot, so let’s break it down. First, that modified gravity “would have to mimic the effects of dark matter” gets it exactly backwards. It is dark matter that has to mimic the effects of MOND. That’s an easy call: dark matter plus baryons could combine in a large variety of ways that might bear no resemblance to MOND. Indeed, they should do that: the obvious prediction of LCDM-like theories is an exponential disk in an NFW halo. In contrast, there is one and only one thing that can happen in MOND since there is a single effective force law that connects the dynamics to the observed distribution of baryons. Galaxies didn’t have to do that, shouldn’t do that, but remarkably they do. The uniqueness of this relation poses a problem for dark matter that has been known since the previous century:

Reluctant conclusions from McGaugh & de Blok (1998). As we said at the time, “This result surprised the bejeepers out of us, too.”

This basic conclusion has not changed over the years, only gotten stronger. The equation coupling dark to luminous matter I wrote down in all generality in McGaugh (2004) and again in McGaugh et al. (2016). The latter paper is published in Physical Review Letters, arguably the most prominent physics journal, and is in the top percentile of citation rates, so it isn’t some minuscule detail buried in an obscure astronomical journal that might have eluded the attention of particle physicists. It is the implication that conclusion [1] could be correct that bounces off a protective shell of cognitive dissonance so hard that the necessary corollary [2] gets overlooked.

OK, that’s just the first sentence. Let’s carry on with “[the modification] would need to apply across an enormous range of scales in distance and time, without contradicting the host of other precise measurements we’ve gathered about how gravity works.” Well, duh. That’s the first thing I checked. Thoroughly and repeatedly. I’ve written many reviews on the subject. They’re either unaware of some well-established results, or choose to ignore them.

The reason MOND doesn’t contradict the host of other constraints about how gravity works is simple. It happens in the low acceleration regime, where the only test of gravity is provided by the data that evince the mass discrepancy. If we had posed galaxy observations as a test of GR, we would have concluded that it fails at low accelerations. Of course we didn’t do that; we observed galaxies because we were interested in how they worked, then inferred the need for dark matter when gravity as we currently know it failed to explain the data. Other tests, regardless how precise, are irrelevant if they probe accelerations higher than Milgrom’s constant (1.2 x 10-10 m/s/s).

Continuing on, there is the complaint that “modifications would also need to explain why… not all galaxies and clusters appear to contain dark matter.” Yep, you gotta explain all the data. That starts with the vast majority of the data that do follow the radial acceleration relation, which is not satisfactorily explained by dark matter. They skip+ past that part, preferring to ignore the forest in order to complain about a few outlying trees. There are some interesting cases, to be sure, but this complaint about objects lacking dark matter is misplaced for deeper reasons. It makes no sense in terms of dark matter that there are objects without dark matter. That shouldn’t happen in LCDM any more than in MOND$. One winds up invoking non-equilibrium effects, which we can do in MOND just as we do in dark matter. It is not satisfactory in either case, but it is weird to complain about it for one theory while not for the other. This line of argument is perilously close to the a priori fallacy.

The last line, “the most sophisticated attempts to formulate self-consistent theories of modified gravity to explain away dark matter end up invoking a type of dark matter anyway, to match the ripples we observe in the cosmic microwave background” actually has some merit. The theory they’re talking about is Aether-Scalar-Tensor (AeST) theory, which I guess earns the badge of “most sophisticated” because it fits the power spectrum of the cosmic microwave background (CMB).

I’ve discussed the CMB in detail before, so won’t belabor it here. I will note that the microwave background is only one piece of many lines of evidence, and the conclusion one reaches depends on how one chooses to weigh the various incommensurate evidence. That they choose to emphasize this one thing while entirely eliding the predictive successes of MOND is typical, but does not encourage me to take this as a serious argument, especially when I had more success predicting important aspects of the microwave background than did the entire community that persistently cites the microwave background to the exclusion of all else.

It is also a bit strange to complain that AeST “explain[s] away dark matter [but] end[s] up invoking a type of dark matter.” I think what they mean here is true at the level of quantum field theory where all particles are fields and all fields are particles, but beyond that, they aren’t the same thing at all. It is common for modified gravity theories to invoke scalar fields#, and this is an important degree of freedom that enables AeST to fit the CMB. TeVeS also added a scalar and tensor field, but could not fit the CMB, so this approach isn’t guaranteed to work. But are these a type of dark matter? Or are our ideas of dark matter mimicking a scalar field? It seems like this argument could cut either way, and we’re just granting dark matter priority as a concept because we thought of it first. I don’t think nature cares about the order of our thoughts.

None of this addresses the question of the year. Why does MOND get any predictions right? Just saying “dark matter does it” is not sufficient. Until scientists engage seriously with this question, they’re doomed to chasing phantoms that aren’t there to catch.


%From what I’ve seen, they’re probably right to fear the curses of their colleagues for such blasphemy. Very objective, very scientific.

*Galaxies are nature’s artwork; human imitations never seem adequate. These look more like fried eggs to me. On the whole, this art is exceptionally well informed by science, or at least by particle physics, but not so much by astronomy. And therein lies the greater problem: there is a whole field of physics devoted to dark matter that is entirely motivated by astronomical observations yet its practitioners are, by and large, remarkably ignorant of anything more than the most rudimentary aspects of the data that motivate their field’s existence.

+There seems to be a common misconception that anything we observe is automatically explained by dark matter. That’s only true at the level of inference: any excess gravity is attributable to unseen mass. That’s why a hypothesis is only as good as its prior; a mere inference isn’t science, you have to make a prediction. Once you do that, you find dark matter might do lots of things that are not at all like the MONDian phenomenology that we observe. While I would hope the need for predictions is obvious, many scientists seem to conflate observation with prediction – if we observe it, that’s what dark matter must predict!

$The discrepancy should only appear below the critical acceleration scale in MOND. So strictly speaking, MOND does predict that there should be objects without dark matter: systems that are high acceleration. The central regions of globular clusters and elliptical galaxies are such regions, and MOND fares well there. In contrast, it is rather hard to build a sensible dark matter model that is as baryon dominated as observed. So this is an example of MOND explaining the absence of dark matter better than dark matter theory. This is related to the observation that the apparent need for dark matter only appears at low accelerations, at a scale that dark matter knows nothing about.

#I, personally, am skeptical of this approach, as it seems too generic (let’s add some new freedom!) when it feels like we’re missing something fundamental, perhaps along the lines of Mach’s Principle. However, I also recognize that this is a feeling on my part; it is outside my training to have a meaningful opinion.

Progressive Approximations in Mass Modeling

Progressive Approximations in Mass Modeling

I have said I wasn’t going to attempt to teach an entire graduate course on galaxy dynamics in this forum, and I’m not. But I can give some pointers for those who want to try it for themselves. It also provides some useful context for fans of Deur’s approach.

The go-to textbook for this topic is Galactic Dynamics by Binney & Tremaine. The first edition was published in 1987, conveniently when I switched to grad school in astronomy. It was already a deep and well-developed field at that time; this is a compendium of considerable scientific knowledge.

Fun story: a colleague in a joint physics & astronomy department once complained to me that she wanted to develop a course in galaxy dynamics, which is a staple of graduate programs in astronomy & astrophysics. However, there was a certain senior colleague who objected, saying that since it was astronomy, it couldn’t possibly be a rigorous course worthy of a full semester graduate course. This is a casual bias that astronomers often encounter when talking to physicists, many of whom have attitudes about the subject that were trapped in amber sometime in the Jurassic. I suggested that she walk into his office and drop a copy of Galactic Dynamics on his desk from on high, as (1) it would make a hefty impact, and (2) no one who so much as skims this book could persist in this toxic attitude.

She later reported that she had done this, and it had worked.

Galactic Dynamics is not a starter book. It is the textbook we use when teaching the graduate course that this is not. A useful how-to guide for the specific material I’ll discuss here is provided by Federico Lelli. In brief, to model the gravitational potential of an observed distribution of matter, we can make one of the following series of approximations:

This is a slide I sometimes use to introduce mass modeling in science talks as a reminder for expert audiences.

All science is an approximation at some level. The most crude approximation we can employ here is to imagine that all of the mass resides at a central point. In this limit, the potential is simply

V2 = GM/R

where V is the orbital speed of a test particle on a circular orbit, G is Newton’s constant, M is the mass, and R is the distance from the point mass. Galaxies are not point masses, so this is a terrible approximation, as can be seen by the divergent V ~ R-1/2 behavior as R → 0 (the dotted line above).

The next bad approximation one can make is a spherical cow: assume the mass is distributed in a sphere that is projected as the image we see on the sky. This at least incorporates the fact that the mass is not all concentrated at a point, so

V2 = GM(R)/R

acknowledges that the mass M is spread out as a function of radius. This is a spherical cow. Since we cannot see dark matter, we almost always assume it to be a spherical cow.

For the luminous disk of a spiral galaxy, a common approximation is the so-called exponential disk:

Σ(R) = Σ0 e-R/Rd

where Σ0 is the central surface density of stars and Rd is the scale length of the disk – the characteristic size over which the surface brightness declines exponentially. This can be integrated by parts to obtain an expression for the enclosed mass M(R) which I leave as an exercise for the eager reader. This provides a handy analytic formula, the rotation curve of which is illustrated above by the dashed line.

Spiral galaxies are fairly thin when seen edge-on, so the spherical cow is not a great approximation. In a classic paper, Freeman (1970) solved the Poisson equation for the case of a razor-thin exponential disk, where one meets modified Bessel functions of the first and second kind (denoted “ikik” above). These must be solved numerically, but one can make a tabulation for use with any choice of disk mass and scale length. Such a thin disk is illustrated by the grey line above for a choice of stellar mass and scale length appropriate to NGC 6946.

The spiral galaxy NGC 6946, aka the fireworks galaxy.

Spiral galaxies are not razor thin of course. We only see a projected image on the sky, so for a galaxy like NGC 6946, we may have a good measurement of its azimuthally averaged light (and presumable stellar mass) distribution Σ(R) but we have no idea how thick it is. Here, we have to make an educated guess based on observations of edge-on galaxies. A ballpark average is R:z = 8:1, but some galaxies are thicker and others thinner, so this becomes an approximation with an associated uncertainty. This uncertainty cannot be unambiguously eliminated; it is one of the known unknowns that comprise the inevitable systematic errors in astronomy. Fortunately, allowing for a finite thickness only takes the harsh edge off of the thin disk case, and the assumption one chooses makes little difference to the result (compare the lines labeled thick and thin above).

The exponential disk formula Σ(R) is an azimuthal average over an image like that of NGC 6946. This approximation captures none of the spiral structure: it only tells us about the average rate at which the surface brightness falls off. It also imposes a smooth shape on that fall off that our eyes can see is not necessarily a great approximation. So the next level of approximation is to solve the Poisson equation numerically for the observed surface brightness profile, Σ(R), not just the exponential approximation thereto. This is the blue line in the bottom right graph above.

There are important differences between using the numerical solution for the observed light distribution and the exponential disk approximation. This has been known since the 1980s, but the analytic expression is so convenient that people need an occasional reminder not to trust it too much. Jerry Sellwood felt the need to provide this reminder in 1999:

Small apparent differences in the shape of the mass profile (left) correspond to pronounced differences in the rotation curve (right). I chose the example of NGC 6946 in part because the exponential approximation for it is pretty good. Nevertheless, the details matter, so the best practice is to build numerical mass models, as we did for SPARC.

Building numerical mass models is tractable for external galaxies, where we can see the entire light distribution. It is not possible for our own Milky Way, since we are located within it and cannot see it as a whole. Consequently, the vast majority of Milky Way models rely on the exponential approximation; so far as I’m aware, I’m the only one who has built a model that attempts to get beyond this.

Numerical mass models are still an approximation. We’re assuming that the gravitational potential is static and azimuthally symmetric. Taking the next step would require abandoning these assumptions to model the spiral arms. The Poisson equation can handle that, but it becomes dicey because the arms rotate with some pattern speed (generally unknown) and may grow or dissolve or reform on some unknown timescale. The potential at any given point is time variable even in equilibrium, so we need not just a numerical solution but a live numerical simulation to keep track of it. That can be done, but it has to be done on a case by case basis, and the answer will depend somewhat on additional assumptions that have to be introduced to run the simulation, like specifying a dark matter halo.

One can generalize further to consider the full 3D potential, e.g., to allow for asymmetry in the z-direction as well as in azimuth. One can further imagine non-equilibrium processes, such as an external perturbations. There is good evidence that the Milky Way suffers both of these effects, the passage of the Large Magellanic Cloud being one obvious and apparently large perturbation. So we are in the awkward position that the Gaia data now oblige us to consider the entire run of possible effects through non-equilibrium processes in a mass distribution that is not completely symmetric in any of the three spatial dimensions, but for the main mass component we are stuck with the inadequate approximation of an exponential disk.

Geometry appears to play a crucial role in the approach of Deur to the acceleration discrepancy problem. The essential claim is that the discrepancy correlates with flattening, with highly flattened systems like spirals evincing the classic discrepancy while spherical systems like E0 galaxies showing none. Big if true!

A useful plot appears on slide 44:

Some measure of the discrepancy as a function of apparent ellipticity.

This is the one example shown that goes into the plot of many determinations of the slope a on the following slide. It being the only one, it is the only thing I have to evaluate without chasing down every other case. Looking at this, I am not inclined to do so.

At first it looks persuasive: the best fit slope is clear. There is no reason why the discrepancy should depend on the projected ellipticity of a triaxial 3D blob of stars, so this must be telling us something important. I’d be on board with that if it were true, but I’ve seen too many non-correlations masquerading as correlations to believe this one. The fitted slope is strongly influenced by the one point at large ellipticity; absent that, a slope of zero works fine. Mostly what I see here is a lot of scatter, which is normal in extragalactic astronomy. Since there are only a few points at high and low ellipticity, we don’t know what would happen if we went out and got more data. But I bet that what would happen is that the high ellipticity points would wind up looking like those in the middle: a big blob of scatter, with no significant correlation.

I’d kinda like to be wrong about this one, so I won’t even get into the theory side, which I find sorta compelling but ultimately unpersuasive. Why are gravitons confined to a disk? What happens way far out? Surely the flatness of the disk at tens of kpc is not dictating the flatness at 1000 kpc.

Surely.

Why’d it have to be MOND?

Why’d it have to be MOND?

I want to take another step back in perspective from the last post to say a few words about what the radial acceleration relation (RAR) means and what it doesn’t mean. Here it is again:

The Radial Acceleration Relation over many decades. The grey region is forbidden – there cannot be less acceleration than caused by the observed baryons. The entire region above the diagonal line (yellow) is accessible to dark matter models as the sum of baryons and however much dark matter the model prescribes. MOND is the blue line.

This information was not available when the dark matter paradigm was developed. We observed excess motion, like flat rotation curves, and inferred the existence of extra mass. That was perfectly reasonable given the information available at the time. It is not now: we need to reassess as we learn more.

There is a clear organization to the data at both high and low acceleration. No objective observer with a well-developed physical intuition would look at this and think “dark matter.” The observed behavior does not follow from one force law plus some arbitrary amount of invisible mass. That could do literally anything in the yellow region above, and beyond the bounds of the plot, both upwards and to the left. Indeed, there is no obvious reason why the data don’t fall all over the place. One of the lingering, niggling concerns is the 5:1 ratio of dark matter:baryons – why is it in the same ballpark, when it could be pretty much anything? Why should the data organize in terms of acceleration? There is no reason for dark matter to do this.

Plausible dark matter models have been predicted to do a variety of things – things other than what we observe. The problem for dark matter is that real objects only occupy a tiny line through the vast region available to them in the plot above. This is a fine-tuning problem: why do the data reside only where they do when they could be all over the place? I recognized this as a problem for dark matter before I became aware$ of MOND. That it turns out that the data follow the line uniquely predicted* by MOND is just chef’s kiss: there is a fine-tuning problem for dark matter because MOND is the effective force law.

The argument against dark matter is that the data could reside anywhere in the yellow region above, but don’t. The argument against MOND is that a small portion of the data fall a little off the blue line. Arguing that such objects, be they clusters of galaxies or particular individual galaxies, falsify MOND while ignoring the fine-tuning problem faced by dark matter is a case of refusing to see the forest for a few outlying trees.%

So to return to the question posed in the title of this post, I don’t know why it had to be MOND. That’s just what we observe. Pretending dark matter does the same thing is a false presumption.


$I’d heard of MOND only vaguely, and, like most other scientists in the field, had paid it no mind until it reared its ugly head in my own data.

*I talk about MOND here because I believe in giving credit where credit is due. MOND predicted this; no other theory did so. Dark matter theories did not predict this. My dark matter-based galaxy formation theory did not predict this. Other dark matter-based galaxy formation theories (including simulations) continue to fail to explain this. Other hypotheses of modified gravity also did not predict what is observed. Who+ ordered this?

Modified Dynamics. Very dangerous. You go first.

Many people in the field hate MOND, often with an irrational intensity that has the texture of religion. It’s not as if I woke up one morning and decided to like MOND – sometimes I wish I had never heard of it – but disliking a theory doesn’t make it wrong, and ignoring it doesn’t make it go away. MOND and only MOND predicted the observed RAR a priori. So far, MOND and only MOND provides a satisfactory explanation of thereof. We might not like it, but there it is in the data. We’re not going to progress until we get over our fear of MOND and cope with it. Imagining that it will somehow fall out of simulations with just the right baryonic feedback prescription is a form of magical thinking, not science.

MOND. Why’d it have to be MOND?

+Milgrom. Milgrom ordered this.


%I expect many cosmologists would argue the same in reverse for the cosmic microwave background (CMB) and other cosmological constraints. I have some sympathy for this. The fit to the power spectrum of the CMB seems too good to be an accident, and it points to the same parameters as other constraints. Well, mostly – the Hubble tension might be a clue that things could unravel, as if they haven’t already. The situation is not symmetric – where MOND predicted what we observe a priori with a minimum of assumptions, LCDM is an amalgam of one free parameter after another after another: dark matter and dark energy are, after all, auxiliary hypotheses we invented to save FLRW cosmology. When they don’t suffice, we invent more. Feedback is single word that represents a whole Pandora’s box of extra degrees of freedom, and we can invent crazier things as needed. The results is a Frankenstein’s monster of a cosmology that we all agree is the same entity, but when we examine it closely the pieces don’t fit, and one cosmologist’s LCDM is not really the same as that of the next. They just seem to agree because they use the same words to mean somewhat different things. Simply agreeing that there has to be non-baryonic dark matter has not helped us conjure up detections of the dark matter particles in the laboratory, or given us the clairvoyance to explain# what MOND predicted a prioi. So rather than agree that dark matter must exist because cosmology works so well, I think the appearance of working well is a chimera of many moving parts. Rather, cosmology, as we currently understand it, works if and only if non-baryonic dark matter exists in the right amount. That requires a laboratory detection to confirm.

#I have a disturbing lack of faith that a satisfactory explanation can be found.

The Radial Acceleration Relation starting from high accelerations

The Radial Acceleration Relation starting from high accelerations

In the previous post, we discussed how lensing data extend the Radial Acceleration Relation (RAR) seen in galaxy kinematics to very low accelerations. Let’s zoom out now, and look at things at higher accelerations and from a historical perspective.

This all started with Kepler’s Laws of Planetary Motion, which are explained by Newton’s Universal Gravitation – the inverse square law gbar = GM/r2 is exactly what is needed to explain the observed centripetal acceleration, gobs = V2/r. It also explains the surface gravity of the Earth. Indeed, it was the famous falling apple that is reputed to have given Newton the epiphany that it was the same force that made the apple fall to the ground that made the Moon circle the Earth that made the planets revolve around the sun.

The inverse square law holds over more than six decades of observed acceleration in the solar system, from the one gee we feel here on the surface of the Earth to the outskirts patrolled by Neptune.

Planetary motion in the radial acceleration plane. The dotted line is Newton’s inverse square law of universal gravity.*

The inverse square force law is what it takes to make the planetary data line up. A different force law would give a line with a different slope in this plot. No force law at all would give chaos, with planets all over the place in this plot, if, say, the solar system were run by a series of deferents and epicycles as envisioned for Ptolemaic cosmologies. In such a system, there is no reason to expect the organization seen above. It would require considerable contrivance to make it so.

Newtonian gravity and General Relativity are exquisitely well-tested in the solar system. There are also some very precise tests at higher accelerations that GR passes with flying colors. The story to lower accelerations is another matter. The most remote solar system probes we’ve launched are the Voyger and Pioneer missions. These probe down to ~10-6 m/s/s; below that is uncharted territory.

The RAR extended from high solar system accelerations to much low accelerations typical of galaxies – not the change in scale. Some early rotation curves (of NGC 55, NGC 801, NGC 2403, NGC 2841, & UGC 2885) are shown as lines. These probed an entirely new regime of acceleration. The departure of these lines from the dotted line are the flat rotation curves indicating the acceleration discrepancy/need for dark matter. This discrepancy was clear by the end of the 1970s, but the amplitude of the discrepancy then was modest.

Galaxies (and extragalactic data in general) probe an acceleration range that is unprecedented from the perspective of solar system tests. General Relativity has passed so many precise tests that the usual presumption is that is applies at all scales. But it is an assumption that it applies to scales where it hasn’t been tested. Galaxies and cosmology pose such a test. That we need to invoke dark matter to save the phenomenon would be interpreted as a failure if we had set out to test the theory rather than assume it applied.

It was clear from flat rotation curves that something extra was needed. However, when we invented the dark matter paradigm, it was not clear that the data were organized in terms of acceleration. As the data continued to improve, it became clear that the vast majority of galaxies adhered to a single, apparently universal+ radial acceleration relation. What had been a hint of systematic behavior in early data became clean and clear. The data did not exhibit the scatter that as was expected from a sum of a baryonic disk and a non-baryonic dark matter halo – there is no reason that these two distinct components should sum to the single effective force law that is observed.

The RAR with modern data for both early (red triangles) and late (cyan circles) morphological types. The blue line is the prediction of MOND: there is a transition at an acceleration scale to a force law that is universal but no longer inverse-square.

The observed force-law happened to already have a name: MOND. If it had been something else, then we could have claimed to discover something new. But instead we were obliged to admit that the unexpected thing we had found had in fact been predicted by Milgrom.

This predictive power now extends to much lower accelerations. Again, only MOND got this prediction right in advance.

The RAR as above, extended by weak gravitational lensing observations. These follow the prediction of MOND as far as they are credible.

The data could have done many different things here. It could have continued along the dotted line, in which case we’d have need for no dark matter or modified gravity. It could have scattered all over the place – this is the natural expectation of dark matter theories, as there is no reason to expect the gravitational potential of the dominant dark matter halo to be dictated by the distribution of baryons. One expects that not to happen. Yet the data evince the exceptional degree of organization seen above.

It requires considerable contrivance to explain the RAR with dark matter. No viable explanation yet exists, despite many unconvincing claims to this effect. I have worked more on trying to explain this in terms of dark matter than I have on MOND, and all I can tell you is what doesn’t work. Every explanation I’ve seen so far is a special case of a model I had previously considered and rejected as obviously unworkable. At this point, I don’t see how dark matter can ever plausibly do what the data require.

I worry that dark matter has become an epicycle theory. We’re sure it is right, so whatever we observe, no matter how awkward or unexpected, must be what it does. But what if it is wrong, and it does not exist? How do we ever disabuse ourselves of the notion that there is invisible mass once we’ve convinced ourselves that there has to be?

Of course, MOND has its own problems. Clusters of galaxies are systems$ for which it persistently fails to explain the amplitude of the observed acceleration discrepancy. So let’s add those to the plot as well:

As above, with clusters of galaxies added (x: Sanders 2003; +: Li et al. 2023).

So: do clusters violate the RAR, or follow it? I’d say yes and yes – the offset, thought modest in amplitude in this depiction, is statistically significant. But there is also a similar scaling with acceleration, only the amplitude is off. The former makes no sense in MOND; the latter makes no sense in terms of dark matter which did not predict a RAR at all.

Clusters are the strongest evidence against MOND. Just being evidence against MOND doesn’t automatically make it evidence in favor of dark matter. I often pose myself the question: which theory requires me to disbelieve the least amount of data? When I first came to the problem, I was shocked to find that the answer was clearly MOND. Since then, it has gone back and forth, but rather than a clear answer emerging, what has happened is more a divergence of different lines of evidence: that which favors the standard cosmology is incommensurate with that which favors MOND. This leads to considerable cognitive dissonance.

One way to cope with cognitive dissonance is to engage with a problem from different perspectives. If I put on a MOND hat, I worry about the offset seen above for clusters. If I put on a dark matter hat, I worry about the same kind of offset for every system that is not a rich cluster of galaxies. Most critics of MOND seem unconcerned about this problem for dark matter, so how much should a critic of dark matter worry about it in MOND?


*For the hyper-pedantic: the eccentricity of each orbit causes the exact location of each planet in the first plot to oscillate up and down along the dotted line. The extent of this oscillation is smaller than the size of each symbol with the exception of Mercury, which has a relatively high eccentricity (but nowhere near enough to reach Venus).

+There are a few exceptions, of course – there are always exceptions in astronomy. The issue is whether these are physically meaningful, or the result of systematic uncertainties or non-equilibrium processes. The claimed discrepancies range from dubious to unconvincing to obviously wrong.

$I’ve heard some people criticize MOND because the centroid of the lensing signal does not peak around the gas in the Bullet cluster. This assumes that the gas represents the majority of the baryons. We know the is not the case, and that there is some missing mass in clusters. Whatever it is, it is clearly more centrally concentrated than the gas, so we don’t expect the lensing signal to peak where the gas is. All the Bullet cluster teaches us is that whatever this stuff is, it is collisionless. So this particular complaint is a logical fallacy of the a red herring and/or straw man variety born of not understanding MOND well enough to criticize it accurately. Why bother to do that when you come to the problem already sure that MOND is wrong? I understand this line of thought extraordinarily well, because that’s the attitude I started with, and I’ve seen it repeated by many colleagues. The difference is that I bothered to educate myself.

A personal note – I will be on vacation next week, so won’t be quick to respond to comments.

Clusters of galaxies ruin everything

Clusters of galaxies ruin everything

A common refrain I hear is that MOND works well in galaxies, but not in clusters of galaxies. The oft-unspoken but absolutely intended implication is that we can therefore dismiss MOND and never speak of it again. That’s silly.

Even if MOND is wrong, that it works as well as it does is surely telling us something. I would like to know why that is. Perhaps it has something to do with the nature of dark matter, but we need to engage with it to make sense of it. We will never make progress if we ignore it.

Like the seventeenth century cleric Paul Gerhardt, I’m a stickler for intellectual honesty:

“When a man lies, he murders some part of the world.”

Paul Gerhardt

I would extend this to ignoring facts. One should not only be truthful, but also as complete as possible. It does not suffice to be truthful about things that support a particular position while eliding unpleasant or unpopular facts* that point in another direction. By ignoring the successes of MOND, we murder a part of the world.

Clusters of galaxies are problematic in different ways for different paradigms. Here I’ll recap three ways in which they point in different directions.

1. Cluster baryon fractions

An unpleasant fact for MOND is that it does not suffice to explain the mass discrepancy in clusters of galaxies. When we apply Milgrom’s formula to galaxies, it explains the discrepancy that is conventionally attributed to dark matter. When we apply MOND clusters, it comes up short. This has been known for a long time; here is a figure from the review Sanders & McGaugh (2002):

Figure 10 from Sanders & McGaugh (2002): (Left) the Newtonian dynamical mass of clusters of galaxies within an observed cutoff radius (rout) vs. the total observable mass in 93 X-ray-emitting clusters of galaxies (White et al. 1997). The solid line corresponds to Mdyn = Mobs (no discrepancy). (Right) the MOND dynamical mass within rout vs. the total observable mass for the same X-ray-emitting clusters. From Sanders (1999).

The Newtonian dynamical mass exceeds what is seen in baryons (left). There is a missing mass problem in clusters. The inference is that the difference is made up by dark matter – presumably the same non-baryonic cold dark matter that we need in cosmology.

When we apply MOND, the data do not fall on the line of equality as they should (right panel). There is still excess mass. MOND suffers a missing baryon problem in clusters.

The common line of reasoning is that MOND still needs dark matter in clusters, so why consider it further? The whole point of MOND is to do away with the need of dark matter, so it is terrible if we need both! Why not just have dark matter?

This attitude was reinforced by the discovery of the Bullet Cluster. You can “see” the dark matter.

An artistic rendition of data for the Bullet Cluster. Pink represents hot X-ray emitting gas, blue the mass concentration inferred through gravitational lensing, and the optical image shows many galaxies. There are two clumps of galaxies that collided and passed through one another, getting ahead of the gas which shocked on impact and lags behind as a result. The gas of the smaller “bullet” subcluster shows a distinctive shock wave.

Of course, we can’t really see the dark matter. What we see is that the mass required by gravitational lensing observations exceeds what we see in normal matter: this is the same discrepancy that Zwicky first noticed in the 1930s. The important thing about the Bullet Cluster is that the mass is associated with the location of the galaxies, not with the gas.

The baryons that we know about in clusters are mostly in the gas, which outweighs the stars by roughly an order of magnitude. So we might expect, in a modified gravity theory like MOND, that the lensing signal would peak up on the gas, not the stars. That would be true, if the gas we see were indeed the majority of the baryons. We already knew from the first plot above that this is not the case.

I use the term missing baryons above intentionally. If one already believes in dark matter, then it is perfectly reasonable to infer that the unseen mass in clusters is the non-baryonic cold dark matter. But there is nothing about the data for clusters that requires this. There is also no reason to expect every baryon to be detected. So the unseen mass in clusters could just be ordinary matter that does not happen to be in a form we can readily detect.

I do not like the missing baryon hypothesis for clusters in MOND. I struggle to imagine how we could hide the required amount of baryonic mass, which is comparable to or exceeds the gas mass. But we know from the first figure that such a component is indicated. Indeed, the Bullet Cluster falls at the top end of the plots above, being one of the most massive objects known. From that perspective, it is perfectly ordinary: it shows the same discrepancy every other cluster shows. So the discovery of the Bullet was neither here nor there to me; it was just another example of the same problem. Indeed, it would have been weird if it hadn’t shown the same discrepancy that every other cluster showed. That it does so in a nifty visual is, well, nifty, but so what? I’m more concerned that the entire population of clusters shows a discrepancy than that this one nifty case does so.

The one new thing that the Bullet Cluster did teach us is that whatever the missing mass is, it is collisionless. The gas shocked when it collided, and lags behind the galaxies. Whatever the unseen mass is, is passed through unscathed, just like the galaxies. Anything with mass separated by lots of space will do that: stars, galaxies, cold dark matter particles, hard-to-see baryonic objects like brown dwarfs or black holes, or even massive [potentially sterile] neutrinos. All of those are logical possibilities, though none of them make a heck of a lot of sense.

As much as I dislike the possibility of unseen baryons, it is important to keep the history of the subject in mind. When Zwicky discovered the need for dark matter in clusters, the discrepancy was huge: a factor of a thousand. Some of that was due to having the distance scale wrong, but most of it was due to seeing only stars. It wasn’t until 40 some years later that we started to recognize that there was intracluster gas, and that it outweighed the stars. So for a long time, the mass ratio of dark to luminous mass was around 70:1 (using a modern distance scale), and we didn’t worry much about the absurd size of this number; mostly we just cited it as evidence that there had to be something massive and non-baryonic out there.

Really there were two missing mass problems in clusters: a baryonic missing mass problem, and a dynamical missing mass problem. Most of the baryons turned out to be in the form of intracluster gas, not stars. So the 70:1 ratio changed to 7:1. That’s a big change! It brings the ratio down from a silly number to something that is temptingly close to the universal baryon fraction of cosmology. Consequently, it becomes reasonable to believe that clusters are fair samples of the universe. All the baryons have been detected, and the remaining discrepancy is entirely due to non-baryonic cold dark matter.

That’s a relatively recent realization. For decades, we didn’t recognize that most of the normal matter in clusters was in an as-yet unseen form. There had been two distinct missing mass problems. Could it happen again? Have we really detected all the baryons, or are there still more lurking there to be discovered? I think it unlikely, but fifty years ago I would also have thought it unlikely that there would have been more mass in intracluster gas than in stars in galaxies. I was ten years old then, but it is clear from the literature that no one else was seriously worried about this at the time. Heck, when I first read Milgrom’s original paper on clusters, I thought he was engaging in wishful thinking to invoke the X-ray gas as possibly containing a lot of the mass. Turns out he was right; it just isn’t quite enough.

All that said, I nevertheless think the residual missing baryon problem MOND suffers in clusters is a serious one. I do not see a reasonable solution. Unfortunately, as I’ve discussed before, LCDM suffers an analogous missing baryon problem in galaxies, so pick your poison.

It is reasonable to imagine in LCDM that some of the missing baryons on galaxy scales are present in the form of warm/hot circum-galactic gas. We’ve been looking for that for a while, and have had some success – at least for bright galaxies where the discrepancy is modest. But the problem gets progressively worse for lower mass galaxies, so it is a bold presumption that the check-sum will work out. There is no indication (beyond faith) that it will, and the fact that it gets progressively worse for lower masses is a direct consequence of the data for galaxies looking like MOND rather than LCDM.

Consequently, both paradigms suffer a residual missing baryon problem. One is seen as fatal while the other is barely seen.

2. Cluster collision speeds

A novel thing the Bullet Cluster provides is a way to estimate the speed at which its subclusters collided. You can see the shock front in the X-ray gas in the picture above. The morphology of this feature is sensitive to the speed and other details of the collision. In order to reproduce it, the two subclusters had to collide head-on, in the plane of the sky (practically all the motion is transverse), and fast. I mean, really fast: nominally 4700 km/s. That is more than the virial speed of either cluster, and more than you would expect from dropping one object onto the other. How likely is this to happen?

There is now an enormous literature on this subject, which I won’t attempt to review. It was recognized early on that the high apparent collision speed was unlikely in LCDM. The chances of observing the bullet cluster even once in an LCDM universe range from merely unlikely (~10%) to completely absurd (< 3 x 10-9). Answers this varied follow from what aspects of both observation and theory are considered, and the annoying fact that the distribution of collision speed probabilities plummets like a stone so that slightly different estimates of the “true” collision speed make a big difference to the inferred probability. What the “true” gravitationally induced collision speed is is somewhat uncertain because the hydrodynamics of the gas plays a role in shaping the shock morphology. There is a long debate about this which bores me; it boils down to it being easy to explain a few hundred extra km/s but hard to get up to the extra 1000 km/s that is needed.

At its simplest, we can imagine the two subclusters forming in the early universe, initially expanding apart along with the Hubble flow like everything else. At some point, their mutual attraction overcomes the expansion, and the two start to fall together. How fast can they get going in the time allotted?

The Bullet Cluster is one of the most massive systems in the universe, so there is lots of dark mass to accelerate the subclusters towards each other. The object is less massive in MOND, even spotting it some unseen baryons, but the long-range force is stronger. Which effect wins?

Gary Angus wrote a code to address this simple question both conventionally and in MOND. Turns out, the longer range force wins this race. MOND is good at making things go fast. While the collision speed of the Bullet Cluster is problematic for LCDM, it is rather natural in MOND. Here is a comparison:

A reasonable answer falls out of MOND with no fuss and no muss. There is room for some hydrodynamical+ high jinx, but it isn’t needed, and the amount that is reasonable makes an already reasonable result more reasonable, boosting the collision speed from the edge of the observed band to pretty much smack in the middle. This is the sort of thing that keeps me puzzled: much as I’d like to go with the flow and just accept that it has to be dark matter that’s correct, it seems like every time there is a big surprise in LCDM, MOND just does it. Why? This must be telling us something.

3. Cluster formation times

Structure is predicted to form earlier in MOND than in LCDM. This is true for both galaxies and clusters of galaxies. In his thesis, Jay Franck found lots of candidate clusters at redshifts higher than expected. Even groups of clusters:

Figure 7 from Franck & McGaugh (2016). A group of four protocluster candidates at z = 3.5 that are proximate in space. The left panel is the sky association of the candidates, while the right panel shows their galaxy distribution along the LOS. The ellipses/boxes show the search volume boundaries (Rsearch = 20 cMpc, Δz ± 20 cMpc). Three of these (CCPC-z34-005, CCPC-z34-006, CCPC-z35-003) exist in a chain along the LOS stretching ≤120 cMpc. This may become a supercluster-sized structure at z = 0.

The cluster candidates at high redshift that Jay found are more common in the real universe than seen with mock observations made using the same techniques within the Millennium simulation. Their velocity dispersions are also larger than comparable simulated objects. This implies that the amount of mass that has assembled is larger than expected at that time in LCDM, or that speeds are boosted by something like MOND, or nothing has settled into anything like equilibrium yet. The last option seems most likely to me, but that doesn’t reconcile matters with LCDM, as we don’t see the same effect in the simulation.

MOND also predicts the early emergence of the cosmic web, which would explain the early appearance of very extended structures like the “big ring.” While some of these very large scale structures are probably not real, there seem to be a lot of such things being noted for all of them to be an illusion. The knee-jerk denials of all such structures reminds me of the shock cosmologists expressed at seeing quasars at redshifts as high as 4 (even 4.9! how can it be so?) or clusters are redshift 2, or the original CfA stickman, which surprised the bejeepers out of everybody in 1987. So many times I’ve been told that a thing can’t be true because it violates theoretician’s preconceptions, only for them to prove to be true, ultimately to be something the theorists expected all along.

Well, which is it?

So, as the title says, clusters ruin everything. The residual missing baryon problem that MOND suffers in clusters is both pernicious and persistent. It isn’t the outright falsification that many people presume it to be, but is sure don’t sit right. On the other hand, both the collision speeds of clusters (there are more examples now than just the Bullet Cluster) and the early appearance of clusters at high redshift is considerably more natural in MOND than In LCDM. So the data for clusters cuts both ways. Taking the most obvious interpretation of the Bullet Cluster data, this one object falsifies both LCDM and MOND.

As always, the conclusion one draws depends on how one weighs the different lines of evidence. This is always an invitation to the bane of cognitive dissonance, accepting that which supports our pre-existing world view and rejecting the validity of evidence that calls it into question. That’s why we have the scientific method. It was application of the scientific method that caused me to change my mind: maybe I was wrong to be so sure of the existence of cold dark matter? Maybe I’m wrong now to take MOND seriously? That’s why I’ve set criteria by which I would change my mind. What are yours?


*In the discussion associated with a debate held at KITP in 2018, one particle physicist said “We should just stop talking about rotation curves.” Straight-up said it out loud! No notes, no irony, no recognition that the dark matter paradigm faces problems beyond rotation curves.

+There are now multiple examples of colliding cluster systems known. They’re a mess (Abell 520 is also called “the train wreck cluster“), so I won’t attempt to describe them all. In Angus & McGaugh (2008) we did note that MOND predicted that high collision speeds would be more frequent than in LCDM, and I have seen nothing to make me doubt that. Indeed, Xavier Hernandez pointed out to me that supersonic shocks like that of the Bullet Cluster are often observed, but basically never occur in cosmological simulations.

Discussion of Dark Matter and Modified Gravity

To start the new year, I provide a link to a discussion I had with Simon White on Phil Halper’s YouTube channel:

In this post I’ll say little that we don’t talk about, but will add some background and mildly amusing anecdotes. I’ll also try addressing the one point of factual disagreement. For the most part, Simon & I entirely agree about the relevant facts; what we’re discussing is the interpretation of those facts. It was a perfectly civil conversation, and I hope it can provide an example for how it is possible to have a positive discussion about a controversial topic+ without personal animus.

First, I’ll comment on the title, in particular the “vs.” This is not really Simon vs. me. This is a discussion between two scientists who are trying to understand how the universe works (no small ask!). We’ve been asked to advocate for different viewpoints, so one might call it “Dark Matter vs. MOND.” I expect Simon and I could swap sides and have an equally interesting discussion. One needs to be able to do that in order to not simply be a partisan hack. It’s not like MOND is my theory – I falsified my own hypothesis long ago, and got dragged reluctantly into this business for honestly reporting that Milgrom got right what I got wrong.

For those who don’t know, Simon White is one of the preeminent scholars working on cosmological computer simulations, having done important work on galaxy formation and structure formation, the baryon fraction in clusters, and the structure of dark matter halos (Simon is the W in NFW halos). He was a Reader at the Institute of Astronomy at the University of Cambridge where we overlapped (it was my first postdoc) before he moved on to become the director of the Max Planck Institute for Astrophysics where he was mentor to many people now working in the field.

That’s a very short summary of a long and distinguished career; Simon has done lots of other things. I highlight these works because they came up at some point in our discussion. Davis, Efstathiou, Frenk, & White are the “gang of four” that was mentioned; around Cambridge I also occasionally heard them referred to as the Cold Dark Mafia. The baryon fraction of clusters was one of the key observations that led from SCDM to LCDM.

The subject of galaxy formation runs throughout our discussion. It is always a fraught issue how things form in astronomy. It is one thing to understand how stars evolve, once made; making them in the first place is another matter. Hard as that is to do in simulations, galaxy formation involves the extra element of dark matter in an expanding universe. Understanding how galaxies come to be is essential to predicting anything about what they are now, at least in the context of LCDM*. Both Simon and I have worked on this subject our entire careers, in very much the same framework if from different perspectives – by which I mean he is a theorist who does some observational work while I’m an observer who does some theory, not LCDM vs. MOND.

When Simon moved to Max Planck, the center of galaxy formation work moved as well – it seemed like he took half of Cambridge astronomy with him. This included my then-office mate, Houjun Mo. At one point I refer to the paper Mo & I wrote on the clustering of low surface brightness galaxies and how I expected them to reside in late-forming dark matter halos**. I often cite Mo, Mao, & White as a touchstone of galaxy formation theory in LCDM; they subsequently wrote an entire textbook about it. (I was already warning them then that I didn’t think their explanations of the Tully-Fisher relation were viable, at least not when combined with the effect we have subsequently named the diversity of rotation curve shapes.)

When I first began to worry that we were barking up the wrong tree with dark matter, I asked myself what could falsify it. It was hard to come up with good answers, and I worried it wasn’t falsifiable. So I started asking other people what would falsify cold dark matter. Most did not answer. They often had a shocked look like they’d never thought about it, and would rather not***. It’s a bind: no one wants it to be false, but most everyone accepts that for it to qualify as physical science it should be falsifiable. So it was a question that always provoked a record-scratch moment in which most scientists simply freeze up.

Simon was one of the first to give a straight answer to this question without hesitation, circa 1999. At that point it was clear that dark matter halos formed central density cusps in simulations; so those “cusps had to exist” in the centers of galaxies. At that point, we believed that to mean all galaxies. The question was complicated by the large dynamical contribution of stars in high surface brightness galaxies, but low surface brightness galaxies were dark matter dominated down to small radii. So we thought these were the ideal place to test the cusp hypothesis.

We no longer believe that. After many attempts at evasion, cold dark matter failed this test; feedback was invoked, and the goalposts started to move. There is now a consensus among simulators that feedback in intermediate mass galaxies can alter the inner mass distribution of dark matter halos. Exactly how this happens depends on who you ask, but it is at least possible to explain the absence of the predicted cusps. This goes in the right direction to explain some data, but by itself does not suffice to address the thornier question of why the distribution of baryons is predictive of the kinematics even when the mass is dominated by dark matter. This is why the discussion focused on the lowest mass galaxies where there hasn’t been enough star formation to drive the feedback necessary to alter cusps. Some of these galaxies can be described as having cusps, but probably not all. Thinking only in those terms elides the fact that MOND has a better record of predictive success. I want to know why this happens; it must surely be telling us something important about how the universe works.

The one point of factual disagreement we encountered had to do with the mass profile of galaxies at large radii as traced by gravitational lensing. It is always necessary to agree on the facts before debating their interpretation, so we didn’t press this far. Afterwards, Simon sent a citation to what he was talking about: this paper by Wang et al. (2016). In particular, look at their Fig. 4:

Fig. 4 of Wang et al. (2016). The excess surface density inferred from gravitational lensing for galaxies in different mass bins (data points) compared to mock observations of the same quantity made from within a simulation (lines). Looks like excellent agreement.

This plot quantifies the mass distribution around isolated galaxies to very large scales. There is good agreement between the lensing observations and the mock observations made within a simulation. Indeed, one can see an initial downward bend corresponding to the outer part of an NFW halo (the “one-halo term”), then an inflection to different behavior due to the presence of surrounding dark matter halos (the “two-halo term”). This is what Simon was talking about when he said gravitational lensing was in good agreement with LCDM.

I was thinking of a different, closely related result. I had in mind the work of Brouwer et al. (2021), which I discussed previously. Very recently, Dr. Tobias Mistele has made a revised analysis of these data. That’s worthy its own post, so I’ll leave out the details, which can be found in this preprint. The bottom line is in Fig. 2, which shows the radial acceleration relation derived from gravitational lensing around isolated galaxies:

The radial acceleration relation from weak gravitational lensing (colored points) extending existing kinematic data (grey points) to lower acceleration corresponding to very large radii (~ 1 Mpc). The dashed line is the prediction of MOND. Looks like excellent agreement.

This plot quantifies the radial acceleration due to the gravitational potential of isolated galaxies to very low accelerations. There is good agreement between the lensing observations and the extrapolation of the radial acceleration relation predicted by MOND. There are no features until extremely low acceleration where there may be a hint of the external field effect. This is what I was talking about when I said gravitational lensing was in good agreement with MOND, and that the data indicated a single halo with an r-2 density profile that extends far out where we ought to see the r-3 behavior of NFW.

The two plots above use the same method applied to the same kind of data. They should be consistent, yet they seem to tell a different story. This is the point of factual disagreement Simon and I had, so we let it be. No point in arguing about the interpretation when you can’t agree on the facts.

I do not know why these results differ, and I’m not going to attempt to solve it here. I suspect it has something to do with sample selection. Both studies rely on isolated galaxies, but how do we define that? How well do we achieve the goal of identifying isolated galaxies? No galaxy is an island; at some level, there is always a neighbor. But is it massive enough to perturb the lensing signal, or can we successfully define samples of galaxies that are effectively isolated, so that we’re only looking at the gravitational potential of that galaxy and not that of it plus some neighbors? Looks like there is some work left to do to sort this out.

Stepping back from that, we agreed on pretty much everything else. MOND as a fundamental theory remains incomplete. LCDM requires us to believe that 95% of the mass-energy content of the universe is something unknown and perhaps unknowable. Dark matter has become familiar as a term but remains a mystery so long as it goes undetected in the laboratory. Perhaps it exists and cannot be detected – this is a logical possibility – but that would be the least satisfactory result possible: we might as well resume counting angels on the head of a pin.

The community has been working on these issues for a long time. I have been working on this for a long time. It is a big problem. There is lots left to do.


+I get a lot of kill the messenger from people who are not capable of discussing controversial topics without personal animus. A lotinevitably from people who know assume they know more about the subject than I do but actually know much less. It is really amazing how many scientists equate me as a person with MOND as a theory without bothering to do any fact-checking. This is logical fallacy 101.

*The predictions of MOND are insensitive to the details of galaxy formation. Though of course an interesting question, we don’t need that in order to make predictions. All we need is the mass distribution that the kinematics respond to – we don’t need to know how it got that way. This is like the solar system, where it suffices to know Newton’s laws to compute orbits; we don’t need to know how the sun and planets formed. In contrast, one needs to know how a galaxy was assembled in LCDM to have any hope of predicting what its distribution of dark matter is and then using that to predict kinematics.

**The ideas Mo & I discussed thirty years ago have reappeared in the literature under the designation “assembly bias.”

***It was often accompanied by “why would you even ask that?” followed by a pained, constipated expression when they realized that every physical theory has to answer that question.

Holiday Concordance

Holiday Concordance

Screw the Earth and its smoking habit. The end of 2023 approaches, so let’s talk about the whole universe, which is its own special kind of mess.

As I’ve related before, our current cosmology, LCDM, was established over the course of the 1990s through a steady drip, drip, drip of results in observational cosmology – what Peebles calls the classic cosmological tests. There were many contributory results; I’m not going to attempt to go through them all. Important among them were the age problem, the realization that the mass density was lower than expected, and that there was more structure on large scales+ than predicted. These established LCDM in the mid-1990s as the “concordance model” – the most probable flavor of FLRW universe. Here is the key figure from Ostriker & Steinhardt depicting the then-allowed region of the density parameter and Hubble constant:

The addition of the cosmological constant to the standard model – replacing SCDM with LCDM – was a brain-wrenching ordeal. Lambda had long been anathema, and there was a region in which an open universe was possible, even reasonable (stripes over shade in the figure above). Moreover, this strange new LCDM made the seemingly inconceivable prediction that not only was the universe expanding [itself the older mind-bender brought to us by Hubble (and Slipher and Lemaître)], the expansion rate should be accelerating. This sounded like crazy talk at the time, so it was greeted with great rejoicing when corroborated by observations of Type Ia supernovae.

A further prediction that could distinguish LCDM from then-viable open models was the geometry of the universe. Open models have a negative curvaturek < 0, in which initially parallel light beams diverge) while the geometry in LCDM should be uniquely flat (Ωk = 0, in which initially parallel light beams remain parallel forever). Uniqueness is important, as it makes for a strong prediction, such as the location of the first peak of the acoustic power spectrum of the cosmic microwave background. In LCDM, this location was predicted to be ℓ ≈ 200 with little flexibility. For viable open models, it was more like ℓ ≈ 800 with a great deal of flexibility. The interpretation of the supernova data relied heavily on the assumption of a flat geometry, so I recall breathing a sigh of relief* when ℓ ≈ 200 was clearly observed.

Where are we now? I decided to reconstruct the Ostriker & Steinhardt plot with modern data. Here it is, with the axes swapped for reasons unrelated to this post. Deal with it.

The concordance region (white space) in the mass density-expansion rate space where the allowed regions (colored bands) of many constraints intersect. Illustrated constraints include a direct measurement of the Hubble constant, the age of the universe, the cluster baryon fraction, and large scale structure. Also shown are the best-fit values from CMB fits labeled by their date of publication (WMAP in orange; Planck in yellow). These follow the green line of constant ΩmH03; combinations of parameters along the line are tolerable but regions away from it are strongly excluded.

There is lots to be said here. First, note the scale. As the accuracy of data have improved, it has become possible to zoom in. My version of the figure is a wee postage stamp on that of Ostriker & Steinhardt. Nevertheless, the concordance region is in pretty much the same spot. Not exactly, of course; the biggest thing that has changed is that the age constraint is now completely incompatible with an open universe, so I haven’t bothered depicting it. Indeed, for the illustrated Hubble constant, the Hubble time (the age of a completely empty, “coasting” universe) is 13.4 Gyr. This is consistent with the illustrated age (13.80 ± 0.75 Gyr) only for Ωm ≈ 0, which is far off the left edge of the plot.

Second, the CMB best-fit values follow a line of constant ΩmH03. This is a deep trench in χ2 space. The region outside this trench is strongly excluded – it’s kinda the grand canyon of cosmology. Even a little off, and you’re standing on the rim looking a long way down, knowing that a much better fit is only a short step away. Once you’re in the valley of χ2, one must hunt along its bottom to find the true minimum. In the mid-`00s, a decade after Ostriker & Steinhardt, the best fit fell smack in the middle of the concordance region defined by completely independent data. It was this additional concordance that impressed me most, more than the detailed CMB fits themselves. This convinced the vast majority of scientists practicing in the field that it had to be LCDM and could only be LCDM and nothing but LCDM.

Since that time, the best-fit CMB value has wandered down the trench, away from the concordance region. These are the results that changed, not everything else. This temporal variation suggests a systematic in the interpretation of the CMB data rather than in the local distance scale.

I recall being at a conference (the Bright & Dark Universe in Naples in 2017) when the latest Planck results were announced. There was a palpable sense in the audience of having been whacked by a blunt object, like walking into a closed door you thought was open. We’d been doing precision cosmology for a long time and had settled on an answer informed by lots of independent lines of evidence, but they were telling us the One True answer was off over there. Not crazy far, but not consistent with the concordance we had come to expect. Worse, they had these crazy tiny error bars – not only were they getting an answer outside the concordance region, it was in tension with pretty much everything else. Not strong tension, but enough to make us all uncomfortable if not outright object. Indeed, there was a definite vibe that people were afraid to object. Not terrified, but nervous. Worried about being on the wrong side of the community. I get it. I know a lot about that.

People are remarkably talented at refashioning the past. Over the past five years, the Planck best-fit parameters have come to be synonymous with LCDM: all else is moot. Young scientists can be forgiven for not realizing it was ever otherwise, just as they might have been taught that cosmic acceleration was discovered by the supernova experiments totally out of the blue. These are convenient oversimplifications that elide so many pertinent events as to be tantamount to gaslighting. We refashion the past until there was never a serious controversy, then it seems strange that some of us think there still is. Sorry, not so fast, there definitely is: if you use the Planck value of the Hubble constant to estimate distances to local galaxies, you will get it wrong%, along with all distance-dependent quantities.

I’m old enough to remember a time when there was a factor of two uncertainty in the Hubble constant (50 vs. 1000) and the age constraint was the most accurate one in this plot. Thanks to genuine progress, the Hubble constant is now the more precise. Consequently, of all the data one could plot above, this is the choice that matters most to where the concordance region falls. If I adopt our own estimate (H0 = 75.1 ± 2.3 km/s/Mpc), then the concordance band gets wider and slides up a little but is basically the same as above. If instead I adopt the lowest highly accurate value, H0 = 69.8 ± 0.8 km/s/Mpc, the window slides down, but not enough to be consistent with the Planck results. Indeed, it stays to the left of the CMB constraint, becoming inconsistent with the mass density as well as the expansion rate.

Dang it, now I want to make that plot. Processing… OK, here it is:

As above, but with a lower measurement of H0. Only the range of statistical uncertainty is illustrated as a systematic uncertainty corresponds to a calibration error that slides H0 up and down – i.e., the exact situation being illustrated relative to the figure above. These two plots illustrate the range of outcomes that are possible from slightly discordant direct modern measurements of the Hubble constant; it is hard to go lower. Doing so doesn’t really help as it would just shift the tension from H0 to Ωm.

Yes, as I expected: the allowed range slides down but remains to the left of the green line. It is less inconsistent with the Planck H0, but that isn’t the only thing that matters. It is also inconsistent with the matter density. Indeed, it misses the CMB-allowed trench entirely. There is no allowed FLRW universe here.

These are only two parameters. Though arguably the most important, there are others, all of which matter to CMB fits. These are difficult to visualize simultaneously. We could, for starters, plot the baryon density as a third axis. If we did so, the concordance region would become a 3D object. It would also get squeezed, depending on what we think the baryon density actually is. Even restricting ourselves to the above-plotted constraints, there is some tension between the cluster baryon fraction and large scale structure constraint along the new third axis. I’m sure I could find in the literature more or less consistent values; this way the madness of cherry-picking lies.

There are many other constraints that could be added here. I’ve tried to stay consistent with the spirit of the original plot without making it illegible by overburdening it with lots and lots of data that all say pretty much the same thing. Nor do I wish to engage in cherry-picking. There are so many results out there that I’m sure one could find some combination that slides the allowed box this way or that – but only a little.

Whenever I’ve taught cosmology, I’ve made it a class exercise$ to investigate diagrams like this, with each student choosing an observational constraint to explore and champion. as a result, I’ve seen many variations on the above plots over the years, but since I first taught it in 1999 they’ve always been consistent with pretty much the same concordance region. It often happens that there is no concordance region; there are so many constraints that when you put them all together, nothing is left. We then debate which results to believe, or not, a process that has always been a part of the practice of cosmology.

We have painted ourselves into a corner. The usual interpretation is that we have painted ourselves into the correct corner: we live in this strange LCDM universe. It is also possible that there really is nothing left, the concordance window is closed, and we’ve falsified FLRW cosmology. That is a fate most fear to contemplate, and it seems less likely than mistakes in some discordant results, so we inevitably go down the path of cognitive dissonance, giving more credence to results that are consistent with our favorite set of LCDM parameters and less to those that do not. This is widely done without contemplating the possibility that the weird FLRW parameters we’ve ended up with are weird because they are just an approximation to some deeper theory.

So, as 2023 winds to an end, we [still] know pretty well what the parameters of cosmology are. While the tension between H0 = 67 and 73 km/s/Mpc is real, it seems like small beans compared to the successful isolation of a narrow concordance window. Sure beats arguing between 50 and 100! Even deciding which concordance window is right seems like a small matter compared to the deeper issues raised by LCDM: what is the cold dark matter? Does it really exist, or is it just a mythical entity we’ve invented for the convenient calculation of cosmic quantities? What the heck do we even mean by Lambda? Does the whole picture hang together so well that it must be correct? Or can it be falsified? Has it already been? How do we decide?

I’m sure we’ll be arguing over these questions for a long time to come.


+Structure formation is often depicted as a great success of cosmology, but it was the failure of the previous standard model, SCDM, to predict enough structure on large scales that led to its demise and its replacement by LCDM, which now faces a similar problem. The observer’s experience has consistently been that there is more structure in place earlier and on larger scales than had been anticipated before its observation.

*I believe in giving theories credit where credit is due. Putting on a cosmologist’s hat, the location of the first peak was a great success of LCDM. It was the amplitude of the second peak that came as a great surprise – unless you can take off the cosmology hat and don a MOND hat – then it was predicted. What is surprising from that perspective is the amplitude of the third peak, which makes more sense in LCDM. It seems impossible to some people that I can wear both hats without my head exploding, so they seem to simply assume I don’t think about it from their perspective when in reality it is the other way around.

%As adjudicated by galaxies with distances known from direct measurements provided by Cepheids or the tip of the red giant branch or surface brightness fluctuations or geometric methods, etc., etc., etc.

$This is a great exercise, but only works if CMB results are excluded. There has to be some narrative suspense: will the various disparate lines of evidence indeed line up? Since CMB fits constrain all parameters simultaneously, and brook no dissent, they suck the joy away from everything else in the sky and drain all interest in the debate.

Full speed in reverse!

Full speed in reverse!

People have been asking me about comments in a recent video by Sabine Hossenfelder. I have not watched it, but the quote I’m asked about is “the higher the uncertainty of the data, the better MOND seems to work” with the implication that this might mean that MOND is a systematic artifact of data interpretation. I believe, because they consulted me about it, that the origin of this claim emerged from recent work by Sabine’s student Maria Khelashvili on fitting the SPARC data.

Let me address the point about data interpretation first. Fitting the SPARC data had exactly nothing to do with attracting my attention to MOND. Detailed MOND fits to these data are not particularly important in the overall scheme of these things as I’ll discuss in excruciating detail below. Indeed, these data didn’t even exist until relatively recently.

It may, at this juncture in time, surprise some readers to learn that I was once a strong advocate for cold dark matter. I was, like many of its current advocates, rather derisive of alternatives, the most prominent at the time being baryonic dark matter. What attracted my attention to MOND was that it made a priori predictions that were corroborated, quite unexpectedly, in my data for low surface brightness galaxies. These results were surprising in terms of dark matter then and to this day remain difficult to understand. After a lot of struggle to save dark matter, I realized that the best we could hope to do with dark matter was to contrive a model that reproduced after the fact what MOND had predicted a priori. That can never be satisfactory.

So – I changed my mind. I admitted that I had been wrong to be so completely sure that the solution to the missing mass problem had to be some new form of non-baryonic dark matter. It was not easy to accept this possibility. It required lengthy and tremendous effort to admit that Milgrom had got right something that the rest of us had got wrong. But he had – his predictions came true, so what was I supposed to say? That he was wrong?

Perhaps I am wrong to take MOND seriously? I would love to be able to honestly say it is wrong so I can stop having this argument over and over. I’ve stipulated the conditions whereby I would change my mind to again believe that dark matter is indeed the better option. These conditions have not been met. Few dark matter advocates have answered the challenge to stipulate what could change their minds.

People seem to have become obsessed with making fits to data. That’s great, but it is not fundamental. Making a priori predictions is fundamental, and has nothing to do with fitting data. By construction, the prediction comes before the data. Perhaps this is one way to distinguish between incremental and revolutionary science. Fitting data is incremental science that seeks the best version of an accepted paradigm. Successful predictions are the hallmark of revolutionary science that make one take notice and say, hey, maybe something entirely different is going on.

One of the predictions of MOND is that the RAR should exist. It was not expected in dark matter. As a quick review of the history, here is the RAR as it was known in 2004 and now (as of 2016):

The radial acceleration relation constructed from data available in 2004 and that from 2016.

The big improvement provided by SPARC was a uniform estimate of the stellar mass surface density of galaxies based on Spitzer near-infrared data. These are what are used to construct the x-axis: gbar is what Newton predicts for the observed mass distribution. SPARC was a vast improvement over the optical data we had previously, to the point that the intrinsic scatter is negligibly small: the observed scatter can be attributed to the various uncertainties and the expected scatter in stellar mass-to-light ratios. The latter never goes away, but did turn out to be at the low end of the range we expected. It could easily have looked worse, as it did in 2004, even if the underlying physical relation was perfect.

Negligibly small intrinsic scatter is the best one can hope to find. The issue now is the fit quality to individual galaxies (not just the group plot above). We already know MOND fits rotation curve data. The claim that appears in Dr. Hossenfelder’s video boils down to dark matter providing better fits. This would be important if it told us something about nature. It does not. All it teaches us about is the hazards of fitting data for which the errors are not well behaved.

While SPARC provides a robust estimate of gbar, gobs is based on a heterogeneous set of rotation curves drawn from a literature spanning decades. The error bars on these rotation curves have not been estimated in a uniform way, so we cannot blindly fit the data with our favorite software tool and expect that to teach us something about physical reality. I find myself having to say this to physicists over and over and over and over and over again: you cannot trust astronomical error bars to behave as Gaussian random variables the way one would like and expect in a controlled laboratory setting.

Astronomy is not conducted in a controlled laboratory. It is an observational science. We cannot put the entire universe in a box and control all the variables. We can hope to improve the data and approach this ideal, but right now we’re nowhere near it. These fitting analyses assume that we are.

Screw it. I really am sick of explaining this over and over, so I’m just going to cut & paste verbatim what I told Hossenfelder & Khelashvili by email when they asked. This is not the first time I’ve written an email like this, and I’m sure it won’t be the last.


Excruciating details: what I said to Hossenfelder & Khelashvili about the perils of rotation curve fitting on 22 September 2023 in response for their request for comments on the draft of the relevant paper:

First, the work of Desmond is a good place to look for an opinion independent of mine. 

Second, in my experience, the fit quality you find is what I’ve found before: DM halos with a constant density core consistently give the best fits in terms of chi^2, then MOND, then NFW. The success of cored DM halos happens because it is an extremely flexible fitting function: the core radius and core density can be traded off to fit any dog’s leg, and is highly degenerate with the stellar M*/L. NFW works less well because it has a less flexible shape. But both work because they have more parameters [than MOND].

Third, statistics will not save us here. I once hoped that the BIC would sort this out, but having gone down that road, I believe the BIC does not penalize models sufficiently for adding free parameters. You allude to this at the end of section 3.2. When you go from MOND (with fixed a0 it has only one parameter, M*/L, to fit to account for everything) to a dark matter halo (which has at a minimum 3 parameters: M*/L plus two to describe the halo) then you gain an enormous amount of freedom – the volume of possible parameter space grows enormously. But the BIC just says if you had 20 degrees of freedom before, now you have 22. That does not remotely represent the amount of flexibility that represents: some free parameters are more equal than others. MOND fits and DM halo fits are not the same beast; we can’t compare them this way any more than we can compare apples and snails. 

Worse, to do this right requires that the uncertainties be real random errors. They are not. SPARC provides homogeneous mass models based on near-IR observations of the stellar mass distribution. Those should be OK to the extent that near-IR light == stellar mass. That is a decent mapping, but not perfect. Consequently, we expect the occasional galaxy to misbehave. UGC 128 is a case where the MOND fit was great with optical data then became terrible with near-IR data. The absolute difference in the data are not great, but in terms of the formal chi^2 it is. So is that a failure of the model, or of the data to represent what we want it to represent?

This happens all the time in astronomy. Here, we want to know the circular velocity of a test particle in the gravitational potential predicted by the baryonic mass distribution. We never measure either of those quantities. What we measure is the (i) stellar light distribution and the (ii) Doppler velocities of gas. We assume we can map stellar light to stellar mass and Doppler velocity to orbital speed, but no mass model is perfect, nor is any patch of observed gas guaranteed to be on a purely circular orbit. These are known unknowns: uncertainties that we know are real but we cannot easily quantify. These assumptions that we have to make to do the analysis dominate over the random errors in many cases. We also assume that galaxies are in dynamical equilibrium, but 20% of spirals show gross side-to-side asymmetries, and at least 50% mild ones. So what is the circular motion in those cases? (F579-1 is a good example)

While SPARC is homogeneous in its photometry, it is extremely heterogeneous in its rotation curve measurements. We’re working on fixing that, but it’ll take a while. Consequently, as you note, some galaxies have little constraining power while others appear to have lots. That’s because many of the rotation curve velocity uncertainties are either grossly over or underestimated. To see this, plot the cumulative distribution of chi^2 for any of your models (or see the CDF published by Li et al 2018 for the RAR and Li et al 2020 for dark matter halos of many flavors. So many, I can’t recall how many CDF we published.) Anyway, for a good model, chi^2 is always close to one, so the CDF should go up sharply and reach one quickly – there shouldn’t be many cases with very low chi^2 or very high chi^2. Unfortunately, rotation curve data do not do this for any type of model. There are always way too many cases with chi^2 << 1 and also too many with chi^2 >> 1. One might conclude that all models are unacceptable – or that the error bars are Messed Up. I think the second option is the case. If so, then this sort of analysis will always have the power to mislead. 

I insert Fig. 1 from Li et al. (2020) so you don’t have to go look it up. The CDF of a statistically good model would rise sharply, being an almost vertical line at chi^2 = 1. No model of any flavor does that. That’s in large part because the uncertainties on some rotation curves are too large, while those on others are too small. The greater flexibility of dark matter models make them incrementally better than MOND for the cases with error bars that are too small – hence the corollary statement that “the higher the uncertainty of the data, the better MOND seems to work.” This happens because dark matter models are allowed to chase bogus outliers with tiny error bars in a way that MOND cannot. That doesn’t make dark matter better, it just makes it is easier to fool.

  A key thing to watch out for is the outsized effects of a few points with tiny error bars. Among galaxies with high chi^2, what often happens is that there is one point with a tiny error bar that does not agree with any of the rest of the data for any smoothly continuous rotation curve. Fitting programs penalize a model for missing this point by many sigma, so will do anything they can to make it better. So what happens is that if you let a0 vary with a flat prior, it will got to some very silly values in order to buy a tiny improvement in chi^2. Formally, that’s a better fit, so you say OK, a0 has to vary. But if you plot the fitted RCs with fixed and variable a0, you will be hard pressed to see the difference. Chi^2 is different, sure, but both will have chi^2 >> 1, so a lousy fit either way, and we haven’t really gained anything meaningful from allowing for the greater fitting freedom. Really it is just that one point that is Wrong even though it has a tiny error bar – which you can see relative to the other points, never mind the model. Dark matter halos have more flexibility from the beginning, so this is less obvious for them even though the same thing happens.

So that’s another big point – what is the prior for a dark matter halo? [Your] Table 1 allows V200 and C200 to be pretty much anything. So yes, you will find a fit from that range. For Burkert halos, there is no prior, since these do not emerge from any theory – they’re just a flexible French curve. For NFW halos, there is a prior from cosmology – see McGaugh et al (2007) among a zillion other possible references, including Li et al (2020). In any[L]CDM cosmology, the parameters V200 and C200 correlate – they are not independent. So a reasonable prior would be a Gaussian in log(C200) at a given V200 as specified by some simulation (Macio et al; see Li et al 2020). Another prior is how V200 (or M200) relates to the observed baryonic mass (or stellar mass). This one is pretty dodgy. Originally, we expected a fixed ratio between baryonic and dark mass. So when I did this kind of analysis in the ’90s, I found NFW flunked hard compared to MOND. (I didn’t know about the BIC then.) Galaxy DM halos simply do not look like NFW halos that form in LCDM and host galaxies with a few percent of their mass in the luminous disk even though this was the standard model for many years (Mo, Mao, & White 1998). If we drop the assumption that luminous galaxies are always a fixed fraction of their dark matter halos, then better fits can be obtained. I suspect your uniform prior fits have halo masses all over the place; they probably don’t correlate well with the baryonic mass, nor are their C and V200 parameters likely to correlate as they are predicted to do. You could apply the expected mass-concentration and stellar mass-halo mass relations as priors, then NFW will come off worse in your analysis because you’ve restricted them to where they ought to live.

So, as you say – it all comes down to the prior.

Even applying a stellar mass-halo mass relation from abundance matching isn’t really independent information, though that’s the best you can hope to do. But I was saying 20+ years ago that fixed mass ratios wouldn’t work, but nobody then wanted to abandon that obvious assumption. Since then, they’ve been forced to do so. But there is no good physical reason for it (feedback is the deus ex machina of all problems in the field), what happened is that the data forced us to drop the obvious assumption. Data including kinematic data (McGaugh et al 2010). So adopting a modern stellar mass-halo mass relation will give you a stronger prior than a uniform prior, but that choice has already been informed by the kinematic data that you’re trying to fit. How do we properly penalize the model for cheating about its “prior” by peaking at past data?

So, as you say – it all comes down to the prior. I think it would be important here to better constrain the priors on the DM halo fits. Li et al (2020) discuss this. Even then we’re not done, because galaxy formation modifies the form of the halo function we’re fitting. They shouldn’t end up as NFW even if they start out that way – see Li et al 2022a & b. Those papers consider the inevitable effects of adiabatic compression, but not of feedback. If feedback really has the effects on DM halos that is frequently advertised, then neither NFW or Burkert are appropriate fitting functions – they’re not what LCDM+feedback predicts. Good luck extracting a legitimate prediction from simulations, though. So we’re stuck doing what you’re trying to do: adopt some functional form to represent the DM halo, and see what fits. What you’ve done here agrees with my experience: cored DM halos work best. But they don’t represent an LCDM prediction, or any other broader theory, so – so what? 

Another detail to be wary of – the radial range over which the RC data constrain the DM halo fit is often rather limited compared to the size of the halo. To complicate matters further, the inner regions are often star-dominated, so there is not much of a handle on DM from where the data are best, at least beyond many galaxies preferring not to have a cusp since the stars already get the job done at small R. So, one ends up with V_DM(R) constrained from 3% to 10% of the virial radius, or something like that. V200 and C200 are defined at the notional virial radius, so there are many combinations of these parameters that might adequately fit the observed range while being quite different elsewhere. Even worse, NFW halos are pretty self-similar – there are combinations of (C200,V200) that are highly degenerate, so you can’t really tell the difference between them even with excellent data – the confidence contours look like bananas in C200-V200 space, with low C/high V often being as good as high C/low V. Even even even worse is that the observed V_DM(R) is often approximately a straight line. Any function looks like a straight line if you stretch it out enough. Consequently, the fits to LSB galaxies often tend to absurdly low C and high V200: NFW never looks like a straight line, but it does if you blow it up enough. So one ends up inferring that the halo masses of tiny galaxies are nearly as big as those of huge galaxies, or more so! My favorite example was NGC 3109, a tiny dwarf on the edge of the Local Group. A straight NFW fit suggests that the halo of this one little galaxy weighs more than the entire Local Group, M31 + MW + everything else combined. This is the sort of absurd result that comes from fitting the NFW halo form to a limited radial range of data. 

I don’t know that this helps you much, but you see a few of the concerns. 

How things go mostly right or badly wrong

How things go mostly right or badly wrong

People often ask me of how “perfect” MOND has to be. The short answer is that it agrees with galaxy data as “perfectly” as we can perceive – i.e., the scatter in the credible data is accounted for entirely by known errors and the expected scatter in stellar mass-to-light ratios. Sometimes it nevertheless looks to go badly wrong. That’s often because we need to know both the mass distribution and the kinematics perfectly. Here I’ll use the Milky Way as an example of how easily things can look bad when they aren’t.

First, an update. I had hoped to stop talking about the Milky Way after the recent series of posts. But it is in the news, and there is always more to say. A new realization of the rotation curve from the Gaia DR3 data has appeared, so let’s look at all the DR3 data together:

Gaia DR3 realizations of the Milky Way rotation curve. The most recent version of these data from Poder et al (2023) are shown as blue squares over the range 5 < R < 13 kpc. Other Gaia DR3 realizations include Ou et al. (2023, green circles), Wang et al. (2023, magenta downward pointing triangles), and Zhou et al. (2023, purple triangles).

The new Gaia realization does not go very far out, and has larger uncertainties. That doesn’t mean it is worse; it might simply be more conservative in estimating uncertainties, and not making a claim where the data don’t substantiate it. Neither does that mean the other realizations are wrong: these differences are what happens in different analyses. Indeed, all the independent realizations of the Gaia data are pretty consistent, despite the different stellar selection criteria and analysis techniques. This is especially true for R < 17 kpc where there are lots of stars informing the measurements. Even beyond that, I would say they are consistent at the level we’d expect for astronomy.

Zooming out to compare with other results:

The Milky Way rotation curve. The model line from McGaugh (2018) is shown with data from various sources. The abscissa switches from linear to logarithmic at 10 kpc to wedge it all in. The location of the Large Magellanic Cloud at 50 kpc is noted. Gaia DR3 data (Poder et al., Ou et al., Wang et al., and Zhou et al.) are shown as in the plot above. The small black squares are the Gaia DR2 realization of Eilers et al. (2019) reanalyzed to include the effect of bumps and wiggles by McGaugh (2019). Non-Gaia data include blue horizontal branch stars (light blue squares) and red giants (red squares) in the stellar halo (Bird et al. 2022), globular clusters (Watkins et al. 2019, pink triangles), VVV stars (Portail et al. 2017, dark grey squares at R < 2.2 kpc), and terminal velocities (McClure-Griffiths & Dickey 2007, 2016, light grey points from 3 < R < 8 kpc). These terminal velocities are the only data that inform the model line; everything else follows.

Overall, I would say the data paint a pretty consistent picture. The biggest tension amongst the data illustrated here is between the outermost Gaia points around R = 25 kpc and the corresponding results from halo stars. One is consistent with the model line and the other is not. We shouldn’t allow the model to inform our interpretation; the important point is that the independent data disagree with each other. This happens all the time in astronomy. Sometimes it boils down to different assumptions; sometimes it is a real discrepancy. Either way, one has to learn* to cope.

The sharp-eyed will also notice an apparent tension between the DR2 data (black squares) and DR3 around 6 and 7 kpc. This is not real – it is an artifact of different treatments of the term in the Jeans equation for the logarithmic derivative of the density profile of the tracer particles. That’s a choice made in the analysis. The data are entirely consistent when treated consistently.

Putting on an empiricist’s hat, I will say that the kink in the slope of the Gaia data around R = 18 kpc looks unnatural. That doesn’t happen in other galaxies. Rather than belabor the point further, I’ll simply say that this is how things mostly go right but also a little wrong. This is as good as we can hope for in [extra]galactic astronomy.

In contrast, it is easy to go very wrong. To give an example, here is a model of the Milky Way that was built to approximately match the rotation curve of Sofue (2020).


Fig. 1 from Dai et al. (2022). Note the logarithmic abscissa. Their caption: The rotation curve of the Milky Way. The data (solid dark circles with error bars) for r < 100kpc come from [22], while for r > 100kpc from [23]. The solid, dashed and doted lines describe the contribution from the bulge, stellar disk and dark matter halo respectively, within a ΛCDM model of the galaxy. The dashed-dot line is the total contribution of all three components.The parameters of each component are taken from [24]. For comparison, the Milky way rotation curve from Gaia DR2 is shown in color. The red dots are data from [34], the blue upward-pointing triangles are from [35], while the cyan downward-pointing triangles are from [36].

This realization of the rotation curve is very different from that seen above. Note that the rotation curve (black points) is very different from that of Gaia (red points) over the same radial range. These independent data are inconsistent; at least one of them is wrong. The data extend to very large radii, encompassing not only the LMC but also Andromeda (780 kpc away). I am already concerned about the effects of the LMC at 50 kpc; Andromeda is twice the baryonic mass of the Milky Way so anything beyond 260 kpc is more Andromeda’s territory than ours – depending on which side we’re talking about. The uncertainties are so big out there they provide no constraining power anyway.

In terms of MOND-required perfection, things fall apart for the Dai model already at very small radii. Dai et al. (2022) chose to fit their bulge component to the high amplitude terminal velocities of Sofue. That’s a reasonable thing to do, if we think the terminal velocities represent circular motion. Because of the non-circular motions that sustain the Galactic bar, they almost certainly do not – that’s why I restricted use of terminal velocities to larger radii. We also know something about the light distribution:

The inner 3 kpc of the Milky Way. The circles are the terminal velocities of Sofue (2020); the squares are the equivalent circular velocity of the potential reconstructed from the kinematics of stars in the VVV survey (Portail et al. 2017). The line is the bulge-bar model of McGaugh (2008) based on the light distribution reported by Binney et al (1997).

This is essentially the same graph as I showed before, but showing only the Newtonian bulge-bar component, and on a logarithmic abscissa for comparison with the plot of Dai et al. The two bulge models are very different. That of Dai et al. is more massive and more compact, as required to match the terminal velocities. There may be galaxies out there that look like this, but the Milky Way is not one of them.

Indeed, Newton’s prediction for the rotation curve of the bulge-bar component – the line labeled bulge/bar based on what the Milky Way looks like – is in good agreement with the effective circular speed curve obtained from stellar data. It is not consistent with the terminal velocities. We could increase the amplitude of the Newtonian prediction by increasing the mass-to-light ratio of the stars (I have adopted the value I expect for stellar populations), but the shape would still be wrong. This does not come as a surprise to most Galactic astronomers, because we know there is a bar in the center of the Milky Way and we know that bars induce non-circular motions, so we do not expect the terminal velocities to be a fair tracer of the rotation curve in this region. That’s why Portail et al. had to go to great lengths in their analysis to reconstruct the equivalent circular velocity, as did I just to build the bulge-bar model.

The thing about predicting rotation curves from the observed mass, as MOND does, is that you have to get both the kinematic data and the mass distribution right. The velocity predicted at any radius depends on the mass enclosed by that radius. So if we get the bulge badly wrong, everything spirals down the drain from there.

Dai et al. (2022) compare their model to the acceleration residuals predicted by MOND for their mass model. If all is well, the data should scatter around the constant line at zero in this graph:

Fig. 4 from Dai et al. (2022). Their caption: [The radial acceleration relation] recast as a comparison between the total acceleration, a, and the MOND prediction, aM , as a function of the acceleration due to baryons aB. The solid horizontal line is a = aM. The circles and squares with error bars represent the Milky Way and M31 data, while the gray dots are from the EAGLE simulation of ΛCDM in [1]. For aB > 10−10m/s2 any difference between a and aM is unclear. However, once aB drops well below 10−11m/s2, the discrepancy emerges. The short-dashed line is the ΛCDM fitting curve of the MW. The dash-dot line is the ΛCDM fitting curve of M31. The mass range** of galaxies in EAGLE’s data is chosen to be between 5 × 1010M to 5 × 1011M. For comparison, the Milky way rotation curve from GAIA data release II is shown in color. The red dots are data from [34], the blue triangles are from [35], while the cyan down triangles are from [36]. While the EAGLE simulation does not match the data perfectly, these plots indicate that it is much easier to accommodate a systematic downward trend with the ΛCDM model than with MOND.

Things are not well.

The interpretation that is offered (right in the figure caption) is that MOND is wrong and the LCDM-based EAGLE simulation does a better if not perfect job of explaining things. We already know that’s not right. The alternate interpretation is that this is not a valid representation of the prediction of MOND, because their mass model does not follow from the observed distribution of light. They get neither the baryonic mass distribution and its predicted acceleration ab nor the total acceleration a right in the plot above.

In terms of dark matter, the model of Dai et al. may appear viable. In terms of MOND, it is way off, not just a little off. The residuals are only zero, as they should be, for a narrow range of accelerations, 2 to 3 x 10-10 m/s/s. That’s more Newton than MOND, and appears to correspond to the limited range in radii over which their model matches the rotation curve data in their Fig. 1 (roughly 4 to 6 kpc). It doesn’t really fit the data elsewhere, and the restrictions on a MOND fit are considerably more stringent than on the sort of dark matter model they construct: there’s no reason to expect their model to behave like MOND in the first place.

And, hoo boy, does it ever not behave like MOND. Look at how far those red points – the Gaia DR2 data – deviate from zero in their Fig. 4. Those are the exact same data that agree well with the model line I show above – the data that were correctly predicted in advance. This model is a reasonable representation of the radial force predicted by MOND, with the blue line in my plot being equivalent to the zero line in theirs.

This is how things can go badly wrong. To properly apply MOND, we need to measure both the kinematics and baryonic mass distribution correctly. If we screw either up, as is easy to do in astronomy, then the result will look very wrong, even if it shouldn’t. Combine this with the eagerness many people have to dismiss MOND outright, and you wind up with lots of articles claiming that MOND is wrong – even when that’s not really the story the data tell. Happens over and over again, so the field remains stagnant.


*This is a large part of the cultural difference between physics and astronomy. Physicists are spoiled by laboratory experiments done in controlled conditions in which one can measure to the sixth place of decimals. In contrast, astronomy is an observational rather than experimental science. We can’t put the universe in a box and control all the systematics – measuring most quantities to 1% is a tall order. Consequently, astronomers are used to being wrong. While I wouldn’t say that astronomers cope with it gracefully, they’re well aware that it happens, that is has happened a lot historically, and will continue to happen in the future. It is a risk we all take in trying to understand a universe so much vaster than ourselves. This makes astronomers rather more tolerant of surprising results – results where the first response is “that can’t be right!” but also informed by the experience that “we’ve been wrong before!” Physicists coming to the field generally lack this experience and take the error bars way too seriously. I notice this attitude is creeping into the younger generation of astronomers; people who’ve received their data from distant observatories and performed CPU-intensive MCMC error analyses, so want to believe them, but often lack the experience of dozens of nights spent at the observatory sweating a thousand ill-controlled but consequential details, like walking out to a beautiful sunrise decorated by wisps of cirrus clouds. When did those arrive?!?


**The data that define the radial acceleration relation come from galaxies spanning six decades in stellar mass, so this one decade range from the simulations is tiny – it is literally comparing a factor of ten to a a factor of a million. What happens outside the illustrated mass range? Are lower masses even resolved?