Non-equilibrium dynamics in galaxies that appear to lack dark matter: ultradiffuse galaxies

Non-equilibrium dynamics in galaxies that appear to lack dark matter: ultradiffuse galaxies

Previously, we discussed non-equilibrium dynamics in tidal dwarf galaxies. These are the result of interactions between giant galaxies that are manifestly a departure from equilibrium, a circumstance that makes TDGs potentially a decisive test to distinguish between dark matter and MOND, and simultaneously precludes confident application of that test. There are other galaxies for which I suspect non-equilibrium dynamics may play a role, among them some (not all) of the so-called ultradiffuse galaxies (UDGs).

UDGs

The term UDG has been adopted for galaxies below a certain surface brightness threshold with a size (half-light radius) in excess of 1.5 kpc (van Dokkum et al. 2015). I find the stipulation about the size to be redundant, as surface brightness* is already a measure of diffuseness. But OK, whatever, these things are really spread out. That means they should be good tests of MOND like low surface brightness galaxies before them: their low stellar surface densities mean** that they should be in the regime of low acceleration and evince large mass discrepancies when isolated. It also makes them susceptible to the external field effect (EFE) in MOND when they are not isolated, and perhaps also to tidal disruption.

To give some context, here is a plot of the size-mass relation for Local Group dwarf spheroidals. Typically they have masses comparable to globular clusters, but much large sizes – a few hundred parsecs instead of just a few. As with more massive galaxies, these pressure supported dwarfs are all over the place – at a give mass, some are large while others are relatively compact. All but the one most massive galaxy in this plot are in the MOND regime. For convenience, I’ll refer to the black points labelled with names as UDGs+.

The size (radius encompassing half of the total light) and stellar mass of Local Group dwarf spheroidals (green points selected by McGaugh et al. 2021 to be relatively safe from external perturbation) along with two more Local Group dwarfs that are subject to the EFE (Crater 2 and Antlia 2) and the two UDGs NGC 1052-DF2 and DF4. Dotted lines show loci of constant surface density. For reference, the solar neighborhood has ~40 Mโ˜‰ pc-2; the centers of high surface brightness galaxies frequently exceed 1,000 Mโ˜‰ pc-2.

The UDGs are big and diffuse. This makes them susceptible to the EFE and tidal effects. The lower the density of a system, the easier it is for external systems to mess with it. The ultimate example is something gets so close to a dominant central mass that it gets tidally disrupted. That can happen conventionally; the stronger effective force of MOND increases tidal effects. Indeed, there is only a fairly narrow regime between the isolated case and tidally-induced disequilibrium where the EFE modifies the internal dynamics in a quasi-static way.

The trouble is the s-word: static. In order to test theories, we assume that the dynamical systems we observe are in equilibrium. Though often a good assumption, it doesn’t always hold. If we forget we made the assumption, we might think we’ve falsified a theory when all we’ve done is discover a system that is out of equilibrium. The universe is a very dynamic place – the whole thing is expanding, after all – so we need to be wary of static thinking.

Equilibrium MOND formulae

That said, let’s indulge in some static thinking. An isolated, pressure supported galaxy in the MOND regime will have an equilibrium velocity dispersion

where M is the mass (the stellar mass in the case of a gas-free dwarf spheroidal), G is Newton’s constant, and a0 is Milgrom’s acceleration constant. The number 4/81 is a geometrical factor that assumes we’re observing a spherical system with isotropic orbits, neither of which is guaranteed even in the equilibrium case, and deviations from this idealized situation are noticeable. Still, this is as simple as it gets: if you know the mass, you can predict the characteristic speed at which stars move. Mass is all that matters: we don’t care about the radius as we must with Newton (v2 = GM/r); the only other quantities are constants of nature.

But what do we mean by isolated? In MOND, it is that the internal acceleration of the system, gin, exceeds that from external sources, gex: gin โ‰ซ gex. For a pressure supported dwarf, gin โ‰ˆ 3ฯƒ2/r (so here the size of the dwarf does matter, as does the location of a star within it), while the external field from a giant host galaxy would be gex = Vf2/D where Vf is the flat rotation speed stipulated by the baryonic mass of the host and D is the distance from the host to the dwarf satellite. The distance is not a static quantity. As a dwarf orbits its host, D will vary by an amount that depends on the eccentricity of the orbit, and the external field will vary with it, so it is possible to have an orbit in which a dwarf satellite dips in and out of the EFE regime. Many Local Group dwarfs straddle the line gin โ‰ˆ gex, and it takes time to equilibrate, so static thinking can go awry.

It is possible to define a sample of Local Group dwarfs that have sufficiently high internal accelerations (but also in the MOND regime with gex โ‰ช gin โ‰ช a0) that we can pretend they are isolated, and the above equation applies. Such dwarfs should& fall on the BTFR, which they do:

The baryonic Tully-Fisher relation (BTFR) including pressure supported dwarfs (green points) with their measured velocity dispersions matched to the flat rotation speeds of rotationally supported galaxies (blue points) via the prescription of McGaugh et al. (2021). The large blue points are rotators in the Local Group (with Andromeda and the Milky Way up near the top); smaller points are spirals with direct distance measurements (Schombert et al. 2020). The Local Group dwarfs assessed to be safe from external perturbation are on the BTFR (for Vf = 2ฯƒ); Crater 2 and the UDGs near NGC 1052 are not.

In contrast, three of the four the UDGs considered here do not fall on the BTFR. Should they?

Conventionally, in terms of dark matter, probably they should. There is no reason for them to deviate from whatever story we make up to explain the BTFR for everything else. That they do means we have to make up a separate story for them. I don’t want to go deeply into this here since the cold dark matter model doesn’t really explain the observed BTFR in the first place. But even accepting that it does so after invoking feedback (or whatever), does it tolerate deviants? In a broad sense, yes: since it doesn’t require the particular form of the BTFR that’s observed, it is no problem to deviate from it. In a more serious sense, no: if one comes up with a model that explains the small scatter of the BTFR, it is hard to make that same model defy said small scatter. I know, I’ve tried. Lots. One winds up with some form of special pleading in pretty much any flavor of dark matter theory on top of whatever special pleading we invoked to explain the BTFR in the first place. This is bad, but perhaps not as bad as it seems once one realizes that not everything has to be in equilibrium all the time.

In MOND, the BTFR is absolute – for isolated systems in equilibrium. In the EFE regime, galaxies can and should deviate from it even if they are in equilibrium. This always goes in the sense of having a lower characteristic velocity for a given mass, so below the line in the plot. To get above the line would require being out of equilibrium through some process that inflates velocities (if systematic errors are not to blame, which also sometimes happens.)

The velocity dispersion in the EFE regime (gin โ‰ช gex โ‰ช a0) is slightly more complicated than this isolated case:

This is just like Newton except the effective value of the gravitational constant is modified. It gets a boost^ by how far the system is in the MOND regime: Geff โ‰ˆ G(a0/gex). An easy way to tell which regime an object is in is to calculate both velocity dispersions ฯƒiso and ฯƒefe: the smaller one is the one that applies#. An upshot of this is that systems in the EFE regime should deviate from the BTFR to the low velocity side. The amplitude of the deviation depends on the system and the EFE: both the size and mass matter, as does gex. Indeed, if an object is on an eccentric orbit, then the velocity dispersion can vary with the EFE as the distance of the satellite from its host varies, so over time the object would trace out some variable path in the BTFR plane.

Three of the four UDGs fall off the BTFR, so that sounds mostly right, qualitatively. Is it? Yes, for Crater 2, but but not really for the others. Even for Crater 2 it is only a partial answer, as non-equilibrium effects may play a role. This gets involved for Crater 2, then more so for the others, so let’s start with Crater 2.

Crater 2 – the velocity dispersion

The velocity dispersion of Crater 2 was correctly predicted a priori by the formula for ฯƒefe above. It is a tiny number, 2 km/s, and that’s what was subsequently observed. Crater 2 is very low mass, ~3 x 105 Mโ˜‰, which is barely a globular cluster, but it is even more spread out than the typical dwarf spheroidal, having an effective surface density of only ~0.05 Mโ˜‰pc-2. If it were isolated, MOND predicts that it would have a higher velocity dispersion – all of 4 km/s. That’s what it would take to put it on the BTFR above. The seemingly modest difference between 2 and 4 km/s makes for a clear offset. But despite its substantial current distance from the Milky Way (~ 120 kpc), Crater 2 is so low surface density that it is still subject to the external field effect, which lowers its equilibrium velocity dispersion. Unlike isolated galaxies, it should be offset from the BTFR according to MOND.

LCDM struggles to explain the low mass end of the BTFR because it predicts a halo mass-circular speed relation Mhalo ~ Vhalo3 that differs from the observed Mb ~ Vf4. A couple of decades ago, it looked like massive galaxies might be consistent with the lower power-law, but that anticipates higher velocities for small systems. The low velocity dispersion of Crater 2 is thus doubly weird in LCDM. It’s internal velocities are too small not just once – the BTFR is already lower than was expected – but twice, being below even that.

An object with a large radial extent like Crater 2 probes far out into its notional dark matter halo, making the nominal prediction$ of LCDM around ~17 km/s, albeit with a huge expected scatter. Even if we can explain the low mass end of the BTFR and its unnaturally low scatter in LCDM, we now have to explain this exception to it – an exception that is natural in MOND, but is on the wrong side of the probability distribution for LCDM. That’s one of the troubles with tuning LCDM to mimic MOND: if you succeed in explaining the first thing, you still fail to anticipate the other. There is no EFE% in LCDM, no reason to anticipate that ฯƒefe applies rather than ฯƒiso, and no reason to expect via feedback that this distinction has anything to do with the dynamical accelerations gin and gex.

But wait – this is a post about non-equilibrium dynamics. That can happen in LCDM too. Indeed, one expects that satellite galaxies suffer tidal effects in the field of their giant host. The primary effect is that the dark matter subhalos in which dwarf satellites reside are stripped from the outside in. Their dark matter becomes part of the large halo of the host. But the stars are well-cocooned in the inner cusp of the NFW halo which is more robust than the outskirts of the subhalo, so the observable velocity dispersion barely evolves until most of the dark mass has been stripped away. Eventually, the stars too get stripped, forming tidal streams. Most of the damage occurs during pericenter passage when satellites are closest to their host. What’s left is no longer in equilibrium, with the details depending on the initial conditions of the dwarf on infall, the orbit, the number of pericenter passages, etc., etc.

What does not come out of this process is Crater 2 – at least not naturally. It has stars very far out – these should get stripped outright if the subhalo has been eviscerated to the point where its velocity dispersion is only 2 km/s. This tidal limitation has been noted by Errani et al.: “the large size of kinematically cold ‘feeble giant’ satellites like Crater 2 or Antlia 2 cannot be explained as due to tidal effects alone in the Lambda Cold Dark Matter scenario.” To save LCDM, we need something extra, some additional special pleading on top of non-equilibrium tidal effects, which is why I previously referred to Crater 2 as the Bullet Cluster of LCDM: an observation so problematic that it amounts to a falsification.

Crater 2 – the orbit

We held a workshop on dwarf galaxies on CWRU’s campus in 2017 where issues pertaining to both dark matter and MOND discussed. The case of Crater 2 was one of the things discussed, and it was included in the list of further tests for both theories (see above links). Basically the expectation in LCDM is that most subhalo orbits are radial (highly eccentric), so that is likely to be the case for Crater 2. In contrast, the ultradiffuse blob that is Crater 2 would not survive a close passage by the Milky Way given the strong tidal force exerted by MOND, so the expectation was for a more tangential (quasi-circular) orbit that keeps it at a safe distance.

Subsequently, it became possible to constrain orbits with Gaia data. The exact orbit depends on the gravitational potential of the Milky Way, which isn’t perfectly known. However, several plausible choices of the global potential give an an eccentricity around 0.6. That’s not exactly radial, but it’s pretty far from circular, placing the pericenter around 30 kpc. That’s much closer than its current distance, and well into the regime where it should be tidally disrupted in MOND. No way it survives such a close passage!

So which is it? MOND predicted the correct velocity dispersion, which LCDM struggles to explain. Yet the orbit is reasonable in LCDM, but incompatible with MOND.

Simulations of dwarf satellites

It occurs to me that we might be falling victim to static thinking somewhere. We talked about the impact of tides on dark matter halos a bit above. What should we expect in MOND?

The first numerical simulations of dwarf galaxies orbiting a giant host were conducted by Brada & Milgrom (2000). Their work is specific to the Aquadratic Lagrangian (AQUAL) theory proposed by Bekenstein & Milgrom (1984). This was the first demonstration that it was possible to write a version of MOND that conserved momentum and energy. Since then, a number of different approaches have been demonstrated. These can be subtly different, so it is challenging to know which (if any) is correct. Sorting that out is well beyond the scope of this post, so let’s stick to what we can learn from Brada & Milgrom.

Brada & Milgrom followed the evolution of low surface density dwarfs of a range of masses as they orbited a giant host galaxy. One thing they found was that the behavior of the numerical model could deviate from the analytic expectation of quasi-equilibrium enshrined in the equations above. For an eccentric orbit, the external field varies with distance from the host. If there is enough time to respond to this, the change can be adiabatic (reversible), and the static approximation may be close enough. However, as the external field varies more rapidly and/or the dwarf is more fragile, the numerical solution departs from the simple analytic approximation. For example:

Fig. 2 of Brada & Milgrom (2000): showing the numerically calculated (dotted line) variation of radius (left) and characteristic velocity (right) for a dwarf on a mildly eccentric orbit (peri- and apocenter of roughly 60 and 90 kpc, respectively, for a Milky Way-like host). Also shown is the variation in the EFE as the dwarf’s distance from the host varies (solid line). Dwarfs go through a breathing mode of increasing/decreasing size and decreasing/increasing velocity dispersion in phase with the orbit. If this process is adiabatic, it tracks the solid line and the static EFE approximation holds. This is not always the case in the simulation, so applying our usual assumption of dynamical equilibrium will result in an error stipulated by the difference between the dotted and solid lines. The amplitude of this error depends on the size, mass, and orbital history of each and every dwarf satellite.

As long as the behavior is adiabatic, the dwarf can be stable indefinitely even as it goes through periodic expansion and contraction in phase with the orbit. Departure from adiabaticity means that every passage will be different. Some damage will be done on the first passage, more on the second, and so on. As a consequence, reality will depart from our simple analytic expectations.

I was aware of this when I made the prediction for the velocity dispersion of Crater 2, and hedged appropriately. Indeed, I worried that Crater 2 should already be out of equilibrium. Nevertheless, I took solace in two things: first, the orbital timescale is long, over a Gyr, so departures from the equilibrium prediction might not have had time to make a dramatic difference. Second, this expectation is consistent with the slow evolution of the characteristic velocity for the most Crater 2-like, m=1 model of Brada & Milgrom (bottom track in the right panel below):

Fig. 4 of Brada & Milgrom (2000): The variation of the size and characteristic velocity of dwarf models of different mass. The more massive models approximate the adiabatic limit, which gradually breaks down for the lowest mass models. In this example, the m = 1 and 2 models explode, with the scale size growing gradually without recovering.

What about the size? That is not constant except for the most massive (m=16) model. The m=3 and 4 models recover, albeit not adiabatically. The m=4 model almost returns to its original size, but the m=3 model has puffed up after one orbit. The m=1 and 2 models explode.

One can see this by eye. The continuous growth in radii of the lower mass models is obvious. If one looks closely, one can also see the expansion then contraction of the heavier models.

Fig. 5 of Brada & Milgrom (2000): AQUAL numerical simulations dwarf satellites orbiting a more massive host galaxy. The parameter m describes the mass and effective surface density of the satellite; all the satellites are in the MOND regime and subject to the external field of the host galaxy, which exceeds their internal accelerations. In dimensionless simulation units, m = 5 x 10-5, which for a satellite of the Milky Way corresponds roughly to a stellar mass of 3 x 106 Mโ˜‰. For real dwarf satellite galaxies, the scale size is also relevant, but the sequence of m above suffices to illustrate the increasingly severe effects of the external field as m decreases.

The current size of Crater 2 is unusual. It is very extended for its mass. If the current version of Crater 2 has a close passage with the Milky Way, it won’t survive. But we know it already had a close passage, so it should be expanding now as a result. (I did discuss the potential for non-equilibrium effects.) Knowing now that there was a pericenter passage in the (not exactly recent) past, we need to imagine running back the clock on the simulations. It would have been smaller in the past, so maybe it started with a normal size, and now appears so large because of its pericenter passage. The dynamics predict something like that; it is static thinking to assume it was always thus.

The dotted line shows a possible evolutionary track for Crater 2 as it expands after pericenter passage. Its initial condition would have been amongst the other dwarf spheroidals. It could also have lost some mass in the process, so any of the green low-mass dwarfs might be similar to the progenitor.

This is a good example of a phenomena I’ve encountered repeatedly with MOND. It predicts something right, but seems to get something else wrong. If we’re already sure it is wrong, we stop there and never think further. But when one bothers to follow through on what the theory really predicts, more often than not the apparently problematic observation is in fact what we should have expected in the first place.

DF2 and DF4

DF2 and DF4 are two UDGs in the vicinity of the giant galaxy NGC 1052. They have very similar properties, and are practically identical in terms of having the same size and mass within the errors. They are similar to Crater 2 in that they are larger than other galaxies of the same mass.

When it was first discovered, NGC 1052-DF2 was portrayed as a falsification of MOND. On closer examination, had I known about it, I could have used MOND to correctly predict its velocity dispersion, just like the dwarfs of Andromeda. This seemed like yet another case where the initial interpretation contrary to MOND melted away to actually be a confirmation. At this point, I’ve seen literally hundreds^^of cases like that. Indeed, this particular incident made me realize that there would always be new cases like that, so I decided to stop spending my time addressing every single case.

Since then, DF2 has been the target of many intensive observing campaigns. Apparently it is easier to get lots of telescope time to observe a single object that might have the capacity to falsify MOND than it is to get a more modest amount to study everything else in the universe. That speaks volumes about community priorities and the biases that inform them. At any rate, there is now lots more data on this one object. In some sense there is too much – there has been an active debate in the literature over the best distance determination (which affects the mass) and the most accurate velocity dispersion. Some of these combinations are fine with MOND, but others are not. Let’s consider the worst case scenario.

In the worst case scenario, both DF2 and DF4 are too far from NGC 1052 for its current EFE to have much impact, and they have relatively low velocity dispersions for their luminosity, around 8 km/s, so they fall below the BTFR. Worse for MOND is that this is about what one expects from Newton for the stars alone. Consequently, these galaxies are sometimes referred to as being “dark matter free.” That’s a problem for MOND, which predicts a larger velocity dispersion for systems in equilibrium.

Perhaps we are falling prey to static thinking, and these objects are not in equilibrium. While their proximity to neighboring galaxies and the EFE to which they are presently exposed depends on the distance, which is disputed, it is clear that they live in a rough neighborhood with lots of more massive galaxies that could have bullied them in a close passage at some point in the past. Looking at Fig. 4 of Brada & Milgrom above, I see that galaxies whacked out of equilibrium not only expand in radius, potentially explaining the unusually large sizes of these UDGs, but they also experience a period during which their velocity dispersion is below the equilibrium value. The amplitude of the dip in these simulations is about right to explain the appearance of being dark-matter-free.

It is thus conceivable that DF2 and DF4 (the two are nearly identical in the relevant respects) suffered some sort of interaction that perturbed them into their current state. Their apparent absence of a mass discrepancy and the apparent falsification of MOND that follows therefrom might simply be a chimera of static thinking.

Make no mistake: this is a form of special pleading. The period of depressed velocity dispersion does not last indefinitely, so we have to catch them at a somewhat special time. How special depends on the nature of the interaction and its timescale. This can be long in intergalactic space (Gyrs), so it may not be crazy special, but we don’t really know how special. To say more, we would have to do detailed simulations to map out the large parameter space of possibilities for these objects.

I’d be embarrassed for MOND to have to make this kind of special pleading if we didn’t also have to do it for LCDM. A dwarf galaxy being dark matter free in LCDM shouldn’t happen. Galaxies form in dark matter halos; it is very hard to get rid of the dark matter while keeping the galaxy. The most obvious way to do it, in rare cases, is through tidal disruption, though one can come up with other possibilities. These amount to the same sort of special pleading we’re contemplating on behalf of MOND.

Recently, Tang et al. (2024) argue that DF2 and DF4 are “part of a large linear substructure of dwarf galaxies that could have been formed from a high-velocity head-on encounter of two gas-rich galaxies” which might have stripped the dark matter while leaving the galactic material. That sounds… unlikely. Whether it is more or less unlikely than what it would take to preserve MOND is hard to judge. It appears that we have to indulge in some sort of special pleading no matter what: it simply isn’t natural for galaxies to lack dark matter in a universe made of dark matter, just as it is unnatural for low acceleration systems to not manifest a mass discrepancy in MOND. There is no world model in which these objects make sense.

Tang et al. (2024) also consider a number of other possibilities, which they conveniently tabulate:

Table 3 from Tang et al. (2024).

There are many variations on awkward hypotheses for how these particular UDGs came to be in LCDM. They’re all forms of special pleading. Even putting on my dark matter hat, most sound like crazy talk to me. (Stellar feedback? Really? Is there anything it cannot do?) It feels like special pleading on top of special pleading; it’s special pleading all the way down. All we have left to debate is which form of special pleading seems less unlikely than the others.

I don’t find this debate particularly engaging. Something weird happened here. What that might be is certainly of interest, but I don’t see how we can hope to extract from it a definitive test of world models.

Antlia 2

The last of the UDGs in the first plot above is Antlia 2, which I now regret including – not because it isn’t interesting, but because this post is getting exhausting. Certainly to write, perhaps to read.

Antlia 2 is on the BTFR, which is ordinarily normal. In this case it is weird in MOND, as the EFE should put it off the BTFR. The observed velocity dispersion is 6 km/s, but the static EFE formula predicts it should only be 3 km/s. This case should be like Crater 2.

First, I’d like to point out that, as an observer, it is amazing to me that we can seriously discuss the difference between 3 and 6 km/s. These are tiny numbers by the standard of the field. The more strident advocates of cold dark matter used to routinely assume that our rotation curve observations suffered much larger systematic errors than that in order to (often blithely) assert that everything was OK with cuspy halos so who are you going to believe, our big, beautiful simulations or those lying data?

I’m not like that, so I do take the difference seriously. My next question, whenever MOND is a bit off like this, is what does LCDM predict?

I’ll wait.

Well, no, I won’t, because I’ve been waiting for thirty years, and the answer, when there is one, keeps changing. The nominal answer, as best I can tell, is ~20 km/s. As with Crater 2, the large scale size of this dwarf means it should sample a large portion of its dark matter halo, so the expected characteristic speed is much higher than 6 km/s. So while the static MOND prediction may be somewhat off here, the static LCDM expectation fares even worse.

This happens a lot. Whenever I come across a case that doesn’t make sense in MOND, it usually doesn’t make sense in dark matter either.

In this case, the failure of the static-case prediction is apparently caused by tidal perturbation. Like Crater 2, Antlia 2 may have a large half-light radius because it is expanding in the way seen in the simulations of Brada & Milgrom. But it appears to be a bit further down that path, with member stars stretched out along the orbital path. They start to trace a small portion of a much deeper gravitational potential, so the apparent velocity dispersion goes up in excess of the static prediction.

Fig. 9 from Ji et al. (2021) showing tidal features in Antlia 2 considering the effects of the Milky Way alone (left panel) and of the Milky Way and the Large Magellanic Cloud together (central panel) along with the position-velocity diagram from individual stars (right panel). The object is clearly not the isotropic, spherical cow presumed by the static equation for the velocity dispersion. Indeed, it is elongated as would be expected from tidal effects, with individual member stars apparently leaking out.

This is essentially what I inferred must be happening in the ultrafaint dwarfs of the Milky Way. There is no way that these tiny objects deep in the potential well of the Milky Way escape tidal perturbation%% in MOND. They may be stripped of their stars and their velocity dispersions mage get tidally stirred up. Indeed, Antlia 2 looks very much like the MOND prediction for the formation of tidal streams from such dwarfs made by McGaugh & Wolf (2010). Unlike dark matter models in which stars are first protected, then lost in pulses during pericenter passages, the stronger tides of MOND combined with the absence of a protective dark matter cocoon means that stars leak out gradually all along the orbit of the dwarf. The rate is faster when the external field is stronger at pericenter passage, but the mass loss is more continuous. This is a good way to make long stellar streams, which are ubiquitous in the stellar halo of the Milky Way.

So… so what?

It appears that aspects of the observations of the UDGs discussed here that seem problematic for MOND may not be as bad for the theory as they at first seem. Indeed, it appears that the noted problems may instead be a consequence of the static assumptions we usually adopt to do the analysis. The universe is a dynamic place, so we know this assumption does not always hold. One has to judge each case individually to assess whether this is reasonable or not.

In the cases of Crater 2 and Antlia 2, yes, the stranger aspects of the observations fit well with non-equilibrium effects. Indeed, the unusually large half-light radii of these low mass dwarfs may well be a result of expansion after tidal perturbation. That this might happen was specifically anticipated for Crater 2, and Antlia 2 fits the bill described by McGaugh & Wolf (2010) as anticipated by the simulations of Brada & Milgrom (2000) even though it was unknown at the time.

In the cases of DF2 and DF4, it is less clear what is going on. I’m not sure which data to believe, and I want to refrain from cherry-picking, so I’ve discussed the worst-case scenario above. But the data don’t make a heck of a lot of sense in any world view; the many hypotheses made in the dark matter context seem just as contrived and unlikely as a tidally-induced, temporary dip in the velocity dispersion that might happen in MOND. I don’t find any of these scenarios to be satisfactory.

This is a long post, and we have only discussed four galaxies. We should bear in mind that the vast majority of galaxies do as predicted by MOND; a few discrepant cases are always to be expected in astronomy. That MOND works at all is a problem for the dark matter paradigm: that it would do so was not anticipated by any flavor of dark matter theory, and there remains no satisfactory explanation of why MOND appears to happen in a universe made of dark matter. These four galaxies are interesting cases, but they may be an example of missing the forest for the trees.


*As it happens, the surface brightness threshold adopted in the definition of UDGs is exactly the same as I suggested for VLSBGs (very low surface brightness galaxies: McGaugh 1996), once the filter conversions have been made. At the time, this was the threshold of our knowledge, and I and other early pioneers of LSB galaxies were struggling to convince the community that such things might exist. Up until that time, the balance of opinion was that they did not, so it is gratifying to see that they do.

**This expectation is specific to MOND; it doesn’t necessarily hold in dark matter where the acceleration in the central regions of diffuse galaxies can be dominated by the cusp of the dark matter halo. These were predicted to exceed what is observed, hence the cusp-core problem.

+Measuring by surface brightness, Crater 2 and Antlia 2 are two orders of magnitude more diffuse than the prototypical ultradiffuse galaxies DF2 and DF4. Crater 2 is not quite large enough to count as a UDG by the adopted size definition, but Antlia 2 is. So does that make it super-ultra diffuse? Would it even be astronomy without terrible nomenclature?

&I didn’t want to use a MOND-specific criterion in McGaugh et al. (2021) because I was making a more general point, so the green points are overly conservative from the perspective of the MOND isolation criterion: there are more dwarfs for which this works. Indeed, we had great success in predicting velocity dispersions in exactly this fashion in McGaugh & Milgrom (2013a, 2013b). And XXVIII was a case not included above that we highlighted as a great test of MOND, being low mass (~4×105 Mโ˜‰) but still qualifying as isolated, and its dispersion came in (6.6+2.9-2.1 km/s in one measurement, 4.9 ยฑ 1.6 km/s in another) as predicted a priori (4.3+0.8-0.7 km/s). Hopefully the Rubin Observatory will discover many more similar objects that are truly isolated; these will be great additional tests, though one wonders how much more piling-on needs to be done.

^This is an approximation that is reasonable for the small accelerations involved. More generally we have Geff = G/ฮผ(|gex+gin|/a0) where ฮผ is the MOND interpolation function and one takes the vector sum of all relevant accelerations.

#This follows because the boost from MOND is limited by how far into the low acceleration regime an object is in. If the EFE is important, the boost will be less than in the isolated case. As we said in 2013, “the case that reports the lower velocity dispersion is always the formally correct one.” I mention it again here because apparently people are good at scraping equations from papers without reading the associated instructions, so one gets statements likethe theory does not specify precisely when the EFE formula should replace the isolated MOND prediction.” Yes it does. We told you precisely when the EFE formula should replace the isolated formula. It is when it reports the lower velocity dispersion. We also noted this as the reason for not giving ฯƒefe in the tables in cases it didn’t apply, so there were multiple flags. It took half a dozen coauthors to not read that. I’d hate to see how their Ikea furniture turned out.

$As often happens with LCDM, there are many nominal predictions. One common theme is that “Despite spanning four decades in luminosity, dSphs appear to inhabit halos of comparable peak circular velocity.” So nominally, one would expect a faint galaxy like Crater 2 to have a similar velocity dispersion to a much brighter one like Fornax, and the luminosity would have practically no power to predict the velocity dispersion, contrary to what we observe in the BTFR.

%There is the 2-halo term – once you get far enough from the center of a dark matter halo (the 1-halo term), there are other halos out there. These provide additional unseen mass, so can boost the velocity. The EFE in MOND has the opposite effect, and occurs for completely different physical reasons, so they’re not at all the same.

^^For arbitrary reasons of human psychology, the threshold many physicists set for “always happens” is around 100 times. That is, if a phenomenon is repeated 100 times, it is widely presumed to be a general rule. That was the threshold Vera Rubin hit when convincing the community that flat rotation curves were the general rule, not just some peculiar cases. That threshold has also been hit and exceeded by detailed MOND fits to rotation curves, and it seems to be widely accepted that this is the general rule even if many people deny the obvious implications. By now, it is also the case for apparent exceptions to MOND ceasing to be exceptions as the data improve. Unfortunately, people tend to stop listening at what they want to hear (in this case, “falsifies MOND”) and fail to pay attention to further developments.

%%It is conceivable that the ultrafaint dwarfs might elude tidal disruption in dark matter models if they reside in sufficiently dense dark matter halos. This seems unlikely given the obvious tidal effects on much more massive systems like the Sagittarius dwarf and the Magellanic Clouds, but it could in principle happen. Indeed, if one calculates the mass density from the observed velocity dispersion, one infers that they do reside in dense dark matter halos. In order to do this calculation, we are obliged to assume that the objects are in equilibrium. This is, of course, a form of static thinking: the possibility of tidal stirring that enhances the velocity dispersion above the equilibrium value is excluded by assumption. The assumption of equilibrium is so basic that it is easy to unwittingly engage in circular reasoning. I know, as I did exactly that myself to begin with.

The fault in our stars: blame them, not the dark matter!

The fault in our stars: blame them, not the dark matter!

As discussed in recent posts, the appearance of massive galaxies in the early universe was predicted a priori by MOND (Sanders 1998, Sanders 2008, Eappen et al. 2022). This is problematic for LCDM. How problematic? That’s always the rub.

The data follow the evolutionary track of a monolithic model (purple line) rather than the track of the largest progenitor predicted by hierarchical LCDM (dotted lines leading to different final masses).

The problem that JWST observations pose for LCDM is that there is a population of galaxies in the high redshift universe that appear to evolve as giant monoliths rather than assembling hierarchically. Put that way, it is a fatal flaw: hierarchical assembly of mass is fundamental to the paradigm. But we don’t observe mass, we observe light. So the obvious “fix” is to adjust the mapping of observed light to predicted dark halo mass in order to match the observations. How plausible is this?

Merger trees from the Illustris-TNG50 simulation showing the hierarchical assembly of L* galaxies. The dotted lines in the preceding plot show the stellar mass growth of the largest progenitor, which is on the left of each merger tree. All progenitors were predicted to be tiny at z > 3, well short of what we observe.

Before trying to wriggle out of the basic result, note that doing so is not plausible from the outset. We need to make the curve of growth of the largest progenitors “look like” the monolithic model. They shouldn’t, by construction, so everything that follows is a fudge to try to avoid the obvious conclusion. But this sort of fudging has been done so many times before in so many ways (the “Frenk Principle” was coined nearly thirty years ago) that many scientists in the field have known nothing else. They seem to think that this is how science is supposed to work. This in turn feeds a convenient attitude that evades the duty to acknowledge that a theory is in trouble when it persistently has to be adjusted to make itself look like a competitor.

That noted, let’s wriggle!

Observational dodges

The first dodge is denial: somehow the JWST data are wrong or misleading. Early on, there were plausible concerns about the validity of some (some) photometric redshifts. There are enough spectroscopic redshifts now that this point is moot.

A related concern is that we “got lucky” with where we pointed JWST to start with, and the results so far are not typical of the universe at large. This is not quite as crazy as it sounds: the field of view of JWST is tiny, so there is no guarantee that the first snapshot will be representative. Moreover, a number of the first pointings intentionally targeted rich fields containing massive clusters, i.e., regions known to be atypical. However, as observations have accumulated, I have seen no indications of a reversal of our first impression, but rather lots of corroboration. So this hedge also now borders on reality denial.

A third observational concern that we worried a lot about in Franck & McGaugh (2017) is contamination by active galactic nuclei (AGN). Luminosity produced by accretion onto supermassive black holes (e.g., quasars) was more common in the early universe. Perhaps some of the light we are attributing to stars is actually produced by AGN. That’s a real concern, but long story short, AGN contamination isn’t enough to explain everything else away. Indeed, the AGN themselves are a problem in their own right: how do we make the supermassive black holes that power AGN so rapidly that they appear already in the early universe? Like the galaxies they inhabit, the black holes that power AGN should take a long time to assemble in the absence of the heavy seeds naturally provided by MOND but not dark matter.

An evergreen concern in astronomy is extinction by dust. Dust could play a role (Ferrara et al. 2023), but this would be a weird effect for it to have. Dust is made by stars, so we naively expect it to build up along with them. In order to explain high redshift JWST data with dust we have to do the opposite: make a lot of dust very early without a lot of stars, then eject it systematically from galaxies so that the net extinction declines with time – a galactic reveal sort of like a cosmic version of the dance of the seven veils. The rate of ejection for all galaxies must necessarily be fine-tuned to balance the barely evolving UV luminosity function with the rapidly evolving dark matter halo mass function. This evolution of the extinction has to coordinate with the dark matter evolution over a rather small window of cosmic time, there being only โˆผ108 yr between z = 14 and 11. This seems like an implausible way to explain an unchanging luminosity density, which is more naturally explained by simply having stars form and be there for their natural lifetimes.

Figure 5ย from McGaugh et al. (2024): The UV luminosity function (left) observed by Donnan et al. (2024; points) compared to that predicted for ฮ›CDM by Yung et al. (2023; lines) as a function of redshift. Lines and points are color coded by redshift, with dark blue, light blue, green, orange, and red corresponding to z = 9, 10, 11, 12, and 14, respectively. There is a clear excess in the number density of galaxies that becomes more pronounced with redshift, ranging from a factor of โˆผ2 at z = 9 to an order of magnitude at z โ‰ฅ 11 (right). This excess occurs because the predicted number of sources declines with redshift while the observed numbers remain nearly constant with the data at z = 9, 10, and 11being right on top of each other.

The basic observation is that there is too much UV light produced by galaxies at all redshifts z > 9. What we’d rather have is the stellar mass function. JWST was designed to see optical light at the redshift of galaxy formation, but the universe surprised us and formed so many stars so early that we are stuck making inferences with the UV anyway. The relation of UV light to mass is dodgy, providing a knob to twist. So up next is the physics of light production.

In our discussion to this point, we have assumed that we know how to compute the luminosity evolution of a stellar population given a prescription for its star formation history. This is no small feat. This subject has a rich history with plenty of ups and downs, like most of astronomy. I’m not going to attempt to review all that here. I think we have this figured out well enough to do what we need to do for the purposes of our discussion here, but there are some obvious knobs to turn, so let’s turn ’em.

Blame the stars!

As noted above, we predict mass but observe light. So the program now is to squeeze more light out of less mass. Early dark matter halos too small? No problem; just make them brighter. More specifically, we need to make models in which the small dark matter halos that form first are better at producing photons from the small amount of baryons that they possess than are their low-redshift descendants. We have observational constraints on the latter; local star formation is inefficient, but maybe that wasn’t always the case. So the first obvious thing to try is to make star formation more efficient.

Super Efficient Star Formation

First, note that stellar populations evolve pretty much as we expect for stars, so this is a bit tricky. We have to retain the evolution we understand well for most of cosmic time while giving a big boost at early times. One way to do that is to have two distinct modes of star formation: the one we think of as normal that persists to this day, and an additional mode of super-efficient star formation (SEFS) at play in the early universe. This way we retain the usual results while potentially giving us the extra boost that we need to explain the JWST data. We argue that this is the least implausible path to preserving LCDM. We’re trying to make it work, and anticipate the arguments Dr. Z would make.

This SESF mode of star formation needs to be very efficient indeed, as there are galaxies that appear to have converted essentially all of their available baryons into stars. Let’s pause to observe that this is pretty silly. Space is very empty; it is hard to get enough mass together to form stars at all: there’s good reason that it is inefficient locally! The early universe is a bit denser by virtue of being smaller; at z = 9 the expansion factor is only 1/(1+z) = 0.1 of what it is now, so the density is (1+z)3 = 1,000 times greater. ON AVERAGE. That’s not really a big boost when it comes to forming structures like stars since the initial condition was extraordinarily uniform. The lack of early structure by far outweighs the difference in density; that is precisely why we’re having a problem. Still, I can at least imagine that there are regions that experience a cascade of violent relaxation and SESF once some threshold in gas density is exceeded that differentiates the normal model of star formation from SESF. Why a threshold in the gas? Because there’s not anything obvious in the dark matter picture to distinguish the galaxies that result from one or the other mode. CDM itself is scale free, after all, so we have to imagine a scale set by baryons that funnels protogalaxies into one mode or the other. Why, physically, is there a particular gas density that makes that happen? That’s a great question.

There have been observational indications that local star formation is related to a gas surface density threshold, so maybe there’s another threshold that kicks it up another notch. That’s just a plausibility argument, but that’s the straw I’m clutching at to justify SESF as the least implausible option. We know there’s at least one way in which a surface density scale might matter to star formation.

Writing out the (1+z)3 argument for the density above tickled the memory that I’d seen something similar claimed elsewhere. Looking it up, indeed Boylan-Kolchin (2024) does this, getting an extra (1+z)3 [for a total of (1+z)6] by invoking a surface density ฮฃ that follows from an acceleration scale g: ฮฃ=g/(ฯ€G). Very MONDish, that. At any rate, the extra boost is claimed to lift a corner of dark matter halo parameter space into the realm of viability. So, sure. Why not make that step two.

However we do it, making stars super-efficiently is what the data appear to require – if we confine our consideration to the mass predicted by LCDM. It’s a way of covering the lack of mass with an surplus of stars. Any mechanism that makes stars more efficiently will boost the dotted lines in the M*-z diagram above in the right direction. Do they map into the data (and the monolithic model) as needed? Unclear! All we’ve done so far is offer plausibility arguments that maybe it could be so, not demonstrate a model that works without fine-tuning that woulda coulda shoulda made the right prediction in the first place.

The ideas become less plausible from here.

Blame the IMF!

The next obvious idea after making more stars in total is to just make more of the high mass stars that produce UV photons. The IMF is a classic boogeyman to accomplish this. I discussed this briefly before, and it came up in a related discussion in which it was suggested that “in the end what will probably happen is that the IMF will be found to be highly redshift dependent.”

OK, so, first, what is the IMF? The Initial Mass Function is the spectrum of masses with which stars form: how many stars of each mass, ranging from the brown dwarf limit (0.08 Mโ˜‰) to the most massive stars formed (around 100 Mโ˜‰). The numbers of stars formed in any star forming event is a strong function of mass: low mass stars are common, high mass stars are rare. Here, though, is the rub: integrating over the whole population, low mass stars contain most of the mass, but high mass stars produce most of the light. This makes the conversion of mass to light quite sensitive to the IMF.

The number of UV photons produced by a stellar population is especially sensitive to the IMF as only the most massive and short-lived O and B stars produce them. This is low-hanging fruit for the desperate theorist: just a few more of those UV-bright, short-lived stars, please! If we adjust the IMF to produce more of these high mass stars, then they crank out lots more UV photons (which goes in the direction we need) but they don’t contribute much to the total mass. Better yet, they don’t live long. They’re like icicles as murder weapons in mystery stories: they do their damage then melt away, leaving no further evidence. (Strictly speaking that’s not true: they leave corpses in the form of neutron stars or stellar mass black holes, but those are practically invisible. They also explode as supernovae, boosting the production of metals, but the amount is uncertain enough to get away with murder.)

There is a good plausibility argument for a variable IMF. To form a star, gravity has to overcome gas pressure to induce collapse. Gas pressure depends on temperature, and interstellar gas can cool more efficiently when it contains some metals (here I mean metals in the astronomy sense, which is everything in the periodic table that’s not hydrogen or helium). It doesn’t take much; a little oxygen (one of the first products of supernova explosions) goes a long way to make cooling more efficient than a primordial gas composed of only hydrogen and helium. Consequently, low metallicity regions have higher gas temperatures, so it makes sense that gas clouds would need more gravity to collapse, leading to higher mass stars. The early universe started with zero metals, and it takes time for stars to make them and to return them to the interstellar medium, so voila: metallicity varies with time so the IMF varies with redshift.

This sound physical argument is simple enough to make that it can be done in a small part of a blog post. This has helped it persist in our collective astronomical awareness for many decades. Unfortunately, it appears to have bugger-all to do with reality.

If metalliticy plays a strong role in determining the IMF, we would expect to see it in stellar populations of different metallicity. We measure the IMF for solar metallicity stars in the solar neighborhood. Globular clusters are composed of stars formed shortly after the Big Bang and have low metallicities. So following this line of argument, we anticipate that they would have a different IMF. There is no evidence that this is the case. Still, we only really need to tweak the high-mass end of the IMF, and those stars died a long time ago, so maybe this argument applies for them if not for the long-lived, low-mass stars that we observe today.

In addition to counting individual stars, we can get a constraint on the galaxy-wide average IMF from the scatter in the Tully-Fisher relation. The physical relation depends on mass, but we rely on light to trace that. So if the IMF varies wildly from galaxy to galaxy, it will induce scatter in Tully-Fisher. This is not observed; the amount of intrinsic scatter that we see is consistent with that expected for stochastic variations in the star formation history for a fixed IMF. That’s a pretty strong constraint, as it doesn’t take much variation in the IMF to cause a lot of scatter that we don’t see. This constraint applies to entire galaxies, so it tolerates variations in the IMF in individual star forming events, but whatever is setting the IMF apparently tends to the same result when averaged over the many star forming events it takes to build a galaxy.

Variation in the IMF has come up repeatedly over the years because it provides so much convenient flexibility. Early in my career, it was commonly invoked to explain the variation in spectral hardness with metallicity. If one looks at the spectra of HII regions (interstellar gas ionized by hot young stars), there is a trend for lower metallicity HII regions to be ionized by hotter stars. The argument above was invoked: clearly the IMF tended to have more high mass stars in low metallicity environments. However, the light emitted by stars also depends on metallicity; low metallicity stars are bluer than their high metallicity equivalents because there are few UV absorption lines from iron in their atmospheres. Taking care to treat the stars and interstellar gas self-consistentlty and integrating over a fixed IMF, I showed that the observed variation in spectral hardness was entirely explained by the variation in metallicity. There didn’t need to be more high mass stars in low metallicity regions, the stars were just hotter because that’s what happens in low metallicity stars. (I didn’t set out to do this; I was just trying to calibrate an abundance indicator that I would need for my thesis.)

Another example where excess high mass stars were invoked was to explain the apparently high optical depth to the surface of last scattering reported by WMAP. If those words don’t mean anything to you, don’t worry – all it means is that a couple of decades ago, we thought we needed lots more UV photons at high redshift (z ~ 17) than CDM naturally provided. The solution was, you guessed it, an IMF rich in high mass stars. Indeed, this result launched a thousand papers on supermassive Population III stars that didn’t pan out for reasons that were easily anticipated at the time. Nowadays, analysis to the Planck data suggest a much lower optical depth than initially inferred by WMAP, but JWST is observing too many UV photons at high redshift to remain consistent with Plank. This apparent tension for LCDM is a natural consequence of early structure formation in MOND; indeed, it is another thing that was specifically predicted (see section 3.1 of McGaugh 2004).

I relate all these stories of encounters with variations in the high mass end of the IMF because they’ve never once panned out. Maybe this time will be different.

Stochastic Star Formation

What else can we think up? There’s always another possibility. It’s a big universe, after all.

One suggestion I haven’t discussed yet is that high redshift galaxies appear overly bright from stochastic fluctuations in their early star formation. This again invokes the dubious relation between stellar mass and UV light, but in a more subtle way than simply stocking the IMF with a bunch more high mass stars. Instead, it notes that the instantaneous star formation rate is stochastic. The massive stars that produces all the UV light are short-lived, so the number present will fluctuate up and down. Over time, this averages out, but there hasn’t been much time yet in the early universe. So maybe the high redshift galaxies that seem to be over-luminous are just those that happen to be near a peak in the ups and downs of star formation. Galaxies will be brightest and most noticeable in this peak phase, so the real mass is less than it appears – albeit there must be a lot of galaxies in the off phase for every one that we see in the on phase.

One expects a lot of scatter in the inferred stellar mass in the early universe due to stochastic variations in the star formation rate. As time goes on, these average out and the inferred stellar mass becomes steady. That’s pretty much what is observed (data). The data track the monolithic model (purple line) and sometimes exceed it in the early, stochastic phase. The data bear no resemblance to hierarchical LCDM (orange line).

This makes a lot of sense to me. Indeed, it should happen at some level, especially in the chaotic early universe. It is also what I infer to be going on to explain why some measurements scatter above the monolithic line. That is the baseline star formation history for this population, with some scatter up and down at early times. Simply scattering from the orange LCDM line isn’t going to look like the purple monolithic line. The shape is wrong and the amplitude difference is too great to overcome in this fashion.

What else?

I’m sure we’ll come up with something, but I think I’ve covered everything I’ve heard so far. Indeed, most of these possibilities are obvious enough that I thought them up myself and wrote about them in McGaugh et al (2024). I don’t see anything in the wide-ranging discussion at KITP that wasn’t already in my paper.

I note this because I want to point out that we are following a well-worn script. This is the part where I tick off all the possibilities for more complicated LCDM models and point out their shortcomings. I expect the same response:

Thatโ€™s too long to read. Dr. Z says it works, so he must be right since we already know that LCDM is correct.

Triton Station, 8 February 2022

People will argue about which of these auxiliary hypotheses is preferable. MOND is not an auxiliary hypothesis, but an entirely different paradigm, so it won’t be part of the discussion. After some debate, one of the auxiliaries (SESF not IMF!) will be adopted as the “standard” picture. This will be repeated until it becomes familiar, and once it is familiar it will seem that it was always so, and then people will assert that there was never a problem, indeed, that we expected it all along. This self-gaslighting reminds me of Feynman’s warning:

The first principle is that you must not fool yourself and you are the easiest person to fool.

Richard Feynman

What is persistently lacking in the community is any willingness to acknowledge, let alone engage with, the deeper question of why we have to keep invoking ad hoc patches to somehow match what MOND correctly predicted a priori. The sociology of invoking arbitrary auxiliary hypotheses to make these sorts of excuses for LCDM has been so consistently on display for so long that I wrote this parody a year ago:


It always seems to come down to special pleading:

Please donโ€™t falsify LCDM! I ran out of computer time. I had a disk crash. I didnโ€™t have a grant for supercomputer time. My simulation data didnโ€™t come back from the processing center. A senior colleague insisted on a rewrite. Someone stole my laptop. There was an earthquake, a terrible flood, locusts! It wasnโ€™t my fault! I swear to God!

And the community loves LCDM, so we fall for it every time.

Oh, LCDM. LCDM, honey.

PS – to appreciate the paraphrased quotes here, you need to hear it as it would be spoken by the pictured actors. So if you do not instantly recognize this scene from the Blues Brothers, you need to correct this shortcoming in your cultural education to get the full effect of the reference.

On the timescale for galaxy formation

On the timescale for galaxy formation

I’ve been wanting to expand on the previous post ever since I wrote it, which is over a month ago now. It has been a busy end to the semester. Plus, there’s a lot to say – nothing that hasn’t been said before, somewhere, somehow, yet still a lot to cobble together into a coherent story – if that’s even possible. This will be a long post, and there will be more after to narrate the story of our big paper in the ApJ. My sole ambition here is to express the predictions of galaxy formation theory in LCDM and MOND in the broadest strokes.

A theory is only as good as its prior. We can always fudge things after the fact, so what matters most is what we predict in advance. What do we expect for the timescale of galaxy formation? To tell you what I’m going to tell you, it takes a long time to build a massive galaxy in LCDM, but it happens much faster in MOND.

Basic Considerations

What does it take to make a galaxy? A typical giant elliptical galaxy has a stellar mass of 9 x 1010 Mโ˜‰. That’s a bit more than our own Milky Way, which has a stellar mass of 5 or 6 x 1010 Mโ˜‰ (depending who you ask) with another 1010 Mโ˜‰ or so in gas. So, in classic astronomy/cosmology style, let’s round off and say a big galaxy is about 1011 Mโ˜‰. That’s a hundred billion stars, give or take.

An elliptical galaxy (NGC 3379, left) and two spiral galaxies (NGC 628 and NGC 891, right).

How much of the universe does it take to make one big galaxy? The critical density of the universe is the over/under point for whether an expanding universe expands forever, or has enough self-gravity to halt the expansion and ultimately recollapse. Numerically, this quantity is ฯcrit = 3H02/(8ฯ€G), which for H0 = 73 km/s/Mpc works out to 10-29 g/cm3 or 1.5 x 10-7 Mโ˜‰/pc3. This is a very small number, but provides the benchmark against which we measure densities in cosmology. The density of any substance X is ฮฉX = ฯX/ฯcrit. The stars and gas in galaxies are made of baryons, and we know the baryon density pretty well from Big Bang Nucleosynthesis: ฮฉb = 0.04. That means the average density of normal matter is very low, only about 4 x 10-31 g/cm3. That’s less than one hydrogen atom per cubic meter – most of space is an excellent vacuum!

This being the case, we need to scoop up a large volume to make a big galaxy. Going through the math, to gather up enough mass to make a 1011 Mโ˜‰ galaxy, we need a sphere with a radius of 1.6 Mpc. That’s in today’s universe; in the past the universe was denser by (1+z)3, so at z = 10 that’s “only” 140 kpc. Still, modern galaxies are much smaller than that; the effective edge of the disk of the Milky Way is at a radius of about 20 kpc, and most of the baryonic mass is concentrated well inside that: the typical half-light radius of a 1011 Mโ˜‰ galaxy is around 6 kpc. That’s a long way to collapse.

Monolithic Galaxy Formation

Given this much information, an early concept was monolithic galaxy formation. We have a big ball of gas in the early universe that collapses to form a galaxy. Why and how this got started was fuzzy. But we knew how much mass we needed and the volume it had to come from, so we can consider what happens as the gas collapses to create a galaxy.

Here we hit a big astrophysical reality check. Just how does the gas collapse? It has to dissipate energy to do so, and cool to form stars. Once stars form, they may feed energy back into the surrounding gas, reheating it and potentially preventing the formation of more stars. These processes are nontrivial to compute ab initio, and attempting to do so obsesses much of the community. We don’t agree on how these things work, so they are the knobs theorists can turn to change an answer they don’t like.

Even if we don’t understand star formation in detail, we do observe that stars have formed, and can estimate how many. Moreover, we do understand pretty well how stars evolve once formed. Hence a common approach is to build stellar population models with some prescribed star formation history and see what works. Spiral galaxies like the Milky Way formed a lot of stars in the past, and continue to do so today. To make 5 x 1010 Mโ˜‰ of stars in 13 Gyr requires an average star formation rate of 4 Mโ˜‰/yr. The current measured star formation rate of the Milky Way is estimated to be 2 ยฑ 0.7 Mโ˜‰/yr, so the star formation rate has been nearly constant (averaging over stochastic variations) over time, perhaps with a gradual decline. Giant elliptical galaxies, in contrast, are “red and dead”: they have no current star formation and appear to have made most of their stars long ago. Rather than a roughly constant rate of star formation, they peaked early and declined rapidly. The cessation of star formation is also called quenching.

A common way to formulate the star formation rate in galaxies as a whole is the exponential star formation rate, SFR(t) = SFR0 e-t/ฯ„. A spiral galaxy has a low baseline star formation rate SFR0 and a long burn time ฯ„ ~ 10 Gyr while an elliptical galaxy has a high initial star formation rate and a short e-folding time like ฯ„ ~ 1 Gyr. Many variations on this theme are possible, and are of great interest astronomically, but this basic distinction suffices for our discussion here. From the perspective of the observed mass and stellar populations of local galaxies, the standard picture for a giant elliptical was a large, monolithic island universe that formed the vast majority of its stars early on then quenched with a short e-folding timescale.

Galaxies as Island Universes

The density parameter ฮฉ provides another useful way to think about galaxy formation. As cosmologists, we obsess about the global value of ฮฉ because it determines the expansion history and ultimate fate of the universe. Here it has a more modest application. We can think of the region in the early universe that will ultimately become a galaxy as its own little closed universe. With a density parameter ฮฉ > 1, it is destined to recollapse.

A fun and funny fact of the Friedmann equation is that the matter density parameter ฮฉm โ†’ 1 at early times, so the early universe when galaxies form is matter dominated. It is also very uniform (more on that below). So any subset that is a bit more dense than average will have ฮฉ > 1 just because the average is very close to ฮฉ = 1. We can then treat this region as its own little universe (a “top-hat overdensity”) and use the Friedmann equation to solve for its evolution, as in this sketch:

The expansion of the early universe a(t) (blue line). A locally overdense region may behave as a closed universe, recollapsing in a finite time (red line) to potentially form a galaxy.

That’s great, right? We have a simple, analytic solution derived from first principles that explains how a galaxy forms. We can plug in the numbers to find how long it takes to form our basic, big 1011 Mโ˜‰ galaxy and… immediately encounter a problem. We need to know how overdense our protogalaxy starts out. Is its effective initial ฮฉm = 2? 10? What value, at what time? The higher it is, the faster the evolution from initially expanding along with the rest of the universe to decoupling from the Hubble flow to collapsing. We know the math but we still need to know the initial condition.

Annoying Initial Conditions

The initial condition for galaxy formation is observed in the cosmic microwave background (CMB) at z = 1090. Where today’s universe is remarkably lumpy, the early universe is incredibly uniform. It is so smooth that it is homogeneous and isotropic to one part in a hundred thousand. This is annoyingly smooth, in fact. It would help to have some lumps – primordial seeds with ฮฉ > 1 – from which structure can grow. The observed seeds are too tiny; the typical initial amplitude is 10-5 so ฮฉm = 1.00001. That takes forever to decouple and recollapse; it hasn’t yet had time to happen.

The cosmic microwave background as observed by ESA’s Planck satellite. This is an all-sky picture of the relic radiation field – essentially a snapshot of the universe when it was just a few hundred thousand years old. The variations in color are variations in temperature which correspond to variations in density. These variations are tiny, only about one part in 100,000. The early universe was very uniform; the real picture is a boring blank grayscale. We have to crank the contrast way up to see these minute variations.

We would like to know how the big galaxies of today – enormous agglomerations of stars and gas and dust separated by inconceivably vast distances – came to be. How can this happen starting from such homogeneous initial conditions, where all the mass is equally distributed? Gravity is an attractive force that makes the rich get richer, so it will grow the slight initial differences in density, but it is also weak and slow to act. A basic result in gravitational perturbation theory is that overdensities grow at the same rate the universe expands, which is inversely related to redshift. So if we see tiny fluctuations in density with amplitude 10-5 at z = 1000, they should have only grown by a factor of 1000 and still be small today (10-2 at z = 0). But we see structures of much higher contrast than that. You can’t here from there.

The rich large scale structure we see today is impossible starting from the smooth observed initial conditions. Yet here we are, so we have to do something to goose the process. This is one of the original motivations for invoking cold dark matter (CDM). If there is a substance that does not interact with photons, it can start to clump up early without leaving too large a mark on the relic radiation field. In effect, the initial fluctuations in mass are larger, just in the invisible substance. (That’s not to say the CDM doesn’t leave a mark on the CMB; it does, but it is subtle and entirely another story.) So the idea is that dark matter forms gravitational structures first, and the baryons fall in later to make galaxies.

An illustration of the the linear growth of overdensities. Structure can grow in the dark matter (long dashed lines) with the baryons catching up only after decoupling (short dashed line). In effect, the dark matter gives structure formation a head start, nicely explaining the apparently impossible growth factor. This has been standard picture for what seems like forever (illustration from Schramm 1992).

With the right amount of CDM – and it has to be just the right amount of a dynamically cold form of non-baryonic dark matter (stuff we still don’t know actually exists) – we can explain how the growth factor is 105 since recombination instead of a mere 103. The dark matter got a head start over the stuff we can see; it looks like 105 because the normal matter lagged behind, being entangled with the radiation field in a way the dark matter was not.

This has been the imperative need in structure formation theory for so long that it has become undisputed lore; an element of the belief system so deeply embedded that it is practically impossible to question. I risk getting ahead of the story, but it is important to point out that, like the interpretation of so much of the relevant astrophysical data, this belief assumes that gravity is normal. This assumption dictates the growth rate of structure, which in turn dictates the need to invoke CDM to allow structure to form in the available time. If we drop this assumption, then we have to work out what happens in each and every alternative that we might consider. That definitely gets ahead of the story, so first let’s understand what we should expect in LCDM.

Hierarchical Galaxy formation in LCDM

LCDM predicts some things remarkably well but others not so much. The dark matter is well-behaved, responding only to gravity. Baryons, on the other hand, are messy – one has to worry about hydrodynamics in the gas, star formation, feedback, dust, and probably even magnetic fields. In a nutshell, LCDM simulations are very good at predicting the assembly of dark mass, but converting that into observational predictions relies on our incomplete knowledge of messy astrophysics. We know what the mass should be doing, but we don’t know so well how that translates to what we see. Mass good, light bad.

Starting with the assembly of mass, the first thing we learn is that the story of monolithic galaxy formation outlined above has to be wrong. Early density fluctuations start out tiny, even in dark matter. God didn’t plunk down island universes of galaxy mass then say “let there be galaxies!” The annoying initial conditions mean that little dark matter halos form first. These subsequently merge hierarchically to make ever bigger halos. Rather than top-down monolithic galaxy formation, we have the bottom-up hierarchical formation of dark matter halos.

The hierarchical agglomeration of dark matter halos into ever larger objects is often depicted as a merger tree. Here are four examples from the high resolution Illustris TNG50 simulation (Pillepich et al. 2019; Nelson et al. 2019).

Examples of merger trees from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019). Objects have been selected to have very nearly the same stellar mass at z=0. Mass is built up through a series of mergers. One large dark matter halo today (at top) has many antecedents (small halos at bottom). These merge hierarchically as illustrated by the connecting lines. The size of the symbol is proportional to the halo mass. I have added redshift and the corresponding age of the universe for vanilla LCDM in a more legible font. The color bar illustrates the specific star formation rate: the top row has objects that are still actively star forming like spirals; those in the bottom row are “red and dead” – things that have stopped forming stars, like giant elliptical galaxies. In all cases, there is a lot of merging and a modest rate of growth, with the typical object taking about half a Hubble time (~7 Gyr) to assemble half of its final stellar mass.

The hierarchical assembly of mass is generic in CDM. Indeed, it is one of its most robust predictions. Dark matter halos start small, and grow larger by a succession of many mergers. This gradual agglomeration is slow: note how tiny the dark matter halos at z = 10 are.

Strictly speaking, it isn’t even meaningful to talk about a single galaxy over the span of a Hubble time. It is hard to avoid this mental trap: surely the Milky Way has always been the Milky Way? so one imagines its evolution over time. This is monolithic thinking. Hierarchically, “the galaxy” refers at best to the largest progenitor, the object that traces the left edge of the merger trees above. But the other protogalactic chunks that eventually merge together are as much part of the final galaxy as the progenitor that happens to be largest.

This complicated picture is complicated further by what we can see being stars, not mass. The luminosity we observe forms through a combination of in situ growth (star formation in the largest progenitor) and ex situ growth through merging. There is no reason for some preferred set of protogalaxies to form stars faster than the others (though of course there is some scatter about the mean), so presumably the light traces the mass of stars formed traces the underlying dark mass. Presumably.

That we should see lots of little protogalaxies at high redshift is nicely illustrated by this lookback cone from Yung et al (2022). Here the color and size of each point corresponds to the stellar mass. Massive objects are common at low redshift but become progressively rare at high redshift, petering out at z > 4 and basically absent at z = 10. This realization of the observable stellar mass tracks the assembly of dark mass seen in merger trees.

Fig. 2 from Yung et al. (2022) illustrating what an observer would see looking back through their simulation to high redshift.

This is what we expect to see in LCDM: lots of small protogalaxies at high redshift; the building blocks of later galaxies that had not yet merged. The observation of galaxies much brighter than this at high redshift by JWST poses a fundamental challenge to the paradigm: mass appears not to be subdivided as expected. So it is entirely justifiable that people have been freaking out that what we see are bright galaxies that are apparently already massive. That shouldn’t happen; it wasn’t predicted to happen; how can this be happening?

That’s all background that is assumed knowledge for our ApJ paper, so we’re only now getting to its Figure 1. This combines one of the merger trees above with its stellar mass evolution. The left panel shows the assembly of dark mass; the right pane shows the growth of stellar mass in the largest progenitor. This is what we expect to see in observations.


Fig. 1 from McGaugh et al (2024): A merger tree for a model galaxy from the TNG50-1 simulation (Pillepich et al. 2019; Nelson et al. 2019, left panel) selected to have Mโˆ— โ‰ˆ 9 ร— 1010 MโŠ™ at z = 0; i.e., the stellar mass of a local Lโˆ— giant elliptical galaxy (Driver et al. 2022). Mass assembles hierarchically, starting from small halos at high redshift (bottom edge) with the largest progenitor traced along the left of edge of the merger tree. The growth of stellar mass of the largest progenitor is shown in the right panel. This example (jagged line) is close to the median (dashed line) of comparable mass objects (Rodriguez-Gomez et al. 2016), and within the range of the scatter (the shaded band shows the 16th โ€“ 84th percentiles). A monolithic model that forms at zf = 10 and evolves with an exponentially declining star formation rate with ฯ„ = 1 Gyr (purple line) is shown for comparison. The latter model forms most of its stars earlier than occurs in the simulation.

For comparison, we also show the stellar mass growth of a monolithic model for a giant elliptical galaxy. This is the classic picture we had for such galaxies before we realized that galaxy formation had to be hierarchical. This particular monolithic model forms at zf = 10 and follows an exponential star formation rate with ฯ„ = 1 Gyr. It is one of the models published by Franck & McGaugh (2017). It is, in fact, the first model I asked Jay to construct when he started the project. Not because we expected it to best describe the data, as it turns out to do, but because the simple exponential model is a touchstone of stellar population modeling. It was a starter model: do this basic thing first to make sure you’re doing it right. We chose ฯ„ = 1 Gyr because that was the typical number bandied about for elliptical galaxies, and zf = 10 because that seemed ridiculously early for a massive galaxy to form. At the time we built the model, it was ludicrously early to imagine a massive galaxy would form, from an LCDM perspective. A formation redshift zf = 10 was, less than a decade ago, practically indistinguishable from the beginning of time, so we expected it to provide a limit that the data would not possibly approach.

In a remarkably short period, JWST has transformed z = 10 from inconceivable to run of the mill. I’m not going to go into the data yet – this all-theory post is already a lot – but to offer one spoiler: the data are consistent with this monolithic model. If we want to “fix” LCDM, we have to make the red line into the purple line for enough objects to explain the data. That proves to be challenging. But that’s moving the goalposts; the prediction was that we should see little protogalaxies at high redshift, not massive, monolith-style objects. Just look at the merger trees at z = 10!

Accelerated Structure Formation in MOND

In order to address these issues in MOND, we have to go back to the beginning. What is the evolution of a spherical region (a top-hat overdensity) that might collapse to form a galaxy? How does a spherical region under the influence of MOND evolve within an expanding universe?

The solution to this problem was first found by Felten (1984), who was trying to play the Newtonian cosmology trick in MOND. In conventional dynamics, one can solve the equation of motion for a point on the surface of a uniform sphere that is initially expanding and recover the essence of the Friedmann equation. It was reasonable to check if cosmology might be that simple in MOND. It was not. The appearance of a0 as a physical scale makes the solution scale-dependent: there is no general solution that one can imagine applies to the universe as a whole.

Felten reasonably saw this as a failure. There were, however, some appealing aspects of his solution. For one, there was no such thing as a critical density. All MOND universes would eventually recollapse irrespective of their density (in the absence of the repulsion provided by a cosmological constant). It could take a very long time, which depended on the density, but the ultimate fate was always the same. There was no special value of ฮฉ, and hence no flatness problem. The latter obsessed people at the time, so I’m somewhat surprised that no one seems to have made this connection. Too soon*, I guess.

There it sat for many years, an obscure solution for an obscure theory to which no one gave credence. When I became interested in the problem a decade later, I started methodically checking all the classic results. I was surprised to find how many things we needed dark matter to explain were just as well (or better) explained by MOND. My exact quote was “surprised the bejeepers out of us.” So, what about galaxy formation?

I started with the top-hat overdensity, and had the epiphany that Felten had already obtained the solution. He had been trying to solve all of cosmology, which didn’t work. But he had solved the evolution of a spherical region that starts out expanding with the rest of the universe but subsequently collapses under the influence of MOND. The overdensity didn’t need to be large, it just needed to be in the low acceleration regime. Something like the red cycloidal line in the second plot above could happen in a finite time. But how much?

The solution depends on scale and needs to be solved numerically. I am not the greatest programmer, and I had a lot else on my plate at the time. I was in no rush, as I figured I was the only one working on it. This is usually a good assumption with MOND, but not in this case. Bob Sanders had had the same epiphany around the same time, which I discovered when I received his manuscript to referee. So all credit is due to Bob: he said these things first.

First, he noted that galaxy formation in MOND is still hierarchical. Small things form first. Crudely speaking, structure formation is very similar to the conventional case, but now the goose comes from the change in the force law rather than extra dark mass. MOND is nonlinear, so the whole process gets accelerated. To compare with the linear growth of CDM:

A sketch of how structures grow over time under the influence of cold dark matter (left, from Schramm 1992, same as above) and MOND (right, from Sanders & McGaugh 2002; see also this further discussion and previous post). The slow linear growth of CDM (long-dashed line, left panel) is replaced by a rapid, nonlinear growth in MOND (solid lines at right; numbers correspond to different scales). Nonlinear growth moderates after cosmic expansion begins to accelerate (dashed vertical line in right panel).

The net effect is the same. A cosmic web of large scale structure emerges. They look qualitatively similar, but everything happens faster in MOND. This is why observations have persistently revealed structures that are more massive and were in place earlier than expected in contemporaneous LCDM models.

Simulated structure formation in ฮ›CDM (top) and MOND (bottom) showing the more rapid emergence of similar structures in MOND (note the redshift of each panel). From McGaugh (2015).

In MOND, small objects like globular clusters form first, but galaxies of a range of masses all collapse on a relatively short cosmic timescale. How short? Let’s consider our typical 1011 Mโ˜‰ galaxy. Solving Felten’s equation for the evolution of a sphere numerically, peak expansion is reached after 300 Myr and collapse happens in a similar time. The whole galaxy is in place speedy quick, and the initial conditions don’t really matter: a uniform, initially expanding sphere in the low acceleration regime will behave this way. From our distant vantage point thirteen billion years later, the whole process looks almost monolithic (the purple line above) even though it is a chaotic hierarchical mess for the first few hundred million years (z > 14). In particular, it is easy to form half of the stellar mass early on: the mass is already assembled.

The evolution of a 1011 MโŠ™ sphere that starts out expanding with the universe but decouples and collapses under the influence of MOND (dotted line). It reaches maximum expansion after 300 Myr and recollapses in a similar time, so the entire object is in place after 600 Myr. (A version of this plot with a logarithmic time axis appears as Fig. 2 in our paper.) The inset shows the evolution of smaller shells within such an object (Fig. 2 from Sanders 2008). The inner regions collapse first followed by outer shells. These oscillate and cross, mixing and ultimately forming a reasonable size galaxy – see Sanders’s Table 1 and also his Fig. 4 for the collapse times for objects of other masses. These early results are corroborated by Eappen et al. (2022), who further demonstrate that the details of feedback are not important in MOND, unlike LCDM.

This is what JWST sees: galaxies that are already massive when the universe is just half a billion years old. I’m sure I should say more but I’m exhausted now and you may be too, so I’m gonna stop here by noting that in 1998, when Bob Sanders predicted that “Objects of galaxy mass are the first virialized objects to form (by z=10),” the contemporaneous prediction of LCDM was that “present-day disc [galaxies] were assembled recently (at z<=1)” and “there is nothing above redshift 7.” One of these predictions has been realized. It is rare in science that such a clear a priori prediction comes true, let alone one that seemed so unreasonable at the time, and which took a quarter century to corroborate.


*I am not quite this old: I was still an undergraduate in 1984. I hadn’t even decided to be an astronomer at that point; I certainly hadn’t started following the literature. The first time I heard of MOND was in a graduate course taught by Doug Richstone in 1988. He only mentioned it in passing while talking about dark matter, writing the equation on the board and saying maybe it could be this. I recall staring at it for a long few seconds, then shaking my head and muttering “no way.” I then completely forgot about it, not thinking about it again until it came up in our data for low surface brightness galaxies. I expect most other professionals have the same initial reaction, which is fair. The test of character comes when it crops up in their data, as it is doing now for the high redshift galaxy community.

Decision Trees & Philosophical Blunders

Decision Trees & Philosophical Blunders

Given recent developments in the long-running hunt for dark matter and the difficulty interpreting what this means, it seems like a good juncture to re-up* this:


The history of science is a decision tree. Vertices appear where we must take one or another branching. Sometimes, we take the wrong road for the right reasons.

A good example is the geocentric vs. heliocentric cosmology. The ancient Greeks knew that in many ways it made more sense for the earth to revolve around the sun than vice-versa. Yet they were very clever. Ptolemy and others tested for the signature of the earth’s orbit in the seasonal wobbling in the positions of stars, or parallax. If the earth is moving around the sun, nearby stars should appear to move on the sky as the earth moves from one side of the sun to the other. Try blinking back and forth between your left and right eyes to see this effect, noting how nearby objects appear to move relative to distant ones.

Problem is, Ptolemy did not find the parallax. Quite reasonably, he inferred that the earth stayed put. We know now that this was the wrong branch to choose, but it persisted as the standard world view for many centuries. It turns out that even the nearest stars are so distant that their angular parallax is tiny (the angle of parallax is inversely proportional to distance). Precision sufficient for measuring the parallax was not achieved until the 19th century, by which time astronomers were already convinced it must happen.

Ptolemy was probably aware of this possibility, though it must have seemed quite unreasonable to conjecture at that time that the stars could be so very remote. The fact was that parallax was not observed. Either the earth did not move, or the stars were ridiculously distant. Which sounds more reasonable to you?

So, science took the wrong branch. Once this happened, sociology kicked in. Generation after generation of intelligent scholars confirmed the lack of parallax until the opposing branch seemed so unlikely that it became heretical to even discuss. It is very hard to reverse back up the decision tree and re-assess what seems to be such a firm conclusion. It took the Copernican revolution to return to that ancient decision branch and try the other one.

Cosmology today faces a similar need to take a few steps back on the decision tree. The problem now is the issue of the mass discrepancy, typically attributed to dark matter. When it first became apparent that things didn’t add up when one applied the usual Law of Gravity to the observed dynamics of galaxies, there was a choice. Either lots of matter is present which happens to be dark, or the Law of Gravity has to be amended. Which sounds more reasonable to you?

Having traveled down the road dictated by the Dark Matter decision branch, cosmologists find themselves trapped in a web of circular logic entirely analogous to the famous Ptolemaic epicycles. Not many of them realize it yet, much less admit that this is what is going on. But if you take a few steps back up the decision branch, you find a few attempts to alter the equations of gravity. Most of these failed almost immediately, encouraging cosmologists down the dark matter path just as Ptolemy wisely chose a geocentric cosmology. However, one of these theories is not only consistent with the data, it actually predicts many important new results. This theory is known as MOND (MOdified Newtonian Dynamics). It was introduced in 1983 by Moti Milgrom of the Weizmann Institute in Israel.

MOND accurately describes the effective force law in galaxies based only on the observed stars and gas. What this means is unclear, but it clearly means something! It is conceivable that dark and luminous matter somehow interact to mimic the behavior stipulated by MOND. This is not expected, and requires a lot of epicyclic thinking to arrange. The more straightforward interpretation is that MOND is correct, and we took the wrong branch of the decision tree back in the ’70s.

MOND has dire implications for much modern cosmological thought which has developed symbiotically with dark matter. As yet, no one has succeeded in writing down a theory which encompasses both MOND and General Relativity. This leaves open many questions in cosmology that were thought to be solved, such as the expansion history of the universe. There is nothing a scientist hates to do more than unlearn what was thought to be well established. It is this sociological phenomenon that makes it so difficult to climb back up the decision tree to the faulty branching.

Once one returns and takes the correct branch, the way forward is not necessarily obvious. The host of questions which had been assigned seemingly reasonable explanations along the faulty branch must be addressed anew. And there will always be those incapable of surrendering the old world view irrespective of the evidence.

In my opinion, the new successes of MOND can not occur by accident. They are a strong sign that we are barking up the wrong tree with dark matter. A grander theory encompassing both MOND and General Relativity must exist, even if no one has as yet been clever enough to figure it out (few have tried).

These all combine to make life as a cosmologist interesting. Sometimes it is exciting. Often it is frustrating. Most of the time, “interesting” takes on the meaning implied by the old Chinese curse:

MAY YOU LIVE IN INTERESTING TIMES

Like it or not, we do.


*I wrote this in 2000. I leave it to the reader to decide how much progress has been made since then.

Why’d it have to be MOND?

Why’d it have to be MOND?

I want to take another step back in perspective from the last post to say a few words about what the radial acceleration relation (RAR) means and what it doesn’t mean. Here it is again:

The Radial Acceleration Relation over many decades. The grey region is forbidden – there cannot be less acceleration than caused by the observed baryons. The entire region above the diagonal line (yellow) is accessible to dark matter models as the sum of baryons and however much dark matter the model prescribes. MOND is the blue line.

This information was not available when the dark matter paradigm was developed. We observed excess motion, like flat rotation curves, and inferred the existence of extra mass. That was perfectly reasonable given the information available at the time. It is not now: we need to reassess as we learn more.

There is a clear organization to the data at both high and low acceleration. No objective observer with a well-developed physical intuition would look at this and think “dark matter.” The observed behavior does not follow from one force law plus some arbitrary amount of invisible mass. That could do literally anything in the yellow region above, and beyond the bounds of the plot, both upwards and to the left. Indeed, there is no obvious reason why the data don’t fall all over the place. One of the lingering, niggling concerns is the 5:1 ratio of dark matter:baryons – why is it in the same ballpark, when it could be pretty much anything? Why should the data organize in terms of acceleration? There is no reason for dark matter to do this.

Plausible dark matter models have been predicted to do a variety of things – things other than what we observe. The problem for dark matter is that real objects only occupy a tiny line through the vast region available to them in the plot above. This is a fine-tuning problem: why do the data reside only where they do when they could be all over the place? I recognized this as a problem for dark matter before I became aware$ of MOND. That it turns out that the data follow the line uniquely predicted* by MOND is just chef’s kiss: there is a fine-tuning problem for dark matter because MOND is the effective force law.

The argument against dark matter is that the data could reside anywhere in the yellow region above, but don’t. The argument against MOND is that a small portion of the data fall a little off the blue line. Arguing that such objects, be they clusters of galaxies or particular individual galaxies, falsify MOND while ignoring the fine-tuning problem faced by dark matter is a case of refusing to see the forest for a few outlying trees.%

So to return to the question posed in the title of this post, I don’t know why it had to be MOND. That’s just what we observe. Pretending dark matter does the same thing is a false presumption.


$I’d heard of MOND only vaguely, and, like most other scientists in the field, had paid it no mind until it reared its ugly head in my own data.

*I talk about MOND here because I believe in giving credit where credit is due. MOND predicted this; no other theory did so. Dark matter theories did not predict this. My dark matter-based galaxy formation theory did not predict this. Other dark matter-based galaxy formation theories (including simulations) continue to fail to explain this. Other hypotheses of modified gravity also did not predict what is observed. Who+ ordered this?

Modified Dynamics. Very dangerous. You go first.

Many people in the field hate MOND, often with an irrational intensity that has the texture of religion. It’s not as if I woke up one morning and decided to like MOND – sometimes I wish I had never heard of it – but disliking a theory doesn’t make it wrong, and ignoring it doesn’t make it go away. MOND and only MOND predicted the observed RAR a priori. So far, MOND and only MOND provides a satisfactory explanation of thereof. We might not like it, but there it is in the data. We’re not going to progress until we get over our fear of MOND and cope with it. Imagining that it will somehow fall out of simulations with just the right baryonic feedback prescription is a form of magical thinking, not science.

MOND. Why’d it have to be MOND?

+Milgrom. Milgrom ordered this.


%I expect many cosmologists would argue the same in reverse for the cosmic microwave background (CMB) and other cosmological constraints. I have some sympathy for this. The fit to the power spectrum of the CMB seems too good to be an accident, and it points to the same parameters as other constraints. Well, mostly – the Hubble tension might be a clue that things could unravel, as if they haven’t already. The situation is not symmetric – where MOND predicted what we observe a priori with a minimum of assumptions, LCDM is an amalgam of one free parameter after another after another: dark matter and dark energy are, after all, auxiliary hypotheses we invented to save FLRW cosmology. When they don’t suffice, we invent more. Feedback is single word that represents a whole Pandora’s box of extra degrees of freedom, and we can invent crazier things as needed. The results is a Frankenstein’s monster of a cosmology that we all agree is the same entity, but when we examine it closely the pieces don’t fit, and one cosmologist’s LCDM is not really the same as that of the next. They just seem to agree because they use the same words to mean somewhat different things. Simply agreeing that there has to be non-baryonic dark matter has not helped us conjure up detections of the dark matter particles in the laboratory, or given us the clairvoyance to explain# what MOND predicted a prioi. So rather than agree that dark matter must exist because cosmology works so well, I think the appearance of working well is a chimera of many moving parts. Rather, cosmology, as we currently understand it, works if and only if non-baryonic dark matter exists in the right amount. That requires a laboratory detection to confirm.

#I have a disturbing lack of faith that a satisfactory explanation can be found.

A Response to Recent Developments Concerning the Gravitational Potential of the Milky Way

A Response to Recent Developments Concerning the Gravitational Potential of the Milky Way

In the series of recent posts I’ve made about the Milky Way, I missed an important reply made in the comments by Francois Hammer, one of the eminent scientists doing the work. I was on to writing the next post when he wrote it, and simply didn’t see it until yesterday. Dr. Hammer has some important things to say that are both illustrative of the specific topic and also of how science should work. I wanted to highlight his concerns with their own post, so, with his permission, I cut & paste his comments below, making this, in effect, a guest post by Francois Hammer.


There are two aspects weโ€™d like to mention, as they may help to clarify part of the debate:
1- When saying โ€œGaia is great, but has its limits. It is really optimized for nearby stars (within a few kpc). Outside of that, the statisticsโ€ฆ leave something to be desired. Is it safe to push out beyond 20 kpc?โ€, one may wonder whether the significance of Gaia data has been really understood.
In the Eilers et al. 2019 DR2 rotation curve, you may see points with small error bar up to 21-22 kpc. Gaia DR3 provides proper motion (systematics) uncertainties that are 2 times smaller than from Gaia DR2, so it can easily goes to 25 kpc or more.
The gain in quality for parallaxes is indeed smaller (30% gain). However, our results cannot be affected by distance estimates, since the large number of stars with parallax estimates in Wang et al. (2023) is giving the same rotation curve than that from (a lower number of) RGB stars with spectrophotometric distances (Ou et al. 2023), i.e., following Eilers et al. 2019. And both show a Keplerian decline, which was already noticeable with DR2 results from Eilers et al 2019. The latter authors said in their conclusions: โ€œWe do see a mild but significant deviation from the straightly declining circular velocity curve at Rโ‰ˆ19โ€“21 kpc of ฮ”vโ‰ˆ15 km sโˆ’1.โ€ Our work using Gaia DR3 is nothing else than having a factor 2 better in accounting for systematics, and then being able to resolve what looks like a Keplerian decrease of the rotation curve.
We may also mention here that one of us participated to an unprecedented study of the kinematics LMC (Gaia Collaboration 2021, Luriโ€™s paper), which is at 50 kpc. Unless one proves everything that people has done about the LMC and MW is wrong, and that the data are too uncertain to conclude anything about what happens at R=17-25 kpc, the above clarifications about Gaia accuracy are truly necessary for people reading your blog.
2- The argument that the result โ€œviolates a gazillion well-established constraints.โ€ has to be taken with some caution, since otherwise, no one can do any progress in the field. In fact, the problem with many probes (so-called โ€œsatellitesโ€) in the MW halo, is the fact that one cannot guarantee whether or not their orbits are at equilibrium with the MW potential. This is the reverse for the MW disk, for which stars are rotating in the disk, and, e.g., at 25 kpc, they have likely experience 7-8 orbits since the last merger (Gaia-Sausage-Enceladus), about 9 billion years go. In other words, the mass provided by a system mostly at equilibrium, likely supersedes masses provided by systems that equilibrium conditions are not secured. An interesting example of this is given by globular clusters (GCs). If taken as an ensemble of 156 GCs (from Baumgardt catalog), just by removing Pyxis and Terzan 8, the MW mass inside 50 kpc passes from 5.5 to 2.1 10^11 Msun. This is likely because these two GCs may have come quite recently, meaning that their initial kinetic energy is still contributing to their total energy. A similar mass overestimate could happen if one accounts the LMC or Leo I as MW satellites at equilibrium with the MW potential.โ€จSo we agree that near 25 kpc the disk of the MW may show signs of less-equilibrium, or sign of slightly less circular orbits due to different phenomenas discussed in the blog. However, why taking into account objects for which there is no proof they are at equilibrium as being the true measurements?
In our work, we have considerably focused in understanding and expanding the whole contribution of systematics, which may comes from Gaia data, but also from assumptions about stellar profile (i.e., deviations from exponential profiles), from the Sun distance and proper motion and so on. You may find a description in Ou et al.โ€™s Figure 5 and Jiao et al.โ€™s Figure 4, both showing that systematics cannot gives much more than 10% error on circular velocity estimates. This is an area where we are considered by the Local Group community as being quite conservative, and following Gaia specialists with who we have worked to deliver the EDR3 catalog of dwarf galaxy motions (Li, Hammer, Babusiaux et al 2021) up to about 150 kpc. Jiao et al. paper main contribution is the fair accounting of systematics, which analysis shows error bars that are much larger than those from other sources of errors especially in MW outskirts (see Fig. 2).

Francois Hammer, 24 September 2023

The image at top is Fig. 2 from Jiao et al. illustrating their assessment of the rotation curve and its systematic uncertainties.

Take it where?

Take it where?

I had written most of the post below the line before an exchange with a senior colleague who accused me of asking us to abandon General Relativity (GR). Anyone who read the last post knows that this is the opposite of true. So how does this happen?

Much of the field is mired in bad ideas that seemed like good ideas in the 1980s. There has been some progress, but the idea that MOND is an abandonment of GR I recognize as a misconception from that time. It arose because the initial MOND hypothesis suggested modifying the law of inertia without showing a clear path to how this might be consistent with GR. GR was built on the Equivalence Principle (EP), the equivalence1 of gravitational charge with inertial mass. The original MOND hypothesis directly contradicted that, so it was a fair concern in 1983. It was not by 19842. I was still an undergraduate then, so I don’t know the sociology, but I get the impression that most of the community wrote MOND off at this point and never gave it further thought.

I guess this is why I still encounter people with this attitude, that someone is trying to rob them of GR. It’s feels like we’re always starting at square one, like there has been zero progress in forty years. I hope it isn’t that bad, but I admit my patience is wearing thin.

I’m trying to help you. Don’t waste you’re entire career chasing phantoms.

What MOND does ask us to abandon is the Strong Equivalence Principle. Not the Weak EP, nor even the Einstein EP. Just the Strong EP. That’s a much more limited ask that abandoning all of GR. Indeed, all flavors of EP are subject to experimental test. The Weak EP has been repeatedly validated, but there is nothing about MOND that implies platinum would fall differently from titanium. Experimental tests of the Strong EP are less favorable.

I understand that MOND seems impossible. It also keeps having its predictions come true. This combination is what makes it important. The history of science is chock full of ideas that were initially rejected as impossible or absurd, going all the way back to heliocentrism. The greater the cognitive dissonance, the more important the result.


Continuing the previous discussion of UT, where do we go from here? If we accept that maybe we have all these problems in cosmology because we’re piling on auxiliary hypotheses to continue to be able to approximate UT with FLRW, what now?

I don’t know.

It’s hard to accept that we don’t understand something we thought we understood. Scientists hate revisiting issues that seem settled. Feels like a waste of time. It also feels like a waste of time continuing to add epicycles to a zombie theory, be it LCDM or MOND or the phoenix universe or tired light or whatever fantasy reality you favor. So, painful as it may be, one has find a little humility to step back and take account of what we know empirically independent of the interpretive veneer of theory.

As I’ve said before, I think we do know that the universe is expanding and passed through an early hot phase that bequeathed us the primordial abundances of the light elements (BBN) and the relic radiation field that we observe as the cosmic microwave background (CMB). There’s a lot more to it than that, and I’m not going to attempt to recite it all here.

Still, to give one pertinent example, BBN only works if the expansion rate is as expected during the epoch of radiation domination. So whatever is going on has to converge to that early on. This is hardly surprising for UT since it was stipulated to contain GR in the relevant limit, but we don’t actually know how it does so until we work out what UT is – a tall order that we can’t expect to accomplish overnight, or even over the course of many decades without a critical mass of scientists thinking about it (and not being vilified by other scientists for doing so).

Another example is that the cosmological principle – that the universe is homogeneous and isotropic – is observed to be true in the CMB. The temperature is the same all over the sky to one part in 100,000. That’s isotropy. The temperature is tightly coupled to the density, so if the temperature is the same everywhere, so is the density. That’s homogeneity. So both of the assumptions made by the cosmological principle are corroborated by observations of the CMB.

The cosmological principle is extremely useful for solving the equations of GR as applied to the whole universe. If the universe has a uniform density on average, then the solution is straightforward (though it is rather tedious to work through to the Friedmann equation). If the universe is not homogeneous and isotropic, then it becomes a nightmare to solve the equations. One needs to know where everything was for all of time.

Starting from the uniform condition of the CMB, it is straightforward to show that the assumption of homogeneity and isotropy should persist on large scales up to the present day. “Small” things like galaxies go nonlinear and collapse, but huge volumes containing billions of galaxies should remain in the linear regime and these small-scale variations average out. One cubic Gigaparsec will have the same average density as the next as the next, so the cosmological principle continues to hold today.

Anyone spot the rub? I said homogeneity and isotropy should persist. This statement assumes GR. Perhaps it doesn’t hold in UT?

This aspect of cosmology is so deeply embedded in everything that we do in the field that it was only recently that I realized it might not hold absolutely – and I’ve been actively contemplating such a possibility for a long time. Shouldn’t have taken me so long. Felten (1984) realized right away that a MONDian universe would depart from isotropy by late times. I read that paper long ago but didn’t grasp the significance of that statement. I did absorb that in the absence of a cosmological constant (which no one believed in at the time), the universe would inevitably recollapse, regardless of what the density was. This seems like an elegant solution to the flatness/coincidence problem that obsessed cosmologists at the time. There is no special value of the mass density that provides an over/under line demarcating eternal expansion from eventual recollapse, so there is no coincidence problem. All naive MOND cosmologies share the same ultimate fate, so it doesn’t matter what we observe for the mass density.

MOND departs from isotropy for the same reason it forms structure fast: it is inherently non-linear. As well as predicting that big galaxies would form by z=10, Sanders (1998) correctly anticipated the size of the largest structures collapsing today (things like the local supercluster Laniakea) and the scale of homogeneity (a few hundred Mpc if there is a cosmological constant). Pretty much everyone who looked into it came to similar conclusions.

But MOND and cosmology, as we know it in the absence of UT, are incompatible. Where LCDM encompasses both cosmology and the dynamics of bound systems (dark matter halos3), MOND addresses the dynamics of low acceleration systems (the most common examples being individual galaxies) but says nothing about cosmology. So how do we proceed?

For starters, we have to admit our ignorance. From there, one has to assume some expanding background – that much is well established – and ask what happens to particles responding to a MONDian force-law in this background, starting from the very nearly uniform initial condition indicated by the CMB. From that simple starting point, it turns out one can get a long way without knowing the details of the cosmic expansion history or the metric that so obsess cosmologists. These are interesting things, to be sure, but they are aspects of UT we don’t know and can manage without to some finite extent.

For one, the thermal history of the universe is pretty much the same with or without dark matter, with or without a cosmological constant. Without dark matter, structure can’t get going until after thermal decoupling (when the matter is free to diverge thermally from the temperature of the background radiation). After that happens, around z = 200, the baryons suddenly find themselves in the low acceleration regime, newly free to respond to the nonlinear force of MOND, and structure starts forming fast, with the consequences previously elaborated.

But what about the expansion history? The geometry? The big questions of cosmology?

Again, I don’t know. MOND is a dynamical theory that extends Newton. It doesn’t address these questions. Hence the need for UT.

I’ve encountered people who refuse to acknowledge4 that MOND gets predictions like z=10 galaxies right without a proper theory for cosmology. That attitude puts the cart before the horse. One doesn’t look for UT unless well motivated. That one is able to correctly predict 25 years in advance something that comes as a huge surprise to cosmologists today is the motivation. Indeed, the degree of surprise and the longevity of the prediction amplify the motivation: if this doesn’t get your attention, what possibly could?

There is no guarantee that our first attempt at UT (or our second or third or fourth) will work out. It is possible that in the search for UT, one comes up with a theory that fails to do what was successfully predicted by the more primitive theory. That just lets you know you’ve taken a wrong turn. It does not mean that a correct UT doesn’t exist, or that the initial prediction was some impossible fluke.

One candidate theory for UT is bimetric MOND. This appears to justify the assumptions made by Sanders’s early work, and provide a basis for a relativistic theory that leads to rapid structure formation. Whether it can also fit the acoustic power spectrum of the CMB as well as LCDM and AeST has yet to be seen. These things take time and effort. What they really need is a critical mass of people working on the problem – a community that enjoys the support of other scientists and funding institutions like NSF. Until we have that5, progress will remain grudgingly slow.


1The equivalence of gravitational charge and inertial mass means that the m in F=GMm/d2 is identically the same as the m in F=ma. Modified gravity changes the former; modified inertia the latter.

2Bekenstein & Milgrom (1984) showed how a modification of Newtonian gravity could avoid the non-conservation issues suffered by the original hypothesis of modified inertia. They also outlined a path towards a generally covariant theory that Bekenstein pursued for the rest of his life. That he never managed to obtain a completely satisfactory version is often cited as evidence that it can’t be done, since he was widely acknowledged as one of the smartest people in the field. One wonders why he persisted if, as these detractors would have us believe, the smart thing to do was not even try.

3The data for galaxies do not look like the dark matter halos predicted by LCDM.

4I have entirely lost patience with this attitude. If a phenomena is correctly predicted in advance in the literature, we are obliged as scientists to take it seriously+. Pretending that it is not meaningful in the absence of UT is just an avoidance strategy: an excuse to ignore inconvenient facts.

+I’ve heard eminent scientists describe MOND’s predictive ability as “magic.” This also seems like an avoidance strategy. I, for one, do not believe in magic. That it works as well as it doesthat it works at all – must be telling us something about the natural world, not the supernatural.

5There does exist a large and active community of astroparticle physicists trying to come up with theories for what the dark matter could be. That’s good: that’s what needs to happen, and we should exhaust all possibilities. We should do the same for new dynamical theories.

What we have here is a failure to communicate

What we have here is a failure to communicate

Kuhn noted that as paradigms reach their breaking point, there is a divergence of opinions between scientists about what the important evidence is, or what even counts as evidence. This has come to pass in the debate over whether dark matter or modified gravity is a better interpretation of the acceleration discrepancy problem. It sometimes feels like we’re speaking about different topics in a different language. That’s why I split the diagram version of the dark matter tree as I did:

Evidence indicating acceleration discrepancies in the universe and various flavors of hypothesized solutions.

Astroparticle physicists seem to be well-informed about the cosmological evidence (top) and favor solutions in the particle sector (left). As more of these people entered the field in the ’00s and began attending conferences where we overlapped, I recognized gaping holes in their knowledge about the dynamical evidence (bottom) and related hypotheses (right). This was part of my motivation to develop an evidence-based course1 on dark matter, to try to fill in the gaps in essential knowledge that were obviously being missed in the typical graduate physics curriculum. Though popular on my campus, not everyone in the field has the opportunity to take this course. It seems that the chasm has continued to grow, though not for lack of attempts at communication.

Part of the problem is a phase difference: many of the questions that concern astroparticle physicists (structure formation is a big one) were addressed 20 years ago in MOND. There is also a difference in texture: dark matter rarely predicts things but always explains them, even if it doesn’t. MOND often nails some predictions but leaves other things unexplained – just a complete blank. So they’re asking questions that are either way behind the curve or as-yet unanswerable. Progress rarely follows a smooth progression in linear time.

I have become aware of a common construction among many advocates of dark matter to criticize “MOND people.” First, I don’t know what a “MOND person” is. I am a scientist who works on a number of topics, among them both dark matter and MOND. I imagine the latter makes me a “MOND person,” though I still don’t really know what that means. It seems to be a generic straw man. Users of this term consistently paint such a luridly ridiculous picture of what MOND people do or do not do that I don’t recognize it as a legitimate depiction of myself or of any of the people I’ve met who work on MOND. I am left to wonder, who are these “MOND people”? They sound very bad. Are there any here in the room with us?

I am under no illusions as to what these people likely say when I am out of ear shot. Someone recently pointed me to a comment on Peter Woit’s blog that I would not have come across on my own. I am specifically named. Here is a screen shot:

From a reply to a post of Peter Woit on December 8, 2022. I omit the part about right-handed neutrinos as irrelevant to the discussion here.

This concisely pinpoints where the field2 is at, both right and wrong. Let’s break it down.

let me just remind everyone that the primary reason to believe in the phenomenon of cold dark matter is the very high precision with which we measure the CMB power spectrum, especially modes beyond the second acoustic peak

This is correct, but it is not the original reason to believe in CDM. The history of the subject matters, as we already believed in CDM quite firmly before any modes of the acoustic power spectrum of the CMB were measured. The original reasons to believe in cold dark matter were (1) that the measured, gravitating mass density exceeds the mass density of baryons as indicated by BBN, so there is stuff out there with mass that is not normal matter, and (2) large scale structure has grown by a factor of 105 from the very smooth initial condition indicated initially by the nondetection of fluctuations in the CMB, while normal matter (with normal gravity) can only get us a factor of 103 (there were upper limits excluding this before there was a detection). Structure formation additionally imposes the requirement that whatever the dark matter is moves slowly (hence “cold”) and does not interact via electromagnetism in order to evade making too big an impact on the fluctuations in the CMB (hence the need, again, for something non-baryonic).

When cold dark matter became accepted as the dominant paradigm, fluctuations in the CMB had not yet been measured. The absence of observable fluctuations at a larger level sufficed to indicate the need for CDM. This, together with ฮฉm > ฮฉb from BBN (which seemed the better of the two arguments at the time), sufficed to convince me, along with most everyone else who was interested in the problem, that the answer had3 to be CDM.

This all happened before the first fluctuations were observed by COBE in 1992. By that time, we already believed firmly in CDM. The COBE observations caused initial confusion and great consternation – it was too much! We actually had a prediction from then-standard SCDM, and it had predicted an even lower level of fluctuations than what COBE observed. This did not cause us (including me) to doubt CDM (thought there was one suggestion that it might be due to self-interacting dark matter); it seemed a mere puzzle to accommodate, not an anomaly. And accommodate it we did: the power in the large scale fluctuations observed by COBE is part of how we got LCDM, albeit only a modest part. A lot of younger scientists seem to have been taught that the power spectrum is some incredibly successful prediction of CDM when in fact it has surprised us at nearly every turn.

As I’ve related here before, it wasn’t until the end of the century that CMB observations became precise enough to provide a test that might distinguish between CDM and MOND. That test initially came out in favor of MOND – or at least in favor of the absence of dark matter: No-CDM, which I had suggested as a proxy for MOND. Cosmologists and dark matter advocates consistently omit this part of the history of the subject.

I had hoped that cosmologists would experience the same surprise and doubt and reevaluation that I had experienced when MOND cropped up in my own data when it cropped up in theirs. Instead, they went into denial, ignoring the successful prediction of the first-to-second peak amplitude ratio, or, worse, making up stories that it hadn’t happened. Indeed, the amplitude of the second peak was so surprising that the first paper to measure it omitted mention of it entirely. Just didn’t talk about it, let alone admit that “Gee, this crazy prediction came true!” as I had with MOND in LSB galaxies. Consequently, I decided that it was better to spend my time working on topics where progress could be made. This is why most of my work on the CMB predates “modes beyond the second peak” just as our strong belief in CDM also predated that evidence. Indeed, communal belief in CDM was undimmed when the modes defining the second peak were observed, despite the No-CDM proxy for MOND being the only hypothesis to correctly predict it quantitatively a priori.

That said, I agree with clayton’s assessment that

CDM thinks [the second and third peak] should be about the same

That this is the best evidence now is both correct and a much weaker argument than it is made out to be. It sounds really strong, because a formal fit to the CMB data require a dark matter component at extremely high confidence – something approaching 100 sigma. This analysis assumes that dark matter exist. It does not contemplate that something else might cause the same effect, so all it really does, yet again, is demonstrate that General Relativity cannot explain cosmology when restricted to the material entities we concretely know to exist.

Given the timing, the third peak was not a strong element of my original prediction, as we did not yet have either a first or second peak. We hadn’t yet clearly observed peaks at all, so what I was doing was pretty far-sighted, but I wasn’t thinking that far ahead. However, the natural prediction for the No-CDM picture I was considering was indeed that the third peak should be lower than the second, as I’ve discussed before.

The No-CDM model (blue line) that correctly predicted the amplitude of the second peak fails to predict that of the third. Data from the Planck satellite; model line from McGaugh (2004); figure from McGaugh (2015).

In contrast, in CDM, the acoustic power spectrum of the CMB can do a wide variety of things:

Acoustic power spectra calculated for the CMB for a variety of cosmic parameters. From Dodelson & Hu (2002).

Given the diversity of possibilities illustrated here, there was never any doubt that a model could be fit to the data, provided that oscillations were observed as expected in any of the theories under consideration here. Consequently, I do not find fits to the data, though excellent, to be anywhere near as impressive as commonly portrayed. What does impress me is consistency with independent data.

What impresses me even more are a priori predictions. These are the gold standard of the scientific method. That’s why I worked my younger self’s tail off to make a prediction for the second peak before the data came out. In order to make a clean test, you need to know what both theories predict, so I did this for both LCDM and No-CDM. Here are the peak ratios predicted before there were data to constrain them, together with the data that came after:

The ratio of the first-to-second (left) and second-to-third peak (right) amplitude ratio in LCDM (red) and No-CDM (blue) as predicted by Ostriker & Steinhardt (1995) and McGaugh (1999). Subsequent data as labeled.

The left hand panel shows the predicted amplitude ratio of the first-to-second peak, A1:2. This is the primary quantity that I predicted for both paradigms. There is a clear distinction between the predicted bands. I was not unique in my prediction for LCDM; the same thing can be seen in other contemporaneous models. All contemporaneous models. I was the only one who was not surprised by the data when they came in, as I was the only one who had considered the model that got the prediction right: No-CDM.

The same No-CDM model fails to correctly predict the second-to-third peak ratio, A2:3. It is, in fact, way off, while LCDM is consistent with A2:3, just as Clayton says. This is a strong argument against No-CDM, because No-CDM makes a clear and unequivocal prediction that it gets wrong. Clayton calls this

a stone-cold, qualitative, crystal clear prediction of CDM

which is true. It is also qualitative, so I call it weak sauce. LCDM could be made to fit a very large range of A2:3, but it had already got A1:2 wrong. We had to adjust the baryon density outside the allowed range in order to make it consistent with the CMB data. The generous upper limit that LCDM might conceivably have predicted in advance of the CMB data was A1:2 < 2.06, which is still clearly less than observed. For the first years of the century, the attitude was that BBN had been close, but not quite right – preference being given to the value needed to fit the CMB. Nowadays, BBN and the CMB are said to be in great concordance, but this is only true if one restricts oneself to deuterium measurements obtained after the “right” answer was known from the CMB. Prior to that, practically all of the measurements for all of the important isotopes of the light elements, deuterium, helium, and lithium, all concurred that the baryon density ฮฉbh2 < 0.02, with the consensus value being ฮฉbh2 = 0.0125 ยฑ 0.0005. This is barely half the value subsequently required to fit the CMB (ฮฉbh2 = 0.0224 ยฑ 0.0001). But what’s a factor of two among cosmologists? (In this case, 4 sigma.)

Taking the data at face value, the original prediction of LCDM was falsified by the second peak. But, no problem, we can move the goal posts, in this case by increasing the baryon density. The successful prediction of the third peak only comes after the goal posts have been moved to accommodate the second peak. Citing only the comparable size of third peak to the second while not acknowledging that the second was too small elides the critical fact that No-CDM got something right, a priori, that LCDM did not. No-CDM failed only after LCDM had already failed. The difference is that I acknowledge its failure while cosmologists elide this inconvenient detail. Perhaps the second peak amplitude is a fluke, but it was a unique prediction that was exactly nailed and remains true in all subsequent data. That’s a pretty remarkable fluke4.

LCDM wins ugly here by virtue of its flexibility. It has greater freedom to fit the data – any of the models in the figure of Dodelson & Hu will do. In contrast. No-CDM is the single blue line in my figure above, and nothing else. Plausible variations in the baryon density make hardly any difference: A1:2 has to have the value that was subsequently observed, and no other. It passed that test with flying colors. It flunked the subsequent test posed by A2:3. For LCDM this isn’t even a test, it is an exercise in fitting the data with a model that has enough parameters5 to do so.

There were a number of years at the beginning of the century during which the No-CDM prediction for the A1:2 was repeatedly confirmed by multiple independent experiments, but before the third peak was convincingly detected. During this time, cosmologists exhibited the same attitude that Clayton displays here: the answer has to be CDM! This warrants mention because the evidence Clayton cites did not yet exist. Clearly the as-yet unobserved third peak was not the deciding factor.

In those days, when No-CDM was the only correct a priori prediction, I would point out to cosmologists that it had got A1:2 right when I got the chance (which was rarely: I was invited to plenty of conferences in those days, but none on the CMB). The typical reaction was usually outright denial6 though sometimes it warranted a dismissive “That’s not a MOND prediction.” The latter is a fair criticism. No-CDM is just General Relativity without CDM. It represented MOND as a proxy under the ansatz that MOND effects had not yet manifested in a way that affected the CMB. I expected that this ansatz would fail at some point, and discussed some of the ways that this should happen. One that’s relevant today is that galaxies form early in MOND, so reionization happens early, and the amplitude of gravitational lensing effects is amplified. There is evidence for both of these now. What I did not anticipate was a departure from a damping spectrum around L=600 (between the second and third peaks). That’s a clear deviation from the prediction, which falsifies the ansatz but not MOND itself. After all, they were correct in noting that this wasn’t a MOND prediction per se, just a proxy. MOND, like Newtonian dynamics before it, is relativity adjacent, but not itself a relativistic theory. Neither can explain the CMB on their own. If you find that an unsatisfactory answer, imagine how I feel.

The same people who complained then that No-CDM wasn’t a real MOND prediction now want to hold MOND to the No-CDM predicted power spectrum and nothing else. First it was the second peak isn’t a real MOND prediction! then when the third peak was observed it became no way MOND can do this! This isn’t just hypocritical, it is bad science. The obvious way to proceed would be to build on the theory that had the greater, if incomplete, predictive success. Instead, the reaction has consistently been to cherry-pick the subset of facts that precludes the need for serious rethinking.

This brings us to sociology, so let’s examine some more of what Clayton has to say:

Any talk Iโ€™ve ever seen by McGaugh (or more exotic modified gravity people like Verlinde) elides this fact, and they evade the questions when I put my hand up to ask. I have invited McGaugh to a conference before specifically to discuss this point, and he just doesnโ€™t want to.

Now you’re getting personal.

There is so much to unpack here, I hardly know where to start. By saying I “elide this fact” about the qualitatively equality of the second and third peak, Clayton is basically accusing me of lying by omission. This is pretty rich coming from a community that consistently elides the history I relate above, and never addresses the question raised by MOND’s predictive power.

Intellectual honesty is very important to me – being honest that MOND predicted what I saw in low surface brightness where my own prediction was wrong is what got me into this mess in the first place. It would have been vastly more convenient to pretend that I never heard of MOND (at first I hadn’t7) and act like that never happened. That would be an lie of omission. It would be a large lie, a lie that denies an important aspect of how the world works (what we’re supposed to uncover through science), the sort of lie that cleric Paul Gerhardt may have had in mind when he said

When a man lies, he murders some part of the world.

Paul Gerhardt

Clayton is, in essence, accusing me of exactly that by failing to mention the CMB in talks he has seen. That might be true – I give a lot of talks. He hasn’t been to most of them, and I usually talk about things I’ve done more recently than 2004. I’ve commented explicitly on this complaint before

Thereโ€™s only so much you can address in a half hour talk. [This is a recurring problem. No matter what I say, there always seems to be someone who asks โ€œwhy didnโ€™t you address X?โ€ where X is usually that personโ€™s pet topic. Usually I could do so, but not in the time allotted.]

– so you may appreciate my exasperation at being accused of dishonesty by someone whose complaint is so predictable that I’ve complained before about people who make this complaint. I’m only human – I can’t cover all subjects for all audiences every time all the time. Moreover, I do tend to choose to discuss subjects that may be news to an audience, not simply reprise the greatest hits they want to hear. Clayton obviously knows about the third peak; he doesn’t need to hear about it from me. This is the scientific equivalent of shouting Freebird! at a concert.

It isn’t like I haven’t talked about it. I have been rigorously honest about the CMB, and certainly have not omitted mention of the third peak. Here is a comment from February 2003 when the third peak was only tentatively detected:

Page et al. (2003) do not offer a WMAP measurement of the third peak. They do quote a compilation of other experiments by Wang et al. (2003). Taking this number at face value, the second to third peak amplitude ratio is A2:3 = 1.03 +/- 0.20. The LCDM expectation value for this quantity was 1.1, while the No-CDM expectation was 1.9. By this measure, LCDM is clearly preferable, in contradiction to the better measured first-to-second peak ratio.

Or here, in March 2006:

the Boomerang data and the last credible point in the 3-year WMAP data both have power that is clearly in excess of the no-CDM prediction. The most natural interpretation of this observation is forcing by a mass component that does not interact with photons, such as non-baryonic cold dark matter.

There are lots like this, including my review for CJP and this talk given at KITP where I had been asked to explicitly take the side of MOND in a debate format for an audience of largely particle physicists. The CMB, including the third peak, appears on the fourth slide, which is right up front, not being elided at all. In the first slide, I tried to encapsulate the attitudes of both sides:

I did the same at a meeting in Stony Brook where I got a weird vibe from the audience; they seemed to think I was lying about the history of the second peak that I recount above. It will be hard to agree on an interpretation if we can’t agree on documented historical facts.

More recently, this image appears on slide 9 of this lecture from the cosmology course I just taught (Fall 2022):

I recognize this slide from talks I’ve given over the past five plus years; this class is the most recent place I’ve used it, not the first. On some occasions I wrote “The 3rd peak is the best evidence for CDM.” I do not recall which all talks I used this in; many of them were likely colloquia for physics departments where one has more time to cover things than in a typical conference talk. Regardless, these apparently were not the talks that Clayton attended. Rather than it being the case that I never address this subject, the more conservative interpretation of the experience he relates would be that I happened not to address it in the small subset of talks that he happened to attend.

But do go off, dude: tell everyone how I never address this issue and evade questions about it.

I have been extraordinarily patient with this sort of thing, but I confess to a great deal of exasperation at the perpetual whataboutism that many scientists engage in. It is used reflexively to shut down discussion of alternatives: dark matter has to be right for this reason (here the CMB); nothing else matters (galaxy dynamics), so we should forbid discussion of MOND. Even if dark matter proves to be correct, the CMB is being used an excuse to not address the question of the century: why does MOND get so many predictions right? Any scientist with a decent physical intuition who takes the time to rub two brain cells together in contemplation of this question will realize that there is something important going on that simply invoking dark matter does not address.

In fairness to McGaugh, he pointed out some very interesting features of galactic DM distributions that do deserve answers. But it turns out that there are a plurality of possibilities, from complex DM physics (self interactions) to unmodelable SM physics (stellar feedback, galaxy-galaxy interactions). There are no such alternatives to CDM to explain the CMB power spectrum.

Thanks. This is nice, and why I say it would be easier to just pretend to never have heard of MOND. Indeed, this succinctly describes the trajectory I was on before I became aware of MOND. I would prefer to be recognized for my own work – of which there is plenty – than an association with a theory that is not my own – an association that is born of honestly reporting a surprising observation. I find my reception to be more favorable if I just talk about the data, but what is the point of taking data if we don’t test the hypotheses?

I have gone to great extremes to consider all the possibilities. There is not a plurality of viable possibilities; most of these things do not work. The specific ideas that are cited here are known not work. SIDM apears to work because it has more free parameters than are required to describe the data. This is a common failing of dark matter models that simply fit some functional form to observed rotation curves. They can be made to fit the data, but they cannot be used to predict the way MOND can.

Feedback is even worse. Never mind the details of specific feedback models, and think about what is being said here: the observations are to be explained by “unmodelable [standard model] physics.” This is a way of saying that dark matter claims to explain the phenomena while declining to make a prediction. Don’t worry – it’ll work out! How can that be considered better than or even equivalent to MOND when many of the problems we invoke feedback to solve are caused by the predictions of MOND coming true? We’re just invoking unmodelable physics as a deus ex machina to make dark matter models look like something they are not. Are physicists straight-up asserting that it is better to have a theory that is unmodelable than one that makes predictions that come true?

Returning to the CMB, are there no “alternatives to CDM to explain the CMB power spectrum”? I certainly do not know how to explain the third peak with the No-CDM ansatz. For that we need a relativistic theory, like Beklenstein‘s TeVeS. This initially seemed promising, as it solved the long-standing problem of gravitational lensing in MOND. However, it quickly became clear that it did not work for the CMB. Nevertheless, I learned from this that there could be more to the CMB oscillations than allowed by the simple No-CDM ansatz. The scalar field (an entity theorists love to introduce) in TeVeS-like theories could play a role analogous to cold dark matter in the oscillation equations. That means that what I thought was a killer argument against MOND – the exact same argument Clayton is making – is not as absolute as I had thought.

Writing down a new relativistic theory is not trivial. It is not what I do. I am an observational astronomer. I only play at theory when I can’t get telescope time.

Comic from the Far Side by Gary Larson.

So in the mid-00’s, I decided to let theorists do theory and started the first steps in what would ultimately become the SPARC database (it took a decade and a lot of effort by Jim Schombert and Federico Lelli in addition to myself). On the theoretical side, it also took a long time to make progress because it is a hard problem. Thanks to work by Skordis & Zlosnik on a theory they [now] call AeST8, it is possible to fit the acoustic power spectrum of the CMB:

CMB power spectrum observed by Planck fit by AeST (Skordis & Zlosnik 2021).

This fit is indistinguishable from that of LCDM.

I consider this to be a demonstration, not necessarily the last word on the correct theory, but hopefully an iteration towards one. The point here is that it is possible to fit the CMB. That’s all that matters for our current discussion: contrary to the steady insistence of cosmologists over the past 15 years, CDM is not the only way to fit the CMB. There may be other possibilities that we have yet to figure out. Perhaps even a plurality of possibilities. This is hard work and to make progress we need a critical mass of people contributing to the effort, not shouting rubbish from the peanut gallery.

As I’ve done before, I like to take the language used in favor of dark matter, and see if it also fits when I put on a MOND hat:

As a galaxy dynamicist, let me just remind everyone that the primary reason to believe in MOND as a physical theory and not some curious dark matter phenomenology is the very high precision with which MOND predicts, a priori, the dynamics of low-acceleration systems, especially low surface brightness galaxies whose kinematics were practically unknown at the time of its inception. There is a stone-cold, quantitative, crystal clear prediction of MOND that the kinematics of galaxies follows uniquely from their observed baryon distributions. This is something CDM profoundly and irremediably gets wrong: it predicts that the dark matter halo should have a central cusp9 that is not observed, and makes no prediction at all for the baryon distribution, let alone does it account for the detailed correspondence between bumps and wiggles in the baryon distribution and those in rotation curves. This is observed over and over again in hundreds upon hundreds of galaxies, each of which has its own unique mass distribution so that each and every individual case provides a distinct, independent test of the hypothesized force law. In contrast, CDM does not even attempt a comparable prediction: rather than enabling the real-world application to predict that this specific galaxy will have this particular rotation curve, it can only refer to the statistical properties of galaxy-like objects formed in numerical simulations that resemble real galaxies only in the abstract, and can never be used to directly predict the kinematics of a real galaxy in advance of the observation – an ability that has been demonstrated repeatedly by MOND. The simple fact that the simple formula of MOND is so repeatably correct in mapping what we see to what we get is to me the most convincing way to see that we need a grander theory that contains MOND and exactly MOND in the low acceleration limit, irrespective of the physical mechanism by which this is achieved.

That is stronger language than I would ordinarily permit myself. I do so entirely to show the danger of being so darn sure. I actually agree with clayton’s perspective in his quote; I’m just showing what it looks like if we adopt the same attitude with a different perspective. The problems pointed out for each theory are genuine, and the supposed solutions are not obviously viable (in either case). Sometimes I feel like we’re up the proverbial creek without a paddle. I do not know what the right answer is, and you should be skeptical of anyone who is sure that he does. Being sure is the sure road to stagnation.


1It may surprise some advocates of dark matter that I barely touch on MOND in this course, only getting to it at the end of the semester, if at all. It really is evidence-based, with a focus on the dynamical evidence as there is a lot more to this than seems to be appreciated by most physicists*. We also teach a course on cosmology, where students get the material that physicists seem to be more familiar with.

*I once had a colleague who was is a physics department ask how to deal with opposition to developing a course on galaxy dynamics. Apparently, some of the physicists there thought it was not a rigorous subject worthy of an entire semester course – an attitude that is all too common. I suggested that she pointedly drop the textbook of Binney & Tremaine on their desks. She reported back that this technique proved effective.

2I do not know who clayton is; that screen name does not suffice as an identifier. He claims to have been in contact with me at some point, which is certainly possible: I talk to a lot of people about these issues. He is welcome to contact me again, though he may wish to consider opening with an apology.

3One of the hardest realizations I ever had as a scientist was that both of the reasons (1) and (2) that I believed to absolutely require CDM assumed that gravity was normal. If one drops that assumption, as one must to contemplate MOND, then these reasons don’t require CDM so much as they highlight that something is very wrong with the universe. That something could be MOND instead of CDM, both of which are in the category of who ordered that?

4In the early days (late ’90s) when I first started asking why MOND gets any predictions right, one of the people I asked was Joe Silk. He dismissed the rotation curve fits of MOND as a fluke. There were 80 galaxies that had been fit at the time, which seemed like a lot of flukes. I mention this because one of the persistent myths of the subject is that MOND is somehow guaranteed to magically fit rotation curves. Erwin de Blok and I explicitly showed that this was not true in a 1998 paper.

5I sometimes hear cosmologists speak in awe of the thousands of observed CMB modes that are fit by half a dozen LCDM parameters. This is impressive, but we’re fitting a damped and driven oscillation – those thousands of modes are not all physically independent. Moreover, as can be seen in the figure from Dodelson & Hu, some free parameters provide more flexibility than others: there is plenty of flexibility in a model with dark matter to fit the CMB data. Only with the Planck data do minor tensions arise, the reaction to which is generally to add more free parameters, like decoupling the primordial helium abundance from that of deuterium, which is anathema to standard BBN so is sometimes portrayed as exciting, potentially new physics.

For some reason, I never hear the same people speak in equal awe of the hundreds of galaxy rotation curves that can be fit by MOND with a universal acceleration scale and a single physical free parameter, the mass-to-light ratio. Such fits are over-constrained, and every single galaxy is an independent test. Indeed, MOND can predict rotation curves parameter-free in cases where gas dominates so that the stellar mass-to-light ratio is irrelevant.

How should we weigh the relative merit of these very different lines of evidence?

6On a number of memorable occasions, people shouted “No you didn’t!” On smaller number of those occasions (exactly two), they bothered to look up the prediction in the literature and then wrote to apologize and agree that I had indeed predicted that.

7If you read this paper, part of what you will see is me being confused about how low surface brightness galaxies could adhere so tightly to the Tully-Fisher relation. They should not. In retrospect, one can see that this was a MOND prediction coming true, but at the time I didn’t know about that; all I could see was that the result made no sense in the conventional dark matter picture.

Some while after we published that paper, Bob Sanders, who was at the same institute as my collaborators, related to me that Milgrom had written to him and asked “Do you know these guys?”

8Initially they had called it RelMOND, or just RMOND. AeST stands for Aether-Scalar-Tensor, and is clearly a step along the lines that Bekenstein made with TeVeS.

In addition to fitting the CMB, AeST retains the virtues of TeVeS in terms of providing a lensing signal consistent with the kinematics. However, it is not obvious that it works in detail – Tobias Mistele has a brand new paper testing it, and it doesn’t look good at extremely low accelerations. With that caveat, it significantly outperforms extant dark matter models.

There is an oft-repeated fallacy that comes up any time a MOND-related theory has a problem: “MOND doesn’t work therefore it has to be dark matter.” This only ever seems to hold when you don’t bother to check what dark matter predicts. In this case, we should but don’t detect the edge of dark matter halos at higher accelerations than where AeST runs into trouble.

9Another question I’ve posed for over a quarter century now is what would falsify CDM? The first person to give a straight answer to this question was Simon White, who said that cusps in dark matter halos were an ironclad prediction; they had to be there. Many years later, it is clear that they are not, but does anyone still believe this is an ironclad prediction? If it is, then CDM is already falsified. If it is not, then what would be? It seems like the paradigm can fit any surprising result, no matter how unlikely a priori. This is not a strength, it is a weakness. We can, and do, add epicycle upon epicycle to save the phenomenon. This has been my concern for CDM for a long time now: not that it gets some predictions wrong, but that it can apparently never get a prediction so wrong that we can’t patch it up, so we can never come to doubt it if it happens to be wrong.

Question of the Year (and a challenge)

Why does MOND get any predictions right?

That’s the question of the year, and perhaps of the century. I’ve been asking it since before this century began, and I have yet to hear a satisfactory answer. Most of the relevant scientific community has aggressively failed to engage with it. Even if MOND is wrong for [insert favorite reason], this does not relieve us of the burden to understand why it gets many predictions right – predictions that have repeatedly come as a surprise to the community that has declined to engage, preferring to ignore the elephant in the room.

It is not good enough to explain MOND phenomenology post facto with some contrived LCDM model. That’s mostly1 what is on offer, being born of the attitude that we’re sure LCDM is right, so somehow MOND phenomenology must emerge from it. We could just as [un]reasonably adopt the attitude that MOND is correct, so surely LCDM phenomenology happens as a result of trying to fit the standard cosmological model to some deeper, subtly different theory.

A basic tenet of the scientific method is that if a theory has its predictions come true, we are obliged to acknowledge its efficacy. This is how we know when to change our minds. This holds even if we don’t like said theory – especially if we don’t like it.

That was my experience with MOND. It correctly predicted the kinematics of the low surface brightness galaxies I was interested in. Dark matter did not. The data falsified all the models available at the time, including my own dark matter-based hypothesis. The only successful a priori predictions were those made by Milgrom. So what am I to conclude2 from this? That he was wrong?

Since that time, MOND has been used to make a lot of further predictions that came true. Predictions for specific objects that cannot even be made with LCDM. Post-hoc explanations abound, but are not satisfactory as they fail to address the question of the year. If LCDM is correct, why is it that MOND keeps making novel predictions that LCDM consistently finds surprising? This has happened over and over again.

I understand the reluctance to engage. It really ticked me off that my own model was falsified. How could this stupid theory of Milgrom’s do better for my galaxies? Indeed, how could it get anything right? I had no answer to this, nor does the wider community. It is not for lack of trying on my part; I’ve spent a lot of time3 building conventional dark matter models. They don’t work. Most of the models made by others that I’ve seen are just variations on models I had already considered and rejected as obviously unworkable. They might look workable from one angle, but they inevitably fail from some other, solving one problem at the expense of another.

Predictive success does not guarantee that a theory is right, but it does make it better than competing theories that fail for the same prediction. This is where MOND and LCDM are difficult to compare, as the relevant data are largely incommensurate. Where one is eloquent, the other tends to be muddled. However, it has been my experience that MOND more frequently reproduces the successes of dark matter than vice-versa. I expect this statement comes as a surprise to some, as it certainly did to me (see the comment line of astro-ph/9801102). The people who say the opposite clearly haven’t bothered to check2 as I have, or even to give MOND a real chance. If you come to a problem sure you know the answer, no data will change your mind. Hence:

A challenge: What would falsify the existence of dark matter?

If LCDM is a scientific theory, it should be falsifiable4. Dark matter, by itself, is a concept, not a theory: mass that is invisible. So how can we tell if it’s not there? Once we have convinced ourselves that the universe is full of invisible stuff that we can’t see or (so far) detect any other way, how do we disabuse ourselves of this notion, should it happen to be wrong? If it is correct, we can in principle find it in the lab, so its existence can be confirmed. But is it falsifiable? How?

That is my challenge to the dark matter community: what would convince you that the dark matter picture is wrong? Answers will vary, as it is up to each individual to decide for themself how to answer. But there has to be an answer. To leave this basic question unaddressed is to abandon the scientific method.

I’ll go first. Starting in 1985 when I was first presented evidence in a class taught by Scott Tremaine, I was as much of a believer in dark matter as anyone. I was even a vigorous advocate, for a time. What convinced me to first doubt the dark matter picture was the fine-tuning I had to engage in to salvage it. It was only after that experience that I realized that the problems I was encountering were caused by the data doing what MOND had predicted – something that really shouldn’t happen if dark matter is running the show. But the MOND part came after; I had already become dubious about dark matter in its own context.

Falsifiability is a question every scientist who works on dark matter needs to face. What would cause you to doubt the existence of dark matter? Nothing is not a scientific answer. Neither is it correct to assert that the evidence for dark matter is already overwhelming. That is a misstatement: the evidence for acceleration discrepancies is overwhelming, but these can be interpreted as evidence for either dark matter or MOND.

This important thing is to establish criteria by which you would change your mind. I changed my mind before: I am no longer convinced that the solution the acceleration discrepancy has to be non-baryonic dark matter. I will change my mind again if the evidence warrants. Let me state, yet again, what would cause me to doubt that MOND is a critical element of said solution. There are lots of possibilities, as MOND is readily falsifiable. Three important ones are:

  1. MOND getting a fundamental prediction wrong;
  2. Detecting dark matter;
  3. Answering the question of the year.

None of these have happened yet. Just shouting MOND is falsified already! doesn’t make it so: the evidence has to be both clear and satisfactory. For example,

  1. MOND might be falsified by cluster data, but it’s apparent failure is not fundamental. There is a residual missing mass problem in the richest clusters, but there’s nothing in MOND that says we have to have detected all the baryons by now. Indeed, LCDM doesn’t fare better, just differently, with both theories suffering a missing baryon problem. The chief difference is that we’re willing to give LCDM endless mulligans but MOND none at all. Where the problem for MOND in clusters comes up all the time, the analogous problem in LCDM is barely discussed, and is not even recognized as a problem.
  2. A detection of dark matter would certainly help. To be satisfactory, it can’t be an isolated signal in a lone experiment that no one else can reproduce. If a new particle is detected, its properties have to be correct (e.g, it has the right mass density, etc.). As always, we must be wary of some standard model event masquerading as dark matter. WIMP detectors will soon reach the neutrino background accumulated from all the nuclear emissions of stars over the course of cosmic history, at which time they will start detecting weakly interacting particles as intended: neutrinos. Those aren’t the dark matter, but what are the odds that the first of those neutrino detections will be eagerly misinterpreted as dark matter?
  3. Finally, the question of the year: why does MOND get any prediction right? To provide a satisfactory answer to this, one must come up with a physical model that provides a compelling explanation for the phenomena and has the same ability as MOND to make novel predictions. Just building a post-hoc model to match the data, which is the most common approach, doesn’t provide a satisfactory, let alone a compelling, explanation for the phenomenon, and provides no predictive power at all. If it did, we could have predicted MOND-like phenomenology and wouldn’t have to build these models after the fact.

So far, none of these three things have been clearly satisfied. The greatest danger to MOND comes from MOND itself: the residual mass discrepancy in clusters, the tension in Galactic data (some of which favor MOND, other of which don’t), and the apparent absence of dark matter in some galaxies. While these are real problems, they are also of the scale that is expected in the normal course of science: there are always tensions and misleading tidbits of information; I personally worry the most about the Galactic data. But even if my first point is satisfied and MOND fails on its own merits, that does not make dark matter better.

A large segment of the scientific community seems to suffer a common logical fallacy: any problem with MOND is seen as a success for dark matter. That’s silly. One has to evaluate the predictions of dark matter for the same observation to see how it fares. My experience has been that observations that are problematic for MOND are also problematic for dark matter. The latter often survives by not making a prediction at all, which is hardly a point in its favor.

Other situations are just plain weird. For example, it is popular these days to cite the absence of dark matter in some ultradiffuse galaxies as a challenge to MOND, which they are. But neither does it make sense to have galaxies without dark matter in a universe made of dark matter. Such a situation can be arranged, but the circumstances are rather contrived and usually involve some non-equilibrium dynamics. That’s fine; that can happen on rare occasions, but disequilibrium situations can happen in MOND too (the claims of falsification inevitably assume equilibrium). We can’t have it both ways, permitting special circumstances for one theory but not for the other. Worse, some examples of galaxies that are claimed to be devoid of dark matter are as much a problem for LCDM as for MOND. A disk galaxy devoid of either can’t happen; we need something to stabilize disks.

So where do we go from here? Who knows! There are fundamental questions that remain unanswered, and that’s a good thing. There is real science yet to be done. We can make progress if we stick to the scientific method. There is more to be done than measuring cosmological parameters to the sixth place of decimals. But we have to start by setting standards for falsification. If there is no observation or experimental result that would disabuse you of your current belief system, then that belief system is more akin to religion than to science.


1There are a few ideas, like superfluid dark matter, that try to automatically produce MOND phenomenology. This is what needs to happen. It isn’t clear yet whether these ideas work, but reproducing the MOND phenomenology naturally is a minimum standard that has to be met for a model to be viable. Run of the mill CDM models that invoke feedback do not meet this standard. They can always be made to reproduce the data once observed, but not to predict it in advance as MOND does.


2There is a common refrain that “MOND fits rotation curves and nothing else.” This is a myth, plain and simple. A good, old-fashioned falsehood sustained by the echo chamber effect. (That’s what I heard!) Seriously: if you are a scientist who thinks this, what is your source? Did it come from a review of MOND, or from idle chit-chat? How many MOND papers have you read? What do you actually know about it? Ignorance is not a strong position from which to draw a scientific conclusion.


3Like most of the community, I have invested considerably more effort in dark matter than in MOND. Where I differ from much of the galaxy formation community* is in admitting when those efforts fail. There is a temptation to slap some lipstick on the dark matter pig and claim success just to go along to get along, but what is the point of science if that is what we do when we encounter an inconvenient result? For me, MOND has been an incredibly inconvenient result. I would love to be able to falsify it, but so far intellectual honesty forbids.

*There is a widespread ethos of toxic positivity in the galaxy formation literature, which habitually puts a more positive spin on results than is objectively warranted. I’m aware of at least one prominent school where students are taught “to be optimistic” and omit mention of caveats that might detract from the a model’s reception. This is effective in a careerist sense, but antithetical to the scientific endeavor.


4The word “falsification” carries a lot of philosophical baggage that I don’t care to get into here. The point is that there must be a way to tell if a theory is wrong. If there is not, we might as well be debating the number of angels that can dance on the head of a pin.