Why’d it have to be MOND?

Why’d it have to be MOND?

I want to take another step back in perspective from the last post to say a few words about what the radial acceleration relation (RAR) means and what it doesn’t mean. Here it is again:

The Radial Acceleration Relation over many decades. The grey region is forbidden – there cannot be less acceleration than caused by the observed baryons. The entire region above the diagonal line (yellow) is accessible to dark matter models as the sum of baryons and however much dark matter the model prescribes. MOND is the blue line.

This information was not available when the dark matter paradigm was developed. We observed excess motion, like flat rotation curves, and inferred the existence of extra mass. That was perfectly reasonable given the information available at the time. It is not now: we need to reassess as we learn more.

There is a clear organization to the data at both high and low acceleration. No objective observer with a well-developed physical intuition would look at this and think “dark matter.” The observed behavior does not follow from one force law plus some arbitrary amount of invisible mass. That could do literally anything in the yellow region above, and beyond the bounds of the plot, both upwards and to the left. Indeed, there is no obvious reason why the data don’t fall all over the place. One of the lingering, niggling concerns is the 5:1 ratio of dark matter:baryons – why is it in the same ballpark, when it could be pretty much anything? Why should the data organize in terms of acceleration? There is no reason for dark matter to do this.

Plausible dark matter models have been predicted to do a variety of things – things other than what we observe. The problem for dark matter is that real objects only occupy a tiny line through the vast region available to them in the plot above. This is a fine-tuning problem: why do the data reside only where they do when they could be all over the place? I recognized this as a problem for dark matter before I became aware$ of MOND. That it turns out that the data follow the line uniquely predicted* by MOND is just chef’s kiss: there is a fine-tuning problem for dark matter because MOND is the effective force law.

The argument against dark matter is that the data could reside anywhere in the yellow region above, but don’t. The argument against MOND is that a small portion of the data fall a little off the blue line. Arguing that such objects, be they clusters of galaxies or particular individual galaxies, falsify MOND while ignoring the fine-tuning problem faced by dark matter is a case of refusing to see the forest for a few outlying trees.%

So to return to the question posed in the title of this post, I don’t know why it had to be MOND. That’s just what we observe. Pretending dark matter does the same thing is a false presumption.


$I’d heard of MOND only vaguely, and, like most other scientists in the field, had paid it no mind until it reared its ugly head in my own data.

*I talk about MOND here because I believe in giving credit where credit is due. MOND predicted this; no other theory did so. Dark matter theories did not predict this. My dark matter-based galaxy formation theory did not predict this. Other dark matter-based galaxy formation theories (including simulations) continue to fail to explain this. Other hypotheses of modified gravity also did not predict what is observed. Who+ ordered this?

Modified Dynamics. Very dangerous. You go first.

Many people in the field hate MOND, often with an irrational intensity that has the texture of religion. It’s not as if I woke up one morning and decided to like MOND – sometimes I wish I had never heard of it – but disliking a theory doesn’t make it wrong, and ignoring it doesn’t make it go away. MOND and only MOND predicted the observed RAR a priori. So far, MOND and only MOND provides a satisfactory explanation of thereof. We might not like it, but there it is in the data. We’re not going to progress until we get over our fear of MOND and cope with it. Imagining that it will somehow fall out of simulations with just the right baryonic feedback prescription is a form of magical thinking, not science.

MOND. Why’d it have to be MOND?

+Milgrom. Milgrom ordered this.


%I expect many cosmologists would argue the same in reverse for the cosmic microwave background (CMB) and other cosmological constraints. I have some sympathy for this. The fit to the power spectrum of the CMB seems too good to be an accident, and it points to the same parameters as other constraints. Well, mostly – the Hubble tension might be a clue that things could unravel, as if they haven’t already. The situation is not symmetric – where MOND predicted what we observe a priori with a minimum of assumptions, LCDM is an amalgam of one free parameter after another after another: dark matter and dark energy are, after all, auxiliary hypotheses we invented to save FLRW cosmology. When they don’t suffice, we invent more. Feedback is single word that represents a whole Pandora’s box of extra degrees of freedom, and we can invent crazier things as needed. The results is a Frankenstein’s monster of a cosmology that we all agree is the same entity, but when we examine it closely the pieces don’t fit, and one cosmologist’s LCDM is not really the same as that of the next. They just seem to agree because they use the same words to mean somewhat different things. Simply agreeing that there has to be non-baryonic dark matter has not helped us conjure up detections of the dark matter particles in the laboratory, or given us the clairvoyance to explain# what MOND predicted a prioi. So rather than agree that dark matter must exist because cosmology works so well, I think the appearance of working well is a chimera of many moving parts. Rather, cosmology, as we currently understand it, works if and only if non-baryonic dark matter exists in the right amount. That requires a laboratory detection to confirm.

#I have a disturbing lack of faith that a satisfactory explanation can be found.

It is not linear

It is not linear

I just got back from a visit to the Carnegie Institution of Washington where I gave a talk and saw some old friends. I was a postdoc at the Department of Terrestrial Magnetism (DTM) in the ’90s. DTM is so-named because in their early days they literally traveled the world mapping the magnetic field. When I was there, DTM+ had a small extragalactic astronomy group including Vera Rubin*, Francois Schweizer, and John Graham. Working there as a Carnegie Fellow gave me great latitude to pursue whatever science I wanted, with the benefit of discussions with these great astronomers. After my initial work on low surface brightness galaxies had brought MOND to my attention, much of the follow-up work checking all (and I do mean all) the other constraints was done there, ultimately resulting in the triptych of papers showing that the bulk of the evidence available at that time favored MOND over the dark matter interpretation.

When I joined the faculty at the University of Maryland in 1998, I saw the need to develop a graduate course on cosmology, which did not exist there at that time. I began to consider how cosmic structure might form in MOND, but was taken aback when Simon White asked me to referee a paper on the subject by Bob Sanders. He had found much what I was finding, that there was no way to avoid an early burst of speedy galaxy formation. I had been scooped!

It has taken a quarter century to test our predictions, so any concern about who said what first seems silly now. Indeed, the bigger problem is informing people that these predictions were made at all. I had a huge eye roll last month when Physics Magazine came out with

February 12, 2024
NEWS FEATURE
JWST Sees More Galaxies than Expected
February 9, 2024

The new JWST observatory is revealing far more bright galaxies in the early Universe than anyone predicted, and astrophysicists have more than one explanation for the puzzle.

Physics Magazine

Far more bright galaxies in the early Universe than anyone predicted! Who could have predicted it? I guess I am anyone.

Joking aside, this is a great illustration of the inefficiency of scientific communication. I wrote a series of papers on the subject. I wasn’t alone; so did others. I gave talks about it. I’ve emphasized it in scientific reviews. My papers are frequently cited, ranking in the top 2% among the top 2% across all sciences. They’re cited by prominent cosmologists. Heck, I’ve even blogged about it. And yet, it comes as such a surprise that it couldn’t have possibly happened, to the extent that no one bothers to check what is in the literature. (There was a similar sociology around the prediction of the CMB second peak. It didn’t happen if we don’t look.)

So what did the Physics Magazine article talk about? More than one explanation, most of which are the conventionalist approaches we’ve talked about before – make star formation more efficient, or adjust the IMF (the mass spectrum with which stars form) to squeeze more UV photons out of fewer baryons. But there is also a paper by Sabti et al. that basically asserts “this can’t be happening!” which is exactly the point.

Sabti et al. ask whether the can boost the amplitude of structure formation in a way that satisfies both the new JWST observations and previous Hubble data. The answer is no:

We consider beyond-ΛCDM power-spectrum enhancements and show that any departure large enough to reproduce the abundance of ultramassive JWST candidates is in conflict with the HST data.

Sabti et al.

At first, this struck me as some form of reality denial, like an assertion that the luminosity density could not possible exceed LCDM predictions, even though that is exactly what it is observed to do:

The integrated UV luminosity density as a function of redshift from Adams et al. (2023). The data exceed the expectation for z > 10, even with the goal posts in motion.

On a closer read, I realized my initial impression was wrong; they are making a much better argument. The star formation rate is what is really constrained by the UV luminosity, but if that is attributed to stellar mass, you can’t get there from here – even with some jiggering of structure formation. That appears to be correct, within the framework of their considerations. Yet an alteration of structure formation is exactly what led to the now-corroborated prediction of Sanders (1998), so something still seemed odd. Just how were they altering it?

It took a close read, but the issue is in their equation 3. They allow for more structure formation by increasing the amplitude. However, they maintain the usual linear growth rate. In effect, they boost the amplitude of the linear dashed line in the left panel below, while maintaining its shape:

The growth rate of structure in CDM (linear, at left) and MOND (nonlinear, at right).

This is strongly constrain at both higher and lower redshifts, so only a little boost in amplitude is possible, assuming linear growth. So what they’ve correctly shown is that the usual linear growth rate of LCDM cannot do what needs to be done. That just emphasizes my point: to get the rapid growth we observe in the narrow time range available above redshift ten, the rate of growth needs to be nonlinear.

It’s not linear from Star Trek DS9.

Nonlinearity is unavoidable in MOND – hence the prediction of big galaxies at high redshift. Nonlinearity is a bear to calculate, which is part of the reason nobody wants to go there. Tough nougies. They teach us in grad school that the early universe is simple. It is a mantra to many who work in the field. I’m sorry, did God promise this? I understand the reasons why the early universe should be simple in standard FLRW cosmology, but what if the universe we live in isn’t that? No one has standing to promise that the early universe is as simple as expected. That’s just a fairy tale cosmologists tell their young so they can sleep at night.


+DTM has since been merged with the Geophysical Laboratory to become the Earth and Planets Laboratory. These departments shared the Broad Branch Road campus but maintained a friendly rivalry in the soccer Mud Cup, so named because the first Mud Cup was played on a field that was a such a quagmire that we all became completely covered in mud. It was great fun.

*Vera was always adamant that she was not a physicist, and yet a search returns the thumbnail

even though the Wikipedia article itself does not (at present) make this spurious “and physicist” assertion.

The evolution of the luminosity density

The evolution of the luminosity density

The results from the high redshift universe keep pouring in from JWST. It is a full time job, and then some, just to keep track. One intriguing aspect is the luminosity density of the universe at z > 10. I had not thought this to be problematic for LCDM, as it only depends on the overall number density of stars, not whether they’re in big or small galaxies. I checked this a couple of years ago, and it was fine. At that point we were limited to z < 10, so what about higher redshift?

It helps to have in mind the contrasting predictions of distinct hypotheses, so a quick reminder. LCDM predicts a gradual build up of the dark matter halo mass function that should presumably be tracked by the galaxies within these halos. MOND predicts that galaxies of a wide range of masses form abruptly, including the biggest ones. The big distinction I’ve focused on is the formation epoch of the most massive galaxies. These take a long time to build up in LCDM, which typically takes half a Hubble time (~7 billion years; z < 1) for a giant elliptical to build up half its final stellar mass. Baryonic mass assembly is considerably more rapid in MOND, so this benchmark can be attained much earlier, even within the first billion years after the Big Bang (z > 5).

In both theories, astrophysics plays a role. How does gas condense into galaxies, and then form into stars? Gravity just tells us when we can assemble the mass, not how it becomes luminous. So the critical question is whether the high redshift galaxies JWST sees are indeed massive. They’re much brighter than had been predicted by LCDM, and in-line with the simplest models evolutionary models one can build in MOND, so the latter is the more natural interpretation. However, it is much harder to predict how many galaxies form in MOND; it is straightforward to show that they should form fast but much harder to figure out how many do so – i.e., how many baryons get incorporated into collapsed objects, and how many get left behind, stranded in the intergalactic medium? Consequently, the luminosity density – the total number of stars, regardless of what size galaxies they’re in – did not seem like a straight-up test the way the masses of individual galaxies is.

It is not difficult to produce lots of stars at high redshift in LCDM. But those stars should be in many protogalactic fragments, not individually massive galaxies. As a reminder, here is the merger tree for a galaxy that becomes a bright elliptical at low redshift:

Merger tree from De Lucia & Blaizot 2007 showing the hierarchical build-up of massive galaxies from many protogalactic fragments.

At large lookback times, i.e., high redshift, galaxies are small protogalactic fragments that have not yet assembled into a large island universe. This happens much faster in MOND, so we expect that for many (not necessarily all) galaxies, this process is basically complete after a mere billion years or so, often less. In both theories, your mileage will vary: each galaxy will have its own unique formation history. Nevertheless, that’s the basic difference: big galaxies form quickly in MOND while they should still be little chunks at high z in LCDM.

The hierarchical formation of structure is a fundamental prediction of LCDM, so this is in principle a place it can break. That is why many people are following the usual script of blaming astrophysics, i.e., how stars form, not how mass assembles. The latter is fundamental while the former is fungible.

Gradual mass assembly is so fundamental that its failure would break LCDM. Indeed, it is so deeply embedded in the mental framework of people working on it that it doesn’t seem to occur to most of them to consider the possibility that it could work any other way. It simply has to work that way; we were taught so in grad school!

Here is a sketch of how structures grow over time under the influence of cold dark matter (left, from Schramm 1992) and MOND (right, from Sanders & McGaugh 2002; see also this further discussion). The slow linear growth of CDM (long-dashed line, left panel) is replaced by a rapid, nonlinear growth in MOND (solid lines at right; numbers correspond to different scales). Nonlinear growth moderates after cosmic expansion begins to accelerate (dashed vertical line in right panel).

A principle result in perturbation theory applied to density fluctuations in an expanding universe governed by General Relativity is that the growth rate of these proto-objects is proportional to the expansion rate of the universe – hence the linear long-dashed line in the left diagram. The baryons cannot match the observations by themselves because the universe has “only” expanded by a factor of a thousand since recombination while structure has grown by a factor of a hundred thousand. This was one of the primary motivations for inventing cold dark matter in the first place: it can grow at the theory-specified rate without obliterating the observed isotropy% of the microwave background. The skeletal structure of the cosmic web grows in cold dark matter first; the baryons fall in afterwards (short-dashed line in left panel).

That’s how it works. Without dark matter, structure cannot form, so we needn’t consider MOND nor speak of it ever again forever and ever, amen.

Except, of course, that isn’t necessarily how structure formation works in MOND. Like every other inference of dark matter, the slow growth of perturbations assumes that gravity is normal. If we consider a different force law, then we have to revisit this basic result. Exactly how structure formation works in MOND is not a settled subject, but the panel at right illustrates how I think it might work. One seemingly unavoidable aspect is that MOND is nonlinear, so the growth rate becomes nonlinear at some point, which is rather early on if Milgrom’s constant a0 does not evolve. Rather than needing dark matter to achieve a growth factory of 105, the boost to the force law enables baryons do it on their own. That, in a nutshell, is why MOND predicts the early formation of big galaxies.

The same nonlinearity that makes structure grow fast in MOND also makes it very hard to predict the mass function. My nominal expectation is that the present-day galaxy baryonic mass function is established early and galaxies mostly evolve as closed boxes after that. Not exclusively; mergers still occasionally happen, as might continued gas accretion. In addition to the big galaxies that form their stars rapidly and eventually become giant elliptical galaxies, there will also be a population for which gas accretion is gradual^ enough to settle into a preferred plane and evolve into a spiral galaxy. But that is all gas physics and hand waving; for the mass function I simply don’t know how to extract a prediction from a nonlinear version of the Press-Schechter formalism. Somebody smarter than me should try that.

We do know how to do it for LCDM, at least for the dark matter halos, so there is a testable prediction there. The observable test depends on the messy astrophysics of forming stars and the shape of the mass function. The total luminosity density integrates over the shape, so is a rather forgiving test, as it doesn’t distinguish between stars in lots of tiny galaxies or the same number in a few big ones. Consequently, I hadn’t put much stock in it. But it is also a more robustly measured quantity, so perhaps it is more interesting than I gave it credit for, at least once we get to such high redshift that there should be hardly any stars.

Here is a plot of the ultraviolet (UV) luminosity density from Adams et al. (2023):

Fig. 8 from Adams et al. (2023) showing the integrated UV luminosity density as a function of redshift. UV light is produced by short-lived, massive stars, so makes a good proxy for the star formation rate (right axis).

The lower line is one+ a priori prediction of LCDM. I checked this back when JWST was launched, and saw no issues up to z=10, which remains true. However, the data now available at higher redshift are systematically higher than the prediction. The reason for this is simple, and the same as we’ve discussed before: dark matter halos are just beginning to get big; they don’t have enough baryons in them to make that many stars – at least not for the usual assumptions, or even just from extrapolating what we know quasi-empirically. (I say “quasi” because the extrapolation requires a theory-dependent rate of mass growth.)

The dashed line is what I consider to be a reasonable adjustment of the a priori prediction. Putting on an LCDM hat, it is actually closer to what I would have predicted myself because it has a constant star formation efficiency which is one of the knobs I prefer to fix empirically and then not touch. With that, everything is good up to z=10.5, maybe even to z=12 if we only believe* the data with uncertainties. But the bulk of the high redshift data sit well above the plausible expectation of LCDM, so grasping at the dangling ends of the biggest error bars seems unlikely to save us from a fall.

Ignoring the model lines, the data flatten out at z > 10, which is another way of saying that the UV luminosity function isn’t evolving when it should be. This redshift range does not correspond to much cosmic time, only a few hundred million years, so it makes the empiricist in me uncomfortable to invoke astrophysical causes. We have to imagine that the physical conditions change rapidly in the first sliver of cosmic time at just the right fine-tuned rate to make it look like there is no evolution at all, then settle down into a star formation efficiency that remains constant in perpetuity thereafter.

Harikane et al. (2023) also come to the conclusion that there is too much star formation going on at high redshift (their Fig. 18 is like that of Adams above, but extending all the way to z=0). Like many, they appear to be unaware that the early onset of structure formation had been predicted, so discuss three conventional astrophysical solutions as if these were the only possibilities. Translating from their section 6, the astrophysical options are:

  • Star formation was more efficient early on
  • Active Galactic Nuclei (AGN)
  • A top heavy IMF

This is a pretty broad view of the things that are being considered currently, though I’m sure people will add to this list as time goes forward and entropy increases.

Taking these in reverse order, the idea of a top heavy IMF is that preferentially more massive stars form early on. These produce more light per unit mass, so one gets brighter galaxies than predicted with a normal IMF. This is an idea that recurs every so often; see, e.g., section 3.1.1 of McGaugh (2004) where I discuss it in the related context of trying to get LCDM models to reionize the universe early enough. Supermassive Population III stars were all the rage back then. Changing the mass spectrum& with which stars form is one of those uber-free parameters that good modelers refrain from twiddling because it gives too much freedom. It is not a single knob so much as a Pandora’s box full of knobs that invoke a thousand Salpeter’s demons to do nearly anything at the price of understanding nothing.

As it happens, the option of a grossly variable IMF is already disfavored by the existence of quenched galaxies at z~3 that formed a normal stellar population at much higher redshift (z~11). These galaxies are composed of stars that have the spectral signatures appropriate for a population that formed with a normal IMF and evolved as stars do. This is exactly what we expect for galaxies that form early and evolve passively. Adjusting the IMF to explain the obvious makes a mockery of Occam’s razor.

AGN is a catchall term for objects like quasars that are powered by supermassive black holes at the centers of galaxies. This is a light source that is non-stellar, so we’ll overestimate the stellar mass if we mistake some light from AGN# as being from stars. In addition, we know that AGN were more prolific in the early universe. That in itself is also a problem: just as forming galaxies early is hard, so too is it hard to form enough supermassive black holes that early. So this just becomes the same problem in a different guise. Besides, the resolution of JWST is good enough to see where the light is coming from, and it ain’t all from unresolved AGN. Harikane et al. estimate that the AGN contribution is only ~10%.

That leaves the star formation efficiency, which is certainly another knob to twiddle. On the one hand, this is a reasonable thing to do, since we don’t really know what the star formation efficiency in the early universe was. On the other, we expected the opposite: star formation should, if anything, be less efficient at high redshift when the metallicity was low so there were few ways for gas to cool, which is widely considered to be a prerequisite for initiating star formation. Indeed, inefficient cooling was an argument in favor of a top-heavy IMF (perhaps stars need to be more massive to overcome higher temperatures in the gas from which they form), so these two possibilities contradict one another: we can have one but not both.

To me, the star formation efficiency is the most obvious knob to twiddle, but it has to be rather fine-tuned. There isn’t much cosmic time over which the variation must occur, and yet it has to change rapidly and in such a way as to precisely balance the non-evolving UV luminosity function against a rapidly evolving dark matter halo mass function. Once again, we’re in the position of having to invoke astrophysics that we don’t understand to make up for a manifest deficit the behavior of dark matter. Funny how those messy baryons always cover up for that clean, pure, simple dark matter.

I could go on about these possibilities at great length (and did in the 2004 paper cited above). I decline to do so any farther: we keep digging this hole just to fill it again. These ideas only seem reasonable as knobs to turn if one doesn’t see any other way out, which is what happens if one has absolute faith in structure formation theory and is blissfully unaware of the predictions of MOND. So I can already see the community tromping down the familiar path of persuading ourselves that the unreasonable is reasonable, that what was not predicted is what we should have expected all along, that everything is fine with cosmology when it is anything but. We’ve done it so many times before.


Initially I had the cat stuffed back in the bag image here, but that was really for a theoretical paper that I didn’t quite make it to in this post. You’ll see it again soon. The observations discussed here are by observers doing their best in the context they know, so it doesn’t seem appropriate to that.


%We were convinced of the need for non-baryonic dark matter before any fluctuations in the microwave background were detected; their absence at the level of one part in a thousand sufficed.

^The assembly of baryonic mass can and in most cases should be rapid. It is the settling of gas into a rotationally supported structure that takes time – this is influenced by gas physics, not just gravity. Regardless of gravity theory, gas needs to settle gently into a rotating disk in order for spiral galaxies to exist.

+There are other predictions that differ in detail, but this is a reasonable representative of the basic expectation.

*This is not necessarily unreasonable, as there is some proclivity to underestimate the uncertainties. That’s a general statement about the field; I have made no attempt to assess how reasonable these particular error bars are.

&Top-heavy refers to there being more than the usual complement of bright but short-lived (tens of millions of years) stars. These stars are individually high mass (bigger than the sun), while long-lived stars are low mass. Though individually low in mass, these faint stars are very numerous. When one integrates over the population, one finds that most of the total stellar mass resides in the faint, low mass stars while much of the light is produced by the high mass stars. So a top heavy IMF explains high redshift galaxies by making them out of the brightest stars that require little mass to build. However, these stars will explode and go away on a short time scale, leaving little behind. If we don’t outright truncate the mass function (so many knobs here!), there could be some longer-lived stars leftover, but they must be few enough for the whole galaxy to fade to invisibility or we haven’t gained anything. So it is surprising, from this perspective, to see massive galaxies that appear to have evolved normally without any of these knobs getting twiddled.

#Excess AGN were one possibility Jay Franck considered in his thesis as the explanation for what we then considered to be hyperluminous galaxies, but the known luminosity function of AGN up to z = 4 couldn’t explain the entire excess. With the clarity of hindsight, we were just seeing the same sorts of bright, early galaxies that JWST has brought into sharper focus.

Clusters of galaxies ruin everything

Clusters of galaxies ruin everything

A common refrain I hear is that MOND works well in galaxies, but not in clusters of galaxies. The oft-unspoken but absolutely intended implication is that we can therefore dismiss MOND and never speak of it again. That’s silly.

Even if MOND is wrong, that it works as well as it does is surely telling us something. I would like to know why that is. Perhaps it has something to do with the nature of dark matter, but we need to engage with it to make sense of it. We will never make progress if we ignore it.

Like the seventeenth century cleric Paul Gerhardt, I’m a stickler for intellectual honesty:

“When a man lies, he murders some part of the world.”

Paul Gerhardt

I would extend this to ignoring facts. One should not only be truthful, but also as complete as possible. It does not suffice to be truthful about things that support a particular position while eliding unpleasant or unpopular facts* that point in another direction. By ignoring the successes of MOND, we murder a part of the world.

Clusters of galaxies are problematic in different ways for different paradigms. Here I’ll recap three ways in which they point in different directions.

1. Cluster baryon fractions

An unpleasant fact for MOND is that it does not suffice to explain the mass discrepancy in clusters of galaxies. When we apply Milgrom’s formula to galaxies, it explains the discrepancy that is conventionally attributed to dark matter. When we apply MOND clusters, it comes up short. This has been known for a long time; here is a figure from the review Sanders & McGaugh (2002):

Figure 10 from Sanders & McGaugh (2002): (Left) the Newtonian dynamical mass of clusters of galaxies within an observed cutoff radius (rout) vs. the total observable mass in 93 X-ray-emitting clusters of galaxies (White et al. 1997). The solid line corresponds to Mdyn = Mobs (no discrepancy). (Right) the MOND dynamical mass within rout vs. the total observable mass for the same X-ray-emitting clusters. From Sanders (1999).

The Newtonian dynamical mass exceeds what is seen in baryons (left). There is a missing mass problem in clusters. The inference is that the difference is made up by dark matter – presumably the same non-baryonic cold dark matter that we need in cosmology.

When we apply MOND, the data do not fall on the line of equality as they should (right panel). There is still excess mass. MOND suffers a missing baryon problem in clusters.

The common line of reasoning is that MOND still needs dark matter in clusters, so why consider it further? The whole point of MOND is to do away with the need of dark matter, so it is terrible if we need both! Why not just have dark matter?

This attitude was reinforced by the discovery of the Bullet Cluster. You can “see” the dark matter.

An artistic rendition of data for the Bullet Cluster. Pink represents hot X-ray emitting gas, blue the mass concentration inferred through gravitational lensing, and the optical image shows many galaxies. There are two clumps of galaxies that collided and passed through one another, getting ahead of the gas which shocked on impact and lags behind as a result. The gas of the smaller “bullet” subcluster shows a distinctive shock wave.

Of course, we can’t really see the dark matter. What we see is that the mass required by gravitational lensing observations exceeds what we see in normal matter: this is the same discrepancy that Zwicky first noticed in the 1930s. The important thing about the Bullet Cluster is that the mass is associated with the location of the galaxies, not with the gas.

The baryons that we know about in clusters are mostly in the gas, which outweighs the stars by roughly an order of magnitude. So we might expect, in a modified gravity theory like MOND, that the lensing signal would peak up on the gas, not the stars. That would be true, if the gas we see were indeed the majority of the baryons. We already knew from the first plot above that this is not the case.

I use the term missing baryons above intentionally. If one already believes in dark matter, then it is perfectly reasonable to infer that the unseen mass in clusters is the non-baryonic cold dark matter. But there is nothing about the data for clusters that requires this. There is also no reason to expect every baryon to be detected. So the unseen mass in clusters could just be ordinary matter that does not happen to be in a form we can readily detect.

I do not like the missing baryon hypothesis for clusters in MOND. I struggle to imagine how we could hide the required amount of baryonic mass, which is comparable to or exceeds the gas mass. But we know from the first figure that such a component is indicated. Indeed, the Bullet Cluster falls at the top end of the plots above, being one of the most massive objects known. From that perspective, it is perfectly ordinary: it shows the same discrepancy every other cluster shows. So the discovery of the Bullet was neither here nor there to me; it was just another example of the same problem. Indeed, it would have been weird if it hadn’t shown the same discrepancy that every other cluster showed. That it does so in a nifty visual is, well, nifty, but so what? I’m more concerned that the entire population of clusters shows a discrepancy than that this one nifty case does so.

The one new thing that the Bullet Cluster did teach us is that whatever the missing mass is, it is collisionless. The gas shocked when it collided, and lags behind the galaxies. Whatever the unseen mass is, is passed through unscathed, just like the galaxies. Anything with mass separated by lots of space will do that: stars, galaxies, cold dark matter particles, hard-to-see baryonic objects like brown dwarfs or black holes, or even massive [potentially sterile] neutrinos. All of those are logical possibilities, though none of them make a heck of a lot of sense.

As much as I dislike the possibility of unseen baryons, it is important to keep the history of the subject in mind. When Zwicky discovered the need for dark matter in clusters, the discrepancy was huge: a factor of a thousand. Some of that was due to having the distance scale wrong, but most of it was due to seeing only stars. It wasn’t until 40 some years later that we started to recognize that there was intracluster gas, and that it outweighed the stars. So for a long time, the mass ratio of dark to luminous mass was around 70:1 (using a modern distance scale), and we didn’t worry much about the absurd size of this number; mostly we just cited it as evidence that there had to be something massive and non-baryonic out there.

Really there were two missing mass problems in clusters: a baryonic missing mass problem, and a dynamical missing mass problem. Most of the baryons turned out to be in the form of intracluster gas, not stars. So the 70:1 ratio changed to 7:1. That’s a big change! It brings the ratio down from a silly number to something that is temptingly close to the universal baryon fraction of cosmology. Consequently, it becomes reasonable to believe that clusters are fair samples of the universe. All the baryons have been detected, and the remaining discrepancy is entirely due to non-baryonic cold dark matter.

That’s a relatively recent realization. For decades, we didn’t recognize that most of the normal matter in clusters was in an as-yet unseen form. There had been two distinct missing mass problems. Could it happen again? Have we really detected all the baryons, or are there still more lurking there to be discovered? I think it unlikely, but fifty years ago I would also have thought it unlikely that there would have been more mass in intracluster gas than in stars in galaxies. I was ten years old then, but it is clear from the literature that no one else was seriously worried about this at the time. Heck, when I first read Milgrom’s original paper on clusters, I thought he was engaging in wishful thinking to invoke the X-ray gas as possibly containing a lot of the mass. Turns out he was right; it just isn’t quite enough.

All that said, I nevertheless think the residual missing baryon problem MOND suffers in clusters is a serious one. I do not see a reasonable solution. Unfortunately, as I’ve discussed before, LCDM suffers an analogous missing baryon problem in galaxies, so pick your poison.

It is reasonable to imagine in LCDM that some of the missing baryons on galaxy scales are present in the form of warm/hot circum-galactic gas. We’ve been looking for that for a while, and have had some success – at least for bright galaxies where the discrepancy is modest. But the problem gets progressively worse for lower mass galaxies, so it is a bold presumption that the check-sum will work out. There is no indication (beyond faith) that it will, and the fact that it gets progressively worse for lower masses is a direct consequence of the data for galaxies looking like MOND rather than LCDM.

Consequently, both paradigms suffer a residual missing baryon problem. One is seen as fatal while the other is barely seen.

2. Cluster collision speeds

A novel thing the Bullet Cluster provides is a way to estimate the speed at which its subclusters collided. You can see the shock front in the X-ray gas in the picture above. The morphology of this feature is sensitive to the speed and other details of the collision. In order to reproduce it, the two subclusters had to collide head-on, in the plane of the sky (practically all the motion is transverse), and fast. I mean, really fast: nominally 4700 km/s. That is more than the virial speed of either cluster, and more than you would expect from dropping one object onto the other. How likely is this to happen?

There is now an enormous literature on this subject, which I won’t attempt to review. It was recognized early on that the high apparent collision speed was unlikely in LCDM. The chances of observing the bullet cluster even once in an LCDM universe range from merely unlikely (~10%) to completely absurd (< 3 x 10-9). Answers this varied follow from what aspects of both observation and theory are considered, and the annoying fact that the distribution of collision speed probabilities plummets like a stone so that slightly different estimates of the “true” collision speed make a big difference to the inferred probability. What the “true” gravitationally induced collision speed is is somewhat uncertain because the hydrodynamics of the gas plays a role in shaping the shock morphology. There is a long debate about this which bores me; it boils down to it being easy to explain a few hundred extra km/s but hard to get up to the extra 1000 km/s that is needed.

At its simplest, we can imagine the two subclusters forming in the early universe, initially expanding apart along with the Hubble flow like everything else. At some point, their mutual attraction overcomes the expansion, and the two start to fall together. How fast can they get going in the time allotted?

The Bullet Cluster is one of the most massive systems in the universe, so there is lots of dark mass to accelerate the subclusters towards each other. The object is less massive in MOND, even spotting it some unseen baryons, but the long-range force is stronger. Which effect wins?

Gary Angus wrote a code to address this simple question both conventionally and in MOND. Turns out, the longer range force wins this race. MOND is good at making things go fast. While the collision speed of the Bullet Cluster is problematic for LCDM, it is rather natural in MOND. Here is a comparison:

A reasonable answer falls out of MOND with no fuss and no muss. There is room for some hydrodynamical+ high jinx, but it isn’t needed, and the amount that is reasonable makes an already reasonable result more reasonable, boosting the collision speed from the edge of the observed band to pretty much smack in the middle. This is the sort of thing that keeps me puzzled: much as I’d like to go with the flow and just accept that it has to be dark matter that’s correct, it seems like every time there is a big surprise in LCDM, MOND just does it. Why? This must be telling us something.

3. Cluster formation times

Structure is predicted to form earlier in MOND than in LCDM. This is true for both galaxies and clusters of galaxies. In his thesis, Jay Franck found lots of candidate clusters at redshifts higher than expected. Even groups of clusters:

Figure 7 from Franck & McGaugh (2016). A group of four protocluster candidates at z = 3.5 that are proximate in space. The left panel is the sky association of the candidates, while the right panel shows their galaxy distribution along the LOS. The ellipses/boxes show the search volume boundaries (Rsearch = 20 cMpc, Δz ± 20 cMpc). Three of these (CCPC-z34-005, CCPC-z34-006, CCPC-z35-003) exist in a chain along the LOS stretching ≤120 cMpc. This may become a supercluster-sized structure at z = 0.

The cluster candidates at high redshift that Jay found are more common in the real universe than seen with mock observations made using the same techniques within the Millennium simulation. Their velocity dispersions are also larger than comparable simulated objects. This implies that the amount of mass that has assembled is larger than expected at that time in LCDM, or that speeds are boosted by something like MOND, or nothing has settled into anything like equilibrium yet. The last option seems most likely to me, but that doesn’t reconcile matters with LCDM, as we don’t see the same effect in the simulation.

MOND also predicts the early emergence of the cosmic web, which would explain the early appearance of very extended structures like the “big ring.” While some of these very large scale structures are probably not real, there seem to be a lot of such things being noted for all of them to be an illusion. The knee-jerk denials of all such structures reminds me of the shock cosmologists expressed at seeing quasars at redshifts as high as 4 (even 4.9! how can it be so?) or clusters are redshift 2, or the original CfA stickman, which surprised the bejeepers out of everybody in 1987. So many times I’ve been told that a thing can’t be true because it violates theoretician’s preconceptions, only for them to prove to be true, ultimately to be something the theorists expected all along.

Well, which is it?

So, as the title says, clusters ruin everything. The residual missing baryon problem that MOND suffers in clusters is both pernicious and persistent. It isn’t the outright falsification that many people presume it to be, but is sure don’t sit right. On the other hand, both the collision speeds of clusters (there are more examples now than just the Bullet Cluster) and the early appearance of clusters at high redshift is considerably more natural in MOND than In LCDM. So the data for clusters cuts both ways. Taking the most obvious interpretation of the Bullet Cluster data, this one object falsifies both LCDM and MOND.

As always, the conclusion one draws depends on how one weighs the different lines of evidence. This is always an invitation to the bane of cognitive dissonance, accepting that which supports our pre-existing world view and rejecting the validity of evidence that calls it into question. That’s why we have the scientific method. It was application of the scientific method that caused me to change my mind: maybe I was wrong to be so sure of the existence of cold dark matter? Maybe I’m wrong now to take MOND seriously? That’s why I’ve set criteria by which I would change my mind. What are yours?


*In the discussion associated with a debate held at KITP in 2018, one particle physicist said “We should just stop talking about rotation curves.” Straight-up said it out loud! No notes, no irony, no recognition that the dark matter paradigm faces problems beyond rotation curves.

+There are now multiple examples of colliding cluster systems known. They’re a mess (Abell 520 is also called “the train wreck cluster“), so I won’t attempt to describe them all. In Angus & McGaugh (2008) we did note that MOND predicted that high collision speeds would be more frequent than in LCDM, and I have seen nothing to make me doubt that. Indeed, Xavier Hernandez pointed out to me that supersonic shocks like that of the Bullet Cluster are often observed, but basically never occur in cosmological simulations.

Quantifying the excess masses of high redshift galaxies

Quantifying the excess masses of high redshift galaxies

As predicted, JWST has been seeing big galaxies at high redshift. There are now many papers on the subject, ranging in tone from “this is a huge problem for LCDM” to “this is not a problem for LCDM at all” – a dichotomy that persists. So – which is it?

It will take some time to sort out. There are several important aspects to the problem, one of which is agreeing on what LCDM actually predicts. It is fairly robust at predicting the number density of dark matter halos as a function of mass. To convert that into something observable requires understanding how baryons find their way into dark matter halos at early times, how those baryons condense into regions dense enough to form stars, what kinds of stars form there (thus determining observables like luminosity and spectral shape), and what happens in the immediate aftermath of early star formation (does feedback shut off star formation quickly or does it persist or is there some distribution over all possibilities). This is what simulators attempt to do. It is hard work, and they are a long way from agreeing with each other. Many of them appear to be a long way from agreeing with themselves, as their answers continue to evolve – sometimes because of genuine progress in the simulations, but sometimes in response to unanticipated* observations.

Observationally, we can hope to measure at least two distinct things: the masses of individual galaxies, and their number density – how many galaxies of a particular mass exist in a specified volume. I have mostly been worried about the first issue, as it appears that individual galaxies got too big too fast. In the hierarchical galaxy formation picture of LCDM, the massive galaxies of today were assembled from many smaller protogalaxies over an extended period of time, so big galaxies don’t emerge until comparatively late: it takes about seven billion years for a typical bright galaxy to assemble half its stellar mass. (The same hierarchical process is accelerated in MOND so galaxies can already be massive at z ≈ 10.) That there are examples of individual galaxies that are already massive in the early universe is a big issue.

How common should massive galaxies be? There are always early adopters: objects that grew faster than average for their mass. We’ll always see the brightest things first, so is what we’re seeing with JWST typical? Or is it just the bright tip of an iceberg that is perfectly reasonable in LCDM? This is what the luminosity function helps quantify: just how many galaxies of each mass are there? If we can quantify that, then we can quantify how many we should be able to see with a given survey of specified depth and sky coverage.

Astronomers have been measuring the galaxy luminosity function for a long time. Doing so at high redshift has always been an ambition, so JWST is hardly the first telescope to contribute to the subject. It is the newest and best, opening a regime where we had hoped to see protogalactic fragments directly. Instead, the first thing we see are galaxies bigger than we expected (in LCDM). This has been building for some time, so let’s take a step back to provide some context.

Steinhardt et al. (2016) pointed out what they call “the impossibly early galaxy problem.” They quantified this by comparing the observed luminosity function in various redshift bins to that predicted by LCDM. We’ve discussed their Fig. 1 before, so let’s look now at their Fig. 4:

Figure 4 from Steinhardt et al. (2016)Colors correspond to redshift, with z = 4, 5, 6, 7, 8, 9, 10 being represented by blue, green, yellow, orange, red, pink, and black: there are fewer objects at high redshift where they’ve had less time to form. (a) Expected halo mass to monochromatic UV luminosity ratio, along with the required evolution to reconcile observation with theory, and (b) resulting corrected halo-mass functions derived as in Figure 1 with Mhalo/LUV evolving due to a stellar population starting at low metallicity at z = 12 and aging along the star-forming main sequence, as described in Section 4.1.1. Such a model would be reasonable given observational constraints, but cannot produce agreement between measured UV luminosity functions and simulated halo-mass functions.

In a perfect model, the points (data) would match the lines (theory) of the same color (redshift). This is not the case – observed galaxies are persistently brighter than predicted. Making that prediction is subject to all the conversions from dark matter mass to stellar mass to observed luminosity we mentioned above, so they also show what they expect and what it would take to match the data. These are the different lines in the top panel. There is a lot of discussion of this in their paper that boils down to these lines are different, and we cannot plausibly make them the same.

The word “plausibly” is doing a lot of work in that last sentence. Just because one set of authors finds something to be impossible (despite their best efforts) doesn’t mean anyone else accepts that. We usually don’t, even when we should**.

It occurs to me that not every reader may appreciate how redshift corresponds to cosmic time. So here is a graph for vanilla LCDM parameters:

The age-redshift relation for the vanilla LCDM cosmology. Everything at z > 3 is in the early universe, i.e., the first two billion years after the Big Bang. Everything at z > 10 is in the very early universe, the first half billion years when there has not yet been time to form big galaxies hierarchically.

Things don’t change much if we adopt slightly different cosmologies: this aspect of LCDM is well established. We used to think it would take a least a couple of billion years to form a big galaxy, so anything at z > 3 is surprising from that perspective. That’s not wrong, as there is an inverse relation between age and redshift, with increasing redshifts crammed into an ever smaller window of time. So while z = 5 and 10 sound very different, there is only about 700 Myr between them. That sounds like a long time to you and me, but the sun will only complete 3 orbits around the Galaxy in that time. This is why it is hard to imagine an object as large as the Milky Way starting from the near-homogeneity of the very early universe then having time to expand, decouple, recollapse, and form into something coherent so “quickly.” There is a much larger distance for material to travel than the current circumference of the solar circle, and not much time in which to do it. If we want to get it done by z = 10, there is less than 500 Myr available – about two orbits of the sun. We just can’t get there fast enough.

We’ve quickly become jaded to the absurdly high redshifts revealed by JWST, but there’s not much difference in cosmic time between these seemingly ever higher redshifts. Very early epochs were already being probed before JSWT; JWST just brings them into excruciating focus. To provide some historical perspective about what “high redshift” means, here is a quote from Schramm (1992). The full text is behind a paywall, so I’ll just quote a relevant paragraph:

Pushing the opposite direction from the “zone of mystery” epoch [the dark ages] between the background radiation and the existence of objects at high redshift is the discovery of objects at higher and higher redshift. The higher the redshift of objects found, the harder it is to have the slow growth of Figure 5 [SCDM] explain their existence. Some high redshift objects can be dismissed as statistical fluctuations if the bulk of objects still formed late. In the last year, the number of quasars with redshifts > 4 has gone to 30, with one having a redshift as large as 4.9… While such constraints are not yet a serious problem for linear growth models, eventually they might be.

David Schramm, 1992

Here we have a cosmologist already concerned 30 years ago that objects exist at z > 4. Crazy, that! Back then, the standard model was SCDM; one of the reasons to switch to LCDM was to address exactly this problem. That only buys us a couple of billion years, so now we’re smack up against the same problem all over again, just shifted to higher redshift. Some people are even invoking statistical fluctuations: same as it ever was.

Consequently, a critical question is how common these massive galaxies are. Sure, massive galaxies exist before we expected them. But are they just statistical fluctuations? This is a question we can address with the luminosity function.

Here is the situation just before JWST was launched. Yung et al. (2019) made a good faith effort to establish a prior: they made predictions for what JWST would see. This is how science is supposed to work. In the figure below, I compare that to what was known (Stefanon et al. 2021) from the Spitzer Space Telescope, in many ways the predecessor to JSWT:

Figure 4 from McGaugh (2024). The number density Φ of galaxies as a function of their stellar mass 𝑀∗, color coded by redshift with 𝑧=6, 7, 8, 9, 10 in dark blue, light blue, green, orange, and red, respectively. The left panel shows predicted stellar mass functions [lines] with the corresponding data [circles]. The right panel shows the ratio of the observed-to-predicted density of galaxies. There is a clear excess of massive galaxies at high redshifts.

If you just look at the mass functions in the left panel, things look pretty good. This is one of the dangers of the logarithmic plots necessary to illustrate the large dynamic range of astronomical data: large differences may look small in log-log space. So I also plot the ratio of densities at right. There one can see a clear excess in the number density of high mass galaxies. There are nearly an order of magnitude more 1010 M galaxies than expected at z ≈ 8!

For technical reasons I don’t care to delve into, it is difficult to get the volume estimate right when constructing the luminosity function. So I can imagine there might be some systematic effects to scale the ratio up or down. That wouldn’t do anything to explain the bump at high masses, and it is rather harder to get the shape wrong, especially at the bright end. The faint end of the luminosity function is the hard part!

The Spitzer data already probes the early universe, before JWST reported results. As those have come in, it has started to be possible to construct luminosity functions at very high redshift. Here are some measurements from Harikane et al. (2023), Finkelstein et al. (2023), and Robertson et al. (2023) together with revised predictions from Yung et al. (2024).

Figure 5 from McGaugh (2024). The number density of galaxies as a function of their rest-frame ultraviolet absolute magnitude observed by JWST, a proxy for stellar mass at high redshift. The left panel shows predicted luminosity functions [lines], color coded by redshift: blue, green, orange, red for 𝑧=9, 11, 12, 14, respectively. Data in the corresponding redshift bins are shown as squares, circles, and triangles. The right panel shows the ratio of the observed-to-predicted density of galaxies. The observed luminosity function barely evolves, in contrast to the prediction of substantial evolution as the first dark matter halos assemble. There is a large excess of bright galaxies at the highest redshifts observed.

Again, we see that there is an excess of bright galaxies at the highest redshifts.

As we look to progressively higher redshift, the light we observe shifts from familiar optical bands to the ultraviolet. This was a huge part of the motivation to build JWST: it is optimized for the infrared, so we can observed the redshifted optical light as our eyes would see it. Astronomers always push to the edge of what a telescope can do, so we start to run into this problem again at the highest redshifts. The mapping of ultraviolet light to stellar mass is one of the harder tasks in stellar population work, much less mapping that to a dark matter halo mass. So one promising conventional idea is “the up-scattering in UV luminosity of small, abundant halos due to stochastic, high efficiency star formation during the initial phases of galaxy formation (unregulated star formation)” discussed$ by Finkelstein et al. (2023). I like this because, yeah, we expect lots of little halos, star formation is messy and star formation during the first phases of galaxy formation should be especially messy, so it is easy to imagine little halos stochastically lighting up in the UV. But can this be enough?

It remains to be seen if the observations can be explained by this or any of the usual tweaks to star formation. It seems like a big gap to overcome. I mean, just look at the left panel of the final figure above. The observed UV luminosity function is barely evolving while the prediction of LCDM is dropping like a rock. Indeed, the mass functions get jagged, which may be an indication that there are so few dark matter halos in the simulation volume at the redshift in question that they do not suffice to define a smooth mass function. Indeed, Harikane et al. estimate a luminosity density of ∼7 × 10−6 mag.−1 Mpc−3 at 𝑧≈16. This point is omitted from the figure above because the corresponding prediction is NAN (not a number): there just isn’t anything big enough in the simulation to do be so bright that early.

There is good reason to be skeptical of the data at 𝑧≈16. There is also good reason to be skeptical of the simulations. These have yet to converge, and even the predictions of the same group continue to evolve. Yung et al. (2019) did the right thing to establish a prior before JWST’s launch, but they haven’t stuck by it. The density of rare, massive galaxies has gone up by a factor of 2 to 2.5 in Yung et al. (2024). They attribute this to the use of higher resolution simulations, which may very well be correct: in order to track the formation of the earliest structures, you have to resolve them. But it doesn’t exactly inspire confidence that we actually know what LCDM predicts, and it feels like the same sort of moving of the goalposts that I’ve witnessed over and over and over and over and over again.

It always seems to come down to special pleading:

Please don’t falsify LCDM! I ran out of computer time. I had a disk crash. I didn’t have a grant for supercomputer time. My simulation data didn’t come back from the processing center. A senior colleague insisted on a rewrite. Someone stole my laptop. There was an earthquake, a terrible flood, locusts! It wasn’t my fault! I swear to God!

And the community loves LCDM, so we fall for it every time.

Oh, LCDM. LCDM, honey.

*There is always a danger in turning knobs to fit the data, and there are plenty of knobs to turn. So what LCDM predicts is a very serious matter – a theory is only as good as its prior, and we should be skeptical if theorists keep adjusting what that is in response to observations they failed to predict. This is true even in the absence of the existential threat of MOND which implies that the entire field of cosmological simulations is betrayed by its most fundamental assumptions, reducing it to “garbage in, garbage out.”

**When I first found that MOND had predicted our observations of low surface brightness galaxies where dark matter had not, despite my best efforts to make it work out, Ortwin Gerhard asked me if he “had to believe it.” My instant reaction was “this is astronomy, we don’t have to believe anything.” More seriously, this question applies on many levels: do we believe the data? do we believe the interpretation? is this the only possible conclusion? At the time, I had already tried very hard to fix it, and had failed. Still, I was willing to imagine there might be some way out, and maybe someone could figure out something I had not. Since that time, lots of other people have tried and also failed. This has not kept some of them from claiming that they have succeeded, but they never seem to address the underlying problem, and most of these models are mere variations on things I tried and dismissed as obviously unworkable.

Now, as then, what we are obliged to believe is the data, to the limits of their accuracy. The data have improved substantially, and at this point it is clear that the radial acceleration relation exists+ and has remarkably small intrinsic scatter. What we can always argue about is the interpretation: sure, it looks exactly like MOND, and MOND was the only theory that predicted it in advance, and we haven’t been able to come up with a reasonable explanation in terms of dark matter, but perhaps one can be found in some dark matter model that does not yet exist.

+Of course, there will always be some people behind the times and in a state of denial, as this subject seems to defeat rationalism in the hearts and minds of particle physicists in the same way Darwin still enrages some of the more religiously inclined.

$I directly quote Finkelstein’s coauthor Mauro Giavalisco from an email exchange.

Discussion of Dark Matter and Modified Gravity

To start the new year, I provide a link to a discussion I had with Simon White on Phil Halper’s YouTube channel:

In this post I’ll say little that we don’t talk about, but will add some background and mildly amusing anecdotes. I’ll also try addressing the one point of factual disagreement. For the most part, Simon & I entirely agree about the relevant facts; what we’re discussing is the interpretation of those facts. It was a perfectly civil conversation, and I hope it can provide an example for how it is possible to have a positive discussion about a controversial topic+ without personal animus.

First, I’ll comment on the title, in particular the “vs.” This is not really Simon vs. me. This is a discussion between two scientists who are trying to understand how the universe works (no small ask!). We’ve been asked to advocate for different viewpoints, so one might call it “Dark Matter vs. MOND.” I expect Simon and I could swap sides and have an equally interesting discussion. One needs to be able to do that in order to not simply be a partisan hack. It’s not like MOND is my theory – I falsified my own hypothesis long ago, and got dragged reluctantly into this business for honestly reporting that Milgrom got right what I got wrong.

For those who don’t know, Simon White is one of the preeminent scholars working on cosmological computer simulations, having done important work on galaxy formation and structure formation, the baryon fraction in clusters, and the structure of dark matter halos (Simon is the W in NFW halos). He was a Reader at the Institute of Astronomy at the University of Cambridge where we overlapped (it was my first postdoc) before he moved on to become the director of the Max Planck Institute for Astrophysics where he was mentor to many people now working in the field.

That’s a very short summary of a long and distinguished career; Simon has done lots of other things. I highlight these works because they came up at some point in our discussion. Davis, Efstathiou, Frenk, & White are the “gang of four” that was mentioned; around Cambridge I also occasionally heard them referred to as the Cold Dark Mafia. The baryon fraction of clusters was one of the key observations that led from SCDM to LCDM.

The subject of galaxy formation runs throughout our discussion. It is always a fraught issue how things form in astronomy. It is one thing to understand how stars evolve, once made; making them in the first place is another matter. Hard as that is to do in simulations, galaxy formation involves the extra element of dark matter in an expanding universe. Understanding how galaxies come to be is essential to predicting anything about what they are now, at least in the context of LCDM*. Both Simon and I have worked on this subject our entire careers, in very much the same framework if from different perspectives – by which I mean he is a theorist who does some observational work while I’m an observer who does some theory, not LCDM vs. MOND.

When Simon moved to Max Planck, the center of galaxy formation work moved as well – it seemed like he took half of Cambridge astronomy with him. This included my then-office mate, Houjun Mo. At one point I refer to the paper Mo & I wrote on the clustering of low surface brightness galaxies and how I expected them to reside in late-forming dark matter halos**. I often cite Mo, Mao, & White as a touchstone of galaxy formation theory in LCDM; they subsequently wrote an entire textbook about it. (I was already warning them then that I didn’t think their explanations of the Tully-Fisher relation were viable, at least not when combined with the effect we have subsequently named the diversity of rotation curve shapes.)

When I first began to worry that we were barking up the wrong tree with dark matter, I asked myself what could falsify it. It was hard to come up with good answers, and I worried it wasn’t falsifiable. So I started asking other people what would falsify cold dark matter. Most did not answer. They often had a shocked look like they’d never thought about it, and would rather not***. It’s a bind: no one wants it to be false, but most everyone accepts that for it to qualify as physical science it should be falsifiable. So it was a question that always provoked a record-scratch moment in which most scientists simply freeze up.

Simon was one of the first to give a straight answer to this question without hesitation, circa 1999. At that point it was clear that dark matter halos formed central density cusps in simulations; so those “cusps had to exist” in the centers of galaxies. At that point, we believed that to mean all galaxies. The question was complicated by the large dynamical contribution of stars in high surface brightness galaxies, but low surface brightness galaxies were dark matter dominated down to small radii. So we thought these were the ideal place to test the cusp hypothesis.

We no longer believe that. After many attempts at evasion, cold dark matter failed this test; feedback was invoked, and the goalposts started to move. There is now a consensus among simulators that feedback in intermediate mass galaxies can alter the inner mass distribution of dark matter halos. Exactly how this happens depends on who you ask, but it is at least possible to explain the absence of the predicted cusps. This goes in the right direction to explain some data, but by itself does not suffice to address the thornier question of why the distribution of baryons is predictive of the kinematics even when the mass is dominated by dark matter. This is why the discussion focused on the lowest mass galaxies where there hasn’t been enough star formation to drive the feedback necessary to alter cusps. Some of these galaxies can be described as having cusps, but probably not all. Thinking only in those terms elides the fact that MOND has a better record of predictive success. I want to know why this happens; it must surely be telling us something important about how the universe works.

The one point of factual disagreement we encountered had to do with the mass profile of galaxies at large radii as traced by gravitational lensing. It is always necessary to agree on the facts before debating their interpretation, so we didn’t press this far. Afterwards, Simon sent a citation to what he was talking about: this paper by Wang et al. (2016). In particular, look at their Fig. 4:

Fig. 4 of Wang et al. (2016). The excess surface density inferred from gravitational lensing for galaxies in different mass bins (data points) compared to mock observations of the same quantity made from within a simulation (lines). Looks like excellent agreement.

This plot quantifies the mass distribution around isolated galaxies to very large scales. There is good agreement between the lensing observations and the mock observations made within a simulation. Indeed, one can see an initial downward bend corresponding to the outer part of an NFW halo (the “one-halo term”), then an inflection to different behavior due to the presence of surrounding dark matter halos (the “two-halo term”). This is what Simon was talking about when he said gravitational lensing was in good agreement with LCDM.

I was thinking of a different, closely related result. I had in mind the work of Brouwer et al. (2021), which I discussed previously. Very recently, Dr. Tobias Mistele has made a revised analysis of these data. That’s worthy its own post, so I’ll leave out the details, which can be found in this preprint. The bottom line is in Fig. 2, which shows the radial acceleration relation derived from gravitational lensing around isolated galaxies:

The radial acceleration relation from weak gravitational lensing (colored points) extending existing kinematic data (grey points) to lower acceleration corresponding to very large radii (~ 1 Mpc). The dashed line is the prediction of MOND. Looks like excellent agreement.

This plot quantifies the radial acceleration due to the gravitational potential of isolated galaxies to very low accelerations. There is good agreement between the lensing observations and the extrapolation of the radial acceleration relation predicted by MOND. There are no features until extremely low acceleration where there may be a hint of the external field effect. This is what I was talking about when I said gravitational lensing was in good agreement with MOND, and that the data indicated a single halo with an r-2 density profile that extends far out where we ought to see the r-3 behavior of NFW.

The two plots above use the same method applied to the same kind of data. They should be consistent, yet they seem to tell a different story. This is the point of factual disagreement Simon and I had, so we let it be. No point in arguing about the interpretation when you can’t agree on the facts.

I do not know why these results differ, and I’m not going to attempt to solve it here. I suspect it has something to do with sample selection. Both studies rely on isolated galaxies, but how do we define that? How well do we achieve the goal of identifying isolated galaxies? No galaxy is an island; at some level, there is always a neighbor. But is it massive enough to perturb the lensing signal, or can we successfully define samples of galaxies that are effectively isolated, so that we’re only looking at the gravitational potential of that galaxy and not that of it plus some neighbors? Looks like there is some work left to do to sort this out.

Stepping back from that, we agreed on pretty much everything else. MOND as a fundamental theory remains incomplete. LCDM requires us to believe that 95% of the mass-energy content of the universe is something unknown and perhaps unknowable. Dark matter has become familiar as a term but remains a mystery so long as it goes undetected in the laboratory. Perhaps it exists and cannot be detected – this is a logical possibility – but that would be the least satisfactory result possible: we might as well resume counting angels on the head of a pin.

The community has been working on these issues for a long time. I have been working on this for a long time. It is a big problem. There is lots left to do.


+I get a lot of kill the messenger from people who are not capable of discussing controversial topics without personal animus. A lotinevitably from people who know assume they know more about the subject than I do but actually know much less. It is really amazing how many scientists equate me as a person with MOND as a theory without bothering to do any fact-checking. This is logical fallacy 101.

*The predictions of MOND are insensitive to the details of galaxy formation. Though of course an interesting question, we don’t need that in order to make predictions. All we need is the mass distribution that the kinematics respond to – we don’t need to know how it got that way. This is like the solar system, where it suffices to know Newton’s laws to compute orbits; we don’t need to know how the sun and planets formed. In contrast, one needs to know how a galaxy was assembled in LCDM to have any hope of predicting what its distribution of dark matter is and then using that to predict kinematics.

**The ideas Mo & I discussed thirty years ago have reappeared in the literature under the designation “assembly bias.”

***It was often accompanied by “why would you even ask that?” followed by a pained, constipated expression when they realized that every physical theory has to answer that question.

Holiday Concordance

Holiday Concordance

Screw the Earth and its smoking habit. The end of 2023 approaches, so let’s talk about the whole universe, which is its own special kind of mess.

As I’ve related before, our current cosmology, LCDM, was established over the course of the 1990s through a steady drip, drip, drip of results in observational cosmology – what Peebles calls the classic cosmological tests. There were many contributory results; I’m not going to attempt to go through them all. Important among them were the age problem, the realization that the mass density was lower than expected, and that there was more structure on large scales+ than predicted. These established LCDM in the mid-1990s as the “concordance model” – the most probable flavor of FLRW universe. Here is the key figure from Ostriker & Steinhardt depicting the then-allowed region of the density parameter and Hubble constant:

The addition of the cosmological constant to the standard model – replacing SCDM with LCDM – was a brain-wrenching ordeal. Lambda had long been anathema, and there was a region in which an open universe was possible, even reasonable (stripes over shade in the figure above). Moreover, this strange new LCDM made the seemingly inconceivable prediction that not only was the universe expanding [itself the older mind-bender brought to us by Hubble (and Slipher and Lemaître)], the expansion rate should be accelerating. This sounded like crazy talk at the time, so it was greeted with great rejoicing when corroborated by observations of Type Ia supernovae.

A further prediction that could distinguish LCDM from then-viable open models was the geometry of the universe. Open models have a negative curvaturek < 0, in which initially parallel light beams diverge) while the geometry in LCDM should be uniquely flat (Ωk = 0, in which initially parallel light beams remain parallel forever). Uniqueness is important, as it makes for a strong prediction, such as the location of the first peak of the acoustic power spectrum of the cosmic microwave background. In LCDM, this location was predicted to be ℓ ≈ 200 with little flexibility. For viable open models, it was more like ℓ ≈ 800 with a great deal of flexibility. The interpretation of the supernova data relied heavily on the assumption of a flat geometry, so I recall breathing a sigh of relief* when ℓ ≈ 200 was clearly observed.

Where are we now? I decided to reconstruct the Ostriker & Steinhardt plot with modern data. Here it is, with the axes swapped for reasons unrelated to this post. Deal with it.

The concordance region (white space) in the mass density-expansion rate space where the allowed regions (colored bands) of many constraints intersect. Illustrated constraints include a direct measurement of the Hubble constant, the age of the universe, the cluster baryon fraction, and large scale structure. Also shown are the best-fit values from CMB fits labeled by their date of publication (WMAP in orange; Planck in yellow). These follow the green line of constant ΩmH03; combinations of parameters along the line are tolerable but regions away from it are strongly excluded.

There is lots to be said here. First, note the scale. As the accuracy of data have improved, it has become possible to zoom in. My version of the figure is a wee postage stamp on that of Ostriker & Steinhardt. Nevertheless, the concordance region is in pretty much the same spot. Not exactly, of course; the biggest thing that has changed is that the age constraint is now completely incompatible with an open universe, so I haven’t bothered depicting it. Indeed, for the illustrated Hubble constant, the Hubble time (the age of a completely empty, “coasting” universe) is 13.4 Gyr. This is consistent with the illustrated age (13.80 ± 0.75 Gyr) only for Ωm ≈ 0, which is far off the left edge of the plot.

Second, the CMB best-fit values follow a line of constant ΩmH03. This is a deep trench in χ2 space. The region outside this trench is strongly excluded – it’s kinda the grand canyon of cosmology. Even a little off, and you’re standing on the rim looking a long way down, knowing that a much better fit is only a short step away. Once you’re in the valley of χ2, one must hunt along its bottom to find the true minimum. In the mid-`00s, a decade after Ostriker & Steinhardt, the best fit fell smack in the middle of the concordance region defined by completely independent data. It was this additional concordance that impressed me most, more than the detailed CMB fits themselves. This convinced the vast majority of scientists practicing in the field that it had to be LCDM and could only be LCDM and nothing but LCDM.

Since that time, the best-fit CMB value has wandered down the trench, away from the concordance region. These are the results that changed, not everything else. This temporal variation suggests a systematic in the interpretation of the CMB data rather than in the local distance scale.

I recall being at a conference (the Bright & Dark Universe in Naples in 2017) when the latest Planck results were announced. There was a palpable sense in the audience of having been whacked by a blunt object, like walking into a closed door you thought was open. We’d been doing precision cosmology for a long time and had settled on an answer informed by lots of independent lines of evidence, but they were telling us the One True answer was off over there. Not crazy far, but not consistent with the concordance we had come to expect. Worse, they had these crazy tiny error bars – not only were they getting an answer outside the concordance region, it was in tension with pretty much everything else. Not strong tension, but enough to make us all uncomfortable if not outright object. Indeed, there was a definite vibe that people were afraid to object. Not terrified, but nervous. Worried about being on the wrong side of the community. I get it. I know a lot about that.

People are remarkably talented at refashioning the past. Over the past five years, the Planck best-fit parameters have come to be synonymous with LCDM: all else is moot. Young scientists can be forgiven for not realizing it was ever otherwise, just as they might have been taught that cosmic acceleration was discovered by the supernova experiments totally out of the blue. These are convenient oversimplifications that elide so many pertinent events as to be tantamount to gaslighting. We refashion the past until there was never a serious controversy, then it seems strange that some of us think there still is. Sorry, not so fast, there definitely is: if you use the Planck value of the Hubble constant to estimate distances to local galaxies, you will get it wrong%, along with all distance-dependent quantities.

I’m old enough to remember a time when there was a factor of two uncertainty in the Hubble constant (50 vs. 1000) and the age constraint was the most accurate one in this plot. Thanks to genuine progress, the Hubble constant is now the more precise. Consequently, of all the data one could plot above, this is the choice that matters most to where the concordance region falls. If I adopt our own estimate (H0 = 75.1 ± 2.3 km/s/Mpc), then the concordance band gets wider and slides up a little but is basically the same as above. If instead I adopt the lowest highly accurate value, H0 = 69.8 ± 0.8 km/s/Mpc, the window slides down, but not enough to be consistent with the Planck results. Indeed, it stays to the left of the CMB constraint, becoming inconsistent with the mass density as well as the expansion rate.

Dang it, now I want to make that plot. Processing… OK, here it is:

As above, but with a lower measurement of H0. Only the range of statistical uncertainty is illustrated as a systematic uncertainty corresponds to a calibration error that slides H0 up and down – i.e., the exact situation being illustrated relative to the figure above. These two plots illustrate the range of outcomes that are possible from slightly discordant direct modern measurements of the Hubble constant; it is hard to go lower. Doing so doesn’t really help as it would just shift the tension from H0 to Ωm.

Yes, as I expected: the allowed range slides down but remains to the left of the green line. It is less inconsistent with the Planck H0, but that isn’t the only thing that matters. It is also inconsistent with the matter density. Indeed, it misses the CMB-allowed trench entirely. There is no allowed FLRW universe here.

These are only two parameters. Though arguably the most important, there are others, all of which matter to CMB fits. These are difficult to visualize simultaneously. We could, for starters, plot the baryon density as a third axis. If we did so, the concordance region would become a 3D object. It would also get squeezed, depending on what we think the baryon density actually is. Even restricting ourselves to the above-plotted constraints, there is some tension between the cluster baryon fraction and large scale structure constraint along the new third axis. I’m sure I could find in the literature more or less consistent values; this way the madness of cherry-picking lies.

There are many other constraints that could be added here. I’ve tried to stay consistent with the spirit of the original plot without making it illegible by overburdening it with lots and lots of data that all say pretty much the same thing. Nor do I wish to engage in cherry-picking. There are so many results out there that I’m sure one could find some combination that slides the allowed box this way or that – but only a little.

Whenever I’ve taught cosmology, I’ve made it a class exercise$ to investigate diagrams like this, with each student choosing an observational constraint to explore and champion. as a result, I’ve seen many variations on the above plots over the years, but since I first taught it in 1999 they’ve always been consistent with pretty much the same concordance region. It often happens that there is no concordance region; there are so many constraints that when you put them all together, nothing is left. We then debate which results to believe, or not, a process that has always been a part of the practice of cosmology.

We have painted ourselves into a corner. The usual interpretation is that we have painted ourselves into the correct corner: we live in this strange LCDM universe. It is also possible that there really is nothing left, the concordance window is closed, and we’ve falsified FLRW cosmology. That is a fate most fear to contemplate, and it seems less likely than mistakes in some discordant results, so we inevitably go down the path of cognitive dissonance, giving more credence to results that are consistent with our favorite set of LCDM parameters and less to those that do not. This is widely done without contemplating the possibility that the weird FLRW parameters we’ve ended up with are weird because they are just an approximation to some deeper theory.

So, as 2023 winds to an end, we [still] know pretty well what the parameters of cosmology are. While the tension between H0 = 67 and 73 km/s/Mpc is real, it seems like small beans compared to the successful isolation of a narrow concordance window. Sure beats arguing between 50 and 100! Even deciding which concordance window is right seems like a small matter compared to the deeper issues raised by LCDM: what is the cold dark matter? Does it really exist, or is it just a mythical entity we’ve invented for the convenient calculation of cosmic quantities? What the heck do we even mean by Lambda? Does the whole picture hang together so well that it must be correct? Or can it be falsified? Has it already been? How do we decide?

I’m sure we’ll be arguing over these questions for a long time to come.


+Structure formation is often depicted as a great success of cosmology, but it was the failure of the previous standard model, SCDM, to predict enough structure on large scales that led to its demise and its replacement by LCDM, which now faces a similar problem. The observer’s experience has consistently been that there is more structure in place earlier and on larger scales than had been anticipated before its observation.

*I believe in giving theories credit where credit is due. Putting on a cosmologist’s hat, the location of the first peak was a great success of LCDM. It was the amplitude of the second peak that came as a great surprise – unless you can take off the cosmology hat and don a MOND hat – then it was predicted. What is surprising from that perspective is the amplitude of the third peak, which makes more sense in LCDM. It seems impossible to some people that I can wear both hats without my head exploding, so they seem to simply assume I don’t think about it from their perspective when in reality it is the other way around.

%As adjudicated by galaxies with distances known from direct measurements provided by Cepheids or the tip of the red giant branch or surface brightness fluctuations or geometric methods, etc., etc., etc.

$This is a great exercise, but only works if CMB results are excluded. There has to be some narrative suspense: will the various disparate lines of evidence indeed line up? Since CMB fits constrain all parameters simultaneously, and brook no dissent, they suck the joy away from everything else in the sky and drain all interest in the debate.

Recent Developments Concerning the Gravitational Potential of the Milky Way. II. A Closer Look at the Data

Recent Developments Concerning the Gravitational Potential of the Milky Way. II. A Closer Look at the Data

Continuing from last time, let’s compare recent rotation curve determinations from Gaia DR3:

Fig. 1 from Jiao et al. comparing three different realizations of the Galactic rotation curve from Gaia DR3. The vertical lines* mark the range of the Ou et al. data considered by Chan & Chung Law (2023).

These are different analyses of the same dataset. The Gaia data release is immense, with billions of stars. There are gazillions of ways to parse these data. So it is reasonable to have multiple realizations, and we shouldn’t expect them to necessarily agree perfectly: do we look exclusively at K giants? A stars? Only stars with proper motion and/or parallax data more accurate than some limit? etc. Of course we want to understand any differences, but that’s not going to happen here.

My first observation is that the various analyses are broadly consistent. They all show a steady decline over a large range of radii. Nothing shocking there; it is fairly typical for bright, compact galaxies like the Milky Way to have somewhat declining rotation curves. The issue here, of course, is how much, and what does it mean?

Looking more closely, not all of the data agree with each other, or even with themselves. There are offsets between the three at radii around the sun (we live just outside R = 8 kpc) where you’d naively think they would agree the best. They’re very consistent from 13 < R < 17 kpc, then they start to diverge a little. The Ou data have a curious uptick right around R = 17 kpc, which I wouldn’t put much stock in; weird kinks like that sometimes happen in astronomical data. But it can’t be consistent with a continuous mass distribution, and will come up again for other reasons.

As an astronomer, I’m happy with the level of agreement I see here. It is not perfect, in the sense that there are some points from one data set whose error bars do not overlap with those of other data sets in places. That’s normal in astronomy, and one of the reasons that we can never entirely trust the stated uncertainties. Jiao et al. make a thorough and yet still incomplete assessment of the systematic uncertainties, winding up with larger error bars on the Wang et al. realization of the data.

For example, one – just one of the issues we have to contend with – is the distance to each star in the sample. Distances to individual objects are hard, and subject to systematic uncertainties. The reason to choose A stars or K giants is because you think you know their luminosity, so can estimate their distance. That works, but aren’t necessarily consistent (let alone correct) among the different groups. That by itself could be the source of the modest difference we see between data sets.

Chan & Chung Law use the Ou et al. realization of the data to make some strong claims. One is that the gradient of the rotation curve is -5 km/s/kpc, and this excludes MOND at high confidence. Here is their plot.

You will notice that, as they say, these are the data of Ou et al, being identical to the same points in the plot from Jiao et al. above – provided you only look in the range between the lines, 17 < R < 23 kpc. This is where the kink at R = 17 kpc comes in. They appear to have truncated the data right where it needs to be truncated to ignore the point with a noticeably lower velocity, which would surely affect the determination of the slope and reduce its confidence level. They also exclude the point with a really big error bar that nominally is within their radial range. That’s OK, as it has little significance: it’s large error bar means it contributes little to the constraint. That is not the case for the datum just inside of R = 17 kpc, or the rest of the data at smaller radii for that matter. These have a manifestly shallower slope. Looking at the line boundaries added to Jiao’s plot, it appears that they selected the range of the data with the steepest gradient. This is called cherry-picking.

It is a strange form of cherry-picking, as there is no physical reason to expect a linear fit to be appropriate. A Keplerian downturn has velocity decline as the inverse square root of radius (see the dotted line above.) These data, over this limited range, may be consistent with a Keplerian downturn, but certainly do not establish that it is required.

Contrast the statements of Chan & Chung Law with the more measured statement from the paper where the data analysis is actually performed:

… a low mass for the Galaxy is driven by the functional forms tested, given that it probes beyond our measurements. It is found to be in tension with mass measurements from globular clusters, dwarf satellites, and streams.

Ou et al. (2023)

What this means is that the data do not go far enough out to measure the total mass. The low mass that is inferred from the data is a result of fitting some specific choice of halo form to it. They note that the result disagrees with other data, as I discussed last time.

Rather than cherry pick the data, we should look at all of it. Let’s see, I’ve done that before. We looked at the Wang et al. (2023) data via Jiao et al. previously, and just discussed the Ou et al. data. That leaves the new Zhao et al. data, so let’s look at those:

Milky Way rotation curve with RAR model (blue line from 2018) and the Gaia DR3 data as realized by Zhou et al. (2023: purple triangles). The dashed line shows the number of stars (right axis) informing each datum.

These data were the last of the current crop that I looked at. They look… pretty good in comparison with the pre-existing RAR model. Not exactly the falsification I had been led to expect.

So – the three different realizations of the Gaia DR3 data are largely consistent, yet one is being portrayed as a falsification of MOND while another is in good agreement with its prediction.

This is why you have to take astronomical error bars with a grain of salt. Three different groups are using data from the same source to obtain very nearly the same result. It isn’t quite the same result, as some of the data disagree at the formal limits of their uncertainty. No big deal – that’s what happens in astronomy. The number of stars per bin helps illustrate one reason why: we go from thousands of stars per bin near the sun to tens of stars in wider bins at R > 20 kpc. That’s not necessarily problematic, but it is emblematic of what we’re dealing with: great gobs of data up close, but only scarce scratches of it far away where systematic effects are more pernicious.

In the meantime, one realization of these data are being portrayed as a death knell for a theory that successfully predicts another realization of the same data. Well, which is it?


*Thanks to Moti Milgrom for pointing out the restricted range of radii considered by Chan & Chung Law and adding the vertical lines to this figure.

Recent Developments Concerning the Gravitational Potential of the Milky Way. I.

Recent Developments Concerning the Gravitational Potential of the Milky Way. I.

Recent results from the third data release (DR3) from Gaia has led to a flurry of papers. Some are good, some are great, some are neither of those. It is apparent from the comments last time that while I’ve kept my pledge to never dumb it down, I have perhaps been assuming more background knowledge on the part of readers than is adequate. I can’t cram a graduate education in astronomy into one web page, but will try to provide a little relevant context.

Galactic Astronomy is an ancient field, dating back at least to the Herschels. There is a lot that is known in the field. There have also been a lot of misleading observations, going back just as far to the Herschel’s map of the Milky Way, which was severely limited by extinction from interstellar dust. That’s easy to say now, but Herschel’s map was the standard for over a century – longer than our modern map has persisted.

So a lot has changed, including a lot that seemed certain, so I try to keep an open mind. The astronomers working with the Gaia data – the ones deriving the rotation curve – are simply following where those data take them, as they should. There are others using their analyses to less credible ends. A lot of context is required to distinguish the two.

The total mass of the Milky Way

There are a lot of constraints on the mass of the Milky Way that predate Gaia; it’s not like these are the first data that address the issue. Indeed, there are lots and lots and lots of other applicable data acquired using different methods over the course of many decades. Here is a summary plot of determinations of the mass of the Milky Way compiled by Wang et al. (2019).

This is an admirable compilation, and yet no such compilation can be complete. There are just so many determinations by lots of independent authors. Still, this is nice for listing multiple results from many distinct methodologies. They all consistently give numbers around 1012 solar masses. (Cast in these terms, my own estimate is 1.4 x 1012 albeit with a substantial systematic uncertainty.) I’ve added a point for the total mass according to the alleged Keplerian downturn seen in the Gaia data, 2 x 1011 solar masses. One of these things is not like the others.

The difference from the bulk of the data has nearly every astronomer rolling our collective eyes. Most of us straight up don’t believe it. That’s not to say the Gaia data are wrong, but the interpretation of those data as indicative of such a small, finite total mass seems unlikely in the light of all other results.

As I discussed briefly last time, it is conceivable that previous results are wrong or misleading due to some systematic effect or bad assumption. For example, mass estimates based on “satellite phenomenon” require the assumption that the satellite galaxies are indeed satellites of the Milky Way on bound orbits. That seems like a really good assumption, as without it, their presence is an instantaneous coincidence particular to the most recent few percent of a Hubble time: they wouldn’t have been nearby more than a billion years ago, and won’t be around another for even a few hundred million more. That sounds like a long time to you and me, but it is not that long on a cosmic scale. Maybe they’re raining down all the time to give the appearance of a steady state? Where have I heard that before?

Even if we’re willing to dismiss satellite constraints, that doesn’t suffice. It isn’t good enough to find flaw with one set of determinations; one must question all distinct methods. I could probably do that; there’s always a systematic uncertainty that might be bigger than expected or an assumption that could go badly wrong. But it is asking a lot for all of them to conspire to be wrong at the same time by the same amount. (The assumption of Newtonian gravity is a catch-all.)

Some constraints are more difficult to dodge than others. For example, the escape velocity method merely notes that there are fast moving stars in the solar neighborhood. Those stars are many billions of years old, and wouldn’t be here if the gravitational potential couldn’t contain them. The mass implied by the Gaia quasi-Keplerian downturn doesn’t suffice.

That said, the total mass of the Milky Way as expressed above is a rather notional quantity. M200 occurs roughly 200 kpc out for the Milky Way, give or take a lot. And the “200” in the subscript has nothing to do with that radius being 200 kpc for reasons too technical and silly to delve into. So my biggest concern about the compilation above is not that the data are wrong so much as they are being extrapolated to an idealized radius that we don’t directly observe. This extrapolation is usually done by assuming the potential of an NFW halo, which makes perfect sense in terms of LCDM but none whatsoever empirically, since NFW predicts the wrong density profile at small, intermediate, and large radii: where the density profile ρ ∝ r is predicted to have α = (1,2,3), it is persistently observed to be more like (0,1,2). While the latter profile is empirically more realistic, it also fails to converge to a finite total mass, rendering the concept meaningless.

Rather than indulge yet again in a discussion of the virtues and vices of different dark matter halo profiles, let’s look at an observationally more robust quantity: the enclosed mass. Wang et al. also provide a tabulation of this quantity from many sources, as depicted here:

Rotation curve constraints implied by the enclosed mass measurements tabulated by Wang et al. (2019) combined with the halo stars and globular clusters previously discussed. The location of the Large Magellanic Cloud is also indicated; data beyond this radius (and perhaps even within it) are subject to perturbation by the passage of the LMC. The RAR-based model is shown as the blue line; the light blue line includes a very uncertain estimate of the effect of the coronal gas. This is very diffuse and extended, and only becomes significant at very large radii. The dotted line is the Keplerian curve for a mass of 2 x 1011 M.

Not all of the enclosed mass data are consistent with one another. The bulk of them are consistent with the RAR model Milky Way (blue line). None of them are consistent with the small mass indicated by recent Gaia analyses (dotted line). Hence the collective unwillingness of most astronomers to accept the low-mass interpretation.

An important thing to note when considering data at large radii, especially those beyond 50 kpc, is that 50 kpc is the current Galactocentric radius of the Large Magellanic Cloud. The LMC brings with it its own dark matter halo, which perturbs the outer regions of the Milky Way. This effect is surprisingly strong*, and leads to the inference that the mass ratio of the two is only 4 or 5:1 even though the luminosity ratio is more like 20:1. This makes the interpretation of the data beyond 50 kpc problematic. If we use that as a pretext to ignore it, then we infer that our low mass Milky Way is no more massive then the LMC – an apparently absurd situation.

There are many rabbit holes we could dig down here, but the basic message is that a small Milky Way mass violates a gazillion well-established constraints. That doesn’t mean the Gaia data are wrong, but it does call into question their interpretation. So next time we’ll look more closely at the data.


*This is not surprising in MOND. The LMC is in the right place at the right time to cause the Galactic warp. The LMC as a candidate perturber to excite the Galactic warp was recognized early, but the conventional mass was thought to be much too small to do the job. The small baryonic mass of the LMC in MOND is not a problem as the long range nature of the force law makes tidal effects more pronounced: it works out about right.

Take it where?

Take it where?

I had written most of the post below the line before an exchange with a senior colleague who accused me of asking us to abandon General Relativity (GR). Anyone who read the last post knows that this is the opposite of true. So how does this happen?

Much of the field is mired in bad ideas that seemed like good ideas in the 1980s. There has been some progress, but the idea that MOND is an abandonment of GR I recognize as a misconception from that time. It arose because the initial MOND hypothesis suggested modifying the law of inertia without showing a clear path to how this might be consistent with GR. GR was built on the Equivalence Principle (EP), the equivalence1 of gravitational charge with inertial mass. The original MOND hypothesis directly contradicted that, so it was a fair concern in 1983. It was not by 19842. I was still an undergraduate then, so I don’t know the sociology, but I get the impression that most of the community wrote MOND off at this point and never gave it further thought.

I guess this is why I still encounter people with this attitude, that someone is trying to rob them of GR. It’s feels like we’re always starting at square one, like there has been zero progress in forty years. I hope it isn’t that bad, but I admit my patience is wearing thin.

I’m trying to help you. Don’t waste you’re entire career chasing phantoms.

What MOND does ask us to abandon is the Strong Equivalence Principle. Not the Weak EP, nor even the Einstein EP. Just the Strong EP. That’s a much more limited ask that abandoning all of GR. Indeed, all flavors of EP are subject to experimental test. The Weak EP has been repeatedly validated, but there is nothing about MOND that implies platinum would fall differently from titanium. Experimental tests of the Strong EP are less favorable.

I understand that MOND seems impossible. It also keeps having its predictions come true. This combination is what makes it important. The history of science is chock full of ideas that were initially rejected as impossible or absurd, going all the way back to heliocentrism. The greater the cognitive dissonance, the more important the result.


Continuing the previous discussion of UT, where do we go from here? If we accept that maybe we have all these problems in cosmology because we’re piling on auxiliary hypotheses to continue to be able to approximate UT with FLRW, what now?

I don’t know.

It’s hard to accept that we don’t understand something we thought we understood. Scientists hate revisiting issues that seem settled. Feels like a waste of time. It also feels like a waste of time continuing to add epicycles to a zombie theory, be it LCDM or MOND or the phoenix universe or tired light or whatever fantasy reality you favor. So, painful as it may be, one has find a little humility to step back and take account of what we know empirically independent of the interpretive veneer of theory.

As I’ve said before, I think we do know that the universe is expanding and passed through an early hot phase that bequeathed us the primordial abundances of the light elements (BBN) and the relic radiation field that we observe as the cosmic microwave background (CMB). There’s a lot more to it than that, and I’m not going to attempt to recite it all here.

Still, to give one pertinent example, BBN only works if the expansion rate is as expected during the epoch of radiation domination. So whatever is going on has to converge to that early on. This is hardly surprising for UT since it was stipulated to contain GR in the relevant limit, but we don’t actually know how it does so until we work out what UT is – a tall order that we can’t expect to accomplish overnight, or even over the course of many decades without a critical mass of scientists thinking about it (and not being vilified by other scientists for doing so).

Another example is that the cosmological principle – that the universe is homogeneous and isotropic – is observed to be true in the CMB. The temperature is the same all over the sky to one part in 100,000. That’s isotropy. The temperature is tightly coupled to the density, so if the temperature is the same everywhere, so is the density. That’s homogeneity. So both of the assumptions made by the cosmological principle are corroborated by observations of the CMB.

The cosmological principle is extremely useful for solving the equations of GR as applied to the whole universe. If the universe has a uniform density on average, then the solution is straightforward (though it is rather tedious to work through to the Friedmann equation). If the universe is not homogeneous and isotropic, then it becomes a nightmare to solve the equations. One needs to know where everything was for all of time.

Starting from the uniform condition of the CMB, it is straightforward to show that the assumption of homogeneity and isotropy should persist on large scales up to the present day. “Small” things like galaxies go nonlinear and collapse, but huge volumes containing billions of galaxies should remain in the linear regime and these small-scale variations average out. One cubic Gigaparsec will have the same average density as the next as the next, so the cosmological principle continues to hold today.

Anyone spot the rub? I said homogeneity and isotropy should persist. This statement assumes GR. Perhaps it doesn’t hold in UT?

This aspect of cosmology is so deeply embedded in everything that we do in the field that it was only recently that I realized it might not hold absolutely – and I’ve been actively contemplating such a possibility for a long time. Shouldn’t have taken me so long. Felten (1984) realized right away that a MONDian universe would depart from isotropy by late times. I read that paper long ago but didn’t grasp the significance of that statement. I did absorb that in the absence of a cosmological constant (which no one believed in at the time), the universe would inevitably recollapse, regardless of what the density was. This seems like an elegant solution to the flatness/coincidence problem that obsessed cosmologists at the time. There is no special value of the mass density that provides an over/under line demarcating eternal expansion from eventual recollapse, so there is no coincidence problem. All naive MOND cosmologies share the same ultimate fate, so it doesn’t matter what we observe for the mass density.

MOND departs from isotropy for the same reason it forms structure fast: it is inherently non-linear. As well as predicting that big galaxies would form by z=10, Sanders (1998) correctly anticipated the size of the largest structures collapsing today (things like the local supercluster Laniakea) and the scale of homogeneity (a few hundred Mpc if there is a cosmological constant). Pretty much everyone who looked into it came to similar conclusions.

But MOND and cosmology, as we know it in the absence of UT, are incompatible. Where LCDM encompasses both cosmology and the dynamics of bound systems (dark matter halos3), MOND addresses the dynamics of low acceleration systems (the most common examples being individual galaxies) but says nothing about cosmology. So how do we proceed?

For starters, we have to admit our ignorance. From there, one has to assume some expanding background – that much is well established – and ask what happens to particles responding to a MONDian force-law in this background, starting from the very nearly uniform initial condition indicated by the CMB. From that simple starting point, it turns out one can get a long way without knowing the details of the cosmic expansion history or the metric that so obsess cosmologists. These are interesting things, to be sure, but they are aspects of UT we don’t know and can manage without to some finite extent.

For one, the thermal history of the universe is pretty much the same with or without dark matter, with or without a cosmological constant. Without dark matter, structure can’t get going until after thermal decoupling (when the matter is free to diverge thermally from the temperature of the background radiation). After that happens, around z = 200, the baryons suddenly find themselves in the low acceleration regime, newly free to respond to the nonlinear force of MOND, and structure starts forming fast, with the consequences previously elaborated.

But what about the expansion history? The geometry? The big questions of cosmology?

Again, I don’t know. MOND is a dynamical theory that extends Newton. It doesn’t address these questions. Hence the need for UT.

I’ve encountered people who refuse to acknowledge4 that MOND gets predictions like z=10 galaxies right without a proper theory for cosmology. That attitude puts the cart before the horse. One doesn’t look for UT unless well motivated. That one is able to correctly predict 25 years in advance something that comes as a huge surprise to cosmologists today is the motivation. Indeed, the degree of surprise and the longevity of the prediction amplify the motivation: if this doesn’t get your attention, what possibly could?

There is no guarantee that our first attempt at UT (or our second or third or fourth) will work out. It is possible that in the search for UT, one comes up with a theory that fails to do what was successfully predicted by the more primitive theory. That just lets you know you’ve taken a wrong turn. It does not mean that a correct UT doesn’t exist, or that the initial prediction was some impossible fluke.

One candidate theory for UT is bimetric MOND. This appears to justify the assumptions made by Sanders’s early work, and provide a basis for a relativistic theory that leads to rapid structure formation. Whether it can also fit the acoustic power spectrum of the CMB as well as LCDM and AeST has yet to be seen. These things take time and effort. What they really need is a critical mass of people working on the problem – a community that enjoys the support of other scientists and funding institutions like NSF. Until we have that5, progress will remain grudgingly slow.


1The equivalence of gravitational charge and inertial mass means that the m in F=GMm/d2 is identically the same as the m in F=ma. Modified gravity changes the former; modified inertia the latter.

2Bekenstein & Milgrom (1984) showed how a modification of Newtonian gravity could avoid the non-conservation issues suffered by the original hypothesis of modified inertia. They also outlined a path towards a generally covariant theory that Bekenstein pursued for the rest of his life. That he never managed to obtain a completely satisfactory version is often cited as evidence that it can’t be done, since he was widely acknowledged as one of the smartest people in the field. One wonders why he persisted if, as these detractors would have us believe, the smart thing to do was not even try.

3The data for galaxies do not look like the dark matter halos predicted by LCDM.

4I have entirely lost patience with this attitude. If a phenomena is correctly predicted in advance in the literature, we are obliged as scientists to take it seriously+. Pretending that it is not meaningful in the absence of UT is just an avoidance strategy: an excuse to ignore inconvenient facts.

+I’ve heard eminent scientists describe MOND’s predictive ability as “magic.” This also seems like an avoidance strategy. I, for one, do not believe in magic. That it works as well as it doesthat it works at all – must be telling us something about the natural world, not the supernatural.

5There does exist a large and active community of astroparticle physicists trying to come up with theories for what the dark matter could be. That’s good: that’s what needs to happen, and we should exhaust all possibilities. We should do the same for new dynamical theories.