I have become despondent for the progress of science.

Despite enormous progress both observational and computational, we have made little progress in solving the missing mass problem. The issue is not one of technical progress. It is psychological.

Words matter. We are hung up on missing mass as literal dark matter. As Bekenstein pointed out, a less misleading name would have been the acceleration discrepancy, because the problem only appears at low accelerations. But that sounds awkward. We humans like our simple catchphrases, and often cling to them no matter what. We called it dark matter, so it must be dark matter!

Vera Rubin succinctly stated the appropriately conservative attitude of most scientists in 1982 during the discussion at IAU 100:

To highlight the end of her quote:

I believe most of us would rather alter Newtonian gravitational theory only as a last resort.

Rubin, V.C. 1983, in the proceedings of IAU Symposium 100: Internal Kinematics and Dynamics of Galaxies, p. 10.


In 1982, this was exactly the right attitude. It had been clearly established that there was a discrepancy between what you see and what you get. But that was about it. So, we could add a little mass that’s hard to see, or we could change a fundamental law of nature. Easy call.

By this time, the evidence for a discrepancy was clear, but the hypothesized solutions were still in development. This was before the publication of the suggestion of Peebles and separately by Steigman & Turner of cold dark matter. This was before the publication of Milgrom’s first papers on MOND. (Note that these ideas took years to develop, so much of this work was simultaneous and not done in a vacuum.) All that was clear was that something extra was needed. It wasn’t even clear how much – a factor of two in mass sufficed for many of the early observations. At that time, it was easy to imagine that amount to be lurking in low mass stars. No need for new physics, either gravitational or particle.

The situation quickly snowballed. From a factor of two, we soon needed a factor of ten. Whatever was doing the gravitating, it exceeded the mass density allowed in normal matter by big bang nucleosynthesis. By the time I was a grad student in the late ’80s, it was obvious that there had to be some kind of dark mass, and it had to be non-baryonic. That meant new particle physics (e.g., a WIMP). The cold dark matter paradigm took root.

Like a fifty year mortgage, we are basically still stuck with this decision we made in the ’80s. It made sense then, given what was then known. Does it still? At what point have we reached the last resort? More importantly, apparently, how do we persuade ourselves that we have reached this point?

Peebles provides a nice recent summary of all the ways in which LCDM is a good approximation to cosmologically relevant observations. There are a lot, and I don’t disagree with him. The basic argument is that it is very unlikely that these things all agree unless LCDM is basically correct.

Trouble is, the exact same argument applies for MOND. I’m not going to justify this here – it should be obvious. If it isn’t, you haven’t been paying attention. It is unlikely to the point of absurdity that a wholly false theory should succeed in making so many predictions of such diversity and precision as MOND has.

These are both examples of what philosophers of science call a No Miracles Argument. The problem is that it cuts both ways. I will refrain from editorializing here on which would be the bigger miracle, and simply note that the obvious thing to do is try to combine the successes of both, especially given that they don’t overlap much. And yet, the Venn diagram of scientists working to satisfy both ends is vanishingly small. Not zero, but the vast majority of the community remains stuck in the ’80s: it has to be cold dark matter. I remember having this attitude, and how hard it was to realize that it might be wrong. The intellectual blinders imposed by this attitude are more opaque than a brick wall. This psychological hangup is the primary barrier to real scientific progress (as opposed to incremental progress in the sense used by Kuhn).

Unfortunately, both CDM and MOND rely on a tooth fairy. In CDM, it is the conceit that non-baryonic dark matter actually exists. This requires new physics beyond the Standard Model of particle physics. All the successes of LCDM follow if and only if dark matter actually exists. This we do not know (contrary to many assertions to this effect); all we really know is that there are discrepancies. Whether the discrepancies are due to literal dark matter or a change in the force law is maddeningly ambiguous. Of course, the conceit in MOND is not just that there is a modified force law, but that there must be a physical mechanism by which it occurs. The first part is the well-established discrepancy. The last part remains wanting.

When we think we know, we cease to learn.

Dr. Radhakrishnan

The best scientists are always in doubt. As well as enumerating its successes, Peebles also discusses some of the ways in which LCDM might be better. Should massive galaxies appear as they do? (Not really.) Should the voids really be so empty? (MOND predicted that one.) I seldom hear these concerns from other cosmologists. That’s because they’re not in doubt. The attitude is that dark matter has to exist, and any contrary evidence is simply a square peg that can be made to fit the round hole if we pound hard enough.

And so, we’re stuck still pounding the ideas of the ’80s into the heads of innocent students, creating a closed ecosystem of stagnant ideas self-perpetuated by the echo chamber effect. I see no good way out of this; indeed, the quality of debate is palpably lower now than it was in the previous century.

So I have become despondent for the progress of science.

Bias all the way down

Bias all the way down

It often happens that data are ambiguous and open to multiple interpretations. The evidence for dark matter is an obvious example. I frequently hear permutations on the statement

We know dark matter exists; we just need to find it.

This is said in all earnestness by serious scientists who clearly believe what they say. They mean it. Unfortunately, meaning something in all seriousness, indeed, believing it with the intensity of religious fervor, does not guarantee that it is so.

The way the statement above is phrased is a dangerous half-truth. What the data show beyond any dispute is that there is a discrepancy between what we observe in extragalactic systems (including cosmology) and the predictions of Newton & Einstein as applied to the visible mass. If we assume that the equations Newton & Einstein taught us are correct, then we inevitably infer the need for invisible mass. That seems like a very reasonable assumption, but it is just that: an assumption. Moreover, it is an assumption that is only tested on the relevant scales by the data that show a discrepancy. One could instead infer that theory fails this test – it does not work to predict observed motions when applied to the observed mass. From this perspective, it could just as legitimately be said that

A more general theory of dynamics must exist; we just need to figure out what it is.

That puts an entirely different complexion on exactly the same problem. The data are the same; they are not to blame. The difference is how we interpret them.

Neither of these statements are correct: they are both half-truths; two sides of the same coin. As such, one risks being wildly misled. If one only hears one, the other gets discounted. That’s pretty much where the field is now, and has it been stuck there for a long time.

That’s certainly where I got my start. I was a firm believer in the standard dark matter interpretation. The evidence was obvious and overwhelming. Not only did there need to be invisible mass, it had to be some new kind of particle, like a WIMP. Almost certainly a WIMP. Any other interpretation (like MACHOs) was obviously stupid, as it violated some strong constraint, like Big Bang Nucleosynthesis (BBN). It had to be non-baryonic cold dark matter. HAD. TO. BE. I was sure of this. We were all sure of this.

What gets us in trouble is not what we don’t know. It’s what we know for sure that just ain’t so.

Josh Billings

I realized in the 1990s that the above reasoning was not airtight. Indeed, it has a gaping hole: we were not even considering modifications of dynamical laws (gravity and inertia). That this was a possibility, even a remote one, came as a profound and deep shock to me. It took me ages of struggle to admit it might be possible, during which I worked hard to save the standard picture. I could not. So it pains me to watch the entire community repeat the same struggle, repeat the same failures, and pretend like it is a success. That last step follows from the zeal of religious conviction: the outcome is predetermined. The answer still HAS TO BE dark matter.

So I asked myself – what if we’re wrong? How could we tell? Once one has accepted that the universe is filled with invisible mass that can’t be detected by any craft available known to us, how can we disabuse ourselves of this notion should it happen to be wrong?

One approach that occurred to me was a test in the power spectrum of the cosmic microwave background. Before any of the peaks had been measured, the only clear difference one expected was a bigger second peak with dark matter, and a smaller one without it for the same absolute density of baryons as set by BBN. I’ve written about the lead up to this prediction before, and won’t repeat it here. Rather, I’ll discuss some of the immediate fall out – some of which I’ve only recently pieced together myself.

The first experiment to provide a test of the prediction for the second peak was Boomerang. The second was Maxima-1. I of course checked the new data when they became available. Maxima-1 showed what I expected. So much so that it barely warranted comment. One is only supposed to write a scientific paper when one has something genuinely new to say. This didn’t rise to that level. It was more like checking a tick box. Besides, lots more data were coming; I couldn’t write a new paper every time someone tacked on an extra data point.

There was one difference. The Maxima-1 data had a somewhat higher normalization. The shape of the power spectrum was consistent with that of Boomerang, but the overall amplitude was a bit higher. The latter mattered not at all to my prediction, which was for the relative amplitude of the first to second peaks.

Systematic errors, especially in the amplitude, were likely in early experiments. That’s like rule one of observing the sky. After examining both data sets and the model expectations, I decided the Maxima-1 amplitude was more likely to be correct, so I asked what offset was necessary to reconcile the two. About 14% in temperature. This was, to me, no big deal – it was not relevant to my prediction, and it is exactly the sort of thing one expects to happen in the early days of a new kind of observation. It did seem worth remarking on, if not writing a full blown paper about, so I put it in a conference presentation (McGaugh 2000), which was published in a journal (IJMPA, 16, 1031) as part of the conference proceedings. This correctly anticipated the subsequent recalibration of Boomerang.

The figure from McGaugh (2000) is below. Basically, I said “gee, looks like the Boomerang calibration needs to be adjusted upwards a bit.” This has been done in the figure. The amplitude of the second peak remained consistent with the prediction for a universe devoid of dark matter. In fact, if got better (see Table 4 of McGaugh 2004).

Plot from McGaugh (2000): The predictions of LCDM (left) and no-CDM (right) compared to Maxima-1 data (open points) and Boomerang data (filled points, corrected in normalization). The LCDM model shown is the most favorable prediction that could be made prior to observation of the first two peaks; other then-viable choices of cosmic parameters predicted a higher second peak. The no-CDM got the relative amplitude right a priori, and remains consistent with subsequent data from WMAP and Planck.

This much was trivial. There was nothing new to see, at least as far as the test I had proposed was concerned. New data were pouring in, but there wasn’t really anything worth commenting on until WMAP data appeared several years later, which persisted in corroborating the peak ratio prediction. By this time, the cosmological community had decided that despite persistent corroborations, my prediction was wrong.

That’s right. I got it right, but then right turned into wrong according to the scuttlebutt of cosmic gossip. This was a falsehood, but it took root, and seems to have become one of the things that cosmologists know for sure that just ain’t so.

How did this come to pass? I don’t know. People never asked me. My first inkling was 2003, when it came up in a chance conversation with Marv Leventhal (then chair of Maryland Astronomy), who opined “too bad the data changed on you.” This shocked me. Nothing relevant in the data had changed, yet here was someone asserting that it had like it was common knowledge. Which I suppose it was by then, just not to me.

Over the years, I’ve had the occasional weird conversation on the subject. In retrospect, I think the weirdness stemmed from a divergence of assumed knowledge. They knew I was right then wrong. I knew the second peak prediction had come true and remained true in all subsequent data, but the third peak was a different matter. So there were many opportunities for confusion. In retrospect, I think many of these people were laboring under the mistaken impression that I had been wrong about the second peak.

I now suspect this started with the discrepancy between the calibration of Boomerang and Maxima-1. People seemed to be aware that my prediction was consistent with the Boomerang data. Then they seem to have confused the prediction with those data. So when the data changed – i.e., Maxima-1 was somewhat different in amplitude, then it must follow that the prediction now failed.

This is wrong on many levels. The prediction is independent of the data that test it. It is incredibly sloppy thinking to confuse the two. More importantly, the prediction, as phrased, was not sensitive to this aspect of the data. If one had bothered to measure the ratio in the Maxima-1 data, one would have found a number consistent with the no-CDM prediction. This should be obvious from casual inspection of the figure above. Apparently no one bothered to check. They didn’t even bother to understand the prediction.

Understanding a prediction before dismissing it is not a hard ask. Unless, of course, you already know the answer. Then laziness is not only justified, but the preferred course of action. This sloppy thinking compounds a number of well known cognitive biases (anchoring bias, belief bias, confirmation bias, to name a few).

I mistakenly assumed that other people were seeing the same thing in the data that I saw. It was pretty obvious, after all. (Again, see the figure above.) It did not occur to me back then that other scientists would fail to see the obvious. I fully expected them to complain and try and wriggle out of it, but I could not imagine such complete reality denial.

The reality denial was twofold: clearly, people were looking for any excuse to ignore anything associated with MOND, however indirectly. But they also had no clear prior for LCDM, which I did establish as a point of comparison. A theory is only as good as its prior, and all LCDM models made before these CMB data showed the same thing: a bigger second peak than was observed. This can be fudged: there are ample free parameters, so it can be made to fit; one just had to violate BBN (as it was then known) by three or four sigma.

In retrospect, I think the very first time I had this alternate-reality conversation was at a conference at the University of Chicago in 2001. Andrey Kravtsov had just joined the faculty there, and organized a conference to get things going. He had done some early work on the cusp-core problem, which was still very much a debated thing at the time. So he asked me to come address that topic. I remember being on the plane – a short ride from Cleveland – when I looked at the program. Nearly did a spit take when I saw that I was to give the first talk. There wasn’t a lot of time to organize my transparencies (we still used overhead projectors in those days) but I’d given the talk many times before, so it was enough.

I only talked about the rotation curves of low surface brightness galaxies in the context of the cusp-core problem. That was the mandate. I didn’t talk about MOND or the CMB. There’s only so much you can address in a half hour talk. [This is a recurring problem. No matter what I say, there always seems to be someone who asks “why didn’t you address X?” where X is usually that person’s pet topic. Usually I could do so, but not in the time allotted.]

About halfway through this talk on the cusp-core problem, I guess it became clear that I wasn’t going to talk about things that I hadn’t been asked to talk about, and I was interrupted by Mike Turner, who did want to talk about the CMB. Or rather, extract a confession from me that I had been wrong about it. I forget how he phrased it exactly, but it was the academic equivalent of “Have you stopped beating your wife lately?” Say yes, and you admit to having done so in the past. Say no, and you’re still doing it. What I do clearly remember was him prefacing it with “As a test of your intellectual honesty” as he interrupted to ask a dishonest and intentionally misleading question that was completely off-topic.

Of course, the pretext for his attack question was the Maxima-1 result. He phrased it in a way that I had to agree that those disproved my prediction, or be branded a liar. Now, at the time, there were rumors swirling that the experiment – some of the people who worked on it were there – had detected the third peak, so I thought that was what he was alluding to. Those data had not yet been published and I certainly had not seen them, so I could hardly answer that question. Instead, I answered the “intellectual honesty” affront by pointing to a case where I had said I was wrong. At one point, I thought low surface brightness galaxies might explain the faint blue galaxy problem. On closer examination, it became clear that they could not provide a complete explanation, so I said so. Intellectual honesty is really important to me, and should be to all scientists. I have no problem admitting when I’m wrong. But I do have a problem with demands to admit that I’m wrong when I’m not.

To me, it was obvious that the Maxima-1 data were consistent with the second peak. The plot above was already published by then. So it never occurred to me that he thought the Maxima-1 data were in conflict with what I had predicted – it was already known that it was not. Only to him, it was already known that it was. Or so I gather – I have no way to know what others were thinking. But it appears that this was the juncture in which the field suffered a psychotic break. We are not operating on the same set of basic facts. There has been a divergence in personal realities ever since.

Arthur Kosowsky gave the summary talk at the end of the conference. He told me that he wanted to address the elephant in the room: MOND. I did not think the assembled crowd of luminary cosmologists were mature enough for that, so advised against going there. He did, and was incredibly careful in what he said: empirical, factual, posing questions rather than making assertions. Why does MOND work as well as it does?

The room dissolved into chaotic shouting. Every participant was vying to say something wrong more loudly than the person next to him. (Yes, everyone shouting was male.) Joel Primack managed to say something loudly enough for it to stick with me, asserting that gravitational lensing contradicted MOND in a way that I had already shown it did not. It was just one of dozens of superficial falsehoods that people take for granted to be true if they align with one’s confirmation bias.

The uproar settled down, the conference was over, and we started to disperse. I wanted to offer Arthur my condolences, having been in that position many times. Anatoly Klypin was still giving it to him, keeping up a steady stream of invective as everyone else moved on. I couldn’t get a word in edgewise, and had a plane home to catch. So when I briefly caught Arthur’s eye, I just said “told you” and moved on. Anatoly paused briefly, apparently fathoming that his behavior, like that of the assembled crowd, was entirely predictable. Then the moment of awkward self-awareness passed, and he resumed haranguing Arthur.



Reality check

Before we can agree on the interpretation of a set of facts, we have to agree on what those facts are. Even if we agree on the facts, we can differ about their interpretation. It is OK to disagree, and anyone who practices astrophysics is going to be wrong from time to time. It is the inevitable risk we take in trying to understand a universe that is vast beyond human comprehension. Heck, some people have made successful careers out of being wrong. This is OK, so long as we recognize and correct our mistakes. That’s a painful process, and there is an urge in human nature to deny such things, to pretend they never happened, or to assert that what was wrong was right all along.

This happens a lot, and it leads to a lot of weirdness. Beyond the many people in the field whom I already know personally, I tend to meet two kinds of scientists. There are those (usually other astronomers and astrophysicists) who might be familiar with my work on low surface brightness galaxies or galaxy evolution or stellar populations or the gas content of galaxies or the oxygen abundances of extragalactic HII regions or the Tully-Fisher relation or the cusp-core problem or faint blue galaxies or big bang nucleosynthesis or high redshift structure formation or joint constraints on cosmological parameters. These people behave like normal human beings. Then there are those (usually particle physicists) who have only heard of me in the context of MOND. These people often do not behave like normal human beings. They conflate me as a person with a theory that is Milgrom’s. They seem to believe that both are evil and must be destroyed. My presence, even the mere mention of my name, easily destabilizes their surprisingly fragile grasp on sanity.

One of the things that scientists-gone-crazy do is project their insecurities about the dark matter paradigm onto me. People who barely know me frequently attribute to me motivations that I neither have nor recognize. They presume that I have some anti-cosmology, anti-DM, pro-MOND agenda, and are remarkably comfortably about asserting to me what it is that I believe. What they never explain, or apparently bother to consider, is why I would be so obtuse? What is my motivation? I certainly don’t enjoy having the same argument over and over again with their ilk, which is the only thing it seems to get me.

The only agenda I have is a pro-science agenda. I want to know how the universe works.

This agenda is not theory-specific. In addition to lots of other astrophysics, I have worked on both dark matter and MOND. I will continue to work on both until we have a better understanding of how the universe works. Right now we’re very far away from obtaining that goal. Anyone who tells you otherwise is fooling themselves – usually by dint of ignoring inconvenient aspects of the evidence. Everyone is susceptible to cognitive dissonance. Scientists are no exception – I struggle with it all the time. What disturbs me is the number of scientists who apparently do not. The field is being overrun with posers who lack the self-awareness to question their own assumptions and biases.

So, I feel like I’m repeating myself here, but let me state my bias. Oh wait. I already did. That’s why it felt like repetition. It is.

The following bit of this post is adapted from an old web page I wrote well over a decade ago. I’ve lost track of exactly when – the file has been through many changes in computer systems, and unix only records the last edit date. For the linked page, that’s 2016, when I added a few comments. The original is much older, and was written while I was at the University of Maryland. Judging from the html style, it was probably early to mid-’00s. Of course, the sentiment is much older, as it shouldn’t need to be said at all.

I will make a few updates as seem appropriate, so check the link if you want to see the changes. I will add new material at the end.

Long standing remarks on intellectual honesty

The debate about MOND often degenerates into something that falls well short of the sober, objective discussion that is suppose to characterize scientific debates. One can tell when voices are raised and baseless ad hominem accusations made. I have, with disturbing frequency, found myself accused of partisanship and intellectual dishonesty, usually by people who are as fair and balanced as Fox News.

Let me state with absolute clarity that intellectual honesty is a bedrock principle of mine. My attitude is summed up well by the quote

When a man lies, he murders some part of the world.

Paul Gerhardt

I first heard this spoken by the character Merlin in the movie Excalibur (1981 version). Others may have heard it in a song by Metallica. As best I can tell, it is originally attributable to the 17th century cleric Paul Gerhardt.

This is a great quote for science, as the intent is clear. We don’t get to pick and choose our facts. Outright lying about them is antithetical to science.

I would extend this to ignoring facts. One should not only be honest, but also as complete as possible. It does not suffice to be truthful while leaving unpleasant or unpopular facts unsaid. This is lying by omission.

I “grew up” believing in dark matter. Specifically, Cold Dark Matter, presumably a WIMP. I didn’t think MOND was wrong so much as I didn’t think about it at all. Barely heard of it; not worth the bother. So I was shocked – and angered – when it its predictions came true in my data for low surface brightness galaxies. So I understand when my colleagues have the same reaction.

Nevertheless, Milgrom got the prediction right. I had a prediction, it was wrong. There were other conventional predictions, they were also wrong. Indeed, dark matter based theories generically have a very hard time explaining these data. In a Bayesian sense, given the prior that we live in a ΛCDM universe, the probability that MONDian phenomenology would be observed is practically zero. Yet it is. (This is very well established, and has been for some time.)

So – confronted with an unpopular theory that nevertheless had some important predictions come true, I reported that fact. I could have ignored it, pretended it didn’t happen, covered my eyes and shouted LA LA LA NOT LISTENING. With the benefit of hindsight, that certainly would have been the savvy career move. But it would also be ignoring a fact, and tantamount to a lie.

In short, though it was painful and protracted, I changed my mind. Isn’t that what the scientific method says we’re suppose to do when confronted with experimental evidence?

That was my experience. When confronted with evidence that contradicted my preexisting world view, I was deeply troubled. I tried to reject it. I did an enormous amount of fact-checking. The people who presume I must be wrong have not had this experience, and haven’t bothered to do any fact-checking. Why bother when you already are sure of the answer?

Willful Ignorance

I understand being skeptical about MOND. I understand being more comfortable with dark matter. That’s where I started from myself, so as I said above, I can empathize with people who come to the problem this way. This is a perfectly reasonable place to start.

For me, that was over a quarter century ago. I can understand there being some time lag. That is not what is going on. There has been ample time to process and assimilate this information. Instead, most physicists have chosen to remain ignorant. Worse, many persist in spreading what can only be described as misinformation. I don’t think they are liars; rather, it seems that they believe their own bullshit.

To give an example of disinformation, I still hear said things like “MOND fits rotation curves but nothing else.” This is not true. The first thing I did was check into exactly that. Years of fact-checking went into McGaugh & de Blok (1998), and I’ve done plenty more since. It came as a great surprise to me that MOND explained the vast majority of the data as well or better than dark matter. Not everything, to be sure, but lots more than “just” rotation curves. Yet this old falsehood still gets repeated as if it were not a misconception that was put to rest in the previous century. We’re stuck in the dark ages by choice.

It is not a defensible choice. There is no excuse to remain ignorant of MOND at this juncture in the progress of astrophysics. It is incredibly biased to point to its failings without contending with its many predictive successes. It is tragi-comically absurd to assume that dark matter provides a better explanation when it cannot make the same predictions in advance. MOND may not be correct in every particular, and makes no pretense to be a complete theory of everything. But it is demonstrably less wrong than dark matter when it comes to predicting the dynamics of systems in the low acceleration regime. Pretending like this means nothing is tantamount to ignoring essential facts.

Even a lie of omission murders a part of the world.

Galaxy Stellar and Halo Masses: tension between abundance matching and kinematics

Galaxy Stellar and Halo Masses: tension between abundance matching and kinematics

Mass is a basic quantity. How much stuff does an astronomical object contain? For a galaxy, mass can mean many different things: that of its stars, stellar remnants (e.g., white dwarfs, neutron stars), atomic gas, molecular clouds, plasma (ionized gas), dust, Bok globules, black holes, habitable planets, biomass, intelligent life, very small rocks… these are all very different numbers for the same galaxy, because galaxies contain lots of different things. Two things that many scientists have settled on as Very Important are a galaxy’s stellar mass and its dark matter halo mass.

The mass of a galaxy’s dark matter halo is not well known. Most measurement provide only lower limits, as tracers fade out before any clear end is reached. Consequently, the “total” mass is a rather notional quantity. So we’ve adopted as a convention the mass M200 contained within an over-density of 200 times the critical density of the universe. This is a choice motivated by an ex-theory that would take an entire post to explain unsatisfactorily, so do not question the convention: all choices are bad, so we stick with it.

One of the long-standing problems the cold dark matter paradigm has is that the galaxy luminosity function should be steep but is observed to be shallow. This sketch shows the basic issue. The number density of dark matter halos as a function of mass is expected to be a power law – one that is well specified once the cosmology is known and a convention for the mass is adopted. The obvious expectation is that the galaxy luminosity function should just be a downshifted version of the halo mass function: one galaxy per halo, with the stellar mass proportional to the halo mass. This was such an obvious assumption [being provision (i) of canonical galaxy formation in LCDM] that it was not seriously questioned for over a decade. (Minor point: a turn down at the high mass end could be attributed to gas cooling times: the universe didn’t have time to cool and assemble a galaxy above some threshold mass, but smaller things had plenty of time for gas to cool and form stars.)

The number density of galaxies (blue) and dark matter halos (red) as a function of their mass. Our original expectation is on the left: the galaxy mass function should be a down-shifted version of the halo mass function, up to a gas cooling limit. Dashed grey lines illustrate the correspondence of galaxies with dark matter halos of proportional mass: M* = md M200. On the right is the current picture of abundance matching with the grey lines connecting galaxies with dark matter halos of equal cosmic density in which they are supposed to reside. In effect, we make the proportionality factor md a rolling, mass-dependent fudge factor.

The galaxy luminosity function does not look like a shifted version of the halo mass function. It has the wrong slope at the faint end. At no point is the size of the shift equal to what one would expect from the mass of available baryons. The proportionality factor md is too small; this is sometimes called the over-cooling problem, in that a lot more baryons should have cooled to form stars than apparently did so. So, aside from the shape and the normalization, it’s a great match.

We obsessed about this problem all through the ’90s. At one point, I thought I had solved it. Low surface brightness galaxies were under-represented in galaxy surveys. They weren’t missed entirely, but their masses could be systematically underestimated. This might matter a lot because the associated volume corrections are huge. A small systematic in mass would get magnified into a big one in density. Sadly, after a brief period of optimism, it became clear that this could not work to solve the entire problem, which persists.

Circa 2000, a local version of the problem became known as the missing satellites problem. This is a down-shifted version of the mismatch between the galaxy luminosity function and the halo mass function that pervades the entire universe: few small galaxies are observed where many are predicted. To give visual life to the numbers we’re talking about, here is an image of the dark matter in a simulation of a Milky Way size galaxy:

Dark Matter in the Via Lactea simulation (Diemand et al. 2008). The central region is the main dark matter halo which would contain a large galaxy like the Milky Way. All the lesser blobs are subhalos. A typical galaxy-sized dark matter halo should contain many, many subhalos. Naively, we expect each subhalo to contain a dwarf satellite galaxy. Structure is scale-free in CDM, so major galaxies should look like miniature clusters of galaxies.

In contrast, real galaxies have rather fewer satellites that meet the eye:

NGC 6946 and environs. The points are foreground stars, ignore them. The neighborhood of NGC 6946 appears to be pretty empty – there is no swarm of satellite galaxies as in the simulation above. I know of two dwarf satellite galaxies in this image, both of low surface brightness. The brighter one (KK98-250) the sharp-eyed may find between the bright stars at top right. The fainter one (KK98-251) is nearby KK98-250, a bit down and to the left of it; good luck seeing it on this image from the Digital Sky Survey. That’s it. There are no other satellite galaxies visible here. There can of course be more that are too low in surface brightness to detect. The obvious assumption of a one-to-one relation between stellar and halo mass cannot be sustained; there must instead be a highly non-linear relation between mass and light so that subhalos contain only contain dwarfs of extraordinarily low surface brightness.

By 2010, we’d thrown in the towel, and decided to just accept that this aspect of the universe was too complicated to predict. The story now is that feedback changes the shape of the luminosity function at both the faint and the bright ends. Exactly how depends on who you ask, but the predicted halo mass function is sacrosanct so there must be physical processes that make it so. (This is an example of the Frenk Principle in action.)

Lacking a predictive theory, theorists instead came up with a clever trick to relate galaxies to their dark matter halos. This has come to be known as abundance matching. We measure the number density of galaxies as a function of stellar mass. We know, from theory, what the number density of dark matter halos should be as a function of halo mass. Then we match them up: galaxies of a given density live in halos of the corresponding density, as illustrated by the horizontal gray lines in the right panel of the figure above.

There have now been a number of efforts to quantify this. Four examples are given in the figure below (see this paper for references), together with kinematic mass estimates.

The ratio of stellar to halo mass as a function of dark matter halo mass. Lines represent the abundance matching relations derived by assigning galaxies to dark matter halos based on their cosmic abundance. Points are independent halo mass estimates based on kinematics (McGaugh et al. 2010). The horizontal dashed line represents the maximum stellar mass that would result if all available baryons were turned into stars. (Mathematically, this happens when md equals the cosmic baryon fraction, about 15%.)

The abundance matching relations have a peak around a halo mass of 1012 M and fall off to either side. This corresponds to the knee in the galaxy luminosity function. For whatever reason, halos of this mass seem to be most efficient at converting their available baryons into stars. The shape of these relations mean that there is a non-linear relation between stellar mass and halo mass. At the low mass end, a big range in stellar mass is compressed into a small range in halo mass. The opposite happens at high mass, where the most massive galaxies are generally presumed to be the “central” galaxy of a cluster of galaxies. We assign the most massive halos to big galaxies understanding that they may be surrounded by many subhalos, each containing a cluster galaxy.

Around the same time, I made a similar plot, but using kinematic measurements to estimate halo masses. Both methods are fraught with potential systematics, but they seem to agree reasonably well – at least over the range illustrated above. It gets dodgy above and below that. The agreement is particularly good for lower mass galaxies. There seems to be a departure for the most massive individual galaxies, but why worry about that when the glass is 3/4 full?

Skip ahead a decade, and some people think we’ve solved the missing satellite problem. One key ingredient of that solution is that the Milky Way resides in a halo that is on the lower end of the mass range that has traditionally been estimated for it (1 to 2 x 1012 M). This helps because the number of subhalos scales with mass: clusters are big halos with lots of galaxy-size halos; the Milky Way is a galaxy-sized halo with lots of smaller subhalos. Reality does not look like that, but having a lower mass means fewer subhalos, so that helps. It does not suffice. We must invoke feedback effects to make the relation between light and mass nonlinear. Then the lowest mass satellites may be too dim to detect: selection effects have to do a lot of work. It also helps to assume the distribution of satellites is isotropic, which looks to be true in the simulation, but not so much in reality where known dwarf satellites occupy a planar distribution. We also need to somehow fudge the too-big-to-fail problem, in which the more massive subhalos appear not to be occupied by luminous galaxies at all. Given all that, we can kinda sorta get in the right ballpark. Kinda, sorta, provided that we live in a galaxy whose halo mass is closer to 1012 M than to 2 x 1012 M.

At an IAU meeting in Shanghai (in July 2019, before travel restrictions), the subject of the mass of the Milky Way was discussed at length. It being our home galaxy, there are many ways in which to constrain the mass, some of which take advantage of tracers that go out to greater distances than we can obtain elsewhere. Speaker after speaker used different methods to come to a similar conclusion, with the consensus hedging on the low side (roughly 1 – 1.5 x 1012 M). A nice consequence would be that the missing satellite problem may no longer be a problem.

Galaxies in general and the Milky Way in particular are different and largely distinct subfields. Different data studied by different people with distinctive cultures. In the discussion at the end of the session, Pieter van Dokkum pointed out that from the perspective of other galaxies, the halo mass ought to follow from abundance matching, which for a galaxy like the Milky Way ought to be more like 3 x 1012 M, considerably more than anyone had suggested, but hard to exclude because most of that mass could be at distances beyond the reach of the available tracers.

This was not well received.

The session was followed by a coffee break, and I happened to find myself standing in line next to Pieter. I was still processing his comment, and decided he was right – from a certain point of view. So we got to talking about it, and wound up making the plot below, which appears in a short research note. (For those who know the field, it might be assumed that Pieter and I hate each other. This is not true, but we do frequently disagree, so the fact that we do agree about this is itself worthy of note.)

The Local Group and its two most massive galaxies, the Milky Way and Andromeda (M31), in the stellar mass-halo mass plane. Lines are the abundance matching relations from above. See McGaugh & van Dokkum for further details. The remaining galaxies of the Local Group all fall off the edge of this plot, and do not add up to anything close to either the Milky Way or Andromeda alone.

The Milky Way and Andromeda are the 1012 M gorillas of the Local Group. There are many dozens of dwarf galaxies, but none of them are comparable in mass, even with the boost provided by the non-linear relation between mass and luminosity. To astronomical accuracy, in terms of mass, the Milky Way plus Andromeda are the Local Group. There are many distinct constraints, on each galaxy as an individual, and on the Local Group as a whole. Any way we slice it, all three entities lie well off the relation expected from abundance matching.

There are several ways one could take it from here. One might suppose that abundance matching is correct, and we have underestimated the mass with other measurements. This happens all the time with rotation curves, which typically do not extend far enough out into the halo to give a good constraint on the total mass. This is hard to maintain for the Local Group, where we have lots of tracers in the form of dwarf satellites, and there are constraints on the motions of galaxies on still larger scales. Moreover, a high mass would be tragic for the missing satellite problem.

One might instead imagine that there is some scatter in the abundance matching relation, and we just happen to live in a galaxy that has a somewhat low mass for its luminosity. This is almost reasonable for the Milky Way, as there is some overlap between kinematic mass estimates and the expectations of abundance matching. But the missing satellite problem bites again unless we are pretty far off the central value of the abundance matching relation. Other Milky Way-like galaxies ought to fall on the other end of the spectrum, with more mass and more satellites. A lot of work is going on to look for satellites around other spirals, which is hard work (see NGC 6946 above). There is certainly scatter in the number of satellites from system to system, but whether this is theoretically sensible or enough to explain our Milky Way is not yet apparent.

There is a tendency in the literature to invoke scatter when and where needed. Here, it is important to bear in mind that there is little scatter in the Tully-Fisher relation. This is a relation between stellar mass and rotation velocity, with the latter supposedly set by the halo mass. We can’t have it both ways. Lots of scatter in the stellar mass-halo mass relation ought to cause a corresponding amount of scatter in Tully-Fisher. This is not observed. It is a much stronger than most people seem to appreciate, as even subtle effects are readily perceptible. Consequently, I think it unlikely that we can nuance the relation between halo mass and observed rotation speed to satisfy both relations without a lot of fine-tuning, which is usually a sign that something is wrong.

There are a lot of moving parts in modern galaxy formation simulations that need to be fine-tuned: the effects of halo mass, merging, dissipation, [non]adiabatic compression, angular momentum transport, gas cooling, on-going accretion of gas from the intergalactic medium, expulsion of gas in galactic winds, re-accretion of expelled gas via galactic fountains, star formation and the ensuing feedback from radiation pressure, stellar winds, supernovae, X-rays from stellar remnants, active galactic nuclei, and undoubtedly other effects I don’t recall off the top of my head. Visualization from the Dr. Seuss suite of simulations.

A lot of effort has been put into beating down the missing satellite problem around the Milky Way. Matters are worse for Andromeda. Kinematic halo mass estimates are typically in the same ballpark as the Milky Way. Some are a bit bigger, some are lower. Lower is a surprise, because the stellar mass of M31 is clearly bigger than that of the Milky Way, placing it is above the turnover where the efficiency of star formation is maximized. In this regime, a little stellar mass goes a long way in terms of halo mass. Abundance matching predicts that a galaxy of Andromeda’s stellar mass should reside in a dark matter halo of at least 1013 M. That’s quite a bit more than 1 or 2 x 1012 M, even by astronomical standards. Put another way, according to abundance matching, the Local Group should have the Milky Way as its most massive occupant. Just the Milky Way. Not the Milky Way plus Andromeda. Despite this, the Local Group is not anomalous among similar groups.

Words matter. A lot boils down to what we consider to be “close enough” to call similar. I do not consider the Milky Way and Andromeda to be all that similar. They are both giant spirals, yes, but galaxies are all individuals. Being composed of hundreds of billions of stars, give or take, leaves a lot of room for differences. In this case, the Milky Way and Andromeda are easily distinguished in the Tully-Fisher plane. Andromeda is about twice the baryonic mass of the Milky Way. It also rotates faster. The error bars on these quantities do not come close to overlapping – that would be one criterion for considering them to be similar – a criterion they do not meet. Even then, there could be other features that might be readily distinguished, but let’s say a rough equality in the Tully-Fisher plane would indicate stellar and halo masses that are “close enough” for our present discussion. They aren’t: to me, the Milky Way and M31 are clearly different galaxies.

I spent a fair amount of time reading the recent literature on satellites searches, and I was struck by the ubiquity with which people make the opposite assumption, treating the Milky Way and Andromeda as interchangeable galaxies of similar mass. Why would they do this? If one looks at the kinematic halo mass as the defining characteristic of a galaxy, they’re both close to 1012 M, with overlapping error bars on M200. By that standard, it seems fair. Is it?

Luminosity is observable. Rotation speed is observable. There are arguments to be had about how to convert luminosity into stellar mass, and what rotation speed measure is “best.” These are sometimes big arguments, but they are tiny in scale compared to estimating notional quantities like the halo mass. The mass M200 is not an observable quantity. As such, we have no business using it as a defining characteristic of a galaxy. You know a galaxy when you see it. The same cannot be said of a dark matter halo. Literally.

If, for some theoretically motivated reason, we want to use halo mass as a standard then we need to at least use a consistent method to assess its value from directly observable quantities. The methods we use for the Milky Way and M31 are not applicable beyond the Local Group. Nowhere else in the universe do we have such an intimate picture of the kinematic mass from a wide array of independent methods with tracers extending to such large radii. There are other standards we could apply, like the Tully-Fisher relation. That we can do outside the Local Group, but by that standard we would not infer that M31 and the Milky Way are the same. Other observables we can fairly apply to other galaxies are their luminosities (stellar masses) and cosmic number densities (abundance matching). From that perspective, what we know from all the other galaxies in the universe is that the factor of ~2 difference in stellar mass between Andromeda and the Milky Way should be huge in terms of halo mass. If it were anywhere else in the universe, we wouldn’t treat these two galaxies as interchangeably equal. This is the essence of Pieter’s insight: abundance matching is all about the abundance of dark matter halos, so that would seem to be the appropriate metric by which to predict the expected number of satellites, not the kinematic halo mass that we can’t measure in the same way anywhere else in the universe.

That isn’t to say we don’t have some handle on kinematic halo masses, it’s just that most of that information comes from rotation curves that don’t typically extend as far as the tracers that we have in the Local Group. Some rotation curves are more extended than others, so one has to account for that variation. Typically, we can only put a lower limit on the halo mass, but if we assume a profile like NFW – the standard thing to do in LCDM, then we can sometimes exclude halos that are too massive.

Abundance matching has become important enough to LCDM that we included it as a prior in fitting dark matter halo models to rotation curves. For example:

The stellar mass-halo mass relation from rotation curve fits (Li et al 2020). Each point is one galaxy; the expected abundance matching relation (line) is not recovered (left) unless it is imposed as a prior (right). The data are generally OK with this because the amount of mass at radii beyond the end of the rotation curve is not strongly constrained. Still, there are some limits on how crazy this can get.

NFW halos are self-similar: low mass halos look very much like high mass halos over the range that is constrained by data. Consequently, if you have some idea what the total mass of the halo should be, as abundance matching provides, and you impose that as a prior, the fits for most galaxies say “OK.” The data covering the visible galaxy have little power to constrain what is going on with the dark matter halo at much larger radii, so the fits literally fall into line when told to do so, as seen in Pengfei‘s work.

That we can impose abundance matching as a prior does not necessarily mean the result is reasonable. The highest halo masses that abundance matching wants in the plot above are crazy talk from a kinematic perspective. I didn’t put too much stock in this, as the NFW halo itself, the go-to standard of LCDM, provides the worst description of the data among all the dozen or so halo models that we considered. Still, we did notice that even with abundance matching imposed as a prior, there are a lot more points above the line than below it at the high mass end (above the bend in the figure above). The rotation curves are sometimes pushing back against the imposed prior; they often don’t want such a high halo mass. This was explored in some detail by Posti et al., who found a similar effect.

I decided to turn the question around. Can we use abundance matching to predict the halo and hence rotation curve of a massive galaxy? The largest spiral in the local universe, UGC 2885, has one of the most extended rotation curves known, meaning that it does provide some constraint on the halo mass. This galaxy has been known as an important case since Vera Rubin’s work in the ’70s. With a modern distance scale, its rotation curve extends out 80 kpc. That’s over a quarter million light-years – a damn long way, even by the standards of galaxies. It also rotates remarkably fast, just shy of 300 km/s. It is big and massive.

(As an aside, Vera once offered a prize for anyone who found a disk that rotated faster than 300 km/s. Throughout her years of looking at hundreds of galaxies, UGC 2885 remained the record holder, with 300 seeming to be a threshold that spirals did not exceed. She told me that she did pay out, but on a technicality: someone showed her a gas disk around a supermassive black hole in Keplerian rotation that went up to 500 km/s at its peak. She lamented that she had been imprecise in her language, as that was nothing like what she meant, which was the flat rotation speed of a spiral galaxy.)

That aside aside, if we take abundance matching at face value, then the stellar mass of a galaxy predicts the mass of its dark matter halo. Using the most conservative (in that it returns the lowest halo mass) of the various abundance matching relations indicates that with a stellar mass of about 2 x 1011 M, UGC 2885 should have a halo mass of 3 x 1013 M. Combining this with a well-known relation between halo concentration and mass for NFW halos, we then know what the rotation curve should be. Doing this for UGC 2885 yields a tragic result:

The extended rotation curve of UGC 2885 (points). The declining dotted line is the rotation curve predicted by the observed stars and gas. The rising dashed line is the halo predicted by abundance matching. Combining this halo with the observed stars and gas should result in the solid line. This greatly exceeds the data. UGC 2885 does not reside in an NFW halo that is anywhere near as massive as predicted by abundance matching.

The data do not allow for the predicted amount of dark matter. If we fit the rotation curve, we obtain a “mere” M200 = 5 x 1012 M. Note that this means that UGC 2885 is basically the Milky Way and Andromeda added together in terms of both stellar mass and halo mass – if added to the M*-M200 plot above, it would land very close to the open circle representing the more massive halo estimate for the combination of MW+M31, and be just as discrepant from the abundance matching relations. We get the same result regardless of which direction we look at it from.

Objectively, 5 x 1012 M is a huge dark matter halo for a single galaxy. It’s just not the yet-more massive halo that is predicted by abundance matching. In this context, UGC 2885 apparently has a serious missing satellites problem, as it does not appear to be swimming in a sea of satellite galaxies the way we’d expect for the central galaxy of such high mass halo.

UGC 2885 appears to be pretty lonely in this image from the DSS. I see a few candidate satellite galaxies amidst the numerous foreground stars, but nothing like what you’d expect for dark matter subhalos from a simulation like the via Lactea. This impression does not change when imaged in more detail with HST.

It is tempting to write this off as a curious anecdote. Another outlier. Sure, that’s always possible, but this is more than a bit ridiculous. Anyone who wants to go this route I refer to Snoop Dog.

I spent much of my early career obsessed with selection effects. These preclude us from seeing low surface brightness galaxies as readily as brighter ones. However, it isn’t binary – a galaxy has to be extraordinarily low surface brightness before it becomes effectively invisible. The selection effect is a bias – and a very strong one – but not an absolute screen that prevents us from finding low surface brightness galaxies. That makes it very hard to sustain the popular notion that there are lots of subhalos that simply contain ultradiffuse galaxies that cannot currently be seen. I’ve been down this road many times as an optimist in favor of this interpretation. It hasn’t worked out. Selection effects are huge, but still nowhere near big enough to overcome the required deficit.

Having the satellite galaxies that inhabit subhalos be low in surface brightness is a necessary but not sufficient criterion. It is also necessary to have a highly non-linear stellar mass-halo mass relation at low mass. In effect, luminosity and halo mass become decoupled: satellite galaxies spanning a vast range in luminosity must live in dark matter halos that cover only a tiny range. This means that it should not be possible to predict stellar motions in these galaxies from their luminosity. The relation between mass and light has just become too weak and messy.

And yet, we can do exactly that. Over and over again. This simply should not be possible in LCDM.

The Fat One – a test of structure formation with the most massive cluster of galaxies

The Fat One – a test of structure formation with the most massive cluster of galaxies

A common objection to MOND is that it does not entirely reconcile the mass discrepancy in clusters of galaxies. This can be seen as an offset in the acceleration scale between individual galaxies and clusters. This is widely seen as definitive proof of dark matter, but this is just defaulting to our confirmation bias without checking if it is really any better: just because MOND does something wrong doesn’t automatically mean that LCDM does it right.

The characteristic acceleration (in units of Milgrom’s constant a0) of extragalactic objects as a function of their baryonic mass, ranging from tiny dwarf galaxies to giant clusters of galaxies. Clusters are offset from individual galaxies, implying a residual missing mass problem for MOND. From Famaey & McGaugh (2012).

I do see clusters as a problem for MOND, and there are some aspects of clusters that make good sense in LCDM. Unlike galaxies, cluster mass profiles are generally consistent with the predicted NFW halos (modulo their own core problem). That’s not a contradiction to MOND, which should do the same thing as Newton in the Newtonian regime. But rich clusters also have baryon fractions close to that expected from cosmology. From that perspective, it looks pretty reasonable. This success does not extend to lower mass clusters; in the plot above, the low mass green triangles should be higher than the higher mass gray triangles in order for all clusters to have the cosmic baryon fraction. They should not parallel the prediction of MOND. Within individual clusters, baryons are not as well mixed with dark matter as expected: they tend to have too much unseen mass at small radius, which is basically the same problem encountered by MOND.

There are other tests, one of which is the growth of clusters. Structure is predicted to form hierarchically in LCDM: small objects form first, and pile on to make bigger ones, with the largest clusters being the last to form. So there is a test in how massive a cluster can get as a function of redshift. This is something for which LCDM makes a clear prediction. In MOND, my expectation is that structure forms faster so that massive objects are in place at higher redshift than expected in LCDM. This post is mostly about clusters in LCDM, so henceforth all masses will be conventional masses, including the putative dark matter.

Like so many things, there is a long history to this. For example, in the late ’90s, Megan Donahue reported a high temperature of ~ 12 keV for the intracluster gas in the cluster MS1054-0321. This meant that it was massive for its redshift: 7.4 x 1014 h-1 M (dark matter and all) at z = 0.829, when the universe was only about half its current age. (Little h is the Hubble constant in units of 100 km/s/Mpc. Since we’re now pretty sure h < 1, the true mass is higher, more like 1015 M.) That’s a lot of solar masses to assemble in the available time. In 1997, this was another nail in the coffin of SCDM, which was already a zombie theory by then. But the loss of Ωm = 1 was still raw for some people, I guess, because she got a lot of grief for it. Can’t be true! Clusters don’t get that big that early! At least they shouldn’t. In SCDM.

Structure formation in SCDM was elegant in that in continues perpetually: as the universe expands, bigger and bigger structures continue to form; statistically, later epochs look like scaled-up versions of earlier epochs. In LCDM, this symmetry is broken by the decline in density as the universe expands. Consequently, structure forms earlier in LCDM: the action has to happen when there is still some density to work with, and the accelerated expansion provides some extra time (what’s a few billion years among cosmologists?) for mass to get together. Consequently, MS1054-0321 is not problematic in LCDM.

The attitude persisted, however. In the mid-’00s, Jim Schombert and I started using the wide field near-IR camera NEWFIRM to study high redshift clusters. Jim had a clever way of identifying them, which turned out not to be particularly hard, e.g., MS 1426.9+1052 at z = 1.83. This is about 10 Gyr ago, and made the theorists squirm. That didn’t leave enough time for a cluster to form. On multiple occasions I had the following conversation with different theorists:

me: Hey, look at this clusters at z = 1.8.

theorist: That isn’t a cluster.

me: Sure it is. There’s the central galaxy, which contains a bright radio source (QSO). You can see lots of other galaxies around it. That’s what a cluster looks like.

theorist: Must be a chance projection.

me: There are spectra for many of the surrounding galaxies; they’re all at the same redshift.

theorist: …

me: So… a cluster at z = 1.8. Pretty cool, huh?

theorist: That isn’t a cluster.

This work became part of Jay Frank’s thesis. He found evidence for more structure at even higher redshift. A lot of this apparent clustering probably is not real… the statistics get worse as you push farther out: fewer galaxies, worse data. But there were still a surprising number of objects in apparent association up to and beyond z > 5. That’s pretty much all of time, leaving a mere Gyr to go from the completely homogeneous universe that we see in the CMB at z = 1090 to the first stars around z ~ 20 to the first galaxies to big galaxies to protoclusters – or whatever we want to call these associations of many galaxies in the same place on the sky at the same redshift.

Jay did a lot of work to estimate the rate of false positives. Long story short, we expect about 1/3 of the protoclusters he identified to be real structures. That’s both bad and good – lots of chaff, but some wheat too. One thing Jay did was to analyze the Millennium simulation in the same way as the data. This allows us to quantify what we would see if the universe looked like an LCDM simulation.

The plot below shows the characteristic brightness of galaxies at various redshifts. For the pros, this is the knee in the Schechter function fit to the luminosity distribution of galaxies in redshift bins. We saw the same thing in protoclusters and in the field: galaxies were brighter than anticipated in the simulation. Between redshifts 3 < z < 4, the characteristic magnitude is expected to be 23. That’s pretty faint. In the data, it’s more like 21. That’s also faint, but about a factor of 6 brighter than they should be. That’s a lot of stars that have formed before they’re supposed to, in galaxies that are bigger than they should yet be, with some of them already clustering together ahead of their time.

The characteristic magnitude of galaxies in the Spitzer 4.5 micron band as a function of redshift in the Millennium simulation (black squares) and in reality (circles). This is a classic backwards astronomical plot in which larger magnitudes are fainter sources. At high redshift, simulations predict that galaxies should not yet have grown to become as bright as they are observed to be. From Franck (2017).

This has been the observer’s experience. Donahue wasn’t the first, and Franck won’t be the last. Every time we look, we see more structure in place sooner than had been expected before it was seen. I don’t hear people complaining about our clusters at z = 1.8 anymore; those have been seen enough to become normalized. Perhaps they have even been explained satisfactorily. But they sure weren’t expected, much less predicted.

So, just how big can a cluster get? Mortonson et al. (2011) set out to answer this question. The graph below shows the upper limit they predict for the most massive cluster in the universe as a function of redshift. This declines as redshift increases because we’re looking back in time; high redshift clusters haven’t had time time to assemble more mass than the upper most line. They project this into what would be discovered in an all-sky survey, and more realistic surveys of finite size. Basically nothing should exist above these lines.

The predicted maximum mass of galaxy clusters as a function of redshift from Mortonson et al. (2011). Each line is the predicted upper limit for the corresponding amount of sky surveyed. The green line illustrates the area of the sky in which El Gordo was discovered. The points show independent mass estimates for El Gordo from Menanteau et al. (2012) and Jee et al. (2014). These are significantly above the predicted upper limit.

Their prediction was almost immediately put to the test by the discovery of El Gordo, a big fat cluster at z = 0.87 reported by Menanteau et al. (2012), who published the X-ray image above. It is currently the record holder for the most massive known object that is thought to be gravitationally bound, weighing in at 2 or 3 x 1015 M, depending on who you ask. That’s about a thousand Milky Ways, plus a few hundred Andromedas. Give or take.

El Gordo straddles the uppermost line in the graph above. A naive reading of the first mass estimate suggests that it’s roughly a 50/50 proposition whether the entire observable universe should contain exactly one El Gordo. However, El Gordo was discovered in something less than a full sky survey. The appropriate comparison is to the green line, which El Gordo clearly exceeds – by about 3 sigma. This is the case for both of the illustrated mass estimates as the high mass point has a larger error bar. They both exceed the green line by a hair less than 3 sigma. Formally, this means that the chance of finding El Gordo in our universe is only a few percent.

A few percent is not good. Neither is it terrible – I’ve often commented here on how the uncertainties are larger than they seem. This is especially true of the tails of the distribution. So maybe a few percent is pessimistic; sometimes that’s how the dice roll. On the other hand, the odds aren’t better than 10%: El Gordo is not likely to exist however we slice the uncertainties. Whether we should be worried about it is just a matter of how surprising it is. A similar situation arises with the collision velocity of the Bullet cluster, which is either absurdly unlikely (about 1 chance in 10 billion) or merely unusual (maybe 1 in 10). So I made the above plot by adding El Gordo to the predictions of Mortonson et al., and filed it away under

Recently, Elena Asencio, Indranil Banik, and Pavel Kroupa have made a more thorough study. They have their own blog post, so I won’t repeat the technical description. Basically, they sift through a really big LCDM simulation to find objects that could be (or become) like El Gordo.

The short answer is that it doesn’t happen, similar to big voids. They estimate that the odds of El Gordo existing are a bit less than one in a billion. I’m sure one can quibble with details, but we’re not going to save LCDM with factors of two in a probability that starts this low. El Gordo just shouldn’t exist.

The probability is lower than in the graph above because it isn’t just a matter of mass. It is also the mass ratio of the merging clumps (both huge clusters in their own right), their collision speed, impact parameter, and morphology. As they are aware, one must be careful not to demand a perfect match, since there is only one reality. But neither is it just a matter of assembling mass; that understates the severity of the problem. This is where simulations are genuinely helpful: one can ask how often does this happen? If the answer is never, one can refine the query to be more lenient. The bottom line here is that you can’t be lenient enough to get something like El Gordo.

Here is their money plot. To be like El Gordo, an object would have to be up on the red line. That’s well above 5 sigma, which is the threshold where we traditionally stop quibbling about percentiles and just say Nope. Not an accident.

Logarithmic mass as a function of expansion factor [how big the universe is. This is inversely related to redshift: a = 1/(1+z)]. The color scale gives the number density of objects of a given mass as a function of how far the universe has expanded. The solid lines show the corresponding odds (in sigma) of finding such a thing in a large LCDM simulation. Figure from Asencio et al (2020).

In principle, this one object falsifies the LCDM structure formation paradigm. We are reluctant to put too much emphasis on a single object (unless it is the bullet cluster and we have clickbait to sell) as its a big universe, so there can always be one unicorn or magnetic monopole somewhere. Ascencio et al note that a similar constraint follows for the Bullet cluster itself, which also should not exist, albeit at a lower significance. That’s two unicorns: we can’t pretend that this is a one-off occurrence. The joint probability of living in a universe with both El Gordo and the Bullet cluster is even lower than either alone.

Looking at Ascencio’s figure, it strikes me as odd not only that we find huge things at high redshift, but also that we don’t see still bigger objects at low redshift. There were already these huge clusters ramming into each other when the universe had only expanded to half its present size. This process should continue to build still bigger clusters, as indicated by the lines in the plot. The sweet spot for finding really massive clusters should be about z = 0.5, by which time they could have reached a mass of nearly 1016 M as readily (or not!) as El Gordo could reach its mass by its observed redshift. (The lines turn down for the largest expansion factors/lowest redshifts because surveys cover a fixed area on the sky, which is a conical volume in 3D. We reside at the point of the cone, and need to see a ways out before a volume large enough to contain a giant cluster has been covered.)

I have never heard a report of a cluster anywhere near to 1016 M. A big cluster is 1015 M. While multiple examples of clusters this big are known, to the best of my knowledge, El Gordo is the record holding Fat One at twice or thrice that. The nearest challenger I can readily find is RX J1347.5-1145 at z=0.451 (close to the survey sweet spot) weighing in at 2 x 1015 M. Clusters just don’t seem to get bigger than that. This mass is OK at low redshift, but at higher z we shouldn’t see things as big as El Gordo. Given that we do see them at z = 0.87 (a = 0.535), why don’t we see still bigger ones at lower redshift? Perhaps structure formation saturates, but that’s not what LCDM predicts. If we can somehow explain El Gordo at high z, we are implicitly predicting still bigger clusters at lower redshift – objects we have yet to discover, if they exist, which they shouldn’t.

Which is the point.

The image featured at top is an X-ray image of the hot gas in the intracluster medium of El Gordo from NASA/CXC/Rutgers/J. Hughes et al.

Big Trouble in a Deep Void

Big Trouble in a Deep Void

The following is a guest post by Indranil Banik, Moritz Haslbauer, and Pavel Kroupa (bios at end) based on their new paper

Modifying gravity to save cosmology

Cosmology is currently in a major crisis because of many severe tensions, the most serious and well-known being that local observations of how quickly the Universe is expanding (the so-called ‘Hubble constant’) exceed the prediction of the standard cosmological model, ΛCDM. This prediction is based on the cosmic microwave background (CMB), the most ancient light we can observe – which is generally thought to have been emitted about 400,000 years after the Big Bang. For ΛCDM to fit the pattern of fluctuations observed in the CMB by the Planck satellite and other experiments, the Hubble constant must have a particular value of 67.4 ± 0.5 km/s/Mpc. Local measurements are nearly all above this ‘Planck value’, but are consistent with each other. In our paper, we use a local value of 73.8 ± 1.1 km/s/Mpc using a combination of supernovae and gravitationally lensed quasars, two particularly precise yet independent techniques.

This unexpectedly rapid local expansion of the Universe could be due to us residing in a huge underdense region, or void. However, a void wide and deep enough to explain the Hubble tension is not possible in ΛCDM, which is built on Einstein’s theory of gravity, General Relativity. Still, there is quite strong evidence that we are indeed living within a large void with a radius of about 300 Mpc, or one billion light years. This evidence comes from many surveys covering the whole electromagnetic spectrum, from radio to X-rays. The most compelling evidence comes from analysis of galaxy number counts in the near-infrared, giving the void its name of the Keenan-Barger-Cowie (KBC) void. Gravity from matter outside the void would pull more than matter inside it, making the Universe appear to expand faster than it actually is for an observer inside the void. This ‘Hubble bubble’ scenario (depicted in Figure 1) could solve the Hubble tension, a possibility considered – and rejected – in several previous works (e.g. Kenworthy+ 2019). We will return to their objections against this idea.

Figure 1: Illustration of the Universe’s large scale structure. The darker regions are voids, and the bright dots represent galaxies. The arrows show how gravity from surrounding denser regions pulls outwards on galaxies in a void. If we were living in such a void (as indicated by the yellow star), the Universe would expand faster locally than it does on average. This could explain the Hubble tension. Credit: Technology Review

One of the main objections seemed to be that since such a large and deep void is incompatible with ΛCDM, it can’t exist. This is a common way of thinking, but the problem with it was clear to us from a very early stage. The first part of this logic is sound – assuming General Relativity, a hot Big Bang, and that the state of the Universe at early times is apparent in the CMB (i.e. it was flat and almost homogeneous then), we are led to the standard flat ΛCDM model. By studying the largest suitable simulation of this model (called MXXL), we found that it should be completely impossible to find ourselves inside a void with the observed size and depth (or fractional underdensity) of the KBC void – this possibility can be rejected with more confidence than the discovery of the Higgs boson when first announced. We therefore applied one of the leading alternative gravity theories called Milgromian Dynamics (MOND), a controversial idea developed in the early 1980s by Israeli physicist Mordehai Milgrom. We used MOND (explained in a simple way here) to evolve a small density fluctuation forwards from early times, studying if 13 billion years later it fits the density and velocity field of the local Universe. Before describing our results, we briefly introduce MOND and explain how to use it in a potentially viable cosmological framework. Astronomers often assume MOND cannot be extended to cosmological scales (typically >10 Mpc), which is probably true without some auxiliary assumptions. This is also the case for General Relativity, though in that case the scale where auxiliary assumptions become crucial is only a few kpc, namely in galaxies.

MOND was originally designed to explain why galaxies rotate faster in their outskirts than they should if one applies General Relativity to their luminous matter distribution. This discrepancy gave rise to the idea of dark matter halos around individual galaxies. For dark matter to cluster on such scales, it would have to be ‘cold’, or equivalently consist of rather heavy particles (above a few thousand eV/c2, or a millionth of a proton mass). Any lighter and the gravity from galaxies could not hold on to the dark matter. MOND assumes these speculative and unexplained cold dark matter haloes do not exist – the need for them is after all dependent on the validity of General Relativity. In MOND once the gravity from any object gets down to a certain very low threshold called a0, it declines more gradually with increasing distance, following an inverse distance law instead of the usual inverse square law. MOND has successfully predicted many galaxy rotation curves, highlighting some remarkable correlations with their visible mass. This is unexpected if they mostly consist of invisible dark matter with quite different properties to visible mass. The Local Group satellite galaxy planes also strongly favour MOND over ΛCDM, as explained using the logic of Figure 2 and in this YouTube video.

Figure 2: the satellite galaxies of the Milky Way and Andromeda mostly lie within thin planes. These are difficult to form unless the galaxies in them are tidal dwarfs born from the interaction of two major galaxies. Since tidal dwarfs should be free of dark matter due to the way they form, the satellites in the satellite planes should have rather weak self-gravity in ΛCDM. This is not the case as measured from their high internal velocity dispersions. So the extra gravity needed to hold galaxies together should not come from dark matter that can in principle be separated from the visible.

To extend MOND to cosmology, we used what we call the νHDM framework (with ν pronounced “nu”), originally proposed by Angus (2009). In this model, the cold dark matter of ΛCDM is replaced by the same total mass in sterile neutrinos with a mass of only 11 eV/c2, almost a billion times lighter than a proton. Their low mass means they would not clump together in galaxies, consistent with the original idea of MOND to explain galaxies with only their visible mass. This makes the extra collisionless matter ‘hot’, hence the name of the model. But this collisionless matter would exist inside galaxy clusters, helping to explain unusual configurations like the Bullet Cluster and the unexpectedly strong gravity (even in MOND) in quieter clusters. Considering the universe as a whole, νHDM has the same overall matter content as ΛCDM. This makes the overall expansion history of the universe very similar in both models, so both can explain the amounts of deuterium and helium produced in the first few minutes after the Big Bang. They should also yield similar fluctuations in the CMB because both models contain the same amount of dark matter. These fluctuations would get somewhat blurred by sterile neutrinos of such a low mass due to their rather fast motion in the early Universe. However, it has been demonstrated that Planck data are consistent with dark matter particles more massive than 10 eV/c2. Crucially, we showed that the density fluctuations evident in the CMB typically yield a gravitational field strength of 21 a0 (correcting an earlier erroneous estimate of 570 a0 in the above paper), making the gravitational physics nearly identical to General Relativity. Clearly, the main lines of early Universe evidence used to argue in favour of ΛCDM are not sufficiently unique to distinguish it from νHDM (Angus 2009).

The models nonetheless behave very differently later on. We estimated that for redshifts below about 50 (when the Universe is older than about 50 million years), the gravity would typically fall below a0 thanks to the expansion of the Universe (the CMB comes from a redshift of 1100). After this ‘MOND moment’, both the ordinary matter and the sterile neutrinos would clump on large scales just like in ΛCDM, but there would also be the extra gravity from MOND. This would cause structures to grow much faster (Figure 3), allowing much wider and deeper voids.

Figure 3: Evolution of the density contrast within a 300 co-moving Mpc sphere in different Newtonian (red) and MOND (blue) models, shown as a function of the Universe’s size relative to its present size (this changes almost linearly with time). Notice the much faster structure growth in MOND. The solid blue line uses a time-independent external field on the void, while the dot-dashed blue line shows the effect of a stronger external field in the past. This requires a deeper initial void to match present-day observations.

We used this basic framework to set up a dynamical model of the void. By making various approximations and trying different initial density profiles, we were able to simultaneously fit the apparent local Hubble constant, the observed density profile of the KBC void, and many other observables like the acceleration parameter, which we come to below. We also confirmed previous results that the same observables rule out standard cosmology at 7.09σ significance. This is much more than the typical threshold of 5σ used to claim a discovery in cases like the Higgs boson, where the results agree with prior expectations.

One objection to our model was that a large local void would cause the apparent expansion of the Universe to accelerate at late times. Equivalently, observations that go beyond the void should see a standard Planck cosmology, leading to a step-like behaviour near the void edge. At stake is the so-called acceleration parameter q0 (which we defined oppositely to convention to correct a historical error). In ΛCDM, we expect q0 = 0.55, while in general much higher values are expected in a Hubble bubble scenario. The objection of Kenworthy+ (2019) was that since the observed q0 is close to 0.55, there is no room for a void. However, their data analysis fixed q0 to the ΛCDM expectation, thereby removing any hope of discovering a deviation that might be caused by a local void. Other analyses (e.g. Camarena & Marra 2020b) which do not make such a theory-motivated assumption find q0 = 1.08, which is quite consistent with our best-fitting model (Figure 4). We also discussed other objections to a large local void, for instance the Wu & Huterer (2017) paper which did not consider a sufficiently large void, forcing the authors to consider a much deeper void to try and solve the Hubble tension. This led to some serious observational inconsistencies, but a larger and shallower void like the observed KBC void seems to explain the data nicely. In fact, combining all the constraints we applied to our model, the overall tension is only 2.53σ, meaning the data have a 1.14% chance of arising if ours were the correct model. The actual observations are thus not the most likely consequence of our model, but could plausibly arise if it were correct. Given also the high likelihood that some if not all of the observational errors we took from publications are underestimates, this is actually a very good level of consistency.

Figure 4: The predicted local Hubble constant (x-axis) and acceleration parameter (y-axis) as measured with local supernovae (black dot, with red error ellipses). Our best-fitting models with different initial void density profiles (blue symbols) can easily explain the observations. However, there is significant tension with the prediction of ΛCDM based on parameters needed to fit Planck observations of the CMB (green dot). In particular, local observations favour a higher acceleration parameter, suggestive of a local void.

Unlike other attempts to solve the Hubble tension, ours is unique in using an already existing theory (MOND) developed for a different reason (galaxy rotation curves). The use of unseen collisionless matter made of hypothetical sterile neutrinos is still required to explain the properties of galaxy clusters, which otherwise do not sit well with MOND. In addition, these neutrinos provide an easy way to explain the CMB and background expansion history, though recently Skordis & Zlosnik (2020) showed that this is possible in MOND with only ordinary matter. In any case, MOND is a theory of gravity, while dark matter is a hypothesis that more matter exists than meets the eye. The ideas could both be right, and should be tested separately.

A dark matter-MOND hybrid thus appears to be a very promising way to resolve the current crisis in cosmology. Still, more work is required to construct a fully-fledged relativistic MOND theory capable of addressing cosmology. This could build on the theory proposed by Skordis & Zlosnik (2019) in which gravitational waves travel at the speed of light, which was considered to be a major difficulty for MOND. We argued that such a theory would enhance structure formation to the required extent under a wide range of plausible theoretical assumptions, but this needs to be shown explicitly starting from a relativistic MOND theory. Cosmological structure formation simulations are certainly required in this scenario – these are currently under way in Bonn. Further observations would also help greatly, especially of the matter density in the outskirts of the KBC void at distances of about 500 Mpc. This could hold vital clues to how quickly the void has grown, helping to pin down the behaviour of the sought-after MOND theory.

There is now a very real prospect of obtaining a single theory that works across all astronomical scales, from the tiniest dwarf galaxies up to the largest structures in the Universe & its overall expansion rate, and from a few seconds after the birth of the Universe until today. Rather than argue whether this theory looks more like MOND or standard cosmology, what we should really do is combine the best elements of both, paying careful attention to all observations.


Indranil Banik is a Humboldt postdoctoral fellow in the Helmholtz Institute for Radiation and Nuclear Physics (HISKP) at the University of Bonn, Germany. He did his undergraduate and masters at Trinity College, Cambridge, and his PhD at Saint Andrews under Hongsheng Zhao. His research focuses on testing whether gravity continues to follow the Newtonian inverse square law at the low accelerations typical of galactic outskirts, with MOND being the best-developed alternative.

Moritz Haslbauer is a PhD student at the Max Planck Institute for Radio Astronomy (MPIfR) in Bonn. He obtained his undergraduate degree from the University of Vienna and his masters from the University of Bonn. He works on the formation and evolution of galaxies and their distribution in the local Universe in order to test different cosmological models and gravitational theories. Prof. Pavel Kroupa is his PhD supervisor.

Pavel Kroupa is a professor at the University of Bonn and professorem hospitem at Charles University in Prague. He went to school in Germany and South Africa, studied physics in Perth, Australia, and obtained his PhD at Trinity College, Cambridge, UK. He researches stellar populations and their dynamics as well as the dark matter problem, therewith testing gravitational theories and cosmological models.

Link to the published science paper.

YouTube video on the paper


Indranil Banik’s YouTube channel.

Cosmology, then and now

Cosmology, then and now

I have been busy teaching cosmology this semester. When I started on the faculty of the University of Maryland in 1998, there was no advanced course on the subject. This seemed like an obvious hole to fill, so I developed one. I remember with fond bemusement the senior faculty, many of them planetary scientists, sending Mike A’Hearn as a stately ambassador to politely inquire if cosmology had evolved beyond a dodgy subject and was now rigorous enough to be worthy of a 3 credit graduate course.

Back then, we used transparencies or wrote on the board. It was novel to have a course web page. I still have those notes, and marvel at the breadth and depth of work performed by my younger self. Now that I’m teaching it for the first time in a decade, I find it challenging to keep up. Everything has to be adapted to an electronic format, and be delivered remotely during this damnable pandemic. It is a less satisfactory experience, and it has precluded posting much here.

Another thing I notice is that attitudes have evolved along with the subject. The baseline cosmology, LCDM, has not changed much. We’ve tilted the power spectrum and spiked it with extra baryons, but the basic picture is that which emerged from the application of classical observational cosmology – measurements of the Hubble constant, the mass density, the ages of the oldest stars, the abundances of the light elements, number counts of faint galaxies, and a wealth of other observational constraints built up over decades of effort. Here is an example of combining such constraints, and exercise I have students do every time I teach the course:

Observational constraints in the mass density-Hubble constant plane assembled by students in my cosmology course in 2002. The gray area is excluded. The open window is the only space allowed; this is LCDM. The box represents the first WMAP estimate in 2003. CMB estimates have subsequently migrated out of the allowed region to lower H0 and higher mass density, but the other constraints have not changed much, most famously H0, which remains entrenched in the low to mid-70s.

These things were known by the mid-90s. Nowadays, people seem to think Type Ia SN discovered Lambda, when really they were just icing on a cake that was already baked. The location of the first peak in the acoustic power spectrum of the microwave background was corroborative of the flat geometry required by the picture that had developed, but trailed the development of LCDM rather than informing its construction. But students entering the field now seem to have been given the impression that these were the only observations that mattered.

Worse, they seem to think these things are Known, as if there’s never been a time that we cosmologists have been sure about something only to find later that we had it quite wrong. This attitude is deleterious to the progress of science, as it precludes us from seeing important clues when they fail to conform to our preconceptions. To give one recent example, everyone seems to have decided that the EDGES observation of 21 cm absorption during the dark ages is wrong. The reason? Because it is impossible in LCDM. There are technical reasons why it might be wrong, but these are subsidiary to Attitude: we can’t believe it’s true, so we don’t. But that’s what makes a result important: something that makes us reexamine how we perceive the universe. If we’re unwilling to do that, we’re no longer doing science.

A Significant Theoretical Advance

A Significant Theoretical Advance

The missing mass problem has been with us many decades now. Going on a century if you start counting from the work of Oort and Zwicky in the 1930s. Not quite a half a century if we date it from the 1970s when most of the relevant scientific community started to take it seriously. Either way, that’s a very long time for a major problem to go unsolved in physics. The quantum revolution that overturned our classical view of physics was lightning fast in comparison – see the discussion of Bohr’s theory in the foundation of quantum mechanics in David Merritt’s new book.

To this day, despite tremendous efforts, we have yet to obtain a confirmed laboratory detection of a viable dark matter particle – or even a hint of persuasive evidence for the physics beyond the Standard Model of Particle Physics (e.g., supersymmetry) that would be required to enable the existence of such particles. We cannot credibly claim (as many of my colleagues insist they can) to know that such invisible mass exists. All we really know is that there is a discrepancy between what we see and what we get: the universe and the galaxies within it cannot be explained by General Relativity and the known stable of Standard Model particles.

If we assume that General Relativity is both correct and sufficient to explain the universe, which seems like a very excellent assumption, then we are indeed obliged to invoke non-baryonic dark matter. The amount of astronomical evidence that points in this direction is overwhelming. That is how we got to where we are today: once we make the obvious, imminently well-motivated assumption, then we are forced along a path in which we become convinced of the reality of the dark matter, not merely as a hypothetical convenience to cosmological calculations, but as an essential part of physical reality.

I think that the assumption that General Relativity is correct is indeed an excellent one. It has repeatedly passed many experimental and observational tests too numerous to elaborate here. However, I have come to doubt the assumption that it suffices to explain the universe. The only data that test it on scales where the missing mass problem arises is the data from which we infer the existence of dark matter. Which we do by assuming that General Relativity holds. The opportunity for circular reasoning is apparent – and frequently indulged.

It should not come as a shock that General Relativity might not be completely sufficient as a theory in all circumstances. This is exactly the motivation for and the working presumption of quantum theories of gravity. That nothing to do with cosmology will be affected along the road to quantum gravity is just another assumption.

I expect that some of my colleagues will struggle to wrap their heads around what I just wrote. I sure did. It was the hardest thing I ever did in science to accept that I might be wrong to be so sure it had to be dark matter – because I was sure it was. As sure of it as any of the folks who remain sure of it now. So imagine my shock when we obtained data that made no sense in terms of dark matter, but had been predicted in advance by a completely different theory, MOND.

When comparing dark matter and MOND, one must weigh all evidence in the balance. Much of the evidence is gratuitously ambiguous, so the conclusion to which one comes depends on how one weighs the more definitive lines of evidence. Some of this points very clearly to MOND, while other evidence prefers non-baryonic dark matter. One of the most important lines of evidence in favor of dark matter is the acoustic power spectrum of the cosmic microwave background (CMB) – the pattern of minute temperature fluctuations in the relic radiation field imprinted on the sky a few hundred thousand years after the Big Bang.

The equations that govern the acoustic power spectrum require General Relativity, but thankfully the small amplitude of the temperature variations permits them to be solved in the limit of linear perturbation theory. So posed, they can be written as a damped and driven oscillator. The power spectrum favors features corresponding to standing waves at the epoch of recombination when the universe transitioned rather abruptly from an opaque plasma to a transparent neutral gas. The edge of a cloud provides an analog: light inside the cloud scatters off the water molecules and doesn’t get very far: the cloud is opaque. Any light that makes it to the edge of the cloud meets no further resistance, and is free to travel to our eyes – which is how we perceive the edge of the cloud. The CMB is the expansion-redshifted edge of the plasma cloud of the early universe.

An easy way to think about a damped and a driven oscillator is a kid being pushed on a swing. The parent pushing the child is a driver of the oscillation. Any resistance – like the child dragging his feet – damps the oscillation. Normal matter (baryons) damps the oscillations – it acts as a net drag force on the photon fluid whose oscillations we observe. If there is nothing going on but General Relativity plus normal baryons, we should see a purely damped pattern of oscillations in which each peak is smaller than the one before it, as seen in the solid line here:

The CMB acoustic power spectrum predicted by General Relativity with no cold dark matter (line) and as observed by the Planck satellite (data points).

As one can see, the case of no Cold Dark Matter (CDM) does well to explain the amplitudes of the first two peaks. Indeed, it was the only hypothesis to successfully predict this aspect of the data in advance of its observation. The small amplitude of the second peak came as a great surprise from the perspective of LCDM. However, without CDM, there is only baryonic damping. Each peak should have a progressively lower amplitude. This is not observed. Instead, the third peak is almost the same amplitude as the second, and clearly higher than expected in the pure damping scenario of no-CDM.

CDM provides a net driving force in the oscillation equations. It acts like the parent pushing the kid. Even though the kid drags his feet, the parent keeps pushing, and the amplitude of the oscillation is maintained. For the third peak at any rate. The baryons are an intransigent child and keep dragging their feet; eventually they win and the power spectrum damps away on progressively finer angular scales (large 𝓁 in the plot).

As I wrote in this review, the excess amplitude of the third peak over the no-CDM prediction is the best evidence to my mind in favor of the existence of non-baryonic CDM. Indeed, this observation is routinely cited by many cosmologists to absolutely require dark matter. It is argued that the observed power spectrum is impossible without it. The corollary is that any problem the dark matter picture encounters is a mere puzzle. It cannot be an anomaly because the CMB tells us that CDM has to exist.

Impossible is a high standard. I hope the reader can see the flaw in this line of reasoning. It is the same as above. In order to compute the oscillation power spectrum, we have assumed General Relativity. While not replacing it, the persistent predictive successes of a theory like MOND implies the existence of a more general theory. We do not know that such a theory cannot explain the CMB until we develop said theory and work out its predictions.

That said, it is a tall order. One needs a theory that provides a significant driving term without a large amount of excess invisible mass. Something has to push the swing in a universe full of stuff that only drags its feet. That does seem nigh on impossible. Or so I thought until I heard a talk by Pedro Ferreira where he showed how the scalar field in TeVeS – the relativistic MONDian theory proposed by Bekenstein – might play the same role as CDM. However, he and his collaborators soon showed that the desired effect was indeed impossible, at least in TeVeS: one could not simultaneously fit the third peak and the data preceding the first. This was nevertheless an important theoretical development, as it showed how it was possible, at least in principle, to affect the peak ratios without massive amounts of non-baryonic CDM.

At this juncture, there are two options. One is to seek a theory that might work, and develop it to the point where it can be tested. This is a lot of hard work that is bound to lead one down many blind alleys without promise of ultimate success. The much easier option is to assume that it cannot be done. This is the option adopted by most cosmologists, who have spent the last 15 years arguing that the CMB power spectrum requires the existence of CDM. Some even seem to consider it to be a detection thereof, in which case we might wonder why we bother with all those expensive underground experiments to detect the stuff.

Rather fewer people have invested in the approach that requires hard work. There are a few brave souls who have tried it; these include Constantinos Skordis and Tom Złosnik. Very recently, the have shown a version of a relativistic MOND theory (which they call RelMOND) that does fit the CMB power spectrum. Here is the plot from their paper:


Note that black line in their plot is the fit of the LCDM model to the Planck power spectrum data. Their theory does the same thing, so it necessarily fits the data as well. Indeed, a good fit appears to follow for a range of parameters. This is important, because it implies that little or no fine-tuning is needed: this is just what happens. That is arguably better than the case for LCDM, in which the fit is very fine-tuned. Indeed, that was a large point of making the measurement, as it requires a very specific set of parameters in order to work. It also leads to tensions with independent measurements of the Hubble constant, the baryon density, and the amplitude of the matter power spectrum at low redshift.

As with any good science result, this one raises a host of questions. It will take time to explore these. But this in itself is a momentous result. Irrespective if RelMOND is the right theory or, like TeVeS, just a step on a longer path, it shows that the impossible is in fact possible. The argument that I have heard repeated by cosmologists ad nauseam like a rosary prayer, that dark matter is the only conceivable way to explain the CMB power spectrum, is simply WRONG.

The Hubble Constant from the Baryonic Tully-Fisher Relation

The Hubble Constant from the Baryonic Tully-Fisher Relation

The distance scale is fundamental to cosmology. How big is the universe? is pretty much the first question we ask when we look at the Big Picture.

The primary yardstick we use to describe the scale of the universe is Hubble’s constant: the H0 in

v = H0 D

that relates the recession velocity (redshift) of a galaxy to its distance. More generally, this is the current expansion rate of the universe. Pick up any book on cosmology and you will find a lengthy disquisition on the importance of this fundamental parameter that encapsulates the size, age, critical density, and potential fate of the cosmos. It is the first of the Big Two numbers in cosmology that expresses the still-amazing fact that the entire universe is expanding.

Quantifying the distance scale is hard. Throughout my career, I have avoided working on it. There are quite enough, er, personalities on the case already.


No need for me to add to the madness.

Not that I couldn’t. The Tully-Fisher relation has long been used as a distance indicator. It played an important role in breaking the stranglehold that H0 = 50 km/s/Mpc had on the minds of cosmologists, including myself. Tully & Fisher (1977) found that it was approximately 80 km/s/Mpc. Their method continues to provide strong constraints to this day: Kourkchi et al. find H0 = 76.0 ± 1.1(stat) ± 2.3(sys) km s-1 Mpc-1. So I’ve been happy to stay out of it.

Until now.


I am motivated in part by the calibration opportunity provided by gas rich galaxies, in part by the fact that tension in independent approaches to constrain the Hubble constant only seems to be getting worse, and in part by a recent conference experience. (Remember when we traveled?) Less than a year ago, I was at a cosmology conference in which I heard an all-too-typical talk that asserted that the Planck H0 = 67.4 ± 0.5 km/s/Mpc had to be correct and everybody who got something different was a stupid-head. I’ve seen this movie before. It is the same community (often the very same people) who once insisted that H0 had to be 50, dammit. They’re every bit as overconfident as before, suffering just as much from confirmation bias (LCDM! LCDM! LCDM!), and seem every bit as likely to be correct this time around.

So, is it true? We have the data, we’ve just refrained from using it in this particular way because other people were on the case. Let’s check.

The big hassle here is not measuring H0 so much as quantifying the uncertainties. That’s the part that’s really hard. So all credit goes to Jim Schombert, who rolled up his proverbial sleeves and did all the hard work. Federico Lelli and I mostly just played the mother-of-all-jerks referees (I’ve had plenty of role models) by asking about every annoying detail. To make a very long story short, none of the items under our control matter at a level we care about, each making < 1 km/s/Mpc difference to the final answer.

In principle, the Baryonic Tully-Fisher relation (BTFR) helps over the usual luminosity-based version by including the gas, which extends application of the relation to lower mass galaxies that can be quite gas rich. Ignoring this component results in a mess that can only be avoided by restricting attention to bright galaxies. But including it introduces an extra parameter. One has to adopt a stellar mass-to-light ratio to put the stars and the gas on the same footing. I always figured that would make things worse – and for a long time, it did. That is no longer the case. So long as we treat the calibration sample that defines the BTFR and the sample used to measure the Hubble constant self-consistently, plausible choices for the mass-to-light ratio return the same answer for H0. It’s all relative – the calibration changes with different choices, but the application to more distant galaxies changes in the same way. Same for the treatment of molecular gas and metallicity. It all comes out in the wash. Our relative distance scale is very precise. Putting an absolute number on it simply requires a lot of calibrating galaxies with accurate, independently measured distances.

Here is the absolute calibration of the BTFR that we obtain:

The Baryonic Tully-Fisher relation calibrated with 50 galaxies with direct distance determinations from either the Tip of the Red Giant Branch method (23) or Cepheids (27).

In constructing this calibrated BTFR, we have relied on distance measurements made or compiled by the Extragalactic Distance Database, which represents the cumulative efforts of Tully and many others to map out the local universe in great detail. We have also benefited from the work of Ponomareva et al, which provides new calibrator galaxies not already in our SPARC sample. Critically, they also measure the flat velocity from rotation curves, which is a huge improvement in accuracy over the more readily available linewidths commonly employed in Tully-Fisher work, but is expensive to obtain so remains the primary observational limitation on this procedure.

Still, we’re in pretty good shape. We now have 50 galaxies with well measured distances as well as the necessary ingredients to construct the BTFR: extended, resolved rotation curves, HI fluxes to measure the gas mass, and Spitzer near-IR data to estimate the stellar mass. This is a huge sample for which to have all of these data simultaneously. Measuring distances to individual galaxies remains challenging and time-consuming hard work that has been done by others. We are not about to second-guess their results, but we can note that they are sensible and remarkably consistent.

There are two primary methods by which the distances we use have been measured. One is Cepheids – the same type of variable stars that Hubble used to measure the distance to spiral nebulae to demonstrate their extragalactic nature. The other is the tip of the red giant branch (TRGB) method, which takes advantage of the brightest red giants having nearly the same luminosity. The sample is split nearly 50/50: there are 27 galaxies with a Cepheid distance measurement, and 23 with the TRGB. The two methods (different colored points in the figure) give the same calibration, within the errors, as do the two samples (circles vs. diamonds). There have been plenty of mistakes in the distance scale historically, so this consistency is important. There are many places where things could go wrong: differences between ourselves and Ponomareva, differences between Cepheids and the TRGB as distance indicators, mistakes in the application of either method to individual galaxies… so many opportunities to go wrong, and yet everything is consistent.

Having  followed the distance scale problem my entire career, I cannot express how deeply impressive it is that all these different measurements paint a consistent picture. This is a credit to a large community of astronomers who have worked diligently on this problem for what seems like aeons. There is a temptation to dismiss distance scale work as having been wrong in the past, so it can be again. Of course that is true, but it is also true that matters have improved considerably. Forty years ago, it was not surprising when a distance indicator turned out to be wrong, and distances changed by a factor of two. That stopped twenty years ago, thanks in large part to the Hubble Space Telescope, a key goal of which had been to nail down the distance scale. That mission seems largely to have been accomplished, with small differences persisting only at the level that one expects from experimental error. One cannot, for example, make a change to the Cepheid calibration without creating a tension with the TRGB data, or vice-versa: both have to change in concert by the same amount in the same direction. That is unlikely to the point of wishful thinking.

Having nailed down the absolute calibration of the BTFR for galaxies with well-measured distances, we can apply it to other galaxies for which we know the redshift but not the distance. There are nearly 100 suitable galaxies available in the SPARC database. Consistency between them and the calibrator galaxies requires

H0 = 75.1 +/- 2.3 (stat) +/- 1.5 (sys) km/s/Mpc.

This is consistent with the result for the standard luminosity-linewidth version of the Tully-Fisher relation reported by Kourkchi et al. Note also that our statistical (random/experimental) error is larger, but our systematic error is smaller. That’s because we have a much smaller number of galaxies. The method is, in principle, more precise (mostly because rotation curves are more accurate than linewidhts), so there is still a lot to be gained by collecting more data.

Our measurement is also consistent with many other “local” measurements of the distance scale,

hubbletension1but not with “global” measurements. See the nice discussion by Telescoper and the paper from which it comes. A Hubble constant in the 70s is the answer that we’ve consistently gotten for the past 20 years by a wide variety of distinct methods, including direct measurements that are not dependent on lower rungs of the distance ladder, like gravitational lensing and megamasers. These are repeatable experiments. In contrast, as I’ve pointed out before, it is the “global” CMB-fitted value of the Hubble parameter that has steadily diverged from the concordance region that originally established LCDM.

So, where does this leave us? In the past, it was easy to dismiss a tension of this sort as due to some systematic error, because that happened all the time – in the 20th century. That’s not so true anymore. It looks to me like the tension is real.


The halo mass function

The halo mass function

I haven’t written much here of late. This is mostly because I have been busy, but also because I have been actively refraining from venting about some of the sillier things being said in the scientific literature. I went into science to get away from the human proclivity for what is nowadays called “fake news,” but we scientists are human too, and are not immune from the same self-deception one sees so frequently exercised in other venues.

So let’s talk about something positive. Current grad student Pengfei Li recently published a paper on the halo mass function. What is that and why should we care?

One of the fundamental predictions of the current cosmological paradigm, ΛCDM, is that dark matter clumps into halos. Cosmological parameters are known with sufficient precision that we have a very good idea of how many of these halos there ought to be. Their number per unit volume as a function of mass (so many big halos, so many more small halos) is called the halo mass function.

An important test of the paradigm is thus to measure the halo mass function. Does the predicted number match the observed number? This is hard to do, since dark matter halos are invisible! So how do we go about it?

Galaxies are thought to form within dark matter halos. Indeed, that’s kinda the whole point of the ΛCDM galaxy formation paradigm. So by counting galaxies, we should be able to count dark matter halos. Counting galaxies was an obvious task long before we thought there was dark matter, so this should be straightforward: all one needs is the measured galaxy luminosity function – the number density of galaxies as a function of how bright they are, or equivalently, how many stars they are made of (their stellar mass). Unfortunately, this goes tragically wrong.

Galaxy stellar mass function and the predicted halo mass function
Fig. 5 from the review by Bullock & Boylan-Kolchin. The number density of objects is shown as a function of their mass. Colored points are galaxies. The solid line is the predicted number of dark matter halos. The dotted line is what one would expect for galaxies if all the normal matter associated with each dark matter halo turned into stars.

This figure shows a comparison of the observed stellar mass function of galaxies and the predicted halo mass function. It is from a recent review, but it illustrates a problem that goes back as long as I can remember. We extragalactic astronomers spent all of the ’90s obsessing over this problem. [I briefly thought that I had solved this problem, but I was wrong.] The observed luminosity function is nearly flat while the predicted halo mass function is steep. Consequently, there should be lots and lots of faint galaxies for every bright one, but instead there are relatively few. This discrepancy becomes progressively more severe to lower masses, with the predicted number of halos being off by a factor of many thousands for the faintest galaxies. The problem is most severe in the Local Group, where the faintest dwarf galaxies are known. Locally it is called the missing satellite problem, but this is just a special case of a more general problem that pervades the entire universe.

Indeed, the small number of low mass objects is just one part of the problem. There are also too few galaxies at large masses. Even where the observed and predicted numbers come closest, around the scale of the Milky Way, they still miss by a large factor (this being a log-log plot, even small offsets are substantial). If we had assigned “explain the observed galaxy luminosity function” as a homework problem and the students had returned as an answer a line that had the wrong shape at both ends and at no point intersected the data, we would flunk them. This is, in effect, what theorists have been doing for the past thirty years. Rather than entertain the obvious interpretation that the theory is wrong, they offer more elaborate interpretations.

Faced with the choice between changing one’s mind and proving that there is no need to do so, almost everybody gets busy on the proof.

J. K. Galbraith

Theorists persist because this is what CDM predicts, with or without Λ, and we need cold dark matter for independent reasons. If we are unwilling to contemplate that ΛCDM might be wrong, then we are obliged to pound the square peg into the round hole, and bend the halo mass function into the observed luminosity function. This transformation is believed to take place as a result of a variety of complex feedback effects, all of which are real and few of which are likely to have the physical effects that are required to solve this problem. That’s way beyond the scope of this post; all we need to know here is that this is the “physics” behind the transformation that leads to what is currently called Abundance Matching.

Abundance matching boils down to drawing horizontal lines in the above figure, thus matching galaxies with dark matter halos with equal number density (abundance). So, just reading off the graph, a galaxy of stellar mass M* = 108 M resides in a dark matter halo of 1011 M, one like the Milky Way with M* = 5 x 1010 M resides in a 1012 M halo, and a giant galaxy with M* = 1012 M is the “central” galaxy of a cluster of galaxies with a halo mass of several 1014 M. And so on. In effect, we abandon the obvious and long-held assumption that the mass in stars should be simply proportional to that in dark matter, and replace it with a rolling fudge factor that maps what we see to what we predict. The rolling fudge factor that follows from abundance matching is called the stellar mass–halo mass relation. Many of the discussions of feedback effects in the literature amount to a post hoc justification for this multiplication of forms of feedback.

This is a lengthy but insufficient introduction to a complicated subject. We wanted to get away from this, and test the halo mass function more directly. We do so by use of the velocity function rather than the stellar mass function.

The velocity function is the number density of galaxies as a function of how fast they rotate. It is less widely used than the luminosity function, because there is less data: one needs to measure the rotation speed, which is harder to obtain than the luminosity. Nevertheless, it has been done, as with this measurement from the HIPASS survey:

Galaxy velocity function
The number density of galaxies as a function of their rotation speed (Zwaan et al. 2010). The bottom panel shows the raw number of galaxies observed; the top panel shows the velocity function after correcting for the volume over which galaxies can be detected. Faint, slow rotators cannot be seen as far away as bright, fast rotators, so the latter are always over-represented in galaxy catalogs.

The idea here is that the flat rotation speed is the hallmark of a dark matter halo, providing a dynamical constraint on its mass. This should make for a cleaner measurement of the halo mass function. This turns out to be true, but it isn’t as clean as we’d like.

Those of you who are paying attention will note that the velocity function Martin Zwaan measured has the same basic morphology as the stellar mass function: approximately flat at low masses, with a steep cut off at high masses. This looks no more like the halo mass function than the galaxy luminosity function did. So how does this help?

To measure the velocity function, one has to use some readily obtained measure of the rotation speed like the line-width of the 21cm line. This, in itself, is not a very good measurement of the halo mass. So what Pengfei did was to fit dark matter halo models to galaxies of the SPARC sample for which we have good rotation curves. Thanks to the work of Federico Lelli, we also have an empirical relation between line-width and the flat rotation velocity. Together, these provide a connection between the line-width and halo mass:

Halo mass-line width relation
The relation Pengfei found between halo mass (M200) and line-width (W) for the NFW (ΛCDM standard) halo model fit to rotation curves from the SPARC galaxy sample.

Once we have the mass-line width relation, we can assign a halo mass to every galaxy in the HIPASS survey and recompute the distribution function. But now we have not the velocity function, but the halo mass function. We’ve skipped the conversion of light to stellar mass to total mass and used the dynamics to skip straight to the halo mass function:

Empirical halo mass function
The halo mass function. The points are the data; these are well fit by a Schechter function (black line; this is commonly used for the galaxy luminosity function). The red line is the prediction of ΛCDM for dark matter halos.

The observed mass function agrees with the predicted one! Test successful! Well, mostly. Let’s think through the various aspects here.

First, the normalization is about right. It does not have the offset seen in the first figure. As it should not – we’ve gone straight to the halo mass in this exercise, and not used the luminosity as an intermediary proxy. So that is a genuine success. It didn’t have to work out this well, and would not do so in a very different cosmology (like SCDM).

Second, it breaks down at high mass. The data shows the usual Schechter cut-off at high mass, while the predicted number of dark matter halos continues as an unabated power law. This might be OK if high mass dark matter halos contain little neutral hydrogen. If this is the case, they will be invisible to HIPASS, the 21cm survey on which this is based. One expects this, to a certain extent: the most massive galaxies tend to be gas-poor ellipticals. That helps, but only by shifting the turn-down to slightly higher mass. It is still there, so the discrepancy is not entirely cured. At some point, we’re talking about large dark matter halos that are groups or even rich clusters of galaxies, not individual galaxies. Still, those have HI in them, so it is not like they’re invisible. Worse, examining detailed simulations that include feedback effects, there do seem to be more predicted high-mass halos that should have been detected than actually are. This is a potential missing gas-rich galaxy problem at the high mass end where galaxies are easy to detect. However, the simulations currently available to us do not provide the information we need to clearly make this determination. They don’t look right, so far as we can tell, but it isn’t clear enough to make a definitive statement.

Finally, the faint-end slope is about right. That’s amazing. The problem we’ve struggled with for decades is that the observed slope is too flat. Here a steep slope just falls out. It agrees with the ΛCDM down to the lowest mass bin. If there is a missing satellite-type problem here, it is at lower masses than we probe.

That sounds great, and it is. But before we get too excited, I hope you noticed that the velocity function from the same survey is flat like the luminosity function. So why is the halo mass function steep?

When we fit rotation curves, we impose various priors. That’s statistics talk for a way of keeping parameters within reasonable bounds. For example, we have a pretty good idea of what the mass-to-light ratio of a stellar population should be. We can therefore impose as a prior that the fit return something within the bounds of reason.

One of the priors we imposed on the rotation curve fits was that they be consistent with the stellar mass-halo mass relation. Abundance matching is now part and parcel of ΛCDM, so it made sense to apply it as a prior. The total mass of a dark matter halo is an entirely notional quantity; rotation curves (and other tracers) pretty much never extend far enough to measure this. So abundance matching is great for imposing sense on a parameter that is otherwise ill-constrained. In this case, it means that what is driving the slope of the halo mass function is a prior that builds-in the right slope. That’s not wrong, but neither is it an independent test. So while the observationally constrained halo mass function is consistent with the predictions of ΛCDM; we have not corroborated the prediction with independent data. What we really need at low mass is some way to constrain the total mass of small galaxies out to much larger radii that currently available. That will keep us busy for some time to come.