Common ground

Common ground

In order to agree on an interpretation, we first have to agree on the facts. Even when we agree on the facts, the available set of facts may admit multiple interpretations. This was an obvious and widely accepted truth early in my career*. Since then, the field has decayed into a haphazardly conceived set of unquestionable absolutes that are based on a large but well-curated subset of facts that gratuitously ignores any subset of facts that are inconvenient.

Sadly, we seem to have entered a post-truth period in which facts are drowned out by propaganda. I went into science to get away from people who place faith before facts, and comfortable fictions ahead of uncomfortable truths. Unfortunately, a lot of those people seem to have followed me here. This manifests as people who quote what are essentially pro-dark matter talking points at me like I don’t understand LCDM, when all it really does is reveal that they are posers** who picked up on some common myths about the field without actually reading the relevant journal articles.

Indeed, a recent experience taught me a new psychology term: identity protective cognition. Identity protective cognition is the tendency for people in a group to selectively credit or dismiss evidence in patterns that reflect the beliefs that predominate in their group. When it comes to dark matter, the group happens to be a scientific one, but the psychology is the same: I’ve seen people twist themselves into logical knots to protect their belief in dark matter from being subject to critical examination. They do it without even recognizing that this is what they’re doing. I guess this is a human foible we cannot escape.

I’ve addressed these issues before, but here I’m going to start a series of posts on what I think some of the essential but underappreciated facts are. This is based on a talk that I gave at a conference on the philosophy of science in 2019, back when we had conferences, and published in Studies in History and Philosophy of Science. I paid the exorbitant open access fee (the journal changed its name – and publication policy – during the publication process), so you can read the whole thing all at once if you are eager. I’ve already written it to be accessible, so mostly I’m going to post it here in what I hope are digestible chunks, and may add further commentary if it seems appropriate.

Cosmic context

Cosmology is the science of the origin and evolution of the universe: the biggest of big pictures. The modern picture of the hot big bang is underpinned by three empirical pillars: an expanding universe (Hubble expansion), Big Bang Nucleosynthesis (BBN: the formation of the light elements through nuclear reactions in the early universe), and the relic radiation field (the Cosmic Microwave Background: CMB) (Harrison, 2000; Peebles, 1993). The discussion here will take this framework for granted.

The three empirical pillars fit beautifully with General Relativity (GR). Making the simplifying assumptions of homogeneity and isotropy, Einstein’s equations can be applied to treat the entire universe as a dynamical entity. As such, it is compelled either to expand or contract. Running the observed expansion backwards in time, one necessarily comes to a hot, dense, early phase. This naturally explains the CMB, which marks the transition from an opaque plasma to a transparent gas (Sunyaev and Zeldovich, 1980; Weiss, 1980). The abundances of the light elements can be explained in detail with BBN provided the universe expands in the first few minutes as predicted by GR when radiation dominates the mass-energy budget of the universe (Boesgaard & Steigman, 1985).

The marvelous consistency of these early universe results with the expectations of GR builds confidence that the hot big bang is the correct general picture for cosmology. It also builds overconfidence that GR is completely sufficient to describe the universe. Maintaining consistency with modern cosmological data is only possible with the addition of two auxiliary hypotheses: dark matter and dark energy. These invisible entities are an absolute requirement of the current version of the most-favored cosmological model, ΛCDM. The very name of this model is born of these dark materials: Λ is Einstein’s cosmological constant, of which ‘dark energy’ is a generalization, and CDM is cold dark matter.

Dark energy does not enter much into the subject of galaxy formation. It mainly helps to set the background cosmology in which galaxies form, and plays some role in the timing of structure formation. This discussion will not delve into such details, and I note only that it was surprising and profoundly disturbing that we had to reintroduce (e.g., Efstathiou et al., 1990; Ostriker and Steinhardt, 1995; Perlmutter et al., 1999; Riess et al., 1998; Yoshii and Peterson, 1995) Einstein’s so-called ‘greatest blunder.’

Dark matter, on the other hand, plays an intimate and essential role in galaxy formation. The term ‘dark matter’ is dangerously crude, as it can reasonably be used to mean anything that is not seen. In the cosmic context, there are at least two forms of unseen mass: normal matter that happens not to glow in a way that is easily seen — not all ordinary material need be associated with visible stars — and non-baryonic cold dark matter. It is the latter form of unseen mass that is thought to dominate the mass budget of the universe and play a critical role in galaxy formation.

Cold Dark Matter

Cold dark matter is some form of slow moving, non-relativistic (‘cold’) particulate mass that is not composed of normal matter (baryons). Baryons are the family of particles that include protons and neutrons. As such, they compose the bulk of the mass of normal matter, and it has become conventional to use this term to distinguish between normal, baryonic matter and the non-baryonic dark matter.

The distinction between baryonic and non-baryonic dark matter is no small thing. Non-baryonic dark matter must be a new particle that resides in a new ‘dark sector’ that is completely distinct from the usual stable of elementary particles. We do not just need some new particle, we need one (or many) that reside in some sector beyond the framework of the stubbornly successful Standard Model of particle physics. Whatever the solution to the mass discrepancy problem turns out to be, it requires new physics.

The cosmic dark matter must be non-baryonic for two basic reasons. First, the mass density of the universe measured gravitationally (Ωm ​≈ ​0.3, e.g., Faber and Gallagher, 1979; Davis et al., 1980, 1992) clearly exceeds the mass density in baryons as constrained by BBN (Ωb ​≈ ​0.05, e.g., Walker et al., 1991). There is something gravitating that is not ordinary matter: Ωm ​> ​Ωb.

The second reason follows from the absence of large fluctuations in the CMB (Peebles and Yu, 1970; Silk, 1968; Sunyaev and Zeldovich, 1980). The CMB is extraordinarily uniform in temperature across the sky, varying by only ~ 1 part in 105 (Smoot et al., 1992). These small temperature variations correspond to variations in density. Gravity is an attractive force; it will make the rich grow richer. Small density excesses will tend to attract more mass, making them larger, attracting more mass, and leading to the formation of large scale structures, including galaxies. But gravity is also a weak force: this process takes a long time. In the long but finite age of the universe, gravity plus known baryonic matter does not suffice to go from the initially smooth, highly uniform state of the early universe to the highly clumpy, structured state of the local universe (Peebles, 1993). The solution is to boost the process with an additional component of mass — the cold dark matter — that gravitates without interacting with the photons, thus getting a head start on the growth of structure while not aggravating the amplitude of temperature fluctuations in the CMB.

Taken separately, one might argue away the need for dark matter. Taken together, these two distinct arguments convinced nearly everyone, including myself, of the absolute need for non-baryonic dark matter. Consequently, CDM became established as the leading paradigm during the 1980s (Peebles, 1984; Steigman and Turner, 1985). The paradigm has snowballed since that time, the common attitude among cosmologists being that CDM has to exist.

From an astronomical perspective, the CDM could be any slow-moving, massive object that does not interact with photons nor participate in BBN. The range of possibilities is at once limitless yet highly constrained. Neutrons would suffice if they were stable in vacuum, but they are not. Primordial black holes are a logical possibility, but if made of normal matter, they must somehow form in the first second after the Big Bang to not impair BBN. At this juncture, microlensing experiments have excluded most plausible mass ranges that primordial black holes could occupy (Mediavilla et al., 2017). It is easy to invent hypothetical dark matter candidates, but difficult for them to remain viable.

From a particle physics perspective, the favored candidate is a Weakly Interacting Massive Particle (WIMP: Peebles, 1984; Steigman and Turner, 1985). WIMPs are expected to be the lightest stable supersymmetric partner particle that resides in the hypothetical supersymmetric sector (Martin, 1998). The WIMP has been the odds-on favorite for so long that it is often used synonymously with the more generic term ‘dark matter.’ It is the hypothesized particle that launched a thousand experiments. Experimental searches for WIMPs have matured over the past several decades, making extraordinary progress in not detecting dark matter (Aprile et al., 2018). Virtually all of the parameter space in which WIMPs had been predicted to reside (Trotta et al., 2008) is now excluded. Worse, the existence of the supersymmetric sector itself, once seemingly a sure thing, remains entirely hypothetical, and appears at this juncture to be a beautiful idea that nature declined to implement.

In sum, we must have cold dark matter for both galaxies and cosmology, but we have as yet no clue to what it is.


* There is a trope that late in their careers, great scientists come to the opinion that everything worth discovering has been discovered, because they themselves already did everything worth doing. That is not a concern I have – I know we haven’t discovered all there is to discover. Yet I see no prospect for advancing our fundamental understanding simply because there aren’t enough of us pulling in the right direction. Most of the community is busy barking up the wrong tree, and refuses to be distracted from their focus on the invisible squirrel that isn’t there.

** Many of these people are the product of the toxic culture that Simon White warned us about. They wave the sausage of galaxy formation and feedback like a magic wand that excuses all faults while being proudly ignorant of how the sausage was made. Bitch, please. I was there when that sausage was made. I helped make the damn sausage. I know what went into it, and I recognize when it tastes wrong.

Galaxy models in compressed halos

Galaxy models in compressed halos

The last post was basically an introduction to this one, which is about the recent work of Pengfei Li. In order to test a theory, we need to establish its prior. What do we expect?

The prior for fully formed galaxies after 13 billion years of accretion and evolution is not an easy problem. The dark matter halos need to form first, with the baryonic component assembling afterwards. We know from dark matter-only structure formation simulations that the initial condition (A) of the dark matter halo should resemble an NFW halo, and from observations that the end product of baryonic assembly needs to look like a real galaxy (Z). How the universe gets from A to Z is a whole alphabet of complications.

The simplest thing we can do is ignore B-Y and combine a model galaxy with a model dark matter halo. The simplest model for a spiral galaxy is an exponential disk. True to its name, the azimuthally averaged stellar surface density falls off exponentially from a central value over some scale length. This is a tolerable approximation of the stellar disks of spiral galaxies, ignoring their central bulges and their gas content. It is an inadequate yet surprisingly decent starting point for describing gravitationally bound collections of hundreds of billions of stars with just two parameters.

So a basic galaxy model is an exponential disk in an NFW dark matter halo. This is they type of model I discussed in the last post, the kind I was considering two decades ago, and the kind of model still frequently considered. It is an obvious starting point. However, we know that this starting point is not adequate. On the baryonic side, we should model all the major mass components: bulge, disk, and gas. On the halo side, we need to understand how the initial halo depends on its assembly history and how it is modified by the formation of the luminous galaxy within it. The common approach to do all that is to run a giant cosmological simulation and watch what happens. That’s great, provided we know how to model all the essential physics. The action of gravity in an expanding universe we can compute well enough, but we do not enjoy the same ability to calculate the various non-gravitational effects of baryons.

Rather than blindly accept the outcome of simulations that have become so complicated that no one really seems to understand them, it helps to break the problem down into its basic steps. There is a lot going on, but what we’re concerned about here boils down to a tug of war between two competing effects: adiabatic compression tends to concentrate the dark matter, while feedback tends to redistribute it outwards.

Adiabatic compression refers to the response of the dark matter halo to infalling baryons. Though this name stuck, the process isn’t necessarily adiabatic, and the A-word word tends to blind people to a generic and inevitable physical process. As baryons condense into the centers of dark matter halos, the gravitational potential is non-stationary. The distribution of dark matter has to respond to this redistribution of mass: the infall of dissipating baryons drags some dark matter in with them, so we expect dark matter halos to become more centrally concentrated. The most common approach to computing this effect is to assume the process is adiabatic (hence the name). This means a gentle settling that is gradual enough to be time-reversible: you can imagine running the movie backwards, unlike a sudden, violent event like a car crash. It needn’t be rigorously adiabatic, but the compressive response of the halo is inevitable. Indeed, forming a thin, dynamically cold, well-organized rotating disk in a preferred plane – i.e., a spiral galaxy – pretty much requires a period during which the adiabatic assumption is a decent approximation. There is a history of screwing up even this much, but Jerry Sellwood showed that it could be done correctly and that when one does so, it reproduces the results of more expensive numerical simulations. This provides a method to go beyond a simple exponential disk in an NFW halo: we can compute what happens to an NFW halo in response to an observed mass distribution.

After infall and compression, baryons form stars that produce energy in the form of radiation, stellar winds, and the blast waves of supernova explosions. These are sources of energy that complicate what until now has been a straightforward calculation of gravitational dynamics. With sufficient coupling to the surrounding gas, these energy sources might be converted into enough kinetic energy to alter the equilibrium mass distribution and the corresponding gravitational potential. I say might because we don’t really know how this works, and it is a lot more complicated than I’ve made it sound. So let’s not go there, and instead just calculate the part we do know how to calculate. What happens from the inevitable adiabatic compression in the limit of zero feedback?

We have calculated this for a grid of model galaxies that matches the observed distribution or real galaxies. This is important; it often happens that people do not explore a realistic parameter space. Here is a plot of size against stellar mass:

The size of galaxy disks as measured by the exponential scale length as a function of stellar mass. Grey points are real galaxies; red circles are model galaxies with parameters chosen to cover the same parameter space. This, and all plots, from Li et al. (2022).

Note that at a given stellar mass, there is a wide range of sizes. This is an essential aspect of galaxy properties; one has to explain size variations as well as the trend with mass. This obvious point has been frequently forgotten and rediscovered in the literature.

The two parameter plot above only suffices to approximate the stellar disks of spiral and irregular galaxies. Real galaxies have bulges and interstellar gas. We include these in our models so that they cover the same distribution as real galaxies in terms of bulge mass, size, and gas fraction. We then assign a dark matter halo to each model galaxy using an abundance matching relation (the stellar mass tells us the halo mass) and adopt the cosmologically appropriate halo mass-concentration relation. These specify the initial condition of the NFW halo in which each model galaxy is presumed to reside.

At this point, it is worth remarking that there are a variety of abundance matching relations in the literature. Some of these give tragically bad predictions for the kinematics. I won’t delve into this here, but do want to note that in what follows, we have adopted the most favorable abundance matching relation, which turns out to be that of Kravstov et al. (2018). Note that this means that we are already engaged in a kind of fine-tuning by cherry-picking the most favorable relation.

Before considering adiabatic compression, let’s see what happens if we simply add our model galaxies to NFW halos. This is the same exercise we did last time with exponential disks; now we’re including bulges and gas:

Galaxy models in the RAR plane. Models are color coded by their stellar surface density. The dotted line is 1:1 (Newton with no dark matter or other funny business). The black line is the fit to the observed RAR.

This looks pretty good, at least at a first glance. Most of the models fall nearly on top of each other. This isn’t entirely true, as the most massive models overpredict the RAR. This is a generic consequence of the bend in abundance matching relations. This bend is mildest in the Kravtsov relation, which is what makes it “best” here – other relations, like the commonly cited one of Behroozi, predict a lot more high-acceleration models. One sees only a hint of that here.

The scatter is respectably small, mostly solving the problem I initially encountered in the nineties. Despite predicting a narrow relation, the models do have a finite scatter that is a bit more than we observe. This isn’t too tragic, so maybe we can work with it. These models also miss the low acceleration end of the relation by a modest but appreciable amount. This seems more significant, as we found the same thing for pure exponential models: it is hard to make this part of the problem go away.

Including bulges in the models extends them to high accelerations. This would seem to explain a region of the RAR that pure exponential models do not address. Bulges are high surface density, star dominated regions, so they fall on the 1:1 part of the RAR at high accelerations.

And then there are the hooks. These are obvious in the plot above. They occur in low and intermediate mass galaxies that lack a significant bulge component. A pure exponential disk has a peak acceleration at finite radius, but an NFW halo has its peak at zero radius. So if you imagine following a given model line inwards in radius, it goes up in acceleration until it reaches the maximum for the disk along the x-axis. The baryonic component of the acceleration then starts to decline while that due to the NFW halo continues to rise. The model doubles back to lower baryonic acceleration while continuing to higher total acceleration, making the little hook shape. This deviation from the RAR is not commonly observed; indeed, these hooks are the signature of the cusp-core problem in the RAR plane.

Results so far are mixed. With the “right” choice of abundance matching relation, we are well ahead of where we were at the turn of the century, but some real problems remain. We have yet to compute the necessary adiabatic contraction, so hopefully doing that right will result in further improvement. So let’s make a rigorous calculation of the compression that would result from forming a galaxy of the stipulated parameters.

Galaxy models in the RAR plane after compression.

Adiabatic compression makes things worse. There is a tiny improvement at low accelerations, but the most pronounced effects are at small radii where accelerations are large. Compression makes cuspy halos cuspier, making the hooks more pronounced. Worse, the strong concentration of starlight that is a bulge inevitably leads to strong compression. These models don’t approach the 1:1 line at high acceleration, and never can: higher acceleration means higher stellar surface density means greater compression. One cannot start from an NFW halo and ever reach a state of baryon domination; too much dark matter is always in the mix.

It helps to look at the residual diagram. The RAR is a log-log plot over a large dynamic range; this can hide small but significant deviations. For some reason, people who claim to explain the RAR with dark matter models never seem to show these residuals.

As above, with the observed RAR divided out. Model galaxies are mostly above the RAR. The cusp-core problem is exacerbated in disks, and bulges never reach the 1:1 line at high accelerations.

The models built to date don’t have the right shape to explain the RAR, at least when examined closely. Still, I’m pleased: what we’ve done here comes closer than all my many previous efforts, and most of the other efforts that are out there. Still, I wouldn’t claim it as a success. Indeed, the inevitable compressive effects that occur at high surface densities means that we can’t invoke simple offsets to accommodate the data: if a model gets the shape of the RAR right but the normalization wrong, it doesn’t work to simply shift it over.

So, where does that leave us? Up the proverbial creek? Perhaps. We have yet to consider feedback, which is too complicated to delve into here. Instead, while we haven’t engaged in any specific fine-tuning, we have already engaged in some cherry picking. First, we’ve abandoned the natural proportionality between halo and disk mass, replacing it with abundance matching. This is no small step, as it converts a single-valued parameter of our theory to a rolling function of mass. Abundance matching has become familiar enough that people seemed to be lulled into thinking this is natural. There is nothing natural about it. Regardless of how much fancy jargon we use to justify it, it’s still basically a rolling fudge factor – the scientific equivalent of a lipstick smothered pig.

Abundance matching does, at least, use data that are independent of the kinematics to set the relation between stellar and halo mass, and it does go in the right direction for the RAR. This only gets us into the right ballpark, and only if we cherry-pick the particular abundance matching relation that we use. So we’re well down the path of tuning whether we realize it or not. Invoking feedback is simply another step along this path.

Feedback is usually invoked in the kinematic context to convert cusps into cores. That could help with the hooks. This kind of feedback is widely thought to affect low and intermediate mass galaxies, or galaxies of a particular stellar to halo mass ratio. Opinions vary a bit, but it is generally not thought to have such a strong effect on massive galaxies. And yet, we find that we need some (second?) kind of feedback for them, as we need to move bulges back onto the 1:1 line in the RAR plane. That’s perhaps related to the cusp-core problem, but it’s also different. Getting bulges right requires a fine-tuned amount of feedback to exactly cancel out the effects of compression. A third distinct place where the models need some help is at low accelerations. This is far from the region where feedback is thought to have much effect at all.

I could go on, and perhaps will in a future post. Point is, we’ve been tuning our feedback prescriptions to match observed facts about galaxies, not computing how we think it really works. We don’t know how to do the latter, and there is no guarantee that our approximations do justice to reality. So on the one hand, I don’t doubt that with enough tinkering this process can be made to work in a model. On the other hand, I do question whether this is how the universe really works.

What should we expect for the radial acceleration relation?

What should we expect for the radial acceleration relation?

In the previous post, I related some of the history of the Radial Acceleration Relation (henceforth RAR). Here I’ll discuss some of my efforts to understand it. I’ve spent more time trying to do this in terms of dark matter than pretty much anything else, but I have not published most of those efforts. As I related briefly in this review, that’s because most of the models I’ve considered are obviously wrong. Just because I have refrained from publishing explanations of the RAR that are manifestly incorrect has not precluded others from doing so.

A theory is only as good as its prior. If a theory makes a clear prediction, preferably ahead of time, then we can test it. If it has not done so ahead of time, that’s still OK, if we can work out what it would have predicted without being guided by the data. A good historical example of this is the explanation of the excess perihelion precession of Mercury provided by General Relativity. The anomaly had been known for decades, but the right answer falls out of the theory without input from the data. A more recent example is our prediction of the velocity dispersions of the dwarf satellites of Andromeda. Some cases were genuine a priori predictions, but even in the cases that weren’t, the prediction is what it is irrespective of the measurement.

Dark matter-based explanations of the RAR do not fall in either category. They have always chased the data and been informed by it. This has been going on for so long that new practitioners have entered field unaware of the extent to which the simulations they inherited had already been informed by the data. They legitimately seem to think that there has been no fine-tuning of the models because they weren’t personally present for every turn of the knob.

So let’s set the way-back machine. I became concerned about fine-tuning problems in the context of galaxy dynamics when I was trying to explain the Tully-Fisher relation of low surface brightness galaxies in the mid-1990s. This was before I was more than dimly aware that MOND existed, much less taken it seriously. Many of us were making earnest efforts to build proper galaxy formation theories at the time (e.g., Mo, McGaugh, & Bothun 1994, Dalcanton, Spergel, & Summers 1997; Mo, Mao, & White 1998 [MMW]; McGaugh & de Blok 1998), though of course these were themselves informed by observations to date. My own paper had started as an effort to exploit the new things we had discovered about low surface brightness galaxies to broaden our conventional theory of galaxy formation, but over the course of several years, turned into a falsification of some of the ideas I had considered in my 1992 thesis. Dalcanton’s model evolved from one that predicted a shift in Tully-Fisher (as mine had) to one that did not (after the data said no). It may never be possible to completely separate theoretical prediction from concurrent data, but it is possible to ask what a theory plausibly predicts. What is the LCDM prior for the RAR?

In order to do this, we need to predict both the baryon distribution (gbar) and that of the dark matter (gobs-gbar). Unfortunately, nobody seems to really agree on what LCDM predicts for galaxies. There seems to be a general consensus that dark matter halos should start out with the NFW form, but opinions vary widely about whether and how this is modified during galaxy formation. The baryonic side of the issue is simply seen as a problem.

That there is no clear prediction is in itself a problem. I distinctly remember expressing my concerns to Martin Rees while I was still a postdoc. He said not to worry; galaxies were such non-linear entities that we shouldn’t be surprised by anything they do. This verbal invocation of a blanket dodge for any conceivable observation did not inspire confidence. Since then, I’ve heard that excuse repeated by others. I have lost count of the number of more serious, genuine, yet completely distinct LCDM predictions I have seen, heard, or made myself. Many dozens, at a minimum; perhaps hundreds at this point. Some seem like they might work but don’t while others don’t even cross the threshold of predicting both axes of the RAR. There is no coherent picture that adds up to an agreed set of falsifiable predictions. Individual models can be excluded, but not the underlying theory.

To give one example, let’s consider the specific model of MMW. I make this choice here for two reasons. One, it is a credible effort by serious workers and has become a touchstone in the field, to the point that a sizeable plurality of practitioners might recognize it as a plausible prior – i.e., the closest thing we can hope to get to a legitimate, testable prior. Two, I recently came across one of my many unpublished attempts to explain the RAR which happens to make use of it. Unix says that the last time I touched these files was nearly 22 years ago, in 2000. The postscript generated then is illegible now, so I have to update the plot:

The prediction of MMW (lines) compared to data (points). Each colored line represents a model galaxy of a given mass. Different lines of the same color represent models with different disk scale lengths, as galaxies of the same mass exist over a range of sizes. Models are only depicted over the range of radii typically observed in real galaxies.

At first glance, this might look OK. The trend is at least in the right direction. This is not a success so much as it is an inevitable consequence of the fact that the observed acceleration includes the contribution of the baryons. The area below the dashed line is excluded, as it is impossible to have gobs < gbar. Moreover, since gobs = gbar+gDM, some correlation in this plane is inevitable. Quite a lot, if baryons dominate, as they always seem to do at high accelerations. Not that these models explain the high acceleration part of the RAR, but I’ll leave that detail for later. For now, note that this is a log-log plot. That the models miss the data a little to the eye translates to a large quantitative error. Individual model galaxies sometimes fall too high, sometimes too low: the model predicts considerably more scatter than is observed. The RAR is not predicted to be a narrow relation, but one with lots of scatter with large intrinsic deviations from the mean. That’s the natural prediction of MMW-type models.

I have explored many flavors of [L]CDM models. They generically predicts more scatter in the RAR than is observed. This is the natural expectation, and some fine-tuning has to be done to reduce the scatter to the observed level. The inevitable need for fine-tuning is why I became concerned for the dark matter paradigm, even before I became aware that MOND predicted exactly this. It is also why the observed RAR was considered to be against orthodoxy at the time: everybody’s prior was for a large scatter. It wasn’t just me.

In order to build a model, one has to make some assumptions. The obvious assumption to make, at the time, was a constant ratio of dark matter to baryons. Indeed, for many years, the working assumption was that this was about 10:1, maybe 20:1. This type of assumption is built into the models of MMW, who thought that they worked provided “(i) the masses of disks are a few percent of those of their haloes”. The (i) is there because it is literally their first point, and the assumption that everybody made. We were terrified of dropping this natural assumption, as the obvious danger is that it becomes a rolling fudge factor, assuming any value that is convenient for explaining any given observation.

Unfortunately, it had already become clear by this time from the data that a constant ratio of dark to luminous matter could not work. The earliest I said this on the record is 1996. [That was before LCDM had supplanted SCDM as the most favored cosmology. From that perspective, the low baryon fractions of galaxies seemed natural; it was clusters of galaxies that were weird.] I pointed out the likely failure of (i) to Mo when I first saw a draft of MMW (we had been office mates in Cambridge). I’ve written various papers about it since. The point here is that, from the perspective of the kinematic data, the ratio of dark to luminous mass has to vary. It cannot be a constant as we had all assumed. But it has to vary in a way that doesn’t introduce scatter into relations like the RAR or the Baryonic Tully-Fisher relation, so we have to fine-tune this rolling fudge factor so that it varies with mass but always obtains the same value at the same mass.

A constant ratio of dark to luminous mass wasn’t just a convenient assumption. There is good physical reason to expect that this should be the case. The baryons in galaxies have to cool and dissipate to form a galaxy in the center of a dark matter halo. This takes time, imposing an upper limit on galaxy mass. But the baryons in small galaxies have ample time to cool and condense, so one naively expects that they should all do so. That would have been natural. It would also lead to a steeply increasing luminosity function, which is not observed, leading to the over-cooling and missing satellite problems.

Reconciling the observed and predicted mass functions is one of the reasons we invoke feedback. The energy produced by the stars that form in the first gas to condense are an energy source that feeds back into the surrounding gas. This can, in principle, reheat the remaining gas or expel it entirely, thereby precluding it from condensing and forming more stars as in the naive expectation. In principle. In practice, we don’t know how this works, or even if the energy provided by star formation couples to the surrounding gas in a way that does what we need it to do. Simulations do not have the resolution to follow feedback in detail, so instead make some assumptions (“subgrid physics”) about how this might happen, and tune the assumed prescription to fit some aspect of the data. Once this is done, it is possible to make legitimate predictions about other aspects of the data, provided they are unrelated. But we still don’t know if that’s how feedback works, and in no way is it natural. Rather, it is a deus ex machina that we invoke to save us from a glaring problem without really knowing how it works or even if it does. This is basically just theoretical hand-waving in the computational age.

People have been invoking feedback as a panacea for all ills in galaxy formation theory for so long that it has become familiar. Once something becomes familiar, everybody knows it. Since everybody knows that feedback has to play some role, it starts to seem like it was always expected. This is easily confused with being natural.

I could rant about the difficulty of making predictions with feedback afflicted models, but never mind the details. Let’s find some aspect of the data that is independent of the kinematics that we can use to specify the dark to luminous mass ratio. The most obvious candidate is abundance matching, in which the number density of observed galaxies is matched to the predicted number density of dark matter halos. We don’t have to believe feedback-based explanations to apply this, we merely have to accept that there is some mechanism to make the dark to luminous mass ratio variable. Whatever it is that makes this happen had better predict the right thing for both the mass function and the kinematics.

When it comes to the RAR, the application of abundance matching to assign halo masses to observed galaxies works out much better than the natural assumption of a constant ratio. This was first pointed out by Di Cintio & Lelli (2016), which inspired me to consider appropriately modified models. All I had to do was update the relation between stellar and halo mass from a constant ratio to a variable specified by abundance matching. This gives rather better results:

A model like that from 2000 but updated by assigning halo masses using an abundance matching relation.

This looks considerably better! The predicted scatter is much lower. How is this accomplished?

Abundance matching results in a non-linear relation bewteen stellar mass and halo mass. For the RAR, the scatter is reduced by narrowing the dynamic range of halo masses relative to the observed stellar masses. There is less variation in gDM. Empirically, this is what needs to happen – to a crude first approximation, the data are roughly consistent with all galaxies living in the same halo – i.e., no variation in halo mass with stellar mass. This was already known before abundance matching became rife; both the kinematic data and the mass function push us in this direction. There’s nothing natural about any of this; it’s just what we need to do to accommodate the data.

Still, it is tempting to say that we’ve succeeded in explaining the RAR. Indeed, some people have built the same kind of models to claim exactly this. While matters are clearly better, really we’re just less far off. By reducing the dynamic range in halo masses that are occupied by galaxies, the partial contribution of gDM to the gobs axis is compressed, and model lines perforce fall closer together. There’s less to distinguish an L* galaxy from a dwarf galaxy in this plane.

Nevertheless, there’s still too much scatter in the models. Harry Desmond made a specific study of this, finding that abundance matching “significantly overpredicts the scatter in the relation and its normalisation at low acceleration”, which is exactly what I’ve been saying. The offset in the normalization at low acceleration is obvious from inspection in the figure above: the models overshoot the low acceleration data. This led Navarro et al. to argue that there was a second acceleration scale, “an effective minimum acceleration probed by kinematic tracers in isolated galaxies” a little above 10-11 m/s/s. The models do indeed do this, over a modest range in gbar, and there is some evidence for it in some data. This does not persist in the more reliable data; those shown above are dominated by atomic gas so there isn’t even the systematic uncertainty of the stellar mass-to-light ratio to save us.

The astute observer will notice some pink model lines that fall well above the RAR in the plot above. These are for the most massive galaxies, those with luminosities in excess of L*. Below the knee in the Schechter function, there is a small range of halo masses for a given range of stellar masses. Above the knee, this situation is reversed. Consequently, the nonlinearity of abundance matching works against us instead of for us, and the scatter explodes. One can suppress this with an apt choice of abundance matching relation, but we shouldn’t get to pick and choose which relation we use. It can be made to work only because there remains enough uncertainty in abundance matching to select the “right” one. There is nothing natural about any this.

There are also these little hooks, the kinks at the high acceleration end of the models. I’ve mostly suppressed them here (as did Navarro et al.) but they’re there in the models if one plots to small enough radii. This is the signature of the cusp-core problem in the RAR plane. The hooks occur because the exponential disk model has a maximum acceleration at a finite radius that is a little under one scale length; this marks the maximum value that such a model can reach in gbar. In contrast, the acceleration gDM of an NFW halo continues to increase all the way to zero radius. Consequently, the predicted gobs continues to increase even after gbar has peaked and starts to decline again. This leads to little hook-shaped loops at the high acceleration end of the models in the RAR plane.

These hooks were going to be the segue to discuss more sophisticated models built by Pengfei Li, but that’s going to be a whole ‘nother post because these are quite enough words for now. So, until next time, don’t invest in bitcoins, Russian oil, or LCDM models that claim to explain the RAR.

The curious case of AGC 114905: an isolated galaxy devoid of dark matter?

The curious case of AGC 114905: an isolated galaxy devoid of dark matter?

It’s early in the new year, so what better time to violate my own resolutions? I prefer to be forward-looking and not argue over petty details, or chase wayward butterflies. But sometimes the devil is in the details, and the occasional butterfly can be entertaining if distracting. Today’s butterfly is the galaxy AGC 114905, which has recently been in the news.

There are a couple of bandwagons here: one to rebrand very low surface brightness galaxies as ultradiffuse, and another to get overly excited when these types of galaxies appear to lack dark matter. The nomenclature is terrible, but that’s normal for astronomy so I would overlook it, except that in this case it gives the impression that there is some new population of galaxies behaving in an unexpected fashion, when instead it looks to me like the opposite is the case. The extent to which there are galaxies lacking dark matter is fundamental to our interpretation of the acceleration discrepancy (aka the missing mass problem), so bears closer scrutiny. The evidence for galaxies devoid of dark matter is considerably weaker than the current bandwagon portrays.

If it were just one butterfly (e.g., NGC 1052-DF2), I wouldn’t bother. Indeed, it was that specific case that made me resolve to ignore such distractions as a waste of time. I’ve seen this movie literally hundreds of times, I know how it goes:

  • Observations of this one galaxy falsify MOND!
  • Hmm, doing the calculation right, that’s what MOND predicts.
  • OK, but better data shrink the error bars and now MOND falsified.
  • Are you sure about…?
  • Yes. We like this answer, let’s stop thinking about it now.
  • As the data continue to improve, it approaches what MOND predicts.
  • <crickets>

Over and over again. DF44 is another example that has followed this trajectory, and there are many others. This common story is not widely known – people lose interest once they get the answer they want. Irrespective of whether we can explain this weird case or that, there is a deeper story here about data analysis and interpretation that seems not to be widely appreciated.

My own experience inevitably colors my attitude about this, as it does for us all, so let’s start thirty years ago when I was writing a dissertation on low surface brightness (LSB) galaxies. I did many things in my thesis, most of them well. One of the things I tried to do then was derive rotation curves for some LSB galaxies. This was not the main point of the thesis, and arose almost as an afterthought. It was also not successful, and I did not publish the results because I didn’t believe them. It wasn’t until a few years later, with improved data, analysis software, and the concerted efforts of Erwin de Blok, that we started to get a handle on things.

The thing that really bugged me at the time was not the Doppler measurements, but the inclinations. One has to correct the observed velocities by the inclination of the disk, 1/sin(i). The inclination can be constrained by the shape of the image and by the variation of velocities across the face of the disk. LSB galaxies presented raggedy images and messy velocity fields. I found it nigh on impossible to constrain their inclinations at the time, and it remains a frequent struggle to this day.

Here is an example of the LSB galaxy F577-V1 that I find lurking around on disk from all those years ago:

The LSB galaxy F577-V1 (B-band image, left) and the run of the eccentricity of ellipses fit to the atomic gas data (right).

A uniform disk projected on the sky at some inclination will have a fixed corresponding eccentricity, with zero being the limit of a circular disk seen perfectly face-on (i = 0). Do you see a constant value of the eccentricity in the graph above? If you say yes, go get your eyes checked.

What we see in this case is a big transition from a fairly eccentric disk to one that is more nearly face on. The inclination doesn’t have a sudden warp; the problem is that the assumption of a uniform disk is invalid. This galaxy has a bar – a quasi-linear feature that is common in many spiral galaxies that is supported by non-circular orbits. Even face-on, the bar will look elongated simply because it is. Indeed, the sudden change in eccentricity is one way to define the end of the bar, which the human eye-brain can do easily by looking at the image. So in a case like this, one might adopt the inclination from the outer points, and that might even be correct. But note that there are spiral arms along the outer edge that is visible to the eye, so it isn’t clear that even these isophotes are representative of the shape of the underlying disk. Worse, we don’t know what happens beyond the edge of the data; the shape might settle down at some other level that we can’t see.

This was so frustrating, I swore never to have anything to do with galaxy kinematics ever again. Over 50 papers on the subject later, all I can say is D’oh! Repeatedly.

Bars are rare in LSB galaxies, but it struck me as odd that we saw any at all. We discovered unexpectedly that they were dark matter dominated – the inferred dark halo outweighs the disk, even within the edge defined by the stars – but that meant that the disks should be stable against the formation of bars. My colleague Chris Mihos agreed, and decided to look into it. The answer was yes, LSB galaxies should be stable against bar formation, at least internally generated bars. Sometimes bars are driven by external perturbations, so we decided to simulate the close passage of a galaxy of similar mass – basically, whack it real hard and see what happens:

Simulation of an LSB galaxy during a strong tidal encounter with another galaxy. Closest approach is at t=24 in simulation units (between the first and second box). A linear bar does not form, but the model galaxy does suffer a strong and persistent oval distortion: all these images are shown face-on (i=0). From Mihos et al (1997).

This was a conventional simulation, with a dark matter halo constructed to be consistent with the observed properties of the LSB galaxy UGC 128. The results are not specific to this case; it merely provides numerical corroboration of the more general case that we showed analytically.

Consider the image above in the context of determining galaxy inclinations from isophotal shapes. We know this object is face-on because we can control our viewing angle in the simulation. However, we would not infer i=0 from this image. If we didn’t know it had been perturbed, we would happily infer a substantial inclination – in this case, easily as much as 60 degrees! This is an intentionally extreme case, but it illustrates how a small departure from a purely circular shape can be misinterpreted as an inclination. This is a systematic error, and one that usually makes the inclination larger than it is: it is possible to appear oval when face-on, but it is not possible to appear more face-on than perfectly circular.

Around the same time, Erwin and I were making fits to the LSB galaxy data – with both dark matter halos and MOND. By this point in my career, I had deeply internalized that the data for LSB galaxies were never perfect. So we sweated every detail, and worked through every “what if?” This was a particularly onerous task for the dark matter fits, which could do many different things if this or that were assumed – we discussed all the plausible possibilities at the time. (Subsequently, a rich literature sprang up discussing many unreasonable possibilities.) By comparison, the MOND fits were easy. They had fewer knobs, and in 2/3 of the cases they simply worked, no muss, no fuss.

For the other 1/3 of the cases, we noticed that the shape of the MOND-predicted rotation curves was usually right, but the amplitude was off. How could it work so often, and yet miss in this weird way? That sounded like a systematic error, and the inclination was the most obvious culprit, with 1/sin(i) making a big difference for small inclinations. So we decided to allow this as a fit parameter, to see whether a fit could be obtained, and judge how [un]reasonable this was. Here is an example for two galaxies:

UGC 1230 (left) and UGC 5005 (right). Ovals show the nominally measured inclination (i=22o for UGC 1230 and 41o for UGC 5005, respectively) and the MOND best-fit value (i=17o and 30o). From de Blok & McGaugh (1998).

The case of UGC 1230 is memorable to me because it had a good rotation curve, despite being more face-on than widely considered acceptable for analysis. And for good reason: the difference between 22 and 17 degrees make a huge difference to the fit, changing it from way off to picture perfect.

Rotation curve fits for UGC 1230 (top) and UGC 5005 (bottom) with the inclination fixed (left) and fit (right). From de Blok & McGaugh (1998).

What I took away from this exercise is how hard it is to tell the difference between inclination values for relatively face-on galaxies. UGC 1230 is obvious: the ovals for the two inclinations are practically on top of each other. The difference in the case of UGC 5005 is more pronounced, but look at the galaxy. The shape of the outer isophote where we’re trying to measure this is raggedy as all get out; this is par for the course for LSB galaxies. Worse, look further in – this galaxy has a bar! The central bar is almost orthogonal to the kinematic major axis. If we hadn’t observed as deeply as we had, we’d think the minor axis was the major axis, and the inclination was something even higher.

I remember Erwin quipping that he should write a paper on how to use MOND to determine inclinations. This was a joke between us, but only half so: using the procedure in this way would be analogous to using Tully-Fisher to measure distances. We would simply be applying an empirically established procedure to constrain a property of a galaxy – luminosity from line-width in that case of Tully-Fisher; inclination from rotation curve shape here. That we don’t understand why this works has never stopped astronomers before.

Systematic errors in inclination happen all the time. Big surveys don’t have time to image deeply – they have too much sky area to cover – and if there is follow-up about the gas content, it inevitably comes in the form of a single dish HI measurement. This is fine; it is what we can do en masse. But an unresolved single dish measurement provides no information about the inclination, only a pre-inclination line-width (which itself is a crude proxy for the flat rotation speed). The inclination we have to take from the optical image, which would key on the easily detected, high surface brightness central region of the image. That’s the part that is most likely to show a bar-like distortion, so one can expect lots of systematic errors in the inclinations determined in this way. I provided a long yet still incomplete discussion of these issues in McGaugh (2012). This is both technical and intensely boring, so not even the pros read it.

This brings us to the case of AGC 114905, which is part of a sample of ultradiffuse galaxies discussed previously by some of the same authors. On that occasion, I kept to the code, and refrained from discussion. But for context, here are those data on a recent Baryonic Tully-Fisher plot. Spoiler alert: that post was about a different sample of galaxies that seemed to be off the relation but weren’t.

Baryonic Tully-Fisher relation showing the ultradiffuse galaxies discussed by Mancera Piña et al. (2019) as gray circles. These are all outliers from the relation; AGC 114905 is highlighted in orange. Placing much meaning in the outliers is a classic case of missing the forest for the trees. The outliers are trees. The Tully-Fisher relation is the forest.

On the face of it, these ultradiffuse galaxies (UDGs) are all very serious outliers. This is weird – they’re not some scatter off to one side, they’re just way off on their own island, with no apparent connection to the rest of established reality. By calling them a new name, UDG, it makes it sound plausible that these are some entirely novel population of galaxies that behave in a new way. But they’re not. They are exactly the same kinds of galaxies I’ve been talking about. They’re all blue, gas rich, low surface brightness, fairly isolated galaxies – all words that I’ve frequently used to describe my thesis sample. These UDGs are all a few billion solar mass is baryonic mass, very similar to F577-V1 above. You could give F577-V1 a different name, slip into the sample, and nobody would notice that it wasn’t like one of the others.

The one slight difference is implied by the name: UDGs are a little lower in surface brightness. Indeed, once filter transformations are taken into account, the definition of ultradiffuse is equal to what I arbitrarily called very low surface brightness in 1996. Most of my old LSB sample galaxies have central stellar surface brightnesses at or a bit above 10 solar masses per square parsec while the UDGs here are a bit under this threshold. For comparison, in typical high surface brightness galaxies this quantity is many hundreds, often around a thousand. Nothing magic happens at the threshold of 10 solar masses per square parsec, so this line of definition between LSB and UDG is an observational distinction without a physical difference. So what are the odds of a different result for the same kind of galaxies?

Indeed, what really matters is the baryonic surface density, not just the stellar surface brightness. A galaxy made purely of gas but no stars would have zero optical surface brightness. I don’t know of any examples of that extreme, but we came close to it with the gas rich sample of Trachternach et al. (2009) when we tried this exact same exercise a decade ago. Despite selecting that sample to maximize the chance of deviations from the Baryonic Tully-Fisher relation, we found none – at least none that were credible: there were deviant cases, but their data were terrible. There were no deviants among the better data. This sample is comparable or even extreme than the UDGs in terms of baryonic surface density, so the UDGs can’t be exception because they’re a genuinely new population, whatever name we call them by.

The key thing is the credibility of the data, so let’s consider the data for AGC 114905. The kinematics are pretty well ordered; the velocity field is well observed for this kind of beast. It ought to be; they invested over 40 hours of JVLA time into this one galaxy. That’s more than went into my entire LSB thesis sample. The authors are all capable, competent people. I don’t think they’ve done anything wrong, per se. But they do seem to have climbed aboard the bandwagon of dark matter-free UDGs, and have talked themselves into believing smaller error bars on the inclination than I am persuaded is warranted.

Here is the picture of AGC 114905 from Mancera Piña et al. (2021):

AGC 114905 in stars (left) and gas (right). The contours of the gas distribution are shown on top of the stars in white. Figure 1 from Mancera Piña et al. (2021).

This messy morphology is typical of very low surface brightness galaxies – hence their frequent classification as Irregular galaxies. Though messier, it shares some morphological traits with the LSB galaxies shown above. The central light distribution is elongated with a major axis that is not aligned with that of the gas. The gas is raggedy as all get out. The contours are somewhat boxy; this is a hint that something hinky is going on beyond circular motion in a tilted axisymmetric disk.

The authors do the right thing and worry about the inclination, checking to see what it would take to be consistent with either LCDM or MOND, which is about i=11o in stead of the 30o indicated by the shape of the outer isophote. They even build a model to check the plausibility of the smaller inclination:

Contours of models of disks with different inclinations (lines, as labeled) compared to the outer contour of the gas distribution of AGC 114905. Figure 7 from Mancera Piña et al. (2021).

Clearly the black line (i=30o) is a better fit to the shape of the gas distribution than the blue dashed line (i=11o). Consequently, they “find it unlikely that we are severely overestimating the inclination of our UDG, although this remains the largest source of uncertainty in our analysis.” I certainly agree with the latter phrase, but not the former. I think it is quite likely that they are overestimating the inclination. I wouldn’t even call it a severe overestimation; more like par for the course with this kind of object.

As I have emphasized above and elsewhere, there are many things that can go wrong in this sort of analysis. But if I were to try to put my finger on the most important thing, here it would be the inclination. The modeling exercise is good, but it assumes “razor-thin axisymmetric discs.” That’s a reasonable thing to do when building such a model, but we have to bear in mind that real disks are neither. The thickness of the disk probably doesn’t matter too much for a nearly face-on case like this, but the assumption of axisymmetry is extraordinarily dubious for an Irregular galaxy. That’s how they got the name.

It is hard to build models that are not axisymmetric. Once you drop this simplifying assumption, where do you even start? So I don’t fault them for stopping at this juncture, but I can also imagine doing as de Blok suggested, using MOND to set the inclination. Then one could build models with asymmetric features by trial and error until a match is obtained. Would we know that such a model would be a better representation of reality? No. Could we exclude such a model? Also no. So the bottom line is that I am not convinced that the uncertainty in the inclination is anywhere near as small as the adopted ±3o.

That’s very deep in the devilish details. If one is worried about a particular result, one can back off and ask if it makes sense in the context of what we already know. I’ve illustrated this process previously. First, check the empirical facts. Every other galaxy in the universe with credible data falls on the Baryonic Tully-Fisher relation, including very similar galaxies that go by a slightly different name. Hmm, strike one. Second, check what we expect from theory. I’m not a fan of theory-informed data interpretation, but we know that LCDM, unlike SCDM before it, at least gets the amplitude of the rotation speed in the right ballpark (Vflat ~ V200). Except here. Strike two. As much as we might favor LCDM as the standard cosmology, it has now been extraordinarily well established that MOND has considerable success in not just explaining but predicting these kind of data, with literally hundreds of examples. One hundred was the threshold Vera Rubin obtained to refute excuses made to explain away the first few flat rotation curves. We’ve crossed that threshold: MOND phenomenology is as well established now as flat rotation curves were at the inception of the dark matter paradigm. So while I’m open to alternative explanations for the MOND phenomenology, seeing that a few trees stand out from the forest is never going to be as important as the forest itself.

The Baryonic Tully-Fisher relation exists empirically; we have to explain it in any theory. Either we explain it, or we don’t. We can’t have it both ways, just conveniently throwing away our explanation to accommodate any discrepant observation that comes along. That’s what we’d have to do here: if we can explain the relation, we can’t very well explain the outliers. If we explain the outliers, it trashes our explanation for the relation. If some galaxies are genuine exceptions, then there are probably exceptional reasons for them to be exceptions, like a departure from equilibrium. That can happen in any theory, rendering such a test moot: a basic tenet of objectivity is that we don’t get to blame a missed prediction of LCDM on departures from equilibrium without considering the same possibility for MOND.

This brings us to a physical effect that people should be aware of. We touched on the bar stability above, and how a galaxy might look oval even when seen face on. This happens fairly naturally in MOND simulations of isolated disk galaxies. They form bars and spirals and their outer parts wobble about. See, for example, this simulation by Nils Wittenburg. This particular example is a relatively massive galaxy; the lopsidedness reminds me of M101 (Watkins et al. 2017). Lower mass galaxies deeper in the MOND regime are likely even more wobbly. This happens because disks are only marginally stable in MOND, not the over-stabilized entities that have to be hammered to show a response as in our early simulation of UGC 128 above. The point is that there is good reason to expect even isolated face-on dwarf Irregulars to look, well, irregular, leading to exactly the issues with inclination determinations discussed above. Rather than being a contradiction to MOND, AGC 114905 may illustrate one of its inevitable consequences.

I don’t like to bicker at this level of detail, but it makes a profound difference to the interpretation. I do think we should be skeptical of results that contradict well established observational reality – especially when over-hyped. God knows I was skeptical of our own results, which initially surprised the bejeepers out of me, but have been repeatedly corroborated by subsequent observations.

I guess I’m old now, so I wonder how I come across to younger practitioners; perhaps as some scary undead monster. But mates, these claims about UDGs deviating from established scaling relations are off the edge of the map.

What JWST will see

What JWST will see

Big galaxies at high redshift!

That’s my prediction, anyway. A little context first.

New Year, New Telescope

First, JWST finally launched. This has been a long-delayed NASA mission; the launch had been put off so many times it felt like a living example of Zeno’s paradox: ever closer but never quite there. A successful launch is always a relief – rockets do sometimes blow up on lift off – but there is still sweating to be done: it has one of the most complex deployments of any space mission. This is still a work in progress, but to start the new year, I thought it would be nice to look forward to what we hope to see.

JWST is a major space telescope optimized for observing in the near and mid-infrared. This enables observation of redshifted light from the earliest galaxies. This should enable us to see them as they would appear to our eyes had we been around at the time. And that time is long, long ago, in galaxies very far away: in principle, we should be able to see the first galaxies in their infancy, 13+ billion years ago. So what should we expect to see?

Early galaxies in LCDM

A theory is only as good as its prior. In LCDM, structure forms hierarchically: small objects emerge first, then merge into larger ones. It takes time to build up large galaxies like the Milky Way; the common estimate early on was that it would take at least a billion years to assemble an L* galaxy, and it could easily take longer. Ach, terminology: an L* galaxy is the characteristic luminosity of the Schechter function we commonly use to describe the number density of galaxies of various sizes. L* galaxies like the Milky Way are common, but the number of brighter galaxies falls precipitously. Bigger galaxies exist, but they are rare above this characteristic brightness, so L* is shorthand for a galaxy of typical brightness.

We expect galaxies to start small and slowly build up in size. This is a very basic prediction of LCDM. The hierarchical growth of dark matter halos is fundamental, and relatively easy to calculate. How this translates to the visible parts of galaxies is more fraught, depending on the details of baryonic infall, star formation, and the many kinds of feedback. [While I am a frequent critic of model feedback schemes implemented in hydrodynamic simulations on galactic scales, there is no doubt that feedback happens on the much smaller scales of individual stars and their nurseries. These are two very different things for which we confusingly use the same word since the former is the aspirational result of the latter.] That said, one only expects to assemble mass so fast, so the natural expectation is to see small galaxies first, with larger galaxies emerging slowly as their host dark matter halos merge together.

Here is an example of a model formation history that results in the brightest galaxy in a cluster (from De Lucia & Blaizot 2007). Little things merge to form bigger things (hence “hierarchical”). This happens a lot, and it isn’t really clear when you would say the main galaxy had formed. The final product (at lookback time zero, at redshift z=0) is a big galaxy composed of old stars – fairly typically for a giant elliptical. But the most massive progenitor is still rather small 8 billion years ago, over 4 billion years after the Big Bang. The final product doesn’t really emerge until the last major merger around 4 billion years ago. This is just one example in one model, and there are many different models, so your mileage will vary. But you get the idea: it takes a long time and a lot of mergers to assemble a big galaxy.

Brightest cluster galaxy merger tree. Time progresses upwards from early in the universe at bottom to the present day at top. Every line is a small galaxy that merges to ultimately form the larger galaxy. Symbols are color-coded by B−V color (red meaning old stars, blue young) and their area scales with the stellar mass (bigger circles being bigger galaxies. From De Lucia & Blaizot 2007).

It is important to note that in a hierarchical model, the age of a galaxy is not the same as the age of the stars that make up the galaxy. According to De Lucia & Blaizot, the stars of the brightest cluster galaxies

“are formed very early (50 per cent at z~5, 80 per cent at z~3)”

but do so

“in many small galaxies”

– i.e., the little progenitor circles in the plot above. The brightest cluster galaxies in their model build up rather slowly, such that

“half their final mass is typically locked-up in a single galaxy after z~0.5.”

De Lucia & Blaizot (2007)

So all the star formation happens early in the little things, but the final big thing emerges later – a lot later, only reaching half its current size when the universe is about 8 Gyr old. (That’s roughly when the solar system formed: we are late-comers to this party.) Given this prediction, one can imagine that JWST should see lots of small galaxies at high redshift, their early star formation popping off like firecrackers, but it shouldn’t see any big galaxies early on – not really at z > 3 and certainly not at z > 5.

Big galaxies in the data at early times?

While JWST is eagerly awaited, people have not been idle about looking into this. There have been many deep surveys made with the Hubble Space Telescope, augmented by the infrared capable (and now sadly defunct) Spitzer Space Telescope. These have already spied a number of big galaxies at surprisingly high redshift. So surprising that Steinhardt et al. (2016) dubbed it “The Impossibly Early Galaxy Problem.” This is their key plot:

The observed (points) and predicted (lines) luminosity functions of galaxies at various redshifts (colors). If all were well, the points would follow the lines of the same color. Instead, galaxies appear to be brighter than expected, already big at the highest redshifts probed. From Steinhardt et al. (2016).

There are lots of caveats to this kind of work. Constructing the galaxy luminosity function is a challenging task at any redshift; getting it right at high redshift especially so. While what counts as “high” varies, I’d say everything on the above plot counts. Steinhardt et al. (2016) worry about these details at considerable length but don’t find any plausible way out.

Around the same time, one of our graduate students, Jay Franck, was looking into similar issues. One of the things he found was that not only were there big galaxies in place early on, but they were also in clusters (or at least protoclusters) early and often. That is to say, not only are the galaxies too big too soon, so are the clusters in which they reside.

Dr. Franck made his own comparison of data to models, using the Millennium simulation to devise an apples-to-apples comparison:

The apparent magnitude m* at 4.5 microns of L* galaxies in clusters as a function of redshift. Circles are data; squares represent the Millennium simulation. These diverge at z > 2: galaxies are brighter (smaller m*) than predicted (Fig. 5.5 from Franck 2017).

The result is that the data look more like big galaxies formed early already as big galaxies. The solid lines are “passive evolution” models in which all the stars form in a short period starting at z=10. This starting point is an arbitrary choice, but there is little cosmic time between z = 10 and 20 – just a few hundred million years, barely one spin around the Milky Way. This is a short time in stellar evolution, so is practically the same as starting right at the beginning of time. As Jay put it,

“High redshift cluster galaxies appear to be consistent with an old stellar population… they do not appear to be rapidly assembling stellar mass at these epochs.”

Franck 2017

We see old stars, but we don’t see the predicted assembly of galaxies via mergers, at least not at the expected time. Rather, it looks like some galaxies were already big very early on.

As someone who has worked mostly on well resolved, relatively nearby galaxies, all this makes me queasy. Jay, and many others, have worked desperately hard to squeeze knowledge from the faint smudges detected by first generation space telescopes. JWST should bring these into much better focus.

Early galaxies in MOND

To go back to the first line of this post, big galaxies at high redshift did not come as a surprise to me. It is what we expect in MOND.

Structure formation is generally considered a great success of LCDM. It is straightforward and robust to calculate on large scales in linear perturbation theory. Individual galaxies, on the other hand, are highly non-linear objects, making them hard to beasts to tame in a model. In MOND, it is the other way around – predicting the behavior of individual galaxies is straightforward – only the observed distribution of mass matters, not all the details of how it came to be that way – but what happens as structure forms in the early universe is highly non-linear.

The non-linearity of MOND makes it hard to work with computationally. It is also crucial to how structure forms. I provide here an outline of how I expect structure formation to proceed in MOND. This page is now old, even ancient in internet time, as the golden age for this work was 15 – 20 years ago, when all the essential predictions were made and I was naive enough to think cosmologists were amenable to reason. Since the horizon of scientific memory is shorter than that, I felt it necessary to review in 2015. That is now itself over the horizon, so with the launch of JWST, it seems appropriate to remind the community yet again that these predictions exist.

This 1998 paper by Bob Sanders is a foundational paper in this field (see also Sanders 2001 and the other references given on the structure formation page). He says, right in the abstract,

“Objects of galaxy mass are the first virialized objects to form (by z = 10), and larger structure develops rapidly.”

Sanders (1998)

This was a remarkable prediction to make in 1998. Galaxies, much less larger structures, were supposed to take much longer to form. It takes time to go from the small initial perturbations that we see in the CMB at z=1000 to large objects like galaxies. Indeed, the it takes at least a few hundred million years simply in free fall time to assemble a galaxy’s worth of mass, a hard limit. Here Sanders was saying that an L* galaxy might assemble as early as half a billion years after the Big Bang.

So how can this happen? Without dark matter to lend a helping hand, structure formation in the very early universe is inhibited by the radiation field. This inhibition is removed around z ~ 200; exactly when being very sensitive to the baryon density. At this point, the baryon perturbations suddenly find themselves deep in the MOND regime, and behave as if there is a huge amount of dark matter. Structure proceeds hierarchically, as it must, but on a highly compressed timescale. To distinguish it from LCDM hierarchical galaxy formation, let’s call it prompt structure formation. In prompt structure formation, we expect

  • Early reionization (z ~ 20)
  • Some L* galaxies by z ~ 10
  • Early emergence of the cosmic web
  • Massive clusters already at z > 2
  • Large, empty voids
  • Large peculiar velocities
  • A very large homogeneity scale, maybe fractal over 100s of Mpc

There are already indications of all of these things, nearly all of which were predicted in advance of the relevant observations. I could elaborate, but that is beyond the scope of this post. People should read the references* if they’re keen.

*Reading the science papers is mandatory for the pros, who often seem fond of making straw man arguments about what they imagine MOND might do without bothering to check. I once referred some self-styled experts in structure formation to Sanders’s work. They promptly replied “That would mean structures of 1018 M!” when what he said was

“The largest objects being virialized now would be clusters of galaxies with masses in excess of 1014 M. Superclusters would only now be reaching maximum expansion.”

Sanders (1998)

The exact numbers are very sensitive to cosmological parameters, as he discussed, but I have no idea where they got 1018, other than just making stuff up. More importantly, Sanders’s statement clearly presaged the observation of very massive clusters at surprisingly high redshift and the discovery of the Laniakea Supercluster.

These are just the early predictions of prompt structure formation, made in the same spirit that enabled me to predict the second peak of the microwave background and the absorption signal observed by EDGES at cosmic dawn. Since that time, at least two additional schools of thought as to how MOND might impact cosmology have emerged. One of them is the sterile neutrino MOND cosmology suggested by Angus and being actively pursued by the Bonn-Prague research group. Very recently, there is of course the new relativistic theory of Skordis & Złośnik which fits the cosmologists’ holy grail of the power spectrum in both the CMB at z = 1090 and galaxies at z = 0. There should be an active exchange and debate between these approaches, with perhaps new ones emerging.

Instead, we lack critical mass. Most of the community remains entirely obsessed with pursuing the vain chimera of invisible mass. I fear that this will eventually prove to be one of the greatest wastes of brainpower (some of it my own) in the history of science. I can only hope I’m wrong, as many brilliant people seem likely to waste their career running garbage in-garbage out computer simulations or at the bottom of a mine shaft failing to detect what isn’t there.

A beautiful mess

JWST can’t answer all of these questions, but it will help enormously with galaxy formation, which is bound to be messy. It’s not like L* galaxies are going to spring fully formed from the void like Athena from the forehead of Zeus. The early universe must be a chaotic place, with clumps of gas condensing to form the first stars that irradiate the surrounding intergalactic gas with UV photons before detonating as the first supernovae, and the clumps of stars merging to form giant elliptical galaxies while elsewhere gas manages to pool and settle into the large disks of spiral galaxies. When all this happens, how it happens, and how big galaxies get how fast are all to be determined – but now accessible to direct observation thanks to JWST.

It’s going to be a confusing, beautiful mess, in the best possible way – one that promises to test and challenge our predictions and preconceptions about structure formation in the early universe.

The RAR extended by weak lensing

The RAR extended by weak lensing

Last time, I expressed despondency about the lack of progress due to attitudes that in many ways remain firmly entrenched in the 1980s. Recently a nice result has appeared, so maybe there is some hope.

The radial acceleration relation (RAR) measured in rotationally supported galaxies extends down to an observed acceleration of about gobs = 10-11 m/s/s, about one part in 1000000000000 of the acceleration we feel here on the surface of the Earth. In some extreme dwarfs, we get down below 10-12 m/s/s. But accelerations this low are hard to find except in the depths of intergalactic space.

Weak lensing data

Brouwer et al have obtained a new constraint down to 10-12.5 m/s/s using weak gravitational lensing. This technique empowers one to probe the gravitational potential of massive galaxies out to nearly 1 Mpc. (The bulk of the luminous mass is typically confined within a few kpc.) To do this, one looks for the net statistical distortion in galaxies behind a lensing mass like a giant elliptical galaxy. I always found this approach a little scary, because you can’t see the signal directly with your eyes the way you can the velocities in a galaxy measured with a long slit spectrograph. Moreover, one has to bin and stack the data, so the result isn’t for an individual galaxy, but rather the average of galaxies within the bin, however defined. There are further technical issues that makes this challenging, but it’s what one has to do to get farther out.

Doing all that, Brouwer et al obtained this RAR:

The radial acceleration relation from weak lensing measured by Brouwer et al (2021). The red squares and bluescale at the top right are the RAR from rotating galaxies (McGaugh et al 2016). The blue, black, and orange points are the new weak lensing results.

To parse a few of the details: there are two basic results here, one from the GAMA survey (the blue points) and one from KiDS. KiDS is larger so has smaller formal errors, but relies on photometric redshifts (which uses lots of colors to guess the best match redshift). That’s probably OK in a statistical sense, but they are not as accurate as the spectroscopic redshifts measured for GAMA. There is a lot of structure in redshift space that gets washed out by photometric redshift estimates. The fact that the two basically agree hopefully means that this doesn’t matter here.

There are two versions of the KiDS data, one using just the stellar mass to estimate gbar, and another that includes an estimate of the coronal gas mass. Many galaxies are surrounded by a hot corona of gas. This is negligible at small radii where the stars dominate, but becomes progressively more important as part of the baryonic mass budget as one moves out. How important? Hard to say. But it certainly matters on scales of a few hundred kpc (this is the CGM in the baryon pie chart, which suggests roughly equal mass in stars (all within a few tens of kpc) and hot coronal gas (mostly out beyond 100 kpc). This corresponds to the orange points; the black points are what happens if we neglect this component (which certainly isn’t zero). So in there somewhere – this seems to be the dominant systematic uncertainty.

Getting past these pesky detail, this result is cool on many levels. First, the RAR appears to persist as a relation. That needn’t have happened. Second, it extends the RAR by a couple of decades to much lower accelerations. Third, it applies to non-rotating as well as rotationally supported galaxies (more on that in a bit). Fourth, the data at very low accelerations follow a straight line with a slope of about 1/2 in this log-log plot. That means gobs ~ gbar1/2. That provides a test of theory.

What does it mean?

Empirically, this is a confirmation that a known if widely unexpected relation extends further than previously known. That’s pretty neat in its own right, without any theoretical baggage. We used to be able to appreciate empirical relations better (e.g, the stellar main sequence!) before we understood what they meant. Now we seem to put the cart (theory) before the horse (data). That said, we do want to use data to test theories. Usually I discuss dark matter first, but that is complicated, so let’s start with MOND.

Test of MOND

MOND predicts what we see.

I am tempted to leave it at that, because it’s really that simple. But experience has taught me that no result is so obvious that someone won’t claim exactly the opposite, so let’s explore it a bit more.

There are three tests: whether the relation (i) exists, (ii) has the right slope, and (iii) has the right normalization. Tests (i) and (ii) are an immediate pass. It also looks like (iii) is very nearly correct, but it depends in detail on the baryonic mass-to-light ratio – that of the stars plus any coronal gas.

MOND is represented by the grey line that’s hard to see, but goes through the data at both high and low acceleration. At high accelerations, this particular line is a fitting function I chose for convenience. There’s nothing special about it, nor is it even specific to MOND. That was the point of our 2016 RAR paper: this relation exists in the data whether it is due to MOND or not. Conceivably, the RAR might be a relation that only applies to rotating galaxies for some reason that isn’t MOND. That’s hard to sustain, since the data look like MOND – so much so that the two are impossible to distinguish in this plane.

In terms of MOND, the RAR traces the interpolation function that quantifies the transition from the Newtonian regime where gobs = gbar to the deep MOND regime where gobs ~ gbar1/2. MOND does not specify the precise form of the interpolation function, just the asymptotic limits. The data trace that the transition, providing an empirical assessment of the shape of the interpolation function around the acceleration scale a0. That’s interesting and will hopefully inform further theory development, but it is not critical to testing MOND.

What MOND does very explicitly predict is the asymptotic behavior gobs ~ gbar1/2 in the deep MOND regime of low accelerations (gobs << a0). That the lensing data are well into this regime makes them an excellent test of this strong prediction of MOND. It passes with flying colors: the data have precisely the slope anticipated by Milgrom nearly 40 years ago.

This didn’t have to happen. All sorts of other things might have happened. Indeed, as we discussed in Lelli et al (2017), there were some hints that the relation flattened, saturating at a constant gobs around 10-11 m/s/s. I was never convinced that this was real, as it only appears in the least certain data, and there were already some weak lensing data to lower accelerations.

Milgrom (2013) analyzed weak lensing data that were available then, obtaining this figure:

Velocity dispersion-luminosity relation obtained from weak lensing data by Milgrom (2013). Lines are the expectation of MOND for mass-to-light ratios ranging from 1 to 6 in the r’-band, as labeled. The sample is split into red (early type, elliptical) and blue (late type, spiral) galaxies. The early types have a systematically higher M/L, as expected for their older stellar populations.

The new data corroborate this result. Here is a similar figure from Brouwer et al:

The RAR from weak lensing for galaxies split by Sesic index (left) and color (right).

Just looking at these figures, one can see the same type-dependent effect found by Milgrom. However, there is an important difference: Milgrom’s plot leaves the unknown mass-to-light ratio as a free parameter, while the new plot has an estimate of this built-in. So if the adopted M/L is correct, then the red and blue galaxies form parallel RARs that are almost but not quite exactly the same. That would not be consistent with MOND, which should place everything on the same relation. However, this difference is well within the uncertainty of the baryonic mass estimate – not just the M/L of the stars, but also the coronal gas content (i.e., the black vs. orange points in the first plot). MOND predicted this behavior well in advance of the observation, so one would have to bend over backwards, rub one’s belly, and simultaneously punch oneself in the face to portray this as anything short of a fantastic success of MOND.

The data! Look at the data!

I say that because I’m sure people will line up to punch themselves in the face in exactly this fashion*. One of the things that persuades me to suspect that there might be something to MOND is the lengths to which people will go to deny even its most obvious successes. At the same time, they are more than willing to cut any amount of slack necessary to save LCDM. An example is provided by Ludlow et al., who claim to explain the RAR ‘naturally’ from simulations – provided they spot themselves a magic factor of two in the stellar mass-to-light ratio. If it were natural, they wouldn’t need that arbitrary factor. By the same token, if you recognize that you might have been that far off about M*/L, you have to extend that same grace to MOND as you do to LCDM. That’s a basic tenet of objectivity, which used to be a value in science. It doesn’t look like a correction as large as a factor of two is necessary here given the uncertainty in the coronal gas. So, preemptively: Get a grip, people.

MOND predicts what we see. No other theory beat it to the punch. The best one can hope to do is to match its success after the fact by coming up with some other theory that looks just like MOND.

Test of LCDM

In order to test LCDM, we have to agree what LCDM predicts. That agreement is lacking. There is no clear prediction. This complicates the discussion, as the best one can hope to do is give a thorough discussion of all the possibilities that people have so far considered, which differ in important ways. That exercise is necessarily incomplete – people can always come up with new and different ideas for how to explain what they didn’t predict. I’ve been down the road of being thorough many times, which gets so complicated that no one reads it. So I will not attempt to be thorough here, and only explore enough examples to give a picture of where we’re currently at.

The tests are the same as above: should the relation (i) exist? (ii) have the observed slope? and (iii) normalization?

The first problem for LCDM is that the relation exists (i). There is no reason to expect this relation to exist. There was (and in some corners, continues to be) a lot of denial that the RAR even exists, because it shouldn’t. It does, and it looks just like what MOND predicts. LCDM is not MOND, and did not anticipate this behavior because there is no reason to do so.

If we persist past this point – and it is not obvious that we should – then we may say, OK, here’s this unexpected relation; how do we explain it? For starters, we do have a prediction for the density profiles of dark matter halos; these fall off as r-3. That translates to some slope in the RAR plane, but not a unique relation, as the normalization can and should be different for each halo. But it’s not even the right slope. The observed slope corresponds to a logarithmic potential in which the density profile falls off as r-2. That’s what is required to give a flat rotation curve in Newtonian dynamics, which is why the psedoisothermal halo was the standard model before simulations gave us the NFW halo with its r-3 fall off. The lensing data are like a flat rotation curve that extends indefinitely far out; they are not like an NFW halo.

That’s just stating the obvious. To do more requires building a model. Here is an example from Oman et al. of a model that follows the logic I just outlined, adding some necessary and reasonable assumptions about the baryons:

The “slight offset” from the observed RAR mentioned in the caption is the factor of two in stellar mass they spotted themselves in Ludlow et al. (2017).

The model is the orange line. It deviates from the black line that is the prediction of MOND. The data look like MOND, not like the orange line.

One can of course build other models. Brouwer et al discuss some. I will not explore these in detail, and only note that the models are not consistent, so there is no clear prediction from LCDM. To explore just one a little further, this figure appears at the very end of their paper, in appendix C:

The orange line in this case is some extrapolation of the model of Navarro et al. (2017).** This also does not work, though it doesn’t fail by as much as the model of Oman et al. I don’t understand how they make the extrapolation here, as a major prediction of Navarro et al. was that gobs would saturate at 10-11 ms/s/s; the orange line should flatten out near the middle of this plot. Indeed, they argued that we would never observe any lower accelerations, and that

“extending observations to radii well beyond the inner halo regions should lead to systematic deviations from the MDAR.”

– Navarro et al (2017)

This is a reasonable prediction for LCDM, but it isn’t what happened – the RAR continues as predicted by MOND. (The MDAR is equivalent to the RAR).

The astute reader may notice that many of these theorists are frequently coauthors, so you might expect they’d come up with a self-consistent model and stick to it. Unfortunately, consistency is not a hobgoblin that afflicts galaxy formation theory, and there are as many predictions as there are theorists (more for the prolific ones). They’re all over the map – which is the problem. LCDM makes no prediction to which everyone agrees. This makes it impossible to test the theory. If one model is wrong, that is just because that particular model is wrong, not because the theory is under threat. The theory is never under threat as there always seems to be another modeler who will claim success where others fail, whether they genuinely succeed or not. That they claim success is all that is required. Cognitive dissonance then takes over, people believe what they want to hear, and all anomalies are forgiven and forgotten. There never seems to be a proper prior that everyone would agree falsifies the theory if it fails. Galaxy formation in LCDM has become epicycles on steroids.

Whither now?

I have no idea. Continue to improve the data, of course. But the more important thing that needs to happen is a change in attitude. The attitude is that LCDM as a cosmology must be right so the mass discrepancy must be caused by non-baryonic dark matter so any observation like this must have a conventional explanation, no matter how absurd and convoluted. We’ve been stuck in this rut since before we even put the L in CDM. We refuse to consider alternatives so long as the standard model has not been falsified, but I don’t see how it can be falsified to the satisfaction of all – there’s always a caveat, a rub, some out that we’re willing to accept uncritically, no matter how silly. So in the rut we remain.

A priori predictions are an important part of the scientific method because they can’t be fudged. On the rare occasions when they come true, it is supposed to make us take note – even change our minds. These lensing results are just another of many previous corroborations of a priori predictions by MOND. What people do with that knowledge – build on it, choose to ignore it, or rant in denial – is up to them.


*Bertolt Brecht mocked this attitude amongst the Aristotelian philosophers in his play about Galileo, noting how they were eager to criticize the new dynamics if the heavier rock beat the lighter rock to the ground by so much as a centimeter in the Leaning Tower of Pisa experiment while turning a blind eye to their own prediction being off by a hundred meters.

**I worked hard to salvage dark matter, which included a lot of model building. I recognize the model of Navarro et al as a slight variation on a model I built in 2000 but did not publish because it was obviously wrong. It takes a lot of time to write a scientific paper, so a lot of null results never get reported. In 2000 when I did this, the natural assumption to make was that galaxies all had about the same disk fraction (the ratio of stars to dark matter, e.g., assumption (i) of Mo et al 1998). This predicts far too much scatter in the RAR, which is why I abandoned the model. Since then, this obvious and natural assumption has been replaced by abundance matching, in which the stellar mass fraction is allowed to vary to account for the difference between the predicted halo mass function and the observed galaxy luminosity function. In effect, we replaced a universal constant with a rolling fudge factor***. This has the effect of compressing the range of halo masses for a given range of stellar masses. This in turn reduces the “predicted” scatter in the RAR, just by taking away some of the variance that was naturally there. One could do better still with even more compression, as the data are crudely consistent with all galaxies living in the same dark matter halo. This is of course a consequence of MOND, in which the conventionally inferred dark matter halo is just the “extra” force specified by the interpolation function.

***This is an example of what I’ll call prediction creep for want of a better term. Originally, we thought that galaxies corresponded to balls of gas that had had time to cool and condense. As data accumulated, we realized that the baryon fractions of galaxies were not equal to the cosmic value fb; they were rather less. That meant that only a fraction of the baryons available in a dark matter halo had actually cooled to form the visible disk. So we introduced a parameter md = Mdisk/Mtot (as Mo et al. called it) where the disk is the visible stars and gas and the total includes that and all the dark matter out to the notional edge of the dark matter halo. We could have any md < fb, but they were in the same ballpark for massive galaxies, so it seemed reasonable to think that the disk fraction was a respectable fraction of the baryons – and the same for all galaxies, perhaps with some scatter. This also does not work; low mass galaxies have much lower md than high mass galaxies. Indeed, md becomes ridiculously small for the smallest galaxies, less than 1% of the available fb (a problem I’ve been worried about since the previous century). At each step, there has been a creep in what we “predict.” All the baryons should condense. Well, most of them. OK, fewer in low mass galaxies. Why? Feedback! How does that work? Don’t ask! You don’t want to know. So for a while the baryon fraction of a galaxy was just a random number stochastically generated by chance and feedback. That is reasonable (feedback is chaotic) but it doesn’t work; the variation of the disk fraction is a clear function of mass that has to have little scatter (or it pumps up the scatter in the Tully-Fisher relation). So we gradually backed our way into a paradigm where the disk fraction is a function md(M*). This has been around long enough that we have gotten used to the idea. Instead of seeing it for what it is – a rolling fudge factor – we call it natural as if it had been there from the start, as if we expected it all along. This is prediction creep. We did not predict anything of the sort. This is just an expectation built through familiarity with requirements imposed by the data, not genuine predictions made by the theory. It has become common to assert that some unnatural results are natural; this stems in part from assuming part of the answer: any model built on abundance matching is unnatural to start, because abundance matching is unnatural. Necessary, but not remotely what we expected before all the prediction creep. It’s creepy how flexible our predictions can be.

Despondency

Despondency

I have become despondent for the progress of science.

Despite enormous progress both observational and computational, we have made little progress in solving the missing mass problem. The issue is not one of technical progress. It is psychological.

Words matter. We are hung up on missing mass as literal dark matter. As Bekenstein pointed out, a less misleading name would have been the acceleration discrepancy, because the problem only appears at low accelerations. But that sounds awkward. We humans like our simple catchphrases, and often cling to them no matter what. We called it dark matter, so it must be dark matter!

Vera Rubin succinctly stated the appropriately conservative attitude of most scientists in 1982 during the discussion at IAU 100:

To highlight the end of her quote:

I believe most of us would rather alter Newtonian gravitational theory only as a last resort.

Rubin, V.C. 1983, in the proceedings of IAU Symposium 100: Internal Kinematics and Dynamics of Galaxies, p. 10.

Exactly.

In 1982, this was exactly the right attitude. It had been clearly established that there was a discrepancy between what you see and what you get. But that was about it. So, we could add a little mass that’s hard to see, or we could change a fundamental law of nature. Easy call.

By this time, the evidence for a discrepancy was clear, but the hypothesized solutions were still in development. This was before the publication of the suggestion of Peebles and separately by Steigman & Turner of cold dark matter. This was before the publication of Milgrom’s first papers on MOND. (Note that these ideas took years to develop, so much of this work was simultaneous and not done in a vacuum.) All that was clear was that something extra was needed. It wasn’t even clear how much – a factor of two in mass sufficed for many of the early observations. At that time, it was easy to imagine that amount to be lurking in low mass stars. No need for new physics, either gravitational or particle.

The situation quickly snowballed. From a factor of two, we soon needed a factor of ten. Whatever was doing the gravitating, it exceeded the mass density allowed in normal matter by big bang nucleosynthesis. By the time I was a grad student in the late ’80s, it was obvious that there had to be some kind of dark mass, and it had to be non-baryonic. That meant new particle physics (e.g., a WIMP). The cold dark matter paradigm took root.

Like a fifty year mortgage, we are basically still stuck with this decision we made in the ’80s. It made sense then, given what was then known. Does it still? At what point have we reached the last resort? More importantly, apparently, how do we persuade ourselves that we have reached this point?

Peebles provides a nice recent summary of all the ways in which LCDM is a good approximation to cosmologically relevant observations. There are a lot, and I don’t disagree with him. The basic argument is that it is very unlikely that these things all agree unless LCDM is basically correct.

Trouble is, the exact same argument applies for MOND. I’m not going to justify this here – it should be obvious. If it isn’t, you haven’t been paying attention. It is unlikely to the point of absurdity that a wholly false theory should succeed in making so many predictions of such diversity and precision as MOND has.

These are both examples of what philosophers of science call a No Miracles Argument. The problem is that it cuts both ways. I will refrain from editorializing here on which would be the bigger miracle, and simply note that the obvious thing to do is try to combine the successes of both, especially given that they don’t overlap much. And yet, the Venn diagram of scientists working to satisfy both ends is vanishingly small. Not zero, but the vast majority of the community remains stuck in the ’80s: it has to be cold dark matter. I remember having this attitude, and how hard it was to realize that it might be wrong. The intellectual blinders imposed by this attitude are more opaque than a brick wall. This psychological hangup is the primary barrier to real scientific progress (as opposed to incremental progress in the sense used by Kuhn).

Unfortunately, both CDM and MOND rely on a tooth fairy. In CDM, it is the conceit that non-baryonic dark matter actually exists. This requires new physics beyond the Standard Model of particle physics. All the successes of LCDM follow if and only if dark matter actually exists. This we do not know (contrary to many assertions to this effect); all we really know is that there are discrepancies. Whether the discrepancies are due to literal dark matter or a change in the force law is maddeningly ambiguous. Of course, the conceit in MOND is not just that there is a modified force law, but that there must be a physical mechanism by which it occurs. The first part is the well-established discrepancy. The last part remains wanting.

When we think we know, we cease to learn.

Dr. Radhakrishnan

The best scientists are always in doubt. As well as enumerating its successes, Peebles also discusses some of the ways in which LCDM might be better. Should massive galaxies appear as they do? (Not really.) Should the voids really be so empty? (MOND predicted that one.) I seldom hear these concerns from other cosmologists. That’s because they’re not in doubt. The attitude is that dark matter has to exist, and any contrary evidence is simply a square peg that can be made to fit the round hole if we pound hard enough.

And so, we’re stuck still pounding the ideas of the ’80s into the heads of innocent students, creating a closed ecosystem of stagnant ideas self-perpetuated by the echo chamber effect. I see no good way out of this; indeed, the quality of debate is palpably lower now than it was in the previous century.

So I have become despondent for the progress of science.

Bias all the way down

Bias all the way down

It often happens that data are ambiguous and open to multiple interpretations. The evidence for dark matter is an obvious example. I frequently hear permutations on the statement

We know dark matter exists; we just need to find it.

This is said in all earnestness by serious scientists who clearly believe what they say. They mean it. Unfortunately, meaning something in all seriousness, indeed, believing it with the intensity of religious fervor, does not guarantee that it is so.

The way the statement above is phrased is a dangerous half-truth. What the data show beyond any dispute is that there is a discrepancy between what we observe in extragalactic systems (including cosmology) and the predictions of Newton & Einstein as applied to the visible mass. If we assume that the equations Newton & Einstein taught us are correct, then we inevitably infer the need for invisible mass. That seems like a very reasonable assumption, but it is just that: an assumption. Moreover, it is an assumption that is only tested on the relevant scales by the data that show a discrepancy. One could instead infer that theory fails this test – it does not work to predict observed motions when applied to the observed mass. From this perspective, it could just as legitimately be said that

A more general theory of dynamics must exist; we just need to figure out what it is.

That puts an entirely different complexion on exactly the same problem. The data are the same; they are not to blame. The difference is how we interpret them.

Neither of these statements are correct: they are both half-truths; two sides of the same coin. As such, one risks being wildly misled. If one only hears one, the other gets discounted. That’s pretty much where the field is now, and has it been stuck there for a long time.

That’s certainly where I got my start. I was a firm believer in the standard dark matter interpretation. The evidence was obvious and overwhelming. Not only did there need to be invisible mass, it had to be some new kind of particle, like a WIMP. Almost certainly a WIMP. Any other interpretation (like MACHOs) was obviously stupid, as it violated some strong constraint, like Big Bang Nucleosynthesis (BBN). It had to be non-baryonic cold dark matter. HAD. TO. BE. I was sure of this. We were all sure of this.

What gets us in trouble is not what we don’t know. It’s what we know for sure that just ain’t so.

Josh Billings

I realized in the 1990s that the above reasoning was not airtight. Indeed, it has a gaping hole: we were not even considering modifications of dynamical laws (gravity and inertia). That this was a possibility, even a remote one, came as a profound and deep shock to me. It took me ages of struggle to admit it might be possible, during which I worked hard to save the standard picture. I could not. So it pains me to watch the entire community repeat the same struggle, repeat the same failures, and pretend like it is a success. That last step follows from the zeal of religious conviction: the outcome is predetermined. The answer still HAS TO BE dark matter.

So I asked myself – what if we’re wrong? How could we tell? Once one has accepted that the universe is filled with invisible mass that can’t be detected by any craft available known to us, how can we disabuse ourselves of this notion should it happen to be wrong?

One approach that occurred to me was a test in the power spectrum of the cosmic microwave background. Before any of the peaks had been measured, the only clear difference one expected was a bigger second peak with dark matter, and a smaller one without it for the same absolute density of baryons as set by BBN. I’ve written about the lead up to this prediction before, and won’t repeat it here. Rather, I’ll discuss some of the immediate fall out – some of which I’ve only recently pieced together myself.

The first experiment to provide a test of the prediction for the second peak was Boomerang. The second was Maxima-1. I of course checked the new data when they became available. Maxima-1 showed what I expected. So much so that it barely warranted comment. One is only supposed to write a scientific paper when one has something genuinely new to say. This didn’t rise to that level. It was more like checking a tick box. Besides, lots more data were coming; I couldn’t write a new paper every time someone tacked on an extra data point.

There was one difference. The Maxima-1 data had a somewhat higher normalization. The shape of the power spectrum was consistent with that of Boomerang, but the overall amplitude was a bit higher. The latter mattered not at all to my prediction, which was for the relative amplitude of the first to second peaks.

Systematic errors, especially in the amplitude, were likely in early experiments. That’s like rule one of observing the sky. After examining both data sets and the model expectations, I decided the Maxima-1 amplitude was more likely to be correct, so I asked what offset was necessary to reconcile the two. About 14% in temperature. This was, to me, no big deal – it was not relevant to my prediction, and it is exactly the sort of thing one expects to happen in the early days of a new kind of observation. It did seem worth remarking on, if not writing a full blown paper about, so I put it in a conference presentation (McGaugh 2000), which was published in a journal (IJMPA, 16, 1031) as part of the conference proceedings. This correctly anticipated the subsequent recalibration of Boomerang.

The figure from McGaugh (2000) is below. Basically, I said “gee, looks like the Boomerang calibration needs to be adjusted upwards a bit.” This has been done in the figure. The amplitude of the second peak remained consistent with the prediction for a universe devoid of dark matter. In fact, if got better (see Table 4 of McGaugh 2004).

Plot from McGaugh (2000): The predictions of LCDM (left) and no-CDM (right) compared to Maxima-1 data (open points) and Boomerang data (filled points, corrected in normalization). The LCDM model shown is the most favorable prediction that could be made prior to observation of the first two peaks; other then-viable choices of cosmic parameters predicted a higher second peak. The no-CDM got the relative amplitude right a priori, and remains consistent with subsequent data from WMAP and Planck.

This much was trivial. There was nothing new to see, at least as far as the test I had proposed was concerned. New data were pouring in, but there wasn’t really anything worth commenting on until WMAP data appeared several years later, which persisted in corroborating the peak ratio prediction. By this time, the cosmological community had decided that despite persistent corroborations, my prediction was wrong.

That’s right. I got it right, but then right turned into wrong according to the scuttlebutt of cosmic gossip. This was a falsehood, but it took root, and seems to have become one of the things that cosmologists know for sure that just ain’t so.

How did this come to pass? I don’t know. People never asked me. My first inkling was 2003, when it came up in a chance conversation with Marv Leventhal (then chair of Maryland Astronomy), who opined “too bad the data changed on you.” This shocked me. Nothing relevant in the data had changed, yet here was someone asserting that it had like it was common knowledge. Which I suppose it was by then, just not to me.

Over the years, I’ve had the occasional weird conversation on the subject. In retrospect, I think the weirdness stemmed from a divergence of assumed knowledge. They knew I was right then wrong. I knew the second peak prediction had come true and remained true in all subsequent data, but the third peak was a different matter. So there were many opportunities for confusion. In retrospect, I think many of these people were laboring under the mistaken impression that I had been wrong about the second peak.

I now suspect this started with the discrepancy between the calibration of Boomerang and Maxima-1. People seemed to be aware that my prediction was consistent with the Boomerang data. Then they seem to have confused the prediction with those data. So when the data changed – i.e., Maxima-1 was somewhat different in amplitude, then it must follow that the prediction now failed.

This is wrong on many levels. The prediction is independent of the data that test it. It is incredibly sloppy thinking to confuse the two. More importantly, the prediction, as phrased, was not sensitive to this aspect of the data. If one had bothered to measure the ratio in the Maxima-1 data, one would have found a number consistent with the no-CDM prediction. This should be obvious from casual inspection of the figure above. Apparently no one bothered to check. They didn’t even bother to understand the prediction.

Understanding a prediction before dismissing it is not a hard ask. Unless, of course, you already know the answer. Then laziness is not only justified, but the preferred course of action. This sloppy thinking compounds a number of well known cognitive biases (anchoring bias, belief bias, confirmation bias, to name a few).

I mistakenly assumed that other people were seeing the same thing in the data that I saw. It was pretty obvious, after all. (Again, see the figure above.) It did not occur to me back then that other scientists would fail to see the obvious. I fully expected them to complain and try and wriggle out of it, but I could not imagine such complete reality denial.

The reality denial was twofold: clearly, people were looking for any excuse to ignore anything associated with MOND, however indirectly. But they also had no clear prior for LCDM, which I did establish as a point of comparison. A theory is only as good as its prior, and all LCDM models made before these CMB data showed the same thing: a bigger second peak than was observed. This can be fudged: there are ample free parameters, so it can be made to fit; one just had to violate BBN (as it was then known) by three or four sigma.

In retrospect, I think the very first time I had this alternate-reality conversation was at a conference at the University of Chicago in 2001. Andrey Kravtsov had just joined the faculty there, and organized a conference to get things going. He had done some early work on the cusp-core problem, which was still very much a debated thing at the time. So he asked me to come address that topic. I remember being on the plane – a short ride from Cleveland – when I looked at the program. Nearly did a spit take when I saw that I was to give the first talk. There wasn’t a lot of time to organize my transparencies (we still used overhead projectors in those days) but I’d given the talk many times before, so it was enough.

I only talked about the rotation curves of low surface brightness galaxies in the context of the cusp-core problem. That was the mandate. I didn’t talk about MOND or the CMB. There’s only so much you can address in a half hour talk. [This is a recurring problem. No matter what I say, there always seems to be someone who asks “why didn’t you address X?” where X is usually that person’s pet topic. Usually I could do so, but not in the time allotted.]

About halfway through this talk on the cusp-core problem, I guess it became clear that I wasn’t going to talk about things that I hadn’t been asked to talk about, and I was interrupted by Mike Turner, who did want to talk about the CMB. Or rather, extract a confession from me that I had been wrong about it. I forget how he phrased it exactly, but it was the academic equivalent of “Have you stopped beating your wife lately?” Say yes, and you admit to having done so in the past. Say no, and you’re still doing it. What I do clearly remember was him prefacing it with “As a test of your intellectual honesty” as he interrupted to ask a dishonest and intentionally misleading question that was completely off-topic.

Of course, the pretext for his attack question was the Maxima-1 result. He phrased it in a way that I had to agree that those disproved my prediction, or be branded a liar. Now, at the time, there were rumors swirling that the experiment – some of the people who worked on it were there – had detected the third peak, so I thought that was what he was alluding to. Those data had not yet been published and I certainly had not seen them, so I could hardly answer that question. Instead, I answered the “intellectual honesty” affront by pointing to a case where I had said I was wrong. At one point, I thought low surface brightness galaxies might explain the faint blue galaxy problem. On closer examination, it became clear that they could not provide a complete explanation, so I said so. Intellectual honesty is really important to me, and should be to all scientists. I have no problem admitting when I’m wrong. But I do have a problem with demands to admit that I’m wrong when I’m not.

To me, it was obvious that the Maxima-1 data were consistent with the second peak. The plot above was already published by then. So it never occurred to me that he thought the Maxima-1 data were in conflict with what I had predicted – it was already known that it was not. Only to him, it was already known that it was. Or so I gather – I have no way to know what others were thinking. But it appears that this was the juncture in which the field suffered a psychotic break. We are not operating on the same set of basic facts. There has been a divergence in personal realities ever since.

Arthur Kosowsky gave the summary talk at the end of the conference. He told me that he wanted to address the elephant in the room: MOND. I did not think the assembled crowd of luminary cosmologists were mature enough for that, so advised against going there. He did, and was incredibly careful in what he said: empirical, factual, posing questions rather than making assertions. Why does MOND work as well as it does?

The room dissolved into chaotic shouting. Every participant was vying to say something wrong more loudly than the person next to him. (Yes, everyone shouting was male.) Joel Primack managed to say something loudly enough for it to stick with me, asserting that gravitational lensing contradicted MOND in a way that I had already shown it did not. It was just one of dozens of superficial falsehoods that people take for granted to be true if they align with one’s confirmation bias.

The uproar settled down, the conference was over, and we started to disperse. I wanted to offer Arthur my condolences, having been in that position many times. Anatoly Klypin was still giving it to him, keeping up a steady stream of invective as everyone else moved on. I couldn’t get a word in edgewise, and had a plane home to catch. So when I briefly caught Arthur’s eye, I just said “told you” and moved on. Anatoly paused briefly, apparently fathoming that his behavior, like that of the assembled crowd, was entirely predictable. Then the moment of awkward self-awareness passed, and he resumed haranguing Arthur.

Divergence

Divergence

Reality check

Before we can agree on the interpretation of a set of facts, we have to agree on what those facts are. Even if we agree on the facts, we can differ about their interpretation. It is OK to disagree, and anyone who practices astrophysics is going to be wrong from time to time. It is the inevitable risk we take in trying to understand a universe that is vast beyond human comprehension. Heck, some people have made successful careers out of being wrong. This is OK, so long as we recognize and correct our mistakes. That’s a painful process, and there is an urge in human nature to deny such things, to pretend they never happened, or to assert that what was wrong was right all along.

This happens a lot, and it leads to a lot of weirdness. Beyond the many people in the field whom I already know personally, I tend to meet two kinds of scientists. There are those (usually other astronomers and astrophysicists) who might be familiar with my work on low surface brightness galaxies or galaxy evolution or stellar populations or the gas content of galaxies or the oxygen abundances of extragalactic HII regions or the Tully-Fisher relation or the cusp-core problem or faint blue galaxies or big bang nucleosynthesis or high redshift structure formation or joint constraints on cosmological parameters. These people behave like normal human beings. Then there are those (usually particle physicists) who have only heard of me in the context of MOND. These people often do not behave like normal human beings. They conflate me as a person with a theory that is Milgrom’s. They seem to believe that both are evil and must be destroyed. My presence, even the mere mention of my name, easily destabilizes their surprisingly fragile grasp on sanity.

One of the things that scientists-gone-crazy do is project their insecurities about the dark matter paradigm onto me. People who barely know me frequently attribute to me motivations that I neither have nor recognize. They presume that I have some anti-cosmology, anti-DM, pro-MOND agenda, and are remarkably comfortably about asserting to me what it is that I believe. What they never explain, or apparently bother to consider, is why I would be so obtuse? What is my motivation? I certainly don’t enjoy having the same argument over and over again with their ilk, which is the only thing it seems to get me.

The only agenda I have is a pro-science agenda. I want to know how the universe works.

This agenda is not theory-specific. In addition to lots of other astrophysics, I have worked on both dark matter and MOND. I will continue to work on both until we have a better understanding of how the universe works. Right now we’re very far away from obtaining that goal. Anyone who tells you otherwise is fooling themselves – usually by dint of ignoring inconvenient aspects of the evidence. Everyone is susceptible to cognitive dissonance. Scientists are no exception – I struggle with it all the time. What disturbs me is the number of scientists who apparently do not. The field is being overrun with posers who lack the self-awareness to question their own assumptions and biases.

So, I feel like I’m repeating myself here, but let me state my bias. Oh wait. I already did. That’s why it felt like repetition. It is.

The following bit of this post is adapted from an old web page I wrote well over a decade ago. I’ve lost track of exactly when – the file has been through many changes in computer systems, and unix only records the last edit date. For the linked page, that’s 2016, when I added a few comments. The original is much older, and was written while I was at the University of Maryland. Judging from the html style, it was probably early to mid-’00s. Of course, the sentiment is much older, as it shouldn’t need to be said at all.

I will make a few updates as seem appropriate, so check the link if you want to see the changes. I will add new material at the end.


Long standing remarks on intellectual honesty

The debate about MOND often degenerates into something that falls well short of the sober, objective discussion that is suppose to characterize scientific debates. One can tell when voices are raised and baseless ad hominem accusations made. I have, with disturbing frequency, found myself accused of partisanship and intellectual dishonesty, usually by people who are as fair and balanced as Fox News.

Let me state with absolute clarity that intellectual honesty is a bedrock principle of mine. My attitude is summed up well by the quote

When a man lies, he murders some part of the world.

Paul Gerhardt

I first heard this spoken by the character Merlin in the movie Excalibur (1981 version). Others may have heard it in a song by Metallica. As best I can tell, it is originally attributable to the 17th century cleric Paul Gerhardt.

This is a great quote for science, as the intent is clear. We don’t get to pick and choose our facts. Outright lying about them is antithetical to science.

I would extend this to ignoring facts. One should not only be honest, but also as complete as possible. It does not suffice to be truthful while leaving unpleasant or unpopular facts unsaid. This is lying by omission.

I “grew up” believing in dark matter. Specifically, Cold Dark Matter, presumably a WIMP. I didn’t think MOND was wrong so much as I didn’t think about it at all. Barely heard of it; not worth the bother. So I was shocked – and angered – when it its predictions came true in my data for low surface brightness galaxies. So I understand when my colleagues have the same reaction.

Nevertheless, Milgrom got the prediction right. I had a prediction, it was wrong. There were other conventional predictions, they were also wrong. Indeed, dark matter based theories generically have a very hard time explaining these data. In a Bayesian sense, given the prior that we live in a ΛCDM universe, the probability that MONDian phenomenology would be observed is practically zero. Yet it is. (This is very well established, and has been for some time.)

So – confronted with an unpopular theory that nevertheless had some important predictions come true, I reported that fact. I could have ignored it, pretended it didn’t happen, covered my eyes and shouted LA LA LA NOT LISTENING. With the benefit of hindsight, that certainly would have been the savvy career move. But it would also be ignoring a fact, and tantamount to a lie.

In short, though it was painful and protracted, I changed my mind. Isn’t that what the scientific method says we’re suppose to do when confronted with experimental evidence?

That was my experience. When confronted with evidence that contradicted my preexisting world view, I was deeply troubled. I tried to reject it. I did an enormous amount of fact-checking. The people who presume I must be wrong have not had this experience, and haven’t bothered to do any fact-checking. Why bother when you already are sure of the answer?


Willful Ignorance

I understand being skeptical about MOND. I understand being more comfortable with dark matter. That’s where I started from myself, so as I said above, I can empathize with people who come to the problem this way. This is a perfectly reasonable place to start.

For me, that was over a quarter century ago. I can understand there being some time lag. That is not what is going on. There has been ample time to process and assimilate this information. Instead, most physicists have chosen to remain ignorant. Worse, many persist in spreading what can only be described as misinformation. I don’t think they are liars; rather, it seems that they believe their own bullshit.

To give an example of disinformation, I still hear said things like “MOND fits rotation curves but nothing else.” This is not true. The first thing I did was check into exactly that. Years of fact-checking went into McGaugh & de Blok (1998), and I’ve done plenty more since. It came as a great surprise to me that MOND explained the vast majority of the data as well or better than dark matter. Not everything, to be sure, but lots more than “just” rotation curves. Yet this old falsehood still gets repeated as if it were not a misconception that was put to rest in the previous century. We’re stuck in the dark ages by choice.

It is not a defensible choice. There is no excuse to remain ignorant of MOND at this juncture in the progress of astrophysics. It is incredibly biased to point to its failings without contending with its many predictive successes. It is tragi-comically absurd to assume that dark matter provides a better explanation when it cannot make the same predictions in advance. MOND may not be correct in every particular, and makes no pretense to be a complete theory of everything. But it is demonstrably less wrong than dark matter when it comes to predicting the dynamics of systems in the low acceleration regime. Pretending like this means nothing is tantamount to ignoring essential facts.

Even a lie of omission murders a part of the world.

Galaxy Stellar and Halo Masses: tension between abundance matching and kinematics

Galaxy Stellar and Halo Masses: tension between abundance matching and kinematics

Mass is a basic quantity. How much stuff does an astronomical object contain? For a galaxy, mass can mean many different things: that of its stars, stellar remnants (e.g., white dwarfs, neutron stars), atomic gas, molecular clouds, plasma (ionized gas), dust, Bok globules, black holes, habitable planets, biomass, intelligent life, very small rocks… these are all very different numbers for the same galaxy, because galaxies contain lots of different things. Two things that many scientists have settled on as Very Important are a galaxy’s stellar mass and its dark matter halo mass.

The mass of a galaxy’s dark matter halo is not well known. Most measurement provide only lower limits, as tracers fade out before any clear end is reached. Consequently, the “total” mass is a rather notional quantity. So we’ve adopted as a convention the mass M200 contained within an over-density of 200 times the critical density of the universe. This is a choice motivated by an ex-theory that would take an entire post to explain unsatisfactorily, so do not question the convention: all choices are bad, so we stick with it.

One of the long-standing problems the cold dark matter paradigm has is that the galaxy luminosity function should be steep but is observed to be shallow. This sketch shows the basic issue. The number density of dark matter halos as a function of mass is expected to be a power law – one that is well specified once the cosmology is known and a convention for the mass is adopted. The obvious expectation is that the galaxy luminosity function should just be a downshifted version of the halo mass function: one galaxy per halo, with the stellar mass proportional to the halo mass. This was such an obvious assumption [being provision (i) of canonical galaxy formation in LCDM] that it was not seriously questioned for over a decade. (Minor point: a turn down at the high mass end could be attributed to gas cooling times: the universe didn’t have time to cool and assemble a galaxy above some threshold mass, but smaller things had plenty of time for gas to cool and form stars.)

The number density of galaxies (blue) and dark matter halos (red) as a function of their mass. Our original expectation is on the left: the galaxy mass function should be a down-shifted version of the halo mass function, up to a gas cooling limit. Dashed grey lines illustrate the correspondence of galaxies with dark matter halos of proportional mass: M* = md M200. On the right is the current picture of abundance matching with the grey lines connecting galaxies with dark matter halos of equal cosmic density in which they are supposed to reside. In effect, we make the proportionality factor md a rolling, mass-dependent fudge factor.

The galaxy luminosity function does not look like a shifted version of the halo mass function. It has the wrong slope at the faint end. At no point is the size of the shift equal to what one would expect from the mass of available baryons. The proportionality factor md is too small; this is sometimes called the over-cooling problem, in that a lot more baryons should have cooled to form stars than apparently did so. So, aside from the shape and the normalization, it’s a great match.

We obsessed about this problem all through the ’90s. At one point, I thought I had solved it. Low surface brightness galaxies were under-represented in galaxy surveys. They weren’t missed entirely, but their masses could be systematically underestimated. This might matter a lot because the associated volume corrections are huge. A small systematic in mass would get magnified into a big one in density. Sadly, after a brief period of optimism, it became clear that this could not work to solve the entire problem, which persists.

Circa 2000, a local version of the problem became known as the missing satellites problem. This is a down-shifted version of the mismatch between the galaxy luminosity function and the halo mass function that pervades the entire universe: few small galaxies are observed where many are predicted. To give visual life to the numbers we’re talking about, here is an image of the dark matter in a simulation of a Milky Way size galaxy:

Dark Matter in the Via Lactea simulation (Diemand et al. 2008). The central region is the main dark matter halo which would contain a large galaxy like the Milky Way. All the lesser blobs are subhalos. A typical galaxy-sized dark matter halo should contain many, many subhalos. Naively, we expect each subhalo to contain a dwarf satellite galaxy. Structure is scale-free in CDM, so major galaxies should look like miniature clusters of galaxies.

In contrast, real galaxies have rather fewer satellites that meet the eye:

NGC 6946 and environs. The points are foreground stars, ignore them. The neighborhood of NGC 6946 appears to be pretty empty – there is no swarm of satellite galaxies as in the simulation above. I know of two dwarf satellite galaxies in this image, both of low surface brightness. The brighter one (KK98-250) the sharp-eyed may find between the bright stars at top right. The fainter one (KK98-251) is nearby KK98-250, a bit down and to the left of it; good luck seeing it on this image from the Digital Sky Survey. That’s it. There are no other satellite galaxies visible here. There can of course be more that are too low in surface brightness to detect. The obvious assumption of a one-to-one relation between stellar and halo mass cannot be sustained; there must instead be a highly non-linear relation between mass and light so that subhalos contain only contain dwarfs of extraordinarily low surface brightness.

By 2010, we’d thrown in the towel, and decided to just accept that this aspect of the universe was too complicated to predict. The story now is that feedback changes the shape of the luminosity function at both the faint and the bright ends. Exactly how depends on who you ask, but the predicted halo mass function is sacrosanct so there must be physical processes that make it so. (This is an example of the Frenk Principle in action.)

Lacking a predictive theory, theorists instead came up with a clever trick to relate galaxies to their dark matter halos. This has come to be known as abundance matching. We measure the number density of galaxies as a function of stellar mass. We know, from theory, what the number density of dark matter halos should be as a function of halo mass. Then we match them up: galaxies of a given density live in halos of the corresponding density, as illustrated by the horizontal gray lines in the right panel of the figure above.

There have now been a number of efforts to quantify this. Four examples are given in the figure below (see this paper for references), together with kinematic mass estimates.

The ratio of stellar to halo mass as a function of dark matter halo mass. Lines represent the abundance matching relations derived by assigning galaxies to dark matter halos based on their cosmic abundance. Points are independent halo mass estimates based on kinematics (McGaugh et al. 2010). The horizontal dashed line represents the maximum stellar mass that would result if all available baryons were turned into stars. (Mathematically, this happens when md equals the cosmic baryon fraction, about 15%.)

The abundance matching relations have a peak around a halo mass of 1012 M and fall off to either side. This corresponds to the knee in the galaxy luminosity function. For whatever reason, halos of this mass seem to be most efficient at converting their available baryons into stars. The shape of these relations mean that there is a non-linear relation between stellar mass and halo mass. At the low mass end, a big range in stellar mass is compressed into a small range in halo mass. The opposite happens at high mass, where the most massive galaxies are generally presumed to be the “central” galaxy of a cluster of galaxies. We assign the most massive halos to big galaxies understanding that they may be surrounded by many subhalos, each containing a cluster galaxy.

Around the same time, I made a similar plot, but using kinematic measurements to estimate halo masses. Both methods are fraught with potential systematics, but they seem to agree reasonably well – at least over the range illustrated above. It gets dodgy above and below that. The agreement is particularly good for lower mass galaxies. There seems to be a departure for the most massive individual galaxies, but why worry about that when the glass is 3/4 full?

Skip ahead a decade, and some people think we’ve solved the missing satellite problem. One key ingredient of that solution is that the Milky Way resides in a halo that is on the lower end of the mass range that has traditionally been estimated for it (1 to 2 x 1012 M). This helps because the number of subhalos scales with mass: clusters are big halos with lots of galaxy-size halos; the Milky Way is a galaxy-sized halo with lots of smaller subhalos. Reality does not look like that, but having a lower mass means fewer subhalos, so that helps. It does not suffice. We must invoke feedback effects to make the relation between light and mass nonlinear. Then the lowest mass satellites may be too dim to detect: selection effects have to do a lot of work. It also helps to assume the distribution of satellites is isotropic, which looks to be true in the simulation, but not so much in reality where known dwarf satellites occupy a planar distribution. We also need to somehow fudge the too-big-to-fail problem, in which the more massive subhalos appear not to be occupied by luminous galaxies at all. Given all that, we can kinda sorta get in the right ballpark. Kinda, sorta, provided that we live in a galaxy whose halo mass is closer to 1012 M than to 2 x 1012 M.

At an IAU meeting in Shanghai (in July 2019, before travel restrictions), the subject of the mass of the Milky Way was discussed at length. It being our home galaxy, there are many ways in which to constrain the mass, some of which take advantage of tracers that go out to greater distances than we can obtain elsewhere. Speaker after speaker used different methods to come to a similar conclusion, with the consensus hedging on the low side (roughly 1 – 1.5 x 1012 M). A nice consequence would be that the missing satellite problem may no longer be a problem.

Galaxies in general and the Milky Way in particular are different and largely distinct subfields. Different data studied by different people with distinctive cultures. In the discussion at the end of the session, Pieter van Dokkum pointed out that from the perspective of other galaxies, the halo mass ought to follow from abundance matching, which for a galaxy like the Milky Way ought to be more like 3 x 1012 M, considerably more than anyone had suggested, but hard to exclude because most of that mass could be at distances beyond the reach of the available tracers.

This was not well received.

The session was followed by a coffee break, and I happened to find myself standing in line next to Pieter. I was still processing his comment, and decided he was right – from a certain point of view. So we got to talking about it, and wound up making the plot below, which appears in a short research note. (For those who know the field, it might be assumed that Pieter and I hate each other. This is not true, but we do frequently disagree, so the fact that we do agree about this is itself worthy of note.)

The Local Group and its two most massive galaxies, the Milky Way and Andromeda (M31), in the stellar mass-halo mass plane. Lines are the abundance matching relations from above. See McGaugh & van Dokkum for further details. The remaining galaxies of the Local Group all fall off the edge of this plot, and do not add up to anything close to either the Milky Way or Andromeda alone.

The Milky Way and Andromeda are the 1012 M gorillas of the Local Group. There are many dozens of dwarf galaxies, but none of them are comparable in mass, even with the boost provided by the non-linear relation between mass and luminosity. To astronomical accuracy, in terms of mass, the Milky Way plus Andromeda are the Local Group. There are many distinct constraints, on each galaxy as an individual, and on the Local Group as a whole. Any way we slice it, all three entities lie well off the relation expected from abundance matching.

There are several ways one could take it from here. One might suppose that abundance matching is correct, and we have underestimated the mass with other measurements. This happens all the time with rotation curves, which typically do not extend far enough out into the halo to give a good constraint on the total mass. This is hard to maintain for the Local Group, where we have lots of tracers in the form of dwarf satellites, and there are constraints on the motions of galaxies on still larger scales. Moreover, a high mass would be tragic for the missing satellite problem.

One might instead imagine that there is some scatter in the abundance matching relation, and we just happen to live in a galaxy that has a somewhat low mass for its luminosity. This is almost reasonable for the Milky Way, as there is some overlap between kinematic mass estimates and the expectations of abundance matching. But the missing satellite problem bites again unless we are pretty far off the central value of the abundance matching relation. Other Milky Way-like galaxies ought to fall on the other end of the spectrum, with more mass and more satellites. A lot of work is going on to look for satellites around other spirals, which is hard work (see NGC 6946 above). There is certainly scatter in the number of satellites from system to system, but whether this is theoretically sensible or enough to explain our Milky Way is not yet apparent.

There is a tendency in the literature to invoke scatter when and where needed. Here, it is important to bear in mind that there is little scatter in the Tully-Fisher relation. This is a relation between stellar mass and rotation velocity, with the latter supposedly set by the halo mass. We can’t have it both ways. Lots of scatter in the stellar mass-halo mass relation ought to cause a corresponding amount of scatter in Tully-Fisher. This is not observed. It is a much stronger than most people seem to appreciate, as even subtle effects are readily perceptible. Consequently, I think it unlikely that we can nuance the relation between halo mass and observed rotation speed to satisfy both relations without a lot of fine-tuning, which is usually a sign that something is wrong.

There are a lot of moving parts in modern galaxy formation simulations that need to be fine-tuned: the effects of halo mass, merging, dissipation, [non]adiabatic compression, angular momentum transport, gas cooling, on-going accretion of gas from the intergalactic medium, expulsion of gas in galactic winds, re-accretion of expelled gas via galactic fountains, star formation and the ensuing feedback from radiation pressure, stellar winds, supernovae, X-rays from stellar remnants, active galactic nuclei, and undoubtedly other effects I don’t recall off the top of my head. Visualization from the Dr. Seuss suite of simulations.

A lot of effort has been put into beating down the missing satellite problem around the Milky Way. Matters are worse for Andromeda. Kinematic halo mass estimates are typically in the same ballpark as the Milky Way. Some are a bit bigger, some are lower. Lower is a surprise, because the stellar mass of M31 is clearly bigger than that of the Milky Way, placing it is above the turnover where the efficiency of star formation is maximized. In this regime, a little stellar mass goes a long way in terms of halo mass. Abundance matching predicts that a galaxy of Andromeda’s stellar mass should reside in a dark matter halo of at least 1013 M. That’s quite a bit more than 1 or 2 x 1012 M, even by astronomical standards. Put another way, according to abundance matching, the Local Group should have the Milky Way as its most massive occupant. Just the Milky Way. Not the Milky Way plus Andromeda. Despite this, the Local Group is not anomalous among similar groups.

Words matter. A lot boils down to what we consider to be “close enough” to call similar. I do not consider the Milky Way and Andromeda to be all that similar. They are both giant spirals, yes, but galaxies are all individuals. Being composed of hundreds of billions of stars, give or take, leaves a lot of room for differences. In this case, the Milky Way and Andromeda are easily distinguished in the Tully-Fisher plane. Andromeda is about twice the baryonic mass of the Milky Way. It also rotates faster. The error bars on these quantities do not come close to overlapping – that would be one criterion for considering them to be similar – a criterion they do not meet. Even then, there could be other features that might be readily distinguished, but let’s say a rough equality in the Tully-Fisher plane would indicate stellar and halo masses that are “close enough” for our present discussion. They aren’t: to me, the Milky Way and M31 are clearly different galaxies.

I spent a fair amount of time reading the recent literature on satellites searches, and I was struck by the ubiquity with which people make the opposite assumption, treating the Milky Way and Andromeda as interchangeable galaxies of similar mass. Why would they do this? If one looks at the kinematic halo mass as the defining characteristic of a galaxy, they’re both close to 1012 M, with overlapping error bars on M200. By that standard, it seems fair. Is it?

Luminosity is observable. Rotation speed is observable. There are arguments to be had about how to convert luminosity into stellar mass, and what rotation speed measure is “best.” These are sometimes big arguments, but they are tiny in scale compared to estimating notional quantities like the halo mass. The mass M200 is not an observable quantity. As such, we have no business using it as a defining characteristic of a galaxy. You know a galaxy when you see it. The same cannot be said of a dark matter halo. Literally.

If, for some theoretically motivated reason, we want to use halo mass as a standard then we need to at least use a consistent method to assess its value from directly observable quantities. The methods we use for the Milky Way and M31 are not applicable beyond the Local Group. Nowhere else in the universe do we have such an intimate picture of the kinematic mass from a wide array of independent methods with tracers extending to such large radii. There are other standards we could apply, like the Tully-Fisher relation. That we can do outside the Local Group, but by that standard we would not infer that M31 and the Milky Way are the same. Other observables we can fairly apply to other galaxies are their luminosities (stellar masses) and cosmic number densities (abundance matching). From that perspective, what we know from all the other galaxies in the universe is that the factor of ~2 difference in stellar mass between Andromeda and the Milky Way should be huge in terms of halo mass. If it were anywhere else in the universe, we wouldn’t treat these two galaxies as interchangeably equal. This is the essence of Pieter’s insight: abundance matching is all about the abundance of dark matter halos, so that would seem to be the appropriate metric by which to predict the expected number of satellites, not the kinematic halo mass that we can’t measure in the same way anywhere else in the universe.

That isn’t to say we don’t have some handle on kinematic halo masses, it’s just that most of that information comes from rotation curves that don’t typically extend as far as the tracers that we have in the Local Group. Some rotation curves are more extended than others, so one has to account for that variation. Typically, we can only put a lower limit on the halo mass, but if we assume a profile like NFW – the standard thing to do in LCDM, then we can sometimes exclude halos that are too massive.

Abundance matching has become important enough to LCDM that we included it as a prior in fitting dark matter halo models to rotation curves. For example:

The stellar mass-halo mass relation from rotation curve fits (Li et al 2020). Each point is one galaxy; the expected abundance matching relation (line) is not recovered (left) unless it is imposed as a prior (right). The data are generally OK with this because the amount of mass at radii beyond the end of the rotation curve is not strongly constrained. Still, there are some limits on how crazy this can get.

NFW halos are self-similar: low mass halos look very much like high mass halos over the range that is constrained by data. Consequently, if you have some idea what the total mass of the halo should be, as abundance matching provides, and you impose that as a prior, the fits for most galaxies say “OK.” The data covering the visible galaxy have little power to constrain what is going on with the dark matter halo at much larger radii, so the fits literally fall into line when told to do so, as seen in Pengfei‘s work.

That we can impose abundance matching as a prior does not necessarily mean the result is reasonable. The highest halo masses that abundance matching wants in the plot above are crazy talk from a kinematic perspective. I didn’t put too much stock in this, as the NFW halo itself, the go-to standard of LCDM, provides the worst description of the data among all the dozen or so halo models that we considered. Still, we did notice that even with abundance matching imposed as a prior, there are a lot more points above the line than below it at the high mass end (above the bend in the figure above). The rotation curves are sometimes pushing back against the imposed prior; they often don’t want such a high halo mass. This was explored in some detail by Posti et al., who found a similar effect.

I decided to turn the question around. Can we use abundance matching to predict the halo and hence rotation curve of a massive galaxy? The largest spiral in the local universe, UGC 2885, has one of the most extended rotation curves known, meaning that it does provide some constraint on the halo mass. This galaxy has been known as an important case since Vera Rubin’s work in the ’70s. With a modern distance scale, its rotation curve extends out 80 kpc. That’s over a quarter million light-years – a damn long way, even by the standards of galaxies. It also rotates remarkably fast, just shy of 300 km/s. It is big and massive.

(As an aside, Vera once offered a prize for anyone who found a disk that rotated faster than 300 km/s. Throughout her years of looking at hundreds of galaxies, UGC 2885 remained the record holder, with 300 seeming to be a threshold that spirals did not exceed. She told me that she did pay out, but on a technicality: someone showed her a gas disk around a supermassive black hole in Keplerian rotation that went up to 500 km/s at its peak. She lamented that she had been imprecise in her language, as that was nothing like what she meant, which was the flat rotation speed of a spiral galaxy.)

That aside aside, if we take abundance matching at face value, then the stellar mass of a galaxy predicts the mass of its dark matter halo. Using the most conservative (in that it returns the lowest halo mass) of the various abundance matching relations indicates that with a stellar mass of about 2 x 1011 M, UGC 2885 should have a halo mass of 3 x 1013 M. Combining this with a well-known relation between halo concentration and mass for NFW halos, we then know what the rotation curve should be. Doing this for UGC 2885 yields a tragic result:

The extended rotation curve of UGC 2885 (points). The declining dotted line is the rotation curve predicted by the observed stars and gas. The rising dashed line is the halo predicted by abundance matching. Combining this halo with the observed stars and gas should result in the solid line. This greatly exceeds the data. UGC 2885 does not reside in an NFW halo that is anywhere near as massive as predicted by abundance matching.

The data do not allow for the predicted amount of dark matter. If we fit the rotation curve, we obtain a “mere” M200 = 5 x 1012 M. Note that this means that UGC 2885 is basically the Milky Way and Andromeda added together in terms of both stellar mass and halo mass – if added to the M*-M200 plot above, it would land very close to the open circle representing the more massive halo estimate for the combination of MW+M31, and be just as discrepant from the abundance matching relations. We get the same result regardless of which direction we look at it from.

Objectively, 5 x 1012 M is a huge dark matter halo for a single galaxy. It’s just not the yet-more massive halo that is predicted by abundance matching. In this context, UGC 2885 apparently has a serious missing satellites problem, as it does not appear to be swimming in a sea of satellite galaxies the way we’d expect for the central galaxy of such high mass halo.

UGC 2885 appears to be pretty lonely in this image from the DSS. I see a few candidate satellite galaxies amidst the numerous foreground stars, but nothing like what you’d expect for dark matter subhalos from a simulation like the via Lactea. This impression does not change when imaged in more detail with HST.

It is tempting to write this off as a curious anecdote. Another outlier. Sure, that’s always possible, but this is more than a bit ridiculous. Anyone who wants to go this route I refer to Snoop Dog.

I spent much of my early career obsessed with selection effects. These preclude us from seeing low surface brightness galaxies as readily as brighter ones. However, it isn’t binary – a galaxy has to be extraordinarily low surface brightness before it becomes effectively invisible. The selection effect is a bias – and a very strong one – but not an absolute screen that prevents us from finding low surface brightness galaxies. That makes it very hard to sustain the popular notion that there are lots of subhalos that simply contain ultradiffuse galaxies that cannot currently be seen. I’ve been down this road many times as an optimist in favor of this interpretation. It hasn’t worked out. Selection effects are huge, but still nowhere near big enough to overcome the required deficit.

Having the satellite galaxies that inhabit subhalos be low in surface brightness is a necessary but not sufficient criterion. It is also necessary to have a highly non-linear stellar mass-halo mass relation at low mass. In effect, luminosity and halo mass become decoupled: satellite galaxies spanning a vast range in luminosity must live in dark matter halos that cover only a tiny range. This means that it should not be possible to predict stellar motions in these galaxies from their luminosity. The relation between mass and light has just become too weak and messy.

And yet, we can do exactly that. Over and over again. This simply should not be possible in LCDM.