The last post was basically an introduction to this one, which is about the recent work of Pengfei Li. In order to test a theory, we need to establish its prior. What do we expect?
The prior for fully formed galaxies after 13 billion years of accretion and evolution is not an easy problem. The dark matter halos need to form first, with the baryonic component assembling afterwards. We know from dark matter-only structure formation simulations that the initial condition (A) of the dark matter halo should resemble an NFW halo, and from observations that the end product of baryonic assembly needs to look like a real galaxy (Z). How the universe gets from A to Z is a whole alphabet of complications.
The simplest thing we can do is ignore B-Y and combine a model galaxy with a model dark matter halo. The simplest model for a spiral galaxy is an exponential disk. True to its name, the azimuthally averaged stellar surface density falls off exponentially from a central value over some scale length. This is a tolerable approximation of the stellar disks of spiral galaxies, ignoring their central bulges and their gas content. It is an inadequate yet surprisingly decent starting point for describing gravitationally bound collections of hundreds of billions of stars with just two parameters.
So a basic galaxy model is an exponential disk in an NFW dark matter halo. This is they type of model I discussed in the last post, the kind I was considering two decades ago, and the kind of model still frequently considered. It is an obvious starting point. However, we know that this starting point is not adequate. On the baryonic side, we should model all the major mass components: bulge, disk, and gas. On the halo side, we need to understand how the initial halo depends on its assembly history and how it is modified by the formation of the luminous galaxy within it. The common approach to do all that is to run a giant cosmological simulation and watch what happens. That’s great, provided we know how to model all the essential physics. The action of gravity in an expanding universe we can compute well enough, but we do not enjoy the same ability to calculate the various non-gravitational effects of baryons.
Rather than blindly accept the outcome of simulations that have become so complicated that no one really seems to understand them, it helps to break the problem down into its basic steps. There is a lot going on, but what we’re concerned about here boils down to a tug of war between two competing effects: adiabatic compression tends to concentrate the dark matter, while feedback tends to redistribute it outwards.
Adiabatic compression refers to the response of the dark matter halo to infalling baryons. Though this name stuck, the process isn’t necessarily adiabatic, and the A-word word tends to blind people to a generic and inevitable physical process. As baryons condense into the centers of dark matter halos, the gravitational potential is non-stationary. The distribution of dark matter has to respond to this redistribution of mass: the infall of dissipating baryons drags some dark matter in with them, so we expect dark matter halos to become more centrally concentrated. The most common approach to computing this effect is to assume the process is adiabatic (hence the name). This means a gentle settling that is gradual enough to be time-reversible: you can imagine running the movie backwards, unlike a sudden, violent event like a car crash. It needn’t be rigorously adiabatic, but the compressive response of the halo is inevitable. Indeed, forming a thin, dynamically cold, well-organized rotating disk in a preferred plane – i.e., a spiral galaxy – pretty much requires a period during which the adiabatic assumption is a decent approximation. There is a history of screwing up even this much, but Jerry Sellwood showed that it could be done correctly and that when one does so, it reproduces the results of more expensive numerical simulations. This provides a method to go beyond a simple exponential disk in an NFW halo: we can compute what happens to an NFW halo in response to an observed mass distribution.
After infall and compression, baryons form stars that produce energy in the form of radiation, stellar winds, and the blast waves of supernova explosions. These are sources of energy that complicate what until now has been a straightforward calculation of gravitational dynamics. With sufficient coupling to the surrounding gas, these energy sources might be converted into enough kinetic energy to alter the equilibrium mass distribution and the corresponding gravitational potential. I say might because we don’t really know how this works, and it is a lot more complicated than I’ve made it sound. So let’s not go there, and instead just calculate the part we do know how to calculate. What happens from the inevitable adiabatic compression in the limit of zero feedback?
We have calculated this for a grid of model galaxies that matches the observed distribution or real galaxies. This is important; it often happens that people do not explore a realistic parameter space. Here is a plot of size against stellar mass:
Note that at a given stellar mass, there is a wide range of sizes. This is an essential aspect of galaxy properties; one has to explain size variations as well as the trend with mass. This obvious point has been frequently forgotten and rediscovered in the literature.
The two parameter plot above only suffices to approximate the stellar disks of spiral and irregular galaxies. Real galaxies have bulges and interstellar gas. We include these in our models so that they cover the same distribution as real galaxies in terms of bulge mass, size, and gas fraction. We then assign a dark matter halo to each model galaxy using an abundance matching relation (the stellar mass tells us the halo mass) and adopt the cosmologically appropriate halo mass-concentration relation. These specify the initial condition of the NFW halo in which each model galaxy is presumed to reside.
At this point, it is worth remarking that there are a variety of abundance matching relations in the literature. Some of these give tragically bad predictions for the kinematics. I won’t delve into this here, but do want to note that in what follows, we have adopted the most favorable abundance matching relation, which turns out to be that of Kravstov et al. (2018). Note that this means that we are already engaged in a kind of fine-tuning by cherry-picking the most favorable relation.
Before considering adiabatic compression, let’s see what happens if we simply add our model galaxies to NFW halos. This is the same exercise we did last time with exponential disks; now we’re including bulges and gas:
This looks pretty good, at least at a first glance. Most of the models fall nearly on top of each other. This isn’t entirely true, as the most massive models overpredict the RAR. This is a generic consequence of the bend in abundance matching relations. This bend is mildest in the Kravtsov relation, which is what makes it “best” here – other relations, like the commonly cited one of Behroozi, predict a lot more high-acceleration models. One sees only a hint of that here.
The scatter is respectably small, mostly solving the problem I initially encountered in the nineties. Despite predicting a narrow relation, the models do have a finite scatter that is a bit more than we observe. This isn’t too tragic, so maybe we can work with it. These models also miss the low acceleration end of the relation by a modest but appreciable amount. This seems more significant, as we found the same thing for pure exponential models: it is hard to make this part of the problem go away.
Including bulges in the models extends them to high accelerations. This would seem to explain a region of the RAR that pure exponential models do not address. Bulges are high surface density, star dominated regions, so they fall on the 1:1 part of the RAR at high accelerations.
And then there are the hooks. These are obvious in the plot above. They occur in low and intermediate mass galaxies that lack a significant bulge component. A pure exponential disk has a peak acceleration at finite radius, but an NFW halo has its peak at zero radius. So if you imagine following a given model line inwards in radius, it goes up in acceleration until it reaches the maximum for the disk along the x-axis. The baryonic component of the acceleration then starts to decline while that due to the NFW halo continues to rise. The model doubles back to lower baryonic acceleration while continuing to higher total acceleration, making the little hook shape. This deviation from the RAR is not commonly observed; indeed, these hooks are the signature of the cusp-core problem in the RAR plane.
Results so far are mixed. With the “right” choice of abundance matching relation, we are well ahead of where we were at the turn of the century, but some real problems remain. We have yet to compute the necessary adiabatic contraction, so hopefully doing that right will result in further improvement. So let’s make a rigorous calculation of the compression that would result from forming a galaxy of the stipulated parameters.
Adiabatic compression makes things worse. There is a tiny improvement at low accelerations, but the most pronounced effects are at small radii where accelerations are large. Compression makes cuspy halos cuspier, making the hooks more pronounced. Worse, the strong concentration of starlight that is a bulge inevitably leads to strong compression. These models don’t approach the 1:1 line at high acceleration, and never can: higher acceleration means higher stellar surface density means greater compression. One cannot start from an NFW halo and ever reach a state of baryon domination; too much dark matter is always in the mix.
It helps to look at the residual diagram. The RAR is a log-log plot over a large dynamic range; this can hide small but significant deviations. For some reason, people who claim to explain the RAR with dark matter models never seem to show these residuals.
The models built to date don’t have the right shape to explain the RAR, at least when examined closely. Still, I’m pleased: what we’ve done here comes closer than all my many previous efforts, and most of the other efforts that are out there. Still, I wouldn’t claim it as a success. Indeed, the inevitable compressive effects that occur at high surface densities means that we can’t invoke simple offsets to accommodate the data: if a model gets the shape of the RAR right but the normalization wrong, it doesn’t work to simply shift it over.
So, where does that leave us? Up the proverbial creek? Perhaps. We have yet to consider feedback, which is too complicated to delve into here. Instead, while we haven’t engaged in any specific fine-tuning, we have already engaged in some cherry picking. First, we’ve abandoned the natural proportionality between halo and disk mass, replacing it with abundance matching. This is no small step, as it converts a single-valued parameter of our theory to a rolling function of mass. Abundance matching has become familiar enough that people seemed to be lulled into thinking this is natural. There is nothing natural about it. Regardless of how much fancy jargon we use to justify it, it’s still basically a rolling fudge factor – the scientific equivalent of a lipstick smothered pig.
Abundance matching does, at least, use data that are independent of the kinematics to set the relation between stellar and halo mass, and it does go in the right direction for the RAR. This only gets us into the right ballpark, and only if we cherry-pick the particular abundance matching relation that we use. So we’re well down the path of tuning whether we realize it or not. Invoking feedback is simply another step along this path.
Feedback is usually invoked in the kinematic context to convert cusps into cores. That could help with the hooks. This kind of feedback is widely thought to affect low and intermediate mass galaxies, or galaxies of a particular stellar to halo mass ratio. Opinions vary a bit, but it is generally not thought to have such a strong effect on massive galaxies. And yet, we find that we need some (second?) kind of feedback for them, as we need to move bulges back onto the 1:1 line in the RAR plane. That’s perhaps related to the cusp-core problem, but it’s also different. Getting bulges right requires a fine-tuned amount of feedback to exactly cancel out the effects of compression. A third distinct place where the models need some help is at low accelerations. This is far from the region where feedback is thought to have much effect at all.
I could go on, and perhaps will in a future post. Point is, we’ve been tuning our feedback prescriptions to match observed facts about galaxies, not computing how we think it really works. We don’t know how to do the latter, and there is no guarantee that our approximations do justice to reality. So on the one hand, I don’t doubt that with enough tinkering this process can be made to work in a model. On the other hand, I do question whether this is how the universe really works.
In the previous post, I related some of the history of the Radial Acceleration Relation (henceforth RAR). Here I’ll discuss some of my efforts to understand it. I’ve spent more time trying to do this in terms of dark matter than pretty much anything else, but I have not published most of those efforts. As I related briefly in this review, that’s because most of the models I’ve considered are obviously wrong. Just because I have refrained from publishing explanations of the RAR that are manifestly incorrect has not precluded others from doing so.
A theory is only as good as its prior. If a theory makes a clear prediction, preferably ahead of time, then we can test it. If it has not done so ahead of time, that’s still OK, if we can work out what it would have predicted without being guided by the data. A good historical example of this is the explanation of the excess perihelion precession of Mercury provided by General Relativity. The anomaly had been known for decades, but the right answer falls out of the theory without input from the data. A more recent example is our prediction of the velocity dispersions of the dwarf satellites of Andromeda. Some cases were genuine a priori predictions, but even in the cases that weren’t, the prediction is what it is irrespective of the measurement.
Dark matter-based explanations of the RAR do not fall in either category. They have always chased the data and been informed by it. This has been going on for so long that new practitioners have entered field unaware of the extent to which the simulations they inherited had already been informed by the data. They legitimately seem to think that there has been no fine-tuning of the models because they weren’t personally present for every turn of the knob.
So let’s set the way-back machine. I became concerned about fine-tuning problems in the context of galaxy dynamics when I was trying to explain the Tully-Fisher relation of low surface brightness galaxies in the mid-1990s. This was before I was more than dimly aware that MOND existed, much less taken it seriously. Many of us were making earnest efforts to build proper galaxy formation theories at the time (e.g., Mo, McGaugh, & Bothun 1994, Dalcanton, Spergel, & Summers 1997; Mo, Mao, & White 1998 [MMW]; McGaugh & de Blok 1998), though of course these were themselves informed by observations to date. My own paper had started as an effort to exploit the new things we had discovered about low surface brightness galaxies to broaden our conventional theory of galaxy formation, but over the course of several years, turned into a falsification of some of the ideas I had considered in my 1992 thesis. Dalcanton’s model evolved from one that predicted a shift in Tully-Fisher (as mine had) to one that did not (after the datasaid no). It may never be possible to completely separate theoretical prediction from concurrent data, but it is possible to ask what a theory plausibly predicts. What is the LCDM prior for the RAR?
In order to do this, we need to predict both the baryon distribution (gbar) and that of the dark matter (gobs-gbar). Unfortunately, nobody seems to really agree on what LCDM predicts for galaxies. There seems to be a general consensus that dark matter halos should start out with the NFW form, but opinions vary widely about whether and how this is modified during galaxy formation. The baryonic side of the issue is simply seen as a problem.
That there is no clear prediction is in itself a problem. I distinctly remember expressing my concerns to Martin Rees while I was still a postdoc. He said not to worry; galaxies were such non-linear entities that we shouldn’t be surprised by anything they do. This verbal invocation of a blanket dodge for any conceivable observation did not inspire confidence. Since then, I’ve heard that excuse repeated by others. I have lost count of the number of more serious, genuine, yet completely distinct LCDM predictions I have seen, heard, or made myself. Many dozens, at a minimum; perhaps hundreds at this point. Some seem like they might work but don’t while othersdon’t even cross the threshold of predicting both axes of the RAR. There is no coherent picture that adds up to an agreed set of falsifiable predictions. Individual models can be excluded, but not the underlying theory.
To give one example, let’s consider the specific model of MMW. I make this choice here for two reasons. One, it is a credible effort by serious workers and has become a touchstone in the field, to the point that a sizeable plurality of practitioners might recognize it as a plausible prior – i.e., the closest thing we can hope to get to a legitimate, testable prior. Two, I recently came across one of my many unpublished attempts to explain the RAR which happens to make use of it. Unix says that the last time I touched these files was nearly 22 years ago, in 2000. The postscript generated then is illegible now, so I have to update the plot:
At first glance, this might look OK. The trend is at least in the right direction. This is not a success so much as it is an inevitable consequence of the fact that the observed acceleration includes the contribution of the baryons. The area below the dashed line is excluded, as it is impossible to have gobs < gbar. Moreover, since gobs = gbar+gDM, some correlation in this plane is inevitable. Quite a lot, if baryons dominate, as they always seem to do at high accelerations. Not that these models explain the high acceleration part of the RAR, but I’ll leave that detail for later. For now, note that this is a log-log plot. That the models miss the data a little to the eye translates to a large quantitative error. Individual model galaxies sometimes fall too high, sometimes too low: the model predicts considerably more scatter than is observed. The RAR is not predicted to be a narrow relation, but one with lots of scatter with large intrinsic deviations from the mean. That’s the natural prediction of MMW-type models.
I have explored many flavors of [L]CDM models. They generically predicts more scatter in the RAR than is observed. This is the natural expectation, and some fine-tuning has to be done to reduce the scatter to the observed level. The inevitable need for fine-tuning is why I became concerned for the dark matter paradigm, even before I became aware that MOND predicted exactly this. It is also why the observed RAR was considered to be against orthodoxy at the time: everybody’s prior was for a large scatter. It wasn’t just me.
In order to build a model, one has to make some assumptions. The obvious assumption to make, at the time, was a constant ratio of dark matter to baryons. Indeed, for many years, the working assumption was that this was about 10:1, maybe 20:1. This type of assumption is built into the models of MMW, who thought that they worked provided “(i) the masses of disks are a few percent of those of their haloes”. The (i) is there because it is literally their first point, and the assumption that everybody made. We were terrified of dropping this natural assumption, as the obvious danger is that it becomes a rolling fudge factor, assuming any value that is convenient for explaining any given observation.
Unfortunately, it had already become clear by this time from the data that a constant ratio of dark to luminous matter could not work. The earliest I said this on the record is 1996. [That was before LCDM had supplanted SCDM as the most favored cosmology. From that perspective, the low baryon fractions of galaxies seemed natural; it was clusters of galaxies that were weird.] I pointed out the likely failure of (i) to Mo when I first saw a draft of MMW (we had been office mates in Cambridge). I’ve writtenvariouspapers about it since. The point here is that, from the perspective of the kinematic data, the ratio of dark to luminous mass has to vary. It cannot be a constant as we had all assumed. But it has to vary in a way that doesn’t introduce scatter into relations like the RAR or the Baryonic Tully-Fisher relation, so we have to fine-tune this rolling fudge factor so that it varies with mass but always obtains the same value at the same mass.
A constant ratio of dark to luminous mass wasn’t just a convenient assumption. There is good physical reason to expect that this should be the case. The baryons in galaxies have to cool and dissipate to form a galaxy in the center of a dark matter halo. This takes time, imposing an upper limit on galaxy mass. But the baryons in small galaxies have ample time to cool and condense, so one naively expects that they should all do so. That would have been natural. It would also lead to a steeply increasing luminosity function, which is not observed, leading to the over-cooling and missing satellite problems.
Reconciling the observed and predicted mass functions is one of the reasons we invoke feedback. The energy produced by the stars that form in the first gas to condense are an energy source that feeds back into the surrounding gas. This can, in principle, reheat the remaining gas or expel it entirely, thereby precluding it from condensing and forming more stars as in the naive expectation. In principle. In practice, we don’t know how this works, or even if the energy provided by star formation couples to the surrounding gas in a way that does what we need it to do. Simulations do not have the resolution to follow feedback in detail, so instead make some assumptions (“subgrid physics”) about how this might happen, and tune the assumed prescription to fit some aspect of the data. Once this is done, it is possible to make legitimate predictions about other aspects of the data, provided they are unrelated. But we still don’t know if that’s how feedback works, and in no way is it natural. Rather, it is a deus ex machina that we invoke to save us from a glaring problem without really knowing how it works or even if it does. This is basically just theoretical hand-waving in the computational age.
People have been invoking feedback as a panacea for all ills in galaxy formation theory for so long that it has become familiar. Once something becomes familiar, everybody knows it. Since everybody knows that feedback has to play some role, it starts to seem like it was always expected. This is easily confused with being natural.
I could rant about the difficulty of making predictions with feedback afflicted models, but never mind the details. Let’s find some aspect of the data that is independent of the kinematics that we can use to specify the dark to luminous mass ratio. The most obvious candidate is abundance matching, in which the number density of observed galaxies is matched to the predicted number density of dark matter halos. We don’t have to believe feedback-based explanations to apply this, we merely have to accept that there is some mechanism to make the dark to luminous mass ratio variable. Whatever it is that makes this happen had better predict the right thing for both the mass function and the kinematics.
When it comes to the RAR, the application of abundance matching to assign halo masses to observed galaxies works out much better than the natural assumption of a constant ratio. This was first pointed out by Di Cintio & Lelli (2016), which inspired me to consider appropriately modified models. All I had to do was update the relation between stellar and halo mass from a constant ratio to a variable specified by abundance matching. This gives rather better results:
This looks considerably better! The predicted scatter is much lower. How is this accomplished?
Abundance matching results in a non-linear relation bewteen stellar mass and halo mass. For the RAR, the scatter is reduced by narrowing the dynamic range of halo masses relative to the observed stellar masses. There is less variation in gDM. Empirically, this is what needs to happen – to a crude first approximation, the data are roughly consistent with all galaxies living in the same halo – i.e., no variation in halo mass with stellar mass. This was already known before abundance matching became rife; both the kinematic data and the mass function push us in this direction. There’s nothing natural about any of this; it’s just what we need to do to accommodate the data.
Still, it is tempting to say that we’ve succeeded in explaining the RAR. Indeed, some people have built the same kind of models to claim exactly this. While matters are clearly better, really we’re just less far off. By reducing the dynamic range in halo masses that are occupied by galaxies, the partial contribution of gDM to the gobs axis is compressed, and model lines perforce fall closer together. There’s less to distinguish an L* galaxy from a dwarf galaxy in this plane.
Nevertheless, there’s still too much scatter in the models. Harry Desmond made a specific study of this, finding that abundance matching “significantly overpredicts the scatter in the relation and its normalisation at low acceleration”, which is exactly what I’ve been saying. The offset in the normalization at low acceleration is obvious from inspection in the figure above: the models overshoot the low acceleration data. This led Navarro et al. to argue that there was a second acceleration scale, “an effective minimum acceleration probed by kinematic tracers in isolated galaxies” a little above 10-11 m/s/s. The models do indeed do this, over a modest range in gbar, and there is some evidence for it in some data. This does not persist in the more reliable data; those shown above are dominated by atomic gas so there isn’t even the systematic uncertainty of the stellar mass-to-light ratio to save us.
The astute observer will notice some pink model lines that fall well above the RAR in the plot above. These are for the most massive galaxies, those with luminosities in excess of L*. Below the knee in the Schechter function, there is a small range of halo masses for a given range of stellar masses. Above the knee, this situation is reversed. Consequently, the nonlinearity of abundance matching works against us instead of for us, and the scatter explodes. One can suppress this with an apt choice of abundance matching relation, but we shouldn’t get to pick and choose which relation we use. It can be made to work only because there remains enough uncertainty in abundance matching to select the “right” one. There is nothing natural about any this.
There are also these little hooks, the kinks at the high acceleration end of the models. I’ve mostly suppressed them here (as did Navarro et al.) but they’re there in the models if one plots to small enough radii. This is the signature of the cusp-core problem in the RAR plane. The hooks occur because the exponential disk model has a maximum acceleration at a finite radius that is a little under one scale length; this marks the maximum value that such a model can reach in gbar. In contrast, the acceleration gDM of an NFW halo continues to increase all the way to zero radius. Consequently, the predicted gobs continues to increase even after gbar has peaked and starts to decline again. This leads to little hook-shaped loops at the high acceleration end of the models in the RAR plane.
These hooks were going to be the segue to discuss more sophisticated models built by Pengfei Li, but that’s going to be a whole ‘nother post because these are quite enough words for now. So, until next time, don’t invest in bitcoins, Russian oil, or LCDM models that claim to explain the RAR.
That’s my prediction, anyway. A little context first.
New Year, New Telescope
First, JWST finally launched. This has been a long-delayed NASA mission; the launch had been put off so many times it felt like a living example of Zeno’s paradox: ever closer but never quite there. A successful launch is always a relief – rockets do sometimes blow up on lift off – but there is still sweating to be done: it has one of the most complex deployments of any space mission. This is still a work in progress, but to start the new year, I thought it would be nice to look forward to what we hope to see.
JWST is a major space telescope optimized for observing in the near and mid-infrared. This enables observation of redshifted light from the earliest galaxies. This should enable us to see them as they would appear to our eyes had we been around at the time. And that time is long, long ago, in galaxies very far away: in principle, we should be able to see the first galaxies in their infancy, 13+ billion years ago. So what should we expect to see?
Early galaxies in LCDM
A theory is only as good as its prior. In LCDM, structure forms hierarchically: small objects emerge first, then merge into larger ones. It takes time to build up large galaxies like the Milky Way; the common estimate early on was that it would take at least a billion years to assemble an L* galaxy, and it could easily take longer. Ach, terminology: an L* galaxy is the characteristic luminosity of the Schechter function we commonly use to describe the number density of galaxies of various sizes. L* galaxies like the Milky Way are common, but the number of brighter galaxies falls precipitously. Bigger galaxies exist, but they are rare above this characteristic brightness, so L* is shorthand for a galaxy of typical brightness.
We expect galaxies to start small and slowly build up in size. This is a very basic prediction of LCDM. The hierarchical growth of dark matter halos is fundamental, and relatively easy to calculate. How this translates to the visible parts of galaxies is more fraught, depending on the details of baryonic infall, star formation, and the many kinds of feedback. [While I am a frequent critic of model feedback schemes implemented in hydrodynamic simulations on galactic scales, there is no doubt that feedback happens on the much smaller scales of individual stars and their nurseries. These are two very different things for which we confusingly use the same word since the former is the aspirational result of the latter.] That said, one only expects to assemble mass so fast, so the natural expectation is to see small galaxies first, with larger galaxies emerging slowly as their host dark matter halos merge together.
Here is an example of a model formation history that results in the brightest galaxy in a cluster (from De Lucia & Blaizot 2007). Little things merge to form bigger things (hence “hierarchical”). This happens a lot, and it isn’t really clear when you would say the main galaxy had formed. The final product (at lookback time zero, at redshift z=0) is a big galaxy composed of old stars – fairly typically for a giant elliptical. But the most massive progenitor is still rather small 8 billion years ago, over 4 billion years after the Big Bang. The final product doesn’t really emerge until the last major merger around 4 billion years ago. This is just one example in one model, and there are many different models, so your mileage will vary. But you get the idea: it takes a long time and a lot of mergers to assemble a big galaxy.
It is important to note that in a hierarchical model, the age of a galaxy is not the same as the age of the stars that make up the galaxy. According to De Lucia & Blaizot, the stars of the brightest cluster galaxies
“are formed very early (50 per cent at z~5, 80 per cent at z~3)”
but do so
“in many small galaxies”
– i.e., the little progenitor circles in the plot above. The brightest cluster galaxies in their model build up rather slowly, such that
“half their final mass is typically locked-up in a single galaxy after z~0.5.”
So all the star formation happens early in the little things, but the final big thing emerges later – a lot later, only reaching half its current size when the universe is about 8 Gyr old. (That’s roughly when the solar system formed: we are late-comers to this party.) Given this prediction, one can imagine that JWST should see lots of small galaxies at high redshift, their early star formation popping off like firecrackers, but it shouldn’t see any big galaxies early on – not really at z > 3 and certainly not at z > 5.
Big galaxies in the data at early times?
While JWST is eagerly awaited, people have not been idle about looking into this. There have been many deep surveys made with the Hubble Space Telescope, augmented by the infrared capable (and now sadly defunct) Spitzer Space Telescope. These have already spied a number of big galaxies at surprisingly high redshift. So surprising that Steinhardt et al. (2016) dubbed it “The Impossibly Early Galaxy Problem.” This is their key plot:
There are lots of caveats to this kind of work. Constructing the galaxy luminosity function is a challenging task at any redshift; getting it right at high redshift especially so. While what counts as “high” varies, I’d say everything on the above plot counts. Steinhardt et al. (2016) worry about these details at considerable length but don’t find any plausible way out.
Around the same time, one of our graduate students, Jay Franck, was looking into similar issues. One of the things he found was that not only were there big galaxies in place early on, but they were also in clusters (or at least protoclusters) early and often. That is to say, not only are the galaxies too big too soon, so are the clusters in which they reside.
Dr. Franck made his own comparison of data to models, using the Millennium simulation to devise an apples-to-apples comparison:
The result is that the data look more like big galaxies formed early already as big galaxies. The solid lines are “passive evolution” models in which all the stars form in a short period starting at z=10. This starting point is an arbitrary choice, but there is little cosmic time between z = 10 and 20 – just a few hundred million years, barely one spin around the Milky Way. This is a short time in stellar evolution, so is practically the same as starting right at the beginning of time. As Jay put it,
“High redshift cluster galaxies appear to be consistent with an old stellar population… they do not appear to be rapidly assembling stellar mass at these epochs.”
We see old stars, but we don’t see the predicted assembly of galaxies via mergers, at least not at the expected time. Rather, it looks like some galaxies were already big very early on.
As someone who has worked mostly on well resolved, relatively nearby galaxies, all this makes me queasy. Jay, and many others, have worked desperately hard to squeeze knowledge from the faint smudges detected by first generation space telescopes. JWST should bring these into much better focus.
Early galaxies in MOND
To go back to the first line of this post, big galaxies at high redshift did not come as a surprise to me. It is what we expect in MOND.
Structure formation is generally considered a great success of LCDM. It is straightforward and robust to calculate on large scales in linear perturbation theory. Individual galaxies, on the other hand, are highly non-linear objects, making them hard to beasts to tame in a model. In MOND, it is the other way around – predicting the behavior of individual galaxies is straightforward – only the observed distribution of mass matters, not all the details of how it came to be that way – but what happens as structure forms in the early universe is highly non-linear.
The non-linearity of MOND makes it hard to work with computationally. It is also crucial to how structure forms. I provide here an outline of how I expect structure formation to proceed in MOND. This page is now old, even ancient in internet time, as the golden age for this work was 15 – 20 years ago, when all the essential predictions were made and I was naive enough to think cosmologists were amenable to reason. Since the horizon of scientific memory is shorter than that, I felt it necessary to review in 2015. That is now itself over the horizon, so with the launch of JWST, it seems appropriate to remind the community yet again that these predictions exist.
This was a remarkable prediction to make in 1998. Galaxies, much less larger structures, were supposed to take much longer to form. It takes time to go from the small initial perturbations that we see in the CMB at z=1000 to large objects like galaxies. Indeed, the it takes at least a few hundred million years simply in free fall time to assemble a galaxy’s worth of mass, a hard limit. Here Sanders was saying that an L* galaxy might assemble as early as half a billion years after the Big Bang.
So how can this happen? Without dark matter to lend a helping hand, structure formation in the very early universe is inhibited by the radiation field. This inhibition is removed around z ~ 200; exactly when being very sensitive to the baryon density. At this point, the baryon perturbations suddenly find themselves deep in the MOND regime, and behave as if there is a huge amount of dark matter. Structure proceeds hierarchically, as it must, but on a highly compressed timescale. To distinguish it from LCDM hierarchical galaxy formation, let’s call it prompt structure formation. In prompt structure formation, we expect
Early reionization (z ~ 20)
Some L* galaxies by z ~ 10
Early emergence of the cosmic web
Massive clusters already at z > 2
Large, empty voids
Large peculiar velocities
A very large homogeneity scale, maybe fractal over 100s of Mpc
There are already indications of all of these things, nearly all of which were predicted in advance of the relevant observations. I could elaborate, but that is beyond the scope of this post. People should read the references* if they’re keen.
*Reading the science papers is mandatory for the pros, who often seem fond of making straw man arguments about what they imagine MOND might do without bothering to check. I once referred some self-styled experts in structure formation to Sanders’s work. They promptly replied “That would mean structures of 1018 M☉!” when what he said was
“The largest objects being virialized now would be clusters of galaxies with masses in excess of 1014 M☉. Superclusters would only now be reaching maximum expansion.”
The exact numbers are very sensitive to cosmological parameters, as Sanders discussed, but I have no idea where the “experts” got 1018, other than just making stuff up. More importantly, Sanders’s statement clearly presaged the observation of very massive clusters at surprisingly high redshift and the discovery of the Laniakea Supercluster.
Instead, we lack critical mass. Most of the community remains entirely obsessed with pursuing the vain chimera of invisible mass. I fear that this will eventually prove to be one of the greatest wastes of brainpower (some of it my own) in the history of science. I can only hope I’m wrong, as many brilliant people seem likely to waste their career running garbage in-garbage out computer simulations or at the bottom of a mine shaft failing to detect what isn’t there.
A beautiful mess
JWST can’t answer all of these questions, but it will help enormously with galaxy formation, which is bound to be messy. It’s not like L* galaxies are going to spring fully formed from the void like Athena from the forehead of Zeus. The early universe must be a chaotic place, with clumps of gas condensing to form the first stars that irradiate the surrounding intergalactic gas with UV photons before detonating as the first supernovae, and the clumps of stars merging to form giant elliptical galaxies while elsewhere gas manages to pool and settle into the large disks of spiral galaxies. When all this happens, how it happens, and how big galaxies get how fast are all to be determined – but now accessible to direct observation thanks to JWST.
It’s going to be a confusing, beautiful mess, in the best possible way – one that promises to test and challenge our predictions and preconceptions about structure formation in the early universe.
Mass is a basic quantity. How much stuff does an astronomical object contain? For a galaxy, mass can mean many different things: that of its stars, stellar remnants (e.g., white dwarfs, neutron stars), atomic gas, molecular clouds, plasma (ionized gas), dust, Bok globules, black holes, habitable planets, biomass, intelligent life, very small rocks… these are all very different numbers for the same galaxy, because galaxies contain lots of different things. Two things that many scientists have settled on as Very Important are a galaxy’s stellar mass and its dark matter halo mass.
The mass of a galaxy’s dark matter halo is not well known. Most measurement provide only lower limits, as tracers fade out before any clear end is reached. Consequently, the “total” mass is a rather notional quantity. So we’ve adopted as a convention the mass M200 contained within an over-density of 200 times the critical density of the universe. This is a choice motivated by an ex-theory that would take an entire post to explain unsatisfactorily, so do not question the convention: all choices are bad, so we stick with it.
One of the long-standing problems the cold dark matter paradigm has is that the galaxy luminosity function should be steep but is observed to be shallow. This sketch shows the basic issue. The number density of dark matter halos as a function of mass is expected to be a power law – one that is well specified once the cosmology is known and a convention for the mass is adopted. The obvious expectation is that the galaxy luminosity function should just be a downshifted version of the halo mass function: one galaxy per halo, with the stellar mass proportional to the halo mass. This was such an obvious assumption [being provision (i) of canonical galaxy formation in LCDM] that it was not seriously questioned for over a decade. (Minor point: a turn down at the high mass end could be attributed to gas cooling times: the universe didn’t have time to cool and assemble a galaxy above some threshold mass, but smaller things had plenty of time for gas to cool and form stars.)
The galaxy luminosity function does not look like a shifted version of the halo mass function. It has the wrong slope at the faint end. At no point is the size of the shift equal to what one would expect from the mass of available baryons. The proportionality factor md is too small; this is sometimes called the over-cooling problem, in that a lot more baryons should have cooled to form stars than apparently did so. So, aside from the shape and the normalization, it’s a great match.
We obsessed about this problem all through the ’90s. At one point, I thought I had solved it. Low surface brightness galaxies were under-represented in galaxy surveys. They weren’t missed entirely, but their masses could be systematically underestimated. This might matter a lot because the associated volume corrections are huge. A small systematic in mass would get magnified into a big one in density. Sadly, after a brief period of optimism, it became clear that this could not work to solve the entire problem, which persists.
Circa 2000, a local version of the problem became known as the missing satellites problem. This is a down-shifted version of the mismatch between the galaxy luminosity function and the halo mass function that pervades the entire universe: few small galaxies are observed where many are predicted. To give visual life to the numbers we’re talking about, here is an image of the dark matter in a simulation of a Milky Way size galaxy:
In contrast, real galaxies have rather fewer satellites that meet the eye:
By 2010, we’d thrown in the towel, and decided to just accept that this aspect of the universe was too complicated to predict. The story now is that feedback changes the shape of the luminosity function at both the faint and the bright ends. Exactly how depends on who you ask, but the predicted halo mass function is sacrosanct so there must be physical processes that make it so. (This is an example of the Frenk Principle in action.)
Lacking a predictive theory, theorists instead came up with a clever trick to relate galaxies to their dark matter halos. This has come to be known as abundance matching. We measure the number density of galaxies as a function of stellar mass. We know, from theory, what the number density of dark matter halos should be as a function of halo mass. Then we match them up: galaxies of a given density live in halos of the corresponding density, as illustrated by the horizontal gray lines in the right panel of the figure above.
There have now been a number of efforts to quantify this. Four examples are given in the figure below (see this paper for references), together with kinematic mass estimates.
The abundance matching relations have a peak around a halo mass of 1012 M☉ and fall off to either side. This corresponds to the knee in the galaxy luminosity function. For whatever reason, halos of this mass seem to be most efficient at converting their available baryons into stars. The shape of these relations mean that there is a non-linear relation between stellar mass and halo mass. At the low mass end, a big range in stellar mass is compressed into a small range in halo mass. The opposite happens at high mass, where the most massive galaxies are generally presumed to be the “central” galaxy of a cluster of galaxies. We assign the most massive halos to big galaxies understanding that they may be surrounded by many subhalos, each containing a cluster galaxy.
Around the same time, I made a similar plot, but using kinematic measurements to estimate halo masses. Both methods are fraught with potential systematics, but they seem to agree reasonably well – at least over the range illustrated above. It gets dodgy above and below that. The agreement is particularly good for lower mass galaxies. There seems to be a departure for the most massive individual galaxies, but why worry about that when the glass is 3/4 full?
Skip ahead a decade, and some people think we’ve solved the missing satellite problem. One key ingredient of that solution is that the Milky Way resides in a halo that is on the lower end of the mass range that has traditionally been estimated for it (1 to 2 x 1012 M☉). This helps because the number of subhalos scales with mass: clusters are big halos with lots of galaxy-size halos; the Milky Way is a galaxy-sized halo with lots of smaller subhalos. Reality does not look like that, but having a lower mass means fewer subhalos, so that helps. It does not suffice. We must invoke feedback effects to make the relation between light and mass nonlinear. Then the lowest mass satellites may be too dim to detect: selection effects have to do a lot of work. It also helps to assume the distribution of satellites is isotropic, which looks to be true in the simulation, but not so much in reality where known dwarf satellites occupy a planar distribution. We also need to somehow fudge the too-big-to-fail problem, in which the more massive subhalos appear not to be occupied by luminous galaxies at all. Given all that, we can kinda sorta get in the right ballpark. Kinda, sorta, provided that we live in a galaxy whose halo mass is closer to 1012 M☉ than to 2 x 1012 M☉.
At an IAU meeting in Shanghai (in July 2019, before travel restrictions), the subject of the mass of the Milky Way was discussed at length. It being our home galaxy, there are many ways in which to constrain the mass, some of which take advantage of tracers that go out to greater distances than we can obtain elsewhere. Speaker after speaker used different methods to come to a similar conclusion, with the consensus hedging on the low side (roughly 1 – 1.5 x 1012 M☉). A nice consequence would be that the missing satellite problem may no longer be a problem.
Galaxies in general and the Milky Way in particular are different and largely distinct subfields. Different data studied by different people with distinctive cultures. In the discussion at the end of the session, Pieter van Dokkum pointed out that from the perspective of other galaxies, the halo mass ought to follow from abundance matching, which for a galaxy like the Milky Way ought to be more like 3 x 1012 M☉, considerably more than anyone had suggested, but hard to exclude because most of that mass could be at distances beyond the reach of the available tracers.
This was not well received.
The session was followed by a coffee break, and I happened to find myself standing in line next to Pieter. I was still processing his comment, and decided he was right – from a certain point of view. So we got to talking about it, and wound up making the plot below, which appears in a short research note. (For those who know the field, it might be assumed that Pieter and I hate each other. This is not true, but we do frequently disagree, so the fact that we do agree about this is itself worthy of note.)
The Milky Way and Andromeda are the 1012 M☉ gorillas of the Local Group. There are many dozens of dwarf galaxies, but none of them are comparable in mass, even with the boost provided by the non-linear relation between mass and luminosity. To astronomical accuracy, in terms of mass, the Milky Way plus Andromeda are the Local Group. There are many distinct constraints, on each galaxy as an individual, and on the Local Group as a whole. Any way we slice it, all three entities lie well off the relation expected from abundance matching.
There are several ways one could take it from here. One might suppose that abundance matching is correct, and we have underestimated the mass with other measurements. This happens all the time with rotation curves, which typically do not extend far enough out into the halo to give a good constraint on the total mass. This is hard to maintain for the Local Group, where we have lots of tracers in the form of dwarf satellites, and there are constraints on the motions of galaxies on still larger scales. Moreover, a high mass would be tragic for the missing satellite problem.
One might instead imagine that there is some scatter in the abundance matching relation, and we just happen to live in a galaxy that has a somewhat low mass for its luminosity. This is almost reasonable for the Milky Way, as there is some overlap between kinematic mass estimates and the expectations of abundance matching. But the missing satellite problem bites again unless we are pretty far off the central value of the abundance matching relation. Other Milky Way-like galaxies ought to fall on the other end of the spectrum, with more mass and more satellites. A lot of work is going on to look for satellites around other spirals, which is hard work (see NGC 6946 above). There is certainly scatter in the number of satellites from system to system, but whether this is theoretically sensible or enough to explain our Milky Way is not yet apparent.
There is a tendency in the literature to invoke scatter when and where needed. Here, it is important to bear in mind that there is little scatter in the Tully-Fisher relation. This is a relation between stellar mass and rotation velocity, with the latter supposedly set by the halo mass. We can’t have it both ways. Lots of scatter in the stellar mass-halo mass relation ought to cause a corresponding amount of scatter in Tully-Fisher. This is not observed. It is a much stronger than most people seem to appreciate, as even subtle effects are readily perceptible. Consequently, I think it unlikely that we can nuance the relation between halo mass and observed rotation speed to satisfy both relations without a lot of fine-tuning, which is usually a sign that something is wrong.
A lot of effort has been put into beating down the missing satellite problem around the Milky Way. Matters are worse for Andromeda. Kinematic halo mass estimates are typically in the same ballpark as the Milky Way. Some are a bit bigger, some are lower. Lower is a surprise, because the stellar mass of M31 is clearly bigger than that of the Milky Way, placing it is above the turnover where the efficiency of star formation is maximized. In this regime, a little stellar mass goes a long way in terms of halo mass. Abundance matching predicts that a galaxy of Andromeda’s stellar mass should reside in a dark matter halo of at least 1013 M☉. That’s quite a bit more than 1 or 2 x 1012 M☉, even by astronomical standards. Put another way, according to abundance matching, the Local Group should have the Milky Way as its most massive occupant. Just the Milky Way. Not the Milky Way plus Andromeda. Despite this, the Local Group is not anomalous among similar groups.
Words matter. A lot boils down to what we consider to be “close enough” to call similar. I do not consider the Milky Way and Andromeda to be all that similar. They are both giant spirals, yes, but galaxies are all individuals. Being composed of hundreds of billions of stars, give or take, leaves a lot of room for differences. In this case, the Milky Way and Andromeda are easily distinguished in the Tully-Fisher plane. Andromeda is about twice the baryonic mass of the Milky Way. It also rotates faster. The error bars on these quantities do not come close to overlapping – that would be one criterion for considering them to be similar – a criterion they do not meet. Even then, there could be other features that might be readily distinguished, but let’s say a rough equality in the Tully-Fisher plane would indicate stellar and halo masses that are “close enough” for our present discussion. They aren’t: to me, the Milky Way and M31 are clearly different galaxies.
I spent a fair amount of time reading the recent literature on satellites searches, and I was struck by the ubiquity with which people make the opposite assumption, treating the Milky Way and Andromeda as interchangeable galaxies of similar mass. Why would they do this? If one looks at the kinematic halo mass as the defining characteristic of a galaxy, they’re both close to 1012 M☉, with overlapping error bars on M200. By that standard, it seems fair. Is it?
Luminosity is observable. Rotation speed is observable. There are arguments to be had about how to convert luminosity into stellar mass, and what rotation speed measure is “best.” These are sometimes big arguments, but they are tiny in scale compared to estimating notional quantities like the halo mass. The mass M200 is not an observable quantity. As such, we have no business using it as a defining characteristic of a galaxy. You know a galaxy when you see it. The same cannot be said of a dark matter halo. Literally.
If, for some theoretically motivated reason, we want to use halo mass as a standard then we need to at least use a consistent method to assess its value from directly observable quantities. The methods we use for the Milky Way and M31 are not applicable beyond the Local Group. Nowhere else in the universe do we have such an intimate picture of the kinematic mass from a wide array of independent methods with tracers extending to such large radii. There are other standards we could apply, like the Tully-Fisher relation. That we can do outside the Local Group, but by that standard we would not infer that M31 and the Milky Way are the same. Other observables we can fairly apply to other galaxies are their luminosities (stellar masses) and cosmic number densities (abundance matching). From that perspective, what we know from all the other galaxies in the universe is that the factor of ~2 difference in stellar mass between Andromeda and the Milky Way should be huge in terms of halo mass. If it were anywhere else in the universe, we wouldn’t treat these two galaxies as interchangeably equal. This is the essence of Pieter’s insight: abundance matching is all about the abundance of dark matter halos, so that would seem to be the appropriate metric by which to predict the expected number of satellites, not the kinematic halo mass that we can’t measure in the same way anywhere else in the universe.
That isn’t to say we don’t have some handle on kinematic halo masses, it’s just that most of that information comes from rotation curves that don’t typically extend as far as the tracers that we have in the Local Group. Some rotation curves are more extended than others, so one has to account for that variation. Typically, we can only put a lower limit on the halo mass, but if we assume a profile like NFW – the standard thing to do in LCDM, then we can sometimes exclude halos that are too massive.
Abundance matching has become important enough to LCDM that we included it as a prior in fitting dark matter halo models to rotation curves. For example:
NFW halos are self-similar: low mass halos look very much like high mass halos over the range that is constrained by data. Consequently, if you have some idea what the total mass of the halo should be, as abundance matching provides, and you impose that as a prior, the fits for most galaxies say “OK.” The data covering the visible galaxy have little power to constrain what is going on with the dark matter halo at much larger radii, so the fits literally fall into line when told to do so, as seen in Pengfei‘s work.
That we can impose abundance matching as a prior does not necessarily mean the result is reasonable. The highest halo masses that abundance matching wants in the plot above are crazy talk from a kinematic perspective. I didn’t put too much stock in this, as the NFW halo itself, the go-to standard of LCDM, provides the worst description of the data among all the dozen or so halo models that we considered. Still, we did notice that even with abundance matching imposed as a prior, there are a lot more points above the line than below it at the high mass end (above the bend in the figure above). The rotation curves are sometimes pushing back against the imposed prior; they often don’t want such a high halo mass. This was explored in some detail by Posti et al., who found a similar effect.
I decided to turn the question around. Can we use abundance matching to predict the halo and hence rotation curve of a massive galaxy? The largest spiral in the local universe, UGC 2885, has one of the most extended rotation curves known, meaning that it does provide some constraint on the halo mass. This galaxy has been known as an important case since Vera Rubin’s work in the ’70s. With a modern distance scale, its rotation curve extends out 80 kpc. That’s over a quarter million light-years – a damn long way, even by the standards of galaxies. It also rotates remarkably fast, just shy of 300 km/s. It is big and massive.
(As an aside, Vera once offered a prize for anyone who found a disk that rotated faster than 300 km/s. Throughout her years of looking at hundreds of galaxies, UGC 2885 remained the record holder, with 300 seeming to be a threshold that spirals did not exceed. She told me that she did pay out, but on a technicality: someone showed her a gas disk around a supermassive black hole in Keplerian rotation that went up to 500 km/s at its peak. She lamented that she had been imprecise in her language, as that was nothing like what she meant, which was the flat rotation speed of a spiral galaxy.)
That aside aside, if we take abundance matching at face value, then the stellar mass of a galaxy predicts the mass of its dark matter halo. Using the most conservative (in that it returns the lowest halo mass) of the various abundance matching relations indicates that with a stellar mass of about 2 x 1011 M☉, UGC 2885 should have a halo mass of 3 x 1013 M☉. Combining this with a well-known relation between halo concentration and mass for NFW halos, we then know what the rotation curve should be. Doing this for UGC 2885 yields a tragic result:
The data do not allow for the predicted amount of dark matter. If we fit the rotation curve, we obtain a “mere” M200 = 5 x 1012 M☉. Note that this means that UGC 2885 is basically the Milky Way and Andromeda added together in terms of both stellar mass and halo mass – if added to the M*-M200 plot above, it would land very close to the open circle representing the more massive halo estimate for the combination of MW+M31, and be just as discrepant from the abundance matching relations. We get the same result regardless of which direction we look at it from.
Objectively, 5 x 1012 M☉ is a huge dark matter halo for a single galaxy. It’s just not the yet-more massive halo that is predicted by abundance matching. In this context, UGC 2885 apparently has a serious missing satellites problem, as it does not appear to be swimming in a sea of satellite galaxies the way we’d expect for the central galaxy of such high mass halo.
It is tempting to write this off as a curious anecdote. Another outlier. Sure, that’s always possible, but this is more than a bit ridiculous. Anyone who wants to go this route I refer to Snoop Dog.
I spent much of my early career obsessed with selection effects. These preclude us from seeing low surface brightness galaxies as readily as brighter ones. However, it isn’t binary – a galaxy has to be extraordinarily low surface brightness before it becomes effectively invisible. The selection effect is a bias – and a very strong one – but not an absolute screen that prevents us from finding low surface brightness galaxies. That makes it very hard to sustain the popular notion that there are lots of subhalos that simply contain ultradiffuse galaxies that cannot currently be seen. I’ve been down this road many times as an optimist in favor of this interpretation. It hasn’t worked out. Selection effects are huge, but still nowhere near big enough to overcome the required deficit.
Having the satellite galaxies that inhabit subhalos be low in surface brightness is a necessary but not sufficient criterion. It is also necessary to have a highly non-linear stellar mass-halo mass relation at low mass. In effect, luminosity and halo mass become decoupled: satellite galaxies spanning a vast range in luminosity must live in dark matter halos that cover only a tiny range. This means that it should not be possible to predict stellar motions in these galaxies from their luminosity. The relation between mass and light has just become too weak and messy.
A subject of long-standing interest in extragalactic astronomy is how stars form in galaxies. Some galaxies are “red and dead” – most of their stars formed long ago, and have evolved as stars will: the massive stars live bright but short lives, leaving the less massive ones to linger longer, producing relatively little light until they swell up to become red giants as they too near the end of their lives. Other galaxies, including our own Milky Way, made some stars in the ancient past and are still actively forming stars today. So what’s the difference?
The difference between star forming galaxies and those that are red and dead turns out to be both simple and complicated. For one, star forming galaxies have a supply of cold gas in their interstellar media, the fuel from which stars form. Dead galaxies have very little in the way of cold gas. So that’s simple: star forming galaxies have the fuel to make stars, dead galaxies don’t. But why that difference? That’s a more complicated question I’m not going to begin to touch in this post.
One can see current star formation in galaxies in a variety of ways. These usually relate to the ultraviolet (UV) photons produced by short-lived stars. Only O stars are hot enough to produce the ionizing radiation that powers the emission of HII (pronounced `H-two’) regions – regions of ionized gas that are like cosmic neon lights. O stars power HII regions but live less than 10 million years. That’s a blink of the eye on the cosmic timescale, so if you see HII regions, you know stars have formed recently enough that the short-lived O stars are still around.
Measuring the intensity of the Hα Balmer line emission provides a proxy for the number of UV photons that ionize the gas, which in turn basically counts the number of O stars that produce the ionizing radiation. This number, divided by the short life-spans of O stars, measures the current star formation rate (SFR).
There are many uncertainties in the calibration of this SFR: how many UV photons do O stars emit? Over what time span? How many of these ionizing photons are converted into Hα, and how many are absorbed by dust or manage to escape into intergalactic space? For every O star that comes and goes, how many smaller stars are born along with it? This latter question is especially pernicious, as most stellar mass resides in small stars. The O stars are only the tip of the iceberg; we are using the tip to extrapolate the size of the entire iceberg.
Astronomers have obsessed over these and related questions for a long time. See, for example, the review by Kennicutt & Evans. Suffice it to say we have a surprisingly decent handle on it, and yet the systematic uncertainties remain substantial. Different methods give the same answer to within an order of magnitude, but often differ by a factor of a few. The difference is often in the mass spectrum of stars that is assumed, but even rationalizing that to the same scale, the same data can be interpreted to give different answers, based on how much UV we estimate to be absorbed by dust.
In addition to the current SFR, one can also measure the stellar mass. This follows from the total luminosity measured from starlight. Many of the same concerns apply, but are somewhat less severe because more of the iceberg is being measured. For a long time we weren’t sure we could do better than a factor of two, but this work has advanced to the point where the integrated stellar masses of galaxies can be estimated to ~20% accuracy.
A diagram that has become popular in the last decade or so is the so-called star forming main sequence. This name is made in analogy with the main sequence of stars, the physics of which is well understood. Whether this is an appropriate analogy is debatable, but the terminology seems to have stuck. In the case of galaxies, the main sequence of star forming galaxies is a plot of star formation rate against stellar mass.
The star forming main sequence is shown in the graph below. It is constructed from data from the SINGS survey (red points) and our own work on dwarf low surface brightness (LSB) galaxies (blue points). Each point represents one galaxy. Its stellar mass is determined by adding up the light emitted by all the stars, while the SFR is estimated from the Hα emission that traces the ionizing UV radiation of the O stars.
The data show a nice correlation, albeit with plenty of intrinsic scatter. This is hardly surprising, as the two axes are not physically independent. They are measuring different quantities that trace the same underlying property: star formation over different time scales. The y-axis is a measure of the quasi-instantaneous star formation rate; the x-axis is the SFR integrated over the age of the galaxy.
Since the stellar mass is the time integral of the SFR, one expects the slope of the star forming main sequence (SFMS) to be one. This is illustrated by the diagonal line marked “Hubble time.” A galaxy forming stars at a constant rate for the age of the universe will fall on this line.
The data for LSB galaxies scatter about a line with slope unity. The best-fit line has a normalization a bit less than that of a constant SFR for a Hubble time. This might mean that the galaxies are somewhat younger than the universe (a little must be true, but need not be much), have a slowly declining SFR (an exponential decline with an e-folding time of a Hubble time works well), or it could just be an error in the calibration of one or both axes. The systematic errors discussed above are easily large enough to account for the difference.
To first order, the SFR in LSB galaxies is constant when averaged over billions of years. On the millions of years timescale appropriate to O stars, the instantaneous SFR bounces up and down. Looks pretty stochastic: galaxies form stars at a steady average rate that varies up and down on short timescales.
Short-term fluctuations in the SFR explain the data with current SFR higher than the past average. These are the points that stray into the gray region of the plot, which becomes increasingly forbidden towards the top left. This is because galaxies that form stars so fast for too long will build up their entire stellar mass in the blink of a cosmic eye. This is illustrated by the lines marked as 0.1 and 0.01 of a Hubble time. A galaxy above these lines would make all their stars in < 2 Gyr; it would have had to be born yesterday. No galaxies reside in this part of the diagram. Those that approach it are called “starbursts:” they’re forming stars at a high specific rate (relative to their mass) but this is presumably a brief-lived phenomenon.
Note that the most massive of the SINGS galaxies all fall below the extrapolation of the line fit to the LSB galaxies (dotted line). The are forming a lot of stars in an absolute sense, simply because they are giant galaxies. But the current SFR is lower than the past average, as if they were winding down. This “quenching” seems to be a mass-dependent phenomenon: more massive galaxies evolve faster, burning through their gas supply before dwarfs do. Red and dead galaxies have already completed this process; the massive spirals of today are weary giants that may join the red and dead galaxy population in the future.
One consequence of mass-dependent quenching is that it skews attempts to fit relations to the SFMS. There are very many such attempts in the literature; these usually have a slope less than one. The dashed line in the plot above gives one specific example. There are many others.
If one looks only at the most massive SINGS galaxies, the slope is indeed shallower than one. Selection effects bias galaxy catalogs strongly in favor of the biggest and brightest, so most work has been done on massive galaxies with M* > 1010 M☉. That only covers the top one tenth of the area of this graph. If that’s what you’ve got to work with, you get a shallow slope like the dashed line.
The dashed line does a lousy job of extrapolating to low mass. This is obvious from the dwarf galaxy data. It is also obvious from the simple mathematical considerations outlined above. Low mass galaxies could only fall on the dashed line if they were born yesterday. Otherwise, their high specific star formation rates would over-produce their observed stellar mass.
Despite this simple physical limit, fits to the SFMS that stray into the forbidden zone are ubiquitous in the literature. In addition to selection effects, I suspect the calibrations of both SFR and stellar mass are in part to blame. Galaxies will stray into the forbidden zone if the stellar mass is underestimated or the SFR is overestimated, or some combination of the two. Probably both are going on at some level. I suspect the larger problem is in the SFR. In particular, it appears that many measurements of the SFR have been over-corrected for the effects of dust. Such a correction certainly has to be made, but since extinction corrections are exponential, it is easy to over-do. Indeed, I suspect this is why the dashed line overshoots even the bright galaxies from SINGS.
This brings us back to the terminology of the main sequence. Among stars, the main sequence is defined by low mass stars that evolve slowly. There is a turn-off point, and an associated mass, where stars transition from the main sequence to the sub giant branch. They then ascend the red giant branch as they evolve.
If we project this terminology onto galaxies, the main sequence should be defined by the low mass dwarfs. These are nowhere near to exhausting their gas supplies, so can continue to form stars far into the future. They establish a star forming main sequence of slope unity because that’s what the math says they must do.
Most of the literature on this subject refers to massive star forming galaxies. These are not the main sequence. They are the turn-off population. Massive spirals are near to exhausting their gas supply. Star formation is winding down as the fuel runs out.
Red and dead galaxies are the next stage, once star formation has stopped entirely. I suppose these are the red giants in this strained analogy to individual stars. That is appropriate insofar as most of the light from red and dead galaxies is produced by red giant stars. But is this really they right way to think about it? Or are we letting our terminology get the best of us?
The week of June 5, 2017, we held a workshop on dwarf galaxies and the dark matter problem. The workshop was attended by many leaders in the field – giants of dwarf galaxy research. It was held on the campus of Case Western Reserve University and supported by the John Templeton Foundation. It resulted in many fascinating discussions which I can’t possibly begin to share in full here, but I’ll say a few words.
Dwarf galaxies are among the most dark matter dominated objects in the universe. Or, stated more properly, they exhibit the largest mass discrepancies. This makes them great places to test theories of dark matter and modified gravity. By the end, we had come up with a few important tests for both ΛCDM and MOND. A few of these we managed to put on a white board. These are hardly a complete list, but provide a basis for discussion.
UFDs in field: Over the past few years, a number of extremely tiny dwarf galaxies have been identified as satellites of the Milky Way galaxy. These “ultrafaint dwarfs” are vaguely defined as being fainter than 100,000 solar luminosities, with the smallest examples having only a few hundred stars. This is absurdly small by galactic standards, having the stellar content of individual star clusters within the Milky Way. Indeed, it is not obvious to me that all of the ultrafaint dwarfs deserve to be recognized as dwarf galaxies, as some may merely be fragmentary portions of the Galactic stellar halo composed of stars coincident in phase space. Nevertheless, many may well be stellar systems external to the Milky Way that orbit it as dwarf satellites.
That multitudes of minuscule dark matter halos exist is a fundamental prediction of the ΛCDM cosmogony. These should often contain ultrafaint dwarf galaxies, and not only as satellites of giant galaxies like the Milky Way. Indeed, one expects to see many ultrafaints in the “field” beyond the orbital vicinity of the Milky Way where we have found them so far. These are predicted to exist in great numbers, and contain uniformly old stars. The “old stars” portion of the prediction stems from the reionization of the universe impeding star formation in the smallest dark matter halos. Upcoming surveys like LSST should provide a test of this prediction.
From an empirical perspective, I do expect that we will continue to discover galaxies of ever lower luminosity and surface brightness. In the field, I expect that these will be predominantly gas rich dwarfs like Leo P rather than gas-free, old stellar systems like the satellite ultrafaints. My expectation is an extrapolation of past experience, not a theory-specific prediction.
No Large Cores: Many of the simulators present at the workshop showed that if the energy released by supernovae was well directed, it could reshape the steep (‘cuspy’) interior density profiles of dark matter halos into something more like the shallow (‘cored’) interiors that are favored by data. I highlight the if because I remain skeptical that supernova energy couples as strongly as required and assumed (basically 100%). Even assuming favorable feedback, there seemed to be broad (in not unanimous) consensus among the simulators present that at sufficiently low masses, not enough stars would form to produce the requisite energy. Consequently, low mass halos should not have shallow cores, but instead retain their primordial density cusps. Hence clear measurement of a large core in a low mass dwarf galaxy (stellar mass < 1 million solar masses) would be a serious problem. Unfortunately, I’m not clear that we quantified “large,” but something more than a few hundred parsecs should qualify.
Radial Orbit for Crater 2: Several speakers highlighted the importance of the recently discovered dwarf satellite Crater 2. This object has a velocity dispersion that is unexpectedly low in ΛCDM, but was predicted by MOND. The “fix” in ΛCDM is to imagine that Crater 2 has suffered a large amount of tidal stripping by a close passage of the Milky Way. Hence it is predicted to be on a radial orbit (one that basically just plunges in and out). This can be tested by measuring the proper motion of its stars with Hubble Space Telescope, for which there exists a recently approved program.
DM Substructures: As noted above, there must exist numerous low mass dark matter halos in the cold dark matter cosmogony. These may be detected as substructure in the halos of larger galaxies by means of their gravitational lensing even if they do not contain dwarf galaxies. Basically, a lumpy dark matter halo bends light in subtly but detectably different ways from a smooth halo.
No Wide Binaries in UFDs: As a consequence of dynamical friction against the background dark matter, binary stars cannot remain at large separations over a Hubble time: their orbits should decay. In the absence of dark matter, this should not happen (it cannot if there is nowhere for the orbital energy to go, like into dark matter particles). Thus the detection of a population of widely separated binary stars would be problematic. Indeed, Pavel Kroupa argued that the apparent absence of strong dynamical friction already excludes particle dark matter as it is usually imagined.
Short dynamical times/common mergers: This is related to dynamical friction. In the hierarchical cosmogony of cold dark matter, mergers of halos (and the galaxies they contain) must be frequent and rapid. Dark matter halos are dynamically sticky, soaking up the orbital energy and angular momentum between colliding galaxies to allow them to stick and merge. Such mergers should go to completion on fairly short timescales (a mere few hundred million years).
A few distinctive predictions for MOND were also identified.
Tangential Orbit for Crater 2: In contrast to ΛCDM, we expect that the `feeble giant’ Crater 2 could not survive a close encounter with the Milky Way. Even at its rather large distance of 120 kpc from the Milky Way, it is so feeble that it is not immune from the external field of its giant host. Consequently, we expect that Crater 2 must be on a more nearly circular orbit, and not on a radial orbit as suggested in ΛCDM. The orbit does not need to be perfectly circular of course, but is should be more tangential than radial.
This provides a nice test that distinguishes between the two theories. Either the orbit of Crater 2 is more radial or more tangential. Bear in mind that Crater 2 already constitutes a problem for ΛCDM. What we’re discussing here is how to close what is basically a loophole whereby we can excuse an otherwise unanticipated result in ΛCDM.
I believe the question mark was added on the white board to permit the logical if unlikely possibility that one could write a MOND theory with an undetectably small EFE.
Position of UFDs on RAR: We chose to avoid making the radial acceleration relation (RAR) a focus of the meeting – there was quite enough to talk about as it was – but it certainly came up. The ultrafaint dwarfs sit “too high” on the RAR, an apparent problem for MOND. Indeed, when I first worked on this subject with Joe Wolf, I initially thought this was a fatal problem for MOND.
My initial thought was wrong. This is not a problem for MOND. The RAR applies to systems in dynamical equilibrium. There is a criterion in MOND to check whether this essential condition may be satisfied. Basically all of the ultrafaints flunk this test. There is no reason to think they are in dynamical equilibrium, so no reason to expect that they should be exactly on the RAR.
Some advocates of ΛCDM seemed to think this was a fudge, a lame excuse morally equivalent to the fudges made in ΛCDM that its critics complain about. This is a false equivalency that reminds me of this cartoon:
The ultrafaints are a handful of the least-well measured galaxies on the RAR. Before we obsess about these, it is necessary to provide a satisfactory explanation for the more numerous, much better measured galaxies that establish the RAR in the first place. MOND does this. ΛCDM does not. Holding one theory to account for the least reliable of measurements before holding another to account for everything up to that point is like, well, like the cartoon… I could put an NGC number to each of the lines Bugs draws in the sand.
Long dynamical times/less common mergers: Unlike ΛCDM, dynamical friction should be relatively ineffective in MOND. It lacks the large halos of dark matter that act as invisible catchers’ mitts to make galaxies stick and merge. Personally, I do not think this is a great test, because we are a long way from understanding dynamical friction in MOND.
These are just a few of the topics discussed at the workshop, and all of those are only a few of the issues that matter to the bigger picture. While the workshop was great in every respect, perhaps the best thing was that it got people from different fields/camps/perspectives talking. That is progress.
I am grateful for progress, but I must confess that to me it feels excruciatingly slow. Models of galaxy formation in the context of ΛCDM have made credible steps forward in addressing some of the phenomenological issues that concern me. Yet they still seem to me to be very far from where they need to be. In particular, there seems to be no engagement with the fundamental question I have posed here before, and that I posed at the beginning of the workshop: Why does MOND get any predictions right?
A research programme is said to be progressing as long as its theoretical growth anticipates its empirical growth, that is as long as it keeps predicting novel facts with some success (“progressive problemshift”); it is stagnating if its theoretical growth lags behind its empirical growth, that is as long as it gives only post-hoc explanations either of chance discoveries or of facts anticipated by, and discovered in, a rival programme (“degenerating problemshift”) (Lakatos, 1971, pp. 104–105).
The recent history of modern cosmology is rife with post-hoc explanations of unanticipated facts. The cusp-core problem and the missing satellites problem are prominent examples. These are explained after the fact by invoking feedback, a vague catch-all that many people agree solves these problems even though none of them agree on how it actually works.
There are plenty of other problems. To name just a few: satellite planes (unanticipated correlations in phase space), the emptiness of voids, and the early formation of structure (see section 4 of Famaey & McGaugh for a longer list and section 6 of Silk & Mamon for a positive spin on our list). Each problem is dealt with in a piecemeal fashion, often by invoking solutions that contradict each other while buggering the principle of parsimony.
It goes like this. A new observation is made that does not align with the concordance cosmology. Hands are wrung. Debate is had. Serious concern is expressed. A solution is put forward. Sometimes it is reasonable, sometimes it is not. In either case it is rapidly accepted so long as it saves the paradigm and prevents the need for serious thought. (“Oh, feedback does that.”) The observation is no longer considered a problem through familiarity and exhaustion of patience with the debate, regardless of how [un]satisfactory the proffered solution is. The details of the solution are generally forgotten (if ever learned). When the next problem appears the process repeats, with the new solution often contradicting the now-forgotten solution to the previous problem.
This has been going on for so long that many junior scientists now seem to think this is how science is suppose to work. It is all they’ve experienced. And despite our claims to be interested in fundamental issues, most of us are impatient with re-examining issues that were thought to be settled. All it takes is one bold assertion that everything is OK, and the problem is perceived to be solved whether it actually is or not.
That is the process we apply to little problems. The Big Problems remain the post hoc elements of dark matter and dark energy. These are things we made up to explain unanticipated phenomena. That we need to invoke them immediately casts the paradigm into what Lakatos called degenerating problemshift. Once we’re there, it is hard to see how to get out, given our propensity to overindulge in the honey that is the infinity of free parameters in dark matter models.
Note that there is another aspect to what Lakatos said about facts anticipated by, and discovered in, a rival programme. Two examples spring immediately to mind: the Baryonic Tully-Fisher Relation and the Radial Acceleration Relation. These are predictions of MOND that were unanticipated in the conventional dark matter picture. Perhaps we can come up with post hoc explanations for them, but that is exactly what Lakatos would describe as degenerating problemshift. The rival programme beat us to it.
In my experience, this is a good description of what is going on. The field of dark matter has stagnated. Experimenters look harder and harder for the same thing, repeating the same experiments in hope of a different result. Theorists turn knobs on elaborate models, gifting themselves new free parameters every time they get stuck.
On the flip side, MOND keeps predicting novel facts with some success, so it remains in the stage of progressive problemshift. Unfortunately, MOND remains incomplete as a theory, and doesn’t address many basic issues in cosmology. This is a different kind of unsatisfactory.
In the mean time, I’m still waiting to hear a satisfactory answer to the question I’ve been posing for over two decades now. Why does MOND get any predictions right? It has had many a priori predictions come true. Why does this happen? It shouldn’t. Ever.
A recent paper in Nature by Genzel et al. reports declining rotation curves for high redshift galaxies. I have been getting a lot of questions about this result, which would be very important if true. So I thought I’d share a few thoughts here.
Nature is a highly reputable journal – in most fields of science. In Astronomy, it has a well earned reputation as the place to publish sexy but incorrect results. They have been remarkably consistent about this, going back to my earliest grad school memories, like a quasar pair being interpreted as a wide gravitational lens indicating the existence of cosmic strings. This was sexy at that time, because cosmic strings were thought to be a likely by-product of cosmic Inflation, threading the universe with remnants of the Inflationary phase. Cool, huh? Many Big Names signed on to this Exciting Discovery, which was Widely Discussed at the time. The only problem was that it was complete nonsense.
Genzel et al. look likely to build on this reputation. In Astronomy, we are always chasing the undiscovered, which often means the most distant. This is a wonderful thing: the universe is practically infinite; there is always something new to discover. An occasional downside is the temptation to over-interpret and oversell data on the edge.
Lets start with some historical perspective. Here is the position-velocity diagram of NGC 7331 as measured by Rubin et al. (1965):
The rotation curve goes up, then it goes down. One would not claim the discovery of flat rotation curves from these data.
Here is the modern rotation curve of the same galaxy:
As the data improved, the flattening became clear. In order to see this, you need to observe to large radius. The original data didn’t do that. It isn’t necessarily wrong; it just doesn’t go far enough out.
Now lets look at the position-velocity diagrams published by Genzel et al.:
They go up, they go down. This is the normal morphology of the rotation curves of bright, high surface brightness galaxies. First they rise steeply, then they roll over, then they decline slowly and gradually flatten out.
It looks to me like the Genzel el al. data do the first two things. They go up. They roll over. Maybe they start to come down a tiny bit. Maybe. They simply do not extend far enough to see the flattening, if it is there. Their claim that the rotation curves are falling is not persuasive: this is asking more of the data than is warranted. Historically, there are many examples of claims of “declining” rotation curves. DDO 154 is one famous example. These claims were not very persuasive at the time, and did not survive closer examination.
I have developed the habit of looking at the data before I read the text of a paper. I did that in this case, and saw what I expected to see from years of experience working on low redshift galaxies. I wasn’t surprised until I read the text as saw the claim that these galaxies somehow differed from those at low redshift.
It takes some practice to look at the data without being influenced by lines drawn to misguide the eye. That’s what the model lines drawn in red do. I don’t have access to the data, so I can’t re-plot them without those lines. So instead I have added, by eye, a crude estimate of what I would expect for galaxies like this. In most cases, the data do not distinguish between falling and flat rotation curves. In the case labeled 33h, the data look slightly more consistent with a flat rotation curve. In 10h, they look slightly more consistent with a falling rotation curve. That appearance is mostly driven by the outermost point with large error bars on the approaching side. Taken literally, this velocity is unphysical: it declines faster than Keplerian. They interpret this in terms of thick disks, but it could be a clue that Something is Wrong.
The basic problem is that the high redshift data do not extend to large radii. They simply do not go far enough out to distinguish between flat and declining rotation curves. Most do not extend beyond 10 kpc. If we plot the data for NGC 7331 with R < 10 kpc, we get this:
Here I’ve plotted both sides in order to replicate the appearance of Genzel’s plots. I’ve also included an exponential disk model in red. Never mind that this is a lousy representation of the true mass model. It gives a good fit, no?
The rotation curve is clearly declining. Unless you observe further out:
The data of Genzel et al. do not allow us to distinguish between “normal” flat rotation curves and genuinely declining ones.
This is just taking the data as presented. I have refrained from making methodological criticisms, and will continue to do so. I will only note that it is possible to make a considerably more sophisticated, 3D analysis. Di Teodoro et al. (2016) have done this for very similar data. They find much lower velocity dispersions (not the thick disks claimed by Genzel et al.) and flat rotation curves:
There is no guarantee that the same results will follow for the Genzel et al. data, but it would be nice to see the same 3D analysis techniques applied.
Since I am unpersuaded that the Genzel et al. data extend far enough out to test for flat rotation, I looked for a comparison that I could make so far as the data do go. Fig. 3 of Genzel et al. shows the dark matter fraction as a function of circular velocity. This contains the same information as Fig. 12 of McGaugh (2016), which I re-plot here in terms of the dark matter fraction:
The data of Genzel et al. follow the trends established by local galaxies. They are confined to the bright, high surface brightness end of these relations, but that is to be expected: the brightest galaxies are always the most readily observed, especially at high redshift.
Genzel et al. only plot the left panel. As I have shown manytimesbefore, the strongest correlation of dynamical-to-baryonic mass is with surface brightness, not mass or its proxies luminosity and circular velocity. This is an essential aspect of the mass discrepancy problem; it is unfortunate that many scientists working on the topic appear to remain unaware of this basic fact.
From these diagrams, I infer that there is no discernible evolution in the properties of bright galaxies out to high redshift (z = 2.4 for their most distant case). The data presented by Genzel et al. sit exactly where one would expect from the relations established by local galaxies. That in itself might seem surprising, and perhaps warrants a Letter to Nature. But most of the words in Genzel et al. are about a surprising sort of evolution in which galaxy rotation curves decline at high redshift, so they have less dark matter then than now. I do not see that their data sustain such an interpretation.
So far everything I have said is empirical. If I put on a theory hat, the claims of Genzel et al. persist in making no sense.
First, ΛCDM. Fundamental to the ΛCDM cosmogony is the notion that dark matter halos form first, with baryons falling in subsequently. It has to happen in that order to satisfy the constraints on the growth of structure from the cosmic microwave background. The temperature fluctuations in the CMB are small because the baryons haven’t yet been able to clump up. In order for them to form galaxies as quickly as observed, the dark matter must already be forming the seeds of dark matter halos for the baryons to subsequently fall into. Without this order of battle, our explanation of structure formation is out the window.
Next, MOND. If rotation curves are indeed falling as claimed, this would falsify MOND, or at least make it a phenomenon that only applies in the local universe. But, as discussed, the high-z galaxies look like local ones. That doesn’t falsify MOND; it rather encourages the basic picture of structure formation we have in that context: galaxies form early and settle down into the form the modified force law stipulates. Indeed, the apparent lack of evolution implies that Milgrom’s acceleration constant a0 is indeed constant, and does not vary (as sometimes speculated) in concert with the expansion rate as hinted at by the numerical coincidence a0 ~ cH0. I cannot place a meaningful limit on the evolution of a0 from the data as presented, but it appears to be small. Rather than falsifying MOND, the high-z data look to be consistent with it – so far as they go.
So, in summary: the data at high redshift appear completely consistent with those at low redshift. The claim of falling rotation curves would be problematic to both ΛCDM and MOND. However, this claim is not persuasive – the data simply do not extend far enough out.
Early 21st century technology has enabled us to do at high redshift what could barely be done at low redshift in the mid-20th century. That’s impressive. But these high-z data look a lot like low-z data circa 1970. A lot has changed since then. Right now, for the exploration of the high redshift universe, I will borrow one of Vera Rubin’s favorite phrases: These are Early Days.
Recently I have been complaining about the low standards to which science has sunk. It has become normal to be surprised by an observation, express doubt about the data, blame the observers, slowly let it sink in, bicker and argue for a while, construct an unsatisfactory model that sort-of, kind-of explains the surprising data but not really, call it natural, then pretend like that’s what we expected all along. This has been going on for so long that younger scientists might be forgiven if they think this is how science is suppose to work. It is not.
At the root of the scientific method is hypothesis testing through prediction and subsequent observation. Ideally, the prediction comes before the experiment. The highest standard is a prediction made before the fact in ignorance of the ultimate result. This is incontrovertibly superior to post-hoc fits and hand-waving explanations: it is how we’re suppose to avoid playing favorites.
I predicted the velocity dispersion of Crater 2 in advance of the observation, for both ΛCDM and MOND. The prediction for MOND is reasonably straightforward. That for ΛCDM is fraught. There is no agreed method by which to do this, and it may be that the real prediction is that this sort of thing is not possible to predict.
The reason it is difficult to predict the velocity dispersions of specific, individual dwarf satellite galaxies in ΛCDM is that the stellar mass-halo mass relation must be strongly non-linear to reconcile the steep mass function of dark matter sub-halos with their small observed numbers. This is closely related to the M*-Mhalo relation found by abundance matching. The consequence is that the luminosity of dwarf satellites can change a lot for tiny changes in halo mass.
Long story short, the nominal expectation for ΛCDM is a lot of scatter. Photometrically identical dwarfs can live in halos with very different velocity dispersions. The trend between mass, luminosity, and velocity dispersion is so weak that it might barely be perceptible. The photometric data should not be predictive of the velocity dispersion.
It is hard to get even a ballpark answer that doesn’t make reference to other measurements. Empirically, there is some correlation between size and velocity dispersion. This “predicts” σ = 17 km/s. That is not a true theoretical prediction; it is just the application of data to anticipate other data.
Abundance matching relations provide a highly uncertain estimate. The first time I tried to do this, I got unphysical answers (σ = 0.1 km/s, which is less than the stars alone would cause without dark matter – about 0.5 km/s). The application of abundance matching requires extrapolation of fits to data at high mass to very low mass. Extrapolating the M*-Mhalo relation over many decades in mass is very sensitive to the low mass slope of the fitted relation, so it depends on which one you pick.
Since my first pick did not work, lets go with the value suggested to me by James Bullock: σ = 11 km/s. That is the mid-value (the blue lines in the figure above); the true value could easily scatter higher or lower. Very hard to predict with any precision. But given the luminosity and size of Crater 2, we expect numbers like 11 or 17 km/s.
The measured velocity dispersion is σ = 2.7 ± 0.3 km/s.
This is incredibly low. Shockingly so, considering the enormous size of the system (1 kpc half light radius). The NFW halos predicted by ΛCDM don’t do that.
Basically, NFW halos, including the sub-halos imagined to host dwarf satellite galaxies, have rotation curves that rise rapidly and stay high in proportion to the cube root of the halo mass. This property makes it very challenging to explain a low velocity at a large radius: exactly the properties observed in Crater 2.
Lets not fail to appreciate how extremely wrong this is. The original version of the graph above stopped at 5 km/s. It didn’t extend to lower values because they were absurd. There was no reason to imagine that this would be possible. Indeed, the point of their paper was that the observed dwarf velocity dispersions were already too low. To get to lower velocity, you need an absurdly low mass sub-halo – around 107 M☉. In contrast, the usual inference of masses for sub-halos containing dwarfs of similar luminosity is around 109 M☉to 1010 M☉. So the low observed velocity dispersion – especially at such a large radius – seems nigh on impossible.
More generally, there is no way in ΛCDM to predict the velocity dispersions of particular individual dwarfs. There is too much intrinsic scatter in the highly non-linear relation between luminosity and halo mass. Given the photometry, all we can say is “somewhere in this ballpark.” Making an object-specific prediction is impossible.
The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.
I’m an equal opportunity scientist. In addition to ΛCDM, I also considered MOND. The successful prediction is that of MOND. (The quoted uncertainty reflects the uncertainty in the stellar mass-to-light ratio.) The difference is that MOND makes a specific prediction for every individual object. And it comes true. Again.
MOND is a funny theory. The amplitude of the mass discrepancy it induces depends on how low the acceleration of a system is. If Crater 2 were off by itself in the middle of intergalactic space, MOND would predict it should have a velocity dispersion of about 4 km/s.
But Crater 2 is not isolated. It is close enough to the Milky Way that there is an additional, external acceleration imposed by the Milky Way. The net result is that the acceleration isn’t quite as low as it would be were Crater 2 al by its lonesome. Consequently, the predicted velocity dispersion is a measly 2 km/s. As observed.
In MOND, this is called the External Field Effect (EFE). Theoretically, the EFE is rather disturbing, as it breaks the Strong Equivalence Principle. In particular, Local Position Invariance in gravitational experiments is violated: the velocity dispersion of a dwarf satellite depends on whether it is isolated from its host or not. Weak equivalence (the universality of free fall) and the Einstein Equivalence Principle (which excludes gravitational experiments) may still hold.
We identified several pairs of photometrically identical dwarfs around Andromeda. Some are subject to the EFE while others are not. We see the predicted effect of the EFE: isolated dwarfs have higher velocity dispersions than their twins afflicted by the EFE.
If it is just a matter of sub-halo mass, the current location of the dwarf should not matter. The velocity dispersion certainly should not depend on the bizarre MOND criterion for whether a dwarf is affected by the EFE or not. It isn’t a simple distance-dependency. It depends on the ratio of internal to external acceleration. A relatively dense dwarf might still behave as an isolated system close to its host, while a really diffuse one might be affected by the EFE even when very remote.
When Crater 2 was first discovered, I ground through the math and tweeted the prediction. I didn’t want to write a paper for just one object. However, I eventually did so because I realized that Crater 2 is important as an extreme example of a dwarf so diffuse that it is affected by the EFE despite being very remote (120 kpc from the Milky Way). This is not easy to reproduce any other way. Indeed, MOND with the EFE is the only way that I am aware of whereby it is possible to predict, in advance, the velocity dispersion of this particular dwarf.
If I put my ΛCDM hat back on, it gives me pause that any method can make this prediction. As discussed above, this shouldn’t be possible. There is too much intrinsic scatter in the halo mass-luminosity relation.
If we cook up an explanation for the radial acceleration relation, we still can’t make this prediction. The RAR fit we obtained empirically predicts 4 km/s. This is indistinguishable from MOND for isolated objects. But the RAR itself is just an empirical law – it provides no reason to expect deviations, nor how to predict them. MOND does both, does it right, and has done so before, repeatedly. In contrast, the acceleration of Crater 2 is below the minimum allowed in ΛCDM according to Navarro et al.
For these reasons I consider Crater 2 to be the bullet cluster of ΛCDM. Just as the bullet cluster seems like a straight-up contradiction to MOND, so too does Crater 2 for ΛCDM. It is something ΛCDM really can’t do. The difference is that you can just look at the bullet cluster. With Crater 2 you actually have to understand MOND as well as ΛCDM, and think it through.
So what can we do to save ΛCDM?
Whatever it takes, per usual.
One possibility is that Crater II may represent the “bright” tip of the extremely low surface brightness “stealth” fossils predicted by Bovill & Ricotti. Their predictions are encouraging for getting the size and surface brightness in the right ballpark. But I see no reason in this context to expect such a low velocity dispersion. They anticipate dispersions consistent with the ΛCDM discussion above, and correspondingly high mass-to-light ratios that are greater than observed for Crater 2 (M/L ≈ 104 rather than ~50).
A plausible suggestion I heard was from James Bullock. While noting that reionization should preclude the existence of galaxies in halos below 5 km/s, as we need for Crater 2, he suggested that tidal stripping could reduce an initially larger sub-halo to this point. I am dubious about this, as my impression from the simulations of Penarrubia was that the outer regions of the sub-halo were stripped first while leaving the inner regions (where the NFW cusp predicts high velocity dispersions) largely intact until near complete dissolution. In this context, it is important to bear in mind that the low velocity dispersion of Crater 2 is observed at large radii (1 kpc, not tens of pc). Still, I can imagine ways in which this might be made to work in this particular case, depending on its orbit. Tony Sohn has an HST program to measure the proper motion; this should constrain whether the object has ever passed close enough to the center of the Milky Way to have been tidally disrupted.
Josh Bland-Hawthorn pointed out to me that he made simulations that suggest a halo with a mass as low as 107 M☉ could make stars before reionization and retain them. This contradicts much of the conventional wisdom outlined above because they find a much lower (and in my opinion, more realistic) feedback efficiency for supernova feedback than assumed in most other simulations. If this is correct (as it may well be!) then it might explain Crater 2, but it would wreck all the feedback-based explanations given for all sorts of other things in ΛCDM, like the missing satellite problem and the cusp-core problem. We can’t have it both ways.
I’m sure people will come up with other clever ideas. These will inevitably be ad hoc suggestions cooked up in response to a previously inconceivable situation. This will ring hollow to me until we explain why MOND can predict anything right at all.
In the case of Crater 2, it isn’t just a matter of retrospectively explaining the radial acceleration relation. One also has to explain why exceptions to the RAR occur following the very specific, bizarre, and unique EFE formulation of MOND. If I could do that, I would have done so a long time ago.
No matter what we come up with, the best we can hope to do is a post facto explanation of something that MOND predicted correctly in advance. Can that be satisfactory?
There has been another attempt to explain away the radial acceleration relation as being fine in ΛCDM. That’s good; I’m glad people are finally starting to address this issue. But lets be clear: this is a beginning, not a solution. Indeed, it seems more like a rush to create truth by assertion than an honest scientific investigation. I would be more impressed if these papers were (i) refereed rather than rushed onto the arXiv, and (ii) honestly addressed the requirements I laid out.
Rather than consider the assertions piecemeal, lets take a step back. We have established that galaxies obey a single effective force law. Federico Lelli has shown that this applies to pressure supported elliptical galaxies as well as rotating disks.
Lets start with what Newton said about the solar system: “Everything happens… as if the force between two bodies is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” Knowing how this story turns out, consider the following.
Suppose someone came to you and told you Newton was wrong. The solar system doesn’t operate on an inverse square law, it operates on an inverse cube law. It just looks like an inverse square law because there is dark matter arranged just so as to make this so. No matter whether we look at the motion of the planets around the sun, or moons around their planets, or any of the assorted miscellaneous asteroids and cometary debris. Everything happens as if there is an inverse square law, when really it is an inverse cube law plus dark matter arranged just so.
Would you believe this assertion?
I hope not. It is a gross violation of the rule of parsimony. Occam would spin in his grave.
Yet this is exactly what we’re doing with dark matter halos. There is one observed, effective force law in galaxies. The dark matter has to be arranged just so as to make this so.
Convenient that it is invisible.
Maybe dark matter will prove to be correct, but there is ample reason to worry. I worry that we have not yet detected it. We are well past the point that we should have. The supersymmetric sector in which WIMP dark matter is hypothesized to live flunked the “golden test” of the Bs meson decay, and looks more and more like a brilliant idea nature declined to implement. And I wonder why the radial acceleration relation hasn’t been predicted before if it is such a “natural” outcome of galaxy formation simulations. Are we doing fair science here? Or just trying to shove the cat back in the bag?
I really don’t know what the final answer will look like. But I’ve talked to a lot of scientists who seem pretty darn sure. If you are sure you know the final answer, then you are violating some basic principles of the scientific method: the principle of parsimony, the principle of doubt, and the principle of objectivity. Mind your confirmation bias!
That’ll do for now. What wonders await among tomorrow’s arXiv postings?