LCDM has met the enemy, and it is itself

LCDM has met the enemy, and it is itself

David Merritt recently published the article “Cosmology and convention” in Studies in History and Philosophy of Science. This article is remarkable in many respects. For starters, it is rare that a practicing scientist reads a paper on the philosophy of science, much less publishes one in a philosophy journal.

I was initially loathe to start reading this article, frankly for fear of boredom: me reading about cosmology and the philosophy of science is like coals to Newcastle. I could not have been more wrong. It is a genuine page turner that should be read by everyone interested in cosmology.

I have struggled for a long time with whether dark matter constitutes a falsifiable scientific hypothesis. It straddles the border: specific dark matter candidates (e.g., WIMPs) are confirmable – a laboratory detection is both possible and plausible – but the concept of dark matter can never be excluded. If we fail to find WIMPs in the range of mass-cross section parameters space where we expected them, we can change the prediction. This moving of the goal post has already happened repeatedly.

The cross-section vs. mass parameter space for WIMPs. The original, “natural” weak interaction cross-section (10-39) was excluded long ago, as were early attempts to map out the theoretically expected parameter space (upper pink region). Later predictions drifted to progressively lower cross-sections. These evaded experimental limits at the time, and confident predictions were made that the dark matter would be found.  More recent data show otherwise: the gray region is excluded by PandaX (2016). [This plot was generated with the help of DMTools hosted at Brown.]
I do not find it encouraging that the goal posts keep moving. This raises the question, how far can we go? Arbitrarily low cross-sections can be extracted from theory if we work at it hard enough. How hard should we work? That is, what criteria do we set whereby we decide the WIMP hypothesis is mistaken?

There has to be some criterion by which we would consider the WIMP hypothesis to be falsified. Without such a criterion, it does not satisfy the strictest definition of a scientific hypothesis. If at some point we fail to find WIMPs and are dissatisfied with the theoretical fine-tuning required to keep them hidden, we are free to invent some other dark matter candidate. No WIMPs? Must be axions. Not axions? Would you believe light dark matter? [Worst. Name. Ever.] And so on, ad infinitum. The concept of dark matter is not falsifiable, even if specific dark matter candidates are subject to being made to seem very unlikely (e.g., brown dwarfs).

Faced with this situation, we can consult the philosophy science. Merritt discusses how many of the essential tenets of modern cosmology follow from what Popper would term “conventionalist stratagems” – ways to dodge serious consideration that a treasured theory is threatened. I find this a compelling terminology, as it formalizes an attitude I have witnessed among scientists, especially cosmologists, many times. It was put more colloquially by J.K. Galbraith:

“Faced with the choice between changing one’s mind and proving that there is no need to do so, almost everybody gets busy on the proof.”

Boiled down (Keuth 2005), the conventionalist strategems Popper identifies are

  1. ad hoc hypotheses
  2. modification of ostensive definitions
  3. doubting the reliability of the experimenter
  4. doubting the acumen of the theorist

These are stratagems to be avoided according to Popper. At the least they are pitfalls to be aware of, but as Merritt discusses, modern cosmology has marched down exactly this path, doing each of these in turn.

The ad hoc hypotheses of ΛCDM are of course Λ and CDM. Faced with the observation of a metric that cannot be reconciled with the prior expectation of a decelerating expansion rate, we re-invoke Einstein’s greatest blunder, Λ. We even generalize the notion and give it a fancy new name, dark energy, which has the convenient property that it can fit any observed set of monotonic distance-redshift pairs. Faced with an excess of gravitational attraction over what can be explained by normal matter, we invoke non-baryonic dark matter: some novel form of mass that has no place in the standard model of particle physics, has yet to show any hint of itself in the laboratory, and cannot be decisively excluded by experiment.

We didn’t accept these ad hoc add-ons easily or overnight. Persuasive astronomical evidence drove us there, but all these data really show is that something dire is wrong: General Relativity plus known standard model particles cannot explain the universe. Λ and CDM are more a first guess than a final answer. They’ve been around long enough that they have become familiar, almost beyond doubt. Nevertheless, they remain unproven ad hoc hypotheses.

The sentiment that is often asserted is that cosmology works so well that dark matter and dark energy must exist. But a more conservative statement would be that our present understanding of cosmology is correct if and only if these dark entities exist. The onus is on us to detect dark matter particles in the laboratory.

That’s just the first conventionalist stratagem. I could given many examples of violations of the other three, just from my own experience. That would make for a very long post indeed.

Instead, you should go read Merritt’s paper. There are too many things there to discuss, at least in a single post. You’re best going to the source. Be prepared for some cognitive dissonance.


Crater 2: the Bullet Cluster of LCDM

Crater 2: the Bullet Cluster of LCDM

Recently I have been complaining about the low standards to which science has sunk. It has become normal to be surprised by an observation, express doubt about the data, blame the observers, slowly let it sink in, bicker and argue for a while, construct an unsatisfactory model that sort-of, kind-of explains the surprising data but not really, call it natural, then pretend like that’s what we expected all along. This has been going on for so long that younger scientists might be forgiven if they think this is how science is suppose to work. It is not.

At the root of the scientific method is hypothesis testing through prediction and subsequent observation. Ideally, the prediction comes before the experiment. The highest standard is a prediction made before the fact in ignorance of the ultimate result. This is incontrovertibly superior to post-hoc fits and hand-waving explanations: it is how we’re suppose to avoid playing favorites.

I predicted the velocity dispersion of Crater 2 in advance of the observation, for both ΛCDM and MOND. The prediction for MOND is reasonably straightforward. That for ΛCDM is fraught. There is no agreed method by which to do this, and it may be that the real prediction is that this sort of thing is not possible to predict.

The reason it is difficult to predict the velocity dispersions of specific, individual dwarf satellite galaxies in ΛCDM is that the stellar mass-halo mass relation must be strongly non-linear to reconcile the steep mass function of dark matter sub-halos with their small observed numbers. This is closely related to the M*-Mhalo relation found by abundance matching. The consequence is that the luminosity of dwarf satellites can change a lot for tiny changes in halo mass.

Fig. 11 from Tollerud et al. (2011, ApJ, 726, 108). The width of the bands illustrates the minimal scatter expected between dark halo and measurable properties. A dwarf of a given luminosity could reside in dark halos differing be two decades in mass, with a corresponding effect on the velocity dispersion.

Long story short, the nominal expectation for ΛCDM is a lot of scatter. Photometrically identical dwarfs can live in halos with very different velocity dispersions. The trend between mass, luminosity, and velocity dispersion is so weak that it might barely be perceptible. The photometric data should not be predictive of the velocity dispersion.

It is hard to get even a ballpark answer that doesn’t make reference to other measurements. Empirically, there is some correlation between size and velocity dispersion. This “predicts” σ = 17 km/s. That is not a true theoretical prediction; it is just the application of data to anticipate other data.

Abundance matching relations provide a highly uncertain estimate. The first time I tried to do this, I got unphysical answers (σ = 0.1 km/s, which is less than the stars alone would cause without dark matter – about 0.5 km/s). The application of abundance matching requires extrapolation of fits to data at high mass to very low mass. Extrapolating the M*-Mhalo relation over many decades in mass is very sensitive to the low mass slope of the fitted relation, so it depends on which one you pick.


Since my first pick did not work, lets go with the value suggested to me by James Bullock: σ = 11 km/s. That is the mid-value (the blue lines in the figure above); the true value could easily scatter higher or lower. Very hard to predict with any precision. But given the luminosity and size of Crater 2, we expect numbers like 11 or 17 km/s.

The measured velocity dispersion is σ = 2.7 ± 0.3 km/s.

This is incredibly low. Shockingly so, considering the enormous size of the system (1 kpc half light radius). The NFW halos predicted by ΛCDM don’t do that.

To illustrate how far off this is, I have adopted this figure from Boylan-Kolchin et al. (2012).

Fig. 1 of MNRAS, 422, 1203 illustrating the “too big to fail” problem: observed dwarfs have lower velocity dispersions than sub-halos that must exist and should host similar or even more luminous dwarfs that apparently do not exist. I have had to extend the range of the original graph to lower velocities in order to include Crater 2.

Basically, NFW halos, including the sub-halos imagined to host dwarf satellite galaxies, have rotation curves that rise rapidly and stay high in proportion to the cube root of the halo mass. This property makes it very challenging to explain a low velocity at a large radius: exactly the properties observed in Crater 2.

Lets not fail to appreciate how extremely wrong this is. The original version of the graph above stopped at 5 km/s. It didn’t extend to lower values because they were absurd. There was no reason to imagine that this would be possible. Indeed, the point of their paper was that the observed dwarf velocity dispersions were already too low. To get to lower velocity, you need an absurdly low mass sub-halo – around 107 M. In contrast, the usual inference of masses for sub-halos containing dwarfs of similar luminosity is around 109 Mto 1010 M. So the low observed velocity dispersion – especially at such a large radius – seems nigh on impossible.

More generally, there is no way in ΛCDM to predict the velocity dispersions of particular individual dwarfs. There is too much intrinsic scatter in the highly non-linear relation between luminosity and halo mass. Given the photometry, all we can say is “somewhere in this ballpark.” Making an object-specific prediction is impossible.

Except that it is possible. I did it. In advance.

The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.

I’m an equal opportunity scientist. In addition to ΛCDM, I also considered MOND. The successful prediction is that of MOND. (The quoted uncertainty reflects the uncertainty in the stellar mass-to-light ratio.) The difference is that MOND makes a specific prediction for every individual object. And it comes true. Again.

MOND is a funny theory. The amplitude of the mass discrepancy it induces depends on how low the acceleration of a system is. If Crater 2 were off by itself in the middle of intergalactic space, MOND would predict it should have a velocity dispersion of about 4 km/s.

But Crater 2 is not isolated. It is close enough to the Milky Way that there is an additional, external acceleration imposed by the Milky Way. The net result is that the acceleration isn’t quite as low as it would be were Crater 2 al by its lonesome. Consequently, the predicted velocity dispersion is a measly 2 km/s. As observed.

In MOND, this is called the External Field Effect (EFE). Theoretically, the EFE is rather disturbing, as it breaks the Strong Equivalence Principle. In particular, Local Position Invariance in gravitational experiments is violated: the velocity dispersion of a dwarf satellite depends on whether it is isolated from its host or not. Weak equivalence (the universality of free fall) and the Einstein Equivalence Principle (which excludes gravitational experiments) may still hold.

We identified several pairs of photometrically identical dwarfs around Andromeda. Some are subject to the EFE while others are not. We see the predicted effect of the EFE: isolated dwarfs have higher velocity dispersions than their twins afflicted by the EFE.

If it is just a matter of sub-halo mass, the current location of the dwarf should not matter. The velocity dispersion certainly should not depend on the bizarre MOND criterion for whether a dwarf is affected by the EFE or not. It isn’t a simple distance-dependency. It depends on the ratio of internal to external acceleration. A relatively dense dwarf might still behave as an isolated system close to its host, while a really diffuse one might be affected by the EFE even when very remote.

When Crater 2 was first discovered, I ground through the math and tweeted the prediction. I didn’t want to write a paper for just one object. However, I eventually did so because I realized that Crater 2 is important as an extreme example of a dwarf so diffuse that it is affected by the EFE despite being very remote (120 kpc from the Milky Way). This is not easy to reproduce any other way. Indeed, MOND with the EFE is the only way that I am aware of whereby it is possible to predict, in advance, the velocity dispersion of this particular dwarf.

If I put my ΛCDM hat back on, it gives me pause that any method can make this prediction. As discussed above, this shouldn’t be possible. There is too much intrinsic scatter in the halo mass-luminosity relation.

If we cook up an explanation for the radial acceleration relation, we still can’t make this prediction. The RAR fit we obtained empirically predicts 4 km/s. This is indistinguishable from MOND for isolated objects. But the RAR itself is just an empirical law – it provides no reason to expect deviations, nor how to predict them. MOND does both, does it right, and has done so before, repeatedly. In contrast, the acceleration of Crater 2 is below the minimum allowed in ΛCDM according to Navarro et al.

For these reasons I consider Crater 2 to be the bullet cluster of ΛCDM. Just as the bullet cluster seems like a straight-up contradiction to MOND, so too does Crater 2 for ΛCDM. It is something ΛCDM really can’t do. The difference is that you can just look at the bullet cluster. With Crater 2 you actually have to understand MOND as well as ΛCDM, and think it through.

So what can we do to save ΛCDM?

Whatever it takes, per usual.

One possibility is that Crater II may represent the “bright” tip of the extremely low surface brightness “stealth” fossils predicted by Bovill & Ricotti. Their predictions are encouraging for getting the size and surface brightness in the right ballpark. But I see no reason in this context to expect such a low velocity dispersion. They anticipate dispersions consistent with the ΛCDM discussion above, and correspondingly high mass-to-light ratios that are greater than observed for Crater 2 (M/L ≈ 104 rather than ~50).

plausible suggestion I heard was from James Bullock. While noting that reionization should preclude the existence of galaxies in halos below 5 km/s, as we need for Crater 2, he suggested that tidal stripping could reduce an initially larger sub-halo to this point. I am dubious about this, as my impression from the simulations of Penarrubia  was that the outer regions of the sub-halo were stripped first while leaving the inner regions (where the NFW cusp predicts high velocity dispersions) largely intact until near complete dissolution. In this context, it is important to bear in mind that the low velocity dispersion of Crater 2 is observed at large radii (1 kpc, not tens of pc). Still, I can imagine ways in which this might be made to work in this particular case, depending on its orbit. Tony Sohn has an HST program to measure the proper motion; this should constrain whether the object has ever passed close enough to the center of the Milky Way to have been tidally disrupted.

Josh Bland-Hawthorn pointed out to me that he made simulations that suggest a halo with a mass as low as 107 Mcould make stars before reionization and retain them. This contradicts much of the conventional wisdom outlined above because they find a much lower (and in my opinion, more realistic) feedback efficiency for supernova feedback than assumed in most other simulations. If this is correct (as it may well be!) then it might explain Crater 2, but it would wreck all the feedback-based explanations given for all sorts of other things in ΛCDM, like the missing satellite problem and the cusp-core problem. We can’t have it both ways.

Without super-efficient supernova feedback, the Local Group would be filled with a million billion ultrafaint dwarf galaxies!

I’m sure people will come up with other clever ideas. These will inevitably be ad hoc suggestions cooked up in response to a previously inconceivable situation. This will ring hollow to me until we explain why MOND can predict anything right at all.

In the case of Crater 2, it isn’t just a matter of retrospectively explaining the radial acceleration relation. One also has to explain why exceptions to the RAR occur following the very specific, bizarre, and unique EFE formulation of MOND. If I could do that, I would have done so a long time ago.

No matter what we come up with, the best we can hope to do is a post facto explanation of something that MOND predicted correctly in advance. Can that be satisfactory?

Crater 2: prediction verified.

Crater 2: prediction verified.

The arXiv brought an early Xmas gift in the form of a measurement of the velocity dispersion of Crater 2. Crater 2 is an extremely diffuse dwarf satellite of the Milky Way. Upon its discovery, I realized there was an opportunity to predict its velocity dispersion based on the reported photometry. The fact that it is very large (half light radius a bit over 1 kpc) and relatively far from the Milky Way (120 kpc) make it a unique and critical case. I will expand on that in another post, or you could read the paper. But for now:

The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.

This prediction appeared in press in advance of the measurement (ApJ, 832, L8). The uncertainty reflects the uncertainty in the mass-to-light ratio.

The measured velocity dispersion is σ = 2.7 ± 0.3 km/s

as reported by Caldwell et al.

Isn’t that how science is suppose to work? Make the prediction first? Not just scramble to explain it after the fact?

Reckless disregard for the scientific method

Reckless disregard for the scientific method

There has been another attempt to explain away the radial acceleration relation as being fine in ΛCDM. That’s good; I’m glad people are finally starting to address this issue. But lets be clear: this is a beginning, not a solution. Indeed, it seems more like a rush to create truth by assertion than an honest scientific investigation. I would be more impressed if these papers were (i) refereed rather than rushed onto the arXiv, and (ii) honestly addressed the requirements I laid out.

This latest paper complains about IC 2574 not falling on the radial acceleration relation. This is the galaxy that I just pointed out (about the same time they must have been posting the preprint) does adhere to the relation. So, I guess post-factual reality has come to science.

Rather than consider the assertions piecemeal, lets take a step back. We have established that galaxies obey a single effective force law. Federico Lelli has shown that this applies to pressure supported elliptical galaxies as well as rotating disks.

The radial acceleration relation, including pressure supported early type galaxies and dwarf Spheroidals.

Lets start with what Newton said about the solar system: “Everything happens… as if the force between two bodies is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” Knowing how this story turns out, consider the following.

Suppose someone came to you and told you Newton was wrong. The solar system doesn’t operate on an inverse square law, it operates on an inverse cube law. It just looks like an inverse square law because there is dark matter arranged just so as to make this so. No matter whether we look at the motion of the planets around the sun, or moons around their planets, or any of the assorted miscellaneous asteroids and cometary debris. Everything happens as if there is an inverse square law, when really it is an inverse cube law plus dark matter arranged just so.

Would you believe this assertion?

I hope not. It is a gross violation of the rule of parsimony. Occam would spin in his grave.

Yet this is exactly what we’re doing with dark matter halos. There is one observed, effective force law in galaxies. The dark matter has to be arranged just so as to make this so.

Convenient that it is invisible.

Maybe dark matter will prove to be correct, but there is ample reason to worry. I worry that we have not yet detected it. We are well past the point that we should have. The supersymmetric sector in which WIMP dark matter is hypothesized to live flunked the “golden test” of the Bs meson decay, and looks more and more like a brilliant idea nature declined to implement. And I wonder why the radial acceleration relation hasn’t been predicted before if it is such a “natural” outcome of galaxy formation simulations. Are we doing fair science here? Or just trying to shove the cat back in the bag?


I really don’t know what the final answer will look like. But I’ve talked to a lot of scientists who seem pretty darn sure. If you are sure you know the final answer, then you are violating some basic principles of the scientific method: the principle of parsimony, the principle of doubt, and the principle of objectivity. Mind your confirmation bias!

That’ll do for now. What wonders await among tomorrow’s arXiv postings?

Going in Circles

Going in Circles

Sam: This looks strangely familiar.

Frodo: That’s because we’ve been here before. We’re going in circles!

Last year, Oman et al. published a paper entitled “The unexpected diversity of dwarf galaxy rotation curves”. This term, diversity, has gained some traction among the community of scientists who simulate the formation of galaxies. From my perspective, this terminology captures some of the story, but misses most of it.

Lets review.

Set the Wayback Machine, Mr. Peabody!

It was established (by van Albada & Sancisi and by Kent) in the ’80s that rotation curves were generally well described as maximal disks: the inner rotation curve was dominated by the stars, with a gradual transition to the flat outer part which required dark matter. By that time, I had became interested in low surface brightness (LSB) galaxies, which had not been studied in such detail. My nominal expectation was that LSB galaxies were stretched out versions of more familiar spiral galaxies. As such they’d also have maximal disks, but lower peak velocities (since V2 ≈ GM/R and LSBs had larger R for the same M).

By the mid-1990s, we had shown that this was not the case. LSB galaxies had the same rotation velocity as more concentrated galaxies of the same luminosity. This meant that LSB galaxies were dark matter dominated. This result is now widely known (to the point that it is often taken for granted), but it had not been expected. One interesting consequence was that LSB galaxies were a convenient laboratory for testing the dark matter hypothesis.

So what do we expect? There were, and are, many ideas for what dark matter should do. One of the leading hypotheses to emerge (around the same time) was the NFW halo obtained from structure formation simulations using cold dark matter. If a galaxy is dark matter dominated, then to a good approximation we expect the stars to act as tracer particles: the rotation curve should just reflect that of the underlying dark matter halo.

This did not turn out well. The rotation curves of low surface brightness galaxies do not look like NFW halos. One example is provided by the LSB galaxy F583-1, reproduced here from Fig. 14 of McGaugh & de Blok (1998).

The rotation curve of LSB galaxy F583-1 (filled points) as reported in McGaugh & de Blok (1998). Open points are what is left after subtracting the contribution of the stars and the gas: this is the rotation curve of the dark matter halo. Lines are example NFW halos. The data do not behave as predicted by NFW, a generic problem in LSB galaxies.

This was bad for NFW. But there is a more general problem, irrespective of the particular form of the dark matter halo. The M*-Mhalo relation required by abundance matching means that galaxies of the same luminosity live in nearly identical dark matter halos. When dark matter dominates, galaxies of the same luminosity should thus have the same rotation curve.

We can test this by comparing the rotation curves of Tully-Fisher pairs: galaxies with the same luminosity and flat rotation velocity, but different surface brightness. The high surface brightness NGC 2403 and low surface brightness UGC 128 are such a pair. So for 20 years, I have been showing their rotation curves:

The rotation curves of NGC 2403 (red points) and UGC 128 (open points). The top panel shows radius in physical units; the bottom panel shows the same data with the radius scaled by the scale length of the disk. This is larger for the LSB galaxies (blue lines in top panel) and has the net effect that the normalized rotation curves are practically indistinguishable.

If NGC 2403 and UGC 128 reside in the same dark matter halo, they should have basically the same rotation curve in physical units [V(R in kpc)]. They don’t. But they do have the pretty much the same rotation curve when radius is scaled by the size of the disk [V(R/Rd)]. The dynamics “knows about” the baryons, in contradiction to the expectation for dark matter dominated galaxies.

Oman et al. have rediscovered the top panel (which they call diversity) but they don’t notice the bottom panel (which one might call uniformity). That galaxies of the same luminosity have different rotation curves remains surprising to simulations, at least the EAGLE and APOSTLE simulations Oman et al. discuss. (Note that APOSTLE was called LG by Oman et al.)  Oman et al. illustrate the point with a number of rotation curves, for example, their Fig. 5:

Fig. 5 from Oman et al. (2015).

Oman et al. show that the rotation curves of LSB galaxies rise more slowly than predicted by simulations, and have a different shape. This is the same problem that we pointed out two decades ago. Indeed, note that the lower left panel is F583-1: the same galaxy noted above, showing the same discrepancy. The new thing is that these simulations include the effects of baryons (shaded regions). Baryons do not help to resolve the problem, at least as implemented in EAGLE and APOSTLE.

It is tempting to be snarky and say that this quantifies how many years simulators are behind observers. But that would be too generous. Observers had already noticed the systematic illustrated in the bottom panel of the NGC2403/UGC 128 in the previous millennium. Simulators are just now coming to grips with the top panel. The full implications of the bottom panel seems not yet to have disturbed their dreams of dark matter.

Perhaps that passes snarky and on into rude, but it isn’t like we haven’t been telling them exactly this for years and years and years. The initial reaction was not mere disbelief, but outright scorn. The data disagree with simulations, so the data must be wrong! Seriously, this was the attitude. I don’t doubt that it persists in some of the colder, darker corners of the communal astro-theoretical intellect.

Indeed, Ludlow et al. provide an example. These are essentially the same people as wrote Oman et al. Though Oman et al. point out a problem when comparing the simulations to data, Ludlow et al. claim that the observed uniformity is “a Natural Outcome of Galaxy Formation in CDM halos”. Seriously. This is in their title.

Well, which is it? Is the diversity of rotation curves a problem for simulations? Or is their uniformity a “natural outcome”? This is not natural at all.

Note that the lower right panel of the figure from Oman et al. contains the galaxy IC 2574. This galaxy obviously deviates from the expectation of the simulations. These predict accelerations that are much larger than observed at small radii. Yet Ludlow et al. claim to explain the radial acceleration relation.

This situation is self-contradictory. Either the simulations explain the RAR, or they fail to explain the “diversity” of rotation curves. These are not independent statements.

I can think of two explanations: either (i) the data that define the RAR don’t include diverse galaxies, or (ii) the simulations are not producing realistic galaxies. In the latter case, it is possible that both the rotation curve and the baryon distribution are off in a way that maintains some semblance of the observed RAR.

I know (i) is not correct. Galaxies like F583-1 and IC 2574 help define the RAR. This is one reason why the RAR is problematic for simulations.

The rotation curve of IC 2574 (left) and its location along the RAR (right).

That leaves (ii). Though the correlation Ludlow et al. show misses the data, the real problem is worse. They only obtain the semblance of the right relation because the simulated galaxies apparently don’t have the same range of surface brightness as real galaxies. They’re not just missing V(R); now that they include baryons they are also getting the distribution of luminous mass wrong.

I have no doubt that this problem can be fixed. Doing so is “simply” a matter of revising the feedback prescription until the desired results is obtained. This is called fine-tuning.

What is Natural?

I have been musing for a while on the idea of writing about Naturalness in science, particularly as it applies to the radial acceleration relation. As a scientist, the concept of Naturalness is very important to me, especially when it comes to the interpretation of data. When I sat down to write, I made the mistake of first Googling the term.

The top Google hits bear little resemblance to what I mean by Naturalness. The closest match is specific to a particular, rather narrow concept in theoretical particle physics. I mean something much more general. I know many scientific colleagues who share this ideal. I also get the impression that this ideal is being eroded and cheapened, even among scientists, in our post-factual society.

I suspect the reason a better hit for Naturalness doesn’t come up more naturally in a Google search is, at least in part, an age effect. As wonderful a search engine as Google may be, it is lousy at identifying things B.G. (Before Google).  The concept of Naturalness has been embedded in the foundations of science for centuries, to the point where it is absorbed by osmosis by students of any discipline: it doesn’t need to be formally taught; there probably is no appropriate website.

In many sciences, we are often faced with messy and incomplete data. In Astronomy in particular, there are often complicated astrophysical processes well beyond our terrestrial experience that allow a broad range of interpretations. Some of these are natural while others are contrived. Usually, the most natural interpretation is the correct one. In this regard, what I mean by Naturalness is closely related to Occam’s Razor, but it is something more as well. It is that which follows – naturally – from a specific hypothesis.

An obvious astronomical example: Kepler’s Laws follow naturally from Newton’s Universal Law of Gravity. It is a trivial amount of algebra to show that Kepler’s third Law, P2 = a3, follows as a direct consequence of Newton’s inverse square law. The first law, that orbits are ellipses, follows with somewhat more math. The second law follows with the conservation of angular momentum.

It isn’t just that Newtonian gravity is the simplest explanation for planetary orbits. It is that all the phenomena identified by Kepler follow naturally from Newton’s insight. This isn’t obvious just by positing an inverse square law. But in exploring the consequences of such a hypothesis, one finds that one clue after another falls into place like the pieces of a jigsaw puzzle. This is what I mean by Naturalness.

I expect that this sense of Naturalness – the fitting together of the pieces of the puzzle – is what gave Newton encouragement that he was on the right path with the inverse square law. Let’s not forget that both Newton and his inverse square law came in for a lot of criticism at the time. Both Leibniz and Huygens objected to action at a distance, for good reason. I suspect this is why Newton prefaced his phrasing of the inverse square law with the modifier as if: “Everything happens… as if the force between two bodies is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” He is not claiming that this is right, that it has to be so. Just that it sure looks that way.

The situation with the radial acceleration relation in galaxies today is the same. Everything happens as if there is a single effective force law in galaxies. This is true regardless of what the ultimate reason proves to be.

The natural explanation for the single effective force law indicated by the radial acceleration relation is that there is indeed a unique force law at work. In this case, such a force law has already been hypothesized: MOND. Often MOND is dismissed for other reasons, though reports of its demise have repeatedly been exaggerated. Perhaps MOND is just the first approximation of some deeper theory. Perhaps, like action at a distance, we simply don’t yet understand the underlying reasons for it.

Another quick-trick simulation result

Another quick-trick simulation result

There has already been one very quick attempt to match ΛCDM galaxy formation simulations to the radial acceleration relation (RAR). Another rapid preprint by the Durham group has appeared. It doesn’t do everything I ask for from simulations, but it does do a respectable number of them. So how does it do?

First, there is some eye-rolling language in the title and the abstract. Two words: natural (in the title) and accommodated (in the abstract). I can’t not address these before getting to the science.

Natural. As I have discussed repeatedly in this blog, and in the refereed literature, there is nothing natural about this. If it were so natural, we’d have been talking about it since Bob Sanders pointed this out in 1990, or since I quantified it better in 1998 and 2004. Instead, the modus operandi of much of the simulation community over the past couple of decades has been to pour scorn on the quality of rotation curve data because it did not look like their simulations. Now it is natural?


Accommodate. Accommodation is an important issue in the philosophy of science. I have no doubt that the simulators are clever enough to find a way to accommodate the data. That is why I have, for 20 years, been posing the question What would falsify ΛCDM? I have heard (or come up myself with) only a few good answers, and I fear the real answer is that it can’t be. It is so flexible, with so many freely adjustable parameters, that it can be made to accommodate pretty much anything. I’m more impressed by predictions that come ahead of time.

That’s one reason I want to see what the current generation of simulations say before entertaining those made with full knowledge of the RAR. At least these quick preprints are using existing simulations, so while not predictions in the strictest since, at least they haven’t been fine-tuned specifically to reproduce the RAR. Lots of other observations, yes, but not this particular one.

Ludlow et al. show a small number of model rotation curves that vary from wildly unrealistic (their NoAGN models peak at 500 km/s; no disk galaxy in the universe comes anywhere close to that… Vera Rubin once offered a prize for any that exceeded 300 km/s) to merely implausible (their StrongFB model is in the right ballpark, but has a very rapidly rising rotation curve). In all cases, their dark matter halos seem little affected by feedback, in contrast to the claims of other simulation groups. It will be interesting to follow the debate between simulators as to what we should really expect.

They do find a RAR-like correlation. Remarkably, the details don’t seem to depend much on the feedback scheme. This motivates some deeper consideration of the RAR.

The RAR plots observed centripetal acceleration, gobs, against that predicted by the observed distribution of baryons, gbar. We chose these coordinates because this seems to be the fundamental empirical correlation, and the two quantities are measured in completely independent ways: rotation curves vs. photometry. While measured independently, some correlation is guaranteed: physically, gobs includes gbar. Things only become weird when the correlation persists as gobs ≫ gbar.

The models are well fit by the functional form we found for the data, but with a different value of the fit parameter: g = 3 rather than 1.2 x 10-10 m s-2. That’s a factor of 2.5 off – a factor that is considered fatal for MOND in galaxy clusters. Is it OK here?

The uncertainty in the fit value is 1.20 ± 0.02. So formally, 3 is off by 90σ. However, the real dominant uncertainty is systematic: what is the true mean mass-to-light ratio at 3.6 microns? We estimated the systematic uncertainty to be ± 0.24 based on an extensive survey of plausible stellar population models. So 3 is only 7.5σ off.

The problem with systematic uncertainties is that they do not obey Gaussian statistics. So I decided to see what we might need to do to obtain g = 3 x 10-10 m s-2. This can be done if we take sufficient liberties with the mass-to-light ratio.

The radial acceleration relation as observed (open points fit by blue line) and modeled (red line). Filled points are the same data with the disk mass-to-light ratio reduced by a factor of two.

Indeed, we can get in the right ball park simply by reducing the assumed mass-to-light ratio of stellar disks by a factor of two. We don’t make the same factor of two adjustment to the bulge components, because the data don’t approach the 1:1 line at high accelerations if this is done. So rather than our fiducial model with M*/L(disk) = 0.5 M/L and M*/L(bulge) = 0.7 M/L (open points in plot), we have M*/L(disk) = 0.25 M/L and M*/L(bulge) = 0.7 M/L (filled points in plot). Lets pretend like we don’t know anything about stars and ignore the fact that this change corresponds to truncating the IMF of the stellar disk so that M dwarfs don’t exist in disks, but they do in bulges. We then find a tolerable match to the simulations (red line).

Amusingly, the data are now more linear than the functional form we assumed. If this is what we thought stars did, we wouldn’t have picked the functional form the simulations apparently reproduce. We would have drawn a straight line through the data – at least most of it.

That much isn’t too much of a problem for the models, though it is an interesting question whether they get the shape of the RAR right for the normalization they appear to demand. There is a serious problem though. That becomes apparent in the lowest acceleration points, which deviate strongly below the red line. (The formal error bars are smaller than the size of the points.)

It is easy to understand why this happens. As we go from high to low accelerations, we transition from bulge dominance to stellar disk dominance to gas dominance. Those last couple of bins are dominated by atomic gas, not stars. So it doesn’t matter what we adopt for the stellar mass-to-light ratio. That’s where the data sit: well off the simulated line.

Is this fatal for these models? As presented, yes. The simulations persist in predicting higher accelerations than observed. This has been the problem all along.

There are other issues. The scatter in the simulated RAR is impressively small. Much smaller than I expected. Smaller even than the observational scatter. But the latter is dominated by observational errors: the intrinsic relation is much tighter, consistent with a δ-function. The intrinsic scatter is what they should be comparing their results to. They either fail to understand, or conveniently choose to gloss over, the distinction between intrinsic scatter and that induced by random errors.

It is worth noting that some of the same authors make this same mistake – and it is a straight up mistake – in discussing the scatter in the baryonic Tully-Fisher relation. The assertion there is “the scatter in the simulated BTF is smaller than observed”. But the observed scatter is dominated by observational errors, which we have taken great care to assess. Once this is done, there is practically no room left over for intrinsic scatter, which is what the models display. This is important, as it completely inverts the stated interpretation. Rather than having less scatter than observed, the simulations exhibit more scatter than allowed.

Can these problems be fixed? No doubt. See the comments on accommodation above.

La Fin de Quoi?

La Fin de Quoi?

Last time, I addressed some of the problems posed by the radial acceleration relation for galaxy formation theory in the LCDM cosmogony. Predictably, some have been quick to assert there is no problem at all. The first such claim is by Keller & Wadsley in a preprint titled La Fin du MOND: LCDM is Fully Consistent with SPARC Acceleration Data.”

There are good things about this paper, bad things, and the potential for great ugliness.


The good:

  This is exactly the reaction that I had hoped to see in response to the radial acceleration relation (RAR): people going to their existing simulations and checking what answer they got. The answer looks promising. The same relation is apparent in the simulations as in the data. That’s good.

  These simulations already existed. They haven’t been tuned to match this particular observations. That’s good.  The cynic might note that the last 15+ years of galaxy formation simulations have been driven by the need to add feedback to match data, including the shapes of rotation curves. Nevertheless, I see no guarantee that the RAR will fall out of this process.

  The scatter in the simulations is 0.05 dex. The scatter in the data not explained by random errors is 0.06 dex. This agreement is good. I think the source of the scatter needs to be explored further (see below), but it is at least in the right ballpark, which is by no means guaranteed.

  The authors make a genuine prediction for how the RAR should evolve with redshift. That isn’t just good; it is bold and laudable.

The bad:

  There are only 18 simulated galaxies to compare to 153 real ones. I appreciate the difficulty in generating these simulations, but we really need a bigger sample. The large number of sampled points (1800) is less important given the simulators’ ability to parse the data as finely as their CPU allows them to resolve. I also wonder if the lowest acceleration points extend beyond the range sampled in comparable galaxies. Typically the data peter out around an HI surface density of 1 Msun/pc^2.

  The comparison they make to Fig. 3 of arxiv:1609.05917 is great.  I would like to see something like Fig. 1 and 2 from that paper as well. What range of galaxy properties do the models span? What do individual mass models looks like?

Fig. 1 from McGaugh, Lelli, & Schombert (2016) showing the range of luminosity and surface brightness covered by the SPARC data. Galaxies range over a factor of 50,000 in luminosity. The shaded region shows the range explored by the simulations discussed by Keller & Wadsley, which cover a factor of 15. Note that this is a logarithmic scale. On a linear scale, the simulations cover 0.03% of the range covered by the data along the x-axis. The range covered along the y-axis was not specified.

  My biggest concern is that there is a limited dynamic range in the simulations, which span only a factor of 15 in disk mass: from 1.7E10 to 2.7E11 Msun. For comparison, the data span 1E7 to 5E11 Lsun in [3.6] luminosity, a factor of 50,000. The simulations only sample the top 0.03% of this range.

  Basically, the simulated galaxies go from a little less massive than the Milky Way up to a bit more massive than Andromeda. Comparing this range to the RAR and declaring the problem solved is like fitting the Milky Way and Andromeda and declaring all problems in the Local Group solved without looking at any of the dwarfs. It is at lower mass scales and for lower surface brightness galaxies that problems become severe. Consequently, the most the authors can claim is a promising start on understanding a tiny fraction of bright galaxies, not a complete explanation of the RAR.

  Indeed, while the authors quantify the mass range over which their simulated galaxies extend, they make no mention of either size or surface brightness. Are these comparable to real galaxies of similar mass? Too narrow a range in size at fixed mass, as seems likely in a small sample, may act to artificially suppress the scatter.  Put another way: if the simulated galaxies only cover a tiny region of Fig. 1 above, it is hardly surprising if they exhibit little scatter.

  The apparent match between the simulated and observed scatter seems good. But the “left over” observational scatter of 0.06 dex is the same as what we expect from scatter in the mass-to-light ratio.  That is irreducible. There has to be some variation in stellar populations, and it is much easier to imagine this number getting bigger than being much smaller.

  In the simulations, the stellar mass is presumably known perfectly, so I expect the scatter has a different source. Presumably there is scatter from halo to halo as seen in other simulations. That’s natural in LCDM, but there isn’t any room for it if we also have to accommodate scatter from the mass-to-light ratio. The apparent equality of observed and simulated scatter is meaningless if they represent scatter in different quantities.

  I have trouble believing that the RAR follows simply from dissipative collapse without feedback. I’ve worked on this before, so I’m pretty sure it does not work this way. It is true that a single model does something like this as a result of dissipative collapse. It is not true that an ensemble of such models are guaranteed to fall on the same relation.

  There are many examples of galaxies with the same mass but very different scale lengths. In the absence of feedback, shorter scale lengths lead to more compression of the dark matter halo. One winds up with more dark matter where there are more baryons. This is the opposite of what we see in the data.

  This makes me suspect the dynamic range in the simulations is a problem. Not only do they cover little range in mass compared to the data, but this particular conclusion may only be reached if there is virtually no dynamic range in size at a given mass. That is hardly surprising given the small sample size.

The ugly:

  The title.

  This paper has nothing to do with MOND, nor says anything about it. Why is it in the title?

  At best, the authors have shown that, over a rather limited dynamic range, simulations in LCDM might reproduce post facto what MOND predicted a priori. If so, LCDM survives this test (as far as it goes). But in no respect can this be considered a problem for MOND, which predicted the phenomenon over 30 years ago. This is a classic problem in the philosophy of science: should we put more weight on the a priori prediction, or on the capacity of a more flexible theory to accommodate the same observation later on?

The title is revealing of a deep-rooted bias. It tarnishes what might be an important results and does a disservice to the objectivity we’re suppose to value in science.


  I am eager to see whether other simulations agree with these results. Not all simulators implement feedback in the same way, nor get the same results. The most dangerous aspect of this paper is that it may give people an excuse to think the problem is solved so they never have to think about it again. The RAR is a test that needs to be applied every time to each and every batch of simulations. If they don’t pass this test, they’re wrong. Unfortunately, there is precedent in the galaxy formation community to take reassurances such as this for granted, and not to bother to perform the test.



Four Strikes

Four Strikes

So the radial acceleration relation is a new law of nature. What does it mean?

One reason we have posed it as a law of nature is that it is interpretation-free. It is a description of how nature works – in this case, a rule for how galaxies rotate. Why nature behaves thus is another matter.

Some people have been saying the RAR (I tire of typing out “radial acceleration relation”) is a problem for dark matter, while others seem to think otherwise. Lets examine this.

The RAR has a critical scale g = 1.2 · 10-10 m s-2. At high acceleration, above this scale, we don’t need dark matter: systems like the solar system or the centers of high surface brightness galaxies are WYSIWYG. At low accelerations, below this scale, we begin to need dark matter. The lower the acceleration, the more dark matter we need.

OK, so this means there is little to no dark matter when the baryons are dense (high gbar), but progressively more as gbar becomes smaller than the critical scale g. Low gbar happens when the surface density of baryons is low. So the amount of dark matter scales inversely with baryonic surface density.

That’s weird.

This is weird for a number of reasons. First, there is no reason for the dark matter to care what the baryons are doing when dark matter dominates. When gobs ≫ gbar the dark matter greatly outweighs the baryons, which simply become tracer particles in the gravitational potential of the dark matter halo. There is no reason for the dark matter to know or care about what the baryonic tracer particles are doing. And yet the RAR persists as a tight correlation well into this regime. It is as if the baryonic tail wags the dark matter dog.

Second, there should be more dark matter where there are more baryons. Galaxies form by baryons falling into dark matter halos. As they do so, they dissipate energy and sink to the center of the halo. In this process, the drag some of the dark matter along with them in a process commonly referred to as “adiabatic compression.” In practice, the process need not be adiabatic, but the dark matter must respond to the rearrangement of the gravitational potential caused by the dissipative infall of the baryons.

These topics have been discussed at great length in the galaxy formation literature. Great arguments have erupted time and again about how best to implement the compression in models, and how big the effect is in practice. These details need not concern us here. What matters is that they are non-negotiable fundamentals of the dark matter paradigm.

Galaxies form by baryonic infall within dark matter halos. The halos form first while the baryons are still coupled to the photons prior to last scattering. This is one of the fundamental reasons we need non-baryonic cold dark matter that does not interact with photons: to get a jump on structure formation. Without it, we cannot get from the smooth initial condition observed in the cosmic microwave background to the rich amount of structure we see today.

As the baryons fall into halos, they must sink to the center to form galaxies. Why? Dark matter halos are much bigger than the galaxies that reside within them. All tracers of the gravitational potential say so. Initially, this might seem odd, as the baryons might to just track the dominant dark matter. But baryons are different: they can dissipate energy. By so doing, they can sink to the center – not all baryons need to sink to the centers of their dark matter halos, but enough to make a galaxy. This they must do in order to form the galaxies that we observe – galaxies that are more centrally condensed than their dark matter halos.

That’s enough, in return, to affect the dark matter. As the baryons dissipate, the gravitational potential is non-stationary. The dark matter distribution must respond to this change in the total gravitational potential. The net result is a further concentration of the dark matter towards the center of the halo: in effect, the baryons drag some dark matter along with them.

I have worked on adiabatic compression myself, but a nice illustration is given by this figure from Elbert et al. (2016):

Dark matter halos formed in numerical simulations illustrating the effect of adiabatic compression. One the left is a pristine halo without baryons. In the middle is a halo after formation of a disk galaxy. On right is a halo after formation of a more compact disk.

One can see by eye the compression caused by the baryons. The more dense the baryons become, the more dark matter they drag towards the center with them.

The fundamental elements of the dark matter paradigm, galaxy formation by baryonic infall and dissipation accompanied by the inevitable compression of the dark matter halo, inevitably lead us to expect that more baryons in the center means more dark matter as well. We observe the exact opposite in the RAR. As baryons become denser, they become the dominant component, to the point where they are the only component. Rather than more dark matter as we expect, more baryons means less dark matter in reality.

Third, the RAR correlation is continuous and apparently scatter-free over all accelerations. The data map from the regime of no dark matter at high accelerations to lots of dark matter at low accelerations in perfect 1:1 harmony with the distribution of the baryons. If we observe the distribution of baryons, we know the corresponding distribution of dark matter. The tail doesn’t just wag the dog. It tells it to sit, beg, and roll over.

Fourth, there is a critical scale in the data, g. That’s the scale where the mass discrepancy sets in. This is a purely empirical statement.

Cold dark matter is scale free. Being scale free is fundamental to its nature. It is essential to fitting the large scale structure, which it does quite well.

So why is there this ridiculous acceleration scale in the data?!? Who ordered this?! It should not be there.

So yes, the radial acceleration relation is a problem for the cold dark matter paradigm.

Tully-Fisher: the Second Law

Tully-Fisher: the Second Law

Previously I noted how we teach about Natural Law, but we no longer speak in those terms. All the Great Laws are already know, right? Surely there can’t be such things left to discover!

That rotation curves tend towards asymptotic flatness is, for all practical purposes, a law of nature. It is tempting to leap straight to the interpretation (dark matter!), but it is worth appreciating the discovery for itself. It isn’t like rotation curves merely exceed what can be explained by the stars and gas, nor that they rise and fall willy-nilly. The striking, ever-repeated observation is an indefinitely extended radial range with near-constant rotation velocity.

The rotation curves of galaxies over a large dynamic range in mass, from the most massive spiral with a well measured rotation curve (UGC 2885) to tiny, low mass, low surface brightness, gas rich dwarfs.

New Laws of Nature aren’t discovered every day. This discovery should have warranted a Nobel prize for Vera Rubin and Albert Bosma. If only we were able to see it in those terms three decades ago. Instead, we phrased it in terms of dark matter, and that was a radical enough idea it has to await verification in the laboratory. Now the prize will go to some experimental group (should there ever be a successful detection) while the new law of nature goes unrecognized. That’s OK – there should be a Nobel prize for a verified laboratory detection of non-baryonic dark matter, should that ever occur – but there should also be a Nobel prize for flat rotation curves, and it should have been awarded a long time ago.

It takes a while to appreciate these things. Another well known yet unrecognized Law of Nature is the Tully-Fisher relation. First discovered as a relation between luminosity and line-width (figure from Tully & Fisher 1977), this relation is most widely known for its utility in measuring the cosmic distance scale.

The original Tully-Fisher relation.

At the time, it gave the “wrong” answer (H0 ≠ 50), and Sandage is reputed to have suppressed its publication for a couple of years. This is one reason astronomy journals have, and should have, a high acceptance rate – too many historical examples of bad behavior to protect sacred cows.

Besides its utility as a distance indicator, the Tully-Fisher relation has profound implications for physical theory. It is not merely a relation between two observables of which only one is distance-dependent. It is a link between the observed mass and the physics that sets the flat velocity.

The stellar mass Tully-Fisher relation (left) and the baryonic Tully-Fisher relation (right). In both cases, the x-axis is the flat rotation velocity measured from resolved rotation curves. In the right panel, the y-axis is the baryonic mass – the sum of observed stars and gas. The latter appears to be a law of nature from which galaxies never stray.

The original y-axis of the Tully-Fisher relation, luminosity, was a proxy for stellar mass. The line-width was a proxy for rotation velocity, of which there are many variants. At this point it is clear that the more fundamental variables are baryonic mass – the sum of observed stars and gas – and the flat rotation velocity.

I had an argument – of the best scientific sort – with Renzo Sancisi in 1995. I was disturbed that our then-new low surface brightness galaxies were falling on the same Tully-Fisher relation as previously known high surface brightness galaxies of comparable luminosity. The conventional explanation for the Tully-Fisher relation up to that point invoked Freeman’s Law – the notion (now deprecated) that all spirals had the same central surface brightness. This had the effect of suppressing the radius term in Newton’s

V2 = GM/R.

Galaxies followed a scaling between luminosity (mass) and velocity because they all had the same R at a given M.

By construction, this was not true for low surface brightness galaxies. They have larger radii at fixed luminosity (representing the mass M). That’s what makes them low surface brightness – their stars are more spread out. Yet they fall smack on the same Tully-Fisher relation!

Renzo and I looked at the result and argued up and down, this way and that about the data, the relation, everything. We were getting no closer to understanding it, or agreeing on what it meant. Finally he shouted “TULLY-FISHER IS GOD!” to which I retorted “NEWTON IS GOD!”

It was a healthy exchange of viewpoints.

Renzo made his assertion because, in his vast experience as an observer, galaxies always fell on the Tully-Fisher relation. I made mine, because, well, duh. The problem is that the observed Tully-Fisher relation does not follow from Newton.

But Renzo was right. Galaxies do always fall on the Tully-Fisher relation. There are no residuals from the baryonic Tully-Fisher relation. Neither size nor surface brightness are second parameters. The relation cares not whether a galaxy disk has a bar or not. It does not matter whether a galaxy is made of stars or gas. It does not depend on environment or pretty much anything else one can imagine. Indeed, there is no intrinsic scatter to the relation, as best we can tell. If a galaxy rotates, it follows the baryonic Tully-Fisher relation.

The baryonic Tully-Fisher relation is a law of nature. If you measure the baryonic mass, you know what the flat rotation speed will be, and vice-versa. The baryonic Tully-Fisher relation is the second law of rotating galaxies.