A research programme is said to be progressing as long as its theoretical growth anticipates its empirical growth, that is as long as it keeps predicting novel facts with some success (“progressive problemshift”); it is stagnating if its theoretical growth lags behind its empirical growth, that is as long as it gives only post-hoc explanations either of chance discoveries or of facts anticipated by, and discovered in, a rival programme (“degenerating problemshift”) (Lakatos, 1971, pp. 104–105).
The recent history of modern cosmology is rife with post-hoc explanations of unanticipated facts. The cusp-core problem and the missing satellites problem are prominent examples. These are explained after the fact by invoking feedback, a vague catch-all that many people agree solves these problems even though none of them agree on how it actually works.
There are plenty of other problems. To name just a few: satellite planes (unanticipated correlations in phase space), the emptiness of voids, and the early formation of structure (see section 4 of Famaey & McGaugh for a longer list and section 6 of Silk & Mamon for a positive spin on our list). Each problem is dealt with in a piecemeal fashion, often by invoking solutions that contradict each other while buggering the principle of parsimony.
It goes like this. A new observation is made that does not align with the concordance cosmology. Hands are wrung. Debate is had. Serious concern is expressed. A solution is put forward. Sometimes it is reasonable, sometimes it is not. In either case it is rapidly accepted so long as it saves the paradigm and prevents the need for serious thought. (“Oh, feedback does that.”) The observation is no longer considered a problem through familiarity and exhaustion of patience with the debate, regardless of how [un]satisfactory the proffered solution is. The details of the solution are generally forgotten (if ever learned). When the next problem appears the process repeats, with the new solution often contradicting the now-forgotten solution to the previous problem.
This has been going on for so long that many junior scientists now seem to think this is how science is suppose to work. It is all they’ve experienced. And despite our claims to be interested in fundamental issues, most of us are impatient with re-examining issues that were thought to be settled. All it takes is one bold assertion that everything is OK, and the problem is perceived to be solved whether it actually is or not.
That is the process we apply to little problems. The Big Problems remain the post hoc elements of dark matter and dark energy. These are things we made up to explain unanticipated phenomena. That we need to invoke them immediately casts the paradigm into what Lakatos called degenerating problemshift. Once we’re there, it is hard to see how to get out, given our propensity to overindulge in the honey that is the infinity of free parameters in dark matter models.
Note that there is another aspect to what Lakatos said about facts anticipated by, and discovered in, a rival programme. Two examples spring immediately to mind: the Baryonic Tully-Fisher Relation and the Radial Acceleration Relation. These are predictions of MOND that were unanticipated in the conventional dark matter picture. Perhaps we can come up with post hoc explanations for them, but that is exactly what Lakatos would describe as degenerating problemshift. The rival programme beat us to it.
In my experience, this is a good description of what is going on. The field of dark matter has stagnated. Experimenters look harder and harder for the same thing, repeating the same experiments in hope of a different result. Theorists turn knobs on elaborate models, gifting themselves new free parameters every time they get stuck.
On the flip side, MOND keeps predicting novel facts with some success, so it remains in the stage of progressive problemshift. Unfortunately, MOND remains incomplete as a theory, and doesn’t address many basic issues in cosmology. This is a different kind of unsatisfactory.
In the mean time, I’m still waiting to hear a satisfactory answer to the question I’ve been posing for over two decades now. Why does MOND get any predictions right? It has had many a priori predictions come true. Why does this happen? It shouldn’t. Ever.
Note: this is a guest post by David Merritt, following on from his paper on the philosophy of science as applied to aspects of modern cosmology.
Stacy kindly invited me to write a guest post, expanding on some of the arguments in my paper. I’ll start out by saying that I certainly don’t think of my paper as a final word on anything. I see it more like an opening argument — and I say this, because it’s my impression that the issues which it raises have not gotten nearly the attention they deserve from the philosophers of science. It is that community that I was hoping to reach, and that fact dictated much about the content and style of the paper. Of course, I’m delighted if astrophysicists find something interesting there too.
My paper is about epistemology, and in particular, whether the standard cosmological model respects Popper’s criterion of falsifiability— which he argued (quite convincingly) is a necessary condition for a theory to be considered scientific. Now, falsifying a theory requires testing it, and testing it means (i) using the theory to make a prediction, then (ii) checking to see if the prediction is correct. In the case of dark matter, the cleanest way I could think of to do this was via so-called “direct detection”, since the rotation curve of the Milky Way makes a pretty definite prediction about the density of dark matter at the Sun’s location. (Although as I argued, even this is not enough, since the theory says nothing at all about the likelihood that the DM particles will interact with normal matter even if they are present in a detector.)
What about the large-scale evidence for dark matter — things like the power spectrum of density fluctuations, baryon acoustic oscillations, the CMB spectrum etc.? In the spirit of falsification, we can ask what the standard model predicts for these things; and the answer is: it does not make any definite prediction. The reason is that — to predict quantities like these — one needs first to specify the values of a set of additional parameters: things like the mean densities of dark and normal matter; the numbers that determine the spectrum of initial density fluctuations; etc. There are roughly half a dozen such “free parameters”. Cosmologists never even try to use data like these to falsify their theory; their goal is to make the theory work, and they do this by picking the parameter values that optimize the fit between theory and data.
Philosophers of science are quite familiar with this sort of thing, and they have a rule: “You can’t use the data twice.” You can’t use data to adjust the parameters of a theory, and then turn around and claim that those same data support the theory. But this is exactly what cosmologists do when they argue that the existence of a “concordance model” implies that the standard cosmological model is correct. What “concordance” actually shows is that the standard model can bemadeconsistent: i.e. that one does not require differentvalues for the same parameter. Consistency is good, but by itself it is a very weak argument in favor of a theory’s correctness. Furthermore, as Stacy has emphasized, the supposed “concordance” vanishes when you look at the values of the same parameters as they are determined in other, independent ways. The apparent tension in the Hubble constant is just the latest example of this; another, long-standing example is the very different value for the mean baryon density implied by the observed lithium abundance. There are other examples. True “convergence” in the sense understood by the philosophers — confirmation of the value of a single parameter in multiple, independent experiments — is essentially lacking in cosmology.
Now, even though those half-dozen parameters give cosmologists a great deal of freedom to adjust their model and to fit the data, the freedom is not complete. This is because — when adjusting parameters — they fix certain things: what Imre Lakatos called the “hard core” of a research program: the assumptions that a theorist is absolutely unwilling to abandon, come hell or high water. In our case, the “hard core” includes Einstein’s theory of gravity, but it also includes a number of less-obvious things; for instance, the assumption that the dark matter responds to gravity in the same way as any collisionless fluid of normal matter would respond. (The latter assumption is not made in many alternative theories.) Because of the inflexibility of the “hard core”, there are going to be certain parameter values that are also more-or-less fixed by the data. When a cosmologist says “The third peak in the CMB requires dark matter”, what she is really saying is: “Assuming the fixed hard core, I find that any reasonable fit to the data requires the parameter defining the dark-matter density to be significantly greater than zero.” That is a much weaker statement than “Dark matter must exist”. Statements like “We know that dark matter exists” put me in mind of the 18th century chemists who said things like “Based on my combustion experiments, I conclude that phlogiston exists and that it has a negative mass”. We know now that the behavior the chemists were ascribing to the release of phlogiston was actually due to oxidation. But the “hard core” of their theory (“Combustibles contain an inflammable principle which they release upon burning”) forbade them from considering different models. It took Lavoisier’s arguments to finally convince them of the existence of oxygen.
The fact that the current cosmological model has a fixed “hard core” also implies that — in principle — it can be falsified. But, at the risk of being called a cynic, I have little doubt that if a new, falsifying observation should appear, even a very compelling one, the community will respond as it has so often in the past: via a conventionalist stratagem. Pavel Kroupa has awonderful graphic, reproduced below, that shows just how often predictions of the standard cosmological model have been falsified — a couple of dozen times, according to latest count; and these are only the major instances. Historians and philosophers of science have documented that theories that evolve in this way often end up on the scrap heap. To the extent that my paper is of interest to the astronomical community, I hope that it gets people to thinking about whether the current cosmological model is headed in that direction.
David Merritt recently published the article “Cosmology and convention” in Studies in History and Philosophy of Science. This article is remarkable in many respects. For starters, it is rare that a practicing scientist reads a paper on the philosophy of science, much less publishes one in a philosophy journal.
I was initially loathe to start reading this article, frankly for fear of boredom: me reading about cosmology and the philosophy of science is like coals to Newcastle. I could not have been more wrong. It is a genuine page turner that should be read by everyone interested in cosmology.
I have struggled for a long time with whether dark matter constitutes a falsifiable scientific hypothesis. It straddles the border: specific dark matter candidates (e.g., WIMPs) are confirmable – a laboratory detection is both possible and plausible – but the concept of dark matter can never be excluded. If we fail to find WIMPs in the range of mass-cross section parameters space where we expected them, we can change the prediction. This moving of the goal post has already happened repeatedly.
I do not find it encouraging that the goal posts keep moving. This raises the question, how far can we go? Arbitrarily low cross-sections can be extracted from theory if we work at it hard enough. How hard should we work? That is, what criteria do we set whereby we decide the WIMP hypothesis is mistaken?
There has to be some criterion by which we would consider the WIMP hypothesis to be falsified. Without such a criterion, it does not satisfy the strictest definition of a scientific hypothesis. If at some point we fail to find WIMPs and are dissatisfied with the theoretical fine-tuning required to keep them hidden, we are free to invent some other dark matter candidate. No WIMPs? Must be axions. Not axions? Would you believe light dark matter? [Worst. Name. Ever.] And so on, ad infinitum. The concept of dark matter is not falsifiable, even if specific dark matter candidates are subject to being made to seem very unlikely (e.g., brown dwarfs).
Faced with this situation, we can consult the philosophy science. Merritt discusses how many of the essential tenets of modern cosmology follow from what Popper would term “conventionalist stratagems” – ways to dodge serious consideration that a treasured theory is threatened. I find this a compelling terminology, as it formalizes an attitude I have witnessed among scientists, especially cosmologists, many times. It was put more colloquially by J.K. Galbraith:
“Faced with the choice between changing one’s mind and proving that there is no need to do so, almost everybody gets busy on the proof.”
Boiled down (Keuth 2005), the conventionalist strategems Popper identifies are
ad hoc hypotheses
modification of ostensive definitions
doubting the reliability of the experimenter
doubting the acumen of the theorist
These are stratagems to be avoided according to Popper. At the least they are pitfalls to be aware of, but as Merritt discusses, modern cosmology has marched down exactly this path, doing each of these in turn.
The ad hoc hypotheses of ΛCDM are of course Λ and CDM. Faced with the observation of a metric that cannot be reconciled with the prior expectation of a decelerating expansion rate, we re-invoke Einstein’s greatest blunder, Λ. We even generalize the notion and give it a fancy new name, dark energy, which has the convenient property that it can fit any observed set of monotonic distance-redshift pairs. Faced with an excess of gravitational attraction over what can be explained by normal matter, we invoke non-baryonic dark matter: some novel form of mass that has no place in the standard model of particle physics, has yet to show any hint of itself in the laboratory, and cannot be decisively excluded by experiment.
We didn’t accept these ad hoc add-ons easily or overnight. Persuasive astronomical evidence drove us there, but all these data really show is that something dire is wrong: General Relativity plus known standard model particles cannot explain the universe. Λ and CDM are more a first guess than a final answer. They’ve been around long enough that they have become familiar, almost beyond doubt. Nevertheless, they remain unproven ad hoc hypotheses.
The sentiment that is often asserted is that cosmology works so well that dark matter and dark energy must exist. But a more conservative statement would be that our present understanding of cosmology is correct if and only if these dark entities exist. The onus is on us to detect dark matter particles in the laboratory.
That’s just the first conventionalist stratagem. I could given many examples of violations of the other three, just from my own experience. That would make for a very long post indeed.
Instead, you should go read Merritt’s paper. There are too many things there to discuss, at least in a single post. You’re best going to the source. Be prepared for some cognitive dissonance.
Recently I have been complaining about the low standards to which science has sunk. It has become normal to be surprised by an observation, express doubt about the data, blame the observers, slowly let it sink in, bicker and argue for a while, construct an unsatisfactory model that sort-of, kind-of explains the surprising data but not really, call it natural, then pretend like that’s what we expected all along. This has been going on for so long that younger scientists might be forgiven if they think this is how science is suppose to work. It is not.
At the root of the scientific method is hypothesis testing through prediction and subsequent observation. Ideally, the prediction comes before the experiment. The highest standard is a prediction made before the fact in ignorance of the ultimate result. This is incontrovertibly superior to post-hoc fits and hand-waving explanations: it is how we’re suppose to avoid playing favorites.
I predicted the velocity dispersion of Crater 2 in advance of the observation, for both ΛCDM and MOND. The prediction for MOND is reasonably straightforward. That for ΛCDM is fraught. There is no agreed method by which to do this, and it may be that the real prediction is that this sort of thing is not possible to predict.
The reason it is difficult to predict the velocity dispersions of specific, individual dwarf satellite galaxies in ΛCDM is that the stellar mass-halo mass relation must be strongly non-linear to reconcile the steep mass function of dark matter sub-halos with their small observed numbers. This is closely related to the M*-Mhalo relation found by abundance matching. The consequence is that the luminosity of dwarf satellites can change a lot for tiny changes in halo mass.
Long story short, the nominal expectation for ΛCDM is a lot of scatter. Photometrically identical dwarfs can live in halos with very different velocity dispersions. The trend between mass, luminosity, and velocity dispersion is so weak that it might barely be perceptible. The photometric data should not be predictive of the velocity dispersion.
It is hard to get even a ballpark answer that doesn’t make reference to other measurements. Empirically, there is some correlation between size and velocity dispersion. This “predicts” σ = 17 km/s. That is not a true theoretical prediction; it is just the application of data to anticipate other data.
Abundance matching relations provide a highly uncertain estimate. The first time I tried to do this, I got unphysical answers (σ = 0.1 km/s, which is less than the stars alone would cause without dark matter – about 0.5 km/s). The application of abundance matching requires extrapolation of fits to data at high mass to very low mass. Extrapolating the M*-Mhalo relation over many decades in mass is very sensitive to the low mass slope of the fitted relation, so it depends on which one you pick.
Since my first pick did not work, lets go with the value suggested to me by James Bullock: σ = 11 km/s. That is the mid-value (the blue lines in the figure above); the true value could easily scatter higher or lower. Very hard to predict with any precision. But given the luminosity and size of Crater 2, we expect numbers like 11 or 17 km/s.
The measured velocity dispersion is σ = 2.7 ± 0.3 km/s.
This is incredibly low. Shockingly so, considering the enormous size of the system (1 kpc half light radius). The NFW halos predicted by ΛCDM don’t do that.
Basically, NFW halos, including the sub-halos imagined to host dwarf satellite galaxies, have rotation curves that rise rapidly and stay high in proportion to the cube root of the halo mass. This property makes it very challenging to explain a low velocity at a large radius: exactly the properties observed in Crater 2.
Lets not fail to appreciate how extremely wrong this is. The original version of the graph above stopped at 5 km/s. It didn’t extend to lower values because they were absurd. There was no reason to imagine that this would be possible. Indeed, the point of their paper was that the observed dwarf velocity dispersions were already too low. To get to lower velocity, you need an absurdly low mass sub-halo – around 107 M☉. In contrast, the usual inference of masses for sub-halos containing dwarfs of similar luminosity is around 109 M☉to 1010 M☉. So the low observed velocity dispersion – especially at such a large radius – seems nigh on impossible.
More generally, there is no way in ΛCDM to predict the velocity dispersions of particular individual dwarfs. There is too much intrinsic scatter in the highly non-linear relation between luminosity and halo mass. Given the photometry, all we can say is “somewhere in this ballpark.” Making an object-specific prediction is impossible.
The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.
I’m an equal opportunity scientist. In addition to ΛCDM, I also considered MOND. The successful prediction is that of MOND. (The quoted uncertainty reflects the uncertainty in the stellar mass-to-light ratio.) The difference is that MOND makes a specific prediction for every individual object. And it comes true. Again.
MOND is a funny theory. The amplitude of the mass discrepancy it induces depends on how low the acceleration of a system is. If Crater 2 were off by itself in the middle of intergalactic space, MOND would predict it should have a velocity dispersion of about 4 km/s.
But Crater 2 is not isolated. It is close enough to the Milky Way that there is an additional, external acceleration imposed by the Milky Way. The net result is that the acceleration isn’t quite as low as it would be were Crater 2 al by its lonesome. Consequently, the predicted velocity dispersion is a measly 2 km/s. As observed.
In MOND, this is called the External Field Effect (EFE). Theoretically, the EFE is rather disturbing, as it breaks the Strong Equivalence Principle. In particular, Local Position Invariance in gravitational experiments is violated: the velocity dispersion of a dwarf satellite depends on whether it is isolated from its host or not. Weak equivalence (the universality of free fall) and the Einstein Equivalence Principle (which excludes gravitational experiments) may still hold.
We identified several pairs of photometrically identical dwarfs around Andromeda. Some are subject to the EFE while others are not. We see the predicted effect of the EFE: isolated dwarfs have higher velocity dispersions than their twins afflicted by the EFE.
If it is just a matter of sub-halo mass, the current location of the dwarf should not matter. The velocity dispersion certainly should not depend on the bizarre MOND criterion for whether a dwarf is affected by the EFE or not. It isn’t a simple distance-dependency. It depends on the ratio of internal to external acceleration. A relatively dense dwarf might still behave as an isolated system close to its host, while a really diffuse one might be affected by the EFE even when very remote.
When Crater 2 was first discovered, I ground through the math and tweeted the prediction. I didn’t want to write a paper for just one object. However, I eventually did so because I realized that Crater 2 is important as an extreme example of a dwarf so diffuse that it is affected by the EFE despite being very remote (120 kpc from the Milky Way). This is not easy to reproduce any other way. Indeed, MOND with the EFE is the only way that I am aware of whereby it is possible to predict, in advance, the velocity dispersion of this particular dwarf.
If I put my ΛCDM hat back on, it gives me pause that any method can make this prediction. As discussed above, this shouldn’t be possible. There is too much intrinsic scatter in the halo mass-luminosity relation.
If we cook up an explanation for the radial acceleration relation, we still can’t make this prediction. The RAR fit we obtained empirically predicts 4 km/s. This is indistinguishable from MOND for isolated objects. But the RAR itself is just an empirical law – it provides no reason to expect deviations, nor how to predict them. MOND does both, does it right, and has done so before, repeatedly. In contrast, the acceleration of Crater 2 is below the minimum allowed in ΛCDM according to Navarro et al.
For these reasons I consider Crater 2 to be the bullet cluster of ΛCDM. Just as the bullet cluster seems like a straight-up contradiction to MOND, so too does Crater 2 for ΛCDM. It is something ΛCDM really can’t do. The difference is that you can just look at the bullet cluster. With Crater 2 you actually have to understand MOND as well as ΛCDM, and think it through.
So what can we do to save ΛCDM?
Whatever it takes, per usual.
One possibility is that Crater II may represent the “bright” tip of the extremely low surface brightness “stealth” fossils predicted by Bovill & Ricotti. Their predictions are encouraging for getting the size and surface brightness in the right ballpark. But I see no reason in this context to expect such a low velocity dispersion. They anticipate dispersions consistent with the ΛCDM discussion above, and correspondingly high mass-to-light ratios that are greater than observed for Crater 2 (M/L ≈ 104 rather than ~50).
A plausible suggestion I heard was from James Bullock. While noting that reionization should preclude the existence of galaxies in halos below 5 km/s, as we need for Crater 2, he suggested that tidal stripping could reduce an initially larger sub-halo to this point. I am dubious about this, as my impression from the simulations of Penarrubia was that the outer regions of the sub-halo were stripped first while leaving the inner regions (where the NFW cusp predicts high velocity dispersions) largely intact until near complete dissolution. In this context, it is important to bear in mind that the low velocity dispersion of Crater 2 is observed at large radii (1 kpc, not tens of pc). Still, I can imagine ways in which this might be made to work in this particular case, depending on its orbit. Tony Sohn has an HST program to measure the proper motion; this should constrain whether the object has ever passed close enough to the center of the Milky Way to have been tidally disrupted.
Josh Bland-Hawthorn pointed out to me that he made simulations that suggest a halo with a mass as low as 107 M☉ could make stars before reionization and retain them. This contradicts much of the conventional wisdom outlined above because they find a much lower (and in my opinion, more realistic) feedback efficiency for supernova feedback than assumed in most other simulations. If this is correct (as it may well be!) then it might explain Crater 2, but it would wreck all the feedback-based explanations given for all sorts of other things in ΛCDM, like the missing satellite problem and the cusp-core problem. We can’t have it both ways.
I’m sure people will come up with other clever ideas. These will inevitably be ad hoc suggestions cooked up in response to a previously inconceivable situation. This will ring hollow to me until we explain why MOND can predict anything right at all.
In the case of Crater 2, it isn’t just a matter of retrospectively explaining the radial acceleration relation. One also has to explain why exceptions to the RAR occur following the very specific, bizarre, and unique EFE formulation of MOND. If I could do that, I would have done so a long time ago.
No matter what we come up with, the best we can hope to do is a post facto explanation of something that MOND predicted correctly in advance. Can that be satisfactory?
The arXiv brought an early Xmas gift in the form of a measurement of the velocity dispersion of Crater 2. Crater 2 is an extremely diffuse dwarf satellite of the Milky Way. Upon its discovery, I realized there was an opportunity to predict its velocity dispersion based on the reported photometry. The fact that it is very large (half light radius a bit over 1 kpc) and relatively far from the Milky Way (120 kpc) make it a unique and critical case. I will expand on that in another post, or you could read the paper. But for now:
The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.
This prediction appeared in press in advance of the measurement (ApJ, 832, L8). The uncertainty reflects the uncertainty in the mass-to-light ratio.
The measured velocity dispersion is σ = 2.7 ± 0.3 km/s
There has been another attempt to explain away the radial acceleration relation as being fine in ΛCDM. That’s good; I’m glad people are finally starting to address this issue. But lets be clear: this is a beginning, not a solution. Indeed, it seems more like a rush to create truth by assertion than an honest scientific investigation. I would be more impressed if these papers were (i) refereed rather than rushed onto the arXiv, and (ii) honestly addressed the requirements I laid out.
Rather than consider the assertions piecemeal, lets take a step back. We have established that galaxies obey a single effective force law. Federico Lelli has shown that this applies to pressure supported elliptical galaxies as well as rotating disks.
Lets start with what Newton said about the solar system: “Everything happens… as if the force between two bodies is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” Knowing how this story turns out, consider the following.
Suppose someone came to you and told you Newton was wrong. The solar system doesn’t operate on an inverse square law, it operates on an inverse cube law. It just looks like an inverse square law because there is dark matter arranged just so as to make this so. No matter whether we look at the motion of the planets around the sun, or moons around their planets, or any of the assorted miscellaneous asteroids and cometary debris. Everything happens as if there is an inverse square law, when really it is an inverse cube law plus dark matter arranged just so.
Would you believe this assertion?
I hope not. It is a gross violation of the rule of parsimony. Occam would spin in his grave.
Yet this is exactly what we’re doing with dark matter halos. There is one observed, effective force law in galaxies. The dark matter has to be arranged just so as to make this so.
Convenient that it is invisible.
Maybe dark matter will prove to be correct, but there is ample reason to worry. I worry that we have not yet detected it. We are well past the point that we should have. The supersymmetric sector in which WIMP dark matter is hypothesized to live flunked the “golden test” of the Bs meson decay, and looks more and more like a brilliant idea nature declined to implement. And I wonder why the radial acceleration relation hasn’t been predicted before if it is such a “natural” outcome of galaxy formation simulations. Are we doing fair science here? Or just trying to shove the cat back in the bag?
I really don’t know what the final answer will look like. But I’ve talked to a lot of scientists who seem pretty darn sure. If you are sure you know the final answer, then you are violating some basic principles of the scientific method: the principle of parsimony, the principle of doubt, and the principle of objectivity. Mind your confirmation bias!
That’ll do for now. What wonders await among tomorrow’s arXiv postings?
Frodo: That’s because we’ve been here before. We’re going in circles!
Last year, Oman et al. published a paper entitled “The unexpected diversity of dwarf galaxy rotation curves”. This term, diversity, has gained some traction among the community of scientists who simulate the formation of galaxies. From my perspective, this terminology captures some of the story, but misses most of it.
It was established (by van Albada & Sancisi and by Kent) in the ’80s that rotation curves were generally well described as maximal disks: the inner rotation curve was dominated by the stars, with a gradual transition to the flat outer part which required dark matter. By that time, I had became interested in low surface brightness (LSB) galaxies, which had not been studied in such detail. My nominal expectation was that LSB galaxies were stretched out versions of more familiar spiral galaxies. As such they’d also have maximal disks, but lower peak velocities (since V2 ≈ GM/R and LSBs had larger R for the same M).
By the mid-1990s, we had shown that this was not the case. LSB galaxies had the same rotation velocity as more concentrated galaxies of the same luminosity. This meant that LSB galaxies were dark matter dominated. This result is now widely known (to the point that it is often taken for granted), but it had not been expected. One interesting consequence was that LSB galaxies were a convenient laboratory for testing the dark matter hypothesis.
So what do we expect? There were, and are, many ideas for what dark matter should do. One of the leading hypotheses to emerge (around the same time) was the NFW halo obtained from structure formation simulations using cold dark matter. If a galaxy is dark matter dominated, then to a good approximation we expect the stars to act as tracer particles: the rotation curve should just reflect that of the underlying dark matter halo.
This did not turn out well. The rotation curves of low surface brightness galaxies do not look like NFW halos. One example is provided by the LSB galaxy F583-1, reproduced here from Fig. 14 of McGaugh & de Blok (1998).
This was bad for NFW. But there is a more general problem, irrespective of the particular form of the dark matter halo. The M*-Mhalo relation required by abundance matching means that galaxies of the same luminosity live in nearly identical dark matter halos. When dark matter dominates, galaxies of the same luminosity should thus have the same rotation curve.
We can test this by comparing the rotation curves of Tully-Fisher pairs: galaxies with the same luminosity and flat rotation velocity, but different surface brightness. The high surface brightness NGC 2403 and low surface brightness UGC 128 are such a pair. So for 20 years, I have been showing their rotation curves:
If NGC 2403 and UGC 128 reside in the same dark matter halo, they should have basically the same rotation curve in physical units [V(R in kpc)]. They don’t. But they do have the pretty much the same rotation curve when radius is scaled by the size of the disk [V(R/Rd)]. The dynamics “knows about” the baryons, in contradiction to the expectation for dark matter dominated galaxies.
Oman et al. have rediscovered the top panel (which they call diversity) but they don’t notice the bottom panel (which one might call uniformity). That galaxies of the same luminosity have different rotation curves remains surprising to simulations, at least the EAGLE and APOSTLE simulations Oman et al. discuss. (Note that APOSTLE was called LG by Oman et al.) Oman et al. illustrate the point with a number of rotation curves, for example, their Fig. 5:
Oman et al. show that the rotation curves of LSB galaxies rise more slowly than predicted by simulations, and have a different shape. This is the same problem that we pointed out two decades ago. Indeed, note that the lower left panel is F583-1: the same galaxy noted above, showing the same discrepancy. The new thing is that these simulations include the effects of baryons (shaded regions). Baryons do not help to resolve the problem, at least as implemented in EAGLE and APOSTLE.
It is tempting to be snarky and say that this quantifies how many years simulators are behind observers. But that would be too generous. Observers had already noticed the systematic illustrated in the bottom panel of the NGC2403/UGC 128 in the previous millennium. Simulators are just now coming to grips with the top panel. The full implications of the bottom panel seems not yet to have disturbed their dreams of dark matter.
Perhaps that passes snarky and on into rude, but it isn’t like we haven’t been telling them exactly this for years and years and years. The initial reaction was not mere disbelief, but outright scorn. The data disagree with simulations, so the data must be wrong! Seriously, this was the attitude. I don’t doubt that it persists in some of the colder, darker corners of the communal astro-theoretical intellect.
Indeed, Ludlow et al. provide an example. These are essentially the same people as wrote Oman et al. Though Oman et al. point out a problem when comparing the simulations to data, Ludlow et al. claim that the observed uniformity is “a Natural Outcome of Galaxy Formation in CDM halos”. Seriously. This is in their title.
Well, which is it? Is the diversity of rotation curves a problem for simulations? Or is their uniformity a “natural outcome”? This is not naturalat all.
Note that the lower right panel of the figure from Oman et al. contains the galaxy IC 2574. This galaxy obviously deviates from the expectation of the simulations. These predict accelerations that are much larger than observed at small radii. Yet Ludlow et al. claim to explain the radial acceleration relation.
This situation is self-contradictory. Either the simulations explain the RAR, or they fail to explain the “diversity” of rotation curves. These are not independent statements.
I can think of two explanations: either (i) the data that define the RAR don’t include diverse galaxies, or (ii) the simulations are not producing realistic galaxies. In the latter case, it is possible that both the rotation curve and the baryon distribution are off in a way that maintains some semblance of the observed RAR.
I know (i) is not correct. Galaxies like F583-1 and IC 2574 help define the RAR. This is one reason why the RAR is problematic for simulations.
That leaves (ii). Though the correlation Ludlow et al. show misses the data, the real problem is worse. They only obtain the semblance of the right relation because the simulated galaxies apparently don’t have the same range of surface brightness as real galaxies. They’re not just missing V(R); now that they include baryons they are also getting the distribution of luminous mass wrong.
I have no doubt that this problem can be fixed. Doing so is “simply” a matter of revising the feedback prescription until the desired results is obtained. This is called fine-tuning.
I have been musing for a while on the idea of writing about Naturalness in science, particularly as it applies to the radial acceleration relation. As a scientist, the concept of Naturalness is very important to me, especially when it comes to the interpretation of data. When I sat down to write, I made the mistake of first Googling the term.
The top Google hits bear little resemblance to what I mean by Naturalness. The closest match is specific to a particular, rather narrow concept in theoretical particle physics. I mean something much more general. I know many scientific colleagues who share this ideal. I also get the impression that this ideal is being eroded and cheapened, even among scientists, in our post-factual society.
I suspect the reason a better hit for Naturalness doesn’t come up more naturally in a Google search is, at least in part, an age effect. As wonderful a search engine as Google may be, it is lousy at identifying things B.G. (Before Google). The concept of Naturalness has been embedded in the foundations of science for centuries, to the point where it is absorbed by osmosis by students of any discipline: it doesn’t need to be formally taught; there probably is no appropriate website.
In many sciences, we are often faced with messy and incomplete data. In Astronomy in particular, there are often complicated astrophysical processes well beyond our terrestrial experience that allow a broad range of interpretations. Some of these are natural while others are contrived. Usually, the most natural interpretation is the correct one. In this regard, what I mean by Naturalness is closely related to Occam’s Razor, but it is something more as well. It is that which follows – naturally – from a specific hypothesis.
An obvious astronomical example: Kepler’s Laws follow naturally from Newton’s Universal Law of Gravity. It is a trivial amount of algebra to show that Kepler’s third Law, P2 = a3, follows as a direct consequence of Newton’s inverse square law. The first law, that orbits are ellipses, follows with somewhat more math. The second law follows with the conservation of angular momentum.
It isn’t just that Newtonian gravity is the simplest explanation for planetary orbits. It is that all the phenomena identified by Kepler follow naturally from Newton’s insight. This isn’t obvious just by positing an inverse square law. But in exploring the consequences of such a hypothesis, one finds that one clue after another falls into place like the pieces of a jigsaw puzzle. This is what I mean by Naturalness.
I expect that this sense of Naturalness – the fitting together of the pieces of the puzzle – is what gave Newton encouragement that he was on the right path with the inverse square law. Let’s not forget that both Newton and his inverse square law came in for a lot of criticism at the time. Both Leibniz and Huygens objected to action at a distance, for good reason. I suspect this is why Newton prefaced his phrasing of the inverse square law with the modifier as if: “Everything happens… as if the force between two bodies is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” He is not claiming that this is right, that it has to be so. Just that it sure looks that way.
The situation with the radial acceleration relation in galaxies today is the same. Everything happens as if there is a single effective force law in galaxies. This is true regardless of what the ultimate reason proves to be.
The natural explanation for the single effective force law indicated by the radial acceleration relation is that there is indeed a unique force law at work. In this case, such a force law has already been hypothesized: MOND. Often MOND is dismissed for other reasons, though reports of its demise have repeatedly been exaggerated. Perhaps MOND is just the first approximation of some deeper theory. Perhaps, like action at a distance, we simply don’t yet understand the underlying reasons for it.
First, there is some eye-rolling language in the title and the abstract. Two words: natural (in the title) and accommodated (in the abstract). I can’t not address these before getting to the science.
Natural. As I have discussed repeatedly in this blog, and in the refereed literature, there is nothing natural about this. If it were so natural, we’d have been talking about it since Bob Sanders pointed this out in 1990, or since I quantified it better in 1998 and 2004. Instead, the modus operandi of much of the simulation community over the past couple of decades has been to pour scorn on the quality of rotation curve data because it did not look like their simulations. Now it is natural?
Accommodate. Accommodation is an important issue in the philosophy of science. I have no doubt that the simulators are clever enough to find a way to accommodate the data. That is why I have, for 20 years, been posing the question What would falsify ΛCDM? I have heard (or come up myself with) only a few good answers, and I fear the real answer is that it can’t be. It is so flexible, with so many freely adjustable parameters, that it can be made to accommodate pretty much anything. I’m more impressed by predictions that come ahead of time.
That’s one reason I want to see what the current generation of simulations say before entertaining those made with full knowledge of the RAR. At least these quick preprints are using existing simulations, so while not predictions in the strictest since, at least they haven’t been fine-tuned specifically to reproduce the RAR. Lots of other observations, yes, but not this particular one.
Ludlow et al. show a small number of model rotation curves that vary from wildly unrealistic (their NoAGN models peak at 500 km/s; no disk galaxy in the universe comes anywhere close to that… Vera Rubin once offered a prize for any that exceeded 300 km/s) to merely implausible (their StrongFB model is in the right ballpark, but has a very rapidly rising rotation curve). In all cases, their dark matter halos seem little affected by feedback, in contrast to the claims of other simulation groups. It will be interesting to follow the debate between simulators as to what we should really expect.
They do find a RAR-like correlation. Remarkably, the details don’t seem to depend much on the feedback scheme. This motivates some deeper consideration of the RAR.
The RAR plots observed centripetal acceleration, gobs, against that predicted by the observed distribution of baryons, gbar. We chose these coordinates because this seems to be the fundamental empirical correlation, and the two quantities are measured in completely independent ways: rotation curves vs. photometry. While measured independently, some correlation is guaranteed: physically, gobs includes gbar. Things only become weird when the correlation persists as gobs ≫ gbar.
The models are well fit by the functional form we found for the data, but with a different value of the fit parameter: g† = 3 rather than 1.2 x 10-10 m s-2. That’s a factor of 2.5 off – a factor that is considered fatal for MOND in galaxy clusters. Is it OK here?
The uncertainty in the fit value is 1.20 ± 0.02. So formally, 3 is off by 90σ. However, the real dominant uncertainty is systematic: what is the true mean mass-to-light ratio at 3.6 microns? We estimated the systematic uncertainty to be ± 0.24 based on an extensive survey of plausible stellar population models. So 3 is only 7.5σ off.
The problem with systematic uncertainties is that they do not obey Gaussian statistics. So I decided to see what we might need to do to obtain g† = 3 x 10-10 m s-2. This can be done if we take sufficient liberties with the mass-to-light ratio.
Indeed, we can get in the right ball park simply by reducing the assumed mass-to-light ratio of stellar disks by a factor of two. We don’t make the same factor of two adjustment to the bulge components, because the data don’t approach the 1:1 line at high accelerations if this is done. So rather than our fiducial model with M*/L(disk) = 0.5 M⊙/L⊙ and M*/L(bulge) = 0.7 M⊙/L⊙ (open points in plot), we have M*/L(disk) = 0.25 M⊙/L⊙ and M*/L(bulge) = 0.7 M⊙/L⊙ (filled points in plot). Lets pretend like we don’t know anything about stars and ignore the fact that this change corresponds to truncating the IMF of the stellar disk so that M dwarfs don’t exist in disks, but they do in bulges. We then find a tolerable match to the simulations (red line).
Amusingly, the data are now more linear than the functional form we assumed. If this is what we thought stars did, we wouldn’t have picked the functional form the simulations apparently reproduce. We would have drawn a straight line through the data – at least most of it.
That much isn’t too much of a problem for the models, though it is an interesting question whether they get the shape of the RAR right for the normalization they appear to demand. There is a serious problem though. That becomes apparent in the lowest acceleration points, which deviate strongly below the red line. (The formal error bars are smaller than the size of the points.)
It is easy to understand why this happens. As we go from high to low accelerations, we transition from bulge dominance to stellar disk dominance to gas dominance. Those last couple of bins are dominated by atomic gas, not stars. So it doesn’t matter what we adopt for the stellar mass-to-light ratio. That’s where the data sit: well off the simulated line.
Is this fatal for these models? As presented, yes. The simulations persist in predicting higher accelerations than observed. This has been the problem all along.
There are other issues. The scatter in the simulated RAR is impressively small. Much smaller than I expected. Smaller even than the observational scatter. But the latter is dominated by observational errors: the intrinsic relation is much tighter, consistent with a δ-function. The intrinsic scatter is what they should be comparing their results to. They either fail to understand, or conveniently choose to gloss over, the distinction between intrinsic scatter and that induced by random errors.
It is worth noting that some of the same authors make this same mistake – and it is a straight up mistake – in discussing the scatter in the baryonic Tully-Fisher relation. The assertion there is “the scatter in the simulated BTF is smaller than observed”. But the observed scatter is dominated by observational errors, which we have taken great care to assess. Once this is done, there is practically no room left over for intrinsic scatter, which is what the models display. This is important, as it completely inverts the stated interpretation. Rather than having less scatter than observed, the simulations exhibit more scatter than allowed.
Can these problems be fixed? No doubt. See the comments on accommodation above.
There are good things about this paper, bad things, and the potential for great ugliness.
This is exactly the reaction that I had hoped to see in response to the radial acceleration relation (RAR): people going to their existing simulations and checking what answer they got. The answer looks promising. The same relation is apparent in the simulations as in the data. That’s good.
These simulations already existed. They haven’t been tuned to match this particular observations. That’s good.The cynic might note that the last 15+ years of galaxy formation simulations have been driven by the need to add feedback to match data, including the shapes of rotation curves. Nevertheless, I see no guarantee that the RAR will fall out of this process.
The scatter in the simulations is 0.05 dex. The scatter in the data not explained by random errors is 0.06 dex. This agreement is good. I think the source of the scatter needs to be explored further (see below), but it is at least in the right ballpark, which is by no means guaranteed.
The authors make a genuine prediction for how the RAR should evolve with redshift. That isn’t just good; it is bold and laudable.
There are only 18 simulated galaxies to compare to 153 real ones. I appreciate the difficulty in generating these simulations, but we really need a bigger sample. The large number of sampled points (1800) is less important given the simulators’ ability to parse the data as finely as their CPU allows them to resolve. I also wonder if the lowest acceleration points extend beyond the range sampled in comparable galaxies. Typically the data peter out around an HI surface density of 1 Msun/pc^2.
The comparison they make to Fig. 3 of arxiv:1609.05917 is great.I would like to see something like Fig. 1 and 2 from that paper as well. What range of galaxy properties do the models span? What do individual mass models looks like?
My biggest concern is that there is a limited dynamic range in the simulations, which span only a factor of 15 in disk mass: from 1.7E10 to 2.7E11 Msun. For comparison, the data span 1E7 to 5E11 Lsun in [3.6] luminosity, a factor of 50,000. The simulations only sample the top 0.03% of this range.
Basically, the simulated galaxies go from a little less massive than the Milky Way up to a bit more massive than Andromeda. Comparing this range to the RAR and declaring the problem solved is like fitting the Milky Way and Andromeda and declaring all problems in the Local Group solved without looking at any of the dwarfs. It is at lower mass scales and for lower surface brightness galaxies that problems become severe. Consequently, the most the authors can claim is a promising start on understanding a tiny fraction of bright galaxies, not a complete explanation of the RAR.
Indeed, while the authors quantify the mass range over which their simulated galaxies extend, they make no mention of either size or surface brightness. Are these comparable to real galaxies of similar mass? Too narrow a range in size at fixed mass, as seems likely in a small sample, may act to artificially suppress the scatter.Put another way: if the simulated galaxies only cover a tiny region of Fig. 1 above, it is hardly surprising if they exhibit little scatter.
The apparent match between the simulated and observed scatter seems good. But the “left over” observational scatter of 0.06 dex is the same as what we expect from scatter in the mass-to-light ratio.That is irreducible. There has to be some variation in stellar populations, and it is much easier to imagine this number getting bigger than being much smaller.
In the simulations, the stellar mass is presumably known perfectly, so I expect the scatter has a different source. Presumably there is scatter from halo to halo as seen in other simulations. That’s natural in LCDM, but there isn’t any room for it if we also have to accommodate scatter from the mass-to-light ratio. The apparent equality of observed and simulated scatter is meaningless if they represent scatter in different quantities.
I have trouble believing that the RAR follows simply from dissipative collapse without feedback. I’ve worked on this before, so I’m pretty sure it does not work this way. It is true that a single model does something like this as a result of dissipative collapse. It is not true that an ensemble of such models are guaranteed to fall on the same relation.
There are many examples of galaxies with the same mass but very different scale lengths. In the absence of feedback, shorter scale lengths lead to more compression of the dark matter halo. One winds up with more dark matter where there are more baryons. This is the opposite of what we see in the data.
This makes me suspect the dynamic range in the simulations is a problem. Not only do they cover little range in mass compared to the data, but this particular conclusion may only be reached if there is virtually no dynamic range in size at a given mass. That is hardly surprising given the small sample size.
This paper has nothing to do with MOND, nor says anything about it. Why is it in the title?
At best, the authors have shown that, over a rather limited dynamic range, simulations in LCDM might reproduce post facto what MOND predicted a priori. If so, LCDM survives this test (as far as it goes). But in no respect can this be considered a problem for MOND, which predicted the phenomenon over 30 years ago. This is a classic problem in the philosophy of science: should we put more weight on the a priori prediction, or on the capacity of a more flexible theory to accommodate the same observation later on?
The title is revealing of a deep-rooted bias. It tarnishes what might be an important results and does a disservice to the objectivity we’re suppose to value in science.
DO OTHER SIMULATIONS AGREE?
I am eager to see whether other simulations agree with these results. Not all simulators implement feedback in the same way, nor get the same results. The most dangerous aspect of this paper is that it may give people an excuse to think the problem is solved so they never have to think about it again. The RAR is a test that needs to be applied every time to each and every batch of simulations. If they don’t pass this test, they’re wrong. Unfortunately, there is precedent in the galaxy formation community to take reassurances such as this for granted, and not to bother to perform the test.
THE RAR TEST MUST BE PERFORMED FOR ALL SIMULATIONS. ALWAYS.