There is a rule of thumb in scientific publication that if a title is posed a question, the answer is no.
It sucks being so far ahead of the field that I get to watch people repeat the mistakes I made (or almost made) and warned against long ago. There have been persistent claims of deviations of one sort or another from the Baryonic Tully-Fisher relation (BTFR). So far, these have all been obviously wrong, for reasons we’ve discussed before. It all boils down to data quality. The credibility of data is important, especially in astronomy.
A relation is clear in the plot above, but it’s a mess. There’s lots of scatter, especially at low mass. There is also a systematic tendency for low mass galaxies to fall to the left of the main relation, appearing to rotate too slowly for their mass.
There is no quality control in the plot above. I have thrown all the mud at the wall. Let’s now do some quality control. The plotted quantities are the baryonic mass and the flat rotation speed. We haven’t actually measured the flat rotation speed in all these cases. For some, we’ve simply taken the last measured point. This was an issue we explicitly pointed out in Stark et al (2009):
If we include a galaxy like UGC 4173, we expect it will be offset to the low velocity side because we haven’t measured the flat rotation speed. We’ve merely taken that last point and hoped it is close enough. Sometimes it is, depending on your tolerance for systematic errors. But the plain fact is that we haven’t measured the flat rotation speed in this case. We don’t even know if it has one; it is only empirical experience with other examples that lead us to expect it to flatten if we manage to observe further out.
For our purpose here, it is as if we hadn’t measured this galaxy at all. So let’s not pretend like we have, and restrict the plot to galaxies for which the flat velocity is measured:
The scatter in the BTFR decreases dramatically when we exclude the galaxies for which we haven’t measured the relevant quantities. This is a simple matter of data quality. We’re no longer pretending to have measured a quantity that we haven’t measured.
There are still some outliers as there are still things that can go wrong. Inclinations are a challenge for some galaxies, as are distances determinations. Remember that Tully-Fisher was first employed as a distance indicator. If we look at the plot above from that perspective, the outliers have obviously been assigned the wrong distance, and we would assign a new one by putting them on the relation. That, in a nutshell, is how astronomical distance indicators work.
If we restrict the data to those with accurate measurements, we get
Now the outliers are gone. They were outliers because they had crappy data. This is completely unsurprising. Some astronomical data are always crappy. You plot crap against crap, you get crap. If, on the other hand, you look at the high quality data, you get a high quality correlation. Even then, you can never be sure that you’ve excluded all the crap, as there are often unknown unknowns – systematic errors you don’t know about and can’t control for.
We have done the exercise of varying the tolerance limits on data quality many times. We have shown that the scatter varies as expected with data quality. If we consider high quality data, we find a small scatter in the BTFR. If we consider low quality data, we get to plot more points, but the scatter goes up. You can see this by eye above. We can quantify this, and have. The amount of scatter varies as expected with the size of the uncertainties. Bigger errors, bigger scatter. Smaller errors, smaller scatter. This shouldn’t be hard to understand.
So why do people – many of them good scientists – keep screwing this up?
There are several answers. One is that measuring the flat rotation speed is hard. We have only done it for a couple hundred galaxies. This seems like a tiny number in the era of the Sloan Digitial Sky Survey, which enables any newbie to assemble a sample of tens of thousands of galaxies… with photometric data. It doesn’t provide any kinematic data. Measuring the stellar mass with the photometric data doesn’t do one bit of good for this problem if you don’t have the kinematic axis to plot against. Consequently, it doesn’t matter how big such a sample is.
Other measurements often provide a proxy measurement that seems like it ought to be close enough to use. If not the flat rotation speed, maybe you have a line width or a maximum speed or V2.2 or the hybrid S0.5 or some other metric. That’s fine, so long as you recognize you’re plotting something different so should expect to get something different – not the BTFR. Again, we’ve shown that the flat rotation speed is the measure that minimizes the scatter; if you utilize some other measure you’re gonna get more scatter. That may be useful for some purposes, but it only tells you about what you measured. It doesn’t tell you anything about the scatter in the BTFR constructed with the flat rotation speed if you didn’t measure the flat rotation speed.
Another possibility is that there exist galaxies that fall off the BTFR that we haven’t observed yet. It is a big universe, after all. This is a known unknown unknown: we know that we don’t know if there are non-conforming galaxies. If the relation is indeed absolute, then we never can find any, but never can we know that they don’t exist, only that we haven’t yet found any credible examples.
I’ve addressed the possibility of nonconforming galaxies elsewhere, so all I’ll say here is that I have spent my entire career seeking out the extremes in galaxy properties. Many times I have specifically sought out galaxies that should deviate from the BTFR for some clear reason, only to be surprised when they fall bang on the BTFR. Over and over and over again. It makes me wonder how Vera Rubin felt when her observations kept turning up flat rotation curves. Shouldn’t happen, but it does – over and over and over again. So far, I haven’t found any credible deviations from the BTFR, nor have I seen credible cases provided by others – just repeated failures of quality control.
Finally, an underlying issue is often – not always, but often – an obsession with salvaging the dark matter paradigm. That’s hard to do if you acknowledge that the observed BTFR – its slope, normalization, lack of scale length residuals, negligible intrinsic scatter; indeed, the very quantities that define it, were anticipated and explicitly predicted by MOND and only MOND. It is easy to confirm the dark matter paradigm if you never acknowledge this to be a problem. Often, people redefine the terms of the issue in some manner that is more tractable from the perspective of dark matter. From that perspective, neither the “cold” baryonic mass nor the flat rotation speed have any special meaning, so why even consider them? That is the road to MONDness.
This expression exactly depicts the progression of the radial acceleration relation. Some people were ahead of this curve, others are still behind it, but it quite accurately depicts the mass sociology. This is how we react to startling new facts.
For quotation purists, I’m not sure exactly what the original phrasing was. I have paraphrased it to be succinct and have substituted orthodoxy for religion, because even scientists can have orthodoxies: holy cows that must not be slaughtered.
I might even add a precursor stage zero to the list above:
0. It goes unrecognized.
This is to say, that if a new fact is sufficiently startling, we don’t just disbelieve it (stage 1); at first we fail to see it at all. We lack the cognitive framework to even recognize how important it is. An example is provided by the 1941 detection of the microwave background by Andrew McKellar. In retrospect, this is as persuasive as the 1964 detection of Penzias and Wilson to which we usually ascribe the discovery. At the earlier time, there was simply no framework for recognizing what it was that was being detected. It appears to me that P&Z didn’t know what they were looking at either until Peebles explained it to them.
The radial acceleration relation was first posed as the mass discrepancy-acceleration relation. They’re fundamentally the same thing, just plotted in a slightly different way. The mass discrepancy-acceleration relation shows the ratio of total mass to that which is visible. This is basically the ratio of the observed acceleration to that predicted by the observed baryons. This is useful to see how much dark matter is needed, but by construction the axes are not independent, as both measured quantities are used in forming the ratio.
The radial acceleration relation shows independent observations along each axis: observed vs. predicted acceleration. Though measured independently, they are not physically independent, as the baryons contribute some to the total observed acceleration – they do have mass, after all. One can construct a halo acceleration relation by subtracting the baryonic contribution away from the total; in principle the remainders are physically independent. Unfortunately, the axes again become observationally codependent, and the uncertainties blow up, especially in the baryon dominated regime. Which of these depictions is preferable depends a bit on what you’re looking to see; here I just want to note that they are the same information packaged somewhat differently.
To the best of my knowledge, the first mention of the mass discrepancy-acceleration relation in the scientific literature is by Sanders (1990). Its existence is explicit in MOND (Milgrom 1983), but here it is possible to draw a clear line between theory and data. I am only speaking of the empirical relation as it appears in the data, irrespective of anything specific to MOND.
I met Bob Sanders, along with many other talented scientists, in a series of visits to the University of Groningen in the early 1990s. Despite knowing him and having talked to him about rotation curves, I was unaware that he had done this.
Stage 0: It goes unrecognized.
For me, stage one came later in the decade at the culmination of a several years’ campaign to examine the viability of the dark matter paradigm from every available perspective. That’s a long paper, which nevertheless drew considerable praise from many people who actually read it. If you go to the bother of reading it today, you will see the outlines of many issues that are still debated and others that have been forgotten (e.g., the fine-tuning issues).
Around this time (1998), the dynamicists at Rutgers were organizing a meeting on galaxy dynamics, and asked me to be one of the speakers. I couldn’t possibly discuss everything in the paper in the time allotted, so was looking for a way to show the essence of the challenge the data posed. Consequently, I reinvented the wheel, coming up with the mass discrepancy-acceleration relation. Here I show the same data that I had then in the form of the radial acceleration relation:
I recognize this version of the plot as having been made by Federico Lelli. I’ve made this plot many times, but this is version I came across first, and it is better than mine in that the opacity of the points illustrates where the data are concentrated. I had been working on low surface brightness galaxies; these have low accelerations, so that part of the plot is well populated.
The data show a clear correlation. By today’s standards, it looks crude. Going on what we had then, it was fantastic. Correlations practically never look this good in extragalactic astronomy, and they certainly don’t happen by accident. Low quality data can hide a correlation – uncertainties cause scatter – but they can’t create a correlation where one doesn’t exist.
I showed the same result later that year (1998) at a meeting on the campus of the University of Maryland where I was a brand new faculty member. It was a much shorter presentation, so I didn’t have time to justify the context or explain much about the data. Contrary to the reception at Rutgers where I had adequate time to speak, the hostility of the audience to the result was palpable, their stony silence eloquent. They didn’t want to believe it, and plenty of peoplegot busyquestioning the data.
Stage 1: It is not true.
I spent the next five years expanding and improving the data. More rotation curves became available thanks to the work of many, particularly Erwin de Blok, Marc Verheijen, and Rob Swaters. That was great, but the more serious limitation was how well we could measure the stellar mass distribution needed to predict the baryonic acceleration.
The mass models we could build at the time were based on optical images. A mass model takes the observed light distribution, assigns a mass-to-light ratio, and makes a numerical solution of the Poisson equation to obtain the the gravitational force corresponding to the observed stellar mass distribution. This is how we obtain the stellar contribution to the predicted baryonic force; the same procedure is applied to the observed gas distribution. The blue part of the spectrum is the best place in which to observe low contrast, low surface brightness galaxies as the night sky is darkest there, at least during new moon. That’s great for measuring the light distribution, but what we want is the stellar mass distribution. The mass-to-light ratio is expected to have a lot of scatter in the blue band simply from the happenstance of recent star formation, which makes bright blue stars that are short-lived. If there is a stochastic uptick in the star formation rate, then the mass-to-light ratio goes down because there are lots of bright stars. Wait a few hundred million years and these die off, so the mass-to-light ratio gets bigger (in the absence of further new star formation). The time-integrated stellar mass may not change much, but the amount of blue light it produces does. Consequently, we expect to see well-observed galaxies trace distinct lines in the radial acceleration plane, even if there is a single universal relation underlying the phenomenon. This happens simply because we expect to get M*/L wrong from one galaxy to the next: in 1998, I had simply assumed all galaxies had the same M*/L for lack of any better prescription. Clearly, a better prescription was warranted.
In those days, I traveled through Tucson to observe at Kitt Peak with some frequency. On one occasion, I found myself with a few hours to kill between coming down from the mountain and heading to the airport. I wandered over to the Steward Observatory at the University of Arizona to see who I might see. A chance meeting in the wild west: I encountered Eric Bell and Roelof de Jong, who were postdocs there at the time. I knew Eric from his work on the stellar populations of low surface brightness galaxies, an interest closely aligned with my own, and Roelof from my visits to Groningen.
As we got to talking, Eric described to me work they were doing on stellar populations, and how they thought it would be possible to break the age-metallicity degeneracy using near-IR colors in addition to optical colors. They were mostly focused on improving the age constraints on stars in LSB galaxies, but as I listened, I realized they had constructed a more general, more powerful tool. At my encouragement (read their acknowledgements), they took on this more general task, ultimately publishing the classic Bell & de Jong (2001). In it, they built a table that enabled one to look up the expected mass-to-light ratio of a complex stellar population – one actively forming stars – as a function of color. This was a big step forward over my educated guess of a constant mass-to-light ratio: there was now a way to use a readily observed property, color, to improve the estimated M*/L of each galaxy in a well-calibrated way.
Combining the new stellar population models with all the rotation curves then available, I obtained an improved mass discrepancy-acceleration relation:
Again, the relation is clear, but with scatter. Even with the improved models of Bell & de Jong, some individual galaxies have M*/L that are wrong – that’s inevitable in this game. What you cannot know is which ones! Note, however, that there are now 74 galaxies in this plot, and almost all of them fall on top of each other where the point density is large. There are some obvious outliers; those are presumably just that: the trees that fall outside the forest because of the expected scatter in M*/L estimates.
I tried a variety of prescriptions for M*/L in addition to that of Bell & de Jong. Though they differed in texture, they all told a consistent story. A relation was clearly present; only its detailed form varied with the adopted prescription.
The prescription that minimized the scatter in the relation was the M*/L obtained in MOND fits. That’s a tautology: by construction, a MOND fit finds the M*/L that puts a galaxy on this relation. However, we can generalize the result. Maybe MOND is just a weird, unexpected way of picking a number that has this property; it doesn’t have to be the true mass-to-light ratio in nature. But one can then define a ratio Q
that relates the “true” mass-to-light ratio to the number that gives a MOND fit. They don’t have to be identical, but MOND does return M*/L that are reasonable in terms of stellar populations, so Q ~ 1. Individual values could vary, and the mean could be a bit more or less than unity, but not radically different. One thing that impressed me at the time about the MOND fits (most of which were made by Bob Sanders) was how well they agreed with the stellar population models, recovering the correct amplitude, the correct dependence on color in different bandpasses, and also giving the expected amount of scatter (more in the blue than in the near-IR).
The obvious interpretation is that we should take seriously a theory that obtains good fits with a single free parameter that checks out admirably well with independent astrophysical constraints, in this case the M*/L expected for stellar populations. But I knew many people would not want to do that, so I defined Q to generalize to any M*/L in any (dark matter) context one might want to consider.
Indeed, Q allows us to write a general expression for the rotation curve of the dark matter halo (essentially the HAR alluded to above) in terms of that of the stars and gas:
The stars and the gas are observed, and μ is the MOND interpolation function assumed in the fit that leads to Q. Except now the interpolation function isn’t part of some funny new theory; it is just the shape of the radial acceleration relation – a relation that is there empirically. The only fit factor between these data and any given model is Q – a single number of order unity. This does leave some wiggle room, but not much.
I went off to a conference to describe this result. At the 2006 meeting Galaxies in the Cosmic Web in New Mexico, I went out of my way at the beginning of the talk to show that even if we ignore MOND, this relation is present in the data, and it provides a strong constraint on the required distribution of dark matter. We may not know why this relation happens, but we can use it, modulo only the modest uncertainty in Q.
Having bent over backwards to distinguish the data from the theory, I was disappointed when, immediately at the end of my talk, prominent galaxy formation theorist Anatoly Klypin loudly shouted
“We don’t have to explain MOND!”
But you do have to explain the data. The problem was and is that the data look like MOND. It is easy to conflate one with the other; I have noticed that a lot of people have trouble keeping the two separate. Just because you don’t like the theory doesn’t mean that the data are wrong. What Anatoly was saying was that
2. It is contrary to orthodoxy.
Despite phrasing the result in a way that would be useful to galaxy formation theorists, they did not, by and large, claim to explain it at the time – it was contrary to orthodoxy so didn’t need to be explained. Looking at the list of papers that cite this result, the early adopters were not the target audience of galaxy formation theorists, but rather others citing it to say variations of “no way dark matter explains this.”
At this point, it was clear to me that further progress required a better way to measure the stellar mass distribution. Looking at the stellar population models, the best hope was to build mass models from near-infrared rather than optical data. The near-IR is dominated by old stars, especially red giants. Galaxies that have been forming stars actively for a Hubble time tend towards a quasi-equilibrium in which red giants are replenished by stellar evolution at about the same rate they move on to the next phase. One therefore expects the mass-to-light ratio to be more nearly constant in the near-IR. Not perfectly so, of course, but a 2 or 3 micron image is as close to a map of the stellar mass of a galaxy as we’re likely to get.
Around this time, the University of Maryland had begun a collaboration with Kitt Peak to build a big infrared camera, NEWFIRM, for the 4m telescope. Rob Swaters was hired to help write software to cope with the massive data flow it would produce. The instrument was divided into quadrants, each of which had a field of view sufficient to hold a typical galaxy. When it went on the telescope, we developed an efficient observing method that I called “four-shooter”, shuffling the target galaxy from quadrant to quadrant so that in processing we could remove the numerous instrumental artifacts intrinsic to its InSb detectors. This eventually became one of the standard observing modes in which the instrument was operated.
I was optimistic that we could make rapid progress, and at first we did. But despite all the work, despite all the active cooling involved, we were still on the ground. The night sky was painfully bright in the IR. Indeed, the thermal component dominated, so we could observe during full moon. To an observer of low surface brightness galaxies attuned to any hint of scattered light from so much as a crescent moon, I cannot describe how discombobulating it was to walk outside the dome and see the full fricking moon. So bright. So wrong. And that wasn’t even the limiting factor: the thermal background was.
We had hit a surface brightness wall, again. We could do the bright galaxies this way, but the LSBs that sample the low acceleration end of the radial acceleration relation were rather less accessible. Not inaccessible, but there was a better way.
The Spitzer Space Telescope was active at this time. Jim Schombert and I started winning time to observe LSB galaxies with it. We discovered that space is dark. There was no atmosphere to contend with. No scattered light from the clouds or the moon or the OH lines that afflict that part of the sky spectrum. No ground-level warmth. The data were fantastic. In some sense, they were too good: the biggest headache we faced was blotting out all the background galaxies that shown right through the optically thin LSB galaxies.
Still, it took a long time to collect and analyze the data. We were starting to get results by the early-teens, but it seemed like it would take forever to get through everything I hoped to accomplish. Fortunately, when I moved to Case Western, I was able to hire Federico Lelli as a postdoc. Federico’s involvement made all the difference. After many months of hard, diligent, and exacting work, he constructed what is now the SPARC database. Finally all the elements were in place to construct an empirical radial acceleration relation with absolutely minimal assumptions about the stellar mass-to-light ratio.
In parallel with the observational work, Jim Schombert had been working hard to build realistic stellar population models that extended to the 3.6 micron band of Spitzer. Spitzer had been built to look redwards of this, further into the IR. 3.6 microns was its shortest wavelength passband. But most models at the time stopped at the K-band, the 2.2 micron band that is the reddest passband that is practically accessible from the ground. They contain pretty much the same information, but we still need to calculate the band-specific value of M*/L.
Being a thorough and careful person, Jim considered not just the star formation history of a model stellar population as a variable, and not just its average metallicity, but also the metallicity distribution of its stars, making sure that these were self-consistent with the star formation history. Realistic metallicity distributions are skewed; it turn out that this subtle effect tends to counterbalance the color dependence of the age effect on M*/L in the near-IR part of the spectrum. The net results is that we expect M*/L to be very nearly constant for all late type galaxies.
This is the best possible result. To a good approximation, we expected all of the galaxies in the SPARC sample to have the same mass-to-light ratio. What you see is what you get. No variable M*/L, no equivocation, just data in, result out.
We did still expect some scatter, as that is an irreducible fact of life in this business. But even that we expected to be small, between 0.1 and 0.15 dex (roughly 25 – 40%). Still, we expected the occasional outlier, galaxies that sit well off the main relation just because our nominal M*/L didn’t happen to apply in that case.
One day as I walked past Federico’s office, he called for me to come look at something. He had plotted all the data together assuming a single M*/L. There… were no outliers. The assumption of a constant M*/L in the near-IR didn’t just work, it worked far better than we had dared to hope. The relation leapt straight out of the data:
Over 150 galaxies, with nearly 2700 resolved measurements within each galaxy, each with their own distinctive mass distribution, all pile on top of each other without effort. There was plenty of effort in building the database, but once it was there, the result appeared, no muss, no fuss. No fitting or fiddling. Just the measurements and our best estimate of the mean M*/L, applied uniformly to every individual galaxy in the sample. The scatter was only 0.12 dex, within the range expected from the population models.
No MOND was involved in the construction of this relation. It may look like MOND, but we neither use MOND nor need it in any way to see the relation. It is in the data. Perhaps this is the sort of result for which we would have to invent MOND if it did not already exist. But the dark matter paradigm is very flexible, and many papers have since appeared that claim to explain the radial acceleration relation. We have reached
3. We knew it all along.
On the one hand, this is good: the community is finally engaging with a startling fact that has been pointedly ignored for decades. On the other hand, many of the claims to explain the radial acceleration relation are transparently incorrecton their face, being nothing more than elaborations of models I considered and discarded as obviously unworkable long ago. They do not provide a satisfactory explanation of the predictive power of MOND, and inevitably fail to address important aspects of the problem, like disk stability. Rather than grapple with the deep issues the new and startling fact poses, it has become fashionable to simply assert that one’s favorite model explains the radial acceleration relation, and does so naturally.
There is nothing natural about the radial acceleration relation in the context of dark matter. Indeed, it is difficult to imagine a less natural result – hence stages one and two. So on the one hand, I welcome the belated engagement, and am willing to consider serious models. On the other hand, if someone asserts that this is natural and that we expected it all along, then the engagement isn’t genuine: they’re just fooling themselves.
Early Days. This was one of Vera Rubin’s favorite expressions. I always had a hard time with it, as many things are very well established. Yet it seems that we have yet to wrap our heads around the problem. Vera’s daughter, Judy Young, once likened the situation to the parable of the blind men and the elephant. Much is known, yes, but the problem is so vast that each of us can perceive only a part of the whole, and the whole may be quite different from the part that is right before us.
So I guess Vera is right as always: these remain Early Days.
It’s early in the new year, so what better time to violate my own resolutions? I prefer to be forward-looking and not argue over petty details, or chase wayward butterflies. But sometimes the devil is in the details, and the occasional butterfly can be entertaining if distracting. Today’s butterfly is the galaxy AGC 114905, which has recently been in the news.
There are a couple of bandwagons here: one to rebrand very low surface brightness galaxies as ultradiffuse, and another to get overly excited when these types of galaxies appear to lack dark matter. The nomenclature is terrible, but that’s normal for astronomy so I would overlook it, except that in this case it gives the impression that there is some new population of galaxies behaving in an unexpected fashion, when instead it looks to me like the opposite is the case. The extent to which there are galaxies lacking dark matter is fundamental to our interpretation of the acceleration discrepancy (aka the missing mass problem), so bears closer scrutiny. The evidence for galaxies devoid of dark matter is considerably weaker than the current bandwagon portrays.
If it were just one butterfly (e.g., NGC 1052-DF2), I wouldn’t bother. Indeed, it was that specific case that made me resolve to ignore such distractions as a waste of time. I’ve seen this movie literally hundreds of times, I know how it goes:
Observations of this one galaxy falsify MOND!
Hmm, doing the calculation right, that’s what MOND predicts.
OK, but better data shrink the error bars and now MOND falsified.
Are you sure about…?
Yes. We like this answer, let’s stop thinking about it now.
As the data continue to improve, it approaches what MOND predicts.
Over and over again. DF44 is another example that has followed this trajectory, and there are many others. This common story is not widely known – people lose interest once they get the answer they want. Irrespective of whether we can explain this weird case or that, there is a deeper story here about data analysis and interpretation that seems not to be widely appreciated.
My own experience inevitably colors my attitude about this, as it does for us all, so let’s start thirty years ago when I was writing a dissertation on low surface brightness (LSB) galaxies. I did many things in my thesis, most of them well. One of the things I tried to do then was derive rotation curves for some LSB galaxies. This was not the main point of the thesis, and arose almost as an afterthought. It was also not successful, and I did not publish the results because I didn’t believe them. It wasn’t until a few years later, with improved data, analysis software, and the concerted efforts of Erwin de Blok, that we started to get a handle on things.
The thing that really bugged me at the time was not the Doppler measurements, but the inclinations. One has to correct the observed velocities by the inclination of the disk, 1/sin(i). The inclination can be constrained by the shape of the image and by the variation of velocities across the face of the disk. LSB galaxies presented raggedy images and messy velocity fields. I found it nigh on impossible to constrain their inclinations at the time, and it remains a frequent struggle to this day.
Here is an example of the LSB galaxy F577-V1 that I find lurking around on disk from all those years ago:
A uniform disk projected on the sky at some inclination will have a fixed corresponding eccentricity, with zero being the limit of a circular disk seen perfectly face-on (i = 0). Do you see a constant value of the eccentricity in the graph above? If you say yes, go get your eyes checked.
What we see in this case is a big transition from a fairly eccentric disk to one that is more nearly face on. The inclination doesn’t have a sudden warp; the problem is that the assumption of a uniform disk is invalid. This galaxy has a bar – a quasi-linear feature that is common in many spiral galaxies that is supported by non-circular orbits. Even face-on, the bar will look elongated simply because it is. Indeed, the sudden change in eccentricity is one way to define the end of the bar, which the human eye-brain can do easily by looking at the image. So in a case like this, one might adopt the inclination from the outer points, and that might even be correct. But note that there are spiral arms along the outer edge that is visible to the eye, so it isn’t clear that even these isophotes are representative of the shape of the underlying disk. Worse, we don’t know what happens beyond the edge of the data; the shape might settle down at some other level that we can’t see.
This was so frustrating, I swore never to have anything to do with galaxy kinematics ever again. Over 50 papers on the subject later, all I can say is D’oh! Repeatedly.
Bars are rare in LSB galaxies, but it struck me as odd that we saw any at all. We discovered unexpectedly that they were dark matter dominated – the inferred dark halo outweighs the disk, even within the edge defined by the stars – but that meant that the disks should be stable against the formation of bars. My colleague Chris Mihos agreed, and decided to look into it. The answer was yes, LSB galaxies should be stable against bar formation, at least internally generated bars. Sometimes bars are driven by external perturbations, so we decided to simulate the close passage of a galaxy of similar mass – basically, whack it real hard and see what happens:
This was a conventional simulation, with a dark matter halo constructed to be consistent with the observed properties of the LSB galaxy UGC 128. The results are not specific to this case; it merely provides numerical corroboration of the more general case that we showed analytically.
Consider the image above in the context of determining galaxy inclinations from isophotal shapes. We know this object is face-on because we can control our viewing angle in the simulation. However, we would not infer i=0 from this image. If we didn’t know it had been perturbed, we would happily infer a substantial inclination – in this case, easily as much as 60 degrees! This is an intentionally extreme case, but it illustrates how a small departure from a purely circular shape can be misinterpreted as an inclination. This is a systematic error, and one that usually makes the inclination larger than it is: it is possible to appear oval when face-on, but it is not possible to appear more face-on than perfectly circular.
Around the same time, Erwin and I were making fits to the LSB galaxy data – with both dark matter halos and MOND. By this point in my career, I had deeply internalized that the data for LSB galaxies were never perfect. So we sweated every detail, and worked through every “what if?” This was a particularly onerous task for the dark matter fits, which could do many different things if this or that were assumed – we discussed all the plausible possibilities at the time. (Subsequently, a rich literature sprang up discussing many unreasonable possibilities.) By comparison, the MOND fits were easy. They had fewer knobs, and in 2/3 of the cases they simply worked, no muss, no fuss.
For the other 1/3 of the cases, we noticed that the shape of the MOND-predicted rotation curves was usually right, but the amplitude was off. How could it work so often, and yet miss in this weird way? That sounded like a systematic error, and the inclination was the most obvious culprit, with 1/sin(i) making a big difference for small inclinations. So we decided to allow this as a fit parameter, to see whether a fit could be obtained, and judge how [un]reasonable this was. Here is an example for two galaxies:
The case of UGC 1230 is memorable to me because it had a good rotation curve, despite being more face-on than widely considered acceptable for analysis. And for good reason: the difference between 22 and 17 degrees make a huge difference to the fit, changing it from way off to picture perfect.
What I took away from this exercise is how hard it is to tell the difference between inclination values for relatively face-on galaxies. UGC 1230 is obvious: the ovals for the two inclinations are practically on top of each other. The difference in the case of UGC 5005 is more pronounced, but look at the galaxy. The shape of the outer isophote where we’re trying to measure this is raggedy as all get out; this is par for the course for LSB galaxies. Worse, look further in – this galaxy has a bar! The central bar is almost orthogonal to the kinematic major axis. If we hadn’t observed as deeply as we had, we’d think the minor axis was the major axis, and the inclination was something even higher.
I remember Erwin quipping that he should write a paper on how to use MOND to determine inclinations. This was a joke between us, but only half so: using the procedure in this way would be analogous to using Tully-Fisher to measure distances. We would simply be applying an empirically established procedure to constrain a property of a galaxy – luminosity from line-width in that case of Tully-Fisher; inclination from rotation curve shape here. That we don’t understand why this works has never stopped astronomers before.
Systematic errors in inclination happen all the time. Big surveys don’t have time to image deeply – they have too much sky area to cover – and if there is follow-up about the gas content, it inevitably comes in the form of a single dish HI measurement. This is fine; it is what we can do en masse. But an unresolved single dish measurement provides no information about the inclination, only a pre-inclination line-width (which itself is a crude proxy for the flat rotation speed). The inclination we have to take from the optical image, which would key on the easily detected, high surface brightness central region of the image. That’s the part that is most likely to show a bar-like distortion, so one can expect lots of systematic errors in the inclinations determined in this way. I provided a long yet still incomplete discussion of these issues in McGaugh (2012). This is both technical and intensely boring, so not even the pros read it.
This brings us to the case of AGC 114905, which is part of a sample of ultradiffuse galaxies discussed previously by some of the same authors. On that occasion, I kept to the code, and refrained from discussion. But for context, here are those data on a recent Baryonic Tully-Fisher plot. Spoiler alert: that post was about a different sample of galaxies that seemed to be off the relation but weren’t.
On the face of it, these ultradiffuse galaxies (UDGs) are all very serious outliers. This is weird – they’re not some scatter off to one side, they’re just way off on their own island, with no apparent connection to the rest of established reality. By calling them a new name, UDG, it makes it sound plausible that these are some entirely novel population of galaxies that behave in a new way. But they’re not. They are exactly the same kinds of galaxies I’ve been talking about. They’re all blue, gas rich, low surface brightness, fairly isolated galaxies – all words that I’ve frequently used to describe my thesis sample. These UDGs are all a few billion solar mass is baryonic mass, very similar to F577-V1 above. You could give F577-V1 a different name, slip into the sample, and nobody would notice that it wasn’t like one of the others.
The one slight difference is implied by the name: UDGs are a little lower in surface brightness. Indeed, once filter transformations are taken into account, the definition of ultradiffuse is equal to what I arbitrarily called very low surface brightness in 1996. Most of my old LSB sample galaxies have central stellar surface brightnesses at or a bit above 10 solar masses per square parsec while the UDGs here are a bit under this threshold. For comparison, in typical high surface brightness galaxies this quantity is many hundreds, often around a thousand. Nothing magic happens at the threshold of 10 solar masses per square parsec, so this line of definition between LSB and UDG is an observational distinction without a physical difference. So what are the odds of a different result for the same kind of galaxies?
Indeed, what really matters is the baryonic surface density, not just the stellar surface brightness. A galaxy made purely of gas but no stars would have zero optical surface brightness. I don’t know of any examples of that extreme, but we came close to it with the gas rich sample of Trachternach et al. (2009) when we tried this exact same exercise a decade ago. Despite selecting that sample to maximize the chance of deviations from the Baryonic Tully-Fisher relation, we found none – at least none that were credible: there were deviant cases, but their data were terrible. There were no deviants among the better data. This sample is comparable or even extreme than the UDGs in terms of baryonic surface density, so the UDGs can’t be exception because they’re a genuinely new population, whatever name we call them by.
The key thing is the credibility of the data, so let’s consider the data for AGC 114905. The kinematics are pretty well ordered; the velocity field is well observed for this kind of beast. It ought to be; they invested over 40 hours of JVLA time into this one galaxy. That’s more than went into my entire LSB thesis sample. The authors are all capable, competent people. I don’t think they’ve done anything wrong, per se. But they do seem to have climbed aboard the bandwagon of dark matter-free UDGs, and have talked themselves into believing smaller error bars on the inclination than I am persuaded is warranted.
AGC 114905 in stars (left) and gas (right). The contours of the gas distribution are shown on top of the stars in white. Figure 1 from Mancera Piña et al. (2021).
This messy morphology is typical of very low surface brightness galaxies – hence their frequent classification as Irregular galaxies. Though messier, it shares some morphological traits with the LSB galaxies shown above. The central light distribution is elongated with a major axis that is not aligned with that of the gas. The gas is raggedy as all get out. The contours are somewhat boxy; this is a hint that something hinky is going on beyond circular motion in a tilted axisymmetric disk.
The authors do the right thing and worry about the inclination, checking to see what it would take to be consistent with either LCDM or MOND, which is about i=11o in stead of the 30o indicated by the shape of the outer isophote. They even build a model to check the plausibility of the smaller inclination:
Clearly the black line (i=30o) is a better fit to the shape of the gas distribution than the blue dashed line (i=11o). Consequently, they “find it unlikely that we are severely overestimating the inclination of our UDG, although this remains the largest source of uncertainty in our analysis.” I certainly agree with the latter phrase, but not the former. I think it is quite likely that they are overestimating the inclination. I wouldn’t even call it a severe overestimation; more like par for the course with this kind of object.
As I have emphasized above and elsewhere, there are many things that can go wrong in this sort of analysis. But if I were to try to put my finger on the most important thing, here it would be the inclination. The modeling exercise is good, but it assumes “razor-thin axisymmetric discs.” That’s a reasonable thing to do when building such a model, but we have to bear in mind that real disks are neither. The thickness of the disk probably doesn’t matter too much for a nearly face-on case like this, but the assumption of axisymmetry is extraordinarily dubious for an Irregular galaxy. That’s how they got the name.
It is hard to build models that are not axisymmetric. Once you drop this simplifying assumption, where do you even start? So I don’t fault them for stopping at this juncture, but I can also imagine doing as de Blok suggested, using MOND to set the inclination. Then one could build models with asymmetric features by trial and error until a match is obtained. Would we know that such a model would be a better representation of reality? No. Could we exclude such a model? Also no. So the bottom line is that I am not convinced that the uncertainty in the inclination is anywhere near as small as the adopted ±3o.
That’s very deep in the devilish details. If one is worried about a particular result, one can back off and ask if it makes sense in the context of what we already know. I’ve illustrated this process previously. First, check the empirical facts. Every other galaxy in the universe with credible data falls on the Baryonic Tully-Fisher relation, including very similar galaxies that go by a slightly different name. Hmm, strike one. Second, check what we expect from theory. I’m not a fan of theory-informed data interpretation, but we know that LCDM, unlike SCDM before it, at least gets the amplitude of the rotation speed in the right ballpark (Vflat ~ V200). Except here. Strike two. As much as we might favor LCDM as the standard cosmology, it has now been extraordinarily well established that MOND has considerable success in not just explaining but predicting these kind of data, with literally hundreds of examples. One hundred was the threshold Vera Rubin obtained to refute excuses made to explain away the first few flat rotation curves. We’ve crossed that threshold: MOND phenomenology is as well established now as flat rotation curves were at the inception of the dark matter paradigm. So while I’m open to alternative explanations for the MOND phenomenology, seeing that a few trees stand out from the forest is never going to be as important as the forest itself.
This brings us to a physical effect that people should be aware of. We touched on the bar stability above, and how a galaxy might look oval even when seen face on. This happens fairly naturally in MOND simulations of isolated disk galaxies. They form bars and spirals and their outer parts wobble about. See, for example, this simulation by Nils Wittenburg. This particular example is a relatively massive galaxy; the lopsidedness reminds me of M101 (Watkins et al. 2017). Lower mass galaxies deeper in the MOND regime are likely even more wobbly. This happens because disks are only marginally stable in MOND, not the over-stabilized entities that have to be hammered to show a response as in our early simulation of UGC 128 above. The point is that there is good reason to expect even isolated face-on dwarf Irregulars to look, well, irregular, leading to exactly the issues with inclination determinations discussed above. Rather than being a contradiction to MOND, AGC 114905 may illustrate one of its inevitable consequences.
I don’t like to bicker at this level of detail, but it makes a profound difference to the interpretation. I do think we should be skeptical of results that contradict well established observational reality – especially when over-hyped. God knows I was skeptical of our own results, which initially surprised the bejeepers out of me, but have been repeatedlycorroborated by subsequent observations.
I guess I’m old now, so I wonder how I come across to younger practitioners; perhaps as some scary undead monster. But mates, these claims about UDGs deviating from established scaling relations are off the edge of the map.
We have a new paper on the arXiv. This is a straightforward empiricist’s paper that provides a reality check on the calibration of the Baryonic Tully-Fisher relation (BTFR) and the distance scale using well-known Local Group galaxies. It also connects observable velocity measures in rotating and pressure supported dwarf galaxies: the flat rotation speed of disks is basically twice the line-of-sight velocity dispersion of dwarf spheroidals.
First, the reality check. Previously we calibrated the BTFR using galaxies with distances measured by reliable methods like Cepheids and the Tip of the Red Giant Branch (TRGB) method. Application of this calibration obtains the Hubble constant H0 = 75.1 +/- 2.3 km/s/Mpc, which is consistent with other local measurements but in tension with the value obtained from fitting the Planck CMB data. All of the calibrator galaxies are nearby (most are within 10 Mpc, which is close by extragalactic standards), but none of them are in the Local Group (galaxies within ~1 Mpc like Andromeda and M33). The distances to Local Group galaxies are pretty well known at this point, so if we got the BTFR calibration right, they had better fall right on it.
They do. From high to low mass, the circles in the plot below are Andromeda, the Milky Way, M33, the LMC, SMC, and NGC 6822. All fall on the externally calibrated BTFR, which extrapolates well to still lower mass dwarf galaxies like WLM, DDO 210, and DDO 216 (and even Leo P, the smallest rotating galaxy known).
The agreement of the BTFR with Local Group rotators is so good that it is tempting to say that there is no way to reconcile this with a low Hubble constant of 67 km/s/kpc. Doing so would require all of these galaxies to be more distant by the factor 75/67 = 1.11. That doesn’t sound too bad, but applying it means that Andromeda would have to be 875 kpc distant rather than the 785 ± 25 adopted by the source of our M31 data, Chemin et al. There is a long history of distance measurements to M31 so many opinions can be found, but it isn’t just M31 – all of the Local Group galaxy distances would have to be off by this factor. This seems unlikely to the point of absurdity, but as colleague and collaborator Jim Schombert reminds me, we’ve seen such things before with the distance scale.
So that’s the reality check: the BTFR works as it should in the Local Group – at least for the rotating galaxies (circles in the plot above). What about the pressure supported galaxies (the squares)?
Galaxies come in two basic kinematic types: rotating disks or pressure supported ellipticals. Disks are generally thin, with most of the stars orbiting in the same direction in the same plane on nearly circular orbits. Ellipticals are quasi-spherical blobs of stars on rather eccentric orbits oriented all over the place. This is an oversimplification, of course; real galaxies have a mix of orbits, but usually most of the kinetic energy is invested in one or the other, rotation or random motions. We can measure the speeds of stars and gas in these configurations, which provides information about the kinetic energy and corresponding gravitational binding energy. That’s how we get at the gravitational potential and infer the need for dark matter – or at least, the existence of acceleration discrepancies.
We would like to have full 6D phase space information for all stars – their location in 3D configuration space and their momentum in each direction. In practice, usually all we can measure is the Doppler line-of-sight speed. For rotating galaxies, we can [attempt to] correct the observed velocity for the inclination of the disk, and get an idea or the in-plane rotation speed. For ellipticals, we get the velocity dispersion along the line of sight in whatever orientation we happen to get. If the orbits are isotropic, then one direction of view is as good as any other. In general that need not be the case, but it is hard to constrain the anisotropy of orbits, so usually we assume isotropy and call it Close Enough for Astronomy.
For isotropic orbits, the velocity dispersion σ* is related to the circular velocity Vc of a test particle by Vc = √3 σ*. The square root of three appears because the kinetic energy of isotropic orbits is evenly divided among the three cardinal directions. These quantities depend in a straightforward way on the gravitational potential, which can be computed for the stuff we can see but not for that which we can’t. The stars tend to dominate the potential at small radii in bright galaxies. This is a complication we’ll ignore here by focusing on the outskirts of rotating galaxies where rotation curves are flat and dwarf spheroidals where stars never dominate. In both cases, we are in a limit where we can neglect the details of the stellar distribution: only the dark mass matters, or, in the case of MOND, only the total normal mass but not its detailed distribution (which does matter for the shape of a rotation curve, but not its flat amplitude).
Rather than worry about theory or the gory details of phase space, let’s just ask the data. How do we compare apples with apples? What is the factor βc that makes Vo = βc σ* an equality?
One notices that the data for pressure supported dwarfs nicely parallels that for rotating galaxies. We estimate βc by finding the shift that puts the dwarf spheroidals on the BTFR (on average). We only do this for the dwarfs that are not obviously affected by tidal effects whose velocity dispersions may not reflect the equilibrium gravitational potential. I have discussed this at great length in McGaugh & Wolf, so I refer the reader eager for more details there. Here I merely note that the exercise is meaningful only for those dwarfs that parallel the BTFR; it can’t apply to those that don’t regardless of the reason.
That caveat aside, this works quite well for βc = 2.
The numerically inclined reader will note that 2 > √3. One would expect the latter for isotropic orbits, which we implicitly average over by using the data for all these dwarfs together. So the likely explanation for the larger values of βc is that the outer velocities of rotation curves are measured at a larger radii than the velocity dispersions of dwarf spheroidals. The value of βc is accounts for the different effective radii of measurement as illustrated by the rotation curves below.
Once said, this seems obvious. The velocity dispersions of dwarf spheroidals are measured by observing the Doppler shifts of individual member stars. This measurement is necessarily made where the stars are. In contrast, the flat portions of rotation curves are traced by atomic gas at radii that typically extend beyond the edge of the optical disk. So we should expect a difference; βc = 2 quantifies it.
One small caveat is that in order to compare apples with apples, we have to adopt a mass-to-light ratio for the stars in dwarfs spheroidals in order to compare them with the combined mass of stars and gas in rotating galaxies. Indeed, the dwarf irregulars that overlap with the dwarf spheroidals in mass are made more of gas than stars, so there is always the risk of some systematic difference between the two mass scales. In the paper, we quantify the variation of βc with the choice of M*/L. If you’re interested in that level of detail, you should read the paper.
I should also note that MOND predicts βc = 2.12. Taken at face value, this implies that MOND prefers an average mass-to-light ratio slightly higher than what we assumed. This is well within the uncertainties, and we already know that MOND is the only theory capable of predicting the velocity dispersions of dwarf spheroidals in advance. We can always explain this after the fact with dark matter, which is what people generally do, often in apparent ignorance that MOND also correctly predicts which dwarfs they’ll have to invoke tidal disruption for. How such models can be considered satisfactory is quite beyond my capacity, but it does save one from the pain of having to critically reassess one’s belief system.
That’s all beyond the scope of the current paper. Here we just provide a nifty empirical result. If you want to make an apples-to-apples comparison of dwarf spheroidals with rotating dwarf irregulars, you will do well to assume Vo = 2σ*.
This title is an example of what has come to be called Betteridge’s law. This is a relatively recent name for an old phenomenon: if a title is posed as a question, the answer is no. This is especially true in science, whether the authors are conscious of it or not.
Pengfei Li completed his Ph.D. recently, fitting all manner of dark matter halos as well as the radial acceleration relation (RAR) to galaxies in the SPARC database. For the RAR, he found that galaxy data were consistent with a single, universal acceleration scale, g+. There is of course scatter in the data, but this appears to us to be consistent with what we expect from variation in the mass-to-light ratios of stars and the various uncertainties in the data.
This conclusion has been controversial despite being painfully obvious. I have my own law for data interpretation in astronomy:
Obvious results provoke opposition. The more obvious the result, the stronger the opposition.
The constancy of the acceleration scale is such a case. Where we do not believe we can distinguish between galaxies, others think they can – using our own data! Here it is worth contemplating what all is involved in building a database like SPARC – we were the ones who did the work, after all. In the case of the photometry, we observed the galaxies, we reduced the data, we cleaned the images of foreground contaminants (stars), we fit isophotes, we built mass models – that’s a very short version of what we did in order to be able to estimate the acceleration predicted by Newtonian gravity for the observed distribution of stars. That’s one axis of the RAR. The other is the observed acceleration, which comes from rotation curves, which require even more work. I will spare you the work flow; we did some galaxies ourselves, and took others from the literature in full appreciation of what we could and could not believe — which we have a deep appreciation for because we do the same kind of work ourselves. In contrast, the people claiming to find the opposite of what we find obtained the data by downloading it from our website. The only thing they do is the very last step in the analysis, making fits with Bayesian statistics the same as we do, but in manifest ignorance of the process by which the data came to be. This leads to an underappreciation of the uncertainty in the uncertainties.
This is another rule of thumb in science: outside groups are unlikely to discover important things that were overlooked by the group that did the original work. An example from about seven years ago was the putative 126 GeV line in Fermi satellite data. This was thought by some at the time to be evidence for dark matter annihilating into gamma rays with energy corresponding to the rest mass of the dark matter particles and their anti-particles. This would be a remarkable, Nobel-winning discovery, if true. Strange then that the claim was not made by the Fermi team themselves. Did outsiders beat them to the punch with their own data? It can happen: sometimes large collaborations can be slow to move on important results, wanting to vet everything carefully or warring internally over its meaning while outside investigators move more swiftly. But it can also be that the vetting shows that the exciting result is not credible.
I recall the 126 GeV line being a big deal. There was an entire session devoted to it at a conference I was scheduled to attend. Our time is valuable: I can’t go to every interesting conference, and don’t want to spend time on conferences that aren’t interesting. I was skeptical, simply because of the rule of thumb. I wrote the organizers, and asked if they really thought that this would still be a thing by the time the conference happened in few months’ time. Some of them certainly thought so, so it went ahead. As it happened, it wasn’t. Not a single speaker who was scheduled to talk about the 126 GeV line actually did so. In a few short months, if had gone from an exciting result sure to win a Nobel prize to nada.
This happens all the time. Science isn’t as simple as a dry table of numbers and error bars. This is especially true in astronomy, where we are observing objects in the sky. It is never possible to do an ideal experiment in which one controls for all possible systematics: the universe is not a closed box in which we can control the conditions. Heck, we don’t even know what all the unknowns are. It is a big friggin’ universe.
The practical consequence of this is that the uncertainty in any astronomical measurement is almost always larger than its formal error bar. There are effects we can quantify and include appropriately in the error assessment. There are things we can not. We know they’re there, but that doesn’t mean we can put a meaningful number on them.
Indeed, the sociology of this has evolved over the course of my career. Back in the day, everybody understood these things, and took the stated errors with a grain of salt. If it was important to estimate the systematic uncertainty, it was common to estimate a wide band, in effect saying “I’m pretty sure it is in this range.” Nowadays, it has become common to split out terms for random and systematic error. This is helpful to the non-specialist, but it can also be misleading because, so stated, the confidence interval on the systematic looks like a 1 sigma error even though it is not likely to have a Gaussian distribution. Being 3 sigma off of the central value might be a lot more likely than this implies — or a lot less.
People have become more careful in making error estimates, which ironically has made matters worse. People seem to think that they can actually believe the error bars. Sometimes you can, but sometimes not. Many people don’t know how much salt to take it with, or realize that they should take it with a grain of salt at all. Worse, more and more folks come over from particle physics where extraordinary accuracy is the norm. They are completely unprepared to cope with astronomical data, or even fully process that the error bars may not be what they think they are. There is no appreciation for the uncertainties in the uncertainties, which is absolutely fundamental in astrophysics.
Consequently, one gets overly credulous analyses. In the case of the RAR, a number of papers have claimed that the acceleration scale isn’t constant. Not even remotely! Why do they make this claim?
Below is a histogram of raw acceleration scales from SPARC galaxies. In effect, they are claiming that they can tell the difference between galaxies in the tail on one side of the histogram from those on the opposite side. We don’t think we can, which is the more conservative claim. The width of the histogram is just the scatter that one expects from astronomical data, so the data are consistent with zero intrinsic scatter. That’s not to say that’s necessarily what Nature is doing: we can never measure zero scatter, so it is always conceivable that there is some intrinsic variation in the characteristic acceleration scale. All we can say is that if is there, it is so small that we cannot yet resolve it.
Posed as a histogram like this, it is easy to see that there is a characteristic value – the peak – with some scatter around it. The entire issue it whether that scatter is due to real variation from galaxy to galaxy, or if it is just noise. One way to check this is to make quality cuts: in the plot above, the gray-striped histogram plots every available galaxy. The solid blue one makes some mild quality cuts, like knowing the distance to better than 20%. That matters, because the acceleration scale is a quantity that depends on distance – a notoriously difficult quantity to measure accurately in astronomy. When this quality cut is imposed, the width of the histogram shrinks. The better data make a tighter histogram – just as one would expect if the scatter is due to noise. If instead the scatter is a real, physical effect, it should, if anything, be more pronounced in the better data.
This should not be difficult to understand. And yet – other representations of the data give a different impression, like this one:
This figure tells a very different story. The characteristic acceleration does not just scatter around a universal value. There is a clear correlation from one end of the plot to the other. Indeed, it is a perfectly smooth transition, because “Galaxy” is the number of each galaxy ordered by the value of its acceleration, from lowest to highest. The axes are not independent, they represent identically the same quantity. It is a plot of x against x. If properly projected it into a histogram, it would look like the one above.
This is a terrible way to plot data. It makes it look like there is a correlation where there is none. Setting this aside, there is a potential issue with the most discrepant galaxies – those at either extreme. There are more points that are roughly 3 sigma from a constant value than there should be for a sample this size. If this is the right assessment of the uncertainty, then there is indeed some variation from galaxy to galaxy. Not much, but the galaxies at the left hand side of the plot are different from those on the right hand side.
But can we believe the formal uncertainties that inform this error analysis? If you’ve read this far, you will anticipate that the answer to this question obeys Betteridge’s law. No.
One of the reasons we can’t just assign confidence intervals and believe them like a common physicist is that there are other factors in the analysis – nuisance parameters in Bayesian verbiage – with which the acceleration scale covaries. That’s a fancy way of saying that if we turn one knob, it affects another. We assign priors to the nuisance parameters (e.g., the distance to each galaxy and its inclination) based on independent measurements. But there is still some room to slop around. The question is really what to believe at the end of the analysis. We don’t think we can distinguish the acceleration scale from one galaxy to another, but this other analysis says we should. So which is it?
It is easy at this point to devolve into accusations of picking priors to obtain a preconceived result. I don’t think anyone is doing that. But how to show it?
Pengfei had the brilliant idea to perform the same analysis as Marra et al., but allowing Newton’s constant to vary. This is Big G, a universal constant that’s been known to be a constant of nature for centuries. It surely does not vary. However, G appears in our equations, so we can test for variation therein. Pengfei did this, following the same procedure as Mara et al., and finds the same kind of graph – now for G instead of g+.
You see here the same kind of trend for Newton’s constant as one sees above for the acceleration scale. The same data have been analyzed in the same way. It has also been plotted in the same way, giving the impression of a correlation where there is none. The result is also the same: if we believe the formal uncertainties, the best-fit G is different for the galaxies at the left than from those to the right.
I’m pretty sure Newton’s constant does not vary this much. I’m entirely sure that the rotation curve data we analyze are not capable of making this determination. It would be absurd to claim so. The same absurdity extends to the acceleration scale g+. If we don’t believe the variation in G, there’s no reason to believe that in g+.
So what is going on here? It boils down to the errors on the rotation curves not representing the uncertainty in the circular velocity as we would like for them to. There are all sorts of reasons for this, observational, physical, and systematic. I’ve written about this at great lengths elsewhere, and I haven’t the patience to do so again here. it is turgidly technical to the extent that even the pros don’t read it. It boils down to the ancient, forgotten wisdom of astronomy: you have to take the errors with a grain of salt.
Here is the cumulative distribution (CDF) of reduced chi squared for the plot above.
Two things to notice here. First, the CDF looks the same regardless of whether we let Newton’s constant vary or not, or how we assign the Bayesian priors. There’s no value added in letting it vary – just as we found for the characteristic acceleration scale in the first place. Second, the reduced chi squared is rarely close to one. It should be! As a goodness of fit measure, one claims to have a good fit when chi squared equal to one. The majority of these are not good fits! Rather than the gradual slope we see here, the CDF of chi squared should be a nearly straight vertical line. That’s nothing like what we see.
If one interprets this literally, there are many large chi squared values well in excess of unity. These are bad fits, and the model should be rejected. That’s exactly what Rodrigues et al. (2018) found, rejecting the constancy of the acceleration scale at 10 sigma. By their reasoning, we must also reject the constancy of Newton’s constant with the same high confidence. That’s just silly.
One strange thing: the people complaining that the acceleration scale is not constant are only testing that hypothesis. Their presumption is that if the data reject that, it falsifies MOND. The attitude is that this is an automatic win for dark matter. Is it? They don’t bother checking.
We do. We can do the same exercise with dark matter. We find the same result. The CDF looks the same; there are many galaxies with chi squared that is too large.
Having found the same result for dark matter halos that we found for the RAR, if we apply the same logic, then all proposed model halos are excluded. There are too many bad fits with overly large chi squared.
We have now ruled out all conceivable models. Dark matter is falsified. MOND is falsified. Nothing works. Look on these data, ye mighty, and despair.
But wait! Should we believe the error bars that lead to the end of all things? What would Betteridge say?
Here is the rotation curve of DDO 170 fit with the RAR. Look first at the left box, with the data (points) and the fit (red line). Then look at the fit parameters in the right box.
Looking at the left panel, this is a good fit. The line representing the model provides a reasonable depiction of the data.
Looking at the right panel, this is a terrible fit. The reduced chi squared is 4.9. That’s a lot larger than one! The model is rejected with high confidence.
Well, which is it? Lots of people fall into the trap of blindly trusting statistical tests like chi squared. Statistics can only help your brain. They can’t replace it. Trust your eye-brain. This is a good fit. Chi squared is overly large not because this is a bad model but because the error bars are too small. The absolute amount by which the data “miss” is just a few km/s. This is not much by the standards of galaxies, and could easily be explained by a small departure of the tracer from a purely circular orbit – a physical effect we expect at that level. Or it could simply be that the errors are underestimated. Either way, it isn’t a big deal. It would be incredibly naive to take chi squared at face value.
If you want to see a dozen plots like this for all the various models fit to each of over a hundred galaxies, see Li et al. (2020). The bottom line is always the same. The same galaxies are poorly fit by any model — dark matter or MOND. Chi squared is too big not because all conceivable models are wrong, but because the formal errors are underestimated in many cases.
This comes as no surprise to anyone with experience working with astronomical data. We can work to improve the data and the error estimation – see, for example, Sellwood et al (2021). But we can’t blindly turn the crank on some statistical black box and expect all the secrets of the universe to tumble out onto a silver platter for our delectation. There’s a little more to it than that.
This Thanksgiving, I’d highlight something positive. Recently, Bob Sanders wrote a paper pointing out that gas rich galaxies are strong tests of MOND. The usual fit parameter, the stellar mass-to-light ratio, is effectively negligible when gas dominates. The MOND prediction follows straight from the gas distribution, for which there is no equivalent freedom. We understand the 21 cm spin-flip transition well enough to relate observed flux directly to gas mass.
In any human endeavor, there are inevitably unsung heroes who carry enormous amounts of water but seem to get no credit for it. Sanders is one of those heroes when it comes to the missing mass problem. He was there at the beginning, and has a valuable perspective on how we got to where we are. I highly recommend his books, The Dark Matter Problem: A Historical Perspective and Deconstructing Cosmology.
In bright spiral galaxies, stars are usually 80% or so of the mass, gas only 20% or less. But in many dwarf galaxies, the mass ratio is reversed. These are often low surface brightness and challenging to observe. But it is a worthwhile endeavor, as their rotation curve is predicted by MOND with extraordinarily little freedom.
Though gas rich galaxies do indeed provide an excellent test of MOND, nothing in astronomy is perfectly clean. The stellar mass-to-light ratio is an irreducible need-to-know parameter. We also need to know the distance to each galaxy, as we do not measure the gas mass directly, but rather the flux of the 21 cm line. The gas mass scales with flux and the square of the distance (see equation 7E7), so to get the gas mass right, we must first get the distance right. We also need to know the inclination of a galaxy as projected on the sky in order to get the rotation to which we’re fitting right, as the observed line of sight Doppler velocity is only sin(i) of the full, in-plane rotation speed. The 1/sin(i) correction becomes increasingly sensitive to errors as i approaches zero (face-on galaxies).
The mass-to-light ratio is a physical fit parameter that tells us something meaningful about the amount of stellar mass that produces the observed light. In contrast, for our purposes here, distance and inclination are “nuisance” parameters. These nuisance parameters can be, and generally are, measured independently from mass modeling. However, these measurements have their own uncertainties, so one has to be careful about taking these measured values as-is. One of the powerful aspects of Bayesian analysis is the ability to account for these uncertainties to allow for the distance to be a bit off the measured value, so long as it is not too far off, as quantified by the measurement uncertainties. This is what current graduate student Pengfei Li did in Li et al. (2018). The constraints on MOND are so strong in gas rich galaxies that often the nuisance parameters cannot be ignored, even when they’re well measured.
To illustrate what I’m talking about, let’s look at one famous example, DDO 154. This galaxy is over 90% gas. The stars (pictured above) just don’t matter much. If the distance and inclination are known, the MOND prediction for the rotation curve follows directly. Here is an example of a MOND fit from a recent paper:
This is terrible! The MOND fit – essentially a parameter-free prediction – misses all of the data. MOND is falsified. If one is inclined to hate MOND, as many seem to be, then one stops here. No need to think further.
If one is familiar with the ups and downs in the history of astronomy, one might not be so quick to dismiss it. Indeed, one might notice that the shape of the MOND prediction closely tracks the shape of the data. There’s just a little difference in scale. That’s kind of amazing for a theory that is wrong, especially when it is amplifying the green line to predict the red one: it needn’t have come anywhere close.
Here is the fit to the same galaxy using the same data [already] published in Li et al.:
Now we have a good fit, using the same data! How can this be so?
I have not checked what Ren et al. did to obtain their MOND fits, but having done this exercise myself many times, I recognize the slight offset they find as a typical consequence of holding the nuisance parameters fixed. What if the measured distance is a little off?
Distance estimates to DDO 154 in the literature range from 3.02 Mpc to 6.17 Mpc. The formally most accurate distance measurement is 4.04 ± 0.08 Mpc. In the fit shown here, we obtained 3.87 ± 0.16 Mpc. The error bars on these distances overlap, so they are the same number, to measurement accuracy. These data do not falsify MOND. They demonstrate that it is sensitive enough to tell the difference between 3.8 and 4.1 Mpc.
One will never notice this from a dark matter fit. Ren et al. also make fits with self-interacting dark matter (SIDM). The nifty thing about SIDM is that it makes quasi-constant density cores in dark matter halos. Halos of this form are not predicted by “ordinary” cold dark matter (CDM), but often give better fits than either MOND of the NFW halos of dark matter-only CDM simulations. For this galaxy, Ren et al. obtain the following SIDM fit.
This is a great fit. Goes right through the data. That makes it better, right?
Not necessarily. In addition to the mass-to-light ratio (and the nuisance parameters of distance and inclination), dark matter halo fits have [at least] two additional free parameters to describe the dark matter halo, such as its mass and core radius. These parameters are highly degenerate – one can obtain equally good fits for a range of mass-to-light ratios and core radii: one makes up for what the other misses. Parameter degeneracy of this sort is usually a sign that there is too much freedom in the model. In this case, the data are adequately described by one parameter (the MOND fit M*/L, not counting the nuisances in common), so using three (M*/L, Mhalo, Rcore) is just an exercise in fitting a French curve. There is ample freedom to fit the data. As a consequence, you’ll never notice that one of the nuisance parameters might be a tiny bit off.
In other words, you can fool a dark matter fit, but not MOND. Erwin de Blok and I demonstrated this 20 years ago. A common myth at that time was that “MOND is guaranteed to fit rotation curves.” This seemed patently absurd to me, given how it works: once you stipulate the distribution of baryons, the rotation curve follows from a simple formula. If the two don’t match, they don’t match. There is no guarantee that it’ll work. Instead, it can’t be forced.
As an illustration, Erwin and I tried to trick it. We took two galaxies that are identical in the Tully-Fisher plane (NGC 2403 and UGC 128) and swapped their mass distribution and rotation curve. These galaxies have the same total mass and the same flat velocity in the outer part of the rotation curve, but the detailed distribution of their baryons differs. If MOND can be fooled, this closely matched pair ought to do the trick. It does not.
Our failure to trick MOND should not surprise anyone who bothers to look at the math involved. There is a one-to-one relation between the distribution of the baryons and the resulting rotation curve. If there is a mismatch between them, a fit cannot be obtained.
We also attempted to play this same trick on dark matter. The standard dark matter halo fitting function at the time was the pseudo-isothermal halo, which has a constant density core. It is very similar to the halos of SIDM and to the cored dark matter halos produced by baryonic feedback in some simulations. Indeed, that is the point of those efforts: they are trying to capture the success of cored dark matter halos in fitting rotation curve data.
Dark matter halos with a quasi-constant density core do indeed provide good fits to rotation curves. Too good. They are easily fooled, because they have too many degrees of freedom. They will fit pretty much any plausible data that you throw at them. This is why the SIDM fit to DDO 154 failed to flag distance as a potential nuisance. It can’t. You could double (or halve) the distance and still find a good fit.
This is why parameter degeneracy is bad. You get lost in parameter space. Once lost there, it becomes impossible to distinguish between successful, physically meaningful fits and fitting epicycles.
Astronomical data are always subject to improvement. For example, the THINGS project obtained excellent data for a sample of nearby galaxies. I made MOND fits to all the THINGS (and other) data for the MOND review Famaey & McGaugh (2012). Here’s the residual diagram, which has been on my web page for many years:
These are, by and large, good fits. The residuals have a well defined peak centered on zero. DDO 154 was one of the THINGS galaxies; lets see what happens if we use those data.
The first thing one is likely to notice is that the THINGS data are much better resolved than the previous generation used above. The first thing I noticed was that THINGS had assumed a distance of 4.3 Mpc. This was prior to the measurement of 4.04, so lets just start over from there. That gives the MOND prediction shown above.
And it is a prediction. I haven’t adjusted any parameters yet. The mass-to-light ratio is set to the mean I expect for a star forming stellar population, 0.5 in solar units in the Sptizer 3.6 micron band. D=4.04 Mpc and i=66 as tabulated by THINGS. The result is pretty good considering that no parameters have been harmed in the making of this plot. Nevertheless, MOND overshoots a bit at large radii.
Constraining the inclinations for gas rich dwarf galaxies like DDO 154 is a bit of a nightmare. Literature values range from 20 to 70 degrees. Seriously. THINGS itself allows the inclination to vary with radius; 66 is just a typical value. Looking at the fit Pengfei obtained, i=61. Let’s try that.
The fit is now satisfactory. One tweak to the inclination, and we’re done. This tweak isn’t even a fit to these data; it was adopted from Pengfei’s fit to the above data. This tweak to the inclination is comfortably within any plausible assessment of the uncertainty in this quantity. The change in sin(i) corresponds to a mere 4% in velocity. I could probably do a tiny bit better with further adjustment – I have left both the distance and the mass-to-light ratio fixed – but that would be a meaningless exercise in statistical masturbation. The result just falls out: no muss, no fuss.
Hence the point Bob Sanders makes. Given the distribution of gas, the rotation curve follows. And it works, over and over and over, within the bounds of the uncertainties on the nuisance parameters.
One cannot do the same exercise with dark matter. It has ample ability to fit rotation curve data, once those are provided, but zero power to predict it. If all had been well with ΛCDM, the rotation curves of these galaxies would look like NFW halos. Or any number of other permutations that have been discussed over the years. In contrast, MOND makes one unique prediction (that was not at all anticipated in dark matter), and that’s what the data do. Out of the huge parameter space of plausible outcomes from the messy hierarchical formation of galaxies in ΛCDM, Nature picks the one that looks exactly like MOND.
It is a bad sign for a theory when it can only survive by mimicking its alternative. This is the case here: ΛCDM must imitate MOND. There are now many papers asserting that it can do just this, but none of those were written before the data were provided. Indeed, I consider it to be problematic that clever people can come with ways to imitate MOND with dark matter. What couldn’t it imitate? If the data had all looked like technicolor space donkeys, we could probably find a way to make that so as well.
Cosmologists will rush to say “microwave background!” I have some sympathy for that, because I do not know how to explain the microwave background in a MOND-like theory. At least I don’t pretend to, even if I had more predictive success there than their entire community. But that would be a much longer post.
For now, note that the situation is even worse for dark matter than I have so far made it sound. In many dwarf galaxies, the rotation velocity exceeds that attributable to the baryons (with Newton alone) at practically all radii. By a lot. DDO 154 is a very dark matter dominated galaxy. The baryons should have squat to say about the dynamics. And yet, all you need to know to predict the dynamics is the baryon distribution. The baryonic tail wags the dark matter dog.
But wait, it gets better! If you look closely at the data, you will note a kink at about 1 kpc, another at 2, and yet another around 5 kpc. These kinks are apparent in both the rotation curve and the gas distribution. This is an example of Sancisi’s Law: “For any feature in the luminosity profile there is a corresponding feature in the rotation curve and vice versa.” This is a general rule, as Sancisi observed, but it makes no sense when the dark matter dominates. The features in the baryon distribution should not be reflected in the rotation curve.
The observed baryons orbit in a disk with nearly circular orbits confined to the same plane. The dark matter moves on eccentric orbits oriented every which way to provide pressure support to a quasi-spherical halo. The baryonic and dark matter occupy very different regions of phase space, the six dimensional volume of position and momentum. The two are not strongly coupled, communicating only by the weak force of gravity in the standard CDM paradigm.
One of the first lessons of galaxy dynamics is that galaxy disks are subject to a variety of instabilities that grow bars and spiral arms. These are driven by disk self-gravity. The same features do not appear in elliptical galaxies because they are pressure supported, 3D blobs. They don’t have disks so they don’t have disk self-gravity, much less the features that lead to the bumps and wiggles observed in rotation curves.
Elliptical galaxies are a good visual analog for what dark matter halos are believed to be like. The orbits of dark matter particles are unable to sustain features like those seen in baryonic disks. They are featureless for the same reasons as elliptical galaxies. They don’t have disks. A rotation curve dominated by a spherical dark matter halo should bear no trace of the features that are seen in the disk. And yet they’re there, often enough for Sancisi to have remarked on it as a general rule.
It gets worse still. One of the original motivations for invoking dark matter was to stabilize galactic disks: a purely Newtonian disk of stars is not a stable configuration, yet the universe is chock full of long-lived spiral galaxies. The cure was to place them in dark matter halos.
The problem for dwarfs is that they have too much dark matter. The halo stabilizes disks by suppressing the formation of structures that stem from disk self-gravity. But you need some disk self-gravity to have the observed features. That can be tuned to work in bright spirals, but it fails in dwarfs because the halo is too massive. As a practical matter, there is no disk self-gravity in dwarfs – it is all halo, all the time. And yet, we do see such features. Not as strong as in big, bright spirals, but definitely present. Whenever someone tries to analyze this aspect of the problem, they inevitably come up with a requirement for more disk self-gravity in the form of unphysically high stellar mass-to-light ratios (something I predicted would happen). In contrast, this is entirely natural in MOND (see, e.g., Brada & Milgrom 1999 and Tiret & Combes 2008), where it is all disk self-gravity since there is no dark matter halo.
The net upshot of all this is that it doesn’t suffice to mimic the radial acceleration relation as many simulations now claim to do. That was not a natural part of CDM to begin with, but perhaps it can be done with smooth model galaxies. In most cases, such models lack the resolution to see the features seen in DDO 154 (and in NGC 1560 and in IC 2574, etc.) If they attain such resolution, they better not show such features, as that would violate some basic considerations. But then they wouldn’t be able to describe this aspect of the data.
Simulators by and large seem to remain sanguine that this will all work out. Perhaps I have become too cynical, but I recall hearing that 20 years ago. And 15. And ten… basically, they’ve always assured me that it will work out even though it never has. Maybe tomorrow will be different. Or would that be the definition of insanity?
The Milky Way Galaxy in which we live seems to be a normal spiral galaxy. But it can be hard to tell. Our perspective from within it precludes a “face-on” view like the picture above, which combines some real data with a lot of artistic liberty. Some local details we can measure in extraordinary detail, but the big picture is hard. Just how big is the Milky Way? The absolute scale of our Galaxy has always been challenging to measure accurately from our spot within it.
For some time, we have had a remarkably accurate measurement of the angular speed of the sun around the center of the Galaxy provided by the proper motion of Sagittarius A*. Sgr A* is the radio source associated with the supermassive black hole at the center of the Galaxy. By watching how it appears to move across the sky, Reid & Brunthaler found our relative angular speed to be 6.379 milliarcseconds/year. That’s a pretty amazing measurement: a milliarcsecond is one one-thousandth of one arcsecond, which is one sixtieth of one arcminute, which is one sixtieth of a degree. A pretty small angle.
The proper motion of an object depends on the ratio of its speed to its distance. So this high precision measurement does not itself tell us how big the Milky Way is. We could be far from the center and moving fast, or close and moving slow. Close being a relative term when our best estimates of the distance to the Galactic center hover around 8 kpc (26,000 light-years), give or take half a kpc.
This situation has recently improved dramatically thanks to the Gravity collaboration. They have observed the close passage of a star (S2) past the central supermassive black hole Sgr A*. Their chief interest is in the resulting relativistic effects: gravitational redshift and Schwarzschild precession, which provide a test of General Relativity. Unsurprisingly, it passes with flying colors.
As a consequence of their fitting process, we get for free some other interesting numbers. The mass of the central black hole is 4.1 million solar masses, and the distance to it is 8.122 kpc. The quoted uncertainty is only 31 pc. That’s parsecs, not kiloparsecs. Previously, I had seen credible claims that the distance to the Galactic center was 7.5 kpc. Or 7.9. Or 8.3 Or 8.5. There was a time when it was commonly thought to be about 10 kpc, i.e., we weren’t even sure what column the first digit belonged in. Now we know it to several decimal places. Amazing.
Knowing both the Galactocentric distance and the proper motion of Sgr A* nails down the relative speed of the sun: 245.6 km/s. Of this, 12.2 km/s is “solar motion,” which is how much the sun deviates from a circular orbit. Correcting for this gives us the circular speed of an imaginary test particle orbiting at the sun’s location: 233.3 km/s, accurate to 1.4 km/s.
The distance and circular speed at the solar circle are the long sought Galactic Constants. These specify the scale of the Milky Way. Knowing them also pins down the rotation curve interior to the sun. This is well constrained by the “terminal velocities,” which provide a precise mapping of relative speeds, but need the Galactic Constants for an absolute scale.
A few years ago, I built a model Milky Way rotation curve that fit the terminal velocity data. What I was interested in then was to see if I could use the radial acceleration relation (RAR) to infer the mass distribution of the Galactic disk. The answer was yes. Indeed, it makes for a clear improvement over the traditional approach of assuming a purely exponential disk in the sense that the kinematically inferred bumps and wiggles in the rotation curve correspond to spiral arms known from star counts, as in external spiral galaxies.
Now that the Galactic constants are Known, it seems worth updating the model. This results in the surface density profile
with the corresponding rotation curve
The model data are available from the Milky Way section of my model pages.
Finding a model that matches both the terminal velocity and the highly accurate Galactic constants is no small feat. Indeed, I worried it was impossible: the speed at the solar circle is down to 233 km/s from a high of 249 km/s just a couple of kpc interior. This sort of variation is possible, but it requires a ring of mass outside the sun. This appears to be the effect of the Perseus spiral arm.
For the new Galactic constants and the current calibration of the RAR, the stellar mass of the Milky Way works out to just under 62 billion solar masses. The largest uncertainty in this is from the asymmetry in the terminal velocities, which are slightly different in the first and fourth quadrants. This is likely a real asymmetry in the mass distribution of the Milky Way. Treating it as an uncertainty, the range of variation corresponds to about 5% up or down in stellar mass.
With the stellar mass determined in this way, we can estimate the local density of dark matter. This is the critical number that is needed for experimental searches: just how much of the stuff should we expect? The answer is very precise: 0.257 GeV per cubic cm. This a bit less than is usually assumed, which makes it a tiny bit harder on the hard-working experimentalists.
The accuracy of the dark matter density is harder to assess. The biggest uncertainty is that in stellar mass. We known the total radial force very well now, but how much is due to stars, and how much to dark matter? (or whatever). The RAR provides a unique method for constraining the stellar contribution, and does so well enough that there is very little formal uncertainty in the dark matter density. This, however, depends on the calibration of the RAR, which itself is subject to systematic uncertainty at the 20% level. This is not as bad as it sounds, because a recalibration of the RAR changes its shape in a way that tends to trade off with stellar mass while not much changing the implied dark matter density. So even with these caveats, this is the most accurate measure of the dark matter density to date.
This is all about the radial force. One can also measure the force perpendicular to the disk. This vertical force implies about twice the dark matter density. This may be telling us something about the shape of the dark matter halo – rather than being spherical as usually assumed, it might be somewhat squashed. It is easy to say that, but it seems a strange circumstance: the stars provide most of the restoring force in the vertical direction, and apparently dominate the radial force. Subtracting off the stellar contribution is thus a challenging task: the total force isn’t much greater than that from the stars alone. Subtracting one big number from another to measure a small one is fraught with peril: the uncertainties tend to blow up in your face.
Returning to the Milky Way, it seems in all respects to be a normal spiral galaxy. With the stellar mass found here, we can compare it to other galaxies in scaling relations like Tully-Fisher. It does not stand out from the crowd: our home is a fairly normal place for this time in the Universe.
It is possible to address many more details with a model like this. See the original!
As soon as I wrote it, I realized that the title is much more general than anything that can be fit in a blog post. Bekenstein argued long ago that the missing mass problem should instead be called the acceleration discrepancy, because that’s what it is – a discrepancy that occurs in conventional dynamics at a particular acceleration scale. So in that sense, it is the entire history of dark matter. For that, I recommend the excellent book The Dark Matter Problem: A Historical Perspective by Bob Sanders.
Here I mean more specifically my own attempts to empirically constrain the relation between the mass discrepancy and acceleration. Milgrom introduced MOND in 1983, no doubt after a long period of development and refereeing. He anticipated essentially all of what I’m going to describe. But not everyone is eager to accept MOND as a new fundamental theory, and often suffer from a very human tendency to confuse fact and theory. So I have gone out of my way to demonstrate what is empirically true in the data – facts – irrespective of theoretical interpretation (MOND or otherwise).
What is empirically true, and now observationally established beyond a reasonable doubt, is that the mass discrepancy in rotating galaxies correlates with centripetal acceleration. The lower the acceleration, the more dark matter one appears to need. Or, as Bekenstein might have put it, the amplitude of the acceleration discrepancy grows as the acceleration itself declines.
Bob Sanders made the first empirical demonstration that I am aware of that the mass discrepancy correlates with acceleration. In a wide ranging and still relevant 1990 review, he showed that the amplitude of the mass discrepancy correlated with the acceleration at the last measured point of a rotation curve. It did not correlate with radius.
I was completely unaware of this when I became interested in the problem a few years later. I wound up reinventing the very same term – the mass discrepancy, which I defined as the ratio of dynamically measured mass to that visible in baryons: D = Mtot/Mbar. When there is no dark matter, Mtot = Mbar and D = 1.
My first demonstration of this effect was presented at a conference at Rutgers in 1998. This considered the mass discrepancy at every radius and every acceleration within all the galaxies that were available to me at that time. Though messy, as is often the case in extragalactic astronomy, the correlation was clear. Indeed, this was part of a broader review of galaxy formation; the title, abstract, and much of the substance remains relevant today.
I spent much of the following five years collecting more data, refining the analysis, and sweating the details of uncertainties and systematic instrumental effects. In 2004, I published an extended and improved version, now with over 5 dozen galaxies.
Here I’ve used a population synthesis model to estimate the mass-to-light ratio of the stars. This is the only unknown; everything else is measured. Note that the vast majority galaxies land on top of each other. There are a few that do not, as you can perceive in the parallel sets of points offset from the main body. But that happens in only a few cases, as expected – no population model is perfect. Indeed, this one was surprisingly good, as the vast majority of the individual galaxies are indistinguishable in the pile that defines the main relation.
I explored the how the estimation of the stellar mass-to-light ratio affected this mass discrepancy-acceleration relation in great detail in the 2004 paper. The details differ with the choice of estimator, but the bottom line was that the relation persisted for any plausible choice. The relation exists. It is an empirical fact.
At this juncture, further improvement was no longer limited by rotation curve data, which is what we had been working to expand through the early ’00s. Now it was the stellar mass. The measurement of stellar mass was based on optical measurements of the luminosity distribution of stars in galaxies. These are perfectly fine data, but it is hard to map the starlight that we measured to the stellar mass that we need for this relation. The population synthesis models were good, but they weren’t good enough to avoid the occasional outlier, as can be seen in the figure above.
One thing the models all agreed on (before they didn’t, then they did again) was that the near-infrared would provide a more robust way of mapping stellar mass than the optical bands we had been using up till then. This was the clear way forward, and perhaps the only hope for improving the data further. Fortunately, technology was keeping pace. Around this time, I became involved in helping the effort to develop the NEWFIRM near-infrared camera for the national observatories, and NASA had just launched the Spitzer space telescope. These were the right tools in the right place at the right time. Ultimately, the high accuracy of the deep images obtained from the dark of space by Spitzer at 3.6 microns were to prove most valuable.
Jim Schombert and I spent much of the following decade observing in the near-infrared. Many other observers were doing this as well, filling the Spitzer archive with useful data while we concentrated on our own list of low surface brightness galaxies. This paragraph cannot suffice to convey the long term effort and enormity of this program. But by the mid-teens, we had accumulated data for hundreds of galaxies, including all those for which we also had rotation curves and HI observations. The latter had been obtained over the course of decades by an entire independent community of radio observers, and represent an integrated effort that dwarfs our own.
On top of the observational effort, Jim had been busy building updated stellar population models. We have a sophisticated understanding of how stars work, but things can get complicated when you put billions of them together. Nevertheless, Jim’s work – and that of a number of independent workers – indicated that the relation between Spitzer’s 3.6 micron luminosity measurements and stellar mass should be remarkably simple – basically just a constant conversion factor for nearly all star forming galaxies like those in our sample.
Things came together when Federico Lelli joined Case Western as a postdoc in 2014. He had completed his Ph.D. in the rich tradition of radio astronomy, and was the perfect person to move the project forward. After a couple more years of effort, curating the rotation curve data and building mass models from the Spitzer data, we were in the position to build the relation for over a dozen dozen galaxies. With all the hard work done, making the plot was a matter of running a pre-prepared computer script.
Federico ran his script. The plot appeared on his screen. In a stunned voice, he called me into his office. We had expected an improvement with the Spitzer data – hence the decade of work – but we had also expected there to be a few outliers. There weren’t. Any.
All. the. galaxies. fell. right. on. top. of. each. other.
This plot differs from those above because we had decided to plot the measured acceleration against that predicted by the observed baryons so that the two axes would be independent. The discrepancy, defined as the ratio, depended on both. D is essentially the ratio of the y-axis to the x-axis of this last plot, dividing out the unity slope where D = 1.
This was one of the most satisfactory moments of my long career, in which I have been fortunate to have had many satisfactory moments. It is right up there with the eureka moment I had that finally broke the long-standing loggerhead about the role of selection effects in Freeman’s Law. (Young astronomers – never heard of Freeman’s Law? You’re welcome.) Or the epiphany that, gee, maybe what we’re calling dark matter could be a proxy for something deeper. It was also gratifying that it was quickly recognized as such, with many of the colleagues I first presented it to saying it was the highlight of the conference where it was first unveiled.
Regardless of the ultimate interpretation of the radial acceleration relation, it clearly exists in the data for rotating galaxies. The discrepancy appears at a characteristic acceleration scale, g† = 1.2 x 10-10 m/s/s. That number is in the data. Why? is a deeply profound question.
It isn’t just that the acceleration scale is somehow fundamental. The amplitude of the discrepancy depends systematically on the acceleration. Above the critical scale, all is well: no need for dark matter. Below it, the amplitude of the discrepancy – the amount of dark matter we infer – increases systematically. The lower the acceleration, the more dark matter one infers.
The relation for rotating galaxies has no detectable scatter – it is a near-perfect relation. Whether this persists, and holds for other systems, is the interesting outstanding question. It appears, for example, that dwarf spheroidal galaxies may follow a slightly different relation. However, the emphasis here is on slighlty. Very few of these data pass the same quality criteria that the SPARC data plotted above do. It’s like comparing mud pies with diamonds.
Whether the scatter in the radial acceleration relation is zero or merely very tiny is important. That’s the difference between a new fundamental force law (like MOND) and a merely spectacular galaxy scaling relation. For this reason, it seems to be controversial. It shouldn’t be: I was surprised at how tight the relation was myself. But I don’t get to report that there is lots of scatter when there isn’t. To do so would be profoundly unscientific, regardless of the wants of the crowd.
Of course, science is hard. If you don’t do everything right, from the measurements to the mass models to the stellar populations, you’ll find some scatter where perhaps there isn’t any. There are so many creative ways to screw up that I’m sure people will continue to find them. Myself, I prefer to look forward: I see no need to continuously re-establish what has been repeatedly demonstrated in the history briefly outlined above.
One experience I’ve frequently had in Astronomy is that there is no result so obvious that someone won’t claim the exact opposite. Indeed, the more obvious the result, the louder the claim to contradict it.
There is a very obvious acceleration scale in galaxies. It can be seen in several ways. Here I describe a nice way that is completely independent of any statistics or model fitting: no need to argue over how to set priors.
Simple dimensional analysis shows that a galaxy with a flat rotation curve has a characteristic acceleration
g† = 0.8 Vf4/(G Mb)
where Vf is the flat rotation speed, Mb is the baryonic mass, and G is Newton’s constant. The factor 0.8 arises from the disk geometry of rotating galaxies, which are not spherical cows. This is first year grad school material: see Binney & Tremaine. I include it here merely to place the characteristic acceleration g† on the same scale as Milgrom’s acceleration constant a0.
These are all known numbers or measurable quantities. There are no free parameters: nothing to fiddle; nothing to fit. The only slightly tricky quantity is the baryonic mass, which is the sum of stars and gas. For the stars, we measure the light but need the mass, so we must adopt a mass-to-light ratio, M*/L. Here I adopt the simple model used to construct the radial acceleration relation: a constant 0.5 M⊙/L⊙ at 3.6 microns for galaxy disks, and 0.7 M⊙/L⊙ for bulges. This is the best present choice from stellar population models; the basic story does not change with plausible variations.
This is all it takes to compute the characteristic acceleration of galaxies. Here is the resulting histogram for SPARCgalaxies:
Do you see the acceleration scale? It’s right there in the data.
I first employed this method in 2011, where I found <g†> = 1.24 ± 0.14 Å s-2 for a sample of gas rich galaxies that predates and is largely independent of the SPARC data. This is consistent with the SPARC result <g†> = 1.20 ± 0.02 Å s-2. This consistency provides some reassurance that the mass-to-light scale is near to correct since the gas rich galaxies are not sensitive to the choice of M*/L. Indeed, the value of Milgrom’s constant has not changed meaningfully since Begeman, Broeils, & Sanders (1991).
The width of the acceleration histogram is dominated by measurement uncertainties and scatter in M*/L. We have assumed that M*/L is constant here, but this cannot be exactly true. It is a good approximation in the near-infrared, but there must be some variation from galaxy to galaxy, as each galaxy has its own unique star formation history. Intrinsic scatter in M*/L due to population difference broadens the distribution. The intrinsic distribution of characteristic accelerations must be smaller.
I have computed the scatter budget many times. It always comes up the same: known uncertainties and scatter in M*/L gobble up the entire budget. There is very little room left for intrinsic variation in <g†>. The upper limit is < 0.06 dex, an absurdly tiny number by the standards of extragalactic astronomy. The data are consistent with negligible intrinsic scatter, i.e., a universal acceleration scale. Apparently a fundamental acceleration scale is present in galaxies.
The radial acceleration relation connects what we see in visible mass with what we get in galaxy dynamics. This is true in a statistical sense, with remarkably little scatter. The SPARC data are consistent with a single, universal force law in galaxies. One that appears to be sourced by the baryons alone.
This was not expected with dark matter. Indeed, it would be hard to imagine a less natural result. We can only salvage the dark matter picture by tweaking it to make it mimic its chief rival. This is not a healthy situation for a theory.
On the other hand, if these results really do indicate the action of a single universal force law, then it should be possible to fit each individual galaxy. This has been done manytimesbefore, with surprisingly positive results. Does it work for the entirety of SPARC?
For the impatient, the answer is yes. Graduate student Pengfei Li has addressed this issue in a paper in press at A&A. There are some inevitable goofballs; this is astronomy after all. But by and large, it works much better than I expected – the goof rate is only about 10%, and the worst goofs are for the worst data.
Fig. 1 from the paper gives the example of NGC 2841. This case has been historically problematic for MOND, but a good fit falls out of the Bayesian MCMC procedure employed. We marginalize over the nuisance parameters (distance and inclination) in addition to the stellar mass-to-light ratio of disk and bulge. These come out a tad high in this case, but everything is within the uncertainties. A long standing historical problem is easily solved by application of Bayesian statistics.
Another example is provided by the low surface brightness (LSB) dwarf galaxy IC 2574. Note that like all LSB galaxies, it lies at the low acceleration end of the RAR. This is what attracted my attention to the problem a long time ago: the mass discrepancy is large everywhere, so conventionally dark matter dominates. And yet, the luminous matter tells you everything you need to know to predict the rotation curve. This makes no physical sense whatsoever: it is as if the baryonic tail wags the dark matter dog.
In this case, the mass-to-light ratio of the stars comes out a bit low. LSB galaxies like IC 2574 are gas rich; the stellar mass is pretty much an afterthought to the fitting process. That’s good: there is very little freedom; the rotation curve has to follow almost directly from the observed gas distribution. If it doesn’t, there’s nothing to be done to fix it. But it is also bad: since the stars contribute little to the total mass budget, their mass-to-light ratio is not well constrained by the fit – changing it a lot makes little overall difference. This renders the formal uncertainty on the mass-to-light ratio highly dubious. The quoted number is correct for the data as presented, but it does not reflect the inevitable systematic errors that afflict astronomical observations in a variety of subtle ways. In this case, a small change in the innermost velocity measurements (as happens in the THINGS data) could change the mass-to-light ratio by a huge factor (and well outside the stated error) without doing squat to the overall fit.
We can address statistically how [un]reasonable the required fit parameters are. Short answer: they’re pretty darn reasonable. Here is the distribution of 3.6 micron band mass-to-light ratios.
From a stellar population perspective, we expect roughly constant mass-to-light ratios in the near-infrared, with some scatter. The fits to the rotation curves give just that. There is no guarantee that this should work out. It could be a meaningless fit parameter with no connection to stellar astrophysics. Instead, it reproduces the normalization, color dependence, and scatter expected from completely independent stellar population models.
The stellar mass-to-light ratio is practically inaccessible in the context of dark matter fits to rotation curves, as it is horribly degenerate with the parameters of the dark matter halo. That MOND returns reasonable mass-to-light ratios is one of those important details that keeps me wondering. It seems like there must be something to it.
Unsurprisingly, once we fit the mass-to-light ratio and the nuisance parameters, the scatter in the RAR itself practically vanishes. It does not entirely go away, as we fit only one mass-to-light ratio per galaxy (two in the handful of cases with a bulge). The scatter in the individual velocity measurements has been minimized, but some remains. The amount that remains is tiny (0.06 dex) and consistent with what we’d expect from measurement errors and mild asymmetries (non-circular motions).
For those unfamiliar with extragalactic astronomy, it is common for “correlations” to be weak and have enormous intrinsic scatter. Early versions of the Tully-Fisher relation were considered spooky-tight with a mere 0.4 mag. of scatter. In the RAR we have a relation as near to perfect as we’re likely to get. The data are consistent with a single, universal force law – at least in the radial direction in rotating galaxies.
That’s a strong statement. It is hard to understand in the context of dark matter. If you think you do, you are not thinking clearly.
So how strong is this statement? Very. We tried fits allowing additional freedom. None is necessary. One can of course introduce more parameters, but we find that no more are needed. The bare minimum is the mass-to-light ratio (plus the nuisance parameters of distance and inclination); these entirely suffice to describe the data. Allowing more freedom does not meaningfully improve the fits.
For example, I have often seen it asserted that MOND fits require variation in the acceleration constant of the theory. If this were true, I would have zero interest in the theory. So we checked.
Here we learn something important about the role of priors in Bayesian fits. If we allow the critical acceleration g† to vary from galaxy to galaxy with a flat prior, it does indeed do so: it flops around all over the place. Aha! So g† is not constant! MOND is falsified!
Well, no. Flat priors are often problematic, as they have no physical motivation. By allowing for a wide variation in g†, one is inviting covariance with other parameters. As g† goes wild, so too does the mass-to-light ratio. This wrecks the stellar mass Tully-Fisher relation by introducing a lot of unnecessary variation in the mass-to-light ratio: luminosity correlates nicely with rotation speed, but stellar mass picks up a lot of extraneous scatter. Worse, all this variation in both g† and the mass-to-light ratio does very little to improve the fits. It does a tiny bit – χ2 gets infinitesimally better, so the fitting program takes it. But the improvement is not statistically meaningful.
In contrast, with a Gaussian prior, we get essentially the same fits, but with practically zero variation in g†. wee The reduced χ2 actually gets a bit worse thanks to the extra, unnecessary, degree of freedom. This demonstrates that for these data, g† is consistent with a single, universal value. For whatever reason it may occur physically, this number is in the data.
We have made the SPARC data public, so anyone who wants to reproduce these results may easily do so. Just mind your priors, and don’t take every individual error bar too seriously. There is a long tail to high χ2 that persists for any type of model. If you get a bad fit with the RAR, you will almost certainly get a bad fit with your favorite dark matter halo model as well. This is astronomy, fergodssake.