25 years a heretic

25 years a heretic

People seem to like to do retrospectives at year’s end. I take a longer view, but the end of 2020 seems like a fitting time to do that. Below is the text of a paper I wrote in 1995 with collaborators at the Kapteyn Institute of the University of Groningen. The last edit date is from December of that year, so this text (in plain TeX, not LaTeX!) is now a quarter century old. I am just going to cut & paste it as-was; I even managed to recover the original figures and translate them into something web-friendly (postscript to jpeg). This is exactly how it was.

This was my first attempt to express in the scientific literature my concerns for the viability of the dark matter paradigm, and my puzzlement that the only theory to get any genuine predictions right was MOND. It was the hardest admission in my career that this could be even a remote possibility. Nevertheless, intellectual honesty demanded that I report it. To fail to do so would be an act of reality denial antithetical to the foundational principles of science.

It was never published. There were three referees. Initially, one was positive, one was negative, and one insisted that rotation curves weren’t flat. There was one iteration; this is the resubmitted version in which the concerns of the second referee were addressed to his apparent satisfaction by making the third figure a lot more complicated. The third referee persisted that none of this was valid because rotation curves weren’t flat. Seems like he had a problem with something beyond the scope of this paper, but the net result was rejection.

One valid concern that ran through the refereeing process from all sides was “what about everything else?” This is a good question that couldn’t fit into a short letter like this. Thanks to the support of Vera Rubin and a Carnegie Fellowship, I spent the next couple of years looking into everything else. The results were published in 1998 in a series of three long papers: one on dark matter, one on MOND, and one making detailed fits.

This had started from a very different place intellectually with my efforts to write a paper on galaxy formation that would have been similar to contemporaneous papers like Dalcanton, Spergel, & Summers and Mo, Mao, & White. This would have followed from my thesis and from work with Houjun Mo, who was an office mate when we were postdocs at the IoA in Cambridge. (The ideas discussed in Mo, McGaugh, & Bothun have been reborn recently in the galaxy formation literature under the moniker of “assembly bias.”) But I had realized by then that my ideas – and those in the papers cited – were wrong. So I didn’t write a paper that I knew to be wrong. I wrote this one instead.

Nothing substantive has changed since. Reading it afresh, I’m amazed how many of the arguments over the past quarter century were anticipated here. As a scientific community, we are stuck in a rut, and seem to prefer to spin the wheels to dig ourselves in deeper than consider the plain if difficult path out.


Testing hypotheses of dark matter and alternative gravity with low surface density galaxies

The missing mass problem remains one of the most vexing in astrophysics. Observations clearly indicate either the presence of a tremendous amount of as yet unidentified dark matter1,2, or the need to modify the law of gravity3-7. These hypotheses make vastly different predictions as a function of density. Observations of the rotation curves of galaxies of much lower surface brightness than previously studied therefore provide a powerful test for discriminating between them. The dark matter hypothesis requires a surprisingly strong relation between the surface brightness and mass to light ratio8, placing stringent constraints on theories of galaxy formation and evolution. Alternatively, the observed behaviour is predicted4 by one of the hypothesised alterations of gravity known as modified Newtonian dynamics3,5 (MOND).

Spiral galaxies are observed to have asymptotically flat [i.e., V(R) ~ constant for large R] rotation curves that extend well beyond their optical edges. This trend continues for as far (many, sometimes > 10 galaxy scale lengths) as can be probed by gaseous tracers1,2 or by the orbits of satellite galaxies9. Outside a galaxy’s optical radius, the gravitational acceleration is aN = GM/R2 = V2/R so one expects V(R) ~ R-1/2. This Keplerian behaviour is not observed in galaxies.

One approach to this problem is to increase M in the outer parts of galaxies in order to provide the extra gravitational acceleration necessary to keep the rotation curves flat. Indeed, this is the only option within the framework of Newtonian gravity since both V and R are directly measured. The additional mass must be invisible, dominant, and extend well beyond the optical edge of the galaxies.

Postulating the existence of this large amount of dark matter which reveals itself only by its gravitational effects is a radical hypothesis. Yet the kinematic data force it upon us, so much so that the existence of dark matter is generally accepted. Enormous effort has gone into attempting to theoretically predict its nature and experimentally verify its existence, but to date there exists no convincing detection of any hypothesised dark matter candidate, and many plausible candidates have been ruled out10.

Another possible solution is to alter the fundamental equation aN = GM/R2. Our faith in this simple equation is very well founded on extensive experimental tests of Newtonian gravity. Since it is so fundamental, altering it is an even more radical hypothesis than invoking the existence of large amounts of dark matter of completely unknown constituent components. However, a radical solution is required either way, so both possibilities must be considered and tested.

A phenomenological theory specifically introduced to address the problem of the flat rotation curves is MOND3. It has no other motivation and so far there is no firm physical basis for the theory. It provides no satisfactory cosmology, having yet to be reconciled with General Relativity. However, with the introduction of one new fundamental constant (an acceleration a0), it is empirically quite successful in fitting galaxy rotation curves11-14. It hypothesises that for accelerations a < a0 = 1.2 x 10-10 m s-2, the effective acceleration is given by aeff = (aN a0)1/2. This simple prescription works well with essentially only one free parameter per galaxy, the stellar mass to light ratio, which is subject to independent constraint by stellar evolution theory. More importantly, MOND makes predictions which are distinct and testable. One specific prediction4 is that the asymptotic (flat) value of the rotation velocity, Va, is Va = (GMa0)1/4. Note that Va does not depend on R, but only on M in the regime of small accelerations (a < a0).

In contrast, Newtonian gravity depends on both M and R. Replacing R with a mass surface density variable S = M(R)/R2, the Newtonian prediction becomes M S ~ Va4 which contrasts with the MOND prediction M ~ Va4. These relations are the theoretical basis in each case for the observed luminosity-linewidth relation L ~ Va4 (better known as the Tully-Fisher15 relation. Note that the observed value of the exponent is bandpass dependent, but does obtain the theoretical value of 4 in the near infrared16 which is considered the best indicator of the stellar mass. The systematic variation with bandpass is a very small effect compared to the difference between the two gravitational theories, and must be attributed to dust or stars under either theory.) To transform from theory to observation one requires the mass to light ratio Y: Y = M/L = S/s, where s is the surface brightness. Note that in the purely Newtonian case, M and L are very different functions of R, so Y is itself a strong function of R. We define Y to be the mass to light ratio within the optical radius R*, as this is the only radius which can be measured by observation. The global mass to light ratio would be very different (since M ~ R for R > R*, the total masses of dark haloes are not measurable), but the particular choice of definition does not affect the relevant functional dependences is all that matters. The predictions become Y2sL ~ Va4 for Newtonian gravity8,16 and YL ~ Va4 for MOND4.

The only sensible17 null hypothesis that can be constructed is that the mass to light ratio be roughly constant from galaxy to galaxy. Clearly distinct predictions thus emerge if galaxies of different surface brightnesses s are examined. In the Newtonian case there should be a family of parallel Tully-Fisher relations for each surface brightness. In the case of MOND, all galaxies should follow the same Tully-Fisher relation irrespective of surface brightness.

Recently it has been shown that extreme objects such as low surface brightness galaxies8,18 (those with central surface brightnesses fainter than s0 = 23 B mag./[] corresponding 40 L pc-2) obey the same Tully-Fisher relation as do the high surface brightness galaxies (typically with s0 = 21.65 B mag./[] or 140 L pc-2) which originally15 defined it. Fig. 1 shows the luminosity-linewidth plane for galaxies ranging over a factor of 40 in surface brightness. Regardless of surface brightness, galaxies fall on the same Tully-Fisher relation.

The luminosity-linewidth (Tully-Fisher) relation for spiral galaxies over a large range in surface brightness. The B-band relation is shown; the same result is obtained in all bands8,18. Absolute magnitudes are measured from apparent magnitudes assuming H0 = 75 km/s/Mpc. Rotation velocities Va are directly proportional to observed 21 cm linewidths (measured as the full width at 20% of maximum) W20 corrected for inclination [sin-1(i)]. Open symbols are an independent sample which defines42 the Tully-Fisher relation (solid line). The dotted lines show the expected shift of the Tully-Fisher relation for each step in surface brightness away from the canonical value s0 = 21.5 if the mass to light ratio remains constant. Low surface brightness galaxies are plotted as solid symbols, binned by surface brightness: red triangles: 22 < s0 < 23; green squares: 23 < s0 < 24; blue circles: s0 > 24. One galaxy with two independent measurements is connected by a line. This gives an indication of the typical uncertainty which is sufficient to explain nearly all the scatter. Contrary to the clear expectation of a readily detectable shift as indicated by the dotted lines, galaxies fall on the same Tully-Fisher relation regardless of surface brightness, as predicted by MOND.

MOND predicts this behaviour in spite of the very different surface densities of low surface brightness galaxies. In order to understand this observational fact in the framework of standard Newtonian gravity requires a subtle relation8 between surface brightness and the mass to light ratio to keep the product sY2 constant. If we retain normal gravity and the dark matter hypothesis, this result is unavoidable, and the null hypothesis of similar mass to light ratios (which, together with an assumed constancy of surface brightness, is usually invoked to explain the Tully-Fisher relation) is strongly rejected. Instead, the current epoch surface brightness is tightly correlated with the properties of the dark matter halo, placing strict constraints on models of galaxy formation and evolution.

The mass to light ratios computed for both cases are shown as a function of surface brightness in Fig. 2. Fig. 2 is based solely on galaxies with full rotation curves19,20 and surface photometry, so Va and R* are directly measured. The correlation in the Newtonian case is very clear (Fig. 2a), confirming our inference8 from the Tully-Fisher relation. Such tight correlations are very rare in extragalactic astronomy, and the Y-s relation is probably the real cause of an inferred Y-L relation. The latter is much weaker because surface brightness and luminosity are only weakly correlated21-24.

The mass to light ratio Y (in M/L) determined with (a) Newtonian dynamics and (b) MOND, plotted as a function of central surface brightness. The mass determination for Newtonian dynamics is M = V2 R*/G and for MOND is M = V4/(G a0). We have adopted as a consistent definition of the optical radius R* four scale lengths of the exponential optical disc. This is where discs tend to have edges, and contains essentially all the light21,22. The definition of R* makes a tremendous difference to the absolute value of the mass to light ratio in the Newtonian case, but makes no difference at all to the functional relation will be present regardless of the precise definition. These mass measurements are more sensitive to the inclination corrections than is the Tully-Fisher relation since there is a sin-2(i) term in the Newtonian case and one of sin-4(i) for MOND. It is thus very important that the inclination be accurately measured, and we have retained only galaxies which have adequate inclination determinations — error bars are plotted for a nominal uncertainty of 6 degrees. The sensitivity to inclination manifests itself as an increase in the scatter from (a) to (b). The derived mass is also very sensitive to the measured value of the asymptotic velocity itself, so we have used only those galaxies for which this can be taken directly from a full rotation curve19,20,42. We do not employ profile widths; the velocity measurements here are independent of those in Fig. 1. In both cases, we have subtracted off the known atomic gas mass19,20,42, so what remains is essentially only the stars and any dark matter that may exist. A very strong correlation (regression coefficient = 0.85) is apparent in (a): this is the mass to light ratio — surface brightness conspiracy. The slope is consistent (within the errors) with the theoretical expectation s ~ Y-2 derived from the Tully-Fisher relation8. At the highest surface brightnesses, the mass to light ratio is similar to that expected for the stellar population. At the faintest surface brightnesses, it has increased by a factor of nearly ten, indicating increasing dark matter domination within the optical disc as surface brightness decreases or a very systematic change in the stellar population, or both. In (b), the mass to light ratio scatters about a constant value of 2. This mean value, and the lack of a trend, is what is expected for stellar populations17,21-24.

The Y-s relation is not predicted by any dark matter theory25,26. It can not be purely an effect of the stellar mass to light ratio, since no other stellar population indicator such as color21-24 or metallicity27,28 is so tightly correlated with surface brightness. In principle it could be an effect of the stellar mass fraction, as the gas mass to light ratio follows a relation very similar to that of total mass to light ratio20. We correct for this in Fig. 2 by subtracting the known atomic gas mass so that Y refers only to the stars and any dark matter. We do not correct for molecular gas, as this has never been detected in low surface brightness galaxies to rather sensitive limits30 so the total mass of such gas is unimportant if current estimates31 of the variation of the CO to H2 conversion factor with metallicity are correct. These corrections have no discernible effect at all in Fig. 2 because the dark mass is totally dominant. It is thus very hard to see how any evolutionary effect in the luminous matter can be relevant.

In the case of MOND, the mass to light ratio directly reflects that of the stellar population once the correction for gas mass fraction is made. There is no trend of Y* with surface brightness (Fig. 2b), a more natural result and one which is consistent with our studies of the stellar populations of low surface brightness galaxies21-23. These suggest that Y* should be roughly constant or slightly declining as surface brightness decreases, with much scatter. The mean value Y* = 2 is also expected from stellar evolutionary theory17, which always gives a number 0 < Y* < 10 and usually gives 0.5 < Y* < 3 for disk galaxies. This is particularly striking since Y* is the only free parameter allowed to MOND, and the observed mean is very close to that directly observed29 in the Milky Way (1.7 ± 0.5 M/L).

The essence of the problem is illustrated by Fig. 3, which shows the rotation curves of two galaxies of essentially the same luminosity but vastly different surface brightnesses. Though the asymptotic velocities are the same (as required by the Tully-Fisher relation), the rotation curve of the low surface brightness galaxy rises less quickly than that of the high surface brightness galaxy as expected if the mass is distributed like the light. Indeed, the ratio of surface brightnesses is correct to explain the ratio of velocities at small radii if both galaxies have similar mass to light ratios. However, if this continues to be the case as R increases, the low surface brightness galaxy should reach a lower asymptotic velocity simply because R* must be larger for the same L. That this does not occur is the problem, and poses very significant systematic constraints on the dark matter distribution.

The rotation curves of two galaxies, one of high surface brightness11 (NGC 2403; open circles) and one of low surface brightness19 (UGC 128; filled circles). The two galaxies have very nearly the same asymptotic velocity, and hence luminosity, as required by the Tully-Fisher relation. However, they have central surface brightnesses which differ by a factor of 13. The lines give the contributions to the rotation curves of the various components. Green: luminous disk. Blue: dark matter halo. Red: luminous disk (stars and gas) with MOND. Solid lines refer to NGC 2403 and dotted lines to UGC 128. The fits for NGC 2403 are taken from ref. 11, for which the stars have Y* = 1.5 M/L. For UGC 128, no specific fit is made: the blue and green dotted lines are simply the NGC 2403 fits scaled by the ratio of disk scale lengths h. This provides a remarkably good description of the UGC 128 rotation curve and illustrates one possible manifestation of the fine tuning problem: if disks have similar Y, the halo parameters p0 and R0 must scale with the disk parameters s0 and h while conspiring to keep the product p0 R02 fixed at any given luminosity. Note also that the halo of NGC 2403 gives an adequate fit to the rotation curve of UGC 128. This is another possible manifestation of the fine tuning problem: all galaxies of the same luminosity have the same halo, with Y systematically varying with s0 so that Y* goes to zero as s0 goes to zero. Neither of these is exactly correct because the contribution of the gas can not be set to zero as is mathematically possible with the stars. This causes the resulting fin tuning problems to be even more complex, involving more parameters. Alternatively, the green dotted line is the rotation curve expected by MOND for a galaxy with the observed luminous mass distribution of UGC 128.

Satisfying the Tully-Fisher relation has led to some expectation that haloes all have the same density structure. This simplest possibility is immediately ruled out. In order to obtain L ~ Va4 ~ MS, one might suppose that the mass surface density S is constant from galaxy to galaxy, irrespective of the luminous surface density s. This achieves the correct asymptotic velocity Va, but requires that the mass distribution, and hence the complete rotation curve, be essentially identical for all galaxies of the same luminosity. This is obviously not the case (Fig. 3), as the rotation curves of lower surface brightness galaxies rise much more gradually than those of higher surface brightness galaxies (also a prediction4 of MOND). It might be possible to have approximately constant density haloes if the highest surface brightness disks are maximal and the lowest minimal in their contribution to the inner parts of the rotation curves, but this then requires fine tuning of Y* with this systematically decreasing with surface brightness.

The expected form of the halo mass distribution depends on the dominant form of dark matter. This could exist in three general categories: baryonic (e.g., MACHOs), hot (e.g., neutrinos), and cold exotic particles (e.g., WIMPs). The first two make no specific predictions. Baryonic dark matter candidates are most subject to direct detection, and most plausible candidates have been ruled out10 with remaining suggestions of necessity sounding increasingly contrived32. Hot dark matter is not relevant to the present problem. Even if neutrinos have a small mass, their velocities considerably exceed the escape velocities of the haloes of low mass galaxies where the problem is most severe. Cosmological simulations involving exotic cold dark matter33,34 have advanced to the point where predictions are being made about the density structure of haloes. These take the form33,34 p(R) = pH/[R(R+RH)b] where pH characterises the halo density and RH its radius, with b ~ 2 to 3. The characteristic density depends on the mean density of the universe at the collapse epoch, and is generally expected to be greater for lower mass galaxies since these collapse first in such scenarios. This goes in the opposite sense of the observations, which show that low mass and low surface brightness galaxies are less, not more, dense. The observed behaviour is actually expected in scenarios which do not smooth on a particular mass scale and hence allow galaxies of the same mass to collapse at a variety of epochs25, but in this case the Tully-Fisher relation should not be universal. Worse, note that at small R < RH, p(R) ~ R-1. It has already been noted32,35 that such a steep interior density distribution is completely inconsistent with the few (4) analysed observations of dwarf galaxies. Our data19,20 confirm and considerably extend this conclusion for 24 low surface brightness galaxies over a wide range in luminosity.

The failure of the predicted exotic cold dark matter density distribution either rules out this form of dark matter, indicates some failing in the simulations (in spite of wide-spread consensus), or requires some mechanism to redistribute the mass. Feedback from star formation is usually invoked for the last of these, but this can not work for two reasons. First, an objection in principle: a small mass of stars and gas must have a dramatic impact on the distribution of the dominant dark mass, with which they can only interact gravitationally. More mass redistribution is required in less luminous galaxies since they start out denser but end up more diffuse; of course progressively less baryonic material is available to bring this about as luminosity declines. Second, an empirical objection: in this scenario, galaxies explode and gas is lost. However, progressively fainter and lower surface brightness galaxies, which need to suffer more severe explosions, are actually very gas rich.

Observationally, dark matter haloes are inferred to have density distributions1,2,11 with constant density cores, p(R) = p0/[1 + (R/R0)g]. Here, p0 is the core density and R0 is the core size with g ~ 2 being required to produce flat rotation curves. For g = 2, the rotation curve resulting from this mass distribution is V(R) = Va [1-(R0/R) tan-1({R/R0)]1/2 where the asymptotic velocity is Va = (4πG p0 R02)1/2. To satisfy the Tully-Fisher relation, Va, and hence the product p0 R02, must be the same for all galaxies of the same luminosity. To decrease the rate of rise of the rotation curves as surface brightness decreases, R0 must increase. Together, these two require a fine tuning conspiracy to keep the product p0 R02 constant while R0 must vary with the surface brightness at a given luminosity. Luminosity and surface brightness themselves are only weakly correlated, so there exists a wide range in one parameter at any fixed value of the other. Thus the structural properties of the invisible dark matter halo dictate those of the luminous disk, or vice versa. So, s and L give the essential information about the mass distribution without recourse to kinematic information.

A strict s-p0-R0 relation is rigorously obeyed only if the haloes are spherical and dominate throughout. This is probably a good approximation for low surface brightness galaxies but may not be for the those of the highest surface brightness. However, a significant non-halo contribution can at best replace one fine tuning problem with another (e.g., surface brightness being strongly correlated with the stellar population mass to light ratio instead of halo core density) and generally causes additional conspiracies.

There are two perspectives for interpreting these relations, with the preferred perspective depending strongly on the philosophical attitude one has towards empirical and theoretical knowledge. One view is that these are real relations which galaxies and their haloes obey. As such, they provide a positive link between models of galaxy formation and evolution and reality.

The other view is that this list of fine tuning requirements makes it rather unattractive to maintain the dark matter hypothesis. MOND provides an empirically more natural explanation for these observations. In addition to the Tully-Fisher relation, MOND correctly predicts the systematics of the shapes of the rotation curves of low surface brightness galaxies19,20 and fits the specific case of UGC 128 (Fig. 3). Low surface brightness galaxies were stipulated4 to be a stringent test of the theory because they should be well into the regime a < a0. This is now observed to be true, and to the limit of observational accuracy the predictions of MOND are confirmed. The critical acceleration scale a0 is apparently universal, so there is a single force law acting in galactic disks for which MOND provides the correct description. The cause of this could be either a particular dark matter distribution36 or a real modification of gravity. The former is difficult to arrange, and a single force law strongly supports the latter hypothesis since in principle the dark matter could have any number of distributions which would give rise to a variety of effective force laws. Even if MOND is not correct, it is essential to understand why it so closely describe the observations. Though the data can not exclude Newtonian dynamics, with a working empirical alternative (really an extension) at hand, we would not hesitate to reject as incomplete any less venerable hypothesis.

Nevertheless, MOND itself remains incomplete as a theory, being more of a Kepler’s Law for galaxies. It provides only an empirical description of kinematic data. While successful for disk galaxies, it was thought to fail in clusters of galaxies37. Recently it has been recognized that there exist two missing mass problems in galaxy clusters, one of which is now solved38: most of the luminous matter is in X-ray gas, not galaxies. This vastly improves the consistency of MOND with with cluster dynamics39. The problem with the theory remains a reconciliation with Relativity and thereby standard cosmology (which is itself in considerable difficulty38,40), and a lack of any prediction about gravitational lensing41. These are theoretical problems which need to be more widely addressed in light of MOND’s empirical success.

ACKNOWLEDGEMENTS. We thank R. Sanders and M. Milgrom for clarifying aspects of a theory with which we were previously unfamiliar. SSM is grateful to the Kapteyn Astronomical Institute for enormous hospitality during visits when much of this work was done. [Note added in 2020: this work was supported by a cooperative grant funded by the EU and would no longer be possible thanks to Brexit.]

REFERENCES

  1. Rubin, V. C. Science 220, 1339-1344 (1983).
  2. Sancisi, R. & van Albada, T. S. in Dark Matter in the Universe, IAU Symp. No. 117, (eds. Knapp, G. & Kormendy, J.) 67-80 (Reidel, Dordrecht, 1987).
  3. Milgrom, M. Astrophys. J. 270, 365-370 (1983).
  4. Milgrom, M. Astrophys. J. 270, 371-383 (1983).
  5. Bekenstein, K. G., & Milgrom, M. Astrophys. J. 286, 7-14
  6. Mannheim, P. D., & Kazanas, D. 1989, Astrophys. J. 342, 635-651 (1989).
  7. Sanders, R. H. Astron. Atrophys. Rev. 2, 1-28 (1990).
  8. Zwaan, M.A., van der Hulst, J. M., de Blok, W. J. G. & McGaugh, S. S. Mon. Not. R. astr. Soc., 273, L35-L38, (1995).
  9. Zaritsky, D. & White, S. D. M. Astrophys. J. 435, 599-610 (1994).
  10. Carr, B. Ann. Rev. Astr. Astrophys., 32, 531-590 (1994).
  11. Begeman, K. G., Broeils, A. H. & Sanders, R. H. Mon. Not. R. astr. Soc. 249, 523-537 (1991).
  12. Kent, S. M. Astr. J. 93, 816-832 (1987).
  13. Milgrom, M. Astrophys. J. 333, 689-693 (1988).
  14. Milgrom, M. & Braun, E. Astrophys. J. 334, 130-134 (1988).
  15. Tully, R. B., & Fisher, J. R. Astr. Astrophys., 54, 661-673 (1977).
  16. Aaronson, M., Huchra, J., & Mould, J. Astrophys. J. 229, 1-17 (1979).
  17. Larson, R. B. & Tinsley, B. M. Astrophys. J. 219, 48-58 (1978).
  18. Sprayberry, D., Bernstein, G. M., Impey, C. D. & Bothun, G. D. Astrophys. J. 438, 72-82 (1995).
  19. van der Hulst, J. M., Skillman, E. D., Smith, T. R., Bothun, G. D., McGaugh, S. S. & de Blok, W. J. G. Astr. J. 106, 548-559 (1993).
  20. de Blok, W. J. G., McGaugh, S. S., & van der Hulst, J. M. Mon. Not. R. astr. Soc. (submitted).
  21. McGaugh, S. S., & Bothun, G. D. Astr. J. 107, 530-542 (1994).
  22. de Blok, W. J. G., van der Hulst, J. M., & Bothun, G. D. Mon. Not. R. astr. Soc. 274, 235-259 (1995).
  23. Ronnback, J., & Bergvall, N. Astr. Astrophys., 292, 360-378 (1994).
  24. de Jong, R. S. Ph.D. thesis, University of Groningen (1995).
  25. Mo, H. J., McGaugh, S. S. & Bothun, G. D. Mon. Not. R. astr. Soc. 267, 129-140 (1994).
  26. Dalcanton, J. J., Spergel, D. N., Summers, F. J. Astrophys. J., (in press).
  27. McGaugh, S. S. Astrophys. J. 426, 135-149 (1994).
  28. Ronnback, J., & Bergvall, N. Astr. Astrophys., 302, 353-359 (1995).
  29. Kuijken, K. & Gilmore, G. Mon. Not. R. astr. Soc., 239, 605-649 (1989).
  30. Schombert, J. M., Bothun, G. D., Impey, C. D., & Mundy, L. G. Astron. J., 100, 1523-1529 (1990).
  31. Wilson, C. D. Astrophys. J. 448, L97-L100 (1995).
  32. Moore, B. Nature 370, 629-631 (1994).
  33. Navarro, J. F., Frenk, C. S., & White, S. D. M. Mon. Not. R. astr. Soc., 275, 720-728 (1995).
  34. Cole, S. & Lacey, C. Mon. Not. R. astr. Soc., in press.
  35. Flores, R. A. & Primack, J. R. Astrophys. J. 427, 1-4 (1994).
  36. Sanders, R. H., & Begeman, K. G. Mon. Not. R. astr. Soc. 266, 360-366 (1994).
  37. The, L. S., & White, S. D. M. Astron. J., 95, 1642-1651 (1988).
  38. White, S. D. M., Navarro, J. F., Evrard, A. E. & Frenk, C. S. Nature 366, 429-433 (1993).
  39. Sanders, R. H. Astron. Astrophys. 284, L31-L34 (1994).
  40. Bolte, M., & Hogan, C. J. Nature 376, 399-402 (1995).
  41. Bekenstein, J. D. & Sanders, R. H. Astrophys. J. 429, 480-490 (1994).
  42. Broeils, A. H., Ph.D. thesis, Univ. of Groningen (1992).

Oh… you don’t want to look in there

Oh… you don’t want to look in there

This post is a recent conversation with David Garofalo for his blog.


Today we talk to Dr. Stacy McGaugh, Chair of the Astronomy Department at Case Western Reserve University.

David: Hi Stacy. You had set out to disprove MOND and instead found evidence to support it. That sounds like the poster child for how science works. Was praise forthcoming?

Stacy: In the late 1980s and into the 1990s, I set out to try to understand low surface brightness galaxies. These are diffuse systems of stars and gas that rotate like the familiar bright spirals, but whose stars are much more spread out. Why? How did these things come to be? Why were they different from brighter galaxies? How could we explain their properties? These were the problems I started out working on that inadvertently set me on a collision course with MOND.

I did not set out to prove or disprove either MOND or dark matter. I was not really even aware of MOND at that time. I had head of it only on a couple of occasions, but I hadn’t payed any attention, and didn’t really know anything about it. Why would I bother? It was already well established that there had to be dark matter.

I worked to develop our understanding of low surface brightness galaxies in the context of dark matter. Their blue colors, low metallicities, high gas fractions, and overall diffuse nature could be explained if they had formed in dark matter halos that are themselves lower than average density: they occupy the low concentration side of the distribution of dark matter halos at a given mass. I found this interpretation quite satisfactory, so gave me no cause to doubt dark matter to that point.

This picture made two genuine predictions that had yet to be tested. First, low surface brightness galaxies should be less strongly clustered than brighter galaxies. Second, having their mass spread over a larger area, they should shift off of the Tully-Fisher relation defined by denser galaxies. The first prediction came true, and for a period I was jubilant that we had made an important new contribution to out understanding of both galaxies and dark matter. The second prediction failed badly: low surface brightness galaxies adhere to the same Tully-Fisher relation that other galaxies follow.

I tried desperately to understand the failure of the second prediction in terms of dark matter. I tried what seemed like a thousand ways to explain this, but ultimately they were all tautological: I could only explain it if I assumed the answer from the start. The adherence of low surface brightness galaxies to the Tully-Fisher relation poses a serious fine-tuning problem: the distribution of dark matter must be adjusted to exactly counterbalance that of the visible matter so as not to leave any residuals. This makes no sense, and anyone who claims it does is not thinking clearly.

It was in this crisis of comprehension in which I became aware that MOND predicted exactly what I was seeing. No fine-tuning was required. Low surface brightness galaxies followed the same Tully-Fisher relation as other galaxies because the modified force law stipulates that they must. It was only at this point (in the mid-’90s) at which I started to take MOND seriously. If it had got this prediction right, what else did it predict?

I was still convinced that the right answer had to be dark matter. There was, after all, so much evidence for it. So this one prediction must be a fluke; surely it would fail the next test. That was not what happened: MOND passed test after test after test, successfully predicting observations both basic and detailed that dark matter theory got wrong or did not even address. It was only after this experience that I realized that what I thought was evidence for dark matter was really just evidence that something was wrong: the data cannot be explained with ordinary gravity without invisible mass. The data – and here I mean ALL the data – were mostly ambiguous: they did not clearly distinguish whether the problem was with mass we couldn’t see or with the underlying equations from which we inferred the need for dark matter.

So to get back to your original question, yes – this is how science should work. I hadn’t set out to test MOND, but I had inadvertently performed exactly the right experiment for that purpose. MOND had its predictions come true where the predictions of other theories did not: both my own theory and those of others who were working in the context of dark matter. We got it wrong while MOND got it right. That led me to change my mind: I had been wrong to be sure the answer had to be dark matter, and to be so quick to dismiss MOND. Admitting this was the most difficult struggle I ever faced in my career.

David: From the perspective of dark matter, how does one understand MOND’s success?

Stacy: One does not.

That the predictions of MOND should come true in a universe dominated by dark matter makes no sense.

Before I became aware of MOND, I spent lots of time trying to come up with dark matter-based explanations for what I was seeing. It didn’t work. Since then, I have continued to search for a viable explanation with dark matter. I have not been successful. Others have claimed such success, but whenever I look at their work, it always seems that what they assert to be a great success is just a specific elaboration of a model I had already considered and rejected as obviously unworkable. The difference boils down to Occam’s razor. If you give dark matter theory enough free parameters, it can be adjusted to “predict” pretty much anything. But the best we can hope to do with dark matter theory is to retroactively explain what MOND successfully predicted in advance. Why should we be impressed by that?

David: Does MOND fail in clusters?

Stacy: Yes and no: there are multiple tests in clusters. MOND passes some and flunks others – as does dark matter.

The most famous test is the baryon fraction. This should be one in MOND – all the mass is normal baryonic matter. With dark matter, it should be the cosmic ratio of normal to dark matter (about 1:5).

MOND fails this test: it explains most of the discrepancy in clusters, but not all of it. The dark matter picture does somewhat better here, as the baryon fraction is close to the cosmic expectation — at least for the richest clusters of galaxies. In smaller clusters and groups of galaxies, the normal matter content falls short of the cosmic value. So both theories suffer a “missing baryon” problem: MOND in rich clusters; dark matter in everything smaller.

Another test is the mass-temperature relation. Both theories predict a relation between the mass of a cluster and the temperature of the gas it contains, but they predict different slopes for this relation. MOND gets the slope right but the amplitude wrong, leading to the missing baryon problem above. Dark matter gets the amplitude right for the most massive clusters, but gets the slope wrong – which leads to it having a missing baryon problem for systems smaller than the largest clusters.

There are other tests. Clusters continue to merge; the collision velocity of merging clusters is predicted to be higher in MOND than with dark matter. For example, the famous bullet cluster, which is often cited as a contradiction to MOND, has a collision speed that is practically impossible with dark matter: there just isn’t enough time for the two components of the bullet to accelerate up to the observed relative speed if they fall together under the influence of normal gravity and the required amount of dark mass. People have argued over the severity of this perplexing problem, but the high collision speed happens quite naturally in MOND as a consequence of its greater effective force of attraction. So, taken at face value, the bullet cluster both confirms and refutes both theories!

I could go on… one expects clusters to form earlier and become more massive in MOND than in dark matter. There are some indications that this is the case – the highest redshift clusters came as a surprise to conventional structure formation theory – but the relative numbers of clusters as a function of mass seems to agree well with current expectations with dark matter. So clusters are a mixed bag.

More generally, there is a widespread myth that MOND fits rotation curves, but gets nothing else right. This is what I expected to find when I started fact checking, but the opposite is true. MOND explains a huge variety of data well. The presumptive superiority of dark matter is just that – a presumption.

David: At a physics colloquium two decades ago, Vera Rubin described how theorists were willing and eager to explain her data to her. At an astronomy colloquium a few years later, you echoed that sentiment in relation to your data on velocity curves. One concludes that theorists are uniquely insightful and generous people. Is there anyone you would like to thank for putting you straight? 
 
Stacy:  So they perceive themselves to be.

MOND has made many successful a priori predictions. This is the golden standard of the scientific method. If there is another explanation for it, I’d like to know what it is.

As your questions supposes, many theorists have offered such explanations. At most one of them can be correct. I have yet to hear a satisfactory explanation.


David: What are MOND people working on these days? 
 
Stacy: Any problem that is interesting in extragalactic astronomy is interesting in the context of MOND. Outstanding questions include planes of satellite dwarf galaxies, clusters of galaxies, the formation of large scale structure, and the microwave background. MOND-specific topics include the precise value of the MOND acceleration constant, predicting the velocity dispersions of dwarf galaxies, and the search for the predicted external field effect, which is a unique signature of MOND.

The phrasing of this question raises a sociological issue. I don’t know what a “MOND person” is. Before now, I have only heard it used as a pejorative.

I am a scientist who has worked on many topics. MOND is just one of them. Does that make me a “MOND person”? I have also worked on dark matter, so am I also a “dark matter person”? Are these mutually exclusive?

I have attended conferences where I have heard people say ‘“MOND people” do this’ or ‘“MOND people” fail to do that.’ Never does the speaker of these words specify who they’re talking about: “MOND people” are a nameless Other. In all cases, I am more familiar with the people and the research they pretend to describe, but in no way do I recognize what they’re talking about. It is just a way of saying “Those People” are Bad.

There are many experts on dark matter in the world. I am one of them. There are rather fewer experts on MOND. I am also one of them. Every one of these “MOND people” is also an expert on dark matter. This situation is not reciprocated: many experts on dark matter are shockingly ignorant about MOND. I was once guilty of that myself, but realized that ignorance is not a sound basis on which to base a scientific judgement.

David: Are you tired of getting these types of questions? 
 
Stacy: Yes and no.

No, in that these are interesting questions about fundamental science. That is always fun to talk about.

Yes, in that I find myself having the same arguments over and over again, usually with scientists who remain trapped in the misconceptions I suffered myself a quarter century ago, but whose minds are closed to ideas that threaten their sacred cows. If dark matter is a real, physical substance, then show me a piece already.

Cosmology, then and now

Cosmology, then and now

I have been busy teaching cosmology this semester. When I started on the faculty of the University of Maryland in 1998, there was no advanced course on the subject. This seemed like an obvious hole to fill, so I developed one. I remember with fond bemusement the senior faculty, many of them planetary scientists, sending Mike A’Hearn as a stately ambassador to politely inquire if cosmology had evolved beyond a dodgy subject and was now rigorous enough to be worthy of a 3 credit graduate course.

Back then, we used transparencies or wrote on the board. It was novel to have a course web page. I still have those notes, and marvel at the breadth and depth of work performed by my younger self. Now that I’m teaching it for the first time in a decade, I find it challenging to keep up. Everything has to be adapted to an electronic format, and be delivered remotely during this damnable pandemic. It is a less satisfactory experience, and it has precluded posting much here.

Another thing I notice is that attitudes have evolved along with the subject. The baseline cosmology, LCDM, has not changed much. We’ve tilted the power spectrum and spiked it with extra baryons, but the basic picture is that which emerged from the application of classical observational cosmology – measurements of the Hubble constant, the mass density, the ages of the oldest stars, the abundances of the light elements, number counts of faint galaxies, and a wealth of other observational constraints built up over decades of effort. Here is an example of combining such constraints, and exercise I have students do every time I teach the course:

Observational constraints in the mass density-Hubble constant plane assembled by students in my cosmology course in 2002. The gray area is excluded. The open window is the only space allowed; this is LCDM. The box represents the first WMAP estimate in 2003. CMB estimates have subsequently migrated out of the allowed region to lower H0 and higher mass density, but the other constraints have not changed much, most famously H0, which remains entrenched in the low to mid-70s.

These things were known by the mid-90s. Nowadays, people seem to think Type Ia SN discovered Lambda, when really they were just icing on a cake that was already baked. The location of the first peak in the acoustic power spectrum of the microwave background was corroborative of the flat geometry required by the picture that had developed, but trailed the development of LCDM rather than informing its construction. But students entering the field now seem to have been given the impression that these were the only observations that mattered.

Worse, they seem to think these things are Known, as if there’s never been a time that we cosmologists have been sure about something only to find later that we had it quite wrong. This attitude is deleterious to the progress of science, as it precludes us from seeing important clues when they fail to conform to our preconceptions. To give one recent example, everyone seems to have decided that the EDGES observation of 21 cm absorption during the dark ages is wrong. The reason? Because it is impossible in LCDM. There are technical reasons why it might be wrong, but these are subsidiary to Attitude: we can’t believe it’s true, so we don’t. But that’s what makes a result important: something that makes us reexamine how we perceive the universe. If we’re unwilling to do that, we’re no longer doing science.

A Philosophical Approach to MOND

A Philosophical Approach to MOND is a new book by David Merritt. This is a major development in the both the science of cosmology and astrophysics, on the one hand, and the philosophy and history of science on the other. It should be required reading for anyone interested in any of these topics.

For many years, David Merritt was a professor of astrophysics who specialized in gravitational dynamics, leading a number of breakthroughs in the effects of supermassive black holes in galaxies on the orbits of stars around them. He has since transitioned to the philosophy of science. This may not sound like a great leap, but it is: these are different scholarly fields, each with their own traditions, culture, and required background education. Changing fields like this is a bit like switching boats mid-stream: even a strong swimmer may flounder in the attempt given the many boulders academic disciplines traditionally place in the stream of knowledge to mark their territory. Merritt has managed the feat with remarkable grace, devouring the background reading and coming up to speed in a different discipline to the point of a lucid fluency.

For the most part, practicing scientists have little interaction with philosophers and historians of science. Worse, we tend to have little patience for them. The baseline presumption of many physical scientists is that we know what we’re doing; there is nothing the philosophers can teach us. In the daily practice of what Kuhn called normal science, this is close to true. When instead we are faced with potential paradigm shifts, the philosophy of science is critical, and the absence of training in it on the part of many scientists becomes glaring.

In my experience, most scientists seem to have heard of Popper and Kuhn. If that. Physical scientists will almost always pay lip service to Popper’s ideal of falsifiablity, and that’s pretty much the extent of it. Living up to applying that ideal is another matter. If an idea that is near and dear to their hearts and careers is under threat, the knee-jerk response is more commonly “let’s not get carried away!”

There is more to the philosophy of science than that. The philosophers of science have invested lots of effort in considering both how science works in practice (e.g., Kuhn) and how it should work (Popper, Lakatos, …) The practice and the ideal of science are not always the same thing.

The debate about dark matter and MOND hinges on the philosophy of science in a profound way. I do not think it is possible to make real progress out of our current intellectual morass without a deep examination of what science is and what it should be.

Merritt takes us through the methodology of scientific research programs, spelling out what we’ve learned from past experience (the history of science) and from careful consideration of how science should work (its philosophical basis). For example, all scientists agree that it is important for a scientific theory to have predictive power. But we are disturbingly fuzzy on what that means. I frequently hear my colleagues say things like “my theory predicts that” in reference to some observation, when in fact no such prediction was made in advance. What they usually mean is that it fits well with the theory. This is sometimes true – they could have predicted the observation in advance if they had considered that particular case. But sometimes it is retroactive fitting more than prediction – consistency, perhaps, but it could have gone a number of other ways equally well. Worse, it is sometimes a post facto assertion that is simply false: not only was the prediction not made in advance, but the observation was genuinely surprising at the time it was made. Only in retrospect is it “correctly” “predicted.”

The philosophers have considered these situations. One thing I appreciate is Merritt’s review of the various takes philosophers have on what counts as a prediction. I wish I had known these things when I wrote the recent review in which I took a very restrictive definition to avoid the foible above. The philosophers provide better definitions, of which more than one can be usefully applicable. I’m not going to go through them here: you should read Merritt’s book, and those of the philosophers he cites.

From this philosophical basis, Merritt makes a systematic, dare I say, scientific, analysis of the basic tenets of MOND and MONDian theories, and how they fare with regard to their predictions and observational tests. Along the way, he also considers the same material in the light of the dark matter paradigm. Of comparable import to confirmed predictions are surprising observations: if a new theory predicts that the sun will rise in the morning, that isn’t either new or surprising. If instead a theory expects one thing but another is observed, that is surprising, and it counts against that theory even if it can be adjusted to accommodate the new fact. I have seen this happen over and over with dark matter: surprising observations (e.g., the absence of cusps in dark matter halos, the small numbers of dwarf galaxies, downsizing in which big galaxies appear to form earliest) are at first ignored, doubted, debated, then partially explained with some mental gymnastics until it is Known and of course, we knew it all along. Merritt explicitly points out examples of this creeping determinism, in which scientists come to believe they predicted something they merely rationalized post-facto (hence the preeminence of genuinely a priori predictions that can’t be fudged).

Merritt’s book is also replete with examples of scientists failing to take alternatives seriously. This is natural: we have invested an enormous amount of time developing physical science to the point we have now reached; there is an enormous amount of background material that cannot simply be ignored or discarded. All too often, we are confronted with crackpot ideas that do exactly this. This makes us reluctant to consider ideas that sound crazy on first blush, and most of us will rightly display considerable irritation when asked to do so. For reasons both valid and not, MOND skirts this bondary. I certainly didn’t take it seriously myself, nor really considered it at all, until its predictions came true in my own data. It was so far below my radar that at first I did not even recognize that this is what had happened. But I did know I was surprised; what I was seeing did not make sense in terms of dark matter. So, from this perspective, I can see why other scientists are quick to dismiss it. I did so myself, initially. I was wrong to do so, and so are they.

A common failure mode is to ignore MOND entirely: despite dozens of confirmed predictions, it simply remains off the radar for many scientists. They seem never to have given it a chance, so they simply don’t pay attention when it gets something right. This is pure ignorance, which is not a strong foundation from which to render a scientific judgement.

Another common reaction is to acknowledge then dismiss. Merritt provides many examples where eminent scientists do exactly this with a construction like: “MOND correctly predicted X but…” where X is a single item, as if this is the only thing that [they are aware that] it does. Put this way, it is easy to dismiss – a common refrain I hear is “MOND fits rotation curves but nothing else.” This is a long-debunked falsehood that is asserted and repeated until it achieves the status of common knowledge within the echo chamber of scientists who refuse to think outside the dark matter box.

This is where the philosophy of science is crucial to finding our way forward. Merritt’s book illuminates how this is done. If you are reading these words, you owe it to yourself to read his book.

Predictive Power in Science

Predictive Power in Science

“Winning isn’t everything. It’s the only thing.”

Red Sanders

This is a wise truth that has often been poorly interpreted. I despise some of the results that this sports quote has had in American culture. It has fostered a culture of bad sportsmanship in some places: an acceptance, even a dictum, that the ends justify the means – up to and including cheating, provided you can get away with it.

Winning every time is an impossible standard. In any competitive event, someone will win a particular game, and someone else will lose. Every participant will be on the losing side some of the time. Learning to lose gracefully despite a great effort is an essential aspect of sportsmanship that must be taught and learned, because it sure as hell isn’t part of human nature.

But there is wisdom here. The quote originates with a football coach. Football is a sport where there is a lot of everything – to even have a chance of winning, you have to do everything right. Not just performance on the field, but strategic choices made before and during the game, and mundane but essential elements like getting the right personnel on the field for each play. What? We’re punting? I thought it was third down!

You can do everything right and still lose. And that’s what I interpret the quote to really mean. You have to do everything to compete. But people will only judge you to be successful if you win.

To give a recent example, the Kansas City Chiefs won this year’s Superbowl. It was only a few months ago, though it seems much longer in pandemic time. The Chiefs dominated the Superbowl, but they nearly didn’t make it past the AFC Championship game.

The Tennessee Titans dominated the early part of the AFC Championship game. They had done everything right. They had peaked at the right time as a team in the overly long and brutal NFL season. They had an excellent game plan, just as they had had in handily defeating the highly favored New England Patriots on the way to the Championship game. Their defense admirably contained the high octane Chiefs offense. It looked like they were going to the Superbowl.

Then one key injury occurred. The Titans lost the only defender who could match up one on one with tight end Travis Kelce. This had an immediate impact on the game, as they Chiefs quickly realized they could successfully throw to Kelce over and over after not having been able to do so at all. The Titans were obliged to double-cover, which opened up other opportunities. The Chief’s offense went from impotent to unstoppable.

I remember this small detail because Kelce is a local boy. He attended the same high school as my daughters, playing on the same field they would (only shortly later) march on with the marching band during half times. If it weren’t for this happenstance of local interest, I probably wouldn’t have noticed this detail of the game, much less remember it.

The bigger point is that the Titans did everything right as a team. They lost anyway. All most people will remember is that the Chiefs won the Superbowl, not that the Titans almost made it there. Hence the quote:

“Winning isn’t everything. It’s the only thing.”

The hallmark of science is predictive power. This is what distinguishes it from other forms of knowledge. The gold standard is a prediction that is made and published in advance of the experiment that tests it. This eliminates the ability to hedge: either we get it right in advance, or we don’t.

The importance of such a prediction depends on how surprising it is. Predicting that the sun will rise tomorrow is not exactly a bold prediction, is it? If instead we have a new idea that changes how we think about how the world works, and makes a prediction that is distinct from current wisdom, then that’s very important. Judging how important a particular prediction may be is inevitably subjective.

RedQueen
That’s very important!

It is rare that we actually meet the gold standard of a priori prediction, but it does  happen. A prominent example is the prediction of gravitational lensing by General Relativity. Einstein pointed out that his theory predicted twice the light-bending that Newtonian theory did. Eddington organized an expedition to measure this effect during a solar eclipse, and claimed to confirm Einstein’s prediction within a few years of it having been made. This is reputed to have had a strong impact that led to widespread acceptance of the new theory. Some of that was undoubtedly due to Eddington’s cheerleading: it does not suffice merely to make a successful prediction, that it has happened needs to become widely known.

It is impossible to anticipate every conceivable experimental result and publish a prediction for it in advance. So there is another situation: does a theory predict what is observed? This has several standards. The highest standard deserves a silver medal. This happens when you work out the prediction of a theory, and you find that it gives exactly what is observed, with very little leeway. If you had had the opportunity to make the prediction in advance, it would have risen to the gold standard.

Einstein provides another example of a silver-standard prediction. A long standing problem in planetary dynamics was the excess precession of the perihelion of Mercury. The orientation of the elliptical orbit of Mercury changes slowly, with the major axis of the ellipse pivoting by 574 arcseconds per century. That’s a tiny rate of angular change, but we’ve been keeping very accurate records of where the planets are for a very long time, so it was well measured. Indeed, it was recognized early that precession would be cause by torques from other planets: it isn’t just Mercury going around the sun; the rest of the solar system matters too. Planetary torques are responsible for most of the effect, but not all. By 1859, Urbain Le Verrier had worked out that the torques from known planets should only amount to 532 arcseconds per century. [I am grossly oversimplifying some fascinating history. Go read up on it!] The point is that there was an excess, unexplained precession of 43 arcseconds per century. This discrepancy was known, known to be serious, and had no satisfactory explanation for many decades before Einstein came on the scene. No way he could go back in time and make a prediction before he was born! But when he worked out the implications of his new theory for this problem, the right answer fell straight out. It explained an ancient and terrible problem without any sort of fiddling: it had to be so.

The data for the precession of the perihelion of Mercury were far superior to the first gravitational lensing measurements made by Eddington and his colleagues. The precession was long known and accurately measured, the post facto prediction clean and irresolute. So in this case, the silver standard was perhaps better than the gold standard. Hence the question once posed to me by a philosopher of science: why we should care if the prediction came in advance of the observation? If X is a consequence of a theory, and X is observed, what difference does it make which came first?

In principle, none. In practice, it depends. I made the hedge above of “very little leeway.” If there is zero leeway, then silver is just as good as gold. There is no leeway to fudge it, so the order doesn’t matter.

It is rare that there is no leeway to fudge it. Theorists love to explore arcane facets of their ideas. They are exceedingly clever at finding ways to “explain” observations that their theory did not predict, even those that seem impossible for their theory to explain. So the standard by which such a post-facto “prediction” must be judged depends on the flexibility of the theory, and the extent to which one indulges said flexibility. If it is simply a matter of fitting for some small number of unknown parameters that are perhaps unknowable in advance, then I would award that a bronze medal. If instead one must strain to twist the theory to make it work out, then that merits at best an asterisk: “we fit* it!” can quickly become “*we’re fudging it!” That’s why truly a priori prediction is the gold standard. There is no way to go back in time and fudge it.

An important corollary is that if a theory gets its predictions right in advance, then we are obliged to acknowledge the efficacy of that theory. The success of a priori predictions is the strongest possible sign that the successful theory is a step in the right direction. This is how we try to maintain objectivity in science: it is how we know when to suck it up and say “OK, my favorite theory got this wrong, but this other theory I don’t like got its prediction exactly right. I need to re-think this.” This ethos has been part of science for as long as I can remember, and a good deal longer than that. I have heard some argue that this is somehow outdated and that we should give up this ethos. This is stupid. If we give up the principle of objectivity, science would quickly degenerate into a numerological form of religion: my theory is always right! and I can bend the numbers to make it seem so.

Hence the hallmark of science is predictive power. Can a theory be applied to predict real phenomena? It doesn’t matter whether the prediction is made in advance or not – with the giant caveat that “predictions” not be massaged to fit the facts. There is always a temptation to massage one’s favorite theory – and obfuscate the extent to which one is doing so. Consequently, truly a priori prediction must necessarily remain the gold standard in science. The power to make such predictions is fundamental.

Predictive power in science isn’t everything. It’s the only thing.

 

9872838_web1_dolphins-chiefs-football_2901929

As I was writing this, I received email to the effect that these issues are also being discussed elsewhere, by Jim Baggot and Sabine Hossenfelder. I have not yet read what they have to say.

Hypothesis testing with gas rich galaxies

Hypothesis testing with gas rich galaxies

This Thanksgiving, I’d highlight something positive. Recently, Bob Sanders wrote a paper pointing out that gas rich galaxies are strong tests of MOND. The usual fit parameter, the stellar mass-to-light ratio, is effectively negligible when gas dominates. The MOND prediction follows straight from the gas distribution, for which there is no equivalent freedom. We understand the 21 cm spin-flip transition well enough to relate observed flux directly to gas mass.

In any human endeavor, there are inevitably unsung heroes who carry enormous amounts of water but seem to get no credit for it. Sanders is one of those heroes when it comes to the missing mass problem. He was there at the beginning, and has a valuable perspective on how we got to where we are. I highly recommend his books, The Dark Matter Problem: A Historical Perspective and Deconstructing Cosmology.

In bright spiral galaxies, stars are usually 80% or so of the mass, gas only 20% or less. But in many dwarf galaxies,  the mass ratio is reversed. These are often low surface brightness and challenging to observe. But it is a worthwhile endeavor, as their rotation curve is predicted by MOND with extraordinarily little freedom.

Though gas rich galaxies do indeed provide an excellent test of MOND, nothing in astronomy is perfectly clean. The stellar mass-to-light ratio is an irreducible need-to-know parameter. We also need to know the distance to each galaxy, as we do not measure the gas mass directly, but rather the flux of the 21 cm line. The gas mass scales with flux and the square of the distance (see equation 7E7), so to get the gas mass right, we must first get the distance right. We also need to know the inclination of a galaxy as projected on the sky in order to get the rotation to which we’re fitting right, as the observed line of sight Doppler velocity is only sin(i) of the full, in-plane rotation speed. The 1/sin(i) correction becomes increasingly sensitive to errors as i approaches zero (face-on galaxies).

The mass-to-light ratio is a physical fit parameter that tells us something meaningful about the amount of stellar mass that produces the observed light. In contrast, for our purposes here, distance and inclination are “nuisance” parameters. These nuisance parameters can be, and generally are, measured independently from mass modeling. However, these measurements have their own uncertainties, so one has to be careful about taking these measured values as-is. One of the powerful aspects of Bayesian analysis is the ability to account for these uncertainties to allow for the distance to be a bit off the measured value, so long as it is not too far off, as quantified by the measurement uncertainties. This is what current graduate student Pengfei Li did in Li et al. (2018). The constraints on MOND are so strong in gas rich galaxies that often the nuisance parameters cannot be ignored, even when they’re well measured.

To illustrate what I’m talking about, let’s look at one famous example, DDO 154. This galaxy is over 90% gas. The stars (pictured above) just don’t matter much. If the distance and inclination are known, the MOND prediction for the rotation curve follows directly. Here is an example of a MOND fit from a recent paper:

DDO154_MOND_180805695
The MOND fit to DDO 154 from Ren et al. (2018). The black points are the rotation curve data, the green line is the Newtonian expectation for the baryons, and the red line is their MOND fit.

This is terrible! The MOND fit – essentially a parameter-free prediction – misses all of the data. MOND is falsified. If one is inclined to hate MOND, as many seem to be, then one stops here. No need to think further.

If one is familiar with the ups and downs in the history of astronomy, one might not be so quick to dismiss it. Indeed, one might notice that the shape of the MOND prediction closely tracks the shape of the data. There’s just a little difference in scale. That’s kind of amazing for a theory that is wrong, especially when it is amplifying the green line to predict the red one: it needn’t have come anywhere close.

Here is the fit to the same galaxy using the same data [already] published in Li et al.:

DDO154_RAR_Li2018
The MOND fit to DDO 154 from Li et al. (2018) using the same data as above, as tabulated in SPARC.

Now we have a good fit, using the same data! How can this be so?

I have not checked what Ren et al. did to obtain their MOND fits, but having done this exercise myself many times, I recognize the slight offset they find as a typical consequence of holding the nuisance parameters fixed. What if the measured distance is a little off?

Distance estimates to DDO 154 in the literature range from 3.02 Mpc to 6.17 Mpc. The formally most accurate distance measurement is 4.04 ± 0.08 Mpc. In the fit shown here, we obtained 3.87 ± 0.16 Mpc. The error bars on these distances overlap, so they are the same number, to measurement accuracy. These data do not falsify MOND. They demonstrate that it is sensitive enough to tell the difference between 3.8 and 4.1 Mpc.

One will never notice this from a dark matter fit. Ren et al. also make fits with self-interacting dark matter (SIDM). The nifty thing about SIDM is that it makes quasi-constant density cores in dark matter halos. Halos of this form are not predicted by “ordinary” cold dark matter (CDM), but often give better fits than either MOND of the NFW halos of dark matter-only CDM simulations. For this galaxy, Ren et al. obtain the following SIDM fit.

DDO154_SIDM_180805695
The SIDM fit to DDO 154 from Ren et al.

This is a great fit. Goes right through the data. That makes it better, right?

Not necessarily. In addition to the mass-to-light ratio (and the nuisance parameters of distance and inclination), dark matter halo fits have [at least] two additional free parameters to describe the dark matter halo, such as its mass and core radius. These parameters are highly degenerate – one can obtain equally good fits for a range of mass-to-light ratios and core radii: one makes up for what the other misses. Parameter degeneracy of this sort is usually a sign that there is too much freedom in the model. In this case, the data are adequately described by one parameter (the MOND fit M*/L, not counting the nuisances in common), so using three (M*/L, Mhalo, Rcore) is just an exercise in fitting a French curve. There is ample freedom to fit the data. As a consequence, you’ll never notice that one of the nuisance parameters might be a tiny bit off.

In other words, you can fool a dark matter fit, but not MOND. Erwin de Blok and I demonstrated this 20 years ago. A common myth at that time was that “MOND is guaranteed to fit rotation curves.” This seemed patently absurd to me, given how it works: once you stipulate the distribution of baryons, the rotation curve follows from a simple formula. If the two don’t match, they don’t match. There is no guarantee that it’ll work. Instead, it can’t be forced.

As an illustration, Erwin and I tried to trick it. We took two galaxies that are identical in the Tully-Fisher plane (NGC 2403 and UGC 128) and swapped their mass distribution and rotation curve. These galaxies have the same total mass and the same flat velocity in the outer part of the rotation curve, but the detailed distribution of their baryons differs. If MOND can be fooled, this closely matched pair ought to do the trick. It does not.

NGC2403UGC128trickMOND
An attempt to fit MOND to a hybrid galaxy with the rotation curve of NGC 2403 and the baryon distribution of UGC 128. The mass-to-light ratio is driven to unphysical values (6 in solar units), but an acceptable fit is not obtained.

Our failure to trick MOND should not surprise anyone who bothers to look at the math involved. There is a one-to-one relation between the distribution of the baryons and the resulting rotation curve. If there is a mismatch between them, a fit cannot be obtained.

We also attempted to play this same trick on dark matter. The standard dark matter halo fitting function at the time was the pseudo-isothermal halo, which has a constant density core. It is very similar to the halos of SIDM and to the cored dark matter halos produced by baryonic feedback in some simulations. Indeed, that is the point of those efforts: they  are trying to capture the success of cored dark matter halos in fitting rotation curve data.

NGC2403UGC128trickDM
A fit to the hybrid galaxy with a cored (pseudo-isothermal) dark matter halo. A satisfactory fit is readily obtained.

Dark matter halos with a quasi-constant density core do indeed provide good fits to rotation curves. Too good. They are easily fooled, because they have too many degrees of freedom. They will fit pretty much any plausible data that you throw at them. This is why the SIDM fit to DDO 154 failed to flag distance as a potential nuisance. It can’t. You could double (or halve) the distance and still find a good fit.

This is why parameter degeneracy is bad. You get lost in parameter space. Once lost there, it becomes impossible to distinguish between successful, physically meaningful fits and fitting epicycles.

Astronomical data are always subject to improvement. For example, the THINGS project obtained excellent data for a sample of nearby galaxies. I made MOND fits to all the THINGS (and other) data for the MOND review Famaey & McGaugh (2012). Here’s the residual diagram, which has been on my web page for many years:

rcresid_mondfits
Residuals of MOND fits from Famaey & McGaugh (2012).

These are, by and large, good fits. The residuals have a well defined peak centered on zero.  DDO 154 was one of the THINGS galaxies; lets see what happens if we use those data.

DDO154mond_i66
The rotation curve of DDO 154 from THINGS (points with error bars). The Newtonian expectation for stars is the green line; the gas is the blue line. The red line is the MOND prediction. Not that the gas greatly outweighs the stars beyond 1.5 kpc; the stellar mass-to-light ratio has extremely little leverage in this MOND fit.

The first thing one is likely to notice is that the THINGS data are much better resolved than the previous generation used above. The first thing I noticed was that THINGS had assumed a distance of 4.3 Mpc. This was prior to the measurement of 4.04, so lets just start over from there. That gives the MOND prediction shown above.

And it is a prediction. I haven’t adjusted any parameters yet. The mass-to-light ratio is set to the mean I expect for a star forming stellar population, 0.5 in solar units in the Sptizer 3.6 micron band. D=4.04 Mpc and i=66 as tabulated by THINGS. The result is pretty good considering that no parameters have been harmed in the making of this plot. Nevertheless, MOND overshoots a bit at large radii.

Constraining the inclinations for gas rich dwarf galaxies like DDO 154 is a bit of a nightmare. Literature values range from 20 to 70 degrees. Seriously. THINGS itself allows the inclination to vary with radius; 66 is just a typical value. Looking at the fit Pengfei obtained, i=61. Let’s try that.

DDO154mond_i61
MOND fit to the THINGS data for DDO 154 with the inclination adjusted to the value found by Li et al. (2018).

The fit is now satisfactory. One tweak to the inclination, and we’re done. This tweak isn’t even a fit to these data; it was adopted from Pengfei’s fit to the above data. This tweak to the inclination is comfortably within any plausible assessment of the uncertainty in this quantity. The change in sin(i) corresponds to a mere 4% in velocity. I could probably do a tiny bit better with further adjustment – I have left both the distance and the mass-to-light ratio fixed – but that would be a meaningless exercise in statistical masturbation. The result just falls out: no muss, no fuss.

Hence the point Bob Sanders makes. Given the distribution of gas, the rotation curve follows. And it works, over and over and over, within the bounds of the uncertainties on the nuisance parameters.

One cannot do the same exercise with dark matter. It has ample ability to fit rotation curve data, once those are provided, but zero power to predict it. If all had been well with ΛCDM, the rotation curves of these galaxies would look like NFW halos. Or any number of other permutations that have been discussed over the years. In contrast, MOND makes one unique prediction (that was not at all anticipated in dark matter), and that’s what the data do. Out of the huge parameter space of plausible outcomes from the messy hierarchical formation of galaxies in ΛCDM, Nature picks the one that looks exactly like MOND.

star_trek_tv_spock_3_copy_-_h_2018
This outcome is illogical.

It is a bad sign for a theory when it can only survive by mimicking its alternative. This is the case here: ΛCDM must imitate MOND. There are now many papers asserting that it can do just this, but none of those were written before the data were provided. Indeed, I consider it to be problematic that clever people can come with ways to imitate MOND with dark matter. What couldn’t it imitate? If the data had all looked like technicolor space donkeys, we could probably find a way to make that so as well.

Cosmologists will rush to say “microwave background!” I have some sympathy for that, because I do not know how to explain the microwave background in a MOND-like theory. At least I don’t pretend to, even if I had more predictive success there than their entire community. But that would be a much longer post.

For now, note that the situation is even worse for dark matter than I have so far made it sound. In many dwarf galaxies, the rotation velocity exceeds that attributable to the baryons (with Newton alone) at practically all radii. By a lot. DDO 154 is a very dark matter dominated galaxy. The baryons should have squat to say about the dynamics. And yet, all you need to know to predict the dynamics is the baryon distribution. The baryonic tail wags the dark matter dog.

But wait, it gets better! If you look closely at the data, you will note a kink at about 1 kpc, another at 2, and yet another around 5 kpc. These kinks are apparent in both the rotation curve and the gas distribution. This is an example of Sancisi’s Law: “For any feature in the luminosity profile there is a corresponding feature in the rotation curve and vice versa.” This is a general rule, as Sancisi observed, but it makes no sense when the dark matter dominates. The features in the baryon distribution should not be reflected in the rotation curve.

The observed baryons orbit in a disk with nearly circular orbits confined to the same plane. The dark matter moves on eccentric orbits oriented every which way to provide pressure support to a quasi-spherical halo. The baryonic and dark matter occupy very different regions of phase space, the six dimensional volume of position and momentum. The two are not strongly coupled, communicating only by the weak force of gravity in the standard CDM paradigm.

One of the first lessons of galaxy dynamics is that galaxy disks are subject to a variety of instabilities that grow bars and spiral arms. These are driven by disk self-gravity. The same features do not appear in elliptical galaxies because they are pressure supported, 3D blobs. They don’t have disks so they don’t have disk self-gravity, much less the features that lead to the bumps and wiggles observed in rotation curves.

Elliptical galaxies are a good visual analog for what dark matter halos are believed to be like. The orbits of dark matter particles are unable to sustain features like those seen in  baryonic disks. They are featureless for the same reasons as elliptical galaxies. They don’t have disks. A rotation curve dominated by a spherical dark matter halo should bear no trace of the features that are seen in the disk. And yet they’re there, often enough for Sancisi to have remarked on it as a general rule.

It gets worse still. One of the original motivations for invoking dark matter was to stabilize galactic disks: a purely Newtonian disk of stars is not a stable configuration, yet the universe is chock full of long-lived spiral galaxies. The cure was to place them in dark matter halos.

The problem for dwarfs is that they have too much dark matter. The halo stabilizes disks by  suppressing the formation of structures that stem from disk self-gravity. But you need some disk self-gravity to have the observed features. That can be tuned to work in bright spirals, but it fails in dwarfs because the halo is too massive. As a practical matter, there is no disk self-gravity in dwarfs – it is all halo, all the time. And yet, we do see such features. Not as strong as in big, bright spirals, but definitely present. Whenever someone tries to analyze this aspect of the problem, they inevitably come up with a requirement for more disk self-gravity in the form of unphysically high stellar mass-to-light ratios (something I predicted would happen). In contrast, this is entirely natural in MOND (see, e.g., Brada & Milgrom 1999 and Tiret & Combes 2008), where it is all disk self-gravity since there is no dark matter halo.

The net upshot of all this is that it doesn’t suffice to mimic the radial acceleration relation as many simulations now claim to do. That was not a natural part of CDM to begin with, but perhaps it can be done with smooth model galaxies. In most cases, such models lack the resolution to see the features seen in DDO 154 (and in NGC 1560 and in IC 2574, etc.) If they attain such resolution, they better not show such features, as that would violate some basic considerations. But then they wouldn’t be able to describe this aspect of the data.

Simulators by and large seem to remain sanguine that this will all work out. Perhaps I have become too cynical, but I recall hearing that 20 years ago. And 15. And ten… basically, they’ve always assured me that it will work out even though it never has. Maybe tomorrow will be different. Or would that be the definition of insanity?

 

 

It Must Be So. But which Must?

It Must Be So. But which Must?

In the last post, I noted some of the sociological overtones underpinning attitudes about dark matter and modified gravity theories. I didn’t get as far as the more scientifically  interesting part, which  illustrates a common form of reasoning in physics.

About modified gravity theories, Bertone & Tait state

“the only way these theories can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of cold dark matter on cosmological scales.”

Leaving aside just which observations need to be mimicked so precisely (I expect they mean power spectrum; perhaps they consider this to be so obvious that it need not be stated), this kind of reasoning is both common and powerful – and frequently correct. Indeed, this is exactly the attitude I expressed in my review a few years ago for the Canadian Journal of Physics, quoted in the image above. I get it. There are lots of positive things to be said for the standard cosmology.

This upshot of this reasoning is, in effect, that “cosmology works so well that non-baryonic dark matter must exist.” I have sympathy for this attitude, but I also remember many examples in the history of cosmology where it has gone badly wrong. There was a time, not so long ago, that the matter density had to be the critical value, and the Hubble constant had to be 50 km/s/Mpc. By and large, it is the same community that insisted on those falsehoods with great intensity that continues to insist on conventionally conceived cold dark matter with similarly fundamentalist insistence.

I think it is an overstatement to say that the successes of cosmology (as we presently perceive them) prove the existence of dark matter. A more conservative statement is that the ΛCDM cosmology is correct if, and only if, dark matter exists. But does it? That’s a separate question, which is why laboratory searches are so important – including null results. It was, after all, the null result of Michelson & Morley that ultimately put an end to the previous version of an invisible aetherial medium, and sparked a revolution in physics.

Here I point out that the same reasoning asserted by Bertone & Tait as a slam dunk in favor of dark matter can just as accurately be asserted in favor of MOND. To directly paraphrase the above statement:

“the only way ΛCDM can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of MOND on galactic scales.”

This is a terrible problem for dark matter. Even if it were true, as is often asserted, that MOND only fits rotation curves, this would still be tantamount to a falsification of dark matter by the same reasoning applied by Bertone & Tait.

Lets look at just one example, NGC 1560:

 

ngc1560mond
The rotation curve of NGC 1560 (points) together with the Newtonian expectation (black line) and the MOND fit (blue line). Data from Begeman et al. (1991) and Gentile et al. (2010).

MOND fits the details of this rotation curve in excruciating detail. It provides just the right amount of boost over the Newtonian expectation, which varies from galaxy to galaxy. Features in the baryon distribution are reflected in the rotation curve. That is required in MOND, but makes no sense in dark matter, where the excess velocity over the Newtonian expectation is attributed to a dynamically hot, dominant, quasi-spherical dark matter halo. Such entities cannot support the features commonly seen in thin, dynamically cold disks. Even if they could, there is no reason that features in the dominant dark matter halo should align with those in the disk: a sphere isn’t a disk. In short, it is impossible to explain this with dark matter – to the extent that anything is ever impossible for the invisible.

NGC 1560 is a famous case because it has such an obvious feature. It is common to dismiss this as some non-equilibrium fluke that should simply be ignored. That is always a dodgy path to tread, but might be OK if it were only this galaxy. But similar effects are seen over and over again, to the point that they earned an empirical moniker: Renzo’s Rule. Renzo’s rule is known to every serious student of rotation curves, but has not informed the development of most dark matter theory. Ignoring this information is like leaving money on the table.

MOND fits not just NGC 1560, but very nearly* every galaxy we measure. It does so with excruciatingly little freedom. The only physical fit parameter is the stellar mass-to-light ratio. The gas fraction of NGC 1560 is 75%, so M*/L plays little role. We understand enough about stellar populations to have an idea what to expect; MOND fits return mass-to-light ratios that compare well with the normalization, color dependence, and band-pass dependent scatter expected from stellar population synthesis models.

MLBV_MOND
The mass-to-light ratio from MOND fits (points) in the blue (left panel) and near-infrared (right panel) pass-bands plotted against galaxy color (blue to the left, red to the right). From the perspective of stellar populations, one expects more scatter and a steeper color dependence in the blue band, as observed. The lines are stellar population models from Bell et al. (2003). These are completely independent, and have not been fit to the data in any way. One could hardly hope for better astrophysical agreement.

 

One can also fit rotation curve data with dark matter halos. These require a minimum of three parameters to the one of MOND. In addition to M*/L, one also needs at least two parameters to describe the dark matter halo of each galaxy – typically some characteristic mass and radius. In practice, one finds that such fits are horribly degenerate: one can not cleanly constrain all three parameters, much less recover a sensible distribution of M*/L. One cannot construct the plot above simply by asking the data what it wants as one can with MOND.

The “disk-halo degeneracy” in dark matter halo fits to rotation curves has been much discussed in the literature. Obsessed over, dismissed, revived, and ultimately ignored without satisfactory understanding. Well, duh. This approach uses three parameters per galaxy when it takes only one to describe the data. Degeneracy between the excess fit parameters is inevitable.

From a probabilistic perspective, there is a huge volume of viable parameter space that could (and should) be occupied by galaxies composed of dark matter halos plus luminous galaxies. Two identical dark matter halos might host very different luminous galaxies, so would have rotation curves that differed with the baryonic component. Two similar looking galaxies might reside in rather different dark matter halos, again having rotation curves that differ.

The probabilistic volume in MOND is much smaller. Absolutely tiny by comparison. There is exactly one and only one thing each rotation curve can do: what the particular distribution of baryons in each galaxy says it should do. This is what we observe in Nature.

The only way ΛCDM can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of MOND on galactic scales. There is a vast volume of parameter space that the rotation curves of galaxies could, in principle, inhabit. The naive expectation was exponential disks in NFW halos. Real galaxies don’t look like that. They look like MOND. Magically, out of the vast parameter space available to galaxies in the dark matter picture, they only ever pick the tiny sub-volume that very precisely mimics MOND.

The ratio of probabilities is huge. So many dark matter models are possible (and have been mooted over the years) that it is indefinably huge. The odds of observing MOND-like phenomenology in a ΛCDM universe is practically zero. This amounts to a practical falsification of dark matter.

I’ve never said dark matter is falsified, because I don’t think it is a falsifiable concept. It is like epicycles – you can always fudge it in some way. But at a practical level, it was falsified a long time ago.

That is not to say MOND has to be right. That would be falling into the same logical trap that says ΛCDM has to be right. Obviously, both have virtues that must be incorporated into whatever the final answer may be. There are some efforts in this direction, but by and large this is not how science is being conducted at present. The standard script is to privilege those data that conform most closely to our confirmation bias, and pour scorn on any contradictory narrative.

In my assessment, the probability of ultimate success through ignoring inconvenient data is practically zero. Unfortunately, that is the course upon which much of the field is currently set.


*There are of course exceptions: no data are perfect, so even the right theory will get it wrong once in a while. The goof rate for MOND fits is about what I expect: rare, but  more frequent for lower quality data. Misfits are sufficiently rare that to obsess over them is to refuse to see the forest for a few outlying trees.

Here’s a residual plot of MOND fits. See the peak at right? That’s the forest. See the tiny tail to one side? That’s an outlying tree.

rcresid_mondfits
Residuals of MOND rotation curve fits from Famaey & McGaugh (2012).

Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Like the Milky Way, our nearest giant neighbor, Andromeda (aka M31), has several dozen dwarf satellite galaxies. A few of these were known and had measured velocity dispersions at the time of my work with Joe Wolf, as discussed previously. Also like the Milky Way, the number of known objects has grown rapidly in recent years – thanks in this case largely to the PAndAS survey.

PAndAS imaged the area around M31 and M33, finding many individual red giant stars. These trace out the debris from interactions and mergers as small dwarfs are disrupted and consumed by their giant host. They also pointed up the existence of previously unknown dwarf satellites.

M31fromPANDAS_ McC2012_EPJ_19_01003
The PAndAS survey field. Dwarf satellites are circled.

As the PAndAS survey started reporting the discovery of new dwarf satellites around Andromeda, it occurred to me that this provided the opportunity to make genuine a priori predictions. These are the gold standard of the scientific method. We could use the observed luminosity and size of the newly discovered dwarfs to predict their velocity dispersions.

I tried to do this for both ΛCDM and MOND. I will not discuss the ΛCDM case much, because it can’t really be done. But it is worth understanding why this is.

In ΛCDM, the velocity dispersion is determined by the dark matter halo. This has only a tenuous connection to the observed stars, so just knowing how big and bright a dwarf is doesn’t provide much predictive power about the halo. This can be seen from this figure by Tollerud et al (2011):

Tollerud2011_ml_scatter
Virial mass of the dark matter halo as a function of galaxy luminosity. Dwarfs satellites reside in the wide colored band of low luminosities.

This graph is obtained by relating the number density of galaxies (an observed quantity) to that of the dark matter halos in which they reside (a theoretical construct). It is highly non-linear, deviating strongly from the one-to-one line we expected early on. There is no reason to expect this particular relation; it is imposed on us by the fact that the observed luminosity function of galaxies is rather flat while the predicted halo mass function is steep. Nowadays, this is usually called the missing satellite problem, but this is a misnomer as it pervades the field.

Addressing the missing satellites problem would be another long post, so lets just accept that the relation between mass and light has to follow something like that illustrated above. If a dwarf galaxy has a luminosity of a million suns, one can read off the graph that it should live in a dark halo with a mass of about 1010 M. One could use this to predict the velocity dispersion, but not very precisely, because there’s a big range corresponding to that luminosity (the bands in the figure). It could be as much as 1011 M or as little as 109 M. This corresponds to a wide range of velocity dispersions. This wide range is unavoidable because of the difference in the luminosity function and halo mass function. Small variations in one lead to big variations in the other, and some scatter in dark halo properties is unavoidable.

Consequently, we only have a vague range of expected velocity dispersions in ΛCDM. In practice, we never make this prediction. Instead, we compare the observed velocity dispersion to the luminosity and say “gee, this galaxy has a lot of dark matter” or “hey, this one doesn’t have much dark matter.” There’s no rigorously testable prior.

In MOND, what you see is what you get. The velocity dispersion has to follow from the observed stellar mass. This is straightforward for isolated galaxies: M* ∝ σ4 – this is essentially the equivalent of the Tully-Fisher relation for pressure supported systems. If we can estimate the stellar mass from the observed luminosity, the predicted velocity dispersion follows.

Many dwarf satellites are not isolated in the MONDian sense: they are subject to the external field effect (EFE) from their giant hosts. The over-under for whether the EFE applies is the point when the internal acceleration from all the stars of the dwarf on each other is equal to the external acceleration from orbiting the giant host. The amplitude of the discrepancy in MOND depends on how low the total acceleration is relative to the critical scale a0. The external field in effect adds some acceleration that wouldn’t otherwise be there, making the discrepancy less than it would be for an isolated object. This means that two otherwise identical dwarfs may be predicted to have different velocity dispersions is they are or are not subject to the EFE. This is a unique prediction of MOND that has no analog in ΛCDM.

It is straightforward to derive the equation to predict velocity dispersions in the extreme limits of isolated (aex ≪ ain < a0) or EFE dominated (ain ≪ aex < a0) objects. In reality, there are many objects for which ain ≈ aex, and no simple formula applies. In practice, we apply the formula that more nearly applies, and pray that this approximation is good enough.

There are many other assumptions and approximations that must be made in any theory: that an object is spherical, isotropic, and in dynamical equilibrium. All of these must fail at some level, but it is the last one that is the most serious concern. In the case of the EFE, one must also make the approximation that the object is in equilibrium at the current level of the external field. That is never true, as both the amplitude and the vector of the external field vary as a dwarf orbits its host. But it might be an adequate approximation if this variation is slow. In the case of a circular orbit, only the vector varies. In general the orbits are not known, so we make the instantaneous approximation and once again pray that it is good enough. There is a fairly narrow window between where the EFE becomes important and where we slip into the regime of tidal disruption, but lets plow ahead and see how far we can get, bearing in mind that the EFE is a dynamical variable of which we only have a snapshot.

To predict the velocity dispersion in the isolated case, all we need to know is the luminosity and a stellar mass-to-light ratio. Assuming the dwarfs of Andromeda to be old stellar populations, I adopted a V-band mass-to-light ratio of 2 give or take a factor of 2. That usually dominates the uncertainty, though the error in the distance can sometimes impact the luminosity at a level that impacts the prediction.

To predict the velocity dispersion in the EFE case, we again need the stellar mass, but now also need to know the size of the stellar system and the intensity of the external field to which it is subject. The latter depends on the mass of the host galaxy and the distance from it to the dwarf. This latter quantity is somewhat fraught: it is straightforward to measure the projected distance on the sky, but we need the 3D distance – how far in front or behind each dwarf is as well as its projected distance from the host. This is often a considerable contributor to the error budget. Indeed, some dwarfs may be inferred to be in the EFE regime for the low end of the range of adopted stellar mass-to-light ratio, and the isolated regime for the high end.

In this fashion, we predicted velocity dispersions for the dwarfs of Andromeda. We in this case were Milgrom and myself. I had never collaborated with him before, and prefer to remain independent. But I also wanted to be sure I got the details described above right. Though it wasn’t much work to make the predictions once the preliminaries were established, it was time consuming to collect and vet the data. As we were writing the paper, velocity dispersion measurements started to appear. People like Michelle Collins, Erik Tollerud, and Nicolas Martin were making follow-up observations, and publishing velocity dispersion for the objects we were making predictions for. That was great, but they were too good – they were observing and publishing faster than we could write!

Nevertheless, we managed to make and publish a priori predictions for 10 dwarfs before any observational measurements were published. We also made blind predictions for the other known dwarfs of Andromeda, and checked the predicted velocity dispersions against all measurements that we could find in the literature. Many of these predictions were quickly tested by on-going programs (i.e., people were out to measure velocity dispersions, whether we predicted them or not). Enough data rolled in that we were soon able to write a follow-up paper testing our predictions.

Nailed it. Good data were soon available to test the predictions for 8 of the 10* a priori cases. All 8 were consistent with our predictions. I was particularly struck by the case of And XXVIII, which I had called out as perhaps the best test. It was isolated, so the messiness of the EFE didn’t apply, and the uncertainties were low. Moreover, the predicted velocity dispersion was low – a good deal lower than broadly expected in ΛCDM: 4.3 km/s, with an uncertainty just under 1 km/s. Two independent observations were subsequently reported. One found 4.9 ± 1.6 km/s, the other 6.6 ± 2.1 km/s, both in good agreement within the uncertainties.

We made further predictions in the second paper as people had continued to discover new dwarfs. These also came true. Here is a summary plot for all of the dwarfs of Andromeda:

AndDwarfswithGoldStars.002
The velocity dispersions of the dwarf satellites of Andromeda. Each numbered box corresponds to one dwarf (x=1 is for And I and so on). Measured velocity dispersions have a number next to them that is the number of stars on which the measurement is based. MOND predictions are circles: green if isolated, open if the EFE applies. Points appear within each box in the order they appeared in the literature, from left to right. The vast majority of Andromeda’s dwarfs are consistent with MOND (large green circles). Two cases are ambiguous (large yellow circles), having velocity dispersions based only a few stars. Only And V appears to be problematic (large red circle).

MOND works well for And I, And II, And III, And VI, And VII, And IX, And X, And XI, And XII, And XIII, And XIV, And XV, And XVI, And XVII, And XVIII, And XIX, And XX, And XXI, And XXII, And XXIII, And XXIV, And XXV, And XXVIII, And XXIX, And XXXI, And XXXII, and And XXXIII. There is one problematic case: And V. I don’t know what is going on there, but note that systematic errors frequently happen in astronomy. It’d be strange if there weren’t at least one goofy case.

Nevertheless, the failure of And V could be construed as a falsification of MOND. It ought to work in every single case. But recall the discussion of assumptions and uncertainties above. Is falsification really the story these data tell?

We do have experience with various systematic errors. For example, we predicted that the isolated dwarfs spheroidal Cetus should have a velocity dispersion in MOND of 8.2 km/s. There was already a published measurement of 17 ± 2 km/s, so we reported that MOND was wrong in this case by over 3σ. Or at least we started to do so. Right before we submitted that paper, a new measurement appeared: 8.3 ± 1 km/s. This is an example of how the data can sometimes change by rather more than the formal error bars suggest is possible. In this case, I suspect the original observations lacked the spectral resolution to resolve the velocity dispersion. At any rate, the new measurement (8.3 km/s) was somewhat more consistent with our prediction (8.2 km/s).

The same predictions cannot even be made in ΛCDM. The velocity data can always be fit once they are in hand. But there is no agreed method to predict the velocity dispersion of a dwarf from its observed luminosity. As discussed above, this should not even be possible: there is too much scatter in the halo mass-stellar mass relation at these low masses.

An unsung predictive success of MOND absent from the graph above is And IV. When And IV was discovered in the general direction of Andromeda, it was assumed to be a new dwarf satellite – hence the name. Milgrom looked at the velocities reported for this object, and said it had to be a background galaxy. No way it could be a dwarf satellite – at least not in MOND. I see no reason why it couldn’t have been in ΛCDM. It is absent from the graph above, because it was subsequently confirmed to be much farther away (7.2 Mpc vs. 750 kpc for Andromeda).

The box for And XVII is empty because this system is manifestly out of equilibrium. It is more of a stellar stream than a dwarf, appearing as a smear in the PAndAS image rather than as a self-contained dwarf. I do not recall what the story with the other missing object (And VIII) is.

While writing the follow-up paper, I also noticed that there were a number of Andromeda dwarfs that were photometrically indistinguishable: basically the same in terms of size and stellar mass. But some were isolated while others were subject to the EFE. MOND predicts that the EFE cases should have lower velocity dispersion than the isolated equivalents.

AndDwarfswithGoldStars.003
The velocity dispersions of the dwarfs of Andromeda, highlighting photometrically matched pairs – dwarfs that should be indistinguishable, but aren’t because of the EFE.

And XXVIII (isolated) has a higher velocity dispersion than its near-twin And XVII (EFE). The same effect might be acting in And XVIII (isolated) and And XXV (EFE). This is clear if we accept the higher velocity dispersion measurement for And XVIII, but an independent measurements begs to differ. The former has more stars, so is probably more reliable, but we should be cautious. The effect is not clear in And XVI (isolated) and And XXI (EFE), but the difference in the prediction is small and the uncertainties are large.

An aggressive person might argue that the pairs of dwarfs is a positive detection of the EFE. I don’t think the data for the matched pairs warrant that, at least not yet. On the other hand, the appropriate use of the EFE was essential to all the predictions, not just the matched pairs.

The positive detection of the EFE is important, as it is a unique prediction of MOND. I see no way to tune ΛCDM galaxy simulations to mimic this effect. Of course, there was a  very recent time when it seemed impossible for them to mimic the isolated predictions of MOND. They claim to have come a long way in that regard.

But that’s what we’re stuck with: tuning ΛCDM to make it look like MOND. This is why a priori predictions are important. There is ample flexibility to explain just about anything with dark matter. What we can’t seem to do is predict the same things that MOND successfully predicts… predictions that are both quantitative and very specific. We’re not arguing that dwarfs in general live in ~15 or 30 km/s halos, as we must in ΛCDM. In MOND we can say this dwarf will have this velocity dispersion and that dwarf will have that velocity dispersion. We can distinguish between 4.9 and 7.3 km/s. And we can do it over and over and over. I see no way to do the equivalent in ΛCDM, just as I see no way to explain the acoustic power spectrum of the CMB in MOND.

This is not to say there are no problematic cases for MOND. Read, Walker, & Steger have recently highlighted the matched pair of Draco and Carina as an issue. And they are – though here I already have reason to suspect Draco is out of equilibrium, which makes it challenging to analyze. Whether it is actually out of equilibrium or not is a separate question.

I am not thrilled that we are obliged to invoke non-equilibrium effects in both theories. But there is a difference. Brada & Milgrom provided a quantitative criterion to indicate when this was an issue before I ran into the problem. In ΛCDM, the low velocity dispersions of objects like And XIX, XXI, XXV and Crater 2 came as a complete surprise despite having been predicted by MOND. Tidal disruption was only invoked after the fact – and in an ad hoc fashion. There is no way to know in advance which dwarfs are affected, as there is no criterion equivalent to that of Brada. We just say “gee, that’s a low velocity dispersion. Must have been disrupted.” That might be true, but it gives no explanation for why MOND predicted it in the first place – which is to say, it isn’t really an explanation at all.

I still do not understand is why MOND gets any predictions right if ΛCDM is the universe we live in, let alone so many. Shouldn’t happen. Makes no sense.

If this doesn’t confuse you, you are not thinking clearly.


*The other two dwarfs were also measured, but with only 4 stars in one and 6 in the other. These are too few for a meaningful velocity dispersion measurement.

Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

The Milky Way and its nearest giant neighbor Andromeda (M31) are surrounded by a swarm of dwarf satellite galaxies. Aside from relatively large beasties like the Large Magellanic Cloud or M32, the majority of these are the so-called dwarf spheroidals. There are several dozen examples known around each giant host, like the Fornax dwarf pictured above.

Dwarf Spheroidal (dSph) galaxies are ellipsoidal blobs devoid of gas that typically contain a million stars, give or take an order of magnitude. Unlike globular clusters, that may have a similar star count, dSphs are diffuse, with characteristic sizes of hundreds of parsecs (vs. a few pc for globulars). This makes them among the lowest surface brightness systems known.

This subject has a long history, and has become a major industry in recent years. In addition to the “classical” dwarfs that have been known for decades, there have also been many comparatively recent discoveries, often of what have come to be called “ultrafaint” dwarfs. These are basically dSphs with luminosities less than 100,000 suns, sometimes being comprised of as little as a few hundred stars. New discoveries are being made still, and there is reason to hope that the LSST will discover many more. Summed up, the known dwarf satellites are proverbial drops in the bucket compared to their giant hosts, which contain hundreds of billions of stars. Dwarfs could rain in for a Hubble time and not perturb the mass budget of the Milky Way.

Nevertheless, tiny dwarf Spheroidals are excellent tests of theories like CDM and MOND. Going back to the beginning, in the early ’80s, Milgrom was already engaged in a discussion about the predictions of his then-new theory (before it was even published) with colleagues at the IAS, where he had developed the idea during a sabbatical visit. They were understandably skeptical, preferring – as many still do – to believe that some unseen mass was the more conservative hypothesis. Dwarf spheroidals came up even then, as their very low surface brightness meant low acceleration in MOND. This in turn meant large mass discrepancies. If you could measure their dynamics, they would have large mass-to-light ratios. Larger than could be explained by stars conventionally, and larger than the discrepancies already observed in bright galaxies like Andromeda.

This prediction of Milgrom’s – there from the very beginning – is important because of how things change (or don’t). At that time, Scott Tremaine summed up the contrasting expectation of the conventional dark matter picture:

“There is no reason to expect that dwarfs will have more dark matter than bright galaxies.” *

This was certainly the picture I had in my head when I first became interested in low surface brightness (LSB) galaxies in the mid-80s. At that time I was ignorant of MOND; my interest was piqued by the argument of Disney that there could be a lot of as-yet undiscovered LSB galaxies out there, combined with my first observing experiences with the then-newfangled CCD cameras which seemed to have a proclivity for making clear otherwise hard-to-see LSB features. At the time, I was interested in finding LSB galaxies. My interest in what made them rotate came  later.

The first indication, to my knowledge, that dSph galaxies might have large mass discrepancies was provided by Marc Aaronson in 1983. This tentative discovery was hugely important, but the velocity dispersion of Draco (one of the “classical” dwarfs) was based on only 3 stars, so was hardly definitive. Nevertheless, by the end of the ’90s, it was clear that large mass discrepancies were a defining characteristic of dSphs. Their conventionally computed M/L went up systematically as their luminosity declined. This was not what we had expected in the dark matter picture, but was, at least qualitatively, in agreement with MOND.

My own interests had focused more on LSB galaxies in the field than on dwarf satellites like Draco. Greg Bothun and Jim Schombert had identified enough of these to construct a long list of LSB galaxies that served as targets my for Ph.D. thesis. Unlike the pressure-supported ellipsoidal blobs of stars that are the dSphs, the field LSBs we studied were gas rich, rotationally supported disks – mostly late type galaxies (Sd, Sm, & Irregulars). Regardless of composition, gas or stars, low surface density means that MOND predicts low acceleration. This need not be true conventionally, as the dark matter can do whatever the heck it wants. Though I was blissfully unaware of it at the time, we had constructed the perfect sample for testing MOND.

Having studied the properties of our sample of LSB galaxies, I developed strong ideas about their formation and evolution. Everything we had learned – their blue colors, large gas fractions, and low star formation rates – suggested that they evolved slowly compared to higher surface brightness galaxies. Star formation gradually sputtered along, having a hard time gathering enough material to make stars in their low density interstellar media. Perhaps they even formed late, an idea I took a shining to in the early ’90s. This made two predictions: field LSB galaxies should be less strongly clustered than bright galaxies, and should spin slower at a given mass.

The first prediction follows because the collapse time of dark matter halos correlates with their larger scale environment. Dense things collapse first and tend to live in dense environments. If LSBs were low surface density because they collapsed late, it followed that they should live in less dense environments.

I didn’t know how to test this prediction. Fortunately, fellow postdoc and office mate in Cambridge at the time, Houjun Mo, did. It came true. The LSB galaxies I had been studying were clustered like other galaxies, but not as strongly. This was exactly what I expected, and I thought sure we were on to something. All that remained was to confirm the second prediction.

At the time, we did not have a clear idea of what dark matter halos should be like. NFW halos were still in the future. So it seemed reasonable that late forming halos should have lower densities (lower concentrations in the modern terminology). More importantly, the sum of dark and luminous density was certainly less. Dynamics follow from the distribution of mass as Velocity2 ∝ Mass/Radius. For a given mass, low surface brightness galaxies had a larger radius, by construction. Even if the dark matter didn’t play along, the reduction in the concentration of the luminous mass should lower the rotation velocity.

Indeed, the standard explanation of the Tully-Fisher relation was just this. Aaronson, Huchra, & Mould had argued that galaxies obeyed the Tully-Fisher relation because they all had essentially the same surface brightness (Freeman’s law) thereby taking variation in the radius out of the equation: galaxies of the same mass all had the same radius. (If you are a young astronomer who has never heard of Freeman’s law, you’re welcome.) With our LSB galaxies, we had a sample that, by definition, violated Freeman’s law. They had large radii for a given mass. Consequently, they should have lower rotation velocities.

Up to that point, I had not taken much interest in rotation curves. In contrast, colleagues at the University of Groningen were all about rotation curves. Working with Thijs van der Hulst, Erwin de Blok, and Martin Zwaan, we set out to quantify where LSB galaxies fell in relation to the Tully-Fisher relation. I confidently predicted that they would shift off of it – an expectation shared by many at the time. They did not.

BTFSBallwlinessmall
The Tully-Fisher relation: disk mass vs. flat rotation speed (circa 1996). Galaxies are binned by surface brightness with the highest surface brightness galaxies marked red and the lowest blue. The lines show the expected shift following the argument of Aaronson et al. Contrary to this expectation, galaxies of all surface brightnesses follow the same Tully-Fisher relation.

I was flummoxed. My prediction was wrong. That of Aaronson et al. was wrong. Poking about the literature, everyone who had made a clear prediction in the conventional context was wrong. It made no sense.

I spent months banging my head against the wall. One quick and easy solution was to blame the dark matter. Maybe the rotation velocity was set entirely by the dark matter, and the distribution of luminous mass didn’t come into it. Surely that’s what the flat rotation velocity was telling us? All about the dark matter halo?

Problem is, we measure the velocity where the luminous mass still matters. In galaxies like the Milky Way, it matters quite a lot. It does not work to imagine that the flat rotation velocity is set by some property of the dark matter halo alone. What matters to what we measure is the combination of luminous and dark mass. The luminous mass is important in high surface brightness galaxies, and progressively less so in lower surface brightness galaxies. That should leave some kind of mark on the Tully-Fisher relation, but it doesn’t.

CRVfresid
Residuals from the Tully-Fisher relation as a function of size at a given mass. Compact galaxies are to the left, diffuse ones to the right. The red dashed line is what Newton predicts: more compact galaxies should rotate faster at a given mass. Fundamental physics? Tully-Fisher don’t care. Tully-Fisher don’t give a sh*t.

I worked long and hard to understand this in terms of dark matter. Every time I thought I had found the solution, I realized that it was a tautology. Somewhere along the line, I had made an assumption that guaranteed that I got the answer I wanted. It was a hopeless fine-tuning problem. The only way to satisfy the data was to have the dark matter contribution scale up as that of the luminous mass scaled down. The more stretched out the light, the more compact the dark – in exact balance to maintain zero shift in Tully-Fisher.

This made no sense at all. Over twenty years on, I have yet to hear a satisfactory conventional explanation. Most workers seem to assert, in effect, that “dark matter does it” and move along. Perhaps they are wise to do so.

repomanfoxharris
Working on the thing can drive you mad.

As I was struggling with this issue, I happened to hear a talk by Milgrom. I almost didn’t go. “Modified gravity” was in the title, and I remember thinking, “why waste my time listening to that nonsense?” Nevertheless, against my better judgement, I went. Not knowing that anyone in the audience worked on either LSB galaxies or Tully-Fisher, Milgrom proceeded to derive the MOND prediction:

“The asymptotic circular velocity is determined only by the total mass of the galaxy: Vf4 = a0GM.”

In a few lines, he derived rather trivially what I had been struggling to understand for months. The lack of surface brightness dependence in Tully-Fisher was entirely natural in MOND. It falls right out of the modified force law, and had been explicitly predicted over a decade before I struggled with the problem.

I scraped my jaw off the floor, determined to examine this crazy theory more closely. By the time I got back to my office, cognitive dissonance had already started to set it. Couldn’t be true. I had more pressing projects to complete, so I didn’t think about it again for many moons.

When I did, I decided I should start by reading the original MOND papers. I was delighted to find a long list of predictions, many of them specifically to do with surface brightness. We had just collected fresh data on LSB galaxies, which provided a new window on the low acceleration regime. I had the data to finally falsify this stupid theory.

Or so I thought. As I went through the list of predictions, my assumption that MOND had to be wrong was challenged by each item. It was barely an afternoon’s work: check, check, check. Everything I had struggled for months to understand in terms of dark matter tumbled straight out of MOND.

I was faced with a choice. I knew this would be an unpopular result. I could walk away and simply pretend I had never run across it. That’s certainly how it had been up until then: I had been blissfully unaware of MOND and its perniciously successful predictions. No need to admit otherwise.

Had I realized just how unpopular it would prove to be, maybe that would have been the wiser course. But even contemplating such a course felt criminal. I was put in mind of Paul Gerhardt’s admonition for intellectual honesty:

“When a man lies, he murders some part of the world.”

Ignoring what I had learned seemed tantamount to just that. So many predictions coming true couldn’t be an accident. There was a deep clue here; ignoring it wasn’t going to bring us closer to the truth. Actively denying it would be an act of wanton vandalism against the scientific method.

Still, I tried. I looked long and hard for reasons not to report what I had found. Surely there must be some reason this could not be so?

Indeed, the literature provided many papers that claimed to falsify MOND. To my shock, few withstood critical examination. Commonly a straw man representing MOND was falsified, not MOND itself. At a deeper level, it was implicitly assumed that any problem for MOND was an automatic victory for dark matter. This did not obviously follow, so I started re-doing the analyses for both dark matter and MOND. More often than not, I found either that the problems for MOND were greatly exaggerated, or that the genuinely problematic cases were a problem for both theories. Dark matter has more flexibility to explain outliers, but outliers happen in astronomy. All too often the temptation was to refuse to see the forest for a few trees.

The first MOND analysis of the classical dwarf spheroidals provides a good example. Completed only a few years before I encountered the problem, these were low surface brightness systems that were deep in the MOND regime. These were gas poor, pressure supported dSph galaxies, unlike my gas rich, rotating LSB galaxies, but the critical feature was low surface brightness. This was the most directly comparable result. Better yet, the study had been made by two brilliant scientists (Ortwin Gerhard & David Spergel) whom I admire enormously. Surely this work would explain how my result was a mere curiosity.

Indeed, reading their abstract, it was clear that MOND did not work for the dwarf spheroidals. Whew: LSB systems where it doesn’t work. All I had to do was figure out why, so I read the paper.

As I read beyond the abstract, the answer became less and less clear. The results were all over the map. Two dwarfs (Sculptor and Carina) seemed unobjectionable in MOND. Two dwarfs (Draco and Ursa Minor) had mass-to-light ratios that were too high for stars, even in MOND. That is, there still appeared to be a need for dark matter even after MOND had been applied. One the flip side, Fornax had a mass-to-light ratio that was too low for the old stellar populations assumed to dominate dwarf spheroidals. Results all over the map are par for the course in astronomy, especially for a pioneering attempt like this. What were the uncertainties?

Milgrom wrote a rebuttal. By then, there were measured velocity dispersions for two more dwarfs. Of these seven dwarfs, he found that

“within just the quoted errors on the velocity dispersions and the luminosities, the MOND M/L values for all seven dwarfs are perfectly consistent with stellar values, with no need for dark matter.”

Well, he would say that, wouldn’t he? I determined to repeat the analysis and error propagation.

MdB98bFig8_dSph
Mass-to-light ratios determined with MOND for eight dwarf spheroidals (named, as published in McGaugh & de Blok 1998). The various symbols refer to different determinations. Mine are the solid circles. The dashed lines show the plausible range for stellar populations.

The net result: they were both right. M/L was still too high for Draco and Ursa Minor, and still too low for Fornax. But this was only significant at the 2σ level, if that – hardly enough to condemn a theory. Carina, Leo I, Leo II, Sculptor, and Sextans all had fairly reasonable mass-to-light ratios. The voting is different now. Instead of going 2 for 5 as Gerhard & Spergel found, MOND was now 5 for 8. One could choose to obsess about the outliers, or one could choose to see a more positive pattern.  Either a positive or a negative spin could be put on this result. But it was clearly more positive than the first attempt had indicated.

The mass estimator in MOND scales as the fourth power of velocity (or velocity dispersion in the case of isolated dSphs), so the too-high M*/L of Draco and Ursa Minor didn’t disturb me too much. A small overestimation of the velocity dispersion would lead to a large overestimation of the mass-to-light ratio. Just about every systematic uncertainty one can think of pushes in this direction, so it would be surprising if such an overestimate didn’t happen once in a while.

Given this, I was more concerned about the low M*/L of Fornax. That was weird.

Up until that point (1998), we had been assuming that the stars in dSphs were all old, like those in globular clusters. That corresponds to a high M*/L, maybe 3 in solar units in the V-band. Shortly after this time, people started to look closely at the stars in the classical dwarfs with the Hubble. Low and behold, the stars in Fornax were surprisingly young. That means a low M*/L, 1 or less. In retrospect, MOND was trying to tell us that: it returned a low M*/L for Fornax because the stars there are young. So what was taken to be a failing of the theory was actually a predictive success.

Hmm.

And Gee. This is a long post. There is a lot more to tell, but enough for now.


*I have a long memory, but it is not perfect. I doubt I have the exact wording right, but this does accurately capture the sentiment from the early ’80s when I was an undergraduate at MIT and Scott Tremaine was on the faculty there.

A brief history of the acceleration discrepancy

A brief history of the acceleration discrepancy

As soon as I wrote it, I realized that the title is much more general than anything that can be fit in a blog post. Bekenstein argued long ago that the missing mass problem should instead be called the acceleration discrepancy, because that’s what it is – a discrepancy that occurs in conventional dynamics at a particular acceleration scale. So in that sense, it is the entire history of dark matter. For that, I recommend the excellent book The Dark Matter Problem: A Historical Perspective by Bob Sanders.

Here I mean more specifically my own attempts to empirically constrain the relation between the mass discrepancy and acceleration. Milgrom introduced MOND in 1983, no doubt after a long period of development and refereeing. He anticipated essentially all of what I’m going to describe. But not everyone is eager to accept MOND as a new fundamental theory, and often suffer from a very human tendency to confuse fact and theory. So I have gone out of my way to demonstrate what is empirically true in the data – facts – irrespective of theoretical interpretation (MOND or otherwise).

What is empirically true, and now observationally established beyond a reasonable doubt, is that the mass discrepancy in rotating galaxies correlates with centripetal acceleration. The lower the acceleration, the more dark matter one appears to need. Or, as Bekenstein might have put it, the amplitude of the acceleration discrepancy grows as the acceleration itself declines.

Bob Sanders made the first empirical demonstration that I am aware of that the mass discrepancy correlates with acceleration. In a wide ranging and still relevant 1990 review, he showed that the amplitude of the mass discrepancy correlated with the acceleration at the last measured point of a rotation curve. It did not correlate with radius.

AccDisc_Sanders1990
The acceleration discrepancy from Sanders (1990).

I was completely unaware of this when I became interested in the problem a few years later. I wound up reinventing the very same term – the mass discrepancy, which I defined as the ratio of dynamically measured mass to that visible in baryons: D = Mtot/Mbar. When there is no dark matter, Mtot = Mbar and D = 1.

My first demonstration of this effect was presented at a conference at Rutgers in 1998. This considered the mass discrepancy at every radius and every acceleration within all the galaxies that were available to me at that time. Though messy, as is often the case in extragalactic astronomy, the correlation was clear. Indeed, this was part of a broader review of galaxy formation; the title, abstract, and much of the substance remains relevant today.

MD1998_constantML
The mass discrepancy – the ratio of dynamically measured mass to that visible in luminous stars and gas – as a function of centripetal acceleration. Each point is a measurement along a rotation curve; two dozen galaxies are plotted together. A constant mass-to-light ratio is assumed for all galaxies.

I spent much of the following five years collecting more data, refining the analysis, and sweating the details of uncertainties and systematic instrumental effects. In 2004, I published an extended and improved version, now with over 5 dozen galaxies.

MDaccpoponly
One panel from Fig. 5 of McGaugh (2004). The mass discrepancy is plotted against the acceleration predicted by the baryons (in units of km2 s2 kpc-1).

Here I’ve used a population synthesis model to estimate the mass-to-light ratio of the stars. This is the only unknown; everything else is measured. Note that the vast majority galaxies land on top of each other. There are a few that do not, as you can perceive in the parallel sets of points offset from the main body. But that happens in only a few cases, as expected – no population model is perfect. Indeed, this one was surprisingly good, as the vast majority of the individual galaxies are indistinguishable in the pile that defines the main relation.

I explored the how the estimation of the stellar mass-to-light ratio affected this mass discrepancy-acceleration relation in great detail in the 2004 paper. The details differ with the choice of estimator, but the bottom line was that the relation persisted for any plausible choice. The relation exists. It is an empirical fact.

At this juncture, further improvement was no longer limited by rotation curve data, which is what we had been working to expand through the early ’00s. Now it was the stellar mass. The measurement of stellar mass was based on optical measurements of the luminosity distribution of stars in galaxies. These are perfectly fine data, but it is hard to map the starlight that we measured to the stellar mass that we need for this relation. The population synthesis models were good, but they weren’t good enough to avoid the occasional outlier, as can be seen in the figure above.

One thing the models all agreed on (before they didn’t, then they did again) was that the near-infrared would provide a more robust way of mapping stellar mass than the optical bands we had been using up till then. This was the clear way forward, and perhaps the only hope for improving the data further. Fortunately, technology was keeping pace. Around this time, I became involved in helping the effort to develop the NEWFIRM near-infrared camera for the national observatories, and NASA had just launched the Spitzer space telescope. These were the right tools in the right place at the right time. Ultimately, the high accuracy of the deep images obtained from the dark of space by Spitzer at 3.6 microns were to prove most valuable.

Jim Schombert and I spent much of the following decade observing in the near-infrared. Many other observers were doing this as well, filling the Spitzer archive with useful data while we concentrated on our own list of low surface brightness galaxies. This paragraph cannot suffice to convey the long term effort and enormity of this program. But by the mid-teens, we had accumulated data for hundreds of galaxies, including all those for which we also had rotation curves and HI observations. The latter had been obtained over the course of decades by an entire independent community of radio observers, and represent an integrated effort that dwarfs our own.

On top of the observational effort, Jim had been busy building updated stellar population models. We have a sophisticated understanding of how stars work, but things can get complicated when you put billions of them together. Nevertheless, Jim’s work – and that of a number of independent workers – indicated that the relation between Spitzer’s 3.6 micron luminosity measurements and stellar mass should be remarkably simple – basically just a constant conversion factor for nearly all star forming galaxies like those in our sample.

Things came together when Federico Lelli joined Case Western as a postdoc in 2014. He had completed his Ph.D. in the rich tradition of radio astronomy, and was the perfect person to move the project forward. After a couple more years of effort, curating the rotation curve data and building mass models from the Spitzer data, we were in the position to build the relation for over a dozen dozen galaxies. With all the hard work done, making the plot was a matter of running a pre-prepared computer script.

Federico ran his script. The plot appeared on his screen. In a stunned voice, he called me into his office. We had expected an improvement with the Spitzer data – hence the decade of work – but we had also expected there to be a few outliers. There weren’t. Any.

All. the. galaxies. fell. right. on. top. of. each. other.

rar
The radial acceleration relation. The centripetal acceleration measured from rotation curves is plotted against that predicted by the observed baryons. 2693 points from 153 distinct galaxies are plotted together (bluescale); individual galaxies do not distinguish themselves in this plot. Indeed, the width of the scatter (inset) is entirely explicable by observational uncertainties and the expected scatter in stellar mass-to-light ratios. From McGaugh et al. (2016).

This plot differs from those above because we had decided to plot the measured acceleration against that predicted by the observed baryons so that the two axes would be independent. The discrepancy, defined as the ratio, depended on both. D is essentially the ratio of the y-axis to the x-axis of this last plot, dividing out the unity slope where D = 1.

This was one of the most satisfactory moments of my long career, in which I have been fortunate to have had many satisfactory moments. It is right up there with the eureka moment I had that finally broke the long-standing loggerhead about the role of selection effects in Freeman’s Law. (Young astronomers – never heard of Freeman’s Law? You’re welcome.) Or the epiphany that, gee, maybe what we’re calling dark matter could be a proxy for something deeper. It was also gratifying that it was quickly recognized as such, with many of the colleagues I first presented it to saying it was the highlight of the conference where it was first unveiled.

Regardless of the ultimate interpretation of the radial acceleration relation, it clearly exists in the data for rotating galaxies. The discrepancy appears at a characteristic acceleration scale, g = 1.2 x 10-10 m/s/s. That number is in the data. Why? is a deeply profound question.

It isn’t just that the acceleration scale is somehow fundamental. The amplitude of the discrepancy depends systematically on the acceleration. Above the critical scale, all is well: no need for dark matter. Below it, the amplitude of the discrepancy – the amount of dark matter we infer – increases systematically. The lower the acceleration, the more dark matter one infers.

The relation for rotating galaxies has no detectable scatter – it is a near-perfect relation. Whether this persists, and holds for other systems, is the interesting outstanding question. It appears, for example, that dwarf spheroidal galaxies may follow a slightly different relation. However, the emphasis here is on slighlty. Very few of these data pass the same quality criteria that the SPARC data plotted above do. It’s like comparing mud pies with diamonds.

Whether the scatter in the radial acceleration relation is zero or merely very tiny is important. That’s the difference between a new fundamental force law (like MOND) and a merely spectacular galaxy scaling relation. For this reason, it seems to be controversial. It shouldn’t be: I was surprised at how tight the relation was myself. But I don’t get to report that there is lots of scatter when there isn’t. To do so would be profoundly unscientific, regardless of the wants of the crowd.

Of course, science is hard. If you don’t do everything right, from the measurements to the mass models to the stellar populations, you’ll find some scatter where perhaps there isn’t any. There are so many creative ways to screw up that I’m sure people will continue to find them. Myself, I prefer to look forward: I see no need to continuously re-establish what has been repeatedly demonstrated in the history briefly outlined above.