25 years a heretic

People seem to like to do retrospectives at year’s end. I take a longer view, but the end of 2020 seems like a fitting time to do that. Below is the text of a paper I wrote in 1995 with collaborators at the Kapteyn Institute of the University of Groningen. The last edit date is from December of that year, so this text (in plain TeX, not LaTeX!) is now a quarter century old. I am just going to cut & paste it as-was; I even managed to recover the original figures and translate them into something web-friendly (postscript to jpeg). This is exactly how it was.

This was my first attempt to express in the scientific literature my concerns for the viability of the dark matter paradigm, and my puzzlement that the only theory to get any genuine predictions right was MOND. It was the hardest admission in my career that this could be even a remote possibility. Nevertheless, intellectual honesty demanded that I report it. To fail to do so would be an act of reality denial antithetical to the foundational principles of science.

It was never published. There were three referees. Initially, one was positive, one was negative, and one insisted that rotation curves weren’t flat. There was one iteration; this is the resubmitted version in which the concerns of the second referee were addressed to his apparent satisfaction by making the third figure a lot more complicated. The third referee persisted that none of this was valid because rotation curves weren’t flat. Seems like he had a problem with something beyond the scope of this paper, but the net result was rejection.

One valid concern that ran through the refereeing process from all sides was “what about everything else?” This is a good question that couldn’t fit into a short letter like this. Thanks to the support of Vera Rubin and a Carnegie Fellowship, I spent the next couple of years looking into everything else. The results were published in 1998 in a series of three long papers: one on dark matter, one on MOND, and one making detailed fits.

This had started from a very different place intellectually with my efforts to write a paper on galaxy formation that would have been similar to contemporaneous papers like Dalcanton, Spergel, & Summers and Mo, Mao, & White. This would have followed from my thesis and from work with Houjun Mo, who was an office mate when we were postdocs at the IoA in Cambridge. (The ideas discussed in Mo, McGaugh, & Bothun have been reborn recently in the galaxy formation literature under the moniker of “assembly bias.”) But I had realized by then that my ideas – and those in the papers cited – were wrong. So I didn’t write a paper that I knew to be wrong. I wrote this one instead.

Nothing substantive has changed since. Reading it afresh, I’m amazed how many of the arguments over the past quarter century were anticipated here. As a scientific community, we are stuck in a rut, and seem to prefer to spin the wheels to dig ourselves in deeper than consider the plain if difficult path out.

Testing hypotheses of dark matter and alternative gravity with low surface density galaxies

The missing mass problem remains one of the most vexing in astrophysics. Observations clearly indicate either the presence of a tremendous amount of as yet unidentified dark matter^1,2, or the need to modify the law of gravity^3-7. These hypotheses make vastly different predictions as a function of density. Observations of the rotation curves of galaxies of much lower surface brightness than previously studied therefore provide a powerful test for discriminating between them. The dark matter hypothesis requires a surprisingly strong relation between the surface brightness and mass to light ratio⁸, placing stringent constraints on theories of galaxy formation and evolution. Alternatively, the observed behaviour is predicted⁴ by one of the hypothesised alterations of gravity known as modified Newtonian dynamics^3,5 (MOND).

Spiral galaxies are observed to have asymptotically flat [i.e., V(R) ~ constant for large R] rotation curves that extend well beyond their optical edges. This trend continues for as far (many, sometimes > 10 galaxy scale lengths) as can be probed by gaseous tracers^1,2 or by the orbits of satellite galaxies⁹. Outside a galaxy’s optical radius, the gravitational acceleration is a_N = GM/R² = V²/R so one expects V(R) ~ R^-1/2. This Keplerian behaviour is not observed in galaxies.

One approach to this problem is to increase M in the outer parts of galaxies in order to provide the extra gravitational acceleration necessary to keep the rotation curves flat. Indeed, this is the only option within the framework of Newtonian gravity since both V and R are directly measured. The additional mass must be invisible, dominant, and extend well beyond the optical edge of the galaxies.

Postulating the existence of this large amount of dark matter which reveals itself only by its gravitational effects is a radical hypothesis. Yet the kinematic data force it upon us, so much so that the existence of dark matter is generally accepted. Enormous effort has gone into attempting to theoretically predict its nature and experimentally verify its existence, but to date there exists no convincing detection of any hypothesised dark matter candidate, and many plausible candidates have been ruled out¹⁰.

Another possible solution is to alter the fundamental equation a_N = GM/R². Our faith in this simple equation is very well founded on extensive experimental tests of Newtonian gravity. Since it is so fundamental, altering it is an even more radical hypothesis than invoking the existence of large amounts of dark matter of completely unknown constituent components. However, a radical solution is required either way, so both possibilities must be considered and tested.

A phenomenological theory specifically introduced to address the problem of the flat rotation curves is MOND³. It has no other motivation and so far there is no firm physical basis for the theory. It provides no satisfactory cosmology, having yet to be reconciled with General Relativity. However, with the introduction of one new fundamental constant (an acceleration a₀), it is empirically quite successful in fitting galaxy rotation curves^11-14. It hypothesises that for accelerations a < a₀ = 1.2 x 10^-10 m s^-2, the effective acceleration is given by a_eff = (a_N a₀)^1/2. This simple prescription works well with essentially only one free parameter per galaxy, the stellar mass to light ratio, which is subject to independent constraint by stellar evolution theory. More importantly, MOND makes predictions which are distinct and testable. One specific prediction⁴ is that the asymptotic (flat) value of the rotation velocity, V_a, is V_a = (GMa₀)^1/4. Note that V_a does not depend on R, but only on M in the regime of small accelerations (a < a₀).

In contrast, Newtonian gravity depends on both M and R. Replacing R with a mass surface density variable S = M(R)/R², the Newtonian prediction becomes M S ~ V_a⁴ which contrasts with the MOND prediction M ~ V_a⁴. These relations are the theoretical basis in each case for the observed luminosity-linewidth relation L ~ V_a⁴ (better known as the Tully-Fisher¹⁵ relation. Note that the observed value of the exponent is bandpass dependent, but does obtain the theoretical value of 4 in the near infrared¹⁶ which is considered the best indicator of the stellar mass. The systematic variation with bandpass is a very small effect compared to the difference between the two gravitational theories, and must be attributed to dust or stars under either theory.) To transform from theory to observation one requires the mass to light ratio Y: Y = M/L = S/s, where s is the surface brightness. Note that in the purely Newtonian case, M and L are very different functions of R, so Y is itself a strong function of R. We define Y to be the mass to light ratio within the optical radius R_*, as this is the only radius which can be measured by observation. The global mass to light ratio would be very different (since M ~ R for R > R_*, the total masses of dark haloes are not measurable), but the particular choice of definition does not affect the relevant functional dependences is all that matters. The predictions become Y²sL ~ V_a⁴ for Newtonian gravity^8,16 and YL ~ V_a⁴ for MOND⁴.

The only sensible¹⁷ null hypothesis that can be constructed is that the mass to light ratio be roughly constant from galaxy to galaxy. Clearly distinct predictions thus emerge if galaxies of different surface brightnesses s are examined. In the Newtonian case there should be a family of parallel Tully-Fisher relations for each surface brightness. In the case of MOND, all galaxies should follow the same Tully-Fisher relation irrespective of surface brightness.

Recently it has been shown that extreme objects such as low surface brightness galaxies^8,18 (those with central surface brightnesses fainter than s₀ = 23 B mag./[] corresponding 40 L_☉ pc^-2) obey the same Tully-Fisher relation as do the high surface brightness galaxies (typically with s₀ = 21.65 B mag./[] or 140 L_☉ pc^-2) which originally¹⁵ defined it. Fig. 1 shows the luminosity-linewidth plane for galaxies ranging over a factor of 40 in surface brightness. Regardless of surface brightness, galaxies fall on the same Tully-Fisher relation.

The luminosity-linewidth (Tully-Fisher) relation for spiral galaxies over a large range in surface brightness. The B-band relation is shown; the same result is obtained in all bands^8,18. Absolute magnitudes are measured from apparent magnitudes assuming H₀ = 75 km/s/Mpc. Rotation velocities V_a are directly proportional to observed 21 cm linewidths (measured as the full width at 20% of maximum) W₂₀ corrected for inclination [sin^-1(i)]. Open symbols are an independent sample which defines⁴² the Tully-Fisher relation (solid line). The dotted lines show the expected shift of the Tully-Fisher relation for each step in surface brightness away from the canonical value s₀ = 21.5 if the mass to light ratio remains constant. Low surface brightness galaxies are plotted as solid symbols, binned by surface brightness: red triangles: 22 < s₀ < 23; green squares: 23 < s₀ < 24; blue circles: s₀ > 24. One galaxy with two independent measurements is connected by a line. This gives an indication of the typical uncertainty which is sufficient to explain nearly all the scatter. Contrary to the clear expectation of a readily detectable shift as indicated by the dotted lines, galaxies fall on the same Tully-Fisher relation regardless of surface brightness, as predicted by MOND.

MOND predicts this behaviour in spite of the very different surface densities of low surface brightness galaxies. In order to understand this observational fact in the framework of standard Newtonian gravity requires a subtle relation⁸ between surface brightness and the mass to light ratio to keep the product sY² constant. If we retain normal gravity and the dark matter hypothesis, this result is unavoidable, and the null hypothesis of similar mass to light ratios (which, together with an assumed constancy of surface brightness, is usually invoked to explain the Tully-Fisher relation) is strongly rejected. Instead, the current epoch surface brightness is tightly correlated with the properties of the dark matter halo, placing strict constraints on models of galaxy formation and evolution.

The mass to light ratios computed for both cases are shown as a function of surface brightness in Fig. 2. Fig. 2 is based solely on galaxies with full rotation curves^19,20 and surface photometry, so V_a and R_* are directly measured. The correlation in the Newtonian case is very clear (Fig. 2a), confirming our inference⁸ from the Tully-Fisher relation. Such tight correlations are very rare in extragalactic astronomy, and the Y-s relation is probably the real cause of an inferred Y-L relation. The latter is much weaker because surface brightness and luminosity are only weakly correlated^21-24.

The mass to light ratio Y (in M_☉/L_☉) determined with (a) Newtonian dynamics and (b) MOND, plotted as a function of central surface brightness. The mass determination for Newtonian dynamics is M = V² R_*/G and for MOND is M = V⁴/(G a₀). We have adopted as a consistent definition of the optical radius R_* four scale lengths of the exponential optical disc. This is where discs tend to have edges, and contains essentially all the light^21,22. The definition of R_* makes a tremendous difference to the absolute value of the mass to light ratio in the Newtonian case, but makes no difference at all to the functional relation will be present regardless of the precise definition. These mass measurements are more sensitive to the inclination corrections than is the Tully-Fisher relation since there is a sin^-2(i) term in the Newtonian case and one of sin^-4(i) for MOND. It is thus very important that the inclination be accurately measured, and we have retained only galaxies which have adequate inclination determinations — error bars are plotted for a nominal uncertainty of 6 degrees. The sensitivity to inclination manifests itself as an increase in the scatter from (a) to (b). The derived mass is also very sensitive to the measured value of the asymptotic velocity itself, so we have used only those galaxies for which this can be taken directly from a full rotation curve^19,20,42. We do not employ profile widths; the velocity measurements here are independent of those in Fig. 1. In both cases, we have subtracted off the known atomic gas mass^19,20,42, so what remains is essentially only the stars and any dark matter that may exist. A very strong correlation (regression coefficient = 0.85) is apparent in (a): this is the mass to light ratio — surface brightness conspiracy. The slope is consistent (within the errors) with the theoretical expectation s ~ Y^-2 derived from the Tully-Fisher relation⁸. At the highest surface brightnesses, the mass to light ratio is similar to that expected for the stellar population. At the faintest surface brightnesses, it has increased by a factor of nearly ten, indicating increasing dark matter domination within the optical disc as surface brightness decreases or a very systematic change in the stellar population, or both. In (b), the mass to light ratio scatters about a constant value of 2. This mean value, and the lack of a trend, is what is expected for stellar populations^17,21-24.

The Y-s relation is not predicted by any dark matter theory^25,26. It can not be purely an effect of the stellar mass to light ratio, since no other stellar population indicator such as color^21-24 or metallicity^27,28 is so tightly correlated with surface brightness. In principle it could be an effect of the stellar mass fraction, as the gas mass to light ratio follows a relation very similar to that of total mass to light ratio²⁰. We correct for this in Fig. 2 by subtracting the known atomic gas mass so that Y refers only to the stars and any dark matter. We do not correct for molecular gas, as this has never been detected in low surface brightness galaxies to rather sensitive limits³⁰ so the total mass of such gas is unimportant if current estimates³¹ of the variation of the CO to H₂ conversion factor with metallicity are correct. These corrections have no discernible effect at all in Fig. 2 because the dark mass is totally dominant. It is thus very hard to see how any evolutionary effect in the luminous matter can be relevant.

In the case of MOND, the mass to light ratio directly reflects that of the stellar population once the correction for gas mass fraction is made. There is no trend of Y* with surface brightness (Fig. 2b), a more natural result and one which is consistent with our studies of the stellar populations of low surface brightness galaxies^21-23. These suggest that Y* should be roughly constant or slightly declining as surface brightness decreases, with much scatter. The mean value Y* = 2 is also expected from stellar evolutionary theory¹⁷, which always gives a number 0 < Y* < 10 and usually gives 0.5 < Y* < 3 for disk galaxies. This is particularly striking since Y* is the only free parameter allowed to MOND, and the observed mean is very close to that directly observed²⁹ in the Milky Way (1.7 ± 0.5 M_☉/L_☉).

The essence of the problem is illustrated by Fig. 3, which shows the rotation curves of two galaxies of essentially the same luminosity but vastly different surface brightnesses. Though the asymptotic velocities are the same (as required by the Tully-Fisher relation), the rotation curve of the low surface brightness galaxy rises less quickly than that of the high surface brightness galaxy as expected if the mass is distributed like the light. Indeed, the ratio of surface brightnesses is correct to explain the ratio of velocities at small radii if both galaxies have similar mass to light ratios. However, if this continues to be the case as R increases, the low surface brightness galaxy should reach a lower asymptotic velocity simply because R_* must be larger for the same L. That this does not occur is the problem, and poses very significant systematic constraints on the dark matter distribution.

The rotation curves of two galaxies, one of high surface brightness¹¹ (NGC 2403; open circles) and one of low surface brightness¹⁹ (UGC 128; filled circles). The two galaxies have very nearly the same asymptotic velocity, and hence luminosity, as required by the Tully-Fisher relation. However, they have central surface brightnesses which differ by a factor of 13. The lines give the contributions to the rotation curves of the various components. Green: luminous disk. Blue: dark matter halo. Red: luminous disk (stars and gas) with MOND. Solid lines refer to NGC 2403 and dotted lines to UGC 128. The fits for NGC 2403 are taken from ref. 11, for which the stars have Y* = 1.5 *M_☉/L_☉*. For UGC 128, no specific fit is made: the blue and green dotted lines are simply the NGC 2403 fits scaled by the ratio of disk scale lengths h. This provides a remarkably good description of the UGC 128 rotation curve and illustrates one possible manifestation of the fine tuning problem: if disks have similar Y, the halo parameters p₀ and R₀ must scale with the disk parameters s₀ and h while conspiring to keep the product *p₀ R₀*² fixed at any given luminosity. Note also that the halo of NGC 2403 gives an adequate fit to the rotation curve of UGC 128. This is another possible manifestation of the fine tuning problem: all galaxies of the same luminosity have the same halo, with Y systematically varying with s₀ so that Y* goes to zero as s₀ goes to zero. Neither of these is exactly correct because the contribution of the gas can not be set to zero as is mathematically possible with the stars. This causes the resulting fin tuning problems to be even more complex, involving more parameters. Alternatively, the green dotted line is the rotation curve expected by MOND for a galaxy with the observed luminous mass distribution of UGC 128.

Satisfying the Tully-Fisher relation has led to some expectation that haloes all have the same density structure. This simplest possibility is immediately ruled out. In order to obtain L ~ V_a⁴ ~ MS, one might suppose that the mass surface density S is constant from galaxy to galaxy, irrespective of the luminous surface density s. This achieves the correct asymptotic velocity V_a, but requires that the mass distribution, and hence the complete rotation curve, be essentially identical for all galaxies of the same luminosity. This is obviously not the case (Fig. 3), as the rotation curves of lower surface brightness galaxies rise much more gradually than those of higher surface brightness galaxies (also a prediction⁴ of MOND). It might be possible to have approximately constant density haloes if the highest surface brightness disks are maximal and the lowest minimal in their contribution to the inner parts of the rotation curves, but this then requires fine tuning of Y* with this systematically decreasing with surface brightness.

The expected form of the halo mass distribution depends on the dominant form of dark matter. This could exist in three general categories: baryonic (e.g., MACHOs), hot (e.g., neutrinos), and cold exotic particles (e.g., WIMPs). The first two make no specific predictions. Baryonic dark matter candidates are most subject to direct detection, and most plausible candidates have been ruled out¹⁰ with remaining suggestions of necessity sounding increasingly contrived³². Hot dark matter is not relevant to the present problem. Even if neutrinos have a small mass, their velocities considerably exceed the escape velocities of the haloes of low mass galaxies where the problem is most severe. Cosmological simulations involving exotic cold dark matter^33,34 have advanced to the point where predictions are being made about the density structure of haloes. These take the form^33,34 p(R) = p_H/[R(R+R_H)^b] where p_H characterises the halo density and R_H its radius, with b ~ 2 to 3. The characteristic density depends on the mean density of the universe at the collapse epoch, and is generally expected to be greater for lower mass galaxies since these collapse first in such scenarios. This goes in the opposite sense of the observations, which show that low mass and low surface brightness galaxies are less, not more, dense. The observed behaviour is actually expected in scenarios which do not smooth on a particular mass scale and hence allow galaxies of the same mass to collapse at a variety of epochs²⁵, but in this case the Tully-Fisher relation should not be universal. Worse, note that at small R < R_H, p(R) ~ R^-1. It has already been noted^32,35 that such a steep interior density distribution is completely inconsistent with the few (4) analysed observations of dwarf galaxies. Our data^19,20 confirm and considerably extend this conclusion for 24 low surface brightness galaxies over a wide range in luminosity.

The failure of the predicted exotic cold dark matter density distribution either rules out this form of dark matter, indicates some failing in the simulations (in spite of wide-spread consensus), or requires some mechanism to redistribute the mass. Feedback from star formation is usually invoked for the last of these, but this can not work for two reasons. First, an objection in principle: a small mass of stars and gas must have a dramatic impact on the distribution of the dominant dark mass, with which they can only interact gravitationally. More mass redistribution is required in less luminous galaxies since they start out denser but end up more diffuse; of course progressively less baryonic material is available to bring this about as luminosity declines. Second, an empirical objection: in this scenario, galaxies explode and gas is lost. However, progressively fainter and lower surface brightness galaxies, which need to suffer more severe explosions, are actually very gas rich.

Observationally, dark matter haloes are inferred to have density distributions^1,2,11 with constant density cores, p(R) = p₀/[1 + (R/R₀)^g]. Here, p₀ is the core density and R₀ is the core size with g ~ 2 being required to produce flat rotation curves. For g = 2, the rotation curve resulting from this mass distribution is V(R) = V_a [1-(R₀/R) tan^-1({R/R₀)]^1/2 where the asymptotic velocity is V_a = (4πG p₀ R₀²)^1/2. To satisfy the Tully-Fisher relation, V_a, and hence the product p₀ R₀², must be the same for all galaxies of the same luminosity. To decrease the rate of rise of the rotation curves as surface brightness decreases, R₀ must increase. Together, these two require a fine tuning conspiracy to keep the product p₀ R₀² constant while R₀ must vary with the surface brightness at a given luminosity. Luminosity and surface brightness themselves are only weakly correlated, so there exists a wide range in one parameter at any fixed value of the other. Thus the structural properties of the invisible dark matter halo dictate those of the luminous disk, or vice versa. So, s and L give the essential information about the mass distribution without recourse to kinematic information.

A strict s-p₀-R₀ relation is rigorously obeyed only if the haloes are spherical and dominate throughout. This is probably a good approximation for low surface brightness galaxies but may not be for the those of the highest surface brightness. However, a significant non-halo contribution can at best replace one fine tuning problem with another (e.g., surface brightness being strongly correlated with the stellar population mass to light ratio instead of halo core density) and generally causes additional conspiracies.

There are two perspectives for interpreting these relations, with the preferred perspective depending strongly on the philosophical attitude one has towards empirical and theoretical knowledge. One view is that these are real relations which galaxies and their haloes obey. As such, they provide a positive link between models of galaxy formation and evolution and reality.

The other view is that this list of fine tuning requirements makes it rather unattractive to maintain the dark matter hypothesis. MOND provides an empirically more natural explanation for these observations. In addition to the Tully-Fisher relation, MOND correctly predicts the systematics of the shapes of the rotation curves of low surface brightness galaxies^19,20 and fits the specific case of UGC 128 (Fig. 3). Low surface brightness galaxies were stipulated⁴ to be a stringent test of the theory because they should be well into the regime a < a₀. This is now observed to be true, and to the limit of observational accuracy the predictions of MOND are confirmed. The critical acceleration scale a₀ is apparently universal, so there is a single force law acting in galactic disks for which MOND provides the correct description. The cause of this could be either a particular dark matter distribution³⁶ or a real modification of gravity. The former is difficult to arrange, and a single force law strongly supports the latter hypothesis since in principle the dark matter could have any number of distributions which would give rise to a variety of effective force laws. Even if MOND is not correct, it is essential to understand why it so closely describe the observations. Though the data can not exclude Newtonian dynamics, with a working empirical alternative (really an extension) at hand, we would not hesitate to reject as incomplete any less venerable hypothesis.

Nevertheless, MOND itself remains incomplete as a theory, being more of a Kepler’s Law for galaxies. It provides only an empirical description of kinematic data. While successful for disk galaxies, it was thought to fail in clusters of galaxies³⁷. Recently it has been recognized that there exist two missing mass problems in galaxy clusters, one of which is now solved³⁸: most of the luminous matter is in X-ray gas, not galaxies. This vastly improves the consistency of MOND with with cluster dynamics³⁹. The problem with the theory remains a reconciliation with Relativity and thereby standard cosmology (which is itself in considerable difficulty^38,40), and a lack of any prediction about gravitational lensing⁴¹. These are theoretical problems which need to be more widely addressed in light of MOND’s empirical success.

ACKNOWLEDGEMENTS. We thank R. Sanders and M. Milgrom for clarifying aspects of a theory with which we were previously unfamiliar. SSM is grateful to the Kapteyn Astronomical Institute for enormous hospitality during visits when much of this work was done. [Note added in 2020: this work was supported by a cooperative grant funded by the EU and would no longer be possible thanks to Brexit.]

REFERENCES

Rubin, V. C. Science 220, 1339-1344 (1983).
Sancisi, R. & van Albada, T. S. in Dark Matter in the Universe, IAU Symp. No. 117, (eds. Knapp, G. & Kormendy, J.) 67-80 (Reidel, Dordrecht, 1987).
Milgrom, M. Astrophys. J. 270, 365-370 (1983).
Milgrom, M. Astrophys. J. 270, 371-383 (1983).
Bekenstein, K. G., & Milgrom, M. Astrophys. J. 286, 7-14
Mannheim, P. D., & Kazanas, D. 1989, Astrophys. J. 342, 635-651 (1989).
Sanders, R. H. Astron. Atrophys. Rev. 2, 1-28 (1990).
Zwaan, M.A., van der Hulst, J. M., de Blok, W. J. G. & McGaugh, S. S. Mon. Not. R. astr. Soc., 273, L35-L38, (1995).
Zaritsky, D. & White, S. D. M. Astrophys. J. 435, 599-610 (1994).
Carr, B. Ann. Rev. Astr. Astrophys., 32, 531-590 (1994).
Begeman, K. G., Broeils, A. H. & Sanders, R. H. Mon. Not. R. astr. Soc. 249, 523-537 (1991).
Kent, S. M. Astr. J. 93, 816-832 (1987).
Milgrom, M. Astrophys. J. 333, 689-693 (1988).
Milgrom, M. & Braun, E. Astrophys. J. 334, 130-134 (1988).
Tully, R. B., & Fisher, J. R. Astr. Astrophys., 54, 661-673 (1977).
Aaronson, M., Huchra, J., & Mould, J. Astrophys. J. 229, 1-17 (1979).
Larson, R. B. & Tinsley, B. M. Astrophys. J. 219, 48-58 (1978).
Sprayberry, D., Bernstein, G. M., Impey, C. D. & Bothun, G. D. Astrophys. J. 438, 72-82 (1995).
van der Hulst, J. M., Skillman, E. D., Smith, T. R., Bothun, G. D., McGaugh, S. S. & de Blok, W. J. G. Astr. J. 106, 548-559 (1993).
de Blok, W. J. G., McGaugh, S. S., & van der Hulst, J. M. Mon. Not. R. astr. Soc. (submitted).
McGaugh, S. S., & Bothun, G. D. Astr. J. 107, 530-542 (1994).
de Blok, W. J. G., van der Hulst, J. M., & Bothun, G. D. Mon. Not. R. astr. Soc. 274, 235-259 (1995).
Ronnback, J., & Bergvall, N. Astr. Astrophys., 292, 360-378 (1994).
de Jong, R. S. Ph.D. thesis, University of Groningen (1995).
Mo, H. J., McGaugh, S. S. & Bothun, G. D. Mon. Not. R. astr. Soc. 267, 129-140 (1994).
Dalcanton, J. J., Spergel, D. N., Summers, F. J. Astrophys. J., (in press).
McGaugh, S. S. Astrophys. J. 426, 135-149 (1994).
Ronnback, J., & Bergvall, N. Astr. Astrophys., 302, 353-359 (1995).
Kuijken, K. & Gilmore, G. Mon. Not. R. astr. Soc., 239, 605-649 (1989).
Schombert, J. M., Bothun, G. D., Impey, C. D., & Mundy, L. G. Astron. J., 100, 1523-1529 (1990).
Wilson, C. D. Astrophys. J. 448, L97-L100 (1995).
Moore, B. Nature 370, 629-631 (1994).
Navarro, J. F., Frenk, C. S., & White, S. D. M. Mon. Not. R. astr. Soc., 275, 720-728 (1995).
Cole, S. & Lacey, C. Mon. Not. R. astr. Soc., in press.
Flores, R. A. & Primack, J. R. Astrophys. J. 427, 1-4 (1994).
Sanders, R. H., & Begeman, K. G. Mon. Not. R. astr. Soc. 266, 360-366 (1994).
The, L. S., & White, S. D. M. Astron. J., 95, 1642-1651 (1988).
White, S. D. M., Navarro, J. F., Evrard, A. E. & Frenk, C. S. Nature 366, 429-433 (1993).
Sanders, R. H. Astron. Astrophys. 284, L31-L34 (1994).
Bolte, M., & Hogan, C. J. Nature 376, 399-402 (1995).
Bekenstein, J. D. & Sanders, R. H. Astrophys. J. 429, 480-490 (1994).
Broeils, A. H., Ph.D. thesis, Univ. of Groningen (1992).

Statistical detection of the external field effect from large scale structure

A unique prediction of MOND

One curious aspect of MOND as a theory is the External Field Effect (EFE). The modified force law depends on an absolute acceleration scale, with motion being amplified over the Newtonian expectation when the force per unit mass falls below the critical acceleration scale a₀ = 1.2 x 10^-10 m/s/s. Usually we consider a galaxy to be an island universe: it is a system so isolated that we need consider only its own gravity. This is an excellent approximation in most circumstances, but in principle all sources of gravity from all over the universe matter.

The EFE in dwarf satellite galaxies

An example of the EFE is provided by dwarf satellite galaxies – small galaxies orbiting a larger host. It can happen that the stars in such a dwarf feel a stronger acceleration towards the host than to each other – the external field exceeds the internal self-gravity of the dwarf . In this limit, they’re more a collection of stars in a common orbit around the larger host than they are a self-gravitating island universe.

A weird consequence of the EFE in MOND is that a dwarf galaxy orbiting a large host will behave differently than it would if it were isolated in the depths of intergalactic space. MOND obeys the Weak Equivalence Principle but does not obey local position invariance. That means it violates the Strong Equivalence Principle while remaining consistent with the Einstein Equivalence Principle, a subtle but important distinction about how gravity self-gravitates.

Nothing like this happens conventionally, with or without dark matter. Gravity is local; it doesn’t matter what the rest of the universe is doing. Larger systems don’t impact smaller ones except in the extreme of tidal disruption, where the null geodesics diverge within the lesser object because it is no longer small compared to the gradient in the gravitational field. An amusing, if extreme, example is spaghettification. The EFE in MOND is a much subtler effect: when near a host, there is an extra source of acceleration, so a dwarf satellite is not as deep in the MOND regime as the equivalent isolated dwarf. Consequently, there is less of a boost from MOND: stars move a little slower, and conventionally one would infer a bit less dark matter.

The importance of the EFE in dwarf satellite galaxies is well documented. It was essential to the a priori prediction of the velocity dispersion in Crater 2 (where MOND correctly anticipated a velocity dispersion of just 2 km/s where the conventional expectation with dark matter was more like 17 km/s) and to the correct prediction of that for NGC 1052-DF2 (13 rather than 20 km/s). Indeed, one can see the difference between isolated and EFE cases in matched pairs of dwarfs satellites of Andromeda. Andromeda has enough satellites that one can pick out otherwise indistinguishable dwarfs where one happens to be subject to the EFE while its twin is practically isolated. The speeds of stars in the dwarfs affected by the EFE are consistently lower, as predicted. For example, the relatively isolated dwarf satellite of Andromeda known as And XXVIII has a velocity dispersion of 5 km/s, while its near twin And XVII (which has very nearly the same luminosity and size) is affected by the EFE and consequently has a velocity dispersion of only 3 km/s.

The case of dwarf satellites is the most obvious place where the EFE occurs. In principle, it applies everywhere all the time. It is most obvious in dwarf satellites because the external field can be comparable to or even greater than the internal field. In principle, the EFE also matters even when smaller than the internal field, albeit only a little bit: the extra acceleration causes an object to be not quite as deep in the MOND regime.

The EFE from large scale structure

Even in the depths of intergalactic space, there is some non-zero acceleration due to everything else in the universe. This is very reminiscent of Mach’s Principle, which Einstein reputedly struggled hard to incorporate into General Relativity. I’m not going to solve that in a blog post, but note that MOND is much more in the spirit of Mach and Lorenz and Einstein than its detractors generally seem to presume.

Here I describe the apparent detection of the subtle effect of a small but non-zero background acceleration field. This is very different from the case of dwarf satellites where the EFE can exceed the internal field. It is just a small tweak to the dominant internal fields of very nearly isolated island universes. It’s like the lapping of waves on their shores: hardly relevant to the existence of the island, but a pleasant feature as you walk along the beach.

The universe has structure; there are places with lots of galaxies (groups, clusters, walls, sheets) and other places with very few (voids). This large scale structure should impose a low-level but non-zero acceleration field that should vary in amplitude from place to place and affect all galaxies in their outskirts. For this reason, we do not expect rotation curves to remain flat forever; even in MOND, there comes an over-under point where the gravity of everything else takes over from any individual object. A test particle at the see-saw balance point between the Milky Way and Andromeda may not know which galaxy to call daddy, but it sure knows they’re both there. The background acceleration field matters to such diverse subjects as groups of galaxies and Lyman alpha absorbers at high redshift.

As an historical aside, Lyman alpha absorbers at high redshift were initially found to deviate from MOND by many orders of magnitude. That was withoug the EFE. With the EFE, the discrepancy is much smaller, but persists. The amplitude of the EFE at high redshift is very uncertain. I expect it is higher in MOND than estimated because structure forms fast in MOND; this might suffice to solve the problem. Whether or not this is the case, it makes a good example of how a simple calculation can make MOND seem way off when it isn’t. If I had a dollar for every time I’ve seen that happen, I could fly first class.

I made an early estimate of the average intergalactic acceleration field, finding the typical environmental acceleration e_env to be about 2% of a₀ (e_env ~ 2.6 x 10^-12 m/s/s, see just before eq. 31). This is highly uncertain and should be location dependent, differing a lot from voids to richer environments. It is hard to find systems that probe much below 10% of a₀, and the effect it would cause on the average (non-satellite) galaxy is rather subtle, so I have mostly neglected this background acceleration as, well, pretty negligible.

This changed recently thanks to Kyu-Hyun Chae and Harry Desmond. We met at a conference in Bonn a year ago September. (Remember travel? I used to complain about how much travel work involved. Now I miss it – especially as experience demonstrates that some things really do require in-person interaction.) Kyu thought we should be able to tease out the EFE from SPARC data in a statistical way, and Harry offered to make a map of the environmental acceleration based on the locations of known galaxies. This is a distinct improvement over the crude average of my ancient first estimate as it specifies the EFE that ought to occur at the location of each individual galaxy. The results of this collaboration were recently published open-access in the Astrophysical Journal.

This did not come easily. I think I mentioned that the predicted effect is subtle. We’re no longer talking about the effect of a big host on a tiny dwarf up close to it. We’re talking about the background of everything on giant galaxies. Space is incomprehensibly vast, so every galaxy is far, far away, and the expected effect is small. So my first reaction was “Sure. Great idea. No way can we do this with current data.” I am please to report that I was wrong: with lots of hard work, perseverance, and the power of Bayesian statistics, we have obtained a positive detection of the EFE.

One reason for my initial skepticism was the importance of data quality. The rotation curves in SPARC are a heterogeneous lot, being the accumulated work of an entire community of radio astronomers over the course of several decades. Some galaxies are bright and their data stupendous, others… not so much. Having started myself working on low surface brightness galaxies – the least stupendous of them all – and having spent much of the past quarter century working long and hard to improve the data, I tend to be rather skeptical of what can be accomplished.

An example of a galaxy with good data is NGC 5055 (aka M63, aka the Sunflower galaxy, pictured atop as viewed by the Hubble Space Telescope). NGC 5055 happens to reside in a relatively high acceleration environment for a spiral, with e_env ~ 9% of a₀. For comparison, the acceleration at the last measured point of its rotation curve is about 15% of a₀. So they’re within a factor of two, which is pretty much the strongest effect in the whole sample. This additional bit of acceleration means NGC 5055 is not quite as deep in the MOND regime as it would be all by its lonesome, with the net effect that the rotation curve is predicted to decline a little bit faster than it would in the isolated case, as you can see in the figure below. See that? Or is it too subtle? I think I mentioned the effect was pretty subtle.

The rotation curve *of NGC 5055* (velocity in km/s vs. radius in kpc). The blue and green bands are the rotation expected from the observed stars and gas. The red band is the MOND fit with (left) and without (right) the external field effect (EFE) from Chae et al. ΔBIC is a statistical measure that indicates that the fit with the EFE is a meaningful improvement over that without (in technical terms, “way better”).

That this case works well is encouraging. I like to start with a good case: if you can’t see what you’re looking for in the best of the data, stop. But I still didn’t hold out much hope for the rest of the sample. Then Kyu showed that the most isolated galaxies – those subject to the lowest environmental accelerations – showed no effect. That sounds boring, but null results are important. It could happen that the environmental acceleration was a promiscuous free parameter that appears to improve a fit without really adding any value. That it declined to do that in cases where it shouldn’t was intriguing. The galaxies in the most extreme environments show an effect when they should, but don’t when they shouldn’t.

Statistical detection of the EFE

Statistics become useful for interpreting the entirety of the large sample of galaxies. Because of the variability in data quality, we knew some cases would go astray. But we only need to know if the fit for any galaxy is improved relative to the case where the EFE is neglected, so each case sets its own standard. This relative measure is more robust than analyses that require an assessment of the absolute fit quality. All we’re really asking the data is whether the presence of an EFE helps. To my initial and ongoing amazement, it does.

The environmental acceleration predicted by the distribution of known galaxies, e_env, against the amplitude e of an external field that provides the best-fit to each rotation curve (Fig. 5 of Chae et al).

The figure above shows the amplitude of the EFE that best fits each rotation curve along the x-axis. The median is 5% of a₀. This is non-zero at 4.7σ, and our detection of the EFE is comparable in quality to that of the Baryon Acoustic Oscillation or the accelerated expansion of the universe when these were first accepted. Of course, these were widely anticipated effects, while the EFE is expected only in MOND. Personally, I think it is a mistake to obsess over the number of σ, which is not as robust as people like to think. I am more impressed that the peak of the color map (the darkest color in the data density map above) is positive definite and clearly non-zero.

Taken together, the data prefer a small but clearly non-zero EFE. That’s a statistical statement for the whole sample. Of course, the amplitude (e) of the EFE inferred for individual galaxies is uncertain, and is occasionally negative. This is unphysical: it shouldn’t happen. Nevertheless, it is statistically expected given the amount of uncertainty in the data: for error bars this size, some of the data should spill over to e < 0.

I didn’t initially think we could detect the EFE in this way because I expected that the error bars would wash out the effect. That is, I expected the colored blob above would be smeared out enough that the peak would encompass zero. That’s not what happened, me of little faith. I am also encouraged that the distribution skews positive: the error bars scatter points in both direction, and wind up positive more often than negative. That’s an indication that they started from an underlying distribution centered on e > 0, not e = 0.

The y-axis in the figure above is the estimate of the environmental acceleration based on the 2M++ galaxy catalog. This is entirely independent of the best fit e from rotation curves. It is the expected EFE from the distribution of mass that we know about. The median environmental EFE found in this way is 3% of a₀. This is pretty close to the 2% I estimated over 20 years ago. Given the uncertainties, it is quite compatible with the median of 5% found from the rotation curve fits.

In an ideal world where all quantities are perfectly known, there would be a correlation between the external field inferred from the best fit to the rotation curves and that of the environment predicted by large scale structure. We are nowhere near to that ideal. I can conceive of improving both measurements, but I find it hard to imagine getting to the point where we can see a correlation between e and e_env. The data quality required on both fronts would be stunning.

Then again, I never thought we could get this far, so I am game to give it a go.

Oh… you don’t want to look in there

This post is a recent conversation with David Garofalo for his blog.

Today we talk to Dr. Stacy McGaugh, Chair of the Astronomy Department at Case Western Reserve University.

David: Hi Stacy. You had set out to disprove MOND and instead found evidence to support it. That sounds like the poster child for how science works. Was praise forthcoming?

Stacy: In the late 1980s and into the 1990s, I set out to try to understand low surface brightness galaxies. These are diffuse systems of stars and gas that rotate like the familiar bright spirals, but whose stars are much more spread out. Why? How did these things come to be? Why were they different from brighter galaxies? How could we explain their properties? These were the problems I started out working on that inadvertently set me on a collision course with MOND.

I did not set out to prove or disprove either MOND or dark matter. I was not really even aware of MOND at that time. I had head of it only on a couple of occasions, but I hadn’t payed any attention, and didn’t really know anything about it. Why would I bother? It was already well established that there had to be dark matter.

I worked to develop our understanding of low surface brightness galaxies in the context of dark matter. Their blue colors, low metallicities, high gas fractions, and overall diffuse nature could be explained if they had formed in dark matter halos that are themselves lower than average density: they occupy the low concentration side of the distribution of dark matter halos at a given mass. I found this interpretation quite satisfactory, so gave me no cause to doubt dark matter to that point.

This picture made two genuine predictions that had yet to be tested. First, low surface brightness galaxies should be less strongly clustered than brighter galaxies. Second, having their mass spread over a larger area, they should shift off of the Tully-Fisher relation defined by denser galaxies. The first prediction came true, and for a period I was jubilant that we had made an important new contribution to out understanding of both galaxies and dark matter. The second prediction failed badly: low surface brightness galaxies adhere to the same Tully-Fisher relation that other galaxies follow.

I tried desperately to understand the failure of the second prediction in terms of dark matter. I tried what seemed like a thousand ways to explain this, but ultimately they were all tautological: I could only explain it if I assumed the answer from the start. The adherence of low surface brightness galaxies to the Tully-Fisher relation poses a serious fine-tuning problem: the distribution of dark matter must be adjusted to exactly counterbalance that of the visible matter so as not to leave any residuals. This makes no sense, and anyone who claims it does is not thinking clearly.

It was in this crisis of comprehension in which I became aware that MOND predicted exactly what I was seeing. No fine-tuning was required. Low surface brightness galaxies followed the same Tully-Fisher relation as other galaxies because the modified force law stipulates that they must. It was only at this point (in the mid-’90s) at which I started to take MOND seriously. If it had got this prediction right, what else did it predict?

I was still convinced that the right answer had to be dark matter. There was, after all, so much evidence for it. So this one prediction must be a fluke; surely it would fail the next test. That was not what happened: MOND passed test after test after test, successfully predicting observations both basic and detailed that dark matter theory got wrong or did not even address. It was only after this experience that I realized that what I thought was evidence for dark matter was really just evidence that something was wrong: the data cannot be explained with ordinary gravity without invisible mass. The data – and here I mean ALL the data – were mostly ambiguous: they did not clearly distinguish whether the problem was with mass we couldn’t see or with the underlying equations from which we inferred the need for dark matter.

So to get back to your original question, yes – this is how science should work. I hadn’t set out to test MOND, but I had inadvertently performed exactly the right experiment for that purpose. MOND had its predictions come true where the predictions of other theories did not: both my own theory and those of others who were working in the context of dark matter. We got it wrong while MOND got it right. That led me to change my mind: I had been wrong to be sure the answer had to be dark matter, and to be so quick to dismiss MOND. Admitting this was the most difficult struggle I ever faced in my career.

David: From the perspective of dark matter, how does one understand MOND’s success?

Stacy: One does not.

That the predictions of MOND should come true in a universe dominated by dark matter makes no sense.

Before I became aware of MOND, I spent lots of time trying to come up with dark matter-based explanations for what I was seeing. It didn’t work. Since then, I have continued to search for a viable explanation with dark matter. I have not been successful. Others have claimed such success, but whenever I look at their work, it always seems that what they assert to be a great success is just a specific elaboration of a model I had already considered and rejected as obviously unworkable. The difference boils down to Occam’s razor. If you give dark matter theory enough free parameters, it can be adjusted to “predict” pretty much anything. But the best we can hope to do with dark matter theory is to retroactively explain what MOND successfully predicted in advance. Why should we be impressed by that?

David: Does MOND fail in clusters?

Stacy: Yes and no: there are multiple tests in clusters. MOND passes some and flunks others – as does dark matter.

The most famous test is the baryon fraction. This should be one in MOND – all the mass is normal baryonic matter. With dark matter, it should be the cosmic ratio of normal to dark matter (about 1:5).

MOND fails this test: it explains most of the discrepancy in clusters, but not all of it. The dark matter picture does somewhat better here, as the baryon fraction is close to the cosmic expectation — at least for the richest clusters of galaxies. In smaller clusters and groups of galaxies, the normal matter content falls short of the cosmic value. So both theories suffer a “missing baryon” problem: MOND in rich clusters; dark matter in everything smaller.

Another test is the mass-temperature relation. Both theories predict a relation between the mass of a cluster and the temperature of the gas it contains, but they predict different slopes for this relation. MOND gets the slope right but the amplitude wrong, leading to the missing baryon problem above. Dark matter gets the amplitude right for the most massive clusters, but gets the slope wrong – which leads to it having a missing baryon problem for systems smaller than the largest clusters.

There are other tests. Clusters continue to merge; the collision velocity of merging clusters is predicted to be higher in MOND than with dark matter. For example, the famous bullet cluster, which is often cited as a contradiction to MOND, has a collision speed that is practically impossible with dark matter: there just isn’t enough time for the two components of the bullet to accelerate up to the observed relative speed if they fall together under the influence of normal gravity and the required amount of dark mass. People have argued over the severity of this perplexing problem, but the high collision speed happens quite naturally in MOND as a consequence of its greater effective force of attraction. So, taken at face value, the bullet cluster both confirms and refutes both theories!

I could go on… one expects clusters to form earlier and become more massive in MOND than in dark matter. There are some indications that this is the case – the highest redshift clusters came as a surprise to conventional structure formation theory – but the relative numbers of clusters as a function of mass seems to agree well with current expectations with dark matter. So clusters are a mixed bag.

More generally, there is a widespread myth that MOND fits rotation curves, but gets nothing else right. This is what I expected to find when I started fact checking, but the opposite is true. MOND explains a huge variety of data well. The presumptive superiority of dark matter is just that – a presumption.

David: At a physics colloquium two decades ago, Vera Rubin described how theorists were willing and eager to explain her data to her. At an astronomy colloquium a few years later, you echoed that sentiment in relation to your data on velocity curves. One concludes that theorists are uniquely insightful and generous people. Is there anyone you would like to thank for putting you straight?

Stacy: So they perceive themselves to be.

MOND has made many successful a priori predictions. This is the golden standard of the scientific method. If there is another explanation for it, I’d like to know what it is.

As your questions supposes, many theorists have offered such explanations. At most one of them can be correct. I have yet to hear a satisfactory explanation.

David: What are MOND people working on these days?

Stacy: Any problem that is interesting in extragalactic astronomy is interesting in the context of MOND. Outstanding questions include planes of satellite dwarf galaxies, clusters of galaxies, the formation of large scale structure, and the microwave background. MOND-specific topics include the precise value of the MOND acceleration constant, predicting the velocity dispersions of dwarf galaxies, and the search for the predicted external field effect, which is a unique signature of MOND.

The phrasing of this question raises a sociological issue. I don’t know what a “MOND person” is. Before now, I have only heard it used as a pejorative.

I am a scientist who has worked on many topics. MOND is just one of them. Does that make me a “MOND person”? I have also worked on dark matter, so am I also a “dark matter person”? Are these mutually exclusive?

I have attended conferences where I have heard people say ‘“MOND people” do this’ or ‘“MOND people” fail to do that.’ Never does the speaker of these words specify who they’re talking about: “MOND people” are a nameless Other. In all cases, I am more familiar with the people and the research they pretend to describe, but in no way do I recognize what they’re talking about. It is just a way of saying “Those People” are Bad.

There are many experts on dark matter in the world. I am one of them. There are rather fewer experts on MOND. I am also one of them. Every one of these “MOND people” is also an expert on dark matter. This situation is not reciprocated: many experts on dark matter are shockingly ignorant about MOND. I was once guilty of that myself, but realized that ignorance is not a sound basis on which to base a scientific judgement.

David: Are you tired of getting these types of questions?

Stacy: Yes and no.

No, in that these are interesting questions about fundamental science. That is always fun to talk about.

Yes, in that I find myself having the same arguments over and over again, usually with scientists who remain trapped in the misconceptions I suffered myself a quarter century ago, but whose minds are closed to ideas that threaten their sacred cows. If dark matter is a real, physical substance, then show me a piece already.

Triton Station

A Blog About the Science and Sociology of Cosmology and Dark Matter

Month: December 2020