A Blog About the Science and Sociology of Cosmology and Dark Matter
Stacy McGaugh is an astrophysicist and cosmologist who studies galaxies, dark matter, and theories of modified gravity. He is an expert on low surface brightness galaxies, a class of objects in which the stars are spread thin compared to bright galaxies like our own Milky Way. He demonstrated that these dim galaxies appear to be dark matter dominated, providing unique tests of theories of galaxy formation and modified gravity.
Professor McGaugh is currently the chair of the Department of Astronomy at Case Western Reserve University in Cleveland, Ohio, and director of the Warner and Swasey Observatory. Previously he was a member of the faculty at the University of Maryland, having also held research fellowships at Rutgers, the Department of Terrestrial Magnetism of the Carnegie Institution of Washington, and the Institute of Astronomy at the University of Cambridge after earning his Ph.D. from the University of Michigan.
The dominant paradigm for dark matter has long been the weakly interacting massive particle (WIMP). WIMPs are hypothetical particles motivated by supersymmetry. This is well-posed scientific hypothesis insofar as it makes a testable prediction: the cold dark matter thought to dominate the cosmic mass budget should be composed of a particle with a mass in the neighborhood of 100 GeV that interacts via the weak nuclear force – hence the name.
That WIMPs couple to the weak nuclear force as well as to gravity is what gives us a window to detect them in the laboratory. They should scatter off of nuclei of comparable mass, albeit only on the rare occasions dictated by the weak force. If we build big enough detectors, we should see it happen. This is what a whole host of massive, underground experiments have been looking for. So far, these experiments have succeeded in failing to detect WIMPs: if WIMPs existed with the properties we predicted them to have, they would have been detected by now.
The failure to find WIMPs has led to the consideration of a myriad of other possibilities. Few of these are as well motivated as the original WIMP. Some have nifty properties that might help with the phenomenology of galaxies. Most are woefully uninformed by such astrophysical considerations, as it is hard enough to do the particle physics without violating some basic constraint.
One possibility that most of us have been reluctant to contemplate is a particle that doesn’t interact at all via strong, weak, or electromagnetic forces. We already know that dark matter cannot interact via electromagnetism, as it wouldn’t be dark. It is similarly difficult to hide a particle that responds to the strong force (though people have of course tried, with strange nuggets in the ’80s and their modern reincarnation, the macro). But why should a particle have to interact at least through the weak force, as WIMPs do? No reason. So what if there is a particle that has zero interaction with standard model particles? It has mass and therefore gravity, but otherwise interacts with the rest of the universe not at all. Let’s call this the Angel Particle, because it will never reveal itself, no matter how much we pray for divine intervention.
I first heard this idea mooted in a talk by Tom Shutt in the early teens. He is a leader in the search for WIMPs, and has been since the outset. So to suggest that the dark matter is something that simply cannot be detected in the laboratory was anathema. A logical possibility to be noted, but only in passing with a shudder of existential dread: the legions of experimentalists looking for dark matter are wasting their time if there is no conceivable signal to detect.
Flash forward a decade, and what was anathema then seems reasonable now that WIMPs remain AWOL. I hear some theorists saying “why not?” with a straight face. “Why shouldn’t there be a particle that doesn’t interact with anything else?”
One the one hand, it’s true. As long as we’re making up particles outside the boundaries of known physics, I know of nothing that precludes us from inventing one that has zero interactions. On the other hand, how would we ever know? We would just give up on laboratory searches, and accept on faith that “gravitational detection” from astronomical evidence is adequate – and indeed, the only possible evidence for invisible mass.
Experimentalists go home! Your services are not required.
To me, this is not physics. There is no way to falsify this hypothesis, or even test it. I was already concerned that WIMPs are not strictly falsifiable. They can be confirmed if found in the laboratory, but if they are not found, we can always tweak the prediction – all the way to this limit of zero interaction, a situation I’ve previously described as the express elevator to hell.
If there is no way to test a hypothesis to destruction, it is metaphysics, not physics. Entertaining the existence of a particle with zero interaction cross-section is a logical possibility, but it is also a form of magical thinking. It provides a way to avoid confronting the many problems with the current paradigm. Indeed, it provides an excuse to never have to deal with them. This way lies madness, and the end of scientific rationalism. We might just as well imagine that angels are responsible for moving objects about.
Indeed, the only virtue of this hypothesis that springs to mind is to address the age-old question: how many angels can dance on the head of a pin? We know from astronomical data that the local density of angel particles must be about 1/4 GeV cm-3. Let’s say the typical pin head is a cylinder with a diameter of 2.5 mm and a thickness of 1 mm, giving it a volume of 10 mm3. Doing a few unit conversions, this means a dark mass of 1 MeV* per pin head, so exactly one angel can occupy the head of a pin if the mass of the Angel particle is 1 MeV.
Of course, we have no idea what the mass of the Angel particle is, so we’ve really only established a limit: 1 MeV is the upper limit for the mass of an angel that can fit on the head of a pin. If it weighs more than 1 MeV, the answer is zero: an angel is too fat to fit on the head of a pin. If angels weighs less than 1 MeV, then they can fit numbers in inverse proportion to their mass. If it is as small as 1 eV, then a million angels can party on the vast dance floor that is the head of a pin.
So I guess we still haven’t answered the age old question, and it looks like we never will.
*An electron is about half an MeV, so it is tempting to imagine dark matter composed of positronium. This does not work for many reasons, not least of which is that a mass of 1 MeV is a coincidence of the volume of the head of a pin that I made up for ease of calculation without bothering to measure the size of an actual pin – not to mention that the size of pins has nothing whatever to do with the dark matter problem. Another reason is that, being composed of an electron and its antiparticle the positron, positronium is unstable and self-annihilates into gamma rays in less than a nanosecond – rather less than the Hubble time that we require for dark matter to still be around at this juncture. Consequently, this hypothesis is immediately off by a factor of 1028, which is the sort of thing that tends to happen when you try to construct dark matter from known particles – hence the need to make up entirely new stuff.
God forbid we contemplate that maybe the force law might be broken. How crazy would that be?
I’ve reached the point in the semester teaching cosmology where we I’ve gone through the details of what we call the three empirical pillars of the hot big bang:
Primordial [Big Bang] Nucleosynthesis (BBN)
Relic Radiation (aka the Cosmic Microwave Background; CMB)
These form an interlocking set of evidence and consistency checks that leave little room for doubt that we live in an expanding universe that passed through an early, hot phase that bequeathed us with the isotopes of the light elements (mostly hydrogen and helium with a dash of lithium) and left us bathing in the relic radiation that we perceive all across the sky as the CMB, the redshifted epoch of last scattering. While I worry about everything, as any good scientist does, I do not seriously doubt that this basic picture is essentially correct.
This basic picture is rather general. Many people seem to conflate it with one specific realization, namely Lambda Cold Dark Matter (LCDM). That’s understandable, because LCDM is the only model that remains viable within the framework of General Relativity (GR). However, that does not inevitably mean it must be so; one can imagine more general theories than GR that contain all the usual early universe results. Indeed, it is hard to imagine otherwise, since such a theory – should it exist – has to reproduce all the successes of GR just as GR had to reproduce all the successes of Newton.
Writing a theory that generalizes GR is a very tall order, so how would we know if we should even attempt such a daunting enterprise? This is not an easy question to answer. I’ve been posing it to myself an others for a quarter century. Answers received range from Why would you even ask that, you fool? to Obviously GR needs to be supplanted by a quantum theory of gravity.
One red flag that a theory might be in trouble is when one has to invoke tooth fairies to preserve it. These are what the philosophers of science more properly call auxiliary hypotheses: unexpected elements that are not part of the original theory that we have been obliged to add in order to preserve it. Modern cosmology requires two:
Non-baryonic cold dark matter
Lambda (or its generalization, dark energy)
LCDM. The tooth fairies are right there in the name.
Lambda and CDM are in no way required by the original big bang hypothesis, and indeed, both came as a tremendous surprise. They are auxiliary hypotheses forced on us by interpreting the data strictly within the framework of GR. If we restrict ourselves to this framework, they are absolute requirements. That doesn’t guarantee they exist; hence the need to conduct laboratory experiments to detect them. If we permit ourselves to question the framework, then we say, gee, who ordered this?
Let me be clear that the data are absolutely clear that something is wrong. There is no doubt of the need for dark matter in the conventional framework of GR. I teach an entire semester course on the many and various empirical manifestations of mass discrepancies in the universe. There is no doubt that the acceleration discrepancy (as Bekenstein called it) is a real set of observed phenomena. At issue is the interpretation: does this indicate literal invisible mass, or is it an indication of the failings of current theory?
Similarly for Lambda. Here is a nice plot of the expansion history of the universe by Saul Perlmutter. The colors delineate the region of possible models in which the expansion either decelerates or accelerates. There is no doubt that the data fall on the accelerating side.
I’m old enough to remember when the blue (accelerating) region of this diagram was forbidden. Couldn’t happen. Data falling in that portion of the diagram would falsify cosmology. The only reason it didn’t is because we could invoke Einstein’s greatest blunder as an auxiliary hypothesis to patch up our hypothesis. That we had to do so is why the whole dark energy thing is such a big deal. Ironically, one can find many theoretical physicists eagerly pursuing modified theories of gravity to explain the need for Lambda without for a moment considering whether this might also apply to the dark matter problem.
When and where one enters the field matters. At the turn of the century, dark energy was the hot, new, interesting problem, and many people chose to work on it. Dark matter was already well established. So much so that students of that era (who are now faculty and science commentators) understandably confuse the empirical dark matter problem with its widely accepted if still hypothetical solution in the form of some as-yet undiscovered particle. Indeed, overcoming this mindset in myself was the hardest challenge I have faced in an entire career full of enormous challenges.
Another issue with dark matter, as commonly conceptualized, is that it cannot be normal matter that happens not to shine as stars. It is very reasonable to image that there are dark baryons, and it is pretty clear that there are. Early on (circa 1980), it seemed like this might suffice. It does not. However, it helped the notion of dark matter transition from an obvious affront to the scientific method to a plausible if somewhat outlandish hypothesis to an inevitable requirement for some entirely new form of particle. That last part is key: we don’t just need ordinary mass that is hard to see, we need some form of non-baryonic entity that is completely invisible and resides entirely outside the well-established boundaries of the standard model of particle physics and that has persistently evaded laboratory signals where predicted.
One becomes concerned about a theory when it becomes too complicated. In the case of cosmology, it isn’t just the Lambda and the cold dark matter. These are just a part of a much larger balancing act. The Hubble tension is a late comer to a long list of tensions among independent observations that have been mounting for so long that I reproduce here a transparency I made to illustrate the situation. That’s right, a transparency, because this was already an issue before end of the twentieth century.
The details have changed, but the situation remains the same. The chief thing that has changed is the advent of precision cosmology. Fits to CMB data are now so accurate that we’ve lost our historical perspective on the slop traditionally associated with cosmological observables. CMB fits are of course made under the assumption of GR+Lambda+CDM. Rather than question these assumptions when some independent line of evidence disagrees, we assume that the independent line of evidence is wrong. The opportunities for confirmation bias are rife.
I hope that it is obvious to everyone that Lambda and CDM are auxiliary hypotheses. I took the time to spell it out because most scientists have subsumed them so deeply into their belief systems that they forget that’s what they are. It is easy to find examples of people criticizing MOND as a tooth fairy as if dark matter is not itself the biggest, most flexible, literally invisible tooth fairy you can imagine. We expected none of this!
I wish to highlight here one other tooth fairy: feedback. It is less obvious that this is a tooth fairy, since it is a very real physical effect. Indeed, it is a whole suite of distinct physical effects, each with very different mechanisms and modes of operation. There are, for example, stellar winds, UV radiation from massive stars, supernova when those stars explode, X-rays from compact sources like neutron stars, and relativistic jets from supermassive black holes at the centers of galactic nuclei. The mechanisms that drive these effects occur on scales that are impossibly tiny from the perspective of cosmology, as they cannot be modeled directly in cosmological simulations. The only computer that has both the size and the resolution to do this calculation is the universe itself.
To account for effects below their resolution limit, simulators have come up with a number of schemes to account for this “sub-grid physics.” Therein lies the rub. There are many different approaches to this, and they do not all produce the same results. We do not understand feedback well enough to model it accurately as subgrid physics. Simulators usually invoke supernova feedback as the primary effect in dwarf galaxies, while observers tell us that stellar winds do most of the damage on the scale of star forming regions – a scale that is much smaller than the scale simulators are concerned with, that of entire galaxies. What the two communities mean by the word feedback is not the same.
On the one hand, it is normal in the course of the progress of science to need to keep working on something like how best to model feedback. On the other hand, feedback has become the go-to explanation for any observation that does not conform to the predictions of LCDM. In that application, it becomes an auxiliary hypothesis. Many plausible implementations of feedback have been rejected for doing the wrong thing in simulations. Only maybe one of those was the right implementation, and the underlying theory is wrong? How can we tell when we keep iterating the implementation to get the right answer?
Bear in mind that there are many forms of feedback. That one word upon which our entire cosmology has become dependent is not a single auxiliary hypothesis. It is more like a Russian nesting doll of multiple tooth fairies, one inside another. Imagining that these different, complicated effects must necessarily add up to just the right outcome is dangerous: anything we get wrong we can just blame on some unknown imperfection in the feedback prescription. Indeed, most of the papers on this topic that I see aren’t even addressing the right problem. Often they claim to fix the cusp-core problem without addressing the fact that this is merely one symptom of the observed MOND phenomenology in galaxies. This is like putting a bandage on an amputation and pretending like the treatment is complete.
The universe is weirder than we know, and perhaps weirder than we can know. This provides boundless opportunity for self-delusion.
It has been two months since my last post. Sorry for the extended silence, but I do have a real job. It is not coincidental that my last post precedes the start of the semester. It has been the best of semesters, but mostly the worst of semesters.
On the positive side, I’m teaching our upper level cosmology course. The students are great, really interested and interactive. Interest has always run high, going back to the first time I taught it (in 1999) as a graduate course at the University of Maryland. Aficionados of web history may marvel at the old course website, which was one of the first of its kind, as was the class – prior to that, graduate level cosmology was often taught as part of extragalactic astronomy. Being a new member of the faculty, it was an obvious gap to fill. I also remember with bemusement receiving Mike A’Hearn (comet expert and PI of Deep Impact) as an envoy from the serious-minded planetary scientists, who wondered if there was enough legitimate substance to the historically flaky subject of cosmology to teach a full three credit graduate course on the subject. Being both an expert and a skeptic, it was easy to reassure him: yes.
That class was large for a graduate level course, being taken in equal numbers by both astronomy and physics students. The astronomers were shocked and horrified that I went so deeply into the background theory to frame the course from the outset, and frequently asked “what’s a metric?” while the physicists loved that part. When we got to observational constraints, you could see the astronomers’ eyes glaze – not the distance scale again – while the physicists desperately asked “what’s a distance modulus?” This dichotomy persists.
This semester’s course is the largest it has ever been, up 70% from previous already-large enrollments. This is consistent with the explosive growth of the field. Interest in the field has never been higher. The number of astronomy majors has doubled over the past decade, having doubled already in the preceding decade.
That’s the good news. The bad news is that over the past four years, our department has been allowed to whither. In 2018, we were the smallest astronomy department in the country, with five tenured professors and an observatory manager who functioned as research faculty. The inevitable retirements that we had warned our administration were coming arrived, and we were allowed to fall off the demographic cliff (a common problem here and at many institutions). Despite the clear demand and the depth, breadth, and diversity of the available talent pool, the only faculty hire we have made in the past decade was an instructor (a rank that differs from a professor in having no research obligations), so now we are a department of two tenured professors and one instructor. I thought we were already small! It boggles the mind when you realize that the three of us are obliged to cover literally the entire universe in our curriculum.
Though always a small department, we managed. Now we don’t manage so much as cling to the edge of the cliff by our fingernails. We can barely cover the required courses for our majors. During the peak of concern about the Covid pandemic, we Chairs were asked to provide a plan for covering courses should one or some of our faculty become ill for an extended period. What a joke. The only “plan” I could offer was “don’t get sick.”
We did at least get along, which is not the case with faculty in all departments. The only minor tension we sometimes encountered was the distribution of research students. A Capstone (basically a senior thesis) is required here, and some faculty wound up with a higher supervisory load than others. That is baked-in now, as we have fewer faculty but more students to supervise.
We have reached a breaking point. The only way to address the problems we face is to hire new faculty. So the solution proffered by the dean is to merge our department into Physics.
Regardless of any other pros and cons, a merger does nothing to address the fundamental problem: we need astronomers to teach the astronomy curriculum. We need astronomers to conduct astronomy research, and to have a critical mass for a viable research community. In short, we need astronomers to do astronomy.
I have been Chair of the CWRU Department of Astronomy for over seven years now. Prof. Mihos served in this capacity for six years before that. No sane faculty member wants to be Chair; it is a service obligation we take on because there are tasks that need doing to serve our students and enable our research. Though necessary, these tasks are a drain on the person doing them, and detract from our ability to help our students and conduct research. Having sustained the department for this long to be told we needn’t have bothered is a deep and profound betrayal. I did not come here to turn out the lights.
Dark matter remains undetected in the laboratory. This has been true for forever, so I don’t know what drivesthe timing of the recent spate of articles encouraging us to keep the faith, that dark matter is still a better idea than anything else. This depends on how we define “better.”
There is a long-standing debate in the philosophy of science about the relative merits of accommodation and prediction. A scientific theory should have predictive power. It should also explain all the relevant data. To do the latter almost inevitably requires some flexibility in order to accommodate things that didn’t turn out exactly as predicted. What is the right mix? Do we lean more towards prediction, or accommodation? The answer to that defines “better” in this context.
One of the recent articles is titled “The dark matter hypothesis isn’t perfect, but the alternatives are worse” by Paul Sutter. This perfectly encapsulates the choice one has to make in what is unavoidably a value judgement. Is it better to accommodate, or to predict (see the Spergel Principle)? Dr. Sutter comes down on the side of accommodation. He notes a couple of failed predictions of dark matter, but mentions no specific predictions of MOND (successful or not) while concluding that dark matter is better because it explains more.
One important principle in science is objectivity. We should be even-handed in the evaluation of evidence for and against a theory. In practice, that is very difficult. As I’ve written before, it made me angry when the predictions of MOND came true in my data for low surface brightness galaxies. I wanted dark matter to be right. I felt sure that it had to be. So why did this stupid MOND theory have any of its predictions come true?
One way to check your objectivity is to look at it from both sides. If I put on a dark matter hat, then I largely agree with what Dr. Sutter says. To quote one example:
The dark matter hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has come up with a MOND-like theory that can explain the wealth of data we have about the universe. That doesn’t make MOND wrong, but it does make it a far weaker alternative to dark matter.
OK, so now let’s put on a MOND hat. Can I make the same statement?
The MOND hypothesis isn’t perfect. But then again, no scientific hypothesis is. When evaluating competing hypotheses, scientists can’t just go with their guts, or pick one that sounds cooler or seems simpler. We have to follow the evidence, wherever it leads. In almost 50 years, nobody has detected dark matter, nor come up with a dark matter-based theory with the predictive power of MOND. That doesn’t make dark matter wrong, but it does make it a far weaker alternative to MOND.
So, which of these statements is true? Well, both of them. How do we weigh the various lines of evidence? Is it more important to explain a large variety of the data, or to be able to predict some of it? This is one of the great challenges when comparing dark matter and MOND. They are incommensurate: the set of relevant data is not the same for both. MOND makes no pretense to provide a theory of cosmology, so it doesn’t even attempt to explain much of the data so beloved by cosmologists. Dark matter explains everything, but, broadly defined, it is not a theory so much as an inference – assuming gravitational dynamics are inviolate, we need more mass than meets the eye. It’s a classic case of comparing apples and oranges.
While dark matter is a vague concept in general, one can build specific theories of dark matter that are predictive. Simulations with generic cold dark matter particles predict cuspy dark matter halos. Galaxies are thought to reside in these halos, which dominate their dynamics. This overlaps with the predictions of MOND, which follow from the observed distribution of normal matter. So, do galaxies look like tracer particles orbiting in cuspy halos? Or do their dynamics follow from the observed distribution of light via Milgrom’s strange formula? The relevant subset of the data very clearly indicate the latter. When head-to-head comparisons like this can be made, the a priori predictions of MOND win, hands down, over and over again. [If this statement sounds wrong, try reading the relevant scientificliterature. Being an expert on dark matter does not automatically make one an expert on MOND. To be qualified to comment, one should know what predictive successes MOND has had. People who say variations of “MOND only fits rotation curves” are proudly proclaiming that they lack this knowledge.]
It boils down to this: if you want to explain extragalactic phenomena, use dark matter. If you want to make a prediction – in advance! – that will come true, use MOND.
A lot of the debate comes down to claims that anything MOND can do, dark matter can do better. Or at least as well. Or, if not as well, good enough. This is why conventionalists are always harping about feedback: it is the deus ex machina they invoke in any situation where they need to explain why their prediction failed. This does nothing to explain why MOND succeeded where they failed.
This post-hoc reasoning is profoundly unsatisfactory. Dark matter, being invisible, allows us lots of freedom to cook up an explanation for pretty much anything. My long-standing concern for the dark matter paradigm is not the failure of any particular prediction, but that, like epicycles, it has too much explanatory power. We could use it to explain pretty much anything. Rotation curves flat when they should be falling? Add some dark matter. No such need? No dark matter. Rising rotation curves? Sure, we could explain that too: add more dark matter. Only we don’t, because that situation doesn’t arise in nature. But we could if we had to. (See, e.g., Fig. 6 of de Blok & McGaugh 1998.)
There is no requirement in dark matter that rotation curves be as flat as they are. If we start from the prior knowledge that they are, then of course that’s what we get. If instead we independently try to build models of galactic disks in dark matter halos, very few of them wind up with realistic looking rotation curves. This shouldn’t be surprising: there are, in principle, an uncountably infinite number of combinations of galaxies and dark matter halos. Even if we impose some sensible restrictions (e.g., scaling the mass of one component with that of the other), we still don’t get it right. That’s one reason that we have to add feedback, which suffices according to some, and not according to others.
In contrast, the predictions of MOND are unique. The kinematics of an object follow from its observed mass distribution. The two are tied together by the hypothesized force law. There is a one-to-one relation between what you see and what you get.
From the perspective of building dark matter models, it’s like the proverbial needle in the haystack: the haystack is the volume of possible baryonic disk plus dark matter halo combinations; the one that “looks like” MOND is the needle. Somehow nature plucks the MOND-like needle out of the dark matter haystack every time it makes a galaxy.
Dr. Sutter says that we shouldn’t go with our gut. That’s exactly what I wanted to do, long ago, to maintain my preference for dark matter. I’d love to do that now so that I could stop having this argument with otherwise reasonable people.
Instead of going with my gut, I’m making a probabilistic statement. In Bayesian terms, the odds of observing MONDian behavior given the prior that we live in a universe made of dark matter are practically zero. In MOND, observing MONDian behavior is the only thing that can happen. That’s what we observe in galaxies, over and over again. Any information criterion shows a strong quantitative preference for MOND when dynamical evidence is considered. That does not happen when cosmological data are considered because MOND makes no prediction there. Concluding that dark matter is better overlooks the practical impossibility that MOND-like phenomenolgy is observed at all. Of course, once one knows this is what the data show, it seems a lot more likely, and I can see that effect in the literature over the long arc of scientific history. This is why, to me, predictive power is more important than accommodation: what we predict before we know the answer is more important than whatever we make up once the answer is known.
The successes of MOND are sometimes minimized by lumping all galaxies into a single category. That’s not correct. Every galaxy has a unique mass distribution; each one is an independent test. The data for galaxies extend over a large dynamic range, from dwarfs to giants, from low to high surface brightness, from gas to star dominated cases. Dismissing this by saying “MOND only explains rotation curves” is like dismissing Newton for only explaining planets – as if every planet, moon, comet, and asteroid aren’t independent tests of Newton’s inverse square law.
MOND does explain more that rotation curves. That was the first thing I checked. I spent several years looking at all of the data, and have reviewed the situation many times since. What I found surprising is how much MOND explains, if you let it. More disturbing was how often I came across claims in the literature that MOND was falsified by X only to try the analysis myself and find that, no, if you bother to do it right, that’s pretty much just what it predicts. Not in every case, of course – no hypothesis is perfect – but I stopped bothering after several hundred cases. Literally hundreds. I can’t keep up with every new claim, and it isn’t my job to do so. My experience has been that as the data improve, so too does its agreement with MOND.
Dr. Sutter’s article goes farther, repeating a common misconception that “the tweaking of gravity under MOND is explicitly designed to explain the motions of stars within galaxies.” This is an overstatement so strong as to be factually wrong. MOND was explicitly designed to produce flat rotation curves – as was dark matter. However, there is a lot more to it than that. Once we write down the force law, we’re stuck with it. It has lots of other unavoidable consequences that lead to genuine predictions. Milgrom explicitly laid out what these consequences would be, and basically all of them have subsequently been observed. I include a partial table in my last review; it only ends where it does because I had to stop somewhere. These were genuine, successful, a priori predictions – the gold standard in science. Some of them can be explained with dark matter, but many cannot: they make no sense, and dark matter can only accommodate them thanks to its epic flexibility.
Dr. Sutter makes a number of other interesting points. He says we shouldn’t “pick [a hypothesis] that sounds cooler or seems simpler.” I’m not sure which seems cooler here – a universe pervaded by a mysterious invisible mass that we can’t [yet] detect in the laboratory but nevertheless controls most of what goes on out there seems pretty cool to me. That there might also be some fundamental aspect of the basic theory of gravitational dynamics that we’re missing also seems like a pretty cool possibility. Those are purely value judgments.
Simplicity, however, is a scientific value known as Occam’s razor. The simpler of competing theories is to be preferred. That’s clearly MOND: we make one adjustment to the force law, and that’s it. What we lack is a widely accepted, more general theory that encapsulates both MOND and General Relativity.
In dark matter, we multiply entities unnecessarily – there is extra mass composed of unknown particles that have no place in the Standard Model of particle physics (which is quite full up) so we have to imagine physics beyond the standard model and perhaps an entire dark sector because why just one particle when 85% of the mass is dark? and there could also be dark photons to exchange forces that are only active in the dark sector as well as entire hierarchies of dark particles that maybe have their own ecosystem of dark stars, dark planets, and maybe even dark people. We, being part of the “normal” matter, are just a minority constituent of this dark universe; a negligible bit of flotsam compared to the dark sector. Doesn’t it make sense to imagine that the dark sector has as rich and diverse a set of phenomena as the “normal” sector? Sure – if you don’t mind abandoning Occam’s razor. Note that I didn’t make any of this stuff up; everything I said in that breathless run-on sentence I’ve heard said by earnest scientists enthusiastic about how cool the dark sector could be. Bugger Occam.
There is also the matter of timescales. Dr. Sutter mentions that “In almost 50 years, nobody has come up with a MOND-like theory” that does all that we need it to do. That’s true, but for the typo. Next year (2023) will mark the 40th anniversary of Milgrom’s first publications on MOND, so it hasn’t been half a century yet. But I’ve heard recurring complaints to this effect before, that finding the deeper theory is taking too long. Let’s examine that, shall we?
First, remember some history. When Newton introduced his inverse square law of universal gravity, it was promptly criticized as a form of magical thinking: How, Sir, can you have action at a distance? The conception at the time was that you had to be in physical contact with an object to exert a force on it. For the sun to exert a force on the earth, or the earth on the moon, seemed outright magical. Leibnitz famously accused Newton of introducing ‘occult’ forces. As a consequence, Newton was careful to preface his description of universal gravity as everything happening as if the force was his famous inverse square law. The “as if” is doing a lot of work here, basically saying, in modern parlance “OK, I don’t get how this is possible, I know it seems really weird, but that’s what it looks like.” I say the same about MOND: galaxies behave as if MOND is the effective force law. The question is why.
As near as I can tell from reading the history around this, and I don’t know how clear this is, but it looks like it took about 20 years for Newton to realize that there was a good geometric reason for the inverse square law. We expect our freshman physics students to see that immediately. Obviously Newton was smarter than the average freshman, so why’d it take so long? Was he, perhaps, preoccupied with the legitimate-seeming criticisms of action at a distance? It is hard to see past a fundamental stumbling block like that, and I wonder if the situation now is analogous. Perhaps we are missing something now that will seems obvious in retrospect, distracted by criticisms that will seem absurd in the future.
Many famous scientists built on the dynamics introduced by Newton. The Poisson equation isn’t named the Newton equation because Newton didn’t come up with it even though it is fundamental to Newtonian dynamics. Same for the Lagrangian. And the classical Hamiltonian. These developments came many decades after Newton himself, and required the efforts of many brilliant scientists integrated over a lot of time. By that standard, forty years seems pretty short: one doesn’t arrive at a theory of everything overnight.
What is the right measure? The integrated effort of the scientific community is more relevant than absolute time. Over the past forty years, I’ve seen a lot of push back against even considering MOND as a legitimate theory. Don’t talk about that! This isn’t exactly encouraging, so not many people have worked on it. I can count on my fingers the number of people who have made important contributions to the theoretical development of MOND. (I am not one of them. I am an observer following the evidence, wherever it leads, even against my gut feeling and to the manifest detriment of my career.) It is hard to make progress without a critical mass of people working on a problem.
Of course, people have been looking for dark matter for those same 40 years. More, really – if you want to go back to Oort and Zwicky, it has been 90 years. But for the first half century of dark matter, no one was looking hard for it – it took that long to gel as a serious problem. These things take time.
Nevertheless, for several decades now there has been an enormous amount of effort put into all aspects of the search for dark matter: experimental, observational, and theoretical. There is and has been a critical mass of people working on it for a long time. There have been thousands of talented scientists who have contributed to direct detection experiments in dozens of vast underground laboratories, who have combed through data from X-ray and gamma-ray observatories looking for the telltale signs of dark matter decay or annihilation, who have checked for the direct production of dark matter particles in the LHC; even theorists who continue to hypothesize what the heck the dark matter could be and how we might go about detecting it. This research has been well funded, with billions of dollars having been spent in the quest for dark matter. And what do we have to show for it?
Zero. Nada. Zilch. Squat. A whole lot of nothing.
This is equal to the amount of funding that goes to support research on MOND. There is no faster way to get a grant proposal rejected than to say nice things about MOND. So one the one hand, we have a small number of people working on the proverbial shoestring, while on the other, we have a huge community that has poured vast resources into the attempt to detect dark matter. If we really believe it is taking too long, perhaps we should try funding MOND as generously as we do dark matter.
I noted last time that in the rush to analyze the first of the JWST data, that “some of these candidate high redshift galaxies will fall by the wayside.” As Maurice Aabe notes in the comments there, this has already happened.
I was concerned because of previous work with Jay Franck in which we found that photometric redshifts were simply not adequately precise to identify the clusters and protoclusters we were looking for. Consequently, we made it a selection criterion when constructing the CCPC to require spectroscopic redshifts. The issue then was that it wasn’t good enough to have a rough idea of the redshift, as the photometric method often provides (what exactly it provides depends in a complicated way on the redshift range, the stellar population modeling, and the wavelength range covered by the observational data that is available). To identify a candidate protocluster, you want to know that all the potential member galaxies are really at the same redshift.
This requirement is somewhat relaxed for the field population, in which a common approach is to ask broader questions of the data like “how many galaxies are at z ~ 6? z ~ 7?” etc. Photometric redshifts, when done properly, ought to suffice for this. However, I had noticed in Jay’s work that there were times when apparently reasonable photometric redshift estimates went badly wrong. So it made the ganglia twitch when I noticed that in early JWST work – specifically Table 2 of the first version of a paper by Adams et al. – there were seven objects with candidate photometric redshifts, and three already had a preexisting spectroscopic redshift. The photometric redshifts were mostly around z ~ 9.7, but the three spectroscopic redshifts were all smaller: two z ~ 7.6, one 8.5.
Three objects are not enough to infer a systematic bias, so I made a mental note and moved on. But given our previous experience, it did not inspire confidence that all the available cases disagreed, and that all the spectroscopic redshifts were lower than the photometric estimates. These things combined to give this observer a serious case of “the heebie-jeebies.”
Adams et al have now posted a revised analysis in which many (not all) redshifts change, and change by a lot. Here is their new Table 4:
There are some cases here that appear to confirm and improve the initial estimate of a high redshift. For example, SMACS-z11e had a very uncertain initial redshift estimate. In the revised analysis, it is still at z~11, but with much higher confidence.
That said, it is hard to put a positive spin on these numbers. 23 of 31 redshifts change, and many change drastically. Those that change all become smaller. The highest surviving redshift estimate is z ~ 15 for SMACS-z16b. Among the objects with very high candidate redshifts, some are practically local (e.g., SMACS-z12a, F150DB-075, F150DA-058).
So… I had expected that this could go wrong, but I didn’t think it would go this wrong. I was concerned about the photometric redshift method – how well we can model stellar populations, especially at young ages dominated by short lived stars that in the early universe are presumably lower metallicity than well-studied nearby examples, the degeneracies between galaxies at very different redshifts but presenting similar colors over a finite range of observed passbands, dust (the eternal scourge of observational astronomy, expected to be an especially severe affliction in the ultraviolet that gets redshifted into the near-IR for high-z objects, both because dust is very efficient at scattering UV photons and because this efficiency varies a lot with metallicity and the exact gran size distribution of the dust), when is a dropout really a dropout indicating the location of the Lyman break and when is it just a lousy upper limit of a shabby detection, etc. – I could go on, but I think I already have. It will take time to sort these things out, even in the best of worlds.
We do not live in the best of worlds.
It appears that a big part of the current uncertainty is a calibration error. There is a pipeline for handling JWST data that has an in-built calibration for how many counts in a JWST image correspond to what astronomical magnitude. The JWST instrument team warned us that the initial estimate of this calibration would “improve as we go deeper into Cycle 1” – see slide 13 of Jane Rigby’s AAS presentation.
I was not previously aware of this caveat, though I’m certainly not surprised by it. This is how these things work – one makes an initial estimate based on the available data, and one improves it as more data become available. Apparently, JWST is outperforming its specs, so it is seeing as much as 0.3 magnitudes deeper than anticipated. This means that people were inferring objects to be that much too bright, hence the appearance of lots of galaxies that seem to be brighter than expected, and an apparent systematic bias to high z for photometric redshift estimators.
I was not at the AAS meeting, let alone Dr. Rigby’s presentation there. Even if I had been, I’m not sure I would have appreciated the potential impact of that last bullet point on nearly the last slide. So I’m not the least bit surprised that this error has propagated into the literature. This is unfortunate, but at least this time it didn’t lead to something as bad as the Challenger space shuttle disaster in which the relevant warning from the engineers was reputed to have been buried in an obscure bullet point list.
So now we need to take a deep breath and do things right. I understand the urgency to get the first exciting results out, and they are still exciting. There are still some interesting high z candidate galaxies, and lots of empirical evidence predating JWST indicating that galaxies may have become too big too soon. However, we can only begin to argue about the interpretation of this once we agree to what the facts are. At this juncture, it is more important to get the numbers right than to post early, potentially ill-advised takes on arXiv.
That said, I’d like to go back to writing my own ill-advised take to post on arXiv now.
There has been a veritable feeding frenzy going on with the first JWST data. This is to be expected. Also to be expected is that some of these early results will ultimately prove to have been premature. So – caveat emptor! That said, I want to highlight one important aspect of these early results, there being too many to do all them all justice.
The basic theme is that people are finding very faint yet surprisingly bright galaxies that are consistent with being at redshift 9 and above. The universe has expanded by a factor of ten since then, when it was barely half a billion years old. That’s a long time to you and me, and even to a geologist, but it is a relatively short time for a universe that is now over 13 billion years old, and it isn’t a lot of time for objects as large as galaxies to form.
In the standard LCDM cosmogony, we expect large galaxies to build up from the merger of many smaller galaxies. These smaller galaxies form first, and many of the stars that end up in big galaxies may have formed in these smaller galaxies prior to merging. So when we look to high redshift, we expect to catch this formation-by-merging process in action. We should see lots of small, actively star forming protogalactic fragments (Searle-Zinn fragments in Old School speak) before they’ve had time to assemble into the large galaxies we see relatively nearby to us at low redshift.
So what are we seeing? Here is one example from Labbe et al.:
Not much to look at, is it? But really it is pretty awesome for light that has been traveling 13 billion years to get to us and had its wavelength stretched by a factor of ten. Measuring the brightness in these various passbands enables us to estimate both its redshift and stellar mass:
Eighty five billion solar masses is a lot of stars. It’s a bit bigger than the Milky Way, which has had the full 13+ billion years to make its complement of roughly 60 billion solar masses of stars. Object 19424 is a big galaxy, and it grew up fast.
In LCDM, it is not particularly hard to build a model that forms a lot of stars early on. What is challenging is assembling this many into a single object. We should see lots of much smaller fragments (and may yet still) but we shouldn’t see many really big objects like this already in place. How many there are is a critical question.
Labbe et al. make an estimate of the stellar mass density in massive high redshift galaxies, and find it to be rather a lot. This is a fraught exercise in the best of circumstances when one has excellent data for thousands of galaxies. Here we have only a handful. We must also assume that the small region surveyed is typical, which it may not be. Moreover, the photometric redshift method illustrated above is fraught. It looks convincing. It is convincing. It also gives me the heebie-jeebies. Many times I have seen photometric redshifts turn out to be wrong when good spectroscopic data are obtained. But usually the method works, and it’s what we got so far, so let’s see where this ride takes us.
A short paper that nicely illustrates the prime issue is provided by Prof. Boylan-Kolchin. His key figure:
The basic issue is that there are too many stars in these big galaxies. There are many astrophysical uncertainties about how stars form: how fast, how efficiently, with what mass distribution, etc., etc. – much of the literature is obsessed with these issues. In contrast, once the parameters of cosmology are known, as we think them to be, it is relatively straightforward to calculate the number density of dark matter halos as a function of mass at a given redshift. This is the dark skeleton on which large scale structure depends; getting this right is absolutely fundamental to the cold dark matter picture.
Every dark matter halo should host a universal fraction of normal matter. The baryon fraction (fb) is known to be very close to 16% in LCDM. Prof. Boylan-Kolchin points out that this sets an important upper limit on how many stars could possibly form. The shaded region in the figure above is excluded: there simply isn’t enough normal matter to make that many stars. The data of Labbe et al. fall in this region, which should be impossible.
The data only fall a little way into the excluded region, so maybe it doesn’t look that bad, but the real situation is more dire. Star formation is very inefficient, but the shaded region assumes that all the available material has been converted into stars. A more realistic expectation is closer to the gray line (ε = 0.1), not the hard limit where all the available material has been magically turned into stars with a cosmic snap of the fingers.
Indeed, I would argue that the real efficiency ε is likely lower than 0.1 as it is locally. This runs into problems with precursors of the JWST result, so we’ve already been under pressure to tweak this free parameter upwards. Turning it up to eleven is just the inevitable consequence of needing to get more stars to form in the first big halos to appear sooner than the theory naturally predicts.
So, does this spell doom for LCDM? I doubt it. There are too many uncertainties at present. It is an intriguing result, but it will take a lot of follow-up work to sort out. I expect some of these candidate high redshift galaxies will fall by the wayside, and turn out to be objects at lower redshift. How many, and how that impacts the basic result, remains to be determined.
After years of testing LCDM, it would be ironic if it could be falsified by this one simple (expensive, technologically amazing) observation. Still, it is something important to watch, as it is at least conceivable that we could measure a stellar mass density that is impossibly high. Wither then?
I went on a bit of a twitter bender yesterday about the early claims about high mass galaxies at high redshift, which went on long enough I thought I should share it here.
For those watching the astro community freak out about bright, high redshift galaxies being detected by JWST, some historical context in an amusing anecdote…
The 1998 October conference was titled “After the dark ages, when galaxies were young (the universe at 2 < z < 5).” That right there tells you what we were expecting. Redshift 5 was high – when the universe was a mere billion years old. Before that, not much going on (dark ages).
This was when the now famous SN Ia results corroborating the acceleration of the expansion rate predicted by concordance LCDM were shiny and new. Many of us already strongly suspected we needed to put the Lambda back in cosmology; the SN results sealed the deal.
One of the many lines of evidence leading to the rehabilitation of Lambda – previously anathema – was that we needed a bit more time to get observed structures to form. One wants the universe to be older than its contents, an off and on problem with globular clusters for forever.
A natural question that arises is just how early do galaxies form? The horizon of z=7 came up in discussion at lunch, with those of us who were observers wondering how we might access that (JWST being the answer long in the making).
Famed simulator Carlos Frenk was there, and assured us not to worry. He had already done LCDM simulations, and knew the timing.
He also added “don’t quote me on that,” which I’ve respected until now, but I think the statute of limitations has expired.
Everyone present immediately pulled out their wallet and chipped in $5 to endow the “7-up” prize for the first persuasive detection of an object at or above redshift seven.
A committee was formed to evaluate claims that might appear in the literature, composed of Carlos, Vera Rubin, and Bruce Partridge. They made it clear that they would require a high standard of evidence: at least two well-identified lines; no dropouts or photo-z’s.
That standard wasn’t met for over a decade, with z=6.96 being the record holder for a while. The 7-up prize was entirely tongue in cheek, and everyone forgot about it. Marv Leventhal had offered to hold the money; I guess he ended up pocketing it.
I believe the winner of the 7-up prize should have been Nial Tanvir for GRB090423 at z~8.2, but I haven’t checked if there might be other credible claims, and I can’t speak for the committee.
At any rate, I don’t think anyone would now seriously dispute that there are galaxies at z>7. The question is how big do they get, how early? And the eternal mobile goalpost, what does LCDM really predict?
Carlos was not wrong. There is no hard cutoff, so I won’t quibble about arbitrary boundaries like z=7. It takes time to assemble big galaxies, & LCDM does make a reasonably clear prediction about the timeline for that to occur. Basically, they shouldn’t be all that big that soon.
Here is a figure adapted from the thesis Jay Franck wrote here 5 years ago using Spitzer data (round points). It shows the characteristic brightness (Schechter M*) of galaxies as a function of redshift. The data diverge from the LCDM prediction (squares) as redshift increases.
Remarkably, the data roughly follow the green line, which is an L* galaxy magically put in place at the inconceivably high redshift of z=10. Galaxies seem to have gotten big impossibly early. This is why you see us astronomers flipping our lids at the JWST results. Can’t happen.
Except that it can, and was predicted to do so by Bob Sanders a quarter century ago: “Objects of galaxy mass are the first virialized objects to form (by z=10) and larger structure develops rapidly.”
The reason is MOND. After decoupling, the baryons find themselves bereft of radiation support and suddenly deep in the low acceleration regime. Structure grows fast and becomes nonlinear almost immediately. It’s as if there is tons more dark matter than we infer nowadays.
I referreed that paper, and was a bit disappointed that Bob had beat me to it: I was doing something similar at the time, with similar results. Instead of being hard to form structure quickly as in LCDM, it’s practically impossible to avoid in MOND.
He beat me to it, so I abandoned writing that paper. No need to say the same thing twice! Didn’t think we’d have to wait so long to test it.
I’ve reviewed this many times. Most recently in January, in anticipation of JWST, on my blog.
But you get the point. Every time you see someone describe the big galaxies JWST is seeing as unexpected, what they mean is unexpected in LCDM. It doesn’t surprise me at all. It is entirely expected in MOND, and was predicted a priori.
The really interesting thing to me, though, remains what LCDM really predicts. I already see people rationalizing excuses. I’ve seen this happen before. Many times. That’s why the field is in a rut.
So are we gonna talk our way out of it this time? I’m no longer interested in how; I’m sure someone will suggest something that will gain traction no matter how unsatisfactory.
The only interesting question is if LCDM makes a prediction here that can’t be fudged. If it does, then it can be falsified. If it doesn’t, it isn’t science.
But can we? Is LCDM subject to falsification? Or will we yet again gaslight ourselves into believing that we knew it all along?
LZ is a merger of two previous experiments compelled to grow still bigger in the never-ending search for dark matter. It contains “seven active tonnes of liquid xenon,” which is an absurd amount, being a substantial fraction of the entire terrestrial supply. It all has to be super-cooled to near absolute zero and filtered of all contaminants that might include naturally radioactive isotopes that might mimic the sought-after signal of dark matter scattering off of xenon nuclei. It is a technological tour de force.
The technology is really fantastic. The experimentalists have accomplished amazing things in building these detectors. They have accomplished the target sensitivity, and then some. If WIMPs existed, they should have found them by now.
WIMPs have not been discovered. As the experiments have improved, the theorists have been obliged to repeatedly move the goalposts. The original (1980s) expectation for the interaction cross-section was 10-39 cm2. That was quickly excluded, but more careful (1990s) calculation suggested perhaps more like 10-42 cm2. This was also excluded experimentally. By the late 2000s, the “prediction” had migrated to 10-46 cm2. This has also now been excluded, so the goalposts have been moved to 10-48 cm2. This migration has been driven entirely by the data; there is nothing miraculous about a WIMP with this cross section.
As remarkable a technological accomplishment as experiments like LZ are, they are becoming the definition of insanity: repeating the same action but expecting a different result.
For comparison, consider the LIGO detection of gravitational waves. A large team of scientists worked unspeakably hard to achieve the detection of a tiny effect. It took 40 years of failure before success was obtained. Until that point, it seemed much the same: repeating the same action but expecting a different result.
Except it wasn’t, because there was a clear expectation for the sensitivity that was required to detect gravitational waves. Once that sensitivity was achieved, they were detected. It wasn’t that simple of course, but close enough for our purposes: it took a long time to get where they were going, but they achieved success once they got there. Having a clear prediction is essential.
In the case of WIMP searches, there was also a clear prediction. The required sensitivity was achieved – long ago. Nothing was found, so the goalposts were moved – by a lot. Then the new required sensitivity was achieved, still without detection. Repeatedly.
It always makes sense to look harder for something you expect if at first you don’t succeed. But at some point, you have to give up: you ain’t gonna find it. This is disappointing, but we’ve all experienced this kind of disappointment at some point in our lives. The tricky part is deciding when to give up.
In science, the point to give up is when your hypothesis is falsified. The original WIMP hypothesis was falsified a long time ago. We keep it on life support with modifications, often obfuscating (to our students and to ourselves) that the WIMPs we’re talking about today are no longer the WIMPs we originally conceived.
I sometimes like to imagine the thought experiment of sending some of the more zealous WIMP advocates back in time to talk to their younger selves. What would they say? How would they respond to themselves? These are not people who like to be contradicted by anyone, even themselves, so I suspect it would go something like
Old scientist: “Hey, kid – I’m future you. This experiment you’re about to spend your life working on won’t detect what you’re looking for.”
Young scientist: “Uh huh. You say you’re me from the future, Mr. Credibility? Tell me: at what point do I go senile, you doddering old fool?”
Old scientist: “You don’t. It just won’t work out the way you think. On top of dark matter, there’s also dark energy…”
Young scientist: “What the heck is dark energy, you drooling crackpot?”
Old scientist: “The cosmological constant.”
Young scientist: “The cosmological constant! You can’t expect people to take you seriously talking about that rubbish. GTFO.”
That’s the polite version that doesn’t end in fisticuffs. It’s easy to imagine this conversation going south much faster. I know that if 1993 me had received a visit from 1998 me telling me that in five years I would have come to doubt WIMPs, and also would have demonstrated that the answer to the missing mass problem might not be dark matter at all, I… would not have taken it well.
That’s why predictions are important in science. They tell us when to change our mind. When to stop what we’re doing because it’s not working. When to admit that we were wrong, and maybe consider something else. Maybe that something else won’t prove correct. Maybe the next ten something elses won’t. But we’ll never find out if we won’t let go of the first wrong thing.
Avi Loeb has a nice recent post Recalculating Academia, in which he discusses some of the issues confronting modern academia. One of the reasons I haven’t written here for a couple of months is despondency over the same problems. If you’re here reading this, you’ll likely be interested in what he has to say.
I am not eager to write at length today, but I do want to amplify some of the examples he gives with my own experience. For example, he notes that there are
theoretical physicists who avoid the guillotine of empirical tests for half a century by dedicating their career to abstract conjectures, avoid the risk of being proven wrong while demonstrating mathematical virtuosity.
I recognize many kinds of theoretical physicists who fit this description. My first thought was string theory, which took off in the mid-80s when I was a grad student at Princeton, ground zero for that movement in the US. (The Russians indulged in this independently.) I remember a colloquium in which David Gross advocated the “theory of everything” with gratuitous religious fervor to a large audience of eager listeners quavering with anticipation with the texture of religious revelation. It was captivating and convincing, up until the point near the end when he noted that experimental tests were many orders of magnitude beyond any experiment conceivable at the time. That… wasn’t physics to me. If this was the path the field was going down, I wanted no part of it. This was one of many factors that precipitated my departure from the toxic sludge that was grad student life in the Princeton physics department.
I wish I could say I had been proven wrong. Instead, decades later, physics has nothing to show for its embrace of string theory. There have been some impressive development in mathematics stemming from it. Mathematics, not physics. And yet, there persists a large community of theoretical physicists who wander endlessly in the barren and practically infinite parameter space of multidimensional string theory. Maybe there is something relevant to physical reality there, or maybe it hasn’t been found because there isn’t. At what point does one admit that the objective being sought just ain’t there? [Death. For many people, the answer seems to be never. They keep repeating the same fruitless endeavor until they die.]
We do have new physics, in the form of massive neutrinos and the dark matter problem and the apparent acceleration of the expansion rate of the universe. What we don’t have is the expected evidence for supersymmetry, the crazy-bold yet comparatively humble first step on the road to string theory. If they had got even this much right, we should have seen evidence for it at the LHC, for example in the decay of the aptly named BS meson. If supersymmetric particles existed, they should provide many options for the meson to decay into, which otherwise has few options in the Standard Model of particle physics. This was a strong prediction of minimal supersymmetry, so much so that it was called the Golden Test of supersymmetry. After hearing this over and over in the ’80s and ’90s, I have not heard it again any time in this century. I’m nor sure when the theorists stopped talking about this embarrassment, but I suspect it is long enough ago now that it will come as a surprise to younger scientists, even those who work in the field. Supersymmetry flunked the golden test, and it flunked it hard. Rather than abandon the theory (some did), we just stopped talking about. There persists a large community of theorists who take supersymmetry for granted, and react with hostility if you question that Obvious Truth. They will tell you with condescension that only minimal supersymmetry is ruled out; there is an enormous parameter space still open for their imaginations to run wild, unbridled by experimental constraint. This is both true and pathetic.
Reading about the history of physics, I learned that there was a community of physicists who persisted believing in aether for decades after the Michelson-Morley experiment. After all, only some forms of aether were ruled out. This was true, at the time, but we don’t bother with that detail when teaching physics now. Instead, it gets streamlined to “aether was falsified by Michelson-Morley.” This is, in retrospect, true, and we don’t bother to mention those who pathetically kept after it.
The standard candidate for dark matter, the WIMP, is a supersymmetric particle. If supersymmetry is wrong, WIMPs don’t exist. And yet, there is a large community of particle physicists who persist in building ever bigger and better experiments designed to detect WIMPs. Funny enough, they haven’t detected anything. It was a good hypothesis, 38 years ago. Now its just a bad habit. The better ones tacitly acknowledge this, attributing their continuing efforts to the streetlight effect: you look where you can see.
Prof. Loeb offers another pertinent example:
When I ask graduating students at their thesis exam whether the cold dark matter paradigm will be proven wrong if their computer simulations will be in conflict with future data, they almost always say that any disagreement will indicate that they should add a missing ingredient to their theoretical model in order to “fix” the discrepancy.
This is indeed the attitude. So much so that no additional ingredient seems to absurd if it is what we need to save the phenomenon. Feedback is the obvious example in my own field, as that (or the synonyms “baryon physics” or “gastrophysics”) is invoked to explain away any and all discrepancies. It sounds simple, since feedback is a real effect that does happen, but this single word does a lot of complicated work under the hood. There are many distinct kinds of feedback: stellar winds, UV radiation from massive stars, supernova when those stars explode, X-rays from compact sources like neutron stars, and relativistic jets from supermasive black holes at the centers of galactic nuclei. These are the examples of feedback that I can think of off the top of my head, there are probably more. All of these things have perceptible, real-world effects on the relevant scales, with, for example, stars blowing apart the dust and gas of their stellar cocoons after they form. This very real process has bugger all to do with what feedback is invoked to do on galactic scales. Usually, supernova are blamed by theorists for any and all problems in dwarf galaxies, while observers tell me that stellar winds do most of the work in disrupting star forming regions. Confronted with this apparent discrepancy, the usual answer is that it doesn’t matter how the energy is input into the interstellar medium, just that it is. Yet we can see profound differences between stellar winds and supernova explosions, so this does not inspire confidence for the predictive power of theories that generically invoke feedback to explain away problems that wouldn’t be there in a healthy theory.
This started a long time ago. I had already lost patience with this unscientific attitude to the point that I dubbed it the
Spergel Principle: “It is better to postdict than to predict.”
This continues to go on and has now done so for so long that generations of students seem to think that this is how science is supposed to be done. If asked about hypothesis testing and whether a theory can be falsified, many theorists will first look mystified, then act put out. Why would you even ask that? (One does not question the paradigm.) The minority of better ones then rally to come up with some reason to justify that yes, what they’re talking about can be falsified, so it does qualify as physics. But those goalposts can always be moved.
A good example of moving goalposts is the cusp-core problem. When I first encountered this in the mid to late ’90s, I tried to figure a way out of it, but failed. So I consulted one of the very best theorists, Simon White. When I asked him what he thought would constitute a falsification of cold dark matter, he said cusps: “cusps have to be there” [in the center of a dark matter halo]. Flash forward to today, when nobody would accept that as a falsification of cold dark matter: it can be fixed by feedback. Which would be fine, if it were true, which isn’t really clear. At best it provides a post facto explanation for an unpredicted phenomenon without addressing the underlying root cause, that the baryon distribution is predictive of the dynamics.
This is like putting a band-aid on a Tyrannosaurus. It’s already dead and fossilized. And if it isn’t, well, you got bigger problems.
Another disease common to theory is avoidance. A problem is first ignored, then the data are blamed for showing the wrong thing, then they are explained in a way that may or may not be satisfactory. Either way, it is treated as something that had been expected all along.
In a parallel to this gaslighting, I’ve noticed that it has become fashionable of late to describe unsatisfactory explanations as “natural.” Saying that something can be explained naturally is a powerful argument in science. The traditional meaning is that ok, we hadn’t contemplated this phenomena before it surprised us, but if we sit down and work it out, it makes sense. The “making sense” part means that an answer falls out of a theory easily when the right question is posed. If you need to run gazillions of supercomputer CPU hours of a simulation with a bunch of knobs for feedback to get something that sorta kinda approximates reality but not really, your result does not qualify as natural. It might be right – that’s a more involved adjudication – but it doesn’t qualify as natural and the current fad to abuse this term again does not inspire confidence that the results of such simulations might somehow be right. Just makes me suspect the theorists are fooling themselves.
I haven’t even talked about astroparticle physicists or those who engage in fantasies about the multiverse. I’ll just close by noting that Popper’s criterion for falsification was intended to distinguish between physics and metaphysics. That’s not the same as right or wrong, but physics is subject to experimental test while metaphysics is the stuff of late night bull sessions. The multiverse is manifestly metaphysical. Cool to think about, has lots of implications for philosophy and religion, but not physics. Even Gross has warned against treading down the garden path of the multiverse. (Tell me that you’re warning others not to make the same mistakes you made without admitting you made mistakes.)
There are a lot of scientists who would like to do away with Popper, or any requirement that physics be testable. These are inevitably the same people whose fancy turns to metascapes of mathematically beautiful if fruitless theories, and want to pass off their metaphysical ramblings as real physics. Don’t buy it.
In previousposts, I briefly described some of the results that provoked a crisis of faith in the mid-1990s. Up until that point, I was an ardent believer in the cold dark matter paradigm. But it no longer made sense as an explanation for galaxy dynamics. It didn’t just not make sense, it seemed strewn with self-contradictions, all of which persist to this day.
Amidst this crisis of faith, there came a chance meeting in Middle-Earth: Moti Milgrom visited Cambridge, where I was a postdoc at the time, and gave a talk. I almost didn’t go to this talk because it had modified gravity in the title and who wanted to waste their time listening to that nonsense? I had yet to make any connection between the self-contradictions the data posed for dark matter and something as dire as an entirely different paradigm.
Despite my misgivings, I did go to Milgrom’s talk. Not knowing that I was there or what I worked on, he casually remarked on some specific predictions for low surface brightness galaxies. These sounded like what I was seeing, in particular the things that were most troublesome for the dark matter interpretation. I became interested.
Long story short, it is a case in which, had MOND not already existed, we would have had to invent it. As Sherlock Holmes famously put it
When you have eliminated the impossible, whatever remains, however improbable, must be the truth.
Sir Arthur Conan Doyle
Modified Newtonian Dynamics
There is one and only one theory that predicted in advance the observations described above: the Modified Newtonian Dynamics (MOND) introduced by Milgrom (1983a,b,c). MOND is an extension of Newtonian theory (Milgrom, 2020). It is not a generally covariant theory, so is not, by itself, a complete replacement for General Relativity. Nevertheless, it makes unique, testable predictions within its regime of applicability (McGaugh, 2020).
The basic idea of MOND is that the force law is modified at an acceleration scale, a0. For large accelerations, g ≫ a0, everything is normal and Newtonian: g = gN, where gN is the acceleration predicted by the observed luminous mass distribution obtained by solving the Poisson equation. At low accelerations, the effective acceleration tends towards the limit
The motivation to make an acceleration-based modification is to explain flat rotation curves (Bosma, 1981; Rubin et al., 1978) that also gives a steep Tully-Fisher relation similar to that which is observed (Aaronson et al., 1979). A test particle in a circular orbit around a point mass Mp in the deep MOND regime (eq. (5)) will experience a centripetal acceleration
Vc2/R = √(a0GMp/R2). (6)
Note that the term for the radius R cancels out, so eq. (6) reduces to
Vc4 = a0GMp (7)
which the reader will recognize as the Baryonic Tully-Fisher relation
Mb = A Vf4 (8)
with A = ζ/(a0G) where ζ is a geometrical factor of order unity.
This simple math explains the flatness of rotation curves. This is not a prediction; it was an input that motivated the theory, as it motivated dark matter. Unlike dark matter, in which rotation curves might rise or fall, the rotation curves of isolated galaxies must tend towards asymptotic flatness.
MOND also explains the Tully-Fisher relation. Indeed, there are several distinct aspects to this prediction. That the relation exists at all is a strong prediction. Fundamentally, the Baryonic Tully-Fisher Relation (BTFR) is a relation between the baryonic mass of a galaxy and its flat rotation speed. There is no dark matter involved: Vf is not a property of a dark matter halo, but of the galaxy itself.
Another aspect of the Tully-Fisher relation is its normalization. This is set by fundamental constants: Newton’s constant, G, and the acceleration scale of MOND, a0. For ζ = 0.8, A = 50 M⊙ km−4 s4. However, there is no theory that predicts the value of a0, which has to be set by the data. Moreover, this scale is distance-dependent, so the precise value of a0 varies with adjustments to the distance scale. For this reason, in part, the initial estimate of a0 = 2 × 10−10 m s−2 of (Milgrom, 1983a) was a bit high. Begeman et al. (1991) used the best data then available to obtain a0 = 1.2 × 10−10 m s−2. The value of Milgrom’s acceleration constant has not varied meaningfully since then (Famaey and McGaugh, 2012; Li et al., 2018; McGaugh, 2011; McGaugh et al., 2016; Sanders and McGaugh, 2002). This is a consistency check, but not a genuine7 prediction.
An important consequence of MOND is that the Tully-Fisher relation is absolute: it should have no dependence on size or surface brightness (Milgrom, 1983a). The mass of baryons is the only thing that sets the flat amplitude of the rotation speed. It matters not at all how those baryons are distributed. MOND was the only theory to correctly predict this in advance of the observation (McGaugh and de Blok, 1998b). The fine-tuning problem that we face conventionally is imposed by this otherwise unanticipated result.
The absolute nature of the Tully-Fisher relation in MOND further predicts that it has no physical residuals whatsoever. That is to say, scatter around the relation can only be caused by observational errors and scatter in the mass-to-light ratios of the stars. The latter is an irreducible unknown: we measure the luminosity produced by the stars in a galaxy, but what we need to know is the mass of those stars. The conversion between them can never be perfect, and inevitably introduces some scatter into the relation. Nevertheless, we can make our best effort to account for known sources of scatter. Between scatter expected from observational uncertainties and that induced by variations in the mass-to-light ratio, the best data are consistent with the prediction of zero intrinsic scatter (McGaugh, 2005, 2012; Lelli et al., 2016b, 2019). Of course, it is impossible to measure zero, but it is possible to set an upper limit on the intrinsic scatter that is very tight by extragalactic standards (<6%Lelli et al., 2019). This leaves very little room for variations beyond the inevitable impact of the stellar mass-to-light ratio. The scatter is no longer entirely accounted for when lower quality data are considered (McGaugh, 2012), but this is expected in astronomy: lower quality data inevitably admit systematic uncertainties that are not readily accounted for in the error budget.
Milgrom (1983a) made a number of other specific predictions. In MOND, the acceleration expected for kinematics follows from the surface density of baryons. Consequently, low surface brightness means low acceleration. Interpreted in terms of conventional dynamics, the prediction is that the ratio of dynamical mass to light, Mdyn/L should increase as surface brightness decreases. This happens both globally — LSB galaxies appear to be more dark matter dominated than HSB galaxies (see Fig. 4(b) of McGaugh and de Blok, 1998a), and locally — the need for dark matter sets in at smaller radii in LSB galaxies than in HSB galaxies (Figs. 3 and 14 of McGaugh and de Blok, 1998b; Famaey and McGaugh, 2012, respectively).
One may also test this prediction by plotting the rotation curves of galaxies binned by surface brightness: acceleration should scale with surface brightness. It does (Figs. 4 and 16 of McGaugh and de Blok, 1998b; Famaey and McGaugh, 2012, respectively). This observation has been confirmed by near-infrared data. The systematic variation of color coded surface brightness is already obvious with optical data, as in Fig. 15 of Famaey and McGaugh (2012), but these suffer some scatter from variations in the stellar mass-to-light ratio. These practically vanish with near-infrared data, which provide such a good tracer of the surface mass density of stars that the equivalent plot is a near-perfect rainbow (Fig. 3 of both McGaugh et al., 2019; McGaugh, 2020). The data strongly corroborate the prediction of MOND that acceleration follows from baryonic surface density.
The central density relation (Fig. 6, Lelli et al., 2016c) was also predicted by MOND (Milgrom, 2016). Both the shape and the amplitude of the correlation are correct. Moreover, the surface density Σ† at which the data bend follows directly from the acceleration scale of MOND: a0 = GΣ†. This surface density also corresponds to the stability limit for disks (Brada & Milgrom, 1999; Milgrom, 1989). The scale we had to insert by hand in dark matter models is a consequence of MOND.
Since MOND is a force law, the entirety of the rotation curve should follow from the baryonic mass distribution. The stellar mass-to-light ratio can modulate the amplitude of the stellar contribution to the rotation curve, but not its shape, which is specified by the observed distribution of light. Consequently, there is rather limited freedom in fitting rotation curves.
Example fits are shown in Fig. 8. The procedure is to construct Newtonian mass models by numerically solving the Poisson equation to determine the gravitational potential that corresponds to the observed baryonic mass distribution. Indeed, it is important to make a rigorous solution of the Poisson equation in order to capture details in the shape of the mass distribution (e.g., the wiggles in Fig. 8). Common analytic approximations like the exponential disk assume these features out of existence. Building proper mass models involves separate observations for the stars, conducted at optical or near-infrared wavelengths, and the gas of the interstellar medium, which is traced by radio wavelength observations. It is sometimes necessary to consider separate mass-to-light ratios for the stellar bulge and disk components, as there can be astrophysical differences between these distinct stellar populations (Baade, 1944). This distinction applies in any theory.
The gravitational potential of each baryonic component is represented by the circular velocity of a test particle in Fig. 8. The amplitude of the rotation curve of the mass model for each stellar component scales as the square root of its mass-to-light ratio. There is no corresponding mass-to-light ratio for the gas of the interstellar medium as there is a well-understood relation between the observed flux at 21 cm and the mass of hydrogen atoms that emit it (Draine, 2011). Consequently, the line for the gas components in Fig. 8 is practically fixed.
In addition to the mass-to-light ratio, there are two “nuisance” parameters that are sometimes considered in MOND fits: distance and inclination. These are known from independent observations, but of course these have some uncertainty. Consequently, the best MOND fit sometimes occurs for slightly different values of the distance and inclination, within their observational uncertainties (Begeman et al., 1991; de Blok & McGaugh, 1998; Sanders, 1996).
Distance matters because it sets the absolute scale. The further a galaxy, the greater its mass for the same observed flux. The distances to individual galaxies are notoriously difficult to measure. Though usually not important, small changes to the distance can occasionally have powerful effects, especially in gas rich galaxies. Compare, for example, the fit to DDO 154 by Li et al. (2018) to that of Ren et al. (2019).
Inclinations matter because we must correct the observed velocities for the inclination of each galaxy as projected on the sky. The inclination correction is V = Vobs/sin(i), so is small at large inclinations (edge-on) but large at small inclinations (face-on). For this reason, dynamical analyses often impose an inclination limit. This is an issue in any theory, but MOND is particularly sensitive since M ∝ V4 so any errors in the inclination are amplified to the fourth power (see Fig. 2 of de Blok & McGaugh, 1998). Worse, inclination estimates can suffer systematic errors (de Blok & McGaugh, 1998; McGaugh, 2012; Verheijen, 2001): a galaxy seen face-on may have an oval distortion that makes it look more inclined than it is, but it can’t be more face-on than face-on.
MOND fits will fail if either the distance or inclination is wrong. Such problems cannot be discerned in fits with dark matter halos, which have ample flexibility to absorb the imparted variance (see Fig. 6 of de Blok & McGaugh, 1998). Consequently, a fit with a dark matter halo will not fail if the distance happens to be wrong; we just won’t notice it.
The best-fit mass-to-light ratios found in MOND rotation curve fits can be checked against independent stellar population models. There is no guarantee that this procedure will return plausible values for the stellar mass-to-light ratio. Nevertheless, MOND fits recover the amplitude that is expected for stellar populations, the expected variation with color, and the band-dependent scatter (e.g., Fig. 28 of Famaey and McGaugh, 2012). Indeed, to a good approximation, the rotation curve can be predicted directly from near-infrared data (McGaugh, 2020; Sanders and Verheijen, 1998) modulo only the inevitable scatter in the mass-to-light ratio. This is a spectacular success of the paradigm that is not shared by dark matter fits (de Blok et al., 2003; de Blok & McGaugh, 1997; Kent, 1987).
Gas rich galaxies provide an even stronger test. When gas dominates the mass budget, the mass-to-light ratio of the stars ceases to have much leverage on the fit. There is no fitting parameter for gas equivalent to the mass-to-light ratio for stars: the gas mass follows directly from the observations. This enables MOND to predict the locations of such galaxies in the Baryonic Tully-Fisher plane (McGaugh, 2011) and essentially their full rotation curves (Sanders, 2019) with no free parameters (McGaugh, 2020).
It should be noted that the acceleration scale a0 is kept fixed when fitting rotation curves. If one allows a0 to vary, both it and the mass-to-light ratio spread over an unphysically large range of values (Li et al., 2018). The two are highly degenerate, causing such fits to be meaningless (Li et al., 2021): the data do not have the power to constrain multiple parameters per galaxy.
The most serious, though certainly not the only, outstanding challenge to MOND is the dynamics of clusters of galaxies (Angus et al., 2008; Sanders and McGaugh, 2002). Contrary to the case in most individual galaxies and some groups of galaxies (Milgrom, 2018, 2019), MOND typically falls short of correcting the mass discrepancy in rich clusters by a factor of ~ 2 in mass. This can be taken as completely fatal, or as a being remarkably close by the standards of astrophysics. Which option one chooses seems to be mostly a matter of confirmation bias: those who are quick to dismiss MOND are happy to spot their own models a factor of two in mass, and even to assert that it is natural to do so (e.g., Ludlow et al., 2017). MOND is hardly alone in suffering problems with clusters of galaxies, which also present problems for ΛCDM (e.g., Angus & McGaugh, 2008; Asencio et al., 2021; Meneghetti et al., 2020).
A common fallacy seems to be that any failing of MOND is automatically considered to be support for ΛCDM. This is seldom the case. More often than not, observations that are problematic for MOND are also problematic for ΛCDM. We do not perceive them as such because we are already convinced that non-baryonic dark matter must exist. From that perspective, any problem encountered by ΛCDM is a mere puzzle that will inevitably be solved, while any problem encountered by MOND is a terminal failure of an irredeemably blasphemous hypothesis. This speaks volumes about human nature but says nothing about how the universe works.
The plain fact is that MOND made many a priori predictions that subsequently came true. This is the essence of the scientific method. LCDM and MOND are largely incommensurate, but whenever I have been able to make a straight comparison, MOND has been the more successful theory. So what am I supposed to say? That it is wrong? Perhaps it is, but that doesn’t make dark matter right. Rather, the predictive successes of MOND must be teaching us something. The field will not progress until these are incorporated into mainstream thinking.