Yep, it’s a religion

I have been concerned for years that dark matter was morphing from legitimate science into a cold, dark religion. I have been reluctant to put it that way, because there are lots of scientists who work on dark matter that have not fallen entirely down that rabbit hole and who continue to make valuable contributions working in that context. But a recent experience reminded me that my concerns were not misplaced, and there are plenty of scientists who have fallen irredeemably down this rabbit hole. No matter what answer the future holds to be correct, many current scientists will have gone to their graves in denial of it.

Where is the boundary between science and religion? It is hard to assess where the borderline is. But it is easy to see when people are far over the line – so far over that it doesn’t really matter where exactly the line is. One can attend any conference on the subject to find people who unabashedly assert that dark matter exists without question. Not just that acceleration discrepancies have been amply demonstrated empirically, but that the only possible interpretation is dark matter. If asked whether this invisible mass is in the room with us now, they will enthusiastically^# answer yes! Since dark matter has not been detected in the laboratory, this assertion is an expression of faith – the hallmark of religion – not of an established scientific fact. What we have established is that there are discrepancies between what we see and what we get when we assume Newtonian gravity (or GR, if needed). What we don’t know is whether the cause of these discrepancies is some form of invisible mass (dark matter) or if the equations we employ are inadequate (modified gravity [or more generally, dynamics]).

Indeed, these days many people will assert that dark matter has already been detected, usually citing astronomical evidence that used to be considered too feeble to merit a Nobel prize. Funny how repeating a mantra long enough morphs an aspiration into accepted reality. Modern physics is not providing a strong falsification of the supposition that science is a social construct.

A prominent example of an observation of the sky that is frequently cited as absolutely requiring cold dark matter is the acoustic power spectrum of the cosmic microwave background. Quoting clayton from a few years ago:

the primary reason to believe in the phenomenon of cold dark matter is the very high precision with which we measure the CMB power spectrum, especially modes beyond the second acoustic peak. There is a stone-cold, qualitative, crystal clear prediction of CDM about the relative sizes of the second and third peaks that modified gravity profoundly and irredeemably gets wrong: it thinks the third peak should be relatively larger* than the second… whereas CDM thinks they should be about the same

I would accept that this were conclusive proof of dark matter if this were the unique prediction of dark matter: that there was no other way to do it, so all other approaches were indeed irredeemable. (Quite the strong language, eh?) The problem is that CDM is not the one unique was to fit these data. Skordis & Zlosnik showed that it is possible to write a modified gravity theory that also fits the CMB data:

*CMB power spectrum observed by Planck fit by AeST (Skordis & Zlosnik 2021).*

This does not prove the AeST theory of Skordis & Zlosnik is correct, but it does demonstrate that it is possible to write a modified gravity theory that does indeed do what it is frequently asserted to be impossible for a modified gravity theory to do. I’ve heard of a couple of other theories that can also do this (the relativistic Khronon theory of Blanchet and nonlocal MOND as discussed by Deffayet & Woodard), so clearly this success is not uniquely limited to cold dark matter, or even a particular modified gravity theory. The work of Skordis & Zlosnik (2021) was known and in the literature before clayton made the assertion above in late 2022, so either he wasn’t paying attention (likely) or is convinced that it is impossible so doesn’t even consider the possibility (also likely). The former just says we’re all too busy, but the latter is a mark of religious thinking: my god is the only god, thou shalt have no other hypotheses before^& me.

Many people are very impressed with the quality of the LCDM fit to the CMB. That is indeed very good, but there are enough free parameters that we were going to get a fit to any physically plausible power spectrum. If not, we’ve never been shy about making up new parameters. (Evolving dark energy, anyone? How about a running power spectrum? There’s a whole bag of possibilities!) What I’ve been more impressed with is the consistency of the fit to the CMB data with the many independent constraints on conventional cosmology. Or at least it was, until it wasn’t.

The Hubble tension has gotten steadily worse (in terms of statistical significance), and it really does not look like local measurements are to blame, nor is it the only tension. People seem to miss that it is the CMB-fitted value of the Hubble constant that has evolved over time to spoil the concordance that got us to believe in LCDM in the first place. But if the CMB is the cornerstone of your religion, all other data must inevitably be at fault and can be ignored: there is an entire community of cosmologists who choose to believe the best-fit Planck cosmology to the exclusion of all other data. It’s like the bad old days of the Hubble tension all over again, with the physics community choosing to believe the lower value of H₀ because it makes more sense for the aspects of cosmology that they care about while those in the astronomical community who actually measure H₀ find a persistently higher value.

A real tension in LCDM implies the need for new physics of the unknown variety. One doesn’t want to go there if it can be helped. I didn’t consider MOND until I was already concerned for the viability of dark matter. There are real problems for the paradigm that its more intense advocates simply deny, brush aside without real thought, or choose to remain ignorant of. When they are confronted with a problem, they are pretty creative about making stuff up on the spot. Anything to avoid having to confront the unspeakable – another hallmark of religion.

For example, cold dark matter is scale free. That’s foundational to the hypothesis. So the existence of an acceleration scale in the kinematic data is anathema to CDM. When I first pointed this contradiction out, there were a variety of assertions to the effect of “does too!” One example is provided by Kaplinghat & Turner, who claim to show “how Milgrom’s law comes about in the cold dark matter theory of structure formation.” That would, indeed, be ideal, and is a requirement for any theory to be successful.

Wee problem: they demonstrat no such thing. CDM is scale free, yet K&T claim that it explains Milgrom’s Law, which is predicated on the existence of an acceleration scale. Well, which is it? Is CDM scale free? Or does it explains the acceleration scale? We can’t have it both ways: their very premise is self-contradictory. It is absurd on its face.

The acceleration scale is defined by baryons, for which K&T have no model. To connect baryons with dark matter, they make a hand-waving argument about galaxies reaching a₀ at the edge of their disks. This is not even a concept of a model and does not begin to suffice as an explanation for many reasons, a prominent one being that low surface brightness galaxies have accelerations less than a₀ everywhere:

Centripetal acceleration curves color coded by galaxy surface brightness. Low surface brightness galaxies (blue colors) have low (sub-a₀) accelerations everywhere: there is no edge at which they reach a₀. (Adapted from McGaugh 2020.)

Milgrom pointed out this and many other shortcomings of their scenario, so I feel no need to elaborate further. Milgrom eviscerated their paper so thoroughly that the proper course of action would have been to retract it. Instead, they simply never acknowledge the criticism, and persist to this day in pushing it as some sort of valid scientific explanation. It is not; it does not withstand even mild critical scrutiny. But it doesn’t need to: it reassures the faithful that all is well. They hear what they want to hear without questioning its veracity. That’s another hallmark of religion.

I have refrained from saying these things in the past because I’m too nice. For example, a few years ago I started then abandoned the draft text below, which I simply cut & paste:

One of the things that attracted me to a career in science is the notion of objectivity. I grew up for a time in the bible belt, where people earnestly believed things that were obviously untrue, even to the eyes of a small child. On the occasions that I had the temerity to point out the obvious, the contradictions posed by facts never had an impact on their belief system. Rather, it inevitably earned me a warning that I was going to hell. No few of these people seemed to think it was their religious duty to send me there prematurely, or at least to make life on Earth a living hell.

Scientists eschew such behavior, but are also human, so often engage in it anyway. I’ve encountered it a lot. I get it; I went through the same denial, grief, and anger over the prospect of losing my good friend cold dark matter. The stages of grief never brought something back from the dead, but it has engendered a lot of blame-the-messenger.

Here’s an example, from a review by Mike Turner:

There is a lot of misinformation packed into this short paragraph.

The first clue is right there at the beginning, in red: the heading “False starts.” This is false framing, a classic tool of propagandists. It starts from the outset by asserting that the topic to be discussed is wrong at a level of knowledge so common it requires no justification. This is not the way one starts an objective discussion, much less a scientific one.

Turner then misconstrues what Milgrom did. He didn’t notice the scale a₀ in the data, for which there was scant evidence at the time. Rather, Milgrom made the obvious statement that the inference of dark matter relied on the assumption that dynamics, as encapsulated by the laws of inertia and gravity, is the same on the very different scales of galaxies as in the solar system where they were established, so we ought to consider if dynamics might change in some way. He quickly excluded a size dependence as a possibility. How he settled on acceleration is beyond the scope of this post, and not for me to say. Neither is it for Turner to say.

After a brief and incomplete description of what MOND is, Turner allows that “this one-parameter model fits all the rotation-curve data”. Even in making this admission, he chooses to call it a model rather than a theory. A model is something specific you build in the context of a theory, like a halo model in CDM. MOND is more than that.

Turner quickly moves on without contemplating any meaning that rotation curves might hold. Let’s pause to consider that.

First, I would not say that MOND fits all the rotation curve data. It fits most galaxies, but there are a minority of weird cases that are not well fit. The weird cases inevitably don’t make sense in terms of dark matter either, so on the whole I interpret this to be the usual price of dealing with astronomical data – some of it is just goofy. Setting such cases aside, I can and have fit the same data with all sorts of dark matter halo models. MOND requires fewer parameters, which is important, but the difference isn’t in the fitting. The difference is in predictive ability. I can use MOND to predict the dynamics of galaxies a priori, and have done so many times. I cannot use any flavor of dark matter theory to do the same, and it’s not for lack of trying.

The predictive power of MOND must be telling us something, even if it is something about the nature of dark matter or the process of galaxy formation. There are many papers written on this, some deep and profound, others absurd and banal. Turner cites none of them, nor displays any awareness that such work exists. I would venture to guess that is because acknowledging such work would imply that there is something to debate here, something he would apparently rather not admit.

That’s where I left off. It’s exhausting deciphering other people’s false assertions. Moreover, I just don’t like criticizing other people, no matter how richly they deserve it. (Turner has never refrained from criticizing me in ad hominem terms: on one occasion^$ he showed my picture to an audience and called me “the enemy.”) A large segment of the particle physics and cosmology community appears to think this way, and has succumbed to a scientific version of bible thumping in which you can assert any absurd thing so long as it falls within the framework of the holy LCDM. They really need to find something better to do.

I had hoped we were past this, but I heard a talk last week that was exactly in this mode. To paraphrase, the talk went

We’re sure dark matter exists. We have been sure about it for decades. In that time, we have been repeatedly proven wrong about what it is. Rather than re-think our paradigm in the face of these repeated failures, we double down yet again on the existence of this invisible, undetected mass, asserting aggressively^% that it must be true while eliding or misrepresenting the evidence that it is not. This enables us to make up a whole lot of exciting new possibilities for what the dark matter might be and conceive of ever more grandiose experiments to continue not to detect it. You must believe in dark matter!

This was not a science talk so much as an indoctrination session. It was as if I had stumbled into a revivalist tent where some hothead was preaching to the choir. This is the kind of talk that misled an entire generation into wasting their careers at the bottom of a mine shaft searching for WIMPs. At least WIMPs were a well-motivated hypothesis; this kind of talk could lead a new generation down an even greater variety of garden paths.

I am well aware that I might fall prey to this attitude myself. That’s why I set criteria by which I would change my mind: detect dark matter already, or at least provide a satisfactory explanation as to how MOND comes about. Neither of those criteria have been met. There are claims to do the latter, but so far these are just variations on models I tried and found to fail long ago. If I thought these could work, I would have said so. At the same time, I don’t see any dark matter advocates taking up the challenge to specify what would change their minds. When I ask them what could falsify dark matter, I get dumbfounded looks – the deer-in-the-headlight face one gets when the immediate response why would you even ask that? is checked by a distant memory that scientific theories are supposed to be falsifiable.

Personally, I found it humbling to encounter MOND in my own data. I too thought we understood the universe with dark matter. But who ordered this? Certainly not me: my own conventional, dark-matter based predictions were falsified. No one else working in the context of dark matter had got it right at the time either. Only Milgrom ordered this.

And what is this? There is a direct connection between what we see and what we get. Even in ignorance of MOND, the radial acceleration relation encodes a one-to-one relation between the distribution of baryons and the effective force. This is so direct that one can right down a single equation connecting the two:

g_{obs} = F(g_N/a_0)\,g_N.

The observed acceleration is a simple function of that predicted by Newton for the stars and gas that we see. There is no mention of unseen mass; everything is specified by what we can see is there.

I’ve sometimes heard astronomers complain about the reductionist ethos of physics, trying to cram all the complexity of the entire universe into a theory of everything. But here it is appropriate: there is a single, apparently universal force-law at work in galaxies. That’s telling us something profound. And yet if questioned about this, the physicists are the ones who will complain that galaxies are complicated, so they should be exempted from having to explain them. Galaxies should be complicated – in LCDM. But they’re observed not to be, in the sense that a single equation suffices to describe their kinematics. The problem isn’t that galaxies are inexplicably complicated, it’s that they should be but aren’t.

I am deeply disappointed that many scientists apparently lack the physical intuition to immediately recognize the import of the simple relation between what we see and what we get. It is the same sort of thing Newton noticed in the solar system: everything happens as if the gravitational force is proportional to the product of the masses and the inverse square of their separation. He didn’t understand why at the time, and was criticized for indulging in magical thinking: how can there be action at a distance? But that’s what the data were saying, and the same applies now. We might not yet understand the why, but that the data look as if MOND is what’s happening in this universe.

^#The framing has morphed over the years. A recent advent is that some people have started proactively asserting that invisible mass is in the room with us now in order to avoid having to answer it as a question that makes them sound like loonies.

*He means the third peak should be smaller than the second, not larger, if by “it” he means modified gravity with the baryon density expected from big bang nucleosynthesis, which was the hypothesis that correctly predicted the first-to-second peak ratio but does indeed get the second-to-third peak ratio wrong. Funny how the CMB community was able to completely ignore the successful prediction for several years, but were then suddenly all over the latter failure. The third peak falsifies the ansatz on which that particular prediction was built, not the entire concept of modified gravity. This would be like asserting that all possible forms of dark matter are excluded because we haven’t yet detected WIMPs. It is a classic failure of objectivity, which is another hallmark of faith-based argumentation: we know His name is [insert favorite deity], not [insert any other deity].

^&Or after me. Dark matter was my first hypothesis, and I’m here to tell you that True Believers do not suffer second hypotheses or those who stray from the fold. I guess that’s why so many scientists who are MOND-curious keep it on the down low. Wise, perhaps (that’s why tenure needs to be a thing), but hardly the ideal of the open and free exchange of scientific ideas.

^$I wasn’t there, but one audience member (not someone I knew) thought it was so over the top that he told me about it, sharing a link with a video. (I did not retain that link, and doubt the hosting conference website is still active.)

^%Argument weak here. RAISE VOICE!

The well of inspiration

AP News recently ran a story on the interface of religious inspiration and the search for dark matter. I thought about commenting on it, then thought better of it, but now here I am again. I know some but not all of the people who are quoted, all good scientists. The article ends with a nice quote from Jennifer Wiseman:

“Studying the deep universe may make us feel insignificant,” Wiseman said. “But it also gives us a sense of unity that we’re all on the same planet. … The hope is we get a sense of joy, humility and love from these contemplations.”

I’ve known Dr. Wiseman since undergrad days, before either of us were Ph.Ds. We don’t agree on much that is specific to religion, but somehow manage to be friends anyway, and I completely agree with the sentiments she expresses here.

There is a tendency to portray religion as being at odds with science, and that can certainly happen. But they share more in common than this trope implies. Both arise from a deep wellspring in the human spirit, the same wellspring: the desire to know. How does the universe work? How did it come to be so? Why? and on and on. Where they differ is in approach: when encountering an unknown, especially things that are fundamentally unknowable (e.g., does God exist?), religion asserts an answer and asks that we accept it on faith. Faith is anathema to the scientific process; one is, in principle, to search for the truth through observation and experimentation.

Religion and science come into conflict when scientific knowledge encroaches on religion’s known unknowns. There was a time when the structure of the heavens must have seemed unknowable; the realm of the sacred, safe from access by mundane human knowledge. So it is easy to see how Galileo came into conflict with the Church despite being a faithful member:

Church: Scripture teaches us that the Earth is the unmoving center of mortal corruption; the heavens above us are perfect and unchanging.

Galileo: I can see spots on the sun and mountains on the moon.

Church: We are the center about which the heavens rotate every day, clearly the center of all rotation.

Galileo: I see satellites circling Jupiter.

Church: God is great; He could do one thing one day and an entirely different thing another day*.

Galileo: It appears that the Earth revolves around the Sun, never the other way around.

Church: Do you see these instruments of torture?

This is a tongue-in-cheek portrayal of a serious historical incident, but my point is that conflict arises when scientific knowledge encroaches onto turf that had formerly been the exclusive province of religion. Addressing our spiritual need to know can be inspirational, but it can also be a let down. Giving specific answers to formerly unknowable questions can be a bit of a mood killer. Is that all there is?

Another downside of scientific knowledge is that it can never be complete. There is always something left unknown, and we are not supposed to fill that void with something we take on faith. Yet we are profoundly uncomfortable with not knowing. A little extrapolation at and beyond the fringes of current knowledge is natural, and can sometimes provides a useful way forward in driving new discoveries. But it can get carried away, e.g., string theory. So extraordinary caution is also warranted: the temptation to fill in the blank is where empiricism transitions to theology.

Understanding that we know a lot – as we do at this juncture in the history of science – and yet still don’t know something important, like what most of the universe is made of, is hard to accept. It becomes even harder when admitting something we thought we understood is wrong, or is less complete than we thought. We know most of the mass in the universe is made of non-baryonic cold dark matter^&. Don’t we?

The dynamical evidence for acceleration discrepancies is abundant. That these must be caused by non-baryonic cold dark matter is less clear. This is where cosmology intrudes. Cosmology has always been the nexus where science and religion meet, with philosophical imperatives often obscuring essential observational facts. Before Kepler, orbits had to be circular. After Inflation, the density parameter had to be one. (We meant Ω_m = 1, not the modern weak-sauce version Ω_m + Ω_Λ = 1.) The mass density is larger than the baryon density from BBN (Ω_m > Ω_b), so there has to be non-baryonic dark matter. That such a substance is also required to fit the acoustic power spectrum of the CMB amplifies our faith in the existence of such stuff.

We think we’ve solved cosmology, and that solution requires non-baryonic cold dark matter. So to admit that maybe we were wrong about dark matter just because it persistently remains undetected and provides unsatisfactory explanations of many astronomical observations and is consistently outperformed in predictive capacity by an alternative theory is to admit that we don’t understand as much about the universe as we thought. That’s really, really hard. Our unwillingness to admit that maybe we have been wrong about such an important issue is where human nature kicks in to blur the line between scientific knowledge and religious faith. It’s much easier to ignore those nagging doubts^% and have faith that we were right all along.

Cosmology works so well that dark matter has to exist. I’ve head this sentiment expressed over and over by many different scientists, yet this is an assertion of faith. A more conservative statement is that cosmology as we currently conceive it works if, and only if, an appropriate form of non-baryonic dark matter exists with the required cosmic density. If not, then we need a new model – something that presumably^{^} stems from a more general underlying theory.

Since both religion and science arise from the same desire to know, it is easy to have faith that we know more than we actually do.

This is the sticking point we’ve hit in the dark matter debate. We’ve been calling the acceleration discrepancy the dark matter problem for so long that this linguistic mistake has morphed into an absolute certainty that invisible mass exists. It has become a matter of faith.

*Of course an omnipotent deity could do that. Apparently He chooses not to do so, sparking the schism between theists, for whom God actively intervenes in miraculous ways in real world affairs, and deists, who view God more as the great watchmaker, setting creation in motion but not interfering with its operation. Deism is conducive to science, as the search to identify the rules by which the universe works is to gain insight – however remote – into the mind of God. I suspect this attitude informed Einstein’s complaint against quantum mechanics: “God does not play dice with the universe.”

^&I certainly thought so before I didn’t.

^%The social pressure to conform to the preferred cosmology is enormous. I know I’m only making it worse for myself. But an explanation that omits MOND is a lie of omission. Here science and religion certainly overlap in the sentiments expressed by the seventeenth century cleric Paul Gerhardt:

“When a man lies, he murders some part of the world.”

Assertions that somehow “feedback” explains MOND are simply a numerical form of magical thinking; an excuse to not have to explain the inexplicable. By eliding MOND, we murder a part of the world.

^{^}Here I am extrapolating at the fringes of knowledge.

The baryons are mostly in the intergalactic medium. Mostly.

My colleague Jim Schombert pointed out a nifty new result published in Nature Astronomy which you probably can’t access so here is a link to what looks to be the preliminary version. The authors use the Deep Synoptic Array (DSA) to discover some new Fast Radio Bursts (FRBs), many of which are apparently in galaxies at large enough distances to provide an interesting probe of the intervening intergalactic medium (IGM).

There is lots that’s new and cool here. The DSA-110 is able to localize FRBs well enough to figure out where they are, which is an interesting challenge and impressive technological accomplishment. FRBs themselves remain something of a mystery. The are observed as short (typically millisecond), high intensity pulses of very low frequency radio emission, typically 1,400 MHz or less. What causes these pulses isn’t entirely clear, but they might be produced in the absurdly intense magnetic fields around some neutron stars.

FRBs are intrinsically luminous – lots of energy packed into a short burst – so can be detected from cosmological distances. The trick is to find them (blink and miss it!) and also to localize them on the sky. That’s challenging to do at these frequencies well enough to uniquely associate them with optical sources like candidate host galaxies. To quote from their website, “DSA-110 is a radio interferometer purpose-built for fast radio burst (FRB) detection and direct localization.” It was literally made to do this.

Connor et al. analyze dozens of known and report nine new FRBs covering enough of the sky to probe an interesting cosmological volume. Host galaxies with known redshifts define a web of pencil-beam probes – the paths that the radio waves have to traverse to get here. Low frequency radio waves are incredibly useful as a probe of the intervening space because they are sensitive to the density of intervening electrons, providing a measure of how many there are between us and each FRB.

Most of intergalactic space is so empty that the average density of matter is orders of magnitude lower than the best vacuum we can achieve in the laboratory. But there is some matter there, and of course intergalactic space is huge, so even low densities might add up to a lot. This provides a good way to find out how much.

The speed of light is the ultimate speed limit, in a vacuum. When propagating through a medium like glass or water, the effective speed of light is reduced by the index of refraction. For low frequency radio waves, the exceedingly low density of free electrons of the IGM suffice to slow them down a bit. This effect, called the dispersion measure, is frequency dependent. It usually comes up in the context of pulsars for which the width of their pulses is spread by the effect, but it works for any radio source with appropriate observable frequencies, like FRBs. The dispersion measure tells us the product of the distance and the density traversed along the line of sight to the source, so is usually expressed in typical obscure astronomical fashion as pc cm^-3. This is really a column density, the number per square cm, but with host galaxies of known redshift the distance in known independently and we get a measure of the average electron volume density along the line of sight.

That’s it. That by itself provides a good measure of the density of intergalactic matter. The IGM is highly ionized, with a neutral fraction < 10^-4, so counting electrons is the same as counting atoms. (Not every nucleus is hydrogen, so they adopt 0.875 electrons per baryon to account for the neutrons in helium and heavier elements. We know the neutral fraction is low in the IGM because hydrogen is incredibly opaque to ultraviolet radiation: absorption would easily be seen, yet there is no Gunn-Peterson trough until z > 6.) This leads to a baryon density of Ω_Bh² = 0.025 ± 0.003, which is 5% of the critical density for a reasonable Hubble parameter of h = 0.7.

This solves the cosmic missing baryon problem. There had been an order of magnitude discrepancy when most of the baryons we knew about were in stars. It gradually became clear that many of the baryons were in various forms of tenuous plasma in the space between galaxies, for example in the Lyman alpha forest, but these didn’t account for everything so a decade ago a third of the baryons expected from BBN were still unaccounted for in the overall baryon budget. Now that checksum is complete. Indeed, if anything, we now have a small (if not statistically significant) baryon surplus⁺.

Here is a graphic representing the distribution of baryons among the various reservoirs. Connor et al. find that the fraction in the intergalactic medium is f_IGM = 0.76 +0.10/-0.11. Three quarters of the baryons are Out There, spread incredibly thin throughout the vastness of cosmic space, with an absolute density of a few x 10^-31 g cm^-3, which is about one atom per cubic meter. Most of the atoms are hydrogen, so “normal” for most of the universe is one proton and one electron in a box a meter across rather than the 10^-10 m occupied by a bound hydrogen atom. That’s a whole lot of empty.

Connor et al. assess that about 3/4 of all baryons are in the intergalactic medium (IGM), give or take 10% – the side bars illustrate the range of uncertainty. Many of the remaining baryons are in other forms of space plasma associated with but not in galaxies: the intracluster medium (ICM) of rich clusters, the intragroup medium (IGroupM) of smaller groups, and the circumgalactic medium (CGM) associated with individual galaxies. All the stars in all the galaxies add up to less than 10%, and the cold (non-ionized) atomic and molecular gas in galaxies comprise about 1% of the baryons.

The other reservoirs of baryons pale in comparison to the IGM. Most are still in some form of diffuse space plasma, like the intracluster media of clusters of galaxies and groups of galaxies, or associated with but not in individual galaxies (the circumgalactic medium). These distinctions are a bit fuzzy, as are the uncertainties on each component, especially the CGM (f_CGM = 0.08 +0.07/-0.06). This leaves some room for a lower overall baryon density, but not much.

Connor et al. get some constraint on the CGM by looking at the increase in the dispersion measure for FRBs with sight-lines that pass close to intervening galaxies vs. those that don’t. This shows that there does seem to be some extra gas associated with such galaxies, but not enough to account for all the baryons that should be associated with their dark matter halos. So the object-by-object checksum of how the baryons are partitioned remains problematic, and I hope to have more to say about it in the near future. Connor et al. argue that some of the baryons have to have been blown entirely out of their original dark matter halos by feedback; they can’t all be lurking there or there would be less dispersion measure from the general IGM between us and relatively nearby galaxies where there is no intervening CGM*.

The baryonic content of visible galaxies – the building blocks of the universe that most readily meet the eye – is less than 10% of the total baryon density. Most of that is in stars and their remnants, which contain about 5% of the baryons, give or take a few percent stemming from the uncertainty in the stellar initial mass function. The cold gas – both neutral atomic gas and the denser molecular gas from which stars form, only add up to about 1% of all baryons. What we see most readily is only a fraction of what’s out there, even when restricting our consideration to normal matter: mostly the baryons are in the IGM. Mostly.

The new baryon inventory is now in good agreement with big bang nucleosynthesis: Ω_Bh² = 0.025 ± 0.003 is consistent with Ω_bh² = 0.0224 ± 0.0001 from Planck CMB fits. It is more consistent with this and the higher baryon density favored by deuterium than it is with lithium, but isn’t accurate enough to exclude the latter. Irrespective of this important detail, I feel better that the third of the baryons that used to be missing (or perhaps not there at all) are now accounted for. The agreement with the checksum of the baryon inventory with the density of baryons consistent with BBN is as encouraging success of this deeply fundamental aspect of the hot big bang cosmology.

⁺Looking at their equation 2, there is some degeneracy between the baryon density Ω_b and the fraction of ionized baryons Out There. Lower Ω_b would mean a higher baryon fraction in the diffuse ionized state. This is already large, so there is only a little room to trade off between the two.

*What counts as CGM is a bit dicey. Putting on a cosmology hat, the definition Connor et al. adopt involving a range of masses of dark matter halos appropriate for individual galaxies is a reasonable one, and it makes sense to talk about the baryon fraction of those objects relative to the cosmic value, of which they fall short (f_gas = 0.35 +0.30/-0.25 in individual galaxies where f_* < 0.35: these don’t add up to unity). Switching to MOND, the notional association of the CGM with the virial radii of a host dark matter halos is meaningless, so it doesn’t matter if the gas in the vicinity of galaxies was once part of them and got blown out or simply never accreted in the first place. In LCDM we require at least some blow out to explain the sub-cosmic baryon fractions, while in MOND I’m inclined to suspect that the dominant process is non-accretion due to inefficient galaxy formation. Of course, the universe may indulge in a mix of both physical effects, in either paradigm!

^%Unlike FLRW cosmology, there is no special scale defined by the critical density; a universe experiencing the MOND force-law will ultimately recollapse whatever its density, at least in the absence of something that acts like anti-gravity (i.e., dark energy). In retrospect, this is a more satisfactory solution of the flatness problem than Inflation, as there is nothing surprising about the observed density being what it is. There is no worry about it being close to but not quite equal to the critical density since the critical density is no longer a special scale.

The Deuterium-Lithium tension in Big Bang Nucleosynthesis

There are many tensions in the era of precision cosmology. The most prominent, at present, is the Hubble tension – the difference between traditional measurements, which consistently obtain H₀ = 73 km/s/Mpc, and best fit* to the acoustic power spectrum of the cosmic microwave background (CMB) observed by Planck, H₀ = 67 km/s/Mpc. There are others of varying severity that are less widely discussed. In this post, I want to talk about a persistent tension in the baryon density implied by the measured primordial abundances of deuterium and lithium⁺. Unlike the tension in H₀, this problem is not nearly as widely discussed as it should be.

Framing

Part of the reason that this problem is not seen as an important tension has to do with the way in which it is commonly framed. In most discussions, it is simply the primordial lithium problem. Deuterium agrees with the CMB, so those must be right and lithium must be wrong. Once framed that way, it becomes a trivial matter specific to one untrustworthy (to cosmologists) observation. It’s a problem for specialists to sort out what went wrong with lithium: the “right” answer is otherwise known, so this tension is not real, making it unworthy of wider discussion. However, as we shall see, this might not be the right way to look at it.

It’s a bit like calling the acceleration discrepancy the dark matter problem. Once we frame it this way, it biases how we see the entire problem. Solving this problem becomes a matter of finding the dark matter. It precludes consideration of the logical possibility that the observed discrepancies occur because the force law changes on the relevant scales. This is the mental block I struggled mightily with when MOND first cropped up in my data; this experience makes it easy to see when other scientists succumb to it sans struggle.

Big Bang Nucleosynthesis (BBN)

I’ve talked about the cosmic baryon density here a lot, but I’ve never given an overview of BBN itself. That’s because it is well-established, and has been for a long time – I assume you, the reader, already know about it or are competent to look it up. There are many good resources for that, so I’ll only give enough of a sketch necessary to the subsequent narrative – a sketch that will be both too little for the experts and too much for the subsequent narrative that most experts are unaware of.

Primordial nucleosynthesis occurs in the first few minutes after the Big Bang when the universe is the right temperature and density to be one big fusion reactor. The protons and available neutrons fuse to form helium and other isotopes of the light elements. Neutrons are slightly more massive and less numerous than protons to begin with. In addition, free neutrons decay with a half-life of roughly ten minutes, so are outnumbered by protons when nucleosynthesis happens. The vast majority of the available neutrons pair up with protons and wind up in ⁴He while most of the protons remain on their own as the most common isotope of hydrogen, ¹H. The resulting abundance ratio is one alpha particle for every dozen protons, or in terms of mass fractions^&, X_p = 3/4 hydrogen and Y_p = 1/4 helium. That is the basic composition with which the universe starts; heavy elements are produced subsequently in stars and supernova explosions.

Though ¹H and ⁴He are by far the most common products of BBN, there are traces of other isotopes that emerge from BBN:

The time evolution of the relative numbers of light element isotopes through BBN. As the universe expands, nuclear reactions “freeze-out” and establish primordial abundances for the indicated species. The precise outcome depends on the baryon density, Ω_b. This plot illustrates a particular choice of *Ω_b*; different *Ω_b* result in observationally distinguishable abundances. (Figures like this are so ubiquitous in discussions of the early universe that I have not been able to identify the original citation for this particular version.)

After hydrogen and helium, the next most common isotope to emerge from BBN is deuterium, ²H. It is the first thing made (one proton plus one neutron) but most of it gets processed into ⁴He, so after a brief peak, its abundance declines. How much it declines is very sensitive to Ω_b: the higher the baryon density, the more deuterium gets gobbled up by helium before freeze-out. The following figure illustrates how the abundance of each isotope depends on Ω_b:

“Schramm diagram” adopted from Cyburt et al (2003) showing the abundance of ⁴He by mass fraction (top) and the number relative to hydrogen of deuterium (D = ²H), helium-3, and lithium as a function of the baryon-to-photon ratio. We measure the photon density in the CMB, so this translates directly to the baryon density^$ *Ω_b*h² (top axis).

If we can go out and measure the primordial abundances of these various isotopes, we can constrain the baryon density.

The Baryon Density

It works! Each isotope provides an independent estimate of Ω_bh², and they agree pretty well. This was the first and for a long time the only over-constrained quantity in cosmology. So while I am going to quibble about the exact value of Ω_bh², I don’t doubt that the basic picture is correct. There are too many details we have to get right in the complex nuclear reaction chains coupled to the decreasing temperature of a universe expanding at the rate required during radiation domination for this to be an accident. It is an exquisite success of the standard Hot Big Bang cosmology, albeit not one specific to LCDM.

Getting at primordial, rather than current, abundances is an interesting observational challenge too involved to go into much detail here. Suffice it to say that it can be done, albeit to varying degrees of satisfaction. We can then compare the measured abundances to the theoretical BBN abundance predictions to infer the baryon density.

The Schramm diagram with measured abundances (orange boxes) for the isotopes of the light elements. The thickness of the box illustrates the uncertainty: tiny for deuterium and large for *⁴He* because of the large zoom on the axis scale. The lithium abundance could correspond to either low or high baryon density. ³He is omitted because its uncertainty is too large to provide a useful constraint.

Deuterium is considered the best baryometer because its relic abundance is very sensitive to Ω_bh²: a small change in baryon density corresponds to a large change in D/H. In contrast, ⁴He is a great confirmation of the basic picture – the primordial mass fraction has to come in very close to 1/4 – but the precise value is not very sensitive to Ω_bh². Most of the neutrons end up in helium no matter what, so it is hard to distinguish^# a few more from a few less. (Note the huge zoom on the linear scale for ⁴He. If we plotted it logarithmically with decades of range as we do the other isotopes, it would be a nearly flat line.) Lithium is annoying for being double-valued right around the interesting baryon density so that the observed lithium abundance can correspond to two values of Ω_bh². This behavior stems from the trade off with ⁷Be which is produced at a higher rate but decays to ⁷Li after a few months. For this discussion the double-valued ambiguity of lithium doesn’t matter, as the problem is that the deuterium abundance indicates Ω_bh² that is even higher than the higher branch of lithium.

BBN pre-CMB

The diagrams above and below show the situation in the 1990s before CMB estimates became available. Consideration of all the available data in the review of Walker et al. led to the value Ω_bh² = 0.0125 ± 0.0025. This value** was so famous that it was Known. It formed the basis of my predictions for the CMB for both LCDM and no-CDM. This prediction hinged on BBN being correct, and that we understood the experimental bounds on the baryon density. A few years after Walker’s work, Copi et al. provided the estimate⁺⁺ 0.009 < Ω_bh² < 0.02. Those were the extreme limits of the time, as illustrated by the green box below:

The baryon density as it was known before detailed observations of the acoustic power spectrum of the CMB. BBN was a mature subject before 1990; the massive reviews of Walker et al. and Copi et al. creak with the authority of a solved problem. The controversial tension at the time was between the high and low deuterium measurements from Hogan and Tytler, which were at the extreme ends of the ranges indicated by the bulk of the data in the reviews.

Up until this point, the constraints on BBN had come mostly from helium observations in nearby galaxies and lithium measurements in metal poor stars. It was only just then becoming possible to obtain high quality spectra of sufficiently high redshift quasars to see weak deuterium lines associated with strongly damped primary hydrogen absorption in intergalactic gas along the line of sight. This is great: deuterium is the most sensitive baryometer, the redshifts were high enough to be early in the history of the universe close to primordial times, and the gas was in the middle of intergalactic nowhere so shouldn’t be altered by astrophysical processes. These are ideal conditions, at least in principle.

First results were binary. Craig Hogan obtained a high deuterium abundance, corresponding to a low baryon density. Really low. From my Walker et al.-informed confirmation bias, too low. It was a a brand new result, so promising but probably wrong. Then Tytler and his collaborators came up with the opposite result: low deuterium abundance corresponding to a high baryon density: Ω_bh² = 0.019 ± 0.001. That seemed pretty high at the time, but at least it was within the bound Ω_bh² < 0.02 set by Copi et al. There was a debate between these high/low deuterium camps that ended in a rare act of intellectual honesty by a cosmologist when Hogan^&& conceded. We seemed to have settled on the high-end of the allowed range, just under Ω_bh² = 0.02.

Enter the CMB

CMB data started to be useful for constraining the baryon density in 2000 and improved rapidly. By that point, LCDM was already well-established, and I had published predictions for both LCDM and no-CDM. In the absences of cold dark matter, one expects a damping spectrum, with each peak lower than the one before it. For the narrow (factor of two) Known range of possible baryon densities, all the no-CDM models run together to essentially the same first-to-second peak ratio.

Peak locations measured by WMAP in 2003 (points) compared to the a priori (1999) predictions of LCDM (red tone lines) and no-CDM (blue tone lines). Models are normalized in amplitude around the first peak.

Adding CDM into the mix adds a driver to the oscillations. This fights the baryonic damping: the CDM is like a parent pushing a swing while the baryons are the kid dragging his feet. This combination makes just about any pattern of peaks possible. Not all free parameters are made equal: the addition of a single free parameter, Ω_CDM, makes it possible to fit any plausible pattern of peaks. Without it (no-CDM means Ω_CDM = 0), only the damping spectrum is allowed.

For BBN as it was known at the time, the clear difference was in the relative amplitude^$$ of the first and second peaks. As can be seen above, the prediction for no-CDM was correct and that for LCDM was not. So we were done, right?

Of course not. To the CMB community, the only thing that mattered was the fit to the CMB power spectrum, not some obscure prediction based on BBN. Whatever the fit said was True; too bad for BBN if it didn’t agree.

The way to fit the unexpectedly small^## second peak was to crank up the baryon density. To do that, Tegmark & Zaldarriaga (2000) needed 0.022 < Ω_bh² < 0.040. That’s what the first blue point below. This was the first time that I heard it suggested that the baryon density could be so high.

The baryon density from deuterium (red triangles) before and after (dotted vertical line) estimates from the CMB (blue points). The horizontal dotted line is the pre-CMB upper limit of *Copi et al.*

The astute reader will note that the CMB-fit 0.022 < Ω_bh² < 0.040 sits entirely outside the BBN bounds 0.009 < Ω_bh² < 0.02. So we’re done, right? Well, no – the community simply ignored the successful a priori prediction of the no-CDM scenario. That was certainly easier than wrestling with its implications, and no one seems to have paused to contemplate why the observed peak ratio came in exactly at the one unique value that it could obtain in the case of no-CDM.

For a few years, the attitude seemed to be that BBN was close but not quite right. As the CMB data improved, the baryon density came down, ultimately settling on Ω_bh² = 0.0224 ± 0.0001. Part of the reason for this decline from the high initial estimate is covariance. In this case, the tilt plays a role: the baryon density declined as n_s = 1 → 0.965 ± 0.004. Getting the second peak amplitude right takes a combination of both.

Now we’re back in the ballpark, almost: Ω_bh² = 0.0224 is not ridiculously far above the BBN limit Ω_bh² < 0.02. Close enough for Spergel et al. (2003) to say “The remarkable agreement between the baryon density inferred from D/H values and our [WMAP] measurements is an important triumph for the basic big bang model.” This was certainly true given the size of the error bars on both deuterium and the CMB at the time. It also elides^*** any mention of either helium or lithium or the fact that the new Known was not consistent with the previous Known. Ω_bh² = 0.0224 was always the ally; Ω_bh² = 0.0125 was always the enemy.

Note, however, that deuterium made a leap from below Ω_bh² = 0.02 to above 0.02 exactly when the CMB indicated that it should do so. They iterated to better agreement and pretty much stayed there. Hopefully that is the correct answer, but given the history of the field, I can’t help worrying about confirmation bias. I don’t know if that is what’s going on, but if it were, this convergence over time is what it would look like.

Lithium does not concur

Taking the deuterium results at face value, there really is excellent agreement with the LCDM fit to the CMB, so I have some sympathy for the desire to stop there. Deuterium is the best baryometer, after all. Helium is hard to get right at a precise enough level to provide a comparable constraint, and lithium, well, lithium is measured in stars. Stars are tiny, much smaller than galaxies, and we know those are too puny to simulate.

Spite & Spite (1982) [those are names, pronounced “speet”; we’re not talking about spiteful stars] discovered what is now known as the Spite plateau, a level of constant lithium abundance in metal poor stars, apparently indicative of the primordial lithium abundance. Lithium is a fragile nucleus; it can be destroyed in stellar interiors. It can also be formed as the fragmentation product of cosmic ray collisions with heavier nuclei. Both of these things go on in nature, making some people distrustful of any lithium abundance. However, the Spite plateau is a sort of safe zone where neither effect appears to dominate. The abundance of lithium observed there is indeed very much in the right ballpark to be a primordial abundance, so that’s the most obvious interpretation.

Lithium indicates a lowish baryon density. Modern estimates are in the same range as BBN of old; they have not varied systematically with time. There is no tension between lithium and pre-CMB deuterium, but it disagrees with LCDM fits to the CMB and with post-CMB deuterium. This tension is both persistent and statistically significant (Fields 2011 describes it as “4–5σ”).

*The baryon density from lithium (yellow symbols) over time. Stars are measurements in groups of stars on the Spite plateau; the square represents the approximate value from the ISM of the SMC.*

I’ve seen many models that attempt to fix the lithium abundance, e.g., by invoking enhanced convective mixing via <<mumble mumble>> so that lithium on the surface of stars is subject to destruction deep in the stellar interior in a previously unexpected way. This isn’t exactly satisfactory – it should result in a mess, not a well-defined plateau – and other attempts I’ve seen to explain away the problem do so with at least as much contrivance. All of these models appeared after lithium became a problem; they’re clearly motivated by the assumption bias that the CMB is correct so the discrepancy is specific to lithium so there must be something weird about stars that explains it.

Another way to illustrate the tension is to use Ω_bh² from the Planck fit to predict what the primordial lithium abundance should be. The Planck-predicted band is clearly higher than and offset from the stars of the Spite plateau. There should be a plateau, sure, but it’s in the wrong place.

The lithium abundance in metal poor stars (points), the interstellar medium of the Small Magellanic Cloud (green band), and the primordial lithium abundance expected for the best-fit Planck LCDM. For reference, *[Fe/H] = -3* means an iron abundance that is one one-thousandth that of the sun.

An important recent observation is that a similar lithium abundance is obtained in the metal poor interstellar gas of the Small Magellanic Cloud. That would seem to obviate any explanation based on stellar physics.

The Schramm diagram with the Planck CMB-LCDM value added (vertical line). This agrees well with deuterium measurements made after CMB data became available, but not with those before, nor with the measured abundance of lithium.

We can also illustrate the tension on the Schramm diagram. This version adds the best-fit CMB value and the modern deuterium abundance. These are indeed in excellent agreement, but they don’t intersect with lithium. The deuterium-lithium tension appears to be real, and comparable in significance to the H₀ tension.

So what’s the answer?

I don’t know. The logical options are

A systematic error in the primordial lithium abundance
A systematic error in the primordial deuterium abundance
Physics beyond standard BBN

I don’t like any of these solutions. The data for both lithium and deuterium are what they are. As astronomical observations, both are subject to the potential for systematic errors and/or physical effects that complicate their interpretation. I am also extremely reluctant to consider modifications to BBN. There are occasional suggestions to this effect, but it is a lot easier to break than it is to fix, especially for what is a fairly small disagreement in the absolute value of Ω_bh².

I have left the CMB off the list because it isn’t part of BBN: it’s constraint on the baryon density is real, but involves completely different physics. It also involves different assumptions, i.e., the LCDM model and all its invisible baggage, while BBN is just what happens to ordinary nucleons during radiation domination in the early universe. CMB fits are corroborative of deuterium only if we assume LCDM, which I am not inclined to accept: deuterium disagreed with the subsequent CMB data before it agreed. Whether that’s just progress or a sign of confirmation bias, I also don’t know. But I do know confirmation bias has bedeviled the history of cosmology, and as the H0 debate shows, we clearly have not outgrown it.

The appearance of confirmation bias is augmented by the response time of each measured elemental abundance. Deuterium is measured using high redshift quasars; the community that does that work is necessarily tightly coupled to cosmology. It’s response was practically instantaneous: as soon as the CMB suggested that the baryon density needed to be higher, conforming D/H measurements appeared. Indeed, I recall when that first high red triangle appeared in the literature, a colleague snarked to me “we can do that too!” In those days, those of us who had been paying attention were all shocked at how quickly Ω_bh² = 0.0125 ± 0.0025 was abandoned for literally double that value, Ω_Bh² = 0.025 ± 0.001. That’s 4.6 sigma for those keeping score.

The primordial helium abundance is measured in nearby dwarf galaxies. That community is aware of cosmology, but not as strongly coupled to it. Estimates of the primordial helium abundance have drifted upwards over time, corresponding to higher implied baryon densities. It’s as if confirmation bias is driving things towards the same result, but on a timescale that depends on the sociological pressure of the CMB imperative.

**Fig. 8** from Steigman (2012) *showing the history of primordial helium mass fraction (Y_P) determinations as a function of time.*

I am not accusing anyone of trying to obtain a particular result. Confirmation bias can be a lot more subtle than that. There is an entire field of study of it in psychology. We “humans actively sample evidence to support prior beliefs” – none of us are immune to it.

In this case, how we sample evidence depends on the field we’re active in. Lithium is measured in stars. One can have a productive career in stellar physics while entirely ignoring cosmology; it is the least likely to be perturbed by edicts from the CMB community. The inferred primordial lithium abundance has not budged over time.

What’s your confirmation bias?

I try not to succumb to confirmation bias, but I know that’s impossible. The best I can do is change my mind when confronted with new evidence. This is why I went from being sure that non-baryonic dark matter had to exist to taking seriously MOND as the theory that predicted what I observed.

I do try to look at things from all perspectives. Here, the CMB has been a roller coaster. Putting on an LCDM hat, the location of the first peak came in exactly where it was predicted: this was strong corroboration of a flat FLRW geometry. What does it mean in MOND? No idea – MOND doesn’t make a prediction about that. The amplitude of the second peak came in precisely as predicted for the case of no-CDM. This was corroboration of the ansatz inspired by MOND, and the strongest possible CMB-based hint that we might be barking up the wrong tree with LCDM.

As an exercise, I went back and maxed out the baryon density as it was known before the second peak was observed. We already thought we knew LCDM parameters well enough to do this. We couldn’t. The amplitude of the second peak came as a huge surprise to LCDM; everyone acknowledged that at the time (if pressed; many simply ignored it). Nowadays this is forgotten, or people have gaslit themselves into believing this was expected all along. It was not.

**Fig. 45** from Famaey & McGaugh (2012): *WMAP data are shown with the a priori prediction of no-CDM (blue line) and the* *most favorable *prediction* *that could have been made ahead of time for* LCDM (red line).*

From the perspective of no-CDM, we don’t really care whether deuterium or lithium hits closer to the right baryon density. All plausible baryon densities predict essentially the same A_1:2 amplitude ratio. Once we admit CDM as a possibility, then the second peak amplitude becomes very sensitive to the mix of CDM and baryons. From this perspective, the lithium-indicated baryon density is unacceptable. That’s why it is important to have a test that is independent of the CMB. Both deuterium and lithium provide that, but they disagree about the answer.

Once we broke BBN to fit the second peak in LCDM, we were admitting (if not to ourselves) that the a priori prediction of LCDM had failed. Everything after that is a fitting exercise. There are enough free parameters in LCDM to fit any plausible power spectrum. Cosmologists are fond of saying there are thousands of independent multipoles, but that overstates the case: it doesn’t matter how finely we sample the wave pattern, it matters what the wave pattern is. That is not as over-constrained as it is made to sound. LCDM is, nevertheless, an excellent fit to the CMB data; the test then is whether the parameters of this fit are consistent with independent measurements. It was until it wasn’t; that’s why we face all these tensions now.

Despite the success of the prediction of the second peak, no-CDM gets the third peak wrong. It does so in a way that is impossible to fix short of invoking new physics. We knew that had to happen at some level; empirically that level occurs at L = 600. After that, it becomes a fitting exercise, just as it is in LCDM – only now, one has to invent a new theory of gravity in which to make the fit. That seems like a lot to ask, so while it remained as a logical possibility, LCDM seemed the more plausible explanation for the CMB if not dynamical data. From this perspective, that A_1:2 came out bang on the value predicted by no-CDM must just be one heck of a cosmic fluke. That’s easy to accept if you were unaware of the prediction or scornful of its motivation; less so if you were the one who made it.

Either way, the CMB is now beyond our ability to predict. It has become a fitting exercise, the chief issue being what paradigm in which to fit it. In LCDM, the fit follows easily enough; the question is whether the result agrees with other data: are these tensions mere hiccups in the great tradition of observational cosmology? Or are they real, demanding some new physics?

The widespread attitude among cosmologists is that it will be impossible to fit the CMB in any way other than LCDM. That is a comforting thought (it has to be CDM!) and for a long time seemed reasonable. However, it has been contradicted by the success of Skordis & Zlosnik (2021) using AeST, which can fit the CMB as well as LCDM.

AeST is a very important demonstration that one does not need dark matter to fit the CMB. One does need other fields⁺⁺⁺, so now the reality of those have to be examined. Where this show stops, nobody knows.

I’ll close by noting that the uniqueness claimed by the LCDM fit to the CMB is a property more correctly attributed to MOND in galaxies. It is less obvious that this is true because it is always possible to fit a dark matter model to data once presented with the data. That’s not science, that’s fitting French curves. To succeed, a dark matter model must “look like” MOND. It obviously shouldn’t do that, so modelers refuse to go there, and we continue to spin our wheels and dig the rut of our field deeper.

Note added in proof, as it were: I’ve been meaning to write about this subject for a long time, but hadn’t, in part because I knew it would be long and arduous. Being deeply interested in the subject, I had to slap myself repeatedly to refrain from spending even more time updating the plots with publication date as an axis: nothing has changed, so that would serve only to feed my OCD. Even so, it has taken a long time to write, which I mention because I had completed the vast majority of this post before the IAU announced on May 15 that Cooke & Pettini have been awarded the Gruber prize for their precision deuterium abundance. This is excellent work (it is one of the deuterium points in the relevant plot above), and I’m glad to see this kind of hard, real-astronomy work recognized.

The award of a prize is a recognition of meritorious work but is not a guarantee that it is correct. So this does not alter any of the concerns that I express here, concerns that I’ve expressed for a long time. It does make my OCD feels obliged to comment at least a little on the relevant observations, which is itself considerably involved, but I will tack on some brief discussion below, after the footnotes.

*These methods were in agreement before they were in tension, e.g., Spergel et al. (2003) state: “The agreement between the HST Key Project value and our [WMAP CMB] value, h = 0.72 ±0.05, is striking, given that the two methods rely on different observables, different underlying physics, and different model assumptions.”

⁺Here I mean the abundance of the primary isotope of lithium, ⁷Li. There is a different problem involving the apparent overabundance of ⁶Li. I’m not talking about that here; I’m talking about the different baryon densities inferred separately from the abundances of D/H and ⁷Li/H.

^&By convention, X, Y, and Z are the mass fractions of hydrogen, helium, and everything else. Since the universe starts from a primordial abundance of X_p = 3/4 and Y_p = 1/4, and stars are seen to have approximately that composition plus a small sprinkling of everything else (for the sun, Z ≈ 0.02), and since iron lines are commonly measured in stars to trace Z, astronomers fell into the habit of calling Z the metallicity even though oxygen is the third most common element in the universe today (by both number and mass). Since everything in the periodic table that isn’t hydrogen and helium is a small fraction of the mass, all the heavier elements are often referred to collectively as metals despite the unintentional offense to chemistry.

^$The factor of h² appears because of the definition of the critical density ρ_c = (3H₀²)/(8πG): Ω_b = ρ_b/ρ_c. The physics cares about the actual density ρ_b but Ω_bh² = 0.02 is a lot more convenient to write than ρ_b,now = 3.75 x 10^-31 g/cm³.

^#I’ve worked on helium myself, but was never able to do better than Y_p = 0.25 ± 0.01. This corroborates the basic BBN picture, but does not suffice as a precise measure of the baryon density. To do that, one must obtain a result accurate to the third place of decimals, as discussed in the exquisite works of Kris Davidson, Bernie Pagel, Evan Skillman, and their collaborators. It’s hard to do for both observational reasons and because a wealth of subtle atomic physics effects come into play at that level of precision – helium has multiple lines; their parent population levels depend on the ionization mechanism, the plasma temperature, its density, and fluorescence effects as well as abundance.

**The value reported by Walker et al. was phrased as Ω_bh₅₀² = 0.05 ± 0.01, where h₅₀ = H₀/(50 km/s/Mpc); translating this to the more conventional h = H₀/(100 km/s/Mpc) decreases these numbers by a factor of four and leads to the impression of more significant digits than were claimed. It is interesting to consider the psychological effect of this numerology. For example, the modern CMB best-fit value in this phrasing is Ω_bh₅₀² = 0.09, four sigma higher than the value Known from the combined assessment of the light isotope abundances. That seems like a tension – not just involving lithium, but the CMB vs. all of BBN. Amusingly, the higher baryon density needed to obtain a CMB fit assuming LCDM is close to the threshold where we might have gotten away without the dynamical need (Ω_m > Ω_b) for non-baryonic dark matter that motivated non-baryonic dark matter in the first place. (For further perspective at a critical juncture in the development of the field, see Peebles 1999).

The use of h₅₀ itself is an example of the confirmation bias I’ve mentioned before as prevalent at the time, that Ω_m = 1 and H₀ = 50 km/s/Mpc. I would love to be able to do the experiment of sending the older cosmologists who are now certain of LCDM back in time to share the news with their younger selves who were then equally certain of SCDM. I suspect their younger selves would ask their older selves at what age they went insane, if they didn’t simply beat themselves up.

⁺⁺Craig Copi is a colleague here at CWRU, so I’ve asked him about the history of this. He seemed almost apologetic, since the current “right” baryon density from the CMB now is higher than his upper limit, but that’s what the data said at the time. The CMB gives a more accurate value only once you assume LCDM, so perhaps BBN was correct in the first place.

^&&Or succumbed to peer pressure, as that does happen. I didn’t witness it myself, so don’t know.

^$$The absolute amplitude of the no-CDM model is too high in a transparent universe. Part of the prediction of MOND is that reionization happens early, causing the universe to be a tiny bit opaque. This combination came out just right for τ = 0.17, which was the original WMAP measurement. It also happens to be consistent with the EDGES cosmic dawn signal and the growing body of evidence from JWST.

^##The second peak was unexpectedly small from the perspective of CDM; it was both natural and expected in no-CDM. At the time, it was computationally expensive to calculate power spectra, so people had pre-computed coarse grids within which to hunt for best fits. The range covered by the grids was informed by extant knowledge, of which BBN was only one element. From a dynamical perspective, Ω_m > 0.2 was adopted as a hard limit that imposed an edge in the grids of the time. There was no possibility of finding no-CDM as the best fit because it had been excluded as a possibility from the start.

***Spergel et al. (2003) also say “the best-fit Ω_bh² value for our fits is relatively insensitive to cosmological model and dataset combination as it depends primarily on the ratio of the first to second peak heights (Page et al. 2003b)” which is of course the basis of the prediction I made using the baryon density as it was Known at the time. They make no attempt to test that prediction, nor do they cite it.

⁺⁺⁺I’ve heard some people assert that this is dark matter by a different name, so is a success of the traditional dark matter picture rather than of modified gravity. That’s not at all correct. It’s just stage three in the list of reactions to surprising results identified by Louis Agassiz.

All of the figures below are from Cooke & Pettini (2018), which I employ here to briefly illustrate how D/H is measured. This is the level of detail I didn’t want to get into for either deuterium or helium or lithium, which are comparably involved.

First, here is a spectrum of the quasar they observe, Q1243+307. The quasar itself is not the object of interest here, though quasars are certainly interesting! Instead, we’re looking at the absorption lines along the line of sight; the quasar is being used as a spotlight to illuminate the gas between it and us.

**Figure 1.** Final combined and flux-calibrated spectrum of Q1243+307 (black histogram) shown with the corresponding error spectrum (blue histogram) and zero level (green dashed line). The red tick marks above the spectrum indicate the locations of the Lyman series absorption lines of the sub-DLA at redshift z_abs = 2.52564. Note the exquisite signal-to-noise ratio (S/N) of the combined spectrum, which varies from S/N ≃ 80 near the Lyα absorption line of the sub-DLA (∼4300 Å) to S/N ≃ 25 at the Lyman limit of the sub-DLA, near 3215 Å in the observed frame.

The big hump around 4330 Å is Lyman α emission from the quasar itself. Lyα is the n = 2 to 1 transition of hydrogen, Lyβ is the n = 3 to 1 transition, and so on. The rest frame wavelength of Lyα is far into the ultraviolet at 1216 Å; we see it redshifted to z = 2.558. The rest of the spectrum is continuum and emission lines from the quasar with absorption lines from stuff along the line of sight. Note that the red end of the spectrum at wavelengths longer than 4400 Å is mostly smooth with only the occasional absorption line. Blueward of 4300 Å, there is a huge jumble. This is not noise, this is the Lyα forest. Each of those lines is absorption from hydrogen in clouds at different distances, hence different redshifts, along the line of sight.

Most of the clouds in the Lyα forest are ephemeral. The cross section for Lyα is huge so It takes very little hydrogen to gobble it up. Most of these lines represent very low column densities of neutral hydrogen gas. Once in a while though, one encounters a higher column density cloud that has enough hydrogen to be completely opaque to Lyα. These are damped Lyα systems. In damped systems, one can often spot the higher order Lyman lines (these are marked in red in the figure). It also means that there is enough hydrogen present to have a shot at detecting the slightly shifted version of Lyα of deuterium. This is where the abundance ratio D/H is measured.

To measure D/H, one has not only to detect the lines, but also to model and subtract the continuum. This is a tricky business in the best of times, but here its importance is magnified by the huge difference between the primary Lyα line which is so strong that it is completely black and the deuterium Lyα line which is incredibly weak. A small error in the continuum placement will not matter to the measurement of the absorption by the primary line, but it could make a huge difference to that of the weak line. I won’t even venture to discuss the nonlinear difference between these limits due to the curve of growth.

**Figure 2.** Lyα profile of the absorption system at *z_abs = 2.52564* toward the quasar Q1243+307 (black histogram) overlaid with the best-fitting model profile (red line), continuum (long dashed blue line), and zero-level (short dashed green line). The top panels show the raw, extracted counts scaled to the maximum value of the best-fitting continuum model. The bottom panels show the continuum normalized flux spectrum. The label provided in the top left corner of every panel indicates the source of the data. The blue points below each spectrum show the normalized fit residuals, (data–model)/error, of all pixels used in the analysis, and the gray band represents a confidence interval of ±2σ. The S/N is comparable between the two data sets at this wavelength range, but it is markedly different near the high order Lyman series lines (see Figures 4 and 5). The red tick marks above the spectra in the bottom panels show the absorption components associated with the main gas cloud (Components 2, 3, 4, 5, 6, 8, and 10 in Table 2), while the blue tick marks indicate the fitted blends. Note that some blends are also detected in Lyβ–Lyε.

The above examples look pretty good. The authors make the necessary correction for the varying spectral sensitivity of the instrument, and take great care to simultaneously fit the emission of the quasar and the absorption. I don’t think they’ve done anything wrong; indeed, it looks like they did everything right – just as the people measuring lithium in stars have.

Still, as an experienced spectroscopist, there are some subtle details that make me queasy. There are two independent observations, which is awesome, and the data look almost exactly the same, a triumph of repeatability. The fitted models are nearly identical, but if you look closely, you can see the model cuts slightly differently along the left edge of the damped absorption around 4278 Å in the two versions of the spectrum, and again along the continuum towards the right edge.

These differences are small, so hopefully don’t matter. But what is the continuum, really? The model line goes through the data, because what else could one possibly do? But there is so much Lyα absorption, is that really continuum? Should the continuum perhaps trace the upper envelope of the data? A physical effect that I worry about is that weak Lyα is so ubiquitous, we never see the true continuum but rather continuum minus a tiny bit of extraordinarily weak (Gunn-Peterson) absorption. If the true continuum from the quasar is just a little higher, then the primary hydrogen absorption is unaffected but the weak deuterium absorption would go up a little. That means slightly higher D/H, which means lower Ω_bh², which is the direction in which the measurement would need to move to come into closer agreement with lithium.

Is the D/H measurement in error? I don’t know. I certainly hope not, and I see no reason to think it is. I do worry that it could be. The continuum level is one thing that could go wrong; there are others. My point is merely that we shouldn’t assume it has to be lithium that is in error.

An important check is whether the measured D/H ratio depends on metallicity or column density. It does not. There is no variation with metallicity as measured by the logarithmic oxygen abundance relative to solar (left panel below). Nor does it appear to depend on the amount of hydrogen in the absorbing cloud (right panel). In the early days of this kind of work there appeared to be a correlation, raising the specter of a systematic. That is not indicated here.

**Figure 6.** Our sample of seven high precision D/H measures (symbols with error bars); the green symbol represents the new measure that we report here. The weighted mean value of these seven measures is shown by the red dashed and dotted lines, which represent the 68% and 95% confidence levels, respectively. The left and right panels show the dependence of D/H on the oxygen abundance and neutral hydrogen column density, respectively. Assuming the Standard Model of cosmology and particle physics, the right vertical axis of each panel shows the conversion from D/H to the universal baryon density. This conversion uses the Marcucci et al. (2016) theoretical determination of the d(p,γ)³He cross-section. The dark and light shaded bands correspond to the 68% and 95% confidence bounds on the baryon density derived from the CMB (Planck Collaboration et al. 2016).

I’ll close by noting that Ω_bh² from this D/H measurement is indeed in very good agreement with the best-fit Planck CMB value. The question remains whether the physics assumed by that fit, baryons+non-baryonic cold dark mater+dark energy in a strictly FLRW cosmology, is the correct assumption to make.

Some more persistent cosmic tensions

I set out last time to discuss some of the tensions that persist in afflicting cosmic concordance, but didn’t get past the Hubble tension. Since then, I’ve come across more of that, e.g., Boubel et al (2024a), who use a variant of Tully-Fisher to obtain H₀ = 73.3 ± 2.1(stat) ± 3.5(sys) km/s/Mpc. Having done that sort of work, their systematic uncertainty term seemed large to me. I then came across Scolnic et al. (2024) who trace this issue back to one apparently erroneous calibration amongst many, and correct the results to H₀ = 76.3 ± 2.1(stat) ± 1.5(sys) km/s/Mpc. Boubel is an author of the latter paper, so apparently agrees with this revision. Fortunately they didn’t go all Sandage-de Vaucouleurs on us, but even so, this provides a good example of how fraught this field can get. It also demonstrates the opportunity for confirmation bias, as the revised numbers are almost exactly what we find ourselves. (New results coming soon!)

It’s a dang mess.

The Hubble tension is only the most prominent of many persistent tensions, so let’s wade into some of the rest.

The persistent tension in the amplitude of the power spectrum

The tension that cosmologists seem to stress about most after the Hubble tension is that in σ₈. σ₈ quantifies the amplitude of the power spectrum; it is a measure of the rms fluctuation in mass in spheres of 8h^-1 Mpc. Historically, this scale was chosen because early work by Peebles & Yu (1970) indicated that this was the scale on which the rms contrast in galaxy numbers* is unity. This is also a handy dividing line between linear and nonlinear regimes. On much larger scales, the fluctuations are smaller (a giant sphere is closer to the average for the whole universe) so can be treated in the limit of linear perturbation theory. Individual galaxies are “small” by this standard, so can’t be treated⁺ so simply, which is the excuse many cosmologists use to run shrieking from discussing them.

As we progressed from wrapping our heads around an expanding universe to quantifying the large scale structure (LSS) therein, the power spectrum statistically describing LSS became part of the canonical set of cosmological parameters. I don’t myself consider it to be on par with the Big Two, the Hubble constant H₀ and the density parameter Ω_m, but many cosmologists do seem partial to it despite the lack of phase information. Consequently, any tension in the amplitude σ₈ garners attention.

The tension in σ₈ has been persistent insofar as I recall debates in the previous century where some kinds of data indicated σ₈ ~ 0.5 while other data preferred σ₈ ~ 1. Some of that tension was in underlying assumptions (SCDM before LCDM). Today, the difference is [mostly] between the Planck best-fit amplitude σ₈ = 0.811 ± 0.006 and various local measurements that typically yield 0.7something. For example, Karim et al. (2024) find low σ₈ for emission line galaxies, even after specifically pursuing corrections in a necessary dust model that pushed things in the right direction:

**Fig. 16** from Karim et al. (2024): *Estimates of σ₈ from emission line galaxies (red and blue), luminous red galaxies (grey), and Planck (green).*

As with so many cosmic parameters, there is degeneracy, in this case between σ₈ and Ω_m. Physically this happens because you get more power when you have more stuff (Ω_m), but the different tracers are sensitive to it in different ways. Indeed, if I put on a cosmology hat, I personally am not too worried about this tension – emission line galaxies are typically lower mass than luminous red galaxies, so one expects that there may be a difference in these populations. The Planck value is clearly offset from both, but doesn’t seem too far afield. We wouldn’t fret at all if it weren’t for Planck’s damnably small error bars.

This tension is also evident as a function of redshift. Here are measures of the combination of parameters fσ₈ = Ω_m(z)^γσ₈ measured and compiled by Boubel et al (2024b):

**Fig. 16** from Boubel et al (2024b). *LCDM* matches the data for σ₈ = 0.74 (green line); the purple line is the expectation from Planck (σ₈ = 0.81). The inset shows the error ellipse, which is clearly offset from the Planck value (crossed lines), particularly for the GR^& value of γ = 0.55.

The line representing the Planck value σ₈ = 0.81 overshoots most of the low redshift data, particularly those with the smallest uncertainties. The green line has σ₈ = 0.74, so is a tad lower than Planck in the same sense as other low redshift measures. Again, the offset is modest, but it does look significant. The tension is persistent but not a show-stopper, so we generally shrug our shoulders and proceed as if it will inevitably work out.

The persistent tension in the cosmic mass density

A persistent tension that nobody seems to worry about is that in the density parameter Ω_m. Fits to the Planck CMB acoustic power spectrum currently peg Ω_m = 0.315±0.007, but as we’ve seen before, this covaries with the Hubble constant. Twenty years ago, WMAP indicated Ω_m = 0.24 and H₀ = 73, in good agreement with the concordance region of other measurements, both then and now. As with H₀, the tension is posed by the itty bitty uncertainties on the Planck fit.

Experienced cosmologists may be inclined to scoff at such tiny error bars. I was, so I’ve confirmed them myself. There is very little wiggle room to match the Planck data within the framework of the LCDM model. I emphasize that last bit because it is an assumption now so deeply ingrained that it is usually left unspoken. If we leave that part out, then the obvious interpretation is that Planck is correct and all measurements that disagree with it must suffer from some systematic error. This seems to be what most cosmologists believe at present. If we don’t leave that part out, perhaps because we’re aware of other possibilities so are not willing to grant this assumption, then the various tensions look like failures of a model that’s already broken. But let’s not go there today, and stay within the conventional framework.

There are lots of ways to estimate the gravitating mass density of the universe. Indeed, it was the persistent, early observation that the mass density Ω_m exceeded that in baryons, Ω_b, from big bang nucleosynthesis that got got the non-baryonic dark matter show on the road: there appears to be something out there gravitating that’s not normal matter. This was the key observation that launched non-baryonic cold dark matter: if Ω_m > Ω_b, there has^% to be some kind of particle that is non-baryonic.

So what is Ω_m? Most estimates have spanned the range 0.2 < Ω_m < 0.4. In the 1980s and into the 1990s, this seemed close enough to Ω_m = 1, by the standards of cosmology, that most Inflationary cosmologists presumed it would work out to what Inflation predicted, Ω_m = 1 exactly. Indeed, I remember that community directing some rather vicious tongue-lashings at observers, castigating them to look harder: you will surely get Ω_m = 1 if you do it right, you fools. But despite the occasional claim to get this “right” answer, the vast majority of the evidence never pointed that way. As I’ve related before, an important step on the path to LCDM – probably the most important step – was convincing everyone that really Ω_m < 1.

Discerning between Ω_m = 0.2 and 0.3 is a lot more challenging than determining that Ω_m < 1, so we tend to treat either as acceptable. That’s not really fair in this age of precision cosmology. There are far too many estimates of the mass density to review here, so I’ll just note a couple of discrepant examples while also acknowledging that it is easy to find dynamical estimates that agree with Planck.

To give a specific example, Mohayaee & Tully (2005) obtained Ω_m = 0.22 ± 0.02 by looking at peculiar velocities in the local universe. This was consistent with other constraints at the time, including WMAP, but is 4.5σ from the current Planck value. That’s not quite the 5σ we arbitrarily define to be an undeniable difference, but it’s plenty significant.

There have of course been other efforts to do this, and many of them lead to the same result, or sometimes even lower Ω_m. For example, Shaya et al. (2022) use the Numerical Action Method developed by Peebles to attempt to work out the motions of nearly 10,000 galaxies – not just their Hubble expansion, but their individual trajectories under the mutual influence of each other’s gravity and whatever else may be out there. The resulting deviations from a pure Hubble flow depend on how much mass is associated with each galaxy and whatever other density there is to perturb things.

**Fig. 4** from Shaya et al (2022): The gravitating mass density as a function of scale. After some local variations (hello Virgo cluster!), the data converge to Ω_m = 0.12. Reaching Ω_m = 0.24 requires an equal, additional amount of mass in “interhalo matter.” Even more mass would be required to reach the Planck value (red line added to original figure).

This result is in even greater tension with Planck than the earlier work by Mohayaee & Tully (2005). I find the need to invoke interhalo matter disturbing, since it acts as a pedestal in their analysis: extra mass density that is uniform everywhere. This is necessary so that it contributes to the global mass density Ω_m but does not contribute to perturbing the Hubble flow.

One can imagine mass that is uniformly distributed easily enough, but what bugs me is that dark matter should not do this. There is no magic segregation between dark matter that forms into halos that contain galaxies and dark matter that just hangs out in the intergalactic medium and declines to participate in any gravitational dynamics. That’s not an option available to it: if it gravitates, it should clump. To pull this off, we’d need to live in a universe made of two distinct kinds of dark matter: cold dark matter that clumps and a fluid that gravitates globally but does not clump, sort of an anti-dark energy.

Alternatively, we might live in an underdense region such that the local Ω_m is less than the global Ω_m. This is an idea that comes and goes for one reason or another, but it has always been hard to sustain. The convergence to low Ω_m looks pretty steady out to ~100 Mpc in the plot above; that’s a pretty big hole. Recall the non-linearity scale discussed above; this scale is a factor of ten larger so over/under-densities should typical be ±10%. This one is -60%, so I guess we’d have to accept that we’re not Copernican observers after all.

The persistent tension in bulk flows

Once we get past the basic Hubble expansion, individual galaxies each have their own peculiar motion, and beyond that we have bulk flows. These have been around a long time. We obsessed a lot about them for a while with discoveries like the Great Attractor. It was weird; I remember some pundits talking about “plate tectonics” in the universe, like there were giant continents of galaxy superclusters wandering around in random directions relative to the frame of the microwave background. Many of us, including me, couldn’t grok this, so we chose not to sweat it.

There is no single problem posed by bulk flows^, and of course you can find those that argue they pose no problem at all. We are in motion relative to the cosmic (CMB) frame^$, but that’s just our Milky Way’s peculiar motion. The strange fact is that it’s not just us; the entirety of the local universe seems to have a unexpected peculiar motion. There are lots of ways to quantify this; here’s a summary table from Courtois et al (2025):

**Table 1** from Courtois et al (2025): *various attempts to measure the scale of dynamical homogeneity.*

As we look to large scales, we expect the universe to converge to homogeneity – that’s the Cosmological Principle, which is one of those assumptions that is so fundamental that we forget we made it. The same holds for dynamics – as we look to large scales, we expect the peculiar motions to average out, and converge to a pure Hubble flow. The table above summarizes our efforts to measure the scale on which this happens – or doesn’t. It also shows what we expect on the second line, “predicted LCDM,” where you can see the expected convergence in the declining bulk velocities as the scale probed increases. The third line is for “cosmic variance;” when you see these words it usually means something is amiss so in addition to the usual uncertainties we’re going to entertain the possibility that we live in an abnormal universe.

Like most people, I was comfortably ignoring this issue until recently, when we had a visit and a talk from one of the protagonists listed above, Richard Watkins (W23). One of the problems that challenge this sort of work is the need for a large sample of galaxies with complete sky coverage. That’s observationally challenging to obtain. Real data are heterogeneous; treating this properly demands a more sophisticated treatment than the usual top-hat or Gaussian approaches. Watkins described in detail what a better way could be, and patiently endured the many questions my colleagues and I peppered him with. This is hard to do right, which gives aid and comfort to the inclination to ignore it. After hearing his talk, I don’t think we should do that.

Panel from **Fig. 7** of Watkins et al. (2023): The magnitude of the bulk flow as a function of scale. The green points are the data and the red dashed line is the expectation of LCDM. The blue dotted line is an estimate of known systematic effects.

The data do not converge with increasing scale as expected. It isn’t just the local space density Ω_m that’s weird, it’s also the way in which things move. And “local” isn’t at all small here, with the effect persisting out beyond 300 Mpc for any plausible h = H₀/100.

This is formally a highly significant result, with the authors noting that “the probability of observing a bulk flow [this] large … is small, only about 0.015 per cent.” Looking at the figure above, I’d say that’s a fairly conservative statement. A more colloquial way of putting it would be “no way we gonna reconcile this!” That said, one always has to worry about systematics. They’ve made every effort to account for these, but there can always be unknown unknowns.

Mapping the Universe

It is only possible to talk about these things thanks to decades of effort to map the universe. One has to survey a large area of sky to identify galaxies in the first place, then do follow-up work to obtain redshifts from spectra. This has become big business, but to do what we’ve just been talking about, it is further necessary to separate peculiar velocities from the Hubble flow. To do that, we need to estimate distances by some redshift-independent method, like Tully-Fisher. Tully has been doing this his entire career, with the largest and most recent data product being Cosmicflows-4. Such data reveal not only large bulk flows, but extensive structure in velocity space:

The Laniakea supercluster of galaxies (Tully et al. 2014).

We have a long way to go to wrap our heads around all of this.

Persistent tensions persist

I’ve discussed a few of the tensions that persist in cosmic data. Whether these are mere puzzles or a mounting pile of anomalies is a matter of judgement. They’ve been around for a while, so it isn’t fair to suggest that all of the data are consistent with LCDM. Nevertheless, I hear exactly this asserted with considerable frequency. It’s as if the definition of all is perpetually shrinking to include only the data that meet the consistency criterion. Yet it’s the discrepant bits that are interesting for containing new information; we need to grapple with them if the field is to progress.

*This was well before my time, so I am probably getting some aspect of the history wrong or oversimplifying it in some gross way. Crudely speaking, if you randomly plop down spheres of this size, some will be found to contain the cosmic average number of galaxies, some twice that, some half that. That the modern value of σ₈ is close to unity means that Peebles got it basically right with the data that were available back then and that galaxy light very nearly traces mass, which is not guaranteed in a universe dominated by dark matter.

⁺It amazes me how pervasively “galaxies are complicated” is used as an excuse⁺⁺ to ignore all small scale evidence.

Not all of us are limited to working on the simplest systems. In this case, it doesn’t matter. The LCDM prediction here is that galaxies should be complicated because they are nonlinear. But the observation is that they are simple – so simple that they obey a single effective force law. That’s the contradiction right there, regardless of what flavor of complicated might come out of some high resolution simulation.

⁺⁺At one KITP conference I attended, a particle-cosmologist said during a discussion session, in all seriousness and with a straight face, “We should stop talking about rotation curves.” Because scientific truth is best revealed by ignoring the inconvenient bits. David Merritt remarked on this in his book A Philosophical Approach to MOND. He surveyed the available cosmology textbooks, and found that not a single one of them mentioned the acceleration scale in the data. I guess that would go some way to explaining why statements of basic observational facts are often met with stunned silence. What’s obvious and well-established to me is a wellspring of fresh if incredible news to them. I’d probably give them the stink-eye about the cosmological constant if I hadn’t been paying the slightest attention to cosmology for the past thirty years.

^&There is an elegant approach to parameterizing the growth of structure in theories that deviate modestly from GR. In this context, such theories are usually invoked as an alternative to dark energy, because it is socially acceptable to modify GR to explain dark energy but not dark matter. The curious hysteresis of that strange and seemingly self-contradictory attitude aside, this approach cannot be adapted to MOND because it assumes linearity while MOND is inherently nonlinear. My very crude, back-of-the-envelope expectation for MOND is very nearly constant γ ~ 0.4 (depending on the scale probed) out to high redshift. The bend we see in the conventional models around z ~ 0.6 will occur at z > 2 (and probably much higher) because structure forms fast in MOND. It is annoyingly difficult to put a more precise redshift on this prediction because it also depends on the unknown metric. So this is a more of a hunch than a quantitative prediction. Still, it will be interesting to see if roughly constant fσ₈ persists to higher redshift.

^%The inference that non-baryonic dark matter has to exist assumes that gravity is normal in the sense taught to us by Newton and Einstein. If some other theory of gravity applies, then one has to reassess the data in that context. This is one of the first considerations I made of MOND in the cosmological context, finding Ω_m ≈ Ω_b.

^MOND is effective at generating large bulk flows.

^$Fun fact: you can type the name of a galaxy into NED (the NASA Extragalactic Database) and it will give you lots of information, including its recession velocity referenced to a variety of frames of reference and the corresponding distance from the Hubble law V = H₀D. Naively, you might think that the obvious choice of reference from is the CMB. You’d be wrong. If you use this, you will get the wrong distance to the galaxy. Of all the choices available there, it consistently performs the worst as adjudicated by direct distance measurements (e.g., Cepheids).

NED used to provide a menu of choices for the value of H₀ to use. It says something about the social-tyranny of precision cosmology that it now defaults to the Planck value. If you use this, you will get the wrong distance to the galaxy. Even if the Planck H₀ turns out to be correct in some global sense, it does not work for real galaxies that are relatively near to us. That’s what it means to have all the “local” measurements based on direct distance measurements (e.g., Cepheids) consistently give a larger H₀.

*Galaxies in the local universe are closer than they appear.* Photo by P.S. Pratheep, www.pratheep.com

Some persistent cosmic tensions

I took the occasion of the NEIU debate to refresh my knowledge of the status of some of the persistent tensions in cosmology. There wasn’t enough time to discuss those, so I thought I’d go through a few of them here. These issues tend to get downplayed or outright ignored when we hype LCDM’s successes.

When I teach cosmology, I like to have the students do a project in which they each track down a measurement of some cosmic parameter, and then report back on it. The idea, when I started doing this back in 1999, was to combine the different lines of evidence to see if we reach a consistent concordance cosmology. Below is an example from the 2002 graduate course at the University of Maryland. Does it all hang together? I ask the students to debate the pros and cons of the various lines of evidence.

The mass density parameter Ω_m = ρ_m/ρ_crit and the Hubble parameter h = H₀/(100 km/s/Mpc) from various constraints (colored lines) available in 2002. I later added the first (2003) WMAP result (box). The combination of results excludes the grey region; only the white portion is viable: this is the concordance region.

The concordance cosmology is the small portion of this diagram that was not ruled out. This is the way in which LCDM was established. Before we had either the CMB acoustic power spectrum or Type Ia supernovae, LCDM was pretty much a done deal based on a wide array of other astronomical evidence. It was the subsequent^α agreement of the Type Ia SN and the CMB that cemented the picture in place.

The implicit assumption in this approach is that we have identified the correct cosmology by process of elimination: whatever is left over must be the right answer. But what if nothing is left over?

I have long worried that we’ve painted ourselves into a corner: maybe the concordance window is merely the least unlikely spot before everything is excluded. Excluding everything would effectively falsify LCDM cosmology, if not the more basic picture of an expanding universe^% emerging from a hot big bang. Once one permits oneself to think this way, then it occurs to one that perhaps the reason we have to invoke the twin tooth fairies of dark matter and dark energy is to get FLRW to approximate some deeper, underlying theory.

Most cosmologists do not appear to contemplate this frightening scenario. And indeed, before we believe something so drastic, we have to have thoroughly debunked the standard picture – something rather difficult to do when 95% of it is invisible. It also means believing all the constraints that call the standard picture into question (hence why contradictory results experience considerably more scrutiny* than conforming results). The fact is that some results are more robust than others. The trick is deciding which to trust.^{^}

In the diagram above, the range of Ω_m from cluster mass-to-light ratios comes from some particular paper. There are hundreds of papers on this topic, if not thousands. I do not recall which one this particular illustration came from, but most of the estimates I’ve seen from the same method come in somewhat higher. So if we slide those green lines up, the allowed concordance window gets larger.

The practice of modern cosmology has necessarily been an exercise in judgement: which lines of evidence should we most trust? For example, there is a line up there for rotation curves. That was my effort to ask what combination of cosmological parameters led to dark matter halo densities that were tolerable to the rotation curve data of the time. Dense cosmologies give birth to dense dark matter halos, so everything above that line was excluded because those parameters cram too much dark matter into too little space. This was a pretty conservative limit at the time, but it is predicated on the insistence of theorists that dark matter halos had to have the NFW form predicted by dark matter-only simulations. Since that time, simulations including baryons have found any number of ways to alter the initial cusp. This in turn means that the constraint no longer applies as the halo might have been altered from its original, cosmology-predicted initial form. Whether the mechanisms that might cause such alterations are themselves viable becomes a separate question.

If we believed all of the available constraints, then there is no window left and FLRW is already ruled out. But not all of those data are correct, and some contradict each other, even absent the assumption of FLRW. So which do we believe? Finding one’s path in this field is like traipsing through an intellectual mine field full of hardened positions occupied by troops dedicated to this or that combination of parameters.

It is in every way an invitation to confirmation bias. The answer we get depends on how we weigh disparate lines of evidence. We are prone to give greater weight to lines of evidence that conform to our pre-established⁺ beliefs.

So, with that warning, let’s plunge ahead.

The modern Hubble tension

Gone but not yet forgotten are the Hubble wars between camps Sandage (H₀ = 50!) and de Vaucouleurs (H₀ = 100!). These were largely resolved early this century thanks to the Hubble Space Telescope Key Project on the distance scale. Obtaining this measurement was the major motivation to launch HST in the first place. Finally, this long standing argument was resolved: nearly everyone agreed that H₀ = 72 km/s/Mpc.

That agreement was long-lived by the standards of cosmology, but did not last forever. Here is an illustration of the time dependence of H₀ measurements this century, from Freedman (2021):

There are many illustrations like this; I choose this one because it looks great and seems to have become the go-to for illustrating the situation. Indeed, it seems to inform the attitude of many scientists close to but not directly involved in the H₀ debate. They seem to perceive this as a debate between Adam Riess and Wendy Freedman, who have become associated with the Cepheid and TRGB^$ calibrations, respectively. This is a gross oversimplification, as they are not the only actors on a very big stage^&. Even in this plot, the first Cepheid point is from Freedman’s HST Key Project. But this apparent dichotomy between calibrators and people seems to be how the subject is perceived by scientists who have neither time nor reason for closer scrutiny. Let’s scrutinize.

Fits to the acoustic power spectrum of the CMB agreed with astronomical measurements of H₀ for the first decade of the century. Concordance was confirmed. The current tension appeared with the first CMB data from Planck. Suddenly the grey band of the CMB best-fit no longer overlapped with the blue band of astronomical measurements. This came as a shock. Then a new (red) band appears, distinguishing between the “local” H₀ calibrated by the TRGB from that calibrated by Cepheids.

I think I mentioned that cosmology was an invitation to confirmation bias. If you put a lot of weight on CMB fits, as many cosmologists do, then it makes sense from that perspective that the TRGB measurement is the correct one and the Cepheid H₀ must be wrong. This is easy to imagine given the history of systematic errors that plagued the subject throughout the twentieth century. This confirmation bias makes one inclined to give more credence to the new^# TRGB calibration, which is only in modest tension with the CMB value. The narrative is then simplified to two astronomical methods that are subject to systematic uncertainty: one that agrees with the right answer and one that does not. Ergo, the Cepheid H₀ is in systematic error.

This narrative oversimplifies that matter to the point of being actively misleading, and the plot above abets this by focusing on only two of the many local measurements. There is no perfect way to do this, but I had a go at it last year. In the plot below, I cobbled together all the data I could without going ridiculously far back, but chose to show only one point per independent group, the most recent one available from each, the idea being that the same people don’t get new votes every time they tweak their result – that’s basically what is illustrated above. The most recent points from above are labeled Cepheids & TRGB (the date of the TRGB goes to the full Chicago-Carnegie paper, not Freedman’s summary paper where the above plot can be found). See McGaugh (2024) for the references.

When I first made this plot, I discovered that many measurements of the Hubble constant are not all that precise: the plot was an indecipherable forest of error bars. So I chose to make a cut at a statistical uncertainty of 3 km/s/Mpc: worse than that, the data are shown as open symbols sans error bars; better than that, the datum gets explicit illustration of both its statistical and systematic uncertainty. One could make other choices, but the point is that this choice paints a different picture from the choice made above. One of these local measurements is not like the others, inviting a different version of confirmation bias: the TRGB point is the outlier, so perhaps it is the one that is wrong.

*Recent measurements of the Hubble constant (left) and the calibration of the baryonic Tully-Fisher relation (right) underpinning one of those measurements.*

I highlight the measurement our group made not to note that we’ve done this too so much as to highlight an underappreciated aspect of the apparent tension between Cepheid and TRGB calibrations. There are 50 galaxies that calibrate the baryonic Tully-Fisher relation, split nearly evenly between galaxies whose distance is known through Cepheids (blue points) and TRGB (red points). They give the same answer. There is no tension between Cepheids and the TRGB here.

Chasing this up, it appears to me that what happened was that Freedman’s group reanalyzed the data that calibrate the TRGB, and wound up with a slightly different answer. This difference does not appear to be in the calibration equation (the absolute magnitude of the tip of the red giant branch didn’t change that much), but in something to do with how the tip magnitude is extracted. Maybe, I guess? I couldn’t follow it all the way, and I got bad vibes reminding me of when I tried to sort through Sandage’s many corrections in the early ’90s. That doesn’t make it wrong, but the point is that the discrepancy is not between Cepheids and TRGB calibrations so much as it is between the TRGB as implemented by Freedman’s group and the TRGB as implemented by others. The depiction of the local Hubble constant debate as being between Cepheid and TRGB calibrations is not just misleading, it is wrong.

Can we get away from Cepheids and the TRGB entirely? Yes. The black points above are for megamasers and gravitational lensing. These are geometric methods that do not require intermediate calibrators like Cepheids at all. It’s straight trigonometry. Both indicate H₀ > 70. Which way is our confirmation bias leaning now?

The way these things are presented has an impact on scientific consensus. A fascinating experiment on this has been done in a recent conference report. Sometimes people poll conference attendees in an attempt to gauge consensus; this report surveys conference attendees “to take a snapshot of the attitudes of physicists working on some of the most pressing questions in modern physics.” One of the topics queried is the Hubble tension. Survey says:

*Table XII from arXiv:2503.15776 in which scientists at the 2024 conference* Black Holes Inside and Out vote on their opinion about the most likely solution of the Hubble tension.

First, a shout out to the 1/4 of scientists who expressed no opinion. That’s the proper thing to do when you’re not close enough to a subject to make a well-informed judgement. Whether one knows enough to do this is itself a judgement call, and we often let our arrogance override our reluctance to over-share ill-informed opinions.

Second, a shout out to the folks who did the poll for including a line for systematics in the CMB. That is a logical possibility, even if only 3 of the 72 participants took it seriously. This corroborates the impression I have that most physicists seem to think the CMB is prefect like some kind of holy scripture written in fire on the primordial sky, so must be correct and cannot be questioned, amen. That’s silly; systematics are always a possibility in any observation of the sky. In the case of the CMB, I suspect it is not some instrumental systematic but the underlying assumption of LCDM FLRW that is the issue; once one assumes that, then indeed, the best fit to the Planck data as published is H₀ = 67.4, with H₀ > 68 being right out. (I’ve checked.)

A red flag that the CMB is where the problem lies is the systematic variation of the best-fit parameters along the trench of minimum χ²:

*The time evolution of best-fit CMB cosmology parameters. These have steadily drifted away from the LCDM concordance window while the astronomical measurements that established it have not.*

I’ve shown this plot and variations for other choices of H₀ before, yet it never fails to come as a surprise when I show it to people who work closely on the subject. I’m gonna guess that extends to most of the people who participated in the survey above. Some red flags prove to be false alarms, some don’t, but one should at least be aware of them and take them into consideration when making a judgement like this.

The plurality (35%) of those polled selected “systematic error in supernova data” as the most likely cause of the Hubble tension. It is indeed a common attitude, as I mentioned above, that the Hubble tension is somehow a problem of systematic errors in astronomical data like back in the bad old days^**of Sandage & de Vaucouleurs.

Let’s unpack this a bit. First, the framing: systematic error in supernova data is not the issue. There may, of course, be systematic uncertainties in supernova data, but that’s not a contender for what is causing the apparent Hubble tension. The debate over the local value of H₀ is in the calibrators of supernovae. This is often expressed as a tension between Cepheid and TRGB calibrators, but as we’ve seen, even that is misleading. So posing the question this way is all kinds of revealing, including of some implicit confirmation bias. It’s like putting the right answer of a multiple choice question first and then making up some random alternatives.

So what do we learn from this poll for consensus? There is no overwhelming consensus, and the most popular choice appears to be ill-informed. This could be a meme. Tell me you’re not an expert on a subject by expressing an opinion as if you were.

The kicker here is that this was a conference on black hole physics. There seems to have been some fundamental gravitational and quantum physics discussed, which is all very interesting, but this is a community that is pretty far removed from the nitty-gritty of astronomical observations. There are many other polls reported in this conference report, many of them about esoteric aspects of black holes that I find interesting but would not myself venture an opinion on: it’s not my field. It appears that a plurality of participants at this particular conference might want to consider adopting that policy for fields beyond their own expertise.

I don’t want to be too harsh, but it seems like we are repeating the same mistakes we made in the 1980s. As I’ve related before, I came to astronomy from physics with the utter assurance that H₀ had to be 50. It was Known. Then I met astronomers who were actually involved in measuring H₀ and they were like, “Maybe it is ~80?” This hurt my brain. It could not be so! and yet they turned out to be correct within the uncertainties of the time. Today, similar strong opinions are being expressed by the same community (and sometimes by the same people) who were wrong then, so it wouldn’t surprise me if they are wrong now. Putting how they think things should be ahead of how they are is how they roll.

There are other tensions besides the Hubble tension, but I’ll get to them in future posts. This is enough for now.

^αAs I’ve related before, I date the genesis of concordance LCDM to the work of Ostriker & Steinhardt (1995), though there were many other contributions leading to it (e.g., Efstathiou et al. 1990). Certainly many of us anticipated that the Type Ia SN experiments would confirm or deny this picture. Since the issue of confirmation bias is ever-present in cosmic considerations, it is important to understand this context: the acceleration of the expansion rate that is often depicted as a novel discovery in 1998 was an expect result. So much so that at a conference in 1997 in Aspen I recall watching Michael Turner badger the SN presenters to Proclaim Lambda already. One of the representatives from the SN teams was Richard Ellis, who wasn’t having it: the SN data weren’t there yet even if the attitude was. Amusingly, I later heard Turner claim to have been completely surprised by the 1998 discovery, as if he hadn’t been pushing for it just the year before. Aspen is a good venue for discussion; I commented at the time that the need to rehabilitate the cosmological constant was a big stop sign in the sky. He glared at me, and I’ve been on his shit list ever since.

^%I will not be entertaining assertions that the universe is not expanding in the comments: that’s beyond the scope of this post.

*Every time a paper corroborating a prediction of MOND is published, the usual suspects get on social media to complain that the referee(s) who reviewed the paper must be incompetent. This is a classic case of admitting you don’t understand how the process works by disparaging what happened in a process to which you weren’t privy. Anyone familiar with the practice of refereeing will appreciate that the opposite is true: claims that seem extraordinary are consistently held to a higher standard.

^{^}Note that it is impossible to exclude the act of judgement. There are approaches to minimizing this in particular experiments, e.g., by doing a blind analysis of large scale structure data. But you’ve still assumed a paradigm in which to analyze those data; that’s a judgement call. It is also a judgement call to decide to believe only large scale data and ignore evidence below some scale.

⁺I felt this hard when MOND first cropped up in my data for low surface brightness galaxies. I remember thinking How can this stupid theory get any predictions right when there is so much evidence for dark matter? It took a while for me to realize that dark matter really meant mass discrepancies. The evidence merely indicates a problem, the misnomer presupposes the solution. I had been working so hard to interpret things in terms of dark matter that it came as a surprise that once I allowed myself to try interpreting things in terms of MOND I no longer had to work so hard: lots of observations suddenly made sense.

^$TRGB = Tip of the Red Giant Branch. Low metallicity stars reach a consistent maximum luminosity as they evolve up the red giant branch, providing a convenient standard candle.

^&Where the heck is Tully? He seldom seems to get acknowledged despite having played a crucial role in breaking the tyranny of H₀ = 50 in the 1970s, having published steadily on the topic, and his group continues to provide accurate measurements to this day. Do physics-trained cosmologists even know who he is?

^#The TRGB was a well-established method before it suddenly appears on this graph. That it appears this way shortly after the CMB told us what answer we should get is a more worrisome potential example of confirmation bias, reminiscent of the situation with the primordial deuterium abundance.

^**Aside from the tension between the TRGB as implemented by Freedman’s group and the TRGB as implemented by others, I’m not aware of any serious hint of systematics in the calibration of the distance scale. Can it still happen? Sure! But people are well aware of the dangers and watch closely for them. At this juncture, there is ample evidence that we may indeed have gotten past this.

Ha! I knew the Riess reference off the top of my head, but lots of people have worked on this so I typed “hubble calibration not a systematic error” into Google to search for other papers only to have its AI overview confidently assert

The statement that Hubble calibration is not a systematic error is incorrect
Google AI

That gave me a good laugh. It’s bad enough when overconfident underachievers shout about this from the wrong peak of the Dunning-Kruger curve without AI adding its recycled opinion to the noise, especially since its “opinion” is constructed from the noise.

The best search engine for relevant academic papers is NASA ADS; putting the same text in the abstract box returns many hits that I’m not gonna wade through. (A well-structured ADS search doesn’t read so casually; apparently the same still applies to Google.)

NEIU debate: dark matter or modified gravity

As promised, the folks at NEIU have posted the video of my discussion with Scott Dodelson last week, so here you go:

I am in the midst of writing a related post on cosmic tensions, so hopefully I can post that soon as well.

Dark Matter or Modified Gravity? A virtual panel discussion

This is a quick post to announce that on Monday, April 7 there will be a virtual panel discussion about dark matter and MOND involving Scott Dodelson and myself. It will be moderated by Orin Harris at Northeastern Illinois University starting at 3pm US Central time*. I asked Orin if I should advertise it more widely, and he said yes – apparently their Zoom set up has a capacity for a thousand attendees.

See their website for further details. If you wish to attend, you need to register in advance.

*That’s 4PM EDT to me, which is when I’m usually ready for a nap.

Things I don’t understand in modified dynamics (it’s cosmology)

I’ve been busy, and a bit exhausted, since the long series of posts on structure formation in the early universe. The thing I like about MOND is that it helps me understand – and successfully predict – the dynamics of galaxies. Specific galaxies that are real objects: one can observe this particular galaxy and predict that it should have this rotation speed or velocity dispersion. In contrast, LCDM simulations can only make statistical statements about populations of galaxy-like numerical abstractions, they can never be equated to real-universe objects. Worse, they obfuscate rather than illuminate. In MOND, the observed centripetal acceleration follows directly from that predicted by the observed distribution of stars and gas. In simulations, this fundamental observation is left unaddressed, and we are left grasping at straws trying to comprehend how the observed kinematics follow from an invisible, massive dark matter halo that starts with the NFW form but somehow gets redistributed just so by inadequately modeled feedback processes.

Simply put, I do not understand galaxy dynamics in terms of dark matter, and not for want of trying. There are plenty of people who claim to do so, but they appear to be fooling themselves. Nevertheless, what I don’t like about MOND is the same thing that they don’t like about MOND which is that I don’t understand the basics of cosmology with it.

Specifically, what I don’t understand about cosmology in modified dynamics is the expansion history and the geometry. That’s a lot, but not everything. The early universe is fine: the expanding universe went through an early hot phase that bequeathed us with the relic radiation field and the abundances of the light elements through big bang nucleosynthesis. There’s nothing about MOND that contradicts that, and arguably MOND is in better agreement with BBN than LCDM, there being no tension with the lithium abundance – this tension was not present in the 1990s, and was only imposed by the need to fit the amplitude of the second peak in the CMB.

But we’re still missing some basics that are well understood in the standard cosmology, and which are in good agreement with many (if not all) of the observations that lead us to LCDM. So I understand the reluctance to admit that maybe we don’t know as much about the universe as we think we do. Indeed, it provokes strong emotional reactions.

*Screenshot from* Dr. Strangelove paraphrasing Major Kong (original quote at top).

So, what might the expansion history be in MOND? I don’t know. There are some obvious things to consider, but I don’t find them satisfactory.

The Age of the Universe

Before I address the expansion history, I want to highlight some observations that pertain to the age of the universe. These provide some context that informs my thinking on the subject, and why I think LCDM hits pretty close to the mark in some important respects, like the time-redshift relation. That’s not to say I think we need to slavishly obey every detail of the LCDM expansion history when constructing other theories, but it does get some things right that need to be respected in any such effort.

One big thing I think we should respect are constraints on the age of the universe. The universe can’t be younger than the objects in it. It could of course be older, but it doesn’t appear to be much older, as there are multiple, independent lines of evidence that all point to pretty much the same age.

Expansion Age: The first basic is that if the universe is expanding, it has a finite age. You can imagine running the expansion in reverse, looking back in time to when the universe was progressively smaller, until you reach an incomprehensibly dense initial phase. A very long time, to be sure, but not infinite.

To put an exact number on the age of the universe, we need to know its detailed expansion history. That is something LCDM provides that MOND does not pretend to do. Setting aside theory, a good ball park age is the Hubble time, which is the inverse of the Hubble constant. This is how long it takes for a linearly expanding, “coasting” universe to get where it is today. For the measured H₀ = 73 km/s/Mpc, the Hubble time is 13.4 Gyr. Keep that number in mind for later. This expansion age is the metric against which to compare the ages of measured objects, as discussed below.

Globular Clusters: The most famous of age constraints is provided by the ancient stars in globular clusters. One of the great accomplishments of 20th century astrophysics is a masterful understanding of the physics of stars as giant nuclear fusion reactors. This allows us to understand how stars of different mass and composition evolve. That, in turn, allows us to put an age on the stars in clusters. Globulars are the oldest of clusters, with a mean age of 13.5 Gyr (Valcin et al. 2021). Other estimates are similar, though I note that the age determinations depends on the distance scale, so keeping them rigorously separate from Hubble constant determinations has historically been a challenge. The covariance of age and distance renders the meaning of error bars rather suspect, but to give a flavor, the globular cluster M92 is estimated to have an age of 13.80±0.75 Gyr (Jiaqi et al. 2023).

Though globular clusters are the most famous in this regard, there are other constraints on the age of the contents of the universe.

White dwarfs: White dwarfs are the remnants of dead stars that were never massive enough to have exploded as supernova. The over/under line for that is about 8 solar mass; the oldest white dwarfs will be the remnants of the first stars that formed just below this threshold. Such stars don’t take long to evolve, around 100 Myr. That’s small compared to the age of the universe, so the first white dwarfs have just been cooling off ever since their progenitors burned out.

As the remnants of the incredibly hot cores of former stars, white dwarfs star off hot but cool quickly by radiating into space. The timescale to cool off can be crudely estimated from first principles just from the Stefan-Boltzmann law. As with so many situations in astrophysics, some detailed radiative transfer calculations are necessary to get the answer right in detail. But the ballpark of the back-of-the-envelope answer is not much different from the detailed calculation, giving some confidence in the procedure: we have a good idea of how long it takes white dwarfs to cool.

Since white dwarfs are not generating new energy but simply radiating into space, their luminosity fades over time as their surface temperature declines. This predicts that there will be a sharp drop in the numbers of white dwarfs corresponding to the oldest such objects: there simply hasn’t been enough time to cool further. The observational challenge then becomes finding the faint edge of the luminosity function for these intrinsically faint sources.

Despite the obvious challenges, people have done it, and after great effort, have found the expected edge. Translating that into an age, we get 12.5+1.4/-3.5 Gyr (Munn et al. 2017). This seems to hold up well now that we have Gaia data, which finds J1312-4728 to be the oldest known white dwarf at 12.41±0.22 Gyr (Torres et al. 2021). To get to the age of the universe, one does have to account for the time it takes to make a white dwarf in the first place, which is of order a Gyr or less, depending on the progenitor and when it formed in the early universe. This is pretty consistent with the ages of globular clusters, but comes from different physics: radiative cooling is the dominant effect rather than the hydrogen fusion budget of main sequence stars.

Radiochronometers: Some elements decay radioactively, so measuring their isotopic abundances provides a clock. Carbon-14 is a famous example: with a half-life of 5,730 years, its decay provides a great way to date the remains of prehistoric camp sites and bones. That’s great over some tens of thousands of years, but we need something with a half-life of order the age of the universe to constrain that. One such isotope is ²³²Thorium, with a half life of 14.05 Gyr.

Making this measurement requires that we first find stars that are both ancient and metal poor but with detectable Thorium and Europium (the latter providing a stable a reference). Then one has to obtain a high quality spectrum with which to do an abundance analysis. This is all hard work, but there are some examples known.

Sneden‘s star, CS 22892-052, fits the bill. Long story short, the measured Th/Eu ratio gives an age of 12.8±3 Gyr (Sneden et al. 2003). A similar result of ~13 Gyr (Frebel & Kratz 2009) is obtained from ²³⁸U (this “stable” isotope of uranium has a half-life of 4.5 Gyr, as opposed to the kind that can be provoked into exploding, ²³⁵U, which has a half-life of 700 Myr). While the search for the first stars and the secrets they may reveal is ongoing, the ages for individual stars estimated from radioactive decay are consistent with the ages of the oldest globular clusters indicated by stellar evolution.

Interstellar dust grains: The age of the solar system (4.56 Gyr) is well known from the analysis of isotopic abundances in meteorites. In addition to tracing the oldest material in the solar system, sometimes it is possible to identify dust grains of interstellar origin. One can do the same sort of analysis, and do the sum: how long did it take the star that made those elements to evolve, return them to the interstellar medium, get mixed in with the solar nebula, and lurk about in space until plunging to the ground as a meteorite that gets picked up by some scientifically-inclined human. This exercise has been done by Nittler et al. (2008), who estimate a total age of 13.7±1.3 Gyr

Taken in sum, all these different age indicators point to a similar, consistent age between 13 and 14 billion years. It might be 12, but not lower, nor is there reason to think it would be much higher: 15 is right out. I say that flippantly because I couldn’t resist the Monty Python reference, but the point is serious: you could in principle have a much older universe, but then why are all the oldest things pretty much the same age? Why would the universe sit around doing nothing for billions of years then suddenly decide to make lots of stars all at once? The more obvious interpretation is that the age of the universe is indeed in the ballpark of 13.something Gyr.

Expansion history

The expansion history in the standard FLRW universe is governed by the Friedmann equation, which we can write* as

H²(z) = H₀² [Ω_m(1+z)³+Ω_k(1+z)²+Ω_Λ]

where z is the redshift, H(z) is the Hubble parameter, H₀ is its current value, and the various Ω are the mass-energy density of stuff relative to the critical density: the mass density Ω_m, the geometry Ω_k, and the cosmological constant Ω_Λ. I’ve neglected radiation for clarity. One can make up other stuff X and add a term for it as Ω_X which will have an associated (1+z) term that depends on the equation of state of X. For our purposes, both normal matter and non-baryonic cold dark matter (CDM) share the same equation of state (cold meaning non-relativisitic motions meaning rest-mass density but negligible pressure), so both contribute to the mass density Ω_m = Ω_b+Ω_CDM.

Note that since H(z=0)=H₀, the various Ω’s have to sum to unity. Thus a cosmology is geometrically flat with the curvature term Ω_k = 0 if Ω_m+Ω_Λ = 1. Vanilla LCDM has Ω_m = 0.3 and Ω_Λ = 0.7. As a community, we’ve become very sure of this, but that the Friedmann equation is sufficient to describe the expansion history of the universe is an assumption based on (1) General Relativity providing a complete description, and (2) the cosmological principle (homogeneity and isotropy) holds. These seem like incredibly reasonable assumptions, but let’s bear in mind that we only know directly about 5% of the sum of Ω’s, the baryons. Ω_CDM = 0.25 and Ω_Λ = 0.7 are effectively fudge factors we need to make things works out given the stated assumptions. LCDM is viable if and only if cold dark matter actually exists.

Gravity is an attractive force, so the mass term Ω_m acts to retard the expansion. Early on, we expected this to be the dominant term due to the (1+z)³ dependence. In the long-presumed⁺ absence of a cosmological constant, cosmology was the search for two numbers: once H₀ and Ω_m are specified, the entire expansion history is known. Such a universe can only decelerate, so only the region below the straight line in the graph below is accessible; an expansion history like the red one representing LCDM should be impossible. That lots of different data seemed to want this is what led us kicking and screaming to rehabilitate the cosmological constant, which acts as a form of anti-gravity to accelerate an expansion that ought to be decelerating.

*The expansion factor maps how the universe has grown over time; it corresponds to 1/(1+z) in redshift so that z → ∞ as t → 0.* The “coasting” limit of an empty universe *(H₀ = 73, Ω_m = Ω_Λ = 0)* that expands linearly is shown as the straight line. The *red line* is the expansion history of vanilla LCDM (H₀ = 70, Ω_m = 0.3, Ω_Λ = 0.7).

The over/under between acceleration/deceleration of the cosmic expansion rate is the coasting universe. This is the conceptually useful limit of a completely empty universe with Ω_m = Ω_Λ = 0. It expands at a steady rate that neither accelerates nor decelerates. The Hubble time is exactly equal to the age of such a universe, i.e., 13.4 Gyr for H₀ = 73.

LCDM has a more complicated expansion history. The mass density dominates early on, so there is an early phase of deceleration – the red curve bends to the right. At late times, the cosmological constant begins to dominate, reversing the deceleration and transforming it into an acceleration. The inflection point when it switches from decelerating to accelerating is not too far in the past, which is a curious coincidence given that the entire future of such a universe will be spent accelerating towards the exponential expansion of the de Sitter limit. Why do we live anywhen close to this special time?

Lots of ink has been spilled on this subject, and the answer seems to boil down to the anthropic principle. I find this lame and won’t entertain it further. I do, however, want to point out a related strange coincidence: the current age of vanilla LCDM (13.5 Gyr) is the same as that of a coasting universe with the locally measured Hubble constant (13.4 Gyr). Why should these very different models be so close in age? LCDM decelerates, then accelerates; there’s only one moment in the expansion history of LCDM when the age is equal to the Hubble time, and we happen to be living just then.

This coincidence problem holds for any viable set of LCDM parameters, as they all have nearly the same age. Planck LCDM has an age of 13.7 Gyr, still basically the same as the Hubble time for the locally measured Hubble constant. The lower Planck Hubble value is balanced by a larger amount of early-time deceleration. The universe reaches its current point after 13.something Gyr in all of these models. That’s in good agreement with the ages of the oldest observed stars, which is encouraging, but it does nothing to help us resolve the Hubble tension, much less constrain alternative cosmologies.

Cosmic expansion in MOND

There is no equivalent to the Friedmann equation is in MOND. This is not satisfactory. As an extension of Newtonian theory, MOND doesn’t claim to encompass cosmic phenomena^$ – hence the search for a deeper underlying theory. Lacking this, what can we try?

Felten (1984) tried to derive an equivalent to the Friedmann equation using the same trick that can be used with Newtonian theory to recover the expansion dynamics in the absence of a cosmological constant. This did not work. The result was unsatisfactory^& for application to the whole universe because the presence of a₀ in the equations makes the result scale-dependent. So how big the universe is matters in a way that the standard cosmology does not; there’s no way to generalize is to describe the whole enchilada.

In retrospect, what Felten had really obtained was a solution for the evolution of a top-hat over-density: the dynamics of a spherical region embedded in an expanding universe. This result is the basis for the successful prediction of early structure formation in MOND. But once again it only tells us about the dynamics of an object within the universe, not the universe itself.

In the absence of a complete theory, one makes an ansatz to proceed. If there is a grander theory that encompasses both General Relativity and MOND, then it must approach both in the appropriate limit, so an obvious ansatz to make is that the entire universe obeys the conventional Friedmann equation while the dynamics of smaller regions in the low acceleration regime obey MOND. Both Bob Sanders and I independently adopted this approach, and explicitly showed that it was consistent with the constraints that were known at the time. The first obvious guess for the mass density of such a cosmology is Ω_m = Ω_b = 0.04. (This was the high end of BBN estimates at the time, so back then we also considered lower values.) The expansion history of this low density, baryon-only universe is shown as the blue line below:

*As above, but with the addition of a low density, baryon-dominated, no-CDM universe *(H₀ = 73, Ω_m = *Ω_b =* 0.04, Ω_Λ = 0; blue line)*.*

As before, there is not much to choose between these models in terms of age. The small but non-zero mass density does cause some early deceleration before the model approaches the coasting limit, so the current age is a bit lower: 12.6 Gyr. This is on the small side, but not problematically so, or even particularly concerning given the history of the subject. (I’m old enough to remember when we were pretty sure that globular clusters were 18 Gyr old.)

The time-redshift relation for the no-CDM, baryon-only universe is somewhat different from that of LCDM. If we adopt it, then we find that MOND-driven structure forms at somewhat higher redshift than in with the LCDM time-redshift relation. The benchmark time of 500 Myr for L* galaxy formation is reached at z = 15 rather than z = 9.5 as in LCDM. This isn’t a huge difference, but it does mean that an L* galaxy could in principle appear even earlier than so far seen. I’ve stuck with LCDM as the more conservative estimate of the time-redshift relation, but the plain fact is we don’t really know what the universe is doing at those early times, or if the ansatz we’ve made holds well enough to do this. Surely it must fail at some point, and it seems likely that we’re past that point.

There is a bigger problem with the no-CDM model above. Even if it is close to the right expansion history, it has a very large negative curvature. The geometry is nowhere close to the flat Robertson-Walker metric indicated by the angular diameter distance to the surface of last scattering (the CMB).

Geometry

Much of cosmology is obsessed with geometry, so I will not attempt to do the subject justice. Each set of FLRW parameters has a specific geometry that comes hand in hand with its expansion history. The most sensitive probe we have of the geometry is the CMB. The a priori prediction of LCDM was that its flat geometry required the first acoustic peak to have a maximum near one degree on the sky. That’s exactly what we observe.

**Fig. 45** from Famaey & McGaugh (21012): The acoustic power spectrum of the cosmic microwave background as observed by WMAP [229] together with the a priori predictions of ΛCDM (red line) and no-CDM (blue line) as they existed in 1999 [265] prior to observation of the acoustic peaks. ΛCDM correctly predicted the position of the first peak (the geometry is very nearly flat) but over-predicted the amplitude of both the second and third peak. The most favorable a priori case is shown; other plausible ΛCDM parameters [468] predicted an even larger second peak. The most important parameter adjustment necessary to obtain an a posteriori fit is an increase in the baryon density Ω_b, above what had previously been expected from BBN. In contrast, the no-CDM model *ansatz* made as a proxy for MOND successfully predicted the correct amplitude ratio of the first to second peak with no parameter adjustment [268, 269]. The no-CDM model was subsequently shown to under-predict the amplitude of the third peak [442], so no model can explain these data without post-hoc adjustment.

In contrast, no-CDM made the correct prediction for the first-to-second peak amplitude ratio, but it is entirely ambivalent about the geometry. FLRW cosmology and MOND dynamics care about incommensurate things in the CMB data. That said, the naive prediction of the baryon-only model outlined above is that the first peak should occur around where the third peak is observed. That is obviously wrong.

Since the geometry is not a fundamental prediction of MOND, the position of the first peak is easily fit by invoking the same fudge factor used to fit it conventionally: the cosmological constant. We need a larger Ω_Λ = 0.96, but so what? This parameter merely encodes our ignorance: we make no pretense to understand it, let alone vesting deep meaning in it. It is one of the things that a deeper theory must explain, and can be considered as a clue in its development.

So instead of a baryon-only universe, our FLRW proxy becomes a Lambda-baryon universe. That fits the geometry, and for an optical depth to the surface of last scattering of τ = 0.17, matches the amplitude of the CMB power spectrum and correctly predicts the cosmic dawn signal that EDGES claimed to detect. Sounds good, right? Well, not entirely. It doesn’t fit the CMB data at L > 600, but I expected to only get so far with the no-CDM, so it doesn’t bother me that you need a better underlying theory to fit the entire CMB. Worse, to my mind, is that the Lambda-baryon proxy universe is much, much older than everything in it: 22 Gyr instead of 13.something.

*As above, but now with the addition of a low density, Lambda-dominated universe *(H₀ = 73, Ω_m = *Ω_b =* 0.04, Ω_Λ = 0.96; dashed line)*.*

This just don’t seem right. Or even close to right. Like, not even pointing in a direction that might lead to something that had a hope of being right.

Moreover, we have a weird tension between the baryon-only proxy and the Lambda-baryon proxy cosmology. The baryon-only proxy has a plausible expansion history but an unacceptable geometry. The Lambda-baryon proxy has a plausible geometry by an implausible expansion history. Technically, yes, it is OK for the universe to be much older than all of its contents, but it doesn’t make much sense. Why would the universe do nothing for 8 or 9 Gyr, then burst into a sudden frenzy of activity? It’s as if Genesis read “for the first 6 Gyr, God was a complete slacker and did nothing. In the seventh Gyr, he tried to pull an all-nighter only to discover it took a long time to build cosmic structure. Then He said ‘Screw it’ and fudged Creation with MOND.”

In the beginning the Universe was created.
This has made a lot of people very angry and been widely regarded as a bad move.
Douglas Adams, The Restaurant at the End of the Universe

So we can have a plausible geometry or we can have a plausible expansion history with a proxy FLRW model, but not both. That’s unpleasant, but not tragic: we know this approach has to fail somehow. But I had hoped for FLRW to be a more coherent first approximation to the underlying theory, whatever it may be. If there is such a theory, then both General Relativity and MOND are its limits in their respective regimes. As such, FLRW ought to be a good approximation to the underlying entity up to some point. That we have to invoke both non-baryonic dark matter and a cosmological constant is a hint that we’ve crossed that point. But I would have hoped that we crossed it in a more coherent fashion. Instead, we seem to get a little of this for the expansion history and a little of that for the geometry.

I really don’t know what the solution is here, or even if there is one. At least I’m not fooling myself into presuming it must work out.

*There are other ways to write the Friedmann equation, but this is a useful form here. For the mathematically keen, the Hubble parameter is the time derivative of the expansion factor normalized by the expansion factor, which in terms of redshift is

H(z) = -(dz/dt)/(1+z)².

This quantity evolves, leading us to expect evolution in Milgrom’s constant if we associate it with the numerical coincidence

2π a₀ = cH₀

If the Hubble parameter evolves, as it appears to do, it would seem to follow that so should a(z) ~ H(z) – otherwise the coincidence is just that: a coincidence that applies only now. There is, at present, no persuasive evidence that a₀ evolves with redshift.

A similar order-of-magnitude association can be made with the cosmological constant,

2π a₀ = c²Λ^1/2

so conceivably the MOND acceleration scale appears as the result of vacuum effects. It is a matter of judgement whether these numerical coincidences are mere coincidences or profound clues towards a deeper theory. That the proportionality constant is very nearly 2π is certainly intriguing, but the constancy of any of these parameters (including Newton’s G) depends on how they emerge from the deeper theory.

⁺In January 2019, I was attending a workshop at Princeton when I had a chance encounter with Jim Peebles. He was not attending the workshop, but happened to be walking across campus at the same time I was. We got to talking, and he affirmed my recollection of just how incredibly unpopular the cosmological constant used to be. Unprompted, he went on to make the analogy of how similar that seemed to how unpopular MOND is now.

Peebles was awarded a long-overdue Nobel Prize later that year.

^$This is one of the things that makes it tricky to compare LCDM and MOND. MOND is a theory of dynamics in the limit of low acceleration. It makes no pretense to be a cosmological theory. LCDM starts as a cosmological theory, but it also makes predictions about the dynamics of systems within it (or at least the dark matter halos in which visible galaxies are presumed to form). So if one starts by putting on a cosmology hat, there is nothing to talk about: LCDM is the only game in town. But from the perspective of dynamics, it’s the other way around, with LCDM repeatedly failing to satisfactorily explain, much less anticipate, phenomena that MOND predicted correctly in advance.

^&An intriguing thing about Felten’s MOND universe is that it eventually recollapses irrespective of the mass density. There is no critical value of Ω_m, hence no coincidence problem. MOND is strong enough to eventually reverse the expansion of the universe, it just takes a very long time to do so, depending on the density.

I’m surprised this aspect of the issue was overlooked. The coincidence problem (then mostly called the flatness problem) obsessed people at the time, so much so that its solution by Cosmic Inflation led to its widespread acceptance. That only works if Ω_m = 1; LCDM makes the coincidence worse. I guess the timing was off, as Inflation had already captured the community’s imagination by that time, likely making it hard to recognize that MOND was a more natural solution. We’d already accepted the craziness that was Inflation and dark matter; MOND craziness was a bridge too far.

I guess. I’m not quite that old; I was still an undergraduate at the time. I did hear about Inflation then, in glowing terms, but not a thing about MOND.

Kinematics suggest large masses for high redshift galaxies

This is what I hope will be the final installment in a series of posts describing the results published in McGaugh et al. (2024). I started by discussing the timescale for galaxy formation in LCDM and MOND which leads to different and distinct predictions. I then discussed the observations that constrain the growth of stellar mass over cosmic time and the related observation of stellar populations that are mature for the age of the universe. I then put on an LCDM hat to try to figure out ways to wriggle out of the obvious conclusion that galaxies grew too massive too fast. Exploring all the arguments that will be made is the hardest part, not because they are difficult to anticipate, but because there are so many* options to consider. This leads to many pages of minutiae that no one ever seems to read⁺, so one of the options I’ve discussed (e.g., super-efficient star formation) will likely emerge as the standard picture even if it comes pre-debunked.

The emphasis so far has been on the evolution of the stellar masses of galaxies because that is observationally most accessible. That gives us the opportunity to wriggle, because what we really want to measure to test LCDM is the growth of [dark] mass. This is well-predicted but invisible, so we can always play games to relate light to mass.

Mass assembly in LCDM from the IllustrisTNG50 simulation. The dark matter mass assembles hierarchically in the merger tree depicted at left; the size of the circles illustrates the dark matter halo mass. The corresponding stellar mass of the largest progenitor is shown at right as the red band. This does not keep pace with the apparent assembly of stellar mass (data points), but what is the underlying mass really doing?

Galaxy Kinematics

What we really want to know is the underlying mass. It is reasonable to expect that the light traces this mass, but is there another way to assess it? Yes: kinematics. The orbital speeds of objects in galaxies trace the total potential, including the dark matter. So, how massive were early galaxies? How does that evolve with redshift?

The rotation curve of NGC 6946 traced by stars at small radii and gas farther out. This is a typical flat rotation curve (data points) that exceeds what can be explained by the observed baryonic mass (red line deduced from the stars and gas pictured at right), leading to the inference of dark matter.

The rotation curve for NGC 6946 shows a number of well-established characteristics for nearby galaxies, including the dominance of baryons at small radii in high surface brightness galaxies and the famous flat outer portion of the rotation curve. Even when stars contribute as much mass as allowed by the inner rotation curve (“maximum disk“), there is a need for something extra further out (i.e., dark matter or MOND). In the case of dark matter, the amplitude of flat rotation is typically interpreted as being indicative^& of halo mass.

So far, the rotation curves of high redshift galaxies look very much like those of low redshift galaxies. There are some fast rotators at high redshift as well. Here is an example observed by Neeleman et al. (2020), who measure a flat rotation speed of 272 km/s for DLA0817g at z = 4.26. That’s more massive than either the Milky Way (~200 km/s) or Andromeda (~230 km/s), if not quite as big as local heavyweight champion UGC 2885 (300 km/s). DLA0817g looks to be a disk galaxy that formed early and is sedately rotating only 1.4 Gyr after the Big Bang. It is already massive at this time: not at all the little nuggets we expect from the CDM merger tree above.

**Fig. 1** from Neeleman et al. (2020): the velocity field (left) and position-velocity diagram (right) of DLA0817g. The velocity field looks like that of a rotating disk with the raw *position-velocity diagram* shows motions of ~200 km/s on either side of the center. When corrected for inclination, the flat rotation speed is 272 km/s, corresponding to a massive galaxy near the top of the Tully-Fisher relation.

This is anecdotal, of course, but there are a good number of similar cases that are already known. For example, the kinematics of ALESS 073.1 at z ≈ 5 indicate the presence of a massive stellar bulge as well as a rapidly rotating disk (Lelli et al. 2021). A similar case has been observed at z ≈ 6 (Tripodi et al. 2023). These kinematic observations indicate the presence of mature, massive disk galaxies well before they were expected to be in place (Pillepich et al. 2019; Wardlow 2021). The high rotation speeds observed in early disk galaxies sometimes exceed 250 (Neeleman et al. 2020) or even 300 km s⁻¹ (Nestor Shachar et al. 2023; Wang et al. 2024), comparable to the most massive local spirals (Noordermeer et al. 2007; Di Teodoro et al. 2021, 2023). That such rapidly rotating galaxies exist at high redshift indicates that there is a lot of mass present, not just light. We can’t just tweak the mass-to-light ratio of the stars to explain the photometry and also explain the kinematics.

In a seminal galaxy formation paper, Mo, Mao, & White (1998) predicted that “present-day disks were assembled recently (at z ≤ 1).” Today, we see that spiral galaxies are ubiquitous in JWST images up to z ∼ 6 (Ferreira et al. 2022, 2023; Kuhn et al. 2024). The early appearance of massive, dynamically cold (Di Teodoro et al. 2016; Lelli et al. 2018, 2023; Rizzo et al. 2023) disks in the first few billion years after the Big Bang is contradictory the natural prediction of ΛCDM. Early disks are expected to be small and dynamically hot (Dekel & Burkert 2014; Zolotov et al. 2015; Krumholz et al. 2018; Pillepich et al. 2019), but they are observed to be massive and dynamically cold. (Hot or cold in this context means a high or low amplitude of the velocity dispersion relative to the rotation speed; the modern Milky Way is cold with σ ~ 20 km/s and V_c ~ 200 km/s.) Understanding the stability and longevity of dynamically cold spiral disks is foundational to the problem.

Kinematic Scaling Relations

Beyond anecdotal cases, we can check on kinematic scaling relations like Tully–Fisher. These are expected to emerge late and evolve significantly with redshift in LCDM (e.g., Glowacki et al. 2021). In MOND, the normalization of the baryonic Tully–Fisher relation is set by a₀, so is immutable for all time if a₀ is constant. Let’s see what the data say:

**Figure 9** from McGaugh et al (2024): The baryonic Tully–Fisher (left) and dark matter fraction–surface brightness (right) relations. Local galaxy data (circles) are from Lelli et al. (2019; left) and Lelli et al. (2016; right). Higher-redshift data (squares) are from Nestor Shachar et al. (2023) in bins with equal numbers of galaxies color coded by redshift: 0.6 < z < 1.22 (blue), 1.22 < z < 2.14 (green), and 2.14 < z < 2.53 (red). Open squares with error bars illustrate the typical uncertainties. The relations known at low redshift also appear at higher redshift with no clear indication of evolution over a lookback time up to 11 Gyr.

Not much to see: the data from Nestor Shachar et al. (2023) show no clear indication of evolution. The same can be said for the dark matter fraction-surface brightness relation. (Glad to see that being plotted after I pointed it out.) The local relations are coincident with those at higher redshift for both relations within any sober assessment of the uncertainties – exactly what we measure and how matters at this level, and I’m not going to attempt to disentangle all that here. Neither am I about to attempt to assess the consistency (or lack thereof) with either LCDM or MOND; the data simply aren’t good enough for that yet. It is also not clear to me that everyone agrees on what LCDM predicts.

What I can do is check empirically how much evolution there is within the 100-galaxy data set of Nestor Shachar et al. (2023). To do that, I fit a line to their data (the left panel above) and measure the residuals: for a given rotation speed, how far is each galaxy from the expected mass? To compare this with the stellar masses discussed previously, I normalize those residuals to the same M_*^* = 9 x 10¹⁰ M_☉. If there is no evolution, the data will scatter around a constant value as function of redshift:

This figure reproduces the stellar mass-redshift data for L* galaxies (black points) and the monolithic (purple line) and LCDM (red and green lines) models discussed previously. The blue squares illustrate deviations of the data of Nestor Shachar et al. (2023) from the baryonic Tully-Fisher relation (dashed line, normalized to the same mass as the monolithic model). There is no indication of evolution in the baryonic Tully-Fisher relation, which was apparently established within the first few billion years after the Big Bang (z = 2.5 corresponds to a cosmic age of about 2.6 Gyr). The data are consistent with a monolithic galaxy formation model in which all the mass had been assembled into a single object early on.

The data scatter around a constant value as function of redshift: there is no perceptible evolution.

The kinematic data for rotating galaxies tells much the same story as the photometric data for galaxies in clusters. The are both consistent with a monolithic model that gathered together the bulk of the baryonic mass early on, and evolved as an island universe for most of the history of the cosmos. There is no hint of the decline in mass with redshift predicted by the LCDM simulations. Moreover, the kinematics trace mass, not just light. So while I am careful to consider the options for LCDM, I don’t know how we’re gonna get out of this one.

Empirically, it is an important observation that there is no apparent evolution in the baryonic Tully-Fisher relation out to z ~ 2.5. That’s a lookback time of ~11 Gyr, so most of cosmic history. That means that whatever physics sets the relation did so early. If the physics is MOND, this absence of evolution implies that a₀ is constant. There is some wiggle room in that given all the uncertainties, but this already excludes the picture in which a₀ evolves with the expansion rate through the coincidence a₀ ~ cH₀. That much evolution would be readily perceptible if H(z) evolves as it appears to do. In contrast, the coincidence a₀ ~ c²Λ^1/2 remains interesting since the cosmological constant is constant. Perhaps this is just a coincidence, or perhaps it is a hint that the anomalous acceleration of the expansion of the universe is somehow connected with the anomalous acceleration in galaxy dynamics.

Though I see no clear evidence for evolution in Tully-Fisher to date, it remains early days. For example, a very recent paper by Amvrosiadis et al. (2025) does show a hint of evolution in the sense of an offset in the normalization of the baryonic Tully-Fisher relation. This isn’t very significant, being different by less than 2σ; and again we find ourselves in a situation where we need to take a hard look at all the assumptions and population modeling and velocity measurements just to see if we’re talking about the same quantities before we even begin to assess consistency or the lack thereof. Nevertheless, it is an intriguing result. There is also another interesting anecdotal case: one of their highest redshift objects, ALESS 071.1 at z = 3.7, is also the most massive in the sample, with an estimated stellar mass of 2 x 10¹² M_☉. That is a crazy large number, comparable to or maybe larger than the entire dark matter halo of the Milky Way. It falls off the top of any of the graphs of stellar mass we discussed before. If correct, this one galaxy is an enormous problem for LCDM regardless of any other consideration. It is of course possible that this case will turn out to be wrong for some reason, so it remains early days for kinematics at high redshift.

Cluster Kinematics

It is even earlier days for cluster kinematics. First we have to find them, which was the focus of Jay Franck’s thesis. Once identified, we have to estimate their masses with the available data, which may or may not be up to the task. And of course we have to figure out what theory predicts.

LCDM makes a clear prediction for the growth of cluster mass. This work out OK at low redshift, in the sense that the cluster X-ray mass function is in good agreement with LCDM. Where the theory struggles is in the proclivity for the most massive clusters to appear sooner in cosmic history than anticipated. Like individual galaxies, they appear too big too soon. This trend persisted in Jay’s analysis, which identified candidate protoclusters at higher redshifts than expected. It also measured velocity dispersions that were consistently higher than found in simulations. That is, when Jay applied the search algorithm he used on the data to mock data from the Millennium simulation, the structures identified there had velocity dispersions on average a factor of two lower than seen in the data. That’s a big difference in terms of mass.

**Figure 11** from McGaugh et al. (2024): Measured velocity dispersions of protocluster candidates (Franck & McGaugh 2016a, 2016b) as a function of redshift. Point size grows with the assessed probability that the identified overdensities correspond to a real structure: all objects are shown as small points, candidates with P > 50% are shown as light blue midsize points, and the large dark blue points meet this criterion and additionally have at least 10 spectroscopically confirmed members. The MOND mass for an equilibrium system in the low-acceleration regime is noted at right; these are comparable to cluster masses at low redshift.

At this juncture, there is no way to know if the protocluster candidates Jay identified are or will become bound structures. We made some probability estimates that can be summed up as “some are probably real, but some probably are not.” The relative probability is illustrated by the size of the points in the plot above; the big blue points are the most likely to be real clusters, having at least ten galaxies at the same place on the sky at the same redshift, all with spectroscopically measured redshifts. Here the spectra are critical; photometric redshifts typically are not accurate enough to indicate that galaxies that happen to be nearby to each other on the sky are also that close in redshift space.

The net upshot is that there are at least some good candidate clusters at high redshift, and these have higher velocity dispersions than expected in LCDM. I did the exercise of working out what the equivalent mass in MOND would be, and it is about the same as what we find for clusters at low redshift. This estimate assumes dynamical equilibrium, which is very far from guaranteed. But the time at which these structures appear is consistent with the timescale for cluster formation in MOND (a couple Gyr; z ~ 3), so maybe? Certainly there shouldn’t be lots of massive clusters in LCDM at z ~ 3.

Kinematic Takeaways

While it remains early days for kinematic observations at high redshift, so far these data do nothing to contradict the obvious interpretation of the photometric data. There are mature, dynamically cold, fast rotating spiral galaxies in the early universe that were predicted not to be there by LCDM. Moreover, kinematics traces mass, not just light, so all the wriggling we might try to explain the latter doesn’t help with the former. The most obvious interpretation of the kinematic data to date is the same as that for the photometric data: galaxies formed early and grew massive quickly, as predicted a priori by MOND.

*The papers I write that cover both theories always seem to wind up lopsided in favor of LCDM in terms of the bulk of their content. That happens because it takes many pages to discuss all the ins and outs. In contrast, MOND just gets it right the first time, so that section is short: there’s not much more to say than “Yep, that’s what it predicted.”

⁺I’ve yet not heard directly any criticisms of our paper. The criticisms that I’ve heard second or third hand so far almost all fall in the category of things we explicitly discussed. That’s a pretty clear tell that the person leveling the critique hasn’t bothered to read it. I don’t expect everyone to agree with our take on this or that, but a competent critic would at least evince awareness that we had addressed their concern, even if not to their satisfaction. We rarely seem to reach that level: it is much easier to libel and slander than engage with the issues.

The one complaint I’ve heard so far that doesn’t fall in the category of things-we-already-discussed is that we didn’t do hydrodynamic simulations of star formation in molecular gas. That is a red herring. To predict the growth of stellar mass, all we need is a prescription for assembling mass and converting baryons into stars; this is essentially a bookkeeping exercise that can be done analytically. If this were a serious concern, it should be noted that most cosmological hydro-simulations also fail to meet this standard: they don’t resolve star formation, so they typically adopt some semi-empirical (i.e., data-informed) bookkeeping prescription for this “subgrid physics.”

Though I have not myself attempted to numerically simulate galaxy formation in MOND, Sanders (2008) did. More recently, Eappen et al. (2022) have done so, including molecular gas and feedback^$ and everything. They find a star formation history compatible with the analytic models we discuss in our paper.

^$Related detail: Eappen et al find that different feedback schemes make little difference to the end result. The deus ex machina invoked to solve all problems in LCDM is largely irrelevant in MOND. There’s a good physical reason for this: gravity in MOND is sourced by what you see; how it came to have its observed distribution is irrelevant. If 90% of the baryons are swept entirely out of the galaxy by some intense galactic wind, then they’re gone BYE BYE and don’t matter any more. In contrast, that is one of the scenarios sometimes invoked to form cores in dark matter halos that are initially cuspy: the departure of all those baryons perturbs the orbits of the dark matter particles and rearranges the structure of the halo. While that might work to alter halo structure, how it results in MOND-like phenomenology has never been satisfactorily explained. Mostly that is not seen as even necessary; converting cusp to core is close enough!

^&Though we typically associate the observed outer velocity with halo mass, an important caveat is that the radius also matters: M ~ RV², and most data for high redshift galaxies do not extend very far out in radius. Nevertheless, it takes a lot of mass to make rotation speeds of order 200 km/s within a few kpc, so it hardly matters if this is or is not representative of the dark matter halo: if it is all stars, then the kinematics directly corroborate the interpretation of the photometric data that the stellar mass is large. If it is representative of the dark matter halo, then we expect the halo radius to scale with the halo velocity (R₂₀₀ ~ V₂₀₀) so M₂₀₀~ V₂₀₀³ and again it appears that there is too much mass in place too early.

Triton Station

A Blog About the Science and Sociology of Cosmology and Dark Matter

Category: Cosmology