A personal recollection of how we learned to stop worrying and love the Lambda

A personal recollection of how we learned to stop worrying and love the Lambda

There is a tendency when teaching science to oversimplify its history for the sake of getting on with the science. How it came to be isn’t necessary to learn it. But to do science requires a proper understanding of the process by which it came to be.

The story taught to cosmology students seems to have become: we didn’t believe in the cosmological constant (Λ), then in 1998 the Type Ia supernovae (SN) monitoring campaigns detected accelerated expansion, then all of a sudden we did believe in Λ. The actual history was, of course, rather more involved – to the point where this oversimplification verges on disingenuous. There were many observational indications of Λ that were essential in paving the way.

Modern cosmology starts in the early 20th century with the recognition that the universe should be expanding or contracting – a theoretical inevitability of General Relativity that Einstein initially tried to dodge by inventing the cosmological constant – and is expanding in fact, as observationally established by Hubble and Slipher and many others since. The Big Bang was largely considered settled truth after the discovery of the existence of the cosmic microwave background (CMB) in 1964.

The CMB held a puzzle, as it quickly was shown to be too smooth. The early universe was both isotropic and homogeneous. Too homogeneous. We couldn’t detect the density variations that could grow into galaxies and other immense structures. Though such density variations are now well measured as temperature fluctuations that are statistically well described by the acoustic power spectrum, the starting point was that these fluctuations were a disappointing no-show. We should have been able to see them much sooner, unless something really weird was going on…

That something weird was non-baryonic cold dark matter (CDM). For structure to grow, it needed the helping hand of the gravity of some unseen substance. Normal matter matter did not suffice. The most elegant cosmology, the Einstein-de Sitter universe, had a mass density Ωm= 1. But the measured abundances of the light elements were only consistent with the calculations of big bang nucleosynthesis if normal matter amounted to only 5% of Ωm = 1. This, plus the need to grow structure, led to the weird but seemingly unavoidable inference that the universe must be full of invisible dark matter. This dark matter needed to be some slow moving, massive particle that does not interact with light nor reside within the menagerie of particles present in the Standard Model of Particle Physics.

CDM and early universe Inflation were established in the 1980s. Inflation gave a mechanism that drove the mass density to exactly one (elegant!), and CDM gave us hope for enough mass to get to that value. Together, they gave us the Standard CDM (SCDM) paradigm with Ωm = 1.000 and H0 = 50 km/s/Mpc.

elrondwasthere
I was there when SCDM failed.

It is hard to overstate the ferver with which the SCDM paradigm was believed. Inflation required that the mass density be exactly one; Ωm < 1 was inconceivable. For an Einstein-de Sitter universe to be old enough to contain the oldest stars, the Hubble constant had to be the lower of the two (50 or 100) commonly discussed at that time. That meant that H0 > 50 was Right Out. We didn’t even discuss Λ. Λ was Unmentionable. Unclean.

SCDM was Known, Khaleesi.

scdm_rightout

Λ had attained unmentionable status in part because of its origin as Einstein’s greatest blunder, and in part through its association with the debunked Steady State model. But serious mention of it creeps back into the literature by 1990. The first time I personally heard Λ mentioned as a serious scientific possibility was by Yoshii at a conference in 1993. Yoshii based his argument on a classic cosmological test, N(m) – the number of galaxies as a function of how faint they appeared. The deeper you look, the more you see, in a way that depends on the intrinsic luminosity of galaxies, and how they fill space. Look deep enough, and you begin to trace the geometry of the cosmos.

At this time, one of the serious problems confronting the field was the faint blue galaxies problem. There were so many faint galaxies on the sky, it was incredibly difficult to explain them all. Yoshii made a simple argument. To get so many galaxies, we needed a big volume. The only way to do that in the context of the Robertson-Walker metric that describes the geometry of the universe is if we have a large cosmological constant, Λ. He was arguing for ΛCDM five years before the SN results.

gold_hat_portrayed_by_alfonso_bedoya
Lambda? We don’t need no stinking Lambda!

Yoshii was shouted down. NO! Galaxies evolve! We don’t need no stinking Λ! In retrospect, Yoshii & Peterson (1995) looks like a good detection of Λ. Perhaps Yoshii & Peterson also deserve a Nobel prize?

Indeed, there were many hints that Λ (or at least low Ωm) was needed, e.g., the baryon catastrophe in clusters, the power spectrum of IRAS galaxies, the early appearance of bound structures, the statistics of gravitational lensesand so on. Certainly by the mid-90s it was clear that we were not going to make it to Ωm = 1. Inflation was threatened – it requires Ωm = 1 – or at least a flat geometry: ΩmΛ = 1.

SCDM was in crisis.

A very influential 1995 paper by Ostriker & Steinhardt did a lot to launch ΛCDM. I was impressed by the breadth of data Ostriker & Steinhardt discussed, all of which demanded low Ωm. I thought the case for Λ was less compelling, as it hinged on the age problem in a way that might also have been solved, at that time, by simply having an open universe (low Ωm with no Λ). This would ruin Inflation, but I wasn’t bothered by that. I expect they were. Regardless, they definitely made that case for ΛCDM three years before the supernovae results. Their arguments were accepted by almost everyone who was paying attention, including myself. I heard Ostriker give a talk around this time during which he was asked “what cosmology are you assuming?” to which he replied “the right one.” Called the “concordance” cosmology by Ostriker & Steinhardt, ΛCDM had already achieved the status of most-favored cosmology by the mid-90s.

omhannotated
A simplified version of the diagram of Ostriker & Steinhardt (1995) illustrating just a few of the constraints they discussed. Direct measurements of the expansion rate, mass density, and ages of the oldest stars excluded SCDM, instead converging on a narrow window – what we now call ΛCDM.

Ostriker & Steinhardt neglected to mention an important prediction of Λ: not only should the universe expand, but that expansion rate should accelerate! In 1995, that sounded completely absurd. People had looked for such an effect, and claimed not to see it. So I wrote a brief note pointing out the predicted acceleration of the expansion rate. I meant it in a bad way: how crazy would it be if the expansion of the universe was accelerating?! This was an obvious and inevitable consequence of ΛCDM that was largely being swept under the rug at that time.

I mean[t], surely we could live with Ωm < 1 but no Λ. Can’t we all just get along? Not really, as it turned out. I remember Mike Turner pushing the SN people very hard in Aspen in 1997 to Admit Λ. He had an obvious bias: as an Inflationary cosmologist, he had spent the previous decade castigating observers for repeatedly finding Ωm < 1. That’s too little mass, you fools! Inflation demands Ωm = 1.000! Look harder!

By 1997, Turner had, like many cosmologists, finally wrapped his head around the fact that we weren’t going to find enough mass for Ωm = 1. This was a huge problem for Inflation. The only possible solution, albeit an ugly one, was if Λ made up the difference. So there he was at Aspen, pressuring the people who observed supernova to Admit Λ. One, in particular, was Richard Ellis, a great and accomplished astronomer who had led the charge in shouting down Yoshii. They didn’t yet have enough data to Admit Λ. Not.Yet.

By 1998, there were many more high redshift SNIa. Enough to see Λ. This time, after the long series of results only partially described above, we were intellectually prepared to accept it – unlike in 1993. Had the SN experiments been conducted five years earlier, and obtained exactly the same result, they would not have been awarded the Nobel prize. They would instead have been dismissed as a trick of astrophysics: the universe evolves, metallicity was lower at earlier times, that made SN then different from now, they evolve and so cannot be used as standard candles. This sounds silly now, as we’ve figured out how to calibrate for intrinsic variations in the luminosities of Type Ia SN, but that is absolutely how we would have reacted in 1993, and no amount of improvements in the method would have convinced us. This is exactly what we did with faint galaxy counts: galaxies evolve; you can’t hope to understand that well enough to constrain cosmology. Do you ever hear them cited as evidence for Λ?

Great as the supernovae experiments to measure the metric genuinely were, they were not a discovery so much as a confirmation of what cosmologists had already decided to believe. There was no singular discovery that changed the way we all thought. There was a steady drip, drip, drip of results pointing towards Λ all through the ’90s – the age problem in which the oldest stars appeared to be older than the universe in which they reside, the early appearance of massive clusters and galaxies, the power spectrum of galaxies from redshift surveys that preceded Sloan, the statistics of gravitational lenses, and the repeated measurement of 1/4 < Ωm < 1/3 in a large variety of independent ways – just to name a few. By the mid-90’s, SCDM was dead. We just refused to bury it until we could accept ΛCDM as a replacement. That was what the Type Ia SN results really provided: a fresh and dramatic reason to accept the accelerated expansion that we’d already come to terms with privately but had kept hidden in the closet.

Note that the acoustic power spectrum of temperature fluctuations in the cosmic microwave background (as opposed to the mere existence of the highly uniform CMB) plays no role in this history. That’s because temperature fluctuations hadn’t yet been measured beyond their rudimentary detection by COBE. COBE demonstrated that temperature fluctuations did indeed exist (finally!) as they must, but precious little beyond that. Eventually, after the settling of much dust, COBE was recognized as one of many reasons why Ωm ≠ 1, but it was neither the most clear nor most convincing reason at that time. Now, in the 21st century, the acoustic power spectrum provides a great way to constrain what all the parameters of ΛCDM have to be, but it was a bit player in its development. The water there was carried by traditional observational cosmology using general purpose optical telescopes in a great variety of different ways, combined with a deep astrophysical understanding of how stars, galaxies, quasars and the whole menagerie of objects found in the sky work. All the vast knowledge incorporated in textbooks like those by Harrison, by Peebles, and by Peacock – knowledge that often seems to be lacking in scientists trained in the post-WMAP era.

Despite being a late arrival, the CMB power spectrum measured in 2000 by Boomerang and 2003 by WMAP did one important new thing to corroborate the ΛCDM picture. The supernovae data didn’t detect accelerated expansion so much as exclude the deceleration we had nominally expected. The data were also roughly consistent with a coasting universe (neither accelerating nor decelerating); the case for acceleration only became clear when we assumed that the geometry of the universe was flat (ΩmΛ = 1). That didn’t have to work out, so it was a great success of the paradigm when the location of the first peak of the power spectrum appeared in exactly the right place for a flat FLRW geometry.

The consistency of these data have given ΛCDM an air of invincibility among cosmologists. But a modern reconstruction of the Ostriker & Steinhardt diagram leaves zero room remaining – hence the tension between H0 = 73 measured directly and H0 = 67 from multiparameter CMB fits.

omhannotated_cmb
Constraints from the acoustic power spectrum of the CMB overplotted on the direct measurements from the plot above. Initially in great consistency with those measurement, the best fit CMB values have steadily wandered away from the most-favored region of parameter space that established ΛCDM in the first place. This is most apparent in the tension with H0.

In cosmology, we are accustomed to having to find our way through apparently conflicting data. The difference between an expansion rate of 67 and 73 seems trivial given that the field was long riven – in living memory – by the dispute between 50 and 100. This gives rise to the expectation that the current difference is just a matter of some subtle systematic error somewhere. That may well be correct. But it is also conceivable that FLRW is inadequate to describe the universe, and we have been driven to the objectively bizarre parameters of ΛCDM because it happens to be the best approximation that can be obtained to what is really going on when we insist on approximating it with FLRW.

Though a logical possibility, that last sentence will likely drive many cosmologists to reach for their torches and pitchforks. Before killing the messenger, we should remember that we once endowed SCDM with the same absolute certainty we now attribute to ΛCDM. I was there, 3,000 internet years ago, when SCDM failed. There is nothing so sacred in ΛCDM that it can’t suffer the same fate, as has every single cosmology ever devised by humanity.

Today, we still lack definitive knowledge of either dark matter or dark energy. These add up to 95% of the mass-energy of the universe according to ΛCDM. These dark materials must exist.

It is Known, Khaleesi.

It Must Be So. But which Must?

It Must Be So. But which Must?

In the last post, I noted some of the sociological overtones underpinning attitudes about dark matter and modified gravity theories. I didn’t get as far as the more scientifically  interesting part, which  illustrates a common form of reasoning in physics.

About modified gravity theories, Bertone & Tait state

“the only way these theories can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of cold dark matter on cosmological scales.”

Leaving aside just which observations need to be mimicked so precisely (I expect they mean power spectrum; perhaps they consider this to be so obvious that it need not be stated), this kind of reasoning is both common and powerful – and frequently correct. Indeed, this is exactly the attitude I expressed in my review a few years ago for the Canadian Journal of Physics, quoted in the image above. I get it. There are lots of positive things to be said for the standard cosmology.

This upshot of this reasoning is, in effect, that “cosmology works so well that non-baryonic dark matter must exist.” I have sympathy for this attitude, but I also remember many examples in the history of cosmology where it has gone badly wrong. There was a time, not so long ago, that the matter density had to be the critical value, and the Hubble constant had to be 50 km/s/Mpc. By and large, it is the same community that insisted on those falsehoods with great intensity that continues to insist on conventionally conceived cold dark matter with similarly fundamentalist insistence.

I think it is an overstatement to say that the successes of cosmology (as we presently perceive them) prove the existence of dark matter. A more conservative statement is that the ΛCDM cosmology is correct if, and only if, dark matter exists. But does it? That’s a separate question, which is why laboratory searches are so important – including null results. It was, after all, the null result of Michelson & Morley that ultimately put an end to the previous version of an invisible aetherial medium, and sparked a revolution in physics.

Here I point out that the same reasoning asserted by Bertone & Tait as a slam dunk in favor of dark matter can just as accurately be asserted in favor of MOND. To directly paraphrase the above statement:

“the only way ΛCDM can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of MOND on galactic scales.”

This is a terrible problem for dark matter. Even if it were true, as is often asserted, that MOND only fits rotation curves, this would still be tantamount to a falsification of dark matter by the same reasoning applied by Bertone & Tait.

Lets look at just one example, NGC 1560:

 

ngc1560mond
The rotation curve of NGC 1560 (points) together with the Newtonian expectation (black line) and the MOND fit (blue line). Data from Begeman et al. (1991) and Gentile et al. (2010).

MOND fits the details of this rotation curve in excruciating detail. It provides just the right amount of boost over the Newtonian expectation, which varies from galaxy to galaxy. Features in the baryon distribution are reflected in the rotation curve. That is required in MOND, but makes no sense in dark matter, where the excess velocity over the Newtonian expectation is attributed to a dynamically hot, dominant, quasi-spherical dark matter halo. Such entities cannot support the features commonly seen in thin, dynamically cold disks. Even if they could, there is no reason that features in the dominant dark matter halo should align with those in the disk: a sphere isn’t a disk. In short, it is impossible to explain this with dark matter – to the extent that anything is ever impossible for the invisible.

NGC 1560 is a famous case because it has such an obvious feature. It is common to dismiss this as some non-equilibrium fluke that should simply be ignored. That is always a dodgy path to tread, but might be OK if it were only this galaxy. But similar effects are seen over and over again, to the point that they earned an empirical moniker: Renzo’s Rule. Renzo’s rule is known to every serious student of rotation curves, but has not informed the development of most dark matter theory. Ignoring this information is like leaving money on the table.

MOND fits not just NGC 1560, but very nearly* every galaxy we measure. It does so with excruciatingly little freedom. The only physical fit parameter is the stellar mass-to-light ratio. The gas fraction of NGC 1560 is 75%, so M*/L plays little role. We understand enough about stellar populations to have an idea what to expect; MOND fits return mass-to-light ratios that compare well with the normalization, color dependence, and band-pass dependent scatter expected from stellar population synthesis models.

MLBV_MOND
The mass-to-light ratio from MOND fits (points) in the blue (left panel) and near-infrared (right panel) pass-bands plotted against galaxy color (blue to the left, red to the right). From the perspective of stellar populations, one expects more scatter and a steeper color dependence in the blue band, as observed. The lines are stellar population models from Bell et al. (2003). These are completely independent, and have not been fit to the data in any way. One could hardly hope for better astrophysical agreement.

 

One can also fit rotation curve data with dark matter halos. These require a minimum of three parameters to the one of MOND. In addition to M*/L, one also needs at least two parameters to describe the dark matter halo of each galaxy – typically some characteristic mass and radius. In practice, one finds that such fits are horribly degenerate: one can not cleanly constrain all three parameters, much less recover a sensible distribution of M*/L. One cannot construct the plot above simply by asking the data what it wants as one can with MOND.

The “disk-halo degeneracy” in dark matter halo fits to rotation curves has been much discussed in the literature. Obsessed over, dismissed, revived, and ultimately ignored without satisfactory understanding. Well, duh. This approach uses three parameters per galaxy when it takes only one to describe the data. Degeneracy between the excess fit parameters is inevitable.

From a probabilistic perspective, there is a huge volume of viable parameter space that could (and should) be occupied by galaxies composed of dark matter halos plus luminous galaxies. Two identical dark matter halos might host very different luminous galaxies, so would have rotation curves that differed with the baryonic component. Two similar looking galaxies might reside in rather different dark matter halos, again having rotation curves that differ.

The probabilistic volume in MOND is much smaller. Absolutely tiny by comparison. There is exactly one and only one thing each rotation curve can do: what the particular distribution of baryons in each galaxy says it should do. This is what we observe in Nature.

The only way ΛCDM can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of MOND on galactic scales. There is a vast volume of parameter space that the rotation curves of galaxies could, in principle, inhabit. The naive expectation was exponential disks in NFW halos. Real galaxies don’t look like that. They look like MOND. Magically, out of the vast parameter space available to galaxies in the dark matter picture, they only ever pick the tiny sub-volume that very precisely mimics MOND.

The ratio of probabilities is huge. So many dark matter models are possible (and have been mooted over the years) that it is indefinably huge. The odds of observing MOND-like phenomenology in a ΛCDM universe is practically zero. This amounts to a practical falsification of dark matter.

I’ve never said dark matter is falsified, because I don’t think it is a falsifiable concept. It is like epicycles – you can always fudge it in some way. But at a practical level, it was falsified a long time ago.

That is not to say MOND has to be right. That would be falling into the same logical trap that says ΛCDM has to be right. Obviously, both have virtues that must be incorporated into whatever the final answer may be. There are some efforts in this direction, but by and large this is not how science is being conducted at present. The standard script is to privilege those data that conform most closely to our confirmation bias, and pour scorn on any contradictory narrative.

In my assessment, the probability of ultimate success through ignoring inconvenient data is practically zero. Unfortunately, that is the course upon which much of the field is currently set.


*There are of course exceptions: no data are perfect, so even the right theory will get it wrong once in a while. The goof rate for MOND fits is about what I expect: rare, but  more frequent for lower quality data. Misfits are sufficiently rare that to obsess over them is to refuse to see the forest for a few outlying trees.

Here’s a residual plot of MOND fits. See the peak at right? That’s the forest. See the tiny tail to one side? That’s an outlying tree.

rcresid_mondfits
Residuals of MOND rotation curve fits from Famaey & McGaugh (2012).

The arrogance of ignorance

The arrogance of ignorance

A colleague points out to me a recent preprint by Bertone & Tait titled A New Era in the Quest for Dark Matter. Most of the narrative is a conventionalist response to the failure of experimental dark matter searches, posing a legitimate question in this context. Where do we take it from here?

bloomccountyrocktakeitwhere
From Bloom County by Berkley Breathed.

There is one brief paragraph mentioning and dismissing the possibility that what we call the dark matter problem might instead be some form of new dynamics. This is completely pro forma, and I wouldn’t have given it a second thought had my colleague not griped to me about it. It contains the following gem:

“The success of these efforts however remained limited at most to rotation curves of galaxies, and it is today clear that the only way these theories can be reconciled with observations is by effectively, and very precisely, mimicking the behavior of cold dark matter on cosmological scales.”

This is enormously revealing about the sociological attitudes in the field. Specifically, the attitude common among  particle physicists who work on dark matter. Now, that’s a lot of people, and there are many individual exceptions to the general attitude I’m about to describe. But these are exceedingly common themes, so lets break it down.

There are two distinct issues packed into this one sentence. The first amounts to the oft-repeated talking point, “MOND fits rotation curves but does nothing else.”

“…these efforts however remained limited at most to rotation curves of galaxies…”

(emphasis added.) This is simply incorrect.

Nevertheless, this sentiment has been asserted so many times by so many otherwise reasonable and eminent scientists that the innocent bystander may be forgiven for thinking there is some truth to this statement. There is not. Indeed, it is a perfect example of the echo chamber effect – someone said it without checking their facts, then someone else repeated it, and so on until everyone knows this to be true. Everyone says so!

To be sure, I shared the same concern initially. Difference is, I did the fact checking. It surprised the bejeepers out of me to find that the vast majority of observations that we usually ascribe to dark matter could just as well be explained by MOND. Often better, and with less effort. This is not to say that MOND is always better, of course. But there is so much more to it that I’m not going to review it yet again here.

f4b760cf-7f29-4e5b-a5d4-3b674a705c7c_screenshot
I’m shocked – SHOCKED – to find MOND going on in this universe!

If only there were some way for scientists to communicate. In writing. Preserved in archival journals. Reviews, even…

A Tale of Two Paradigms: the Mutual Incommensurability of ΛCDM and MOND

The Third Law of Galactic Rotation

Modified Newtonian Dynamics (MOND): Observational Phenomenology and Relativistic Extensions

Modified Newtonian Dynamics as an Alternative to Dark Matter

Testing the Hypothesis of Modified Dynamics with Low Surface Brightness Galaxies and Other Evidence

Testing the Dark Matter Hypothesis with Low Surface Brightness Galaxies and Other Evidence

or if those are too intimidating, review talks at conferences

Extended Theories of Gravity

The Trials and Tribulations of Modern Cosmology: The Good, the Bad, and the Just Plain Ugly

Observational Constraints on the Acceleration Discrepancy Problem

Dynamical Constraints on Disk Galaxy Formation

How Galaxies Don’t Form

ETC.

The world has many experts on dark matter. I am one of them. It has rather fewer experts on MOND. I happen to be one of those, because I made the effort to learn about it. Being an expert on dark matter does not make one an expert on MOND – it’s painful to realize you’re at the wrong peak in the Dunning-Kruger curve. Becoming an expert is hard and time consuming, so I appreciate why many people don’t want to invest their time that way – MOND is a fringe idea, after all. Or so I thought, until I bothered to learn about it. The more I learned, the more I realized it could not so easily be dismissed.

But MOND can easily be dismissed if you remain ignorant of it! This attitude is what I call the arrogance of ignorance. Many scientists who are experts on dark matter don’t know what MOND is really, or what all it does and does not do successfully. They don’t need to! (arrogance). It can’t possibly be true! (ignorance).

The result is a profoundly unscientific status quo. If you hear someone assert something to the effect that “MOND fits rotation curves and nothing else” then you know they simply don’t know what they’re talking about.

I haven’t yet broken down the second part of statement above, but I’ve probably outraged enough people for one post. That’s OK – the shoe deserves to be on the other foot. Being outraged is what is to be an astronomer listening to particle physicists opine about dark matter. Their general attitude is that astronomers can’t possibly have anything to teach them about the subject. Never mind that 100% of the evidence is astronomical in nature, and will remain so until we get a laboratory detection. Good luck with that.

Yes, Virginia, there is a Dark Matter

Yes, Virginia, there is a Dark Matter

Virginia, your little friends are wrong. They have been affected by the skepticism of a skeptical age. They do not believe except they see. They think that nothing can be which is not comprehensible by their little minds. All minds, Virginia, whether they be men’s or children’s, are little. In this great universe of ours man is a mere insect, an ant, in his intellect, as compared with the boundless world about him, as measured by the intelligence capable of grasping the whole of truth and knowledge.

Yes, Virginia, there is a Dark Matter. It exists as certainly as squarks and sleptons and Higgsinos exist, and you know that they abound and give to your life its highest beauty and joy. Alas! how dreary would be the world if there were no Dark Matter. It would be as dreary as if there were no supersymmetry. There would be no childlike faith then, no papers, no grants to make tolerable this existence. We should have no enjoyment, except in observation and experiment. The eternal light with which childhood fills the world would be extinguished.

Not believe in Dark Matter! You might as well not believe in Dark Energy! You might get the DOE to hire men to watch in all the underground laboratories to catch Dark Matter, but even if they did not see Dark Matter coming down, what would that prove? Nobody sees Dark Matter, but that is no sign that there is no Dark Matter. The most real things in the world are those that neither children nor men can see. Did you ever see fairies dancing on the lawn? Of course not, but that’s no proof that they are not there. Nobody can conceive or imagine all the wonders there are unseen and unseeable in the world.

You may tear apart the baby’s rattle and see what makes the noise inside, but there is a veil covering the unseen world which not the best experiment, nor even the united efforts of all the keenest experiments ever conducted, could tear apart. Only faith, fancy, poetry, love, romance, can push aside that curtain and view and picture the supernal beauty and glory beyond. Is it all real? Ah, Virginia, in all this world there is nothing else real and abiding.

No Dark Matter! Thank God! It exists, and it exists forever. A thousand years from now, Virginia, nay, ten times ten thousand years from now, it will continue to make glad the coffers of science.

Paraphrased from the famous letter Yes, Virginia, there is a Santa Claus.

The kids are all right, but they can’t interpret a graph

The kids are all right, but they can’t interpret a graph

I have not posted here in a while. This is mostly due to the fact that I have a job that is both engaging and demanding. I started this blog as a way to blow off steam, but I realized this mostly meant ranting about those fools at the academy! of whom there are indeed plenty. These are reality based rants, but I’ve got better things to do.

As it happens, I’ve come down with a bug that keeps me at home but leaves just enough energy to read and type, but little else. This is an excellent recipe for inciting a rant. Reading the Washington Post article on delayed gratification in children brings it on.

It is not really the article that gets me, let alone the scholarly paper on which it is based. I have not read the latter, and have no intention of doing so. I hope its author has thought through the interpretation better than is implied by what I see in the WaPo article. That is easy for me to believe; my own experience is that what academics say to the press has little to do with what eventually appears in the press – sometimes even inverting its meaning outright. (At one point I was quoted as saying that dark matter experimentalists should give up, when what I had said was that it was important to pursue these experiments to their logical conclusion, but that we also needed to think about what would constitute a logical conclusion if dark matter remains undetected.)

So I am at pains to say that my ire is not directed at the published academic article. In this case it isn’t even directed at the article in the WaPo, regardless of whether it is a fair representation of the academic work or not. My ire is directed entirely at the interpretation of a single graph, which I am going to eviscerate.

The graph in question shows the delay time measured in psychology experiments over the years. It is an attempt to measure self-control in children. When presented with a marshmallow but told they may have two marshmallows if they wait for it, how long can they hold out? This delayed gratification is thought to be a measure of self-control that correlates positively with all manners of subsequent development. Which may indeed be true. But what can we learn from this particular graph?

marshmallow_test-1

The graph plots the time delay measured from different experiments against the date of the experiment. Every point (plotted as a marshmallow – cute! I don’t object to that) represents an average over many children tested at that time. Apparently they have been “corrected” to account for the age of the children (one gets better at delayed gratification as one matures) which is certainly necessary, but it also raises a flag. How was the correction made? Such details can matter.

However, my primary concern is more basic. Do the data, as shown, actually demonstrate a trend?

To answer this question for yourself, the first thing you have to be able to do is mentally remove the line. That big black bold line that so nicely connects the dots. Perhaps it is a legitimate statistical fit of some sort. Or perhaps it is boldface to [mis]guide the eye. Doesn’t matter. Ignore it. Look at the data.

The first thing I notice about the data are the outliers – in this case, 3 points at very high delay times. These do not follow the advertised trend, or any trend. Indeed, they seem in no way related to the other data. It is as if a different experiment had been conducted.

When confronted with outlying data, one has a couple of choices. If we accept that these data are correct and from the same experiment, then there is no trend: the time of delayed gratification could be pretty much anything from a minute to half an hour. However, the rest of the data do clump together, so the other option is that these outliers are not really representing the same thing as the rest of the data, and should be ignored, or at least treated with less weight.

The outliers may be the most striking part of the data set, but they are usually the least important. There are all sorts of statistical measures by which to deal with them. I do not know which, if any, have been applied. There are no error bars, no boxes representing quartiles or some other percentage spanned by the data each point represents. Just marshmallows. Now I’m a little grumpy about the cutesy marshmallows. All marshmallows are portrayed as equal, but are some marshmallows more equal than others? This graph provides no information on this critical point.

In the absence of any knowledge about the accuracy of each marshmallow, one is forced to use one’s brain. This is called judgement. This can be good or bad. It is possible to train the brain to be a good judge of these things – a skill that seems to be in decline these days.

What I see in the data are several clumps of points (disregarding the outliers). In the past decade there are over a dozen points all clumped together around an average of 8 minutes. That seems like a pretty consistent measure of the delayed gratification of the current generation of children.

Before 2007, the data are more sparse. There are a half a dozen points on either side of 1997. These have a similar average of 7 or 8 minutes.

Before that there are very little data. What there is goes back to the sixties. One could choose to see that as two clumps of three points, or one clump of six points. If one does the latter, the mean is around 5 minutes. So we had a “trend” of 5 minutes circa 1970, 7 minutes circa 1997, and 8 minutes circa 2010. That is an increase over time, but it is also a tiny trend – much less persuasive than the heavy solid line in the graph implies.

If we treat the two clumps of three separately – as I think we should, since they sit well apart from each other – then we have to choose which to believe. They aren’t consistent. The delay time in 1968 looks to have an average of two minutes; in 1970 it looks to be 8 minutes. So which is it?

According to the line in the graph, we should believe the 1968 data and not the 1970 data. That is, the 1968 data fall nicely on the line, while the 1970 data fall well off it. In percentage terms, the 1970 data are as far from the trend as the highest 2010 point that we rejected as an outlier.

When fitting a line, the slope of the line can be strongly influence by the points at its ends. In this case, the earliest and the latest data. The latest data seem pretty consistent, but the earliest data are split. So the slope depends entirely on which clump of three early points you choose to believe.

If we choose to believe the 1970 clump, then the “trend” becomes 8 minutes in 1970, 7 minutes in 1997, 8 minutes in 2010. Which is to say, no trend at all. Try disregarding the first three (1968) points and draw your own line on this graph. Without them, it is pretty flat. In the absence of error bars and credible statistics, I would conclude that there is no meaningful trend present in the data at all. Maybe a formal fit gives a non-zero slope, but I find it hard to believe it is meaningfully non-zero.

None of this happens in a vacuum. Lets step back and apply some external knowledge. Have people changed over the 5 decades of my life?

The contention of the WaPo article is that they have. Specifically, contrary to the perception that iPhones and video games have created a generation with a cripplingly short attention span (congrats if you made it this far!), in fact the data show the opposite. The ability of children to delay gratification has improved over the time these experiments have been conducted.

What does the claimed trend imply? If we take it literally, then extrapolating back in time, the delay time goes to zero around 1917. People in the past must have been completely incapable of delaying gratification for even an instant. This was a power our species only developed in the past century.

I hope that sounds implausible. If there is no trend, which is what the data actually show, then children a half century ago were much the same as children a generation ago are much the same as the children of today. So the more conservative interpretation of the graph would be that human nature is rather invariant, at least as indicated by the measure of delayed gratification in children.

Sadly, null results are dull. There well may be a published study reporting no trend, but it doesn’t get picked up by the Washington Post. Imagine the headline: “Children today are much the same as they’ve always been!” Who’s gonna click on that? In this fashion, even reputable news sources contribute to the scourge of misleading science and fake news that currently pollutes our public discourse.

ghostbusters-columbia
They expect results!

This sort of over-interpretation of weak trends is rife in many fields. My own, for example. This is why I’m good at spotting them. Fortunately, screwing up in Astronomy seldom threatens life and limb.

Then there is Medicine. My mother was a medical librarian; I occasionally browsed their journals when waiting for her at work. Graphs for the efficacy of treatments that looked like the marshmallow graph were very common. Which is to say, no effect was in evidence, but it was often portrayed as a positive trend. They seem to be getting better lately (which is to say, at some point in the not distant past some medical researchers were exposed to basic statistics), but there is an obvious pressure to provide a treatment, even if the effect of the available course of treatment is tiny. Couple that to the aggressive marketing of drugs in the US, and it would not surprise me if many drugs have been prescribed based on efficacy trends weaker than seen in the marshmallow graph. See! There is a line with a positive slope! It must be doing some good!

Another problem with data interpretation is in the corrections applied. In the case of marshmallows, one must correct for the age of the subject: an eight year old can usually hold out longer than a toddler. No doubt there are other corrections. The way these are usually made is to fit some sort of function to whatever trend is seen with age in a particular experiment. While that trend may be real, it also has scatter (I’ve known eight year olds who couldn’t out wait a toddler), which makes it dodgy to apply. Do all experiments see the same trend? It is safe to apply the same correction to all of them? Worse, it is often necessary to extrapolate these corrections beyond where they are constrained by data. This is known to be dangerous, as the correction can become overlarge upon extrapolation.

It would not surprise me if the abnormally low points around 1968 were over-corrected in some way. But then, it was the sixties. Children may have not changed much since then, but the practice of psychology certainly has. Lets consider the implications that has for comparing 1968 data to 2017 data.

The sixties were a good time for psychological research. The field had grown enormously since the time of Freud and was widely respected. However, this was also the time when many experimental psychologists thought psychotropic drugs were a good idea. Influential people praised the virtues of LSD.

My father was a grad student in psychology in the sixties. He worked with swans. One group of hatchlings imprinted on him. When they grew up, they thought they should mate with people – that’s what their mom looked like, after all. So they’d and make aggressive displays towards any person (they could not distinguish human gender) who ventured too close.

He related the anecdote of a colleague who became interested in the effect of LSD on animals. The field was so respected at the time that this chap was able to talk the local zoo into letting him inject an elephant with LSD. What could go wrong?

Perhaps you’ve heard the expression “That would have killed a horse! Fortunately, you’re not a horse.” Well, the fellow in question figured elephants were a lot bigger than people. So he scaled up the dose by the ratio of body mass. Not, say, the ratio of brain size, or whatever aspect of the metabolism deals with LSD.

That’s enough LSD to kill an elephant.

Sad as that was for the elephant, who is reputed to have been struck dead pretty much instantly – no tripping rampage preceded its demise – my point here is that these were the same people conducting the experiments in 1968. Standards were a little different. The difference seen in the graph may have more to do with differences in the field than with differences in the subjects.

That is not to say we should simply disregard old data. The date on which an observation is made has no bearing on its reliability. The practice of the field at that time does.

The 1968 delay times are absurdly low. All three are under four minutes. Such low delay times are not reproduced in any of the subsequent experiments. They would be more credible if the same result were even occasionally reproduced. It ain’t.

Another way to look at this is that there should be a comparable number of outliers on either side of the correct trend. That isn’t necessarily true – sometimes systematic errors push in a single direction – but in the absence of knowledge of such effects, one would expect outliers on both the high side and the low side.

In the marshmallow graph, with the trend as drawn, there are lots of outliers on the high side. There are none on the low side. [By outlier, I mean points well away from the trend, not just scattered a little to one side or the other.]

If instead we draw a flat line at 7 or 8 minutes, then there are three outliers on both sides. The three very high points, and the three very low points, which happen to occur around 1968. It is entirely because the three outliers on the low side happen at the earliest time that we get even the hint of a trend. Spread them out, and they would immediately be dismissed as outliers – which is probably what they are. Without them, there is no significant trend. This would be the more conservative interpretation of the marshmallow graph.

Perhaps those kids in 1968 were different in other ways. The experiments were presumably conducted in psychology departments on university campuses in the late sixties. It was OK to smoke inside back then, and not everybody restricted themselves to tobacco in those days. Who knows how much second hand marijuana smoke was inhaled just to getting to the test site? I jest, but the 1968 numbers might just measure the impact on delayed gratification when the subject gets the munchies.

ancient-aliens
Marshmallows.

 

Solution Aversion

Solution Aversion

I have had the misfortune to encounter many terms for psychological dysfunction in many venues. Cognitive dissonance, confirmation bias, the Dunning-Kruger effect – I have witnessed them all, all too often, both in the context of science and elsewhere. Those of us who are trained as scientists are still human: though we fancy ourselves immune, we are still subject to the same cognitive foibles as everyone else. Generally our training only suffices us to get past the oft-repeated ones.

Solution aversion is the knee-jerk reaction we have to deny the legitimacy of a problem when we don’t like the solution admitting said problem would entail. An obvious example in the modern era is climate change. People who deny the existence of this problem are usually averse to its solution.

Let me give an example from my own experience. To give some context requires some circuitous story-telling. We’ll start with climate change, but eventually get to cosmology.

Recently I encountered a lot of yakking on social media about an encounter between Bill Nye (the science guy) and Will Happer in a dispute about climate change. The basic gist of most of the posts was that of people (mostly scientists, mostly young enough to have watched Bill Nye growing up) cheering on Nye as he “eviscerated” Happer’s denialism. I did not watch any of the exchange, so I cannot evaluate the relative merits of their arguments. However, there is a more important issue at stake here: credibility.

Bill Nye has done wonderful work promoting science. Younger scientists often seem to revere him as a sort of Mr. Rogers of science. Which is great. But he is a science-themed entertainer, not an actual scientist. His show demonstrates basic, well known phenomena at a really, well, juvenile level. That’s a good thing – it clearly helped motivate a lot of talented people to become scientists. But recapitulating well-known results is very different from doing the cutting edge science that establishes new results that will become the fodder of future textbooks.

Will Happer is a serious scientist. He has made numerous fundamental contributions to physics. For example, he pointed out that the sodium layer in the upper atmosphere could be excited by a laser to create artificial guide stars for adaptive optics, enabling ground-based telescopes to achieve resolutions comparable to that of the Hubble space telescope. I suspect his work for the JASON advisory group led to the implementation of adaptive optics on Air Force telescopes long before us astronomers were doing it. (This is speculation on my part: I wouldn’t know; it’s classified.)

My point is that, contrary to the wishful thinking on social media, Nye has no more standing to debate Happer than Mickey Mouse has to debate Einstein. Nye, like Mickey Mouse, is an entertainer. Einstein is a scientist. If you think that comparison is extreme, that’s because there aren’t many famous scientists whose name I can expect everyone to know. A better analogy might be comparing Jon Hirschtick (a successful mechanical engineer, Nye’s field) to I.I. Rabi (a prominent atomic physicist like Happer), but you’re less likely to know who those people are. Most serious scientists do not cultivate public fame, and the modern examples I can think of all gave up doing real science for the limelight of their roles as science entertainers.

Another important contribution Happer made was to the study and technology of spin polarized nuclei. If you place an alkali element and a noble gas together in vapor, they may form weak van der Waals molecules. An alkali is basically a noble gas with a spare electron, so the two can become loosely bound, sharing the unwanted electron between them. It turns out – as Happer found and explained – that the wavefunction of the spare electron overlaps with the nucleus of the noble. By spin polarizing the electron through the well known process of optical pumping with a laser, it is possible to transfer the spin polarization to the nucleus. In this way, one can create large quantities of polarized nuclei, an amazing feat. This has found use in medical imaging technology. Noble gases are chemically inert, so safe to inhale. By doing so, one can light up lung tissue that is otherwise invisible to MRI and other imaging technologies.

I know this because I worked on it with Happer in the mid-80s. I was a first year graduate student in physics at Princeton where he was a professor. I did not appreciate the importance of what we were doing at the time. Will was a nice guy, but he was also my boss and though I respected him I did not much like him. I was a high-strung, highly stressed, 21 year old graduate student displaced from friends and familiar settings, so he may not have liked me much, or simply despaired of me amounting to anything. Mostly I blame the toxic arrogance of the physics department we were both in – Princeton is very much the Slytherin of science schools.

In this environment, there weren’t many opportunities for unguarded conversations. I do vividly recall some of the few that happened. In one instance, we had heard a talk about the potential for industrial activity to add enough carbon dioxide to the atmosphere to cause an imbalance in the climate. This was 1986, and it was the first I had heard of what is now commonly referred to as climate change. I was skeptical, and asked Will’s opinion. I was surprised by the sudden vehemence of his reaction:

“We can’t turn off the wheels of industry, and go back to living like cavemen.”

I hadn’t suggested any such thing. I don’t even recall expressing support for the speaker’s contention. In retrospect, this is a crystal clear example of solution aversion in action. Will is a brilliant guy. He leapt ahead of the problem at hand to see the solution being a future he did not want. Rejecting that unacceptable solution became intimately tied, psychologically, to the problem itself. This attitude has persisted to the present day, and Happer is now known as one of the most prominent scientists who is also a climate change denier.

Being brilliant never makes us foolproof against being wrong. If anything, it sets us up for making mistakes of enormous magnitude.

There is a difference between the problem and the solution. Before we debate the solution, we must first agree on the problem. That should, ideally, be done dispassionately and without reference to the solutions that might stem from it. Only after we agree on the problem can we hope to find a fitting solution.

In the case of climate change, it might be that we decide the problem is not so large as to require drastic action. Or we might hope that we can gradually wean ourselves away from fossil fuels. That is easier said than done, as many people do not seem to appreciate the magnitude of the energy budget what needs replacing. But does that mean we shouldn’t even try? That seems to be the psychological result of solution aversion.

Either way, we have to agree and accept that there is a problem before we can legitimately decide what to do about it. Which brings me back to cosmology. I did promise you a circuitous bit of story-telling.

Happer’s is just the first example I encountered of a brilliant person coming to a dubious conclusion because of solution aversion. I have had many colleagues who work on cosmology and galaxy formation say straight out to me that they would only consider MOND “as a last resort.” This is a glaring, if understandable, example of solution aversion. We don’t like MOND, so we’re only willing to consider it when all other options have failed.

I hope it is obvious from the above that this attitude is not a healthy one in science. In cosmology, it is doubly bad. Just when, exactly, do we reach the last resort?

We’ve already accepted that the universe is full of dark matter, some invisible form of mass that interacts gravitationally but not otherwise, has no place in the ridiculously well tested Standard Model of particle physics, and has yet to leave a single shred of credible evidence in dozens of super-sensitive laboratory experiments. On top of that, we’ve accepted that there is also a distinct dark energy that acts like antigravity to drive the apparent acceleration of the expansion rate of the universe, conserving energy by the magic trick of a sign error in the equation of state that any earlier generation of physicists would have immediately rejected as obviously unphysical. In accepting these dark denizens of cosmology we have granted ourselves essentially infinite freedom to fine-tune any solution that strikes our fancy. Just what could possibly constitute the last resort of that?

hammerandnails
When you have a supercomputer, every problem looks like a simulation in need of more parameters.

Being a brilliant scientist never precludes one from being wrong. At best, it lengthens the odds. All too often, it leads to a dangerous hubris: we’re so convinced by, and enamored of, our elaborate and beautiful theories that we see only the successes and turn a blind eye to the failures, or in true partisan fashion, try to paint them as successes. We can’t have a sensible discussion about what might be right until we’re willing to admit – seriously, deep-down-in-our-souls admit – that maybe ΛCDM is wrong.

I fear the field has gone beyond that, and is fissioning into multiple, distinct branches of science that use the same words to mean different things. Already “dark matter” means something different to particle physicists and astronomers, though they don’t usually realize it. Soon our languages may become unrecognizable dialects to one another; already communication across disciplinary boundaries is strained. I think Kuhn noted something about different scientists not recognizing what other scientists were doing as science, nor regarding the same evidence in the same way. Certainly we’ve got that far already, as successful predictions of the “other” theory are dismissed as so much fake news in a world unhinged from reality.

Critical Examination of the Impossible

Critical Examination of the Impossible

It has been proposal season for the Hubble Space Telescope, so many astronomers have been busy with that. I am no exception. Talking to others, it is clear that there remain many more excellent Hubble projects than available observing time.

So I haven’t written here for a bit, and I have other tasks to get on with. I did get requests for a report on the last conference I went to, Beyond WIMPs: from Theory to Detection. They have posted video from the talks, so anyone who is interested may watch.

I think this is the worst talk I’ve given in 20 years. Maybe more. Made the classic mistake of trying to give the talk the organizers asked for rather than the one I wanted to give. Conference organizers mean well, but they usually only have a vague idea of what they imagine you’ll say. You should always ignore that and say what you think is important.

When speaking or writing, there are three rules: audience, audience, audience. I was unclear what the audience would be when I wrote the talk, and it turns out there were at least four identifiably distinct audiences in attendance. There were skeptics – particle physicists who were concerned with the state of their field and that of cosmology, there were the faithful – particle physicists who were not in the least concerned about this state of affairs, there were the innocent – grad students with little to no background in astronomy, and there were experts – astroparticle physicists who have a deep but rather narrow knowledge of relevant astronomical data. I don’t think it would have been possible to address the assigned topic (a “Critical Examination of the Existence of Dark Matter“) in a way that satisfied all of these distinct audiences, and certainly not in the time allotted (or even in an entire semester).

It is tempting to give an interruption by interruption breakdown of the sociology, but you may judge that for yourselves. The one thing I got right was what I said at the outset: Attitude Matters. You can see that on display throughout.

IMG_5460
This comic has been hanging on a colleague’s door for decades.

In science as in all matters, if you come to a problem sure that you already know the answer, you will leave with that conviction. No data nor argument will shake your faith. Only you can open your own mind.

Hubble constant redux

Hubble constant redux

There is a new article in Science on the expansion rate of the universe, very much along the lines of my recent post. It is a good read that I recommend. It includes some of the human elements that influence the science.

When I started this blog, I recalled my experience in the ’80s moving from a theory-infused institution to a more observationally and empirically oriented one. At that time, the theory-infused cosmologists assured us that Sandage had to be correct: H0 = 50. As a young student, I bought into this. Big time. I had no reason not to; I was very certain of the transmitted lore. The reasons to believe it then seemed every bit as convincing a the reasons to believe ΛCDM today. When I encountered people actually making the measurement, like Greg Bothun, they said “looks to be about 80.”

This caused me a lot of cognitive dissonance. This couldn’t be true. The universe would be too young (at most ∼12 Gyr) to contain the oldest stars (thought to be ∼18 Gyr at that time). Worse, there was no way to reconcile this with Inflation, which demanded Ωm = 1. The large deceleration of the expansion caused by high Ωm greatly exacerbated the age problem (only ∼8 Gyr accounting for deceleration). Reconciling the age problem with Ωm = 1 was hard enough without raising the Hubble constant.

Presented with this dissonant information, I did what most of us humans do: I ignored it. Some of my first work involved computing the luminosity function of quasars. With the huge distance scale of H0 = 50, I remember noticing how more distant quasars got progressively brighter. By a lot. Yes, they’re the most luminous things in the early universe. But they weren’t just outshining a galaxy’s worth of stars; they were outshining a galaxy of galaxies.

That was a clue that the metric I was assuming was very wrong. And indeed, since that time, every number of cosmological significance that I was assured in confident tones by Great Men that I Had to Believe has changed by far more than its formal uncertainty. In struggling with this, I’ve learned not to be so presumptuous in my beliefs. The universe is there for us to explore and discover. We inevitably err when we try to dictate how it Must Be.

The amplitude of the discrepancy in the Hubble constant is smaller now, but the same attitudes are playing out. Individual attitudes vary, of course, but there are many in the cosmological community who take the attitude that the Planck data give H0 = 67.8 so that is the right number. All other data are irrelevant; or at best flawed until brought into concordance with the right number.

It is Known, Khaleesi. 

Often these are the same people who assured us we had to believe Ωm = 1 and H0 = 50 back in the day. This continues the tradition of arrogance about how things must be. This attitude remains rampant in cosmology, and is subsumed by new generations of students just as it was by me. They’re very certain of the transmitted lore. I’ve even been trolled by some who seem particularly eager to repeat the mistakes of the past.

From hard experience, I would advocate a little humility. Yes, Virginia, there is a real tension in the Hubble constant. And yes, it remains quite possible that essential elements of our cosmology may prove to be wrong. I personally have no doubt about the empirical pillars of the Big Bang – cosmic expansion, Big Bang Nucleosynthesis, and the primordial nature of the Cosmic Microwave Background. But Dark Matter and Dark Energy may well turn out to be mere proxies for some deeper cosmic truth. IF that is so, we will never recognize it if we proceed with the attitude that LCDM is Known, Khaleesi.

Ode to Vera

Ode to Vera

Vera Rubin passed away a few weeks ago. This was not surprising: she had lived a long, positive, and fruitful life, but had faced the usual health problems of those of us who make it to the upper 80s. Though news of her death was not surprising, it was deeply saddening. It affected me more than I had anticipated, even armed with the intellectual awareness that the inevitable must be approaching. It saddens me again now trying to write this, which must inevitably be an inadequate tribute.

In the days after Vera Rubin passed away, I received a number of inquiries from the press asking me to comment on her life and work for their various programs. I did not respond. I guess I understand the need to recognize and remark on the passing of a great scientist and human being, and I’m glad the press did in fact acknowledge her many accomplishments. But I wondered if, by responding, I would be providing a tribute to Vera, or merely feeding the needs of the never-ending hyperactive news cycle. Both, I guess. At any rate, I did not feel it was my place to comment. It did not seem right to air my voice where hers would never be heard again.

I knew Vera reasonably well, but there are plenty who knew her better and were her colleagues over a longer period of time. Also, at the back of my mind, I was a tiny bit afraid that no matter what I said, someone would read into it some sort of personal scientific agenda. My reticence did not preclude other scientists who knew her considerably less well from doing exactly that. Perhaps it is unavoidable: to speak of others, one must still use one’s own voice, and that inevitably is colored by our own perspective. I mention this because many of the things recently written about Vera do not do justice to her scientific opinions as I know them from conversations with her. This is important, because Vera was all about the science.

One thing I distinctly remembering her saying to me, and I’m sure she repeated this advice to many other junior scientists, was that you had to do science because you had a need to Know. It was not something to be done for awards or professional advancement; you could not expect any sort of acknowledgement and would likely be disappointed if you did. You had to do it because you wanted to find out how things work, to have even a brief moment when you felt like you understood some tiny fraction of the wonders of the universe.

Despite this attitude, Vera was very well rewarded for her science. It came late in her career – she did devote a lot of energy to raising a large family; she and her husband Bob Rubin were true life partners in the ideal sense of the term: family came first, and they always supported each other. It was deeply saddening when Bob passed, and another blow to science when their daughter Judy passed away all too early. We all die, sometimes sooner rather than later, but few of us take it well.

Professionally, Vera was all about the science. Work was like breathing. Something you just did; doing it was its own reward. Vera always seemed to take great joy in it. Success, in terms of awards, came late, but it did come, and in many prestigious forms – membership in the National Academy of Sciences, the Gold Medal of the Royal Astronomical Society, and the National Medal of Science, to name a few of her well-deserved honors. Much has been made of the fact that this list does not include a Nobel Prize, but I never heard Vera express disappointment about that, or even aspiration to it. Quite the contrary, she, like most modest people, didn’t seem to consider it to be appropriate. I think  part of the reason for this was that she self-identified as an astronomer, not as a physicist (as some publications mis-report). That distinction is worthy of an entire post so I’ll leave it for now.

Astronomer though she was, her work certainly had an outsized impact on physics. I have written before as to why she was deserving of a Nobel Prize, if for slightly different reasons than others give. But I do not dread that she died in any way disappointed by the lack of a Nobel Prize. It was not her nature to fret about such things.

Nevertheless, Vera was an obvious scientist to recognize with a Nobel Prize. No knowledgeable scientist would have disputed her as a choice. And yet the history of the physics Nobel prize is incredibly lacking in female laureates (see definition 4). Only two women have been recognized in the entire history of the award: Marie Curie (1903) and Maria Goeppert-Mayer (1963). She was an obvious woman to have honored in this way. It is hard to avoid the conclusion that the awarding of the prize is inherently sexist. Based on two data points, it has become more sexist over time, as there is a longer gap between now and the last award to a woman (63 years) than between the two awards (60 years).

Why should gender play any role in the search for knowledge? Or the recognition of discoveries made in that search? And yet women scientists face antiquated attitudes and absurd barriers all the time. Not just in the past. Now.

Vera was always a strong advocate of women in science. She has been an inspiration to many. A Nobel prize awarded to Vera Rubin would have been great for her, yes, but the greater tragedy of this missed opportunity is what it would have meant to all the women who are scientists now and who will be in the future.

Well, those are meta-issues raised by Vera’s passing. I don’t think it is inappropriate, because these were issues dear to her heart. I know the world is a better place for her efforts. But I hadn’t intended to go off on meta-tangents. Vera was a very real, warm, positive human being. So I what I had meant to do was recollect a few personal anecdotes. These seem so inadequate: brief snippets in a long and expansive life. Worse, they are my memories, so I can’t see how to avoid making it at least somewhat about me when it should be entirely about her. Still. Here are a few of the memories I have of her.

I first met Vera in 1985 on Kitt Peak. In retrospect I can’t imagine a more appropriate setting. But at the time it was only my second observing run, and I had no clue as to what was normal or particularly who Vera Rubin was. She was just another astronomer at the dinner table before a night of observing.

A very curious astronomer. She kindly asked what I was working on, and followed up with a series of perceptive questions. She really wanted to know. Others have remarked on her ability to make junior people feel important, and she could indeed do that. But I don’t think she tried, in particular. She was just genuinely curious.

At the time, I was a senior about to graduate from MIT. I had to beg permission to take some finals late so I could attend this observing run. My advisor, X-ray astronomer George Whipple Clark, kindly bragged about how I had actually got my thesis in on time (most students took advantage of a default one-week grace period) in order to travel to Kitt Peak. Vera, ever curious, asked about my thesis, what galaxies were involved, how the data were obtained… all had been from a run the semester before. As this became clear, Vera got this bemused look and asked “What kind of thesis can be written from a single observing run?” “A senior thesis!” I volunteered: undergraduate observers were rare on the mountain in those days; up till that point I think she had assumed I was a grad student.

I encountered Vera occasionally over the following years, but only in passing. In 1995, she offered me a Carnegie fellowship at DTM. This was a reprieve in a tight job market. As it happened, we were both visiting the Kapteyn Institute, and Renzo Sancisi had invited us both to dinner, so she took the opportunity to explain that their initial hire had moved on to a faculty position so the fellowship was open again. She managed to do this without making me feel like an also-ran. I had recently become interested in MOND, and here was the queen of dark matter offering me a job I desperately needed. It seemed right to warn her, so I did: would she have a problem with a postdoc who worked on MOND? She was visibly shocked, but only for an instant. “Of course not,” she said. “As a Carnegie Fellow, you can work on whatever you want.”

Vera was very supportive throughout my time at DTM, and afterwards. We had many positive scientific interactions, but we didn’t really work together then. I tried to get her interested in the rotation curves of low surface brightness galaxies, but she had a full plate. It wasn’t until a couple of years after I left DTM that we started collaborating.

fig3
Figure made by Vera Rubin from her measurements of the rotation curves of low surface brightness galaxies. Published in McGaugh, Rubin, & de Blok (2001).

Vera loved to measure. The reason I chose the picture featured at top is that it shows her doing what she loved. By the time we collaborated, she had moved on to using a computer to measure line positions for velocities. But that is what she loved to do. She did all the measurements for the rotation curves we measured, like the ones shown above. As the junior person, I had expected to do all that work, but she wanted to do it. Then she handed it on to me to write up, with no expectation of credit. It was like she was working for me as a postdoc. Vera Rubin was an awesome postdoc!

She also loved to observe. Mostly that was a typically positive, fruitful experience. But she did have an intense edge that rarely peaked out. One night on Las Campanas, the telescope broke. This is not unusual, and we took it in stride. For a half hour or so. Then Vera started calmly but assertively asking the staff why we were not yet back up and working. Something was very wrong, and it involved calling in extra technicians who led us into the mechanical bowels of the du Pont telescope, replete with steel cables and unidentifiable steam-punk looking artifacts. Vera watched them like a hawk. She never said a negative word. But she silently, intently watched them. Tension mounted; time slowed to a crawl till it seemed that I could feel like a hard rain the impact of every photon that we weren’t collecting. She wanted those photons. Never said a negative word, but I’m sure the staff felt a wall of pressure that I was keenly aware of merely standing in its proximity. Perhaps like a field mouse under a raptor’s scrutiny.

Vera was not normally like that, but every good observer has in her that urgency to get on sky. This was the only time I saw it come out. Other typical instrumental guffaws she bore in stride. This one took too long. But it did get fixed, and we were back on sky, and it was as if there had never been a problem in the world.

Ultimately, Vera loved the science. She was one of the most intrinsically curious souls I ever met. She wanted to know, to find out what was going on up there. But she was also content with what the universe chose to share, reveling in the little discoveries as much as the big ones. Why does the Hα emission extend so far out in UGC 2885? What is the kinematic major axis of DDO 154, anyway? Let’s put the slit in a few different positions and work it out. She kept a cheat sheet taped on her desk for how the rotation curve changed if the position angle were missed – which never happened, because she prepared so carefully for observing runs. She was both thorough and extremely good at what she did.

Vera was very positive about the discoveries of others. Like all good astronomers, she had a good BS detector. But she very rarely said a negative word. Rarely, not never. She was not a fan of Chandrasekhar, who was the editor of the ApJ when she submitted her dissertation paper there. Her advisor, Gamow, had posed the question to her, is there a length scale in the sky? Her answer would, in the modern parlance, be called the correlation length of galaxies. Chandrasekhar declined to consider publishing this work, explaining in a letter that he had a student working on the topic, and she should wait for the right answer. The clear implication was that this was a man’s job, and the work of a woman was not to be trusted. Ultimately her work was published in the proceedings of the National Academy, of which Gamow was a member. He had predicted that this is how Chandrasekhar would behave, afterwards sending her a postcard saying only “Told you so.”

On another occasion, in the mid-90s when “standard” CDM meant SCDM with Ωm = 1, not ΛCDM, she confided to me in hushed tones that the dark matter had to be baryonic. Other eminent dynamicists have said the same thing to me at times, always in the same hushed tones, lest the cosmologists overhear. As well they might. To my ears this was an absurdity, and I know well the derision it would bring. What about Big Bang Nucleosynthesis? This was the only time I recall hearing Vera scoff. “If I told the theorists today that I could prove Ωm = 1, tomorrow they would explain that away.”

I was unconvinced. But it made clear to me that I put a lot of faith in Big Bang Nucleosynthesis, and this need not be true for all intelligent scientists. Vera – and the others I allude to, who still live so I won’t name – had good reasons for her assertion. She had already recognized that there was a connection between the baryon distribution and the dynamics of galaxies, and that this made a lot more sense if the dark and luminous component were closely related – for example, if the dark matter – or at least some important fraction of it in galaxies – were itself baryonic. Even if we believe in Big Bang Nucleosynthesis, we’re still missing a lot of baryons.

The proper interpretation of this evidence is still debated today. What I learned from this was to be more open to the possibility that things I thought I knew for sure might turn out to be wrong. After all, that pretty much sums up the history of cosmology.

It was widely reported that Vera discovered dark matter or “proved” or “confirmed” its existence. I don’t think Vera would agree with this assessment, nor would many of her colleagues at DTM. I know this because we talked about it. A lot.

To my mind, what Vera discovered is both more specific and more profound than the dark matter paradigm it helped to create. What she discovered observationally is that rotation curves are very nearly flat, and continue to be so to indefinitely large radius. Over and over again, for every galaxy in the sky. It is a law of nature for galaxies, akin to Kepler’s laws for planets. Dark matter is an inference, a subsidiary result. It is just one possible interpretation, a subset of amazing and seemingly unlikely possibilities opened up by her discovery.

The discovery itself is amazing enough without conflating it with dark matter or MOND or any other flavor of interpretation of which the reader might be fond. Like many great discoveries, it has many parents. I would give a lot of credit to Albert Bosma, but there are also others who had early results, like Mort Roberts and Seth Shostak. But it was Vera whose persistence overcame the knee-jerk conservatism of cosmologists like Sandage, who she said dismissed her early flat rotation curve of M31 (obtained in collaboration with Roberts) as “the effect of looking at a bright galaxy.” “What does that even mean?” she asked me rhetorically. She also recalled Jim Gunn gasping “But… that would mean most of the mass is dark!” Indeed. It takes time to wrap our heads around these things. She obtained rotation curve after rotation curve in excess of a hundred to ensure we realized we had to do so.

Vera realized the interpretation was never as settled as the data. Her attitude (and that of many of us, including myself) is nicely summarized by her exchange with Tohline at the end of her 1982 talk at IAU 100. One starts with the most conservative – or at least, least outrageous – possibility, which at that time was a mere factor of two in hidden mass, which could easily have been baryonic. Yet much more more recently, at the last conference I attended with her (in 2009), she reminded the audience (to some visible consternation) that it was still “early days” for dark matter, and we should not be surprised to be surprised – up to, and including, how gravity works.

At this juncture, I expect some readers will accuse me of what I warned about above: using this for my own agenda. I have found it is impossible to avoid having an agenda imputed to me by people who don’t like what they imagine my agenda to be, whether they imagine right or not – usually not. But I can’t not say these things if I want to set the record straight – these were Vera’s words. She remained concerned all along that it might be gravity to blame rather than dark matter. Not convinced, nor even giving either the benefit of the doubt. There was, and remains, so much to figure out.

“Early days.”

I suppose, in the telling, it is often more interesting to relate matters of conflict and disagreement than feelings of goodwill. In that regards, some of the above anecdotes are atypical: Vera was a very positive person. It just isn’t compelling to relate episodes like her gushing praise for Rodrigo Ibata’s discovery of the Sagittarius dwarf satellite galaxy. I probably only remember that myself because I had, like Rodrigo, encountered considerable difficulty in convincing some at Cambridge that there could be lots of undiscovered low surface brightness galaxies out there, even in the Local Group. Some of these same people now seem to take for granted that there are a lot more in the Local Group than I find plausible.

I have been fortunate in my life to have known many talented scientists. I have met many people from many nations, most of them warm, wonderful human beings. Vera was the best of the best, both as a scientist and as a human being. The world is a better place for having had her in it, for a time.

Crater 2: the Bullet Cluster of LCDM

Crater 2: the Bullet Cluster of LCDM

Recently I have been complaining about the low standards to which science has sunk. It has become normal to be surprised by an observation, express doubt about the data, blame the observers, slowly let it sink in, bicker and argue for a while, construct an unsatisfactory model that sort-of, kind-of explains the surprising data but not really, call it natural, then pretend like that’s what we expected all along. This has been going on for so long that younger scientists might be forgiven if they think this is how science is suppose to work. It is not.

At the root of the scientific method is hypothesis testing through prediction and subsequent observation. Ideally, the prediction comes before the experiment. The highest standard is a prediction made before the fact in ignorance of the ultimate result. This is incontrovertibly superior to post-hoc fits and hand-waving explanations: it is how we’re suppose to avoid playing favorites.

I predicted the velocity dispersion of Crater 2 in advance of the observation, for both ΛCDM and MOND. The prediction for MOND is reasonably straightforward. That for ΛCDM is fraught. There is no agreed method by which to do this, and it may be that the real prediction is that this sort of thing is not possible to predict.

The reason it is difficult to predict the velocity dispersions of specific, individual dwarf satellite galaxies in ΛCDM is that the stellar mass-halo mass relation must be strongly non-linear to reconcile the steep mass function of dark matter sub-halos with their small observed numbers. This is closely related to the M*-Mhalo relation found by abundance matching. The consequence is that the luminosity of dwarf satellites can change a lot for tiny changes in halo mass.

apj374168f11_lr
Fig. 11 from Tollerud et al. (2011, ApJ, 726, 108). The width of the bands illustrates the minimal scatter expected between dark halo and measurable properties. A dwarf of a given luminosity could reside in dark halos differing be two decades in mass, with a corresponding effect on the velocity dispersion.

Long story short, the nominal expectation for ΛCDM is a lot of scatter. Photometrically identical dwarfs can live in halos with very different velocity dispersions. The trend between mass, luminosity, and velocity dispersion is so weak that it might barely be perceptible. The photometric data should not be predictive of the velocity dispersion.

It is hard to get even a ballpark answer that doesn’t make reference to other measurements. Empirically, there is some correlation between size and velocity dispersion. This “predicts” σ = 17 km/s. That is not a true theoretical prediction; it is just the application of data to anticipate other data.

Abundance matching relations provide a highly uncertain estimate. The first time I tried to do this, I got unphysical answers (σ = 0.1 km/s, which is less than the stars alone would cause without dark matter – about 0.5 km/s). The application of abundance matching requires extrapolation of fits to data at high mass to very low mass. Extrapolating the M*-Mhalo relation over many decades in mass is very sensitive to the low mass slope of the fitted relation, so it depends on which one you pick.

he-chose-poorly

Since my first pick did not work, lets go with the value suggested to me by James Bullock: σ = 11 km/s. That is the mid-value (the blue lines in the figure above); the true value could easily scatter higher or lower. Very hard to predict with any precision. But given the luminosity and size of Crater 2, we expect numbers like 11 or 17 km/s.

The measured velocity dispersion is σ = 2.7 ± 0.3 km/s.

This is incredibly low. Shockingly so, considering the enormous size of the system (1 kpc half light radius). The NFW halos predicted by ΛCDM don’t do that.

To illustrate how far off this is, I have adopted this figure from Boylan-Kolchin et al. (2012).

mbkplusdwarfswcraterii
Fig. 1 of MNRAS, 422, 1203 illustrating the “too big to fail” problem: observed dwarfs have lower velocity dispersions than sub-halos that must exist and should host similar or even more luminous dwarfs that apparently do not exist. I have had to extend the range of the original graph to lower velocities in order to include Crater 2.

Basically, NFW halos, including the sub-halos imagined to host dwarf satellite galaxies, have rotation curves that rise rapidly and stay high in proportion to the cube root of the halo mass. This property makes it very challenging to explain a low velocity at a large radius: exactly the properties observed in Crater 2.

Lets not fail to appreciate how extremely wrong this is. The original version of the graph above stopped at 5 km/s. It didn’t extend to lower values because they were absurd. There was no reason to imagine that this would be possible. Indeed, the point of their paper was that the observed dwarf velocity dispersions were already too low. To get to lower velocity, you need an absurdly low mass sub-halo – around 107 M. In contrast, the usual inference of masses for sub-halos containing dwarfs of similar luminosity is around 109 Mto 1010 M. So the low observed velocity dispersion – especially at such a large radius – seems nigh on impossible.

More generally, there is no way in ΛCDM to predict the velocity dispersions of particular individual dwarfs. There is too much intrinsic scatter in the highly non-linear relation between luminosity and halo mass. Given the photometry, all we can say is “somewhere in this ballpark.” Making an object-specific prediction is impossible.

Except that it is possible. I did it. In advance.

The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.

I’m an equal opportunity scientist. In addition to ΛCDM, I also considered MOND. The successful prediction is that of MOND. (The quoted uncertainty reflects the uncertainty in the stellar mass-to-light ratio.) The difference is that MOND makes a specific prediction for every individual object. And it comes true. Again.

MOND is a funny theory. The amplitude of the mass discrepancy it induces depends on how low the acceleration of a system is. If Crater 2 were off by itself in the middle of intergalactic space, MOND would predict it should have a velocity dispersion of about 4 km/s.

But Crater 2 is not isolated. It is close enough to the Milky Way that there is an additional, external acceleration imposed by the Milky Way. The net result is that the acceleration isn’t quite as low as it would be were Crater 2 al by its lonesome. Consequently, the predicted velocity dispersion is a measly 2 km/s. As observed.

In MOND, this is called the External Field Effect (EFE). Theoretically, the EFE is rather disturbing, as it breaks the Strong Equivalence Principle. In particular, Local Position Invariance in gravitational experiments is violated: the velocity dispersion of a dwarf satellite depends on whether it is isolated from its host or not. Weak equivalence (the universality of free fall) and the Einstein Equivalence Principle (which excludes gravitational experiments) may still hold.

We identified several pairs of photometrically identical dwarfs around Andromeda. Some are subject to the EFE while others are not. We see the predicted effect of the EFE: isolated dwarfs have higher velocity dispersions than their twins afflicted by the EFE.

If it is just a matter of sub-halo mass, the current location of the dwarf should not matter. The velocity dispersion certainly should not depend on the bizarre MOND criterion for whether a dwarf is affected by the EFE or not. It isn’t a simple distance-dependency. It depends on the ratio of internal to external acceleration. A relatively dense dwarf might still behave as an isolated system close to its host, while a really diffuse one might be affected by the EFE even when very remote.

When Crater 2 was first discovered, I ground through the math and tweeted the prediction. I didn’t want to write a paper for just one object. However, I eventually did so because I realized that Crater 2 is important as an extreme example of a dwarf so diffuse that it is affected by the EFE despite being very remote (120 kpc from the Milky Way). This is not easy to reproduce any other way. Indeed, MOND with the EFE is the only way that I am aware of whereby it is possible to predict, in advance, the velocity dispersion of this particular dwarf.

If I put my ΛCDM hat back on, it gives me pause that any method can make this prediction. As discussed above, this shouldn’t be possible. There is too much intrinsic scatter in the halo mass-luminosity relation.

If we cook up an explanation for the radial acceleration relation, we still can’t make this prediction. The RAR fit we obtained empirically predicts 4 km/s. This is indistinguishable from MOND for isolated objects. But the RAR itself is just an empirical law – it provides no reason to expect deviations, nor how to predict them. MOND does both, does it right, and has done so before, repeatedly. In contrast, the acceleration of Crater 2 is below the minimum allowed in ΛCDM according to Navarro et al.

For these reasons I consider Crater 2 to be the bullet cluster of ΛCDM. Just as the bullet cluster seems like a straight-up contradiction to MOND, so too does Crater 2 for ΛCDM. It is something ΛCDM really can’t do. The difference is that you can just look at the bullet cluster. With Crater 2 you actually have to understand MOND as well as ΛCDM, and think it through.

So what can we do to save ΛCDM?

Whatever it takes, per usual.

One possibility is that Crater II may represent the “bright” tip of the extremely low surface brightness “stealth” fossils predicted by Bovill & Ricotti. Their predictions are encouraging for getting the size and surface brightness in the right ballpark. But I see no reason in this context to expect such a low velocity dispersion. They anticipate dispersions consistent with the ΛCDM discussion above, and correspondingly high mass-to-light ratios that are greater than observed for Crater 2 (M/L ≈ 104 rather than ~50).

plausible suggestion I heard was from James Bullock. While noting that reionization should preclude the existence of galaxies in halos below 5 km/s, as we need for Crater 2, he suggested that tidal stripping could reduce an initially larger sub-halo to this point. I am dubious about this, as my impression from the simulations of Penarrubia  was that the outer regions of the sub-halo were stripped first while leaving the inner regions (where the NFW cusp predicts high velocity dispersions) largely intact until near complete dissolution. In this context, it is important to bear in mind that the low velocity dispersion of Crater 2 is observed at large radii (1 kpc, not tens of pc). Still, I can imagine ways in which this might be made to work in this particular case, depending on its orbit. Tony Sohn has an HST program to measure the proper motion; this should constrain whether the object has ever passed close enough to the center of the Milky Way to have been tidally disrupted.

Josh Bland-Hawthorn pointed out to me that he made simulations that suggest a halo with a mass as low as 107 Mcould make stars before reionization and retain them. This contradicts much of the conventional wisdom outlined above because they find a much lower (and in my opinion, more realistic) feedback efficiency for supernova feedback than assumed in most other simulations. If this is correct (as it may well be!) then it might explain Crater 2, but it would wreck all the feedback-based explanations given for all sorts of other things in ΛCDM, like the missing satellite problem and the cusp-core problem. We can’t have it both ways.

maxresdefault
Without super-efficient supernova feedback, the Local Group would be filled with a million billion ultrafaint dwarf galaxies!

I’m sure people will come up with other clever ideas. These will inevitably be ad hoc suggestions cooked up in response to a previously inconceivable situation. This will ring hollow to me until we explain why MOND can predict anything right at all.

In the case of Crater 2, it isn’t just a matter of retrospectively explaining the radial acceleration relation. One also has to explain why exceptions to the RAR occur following the very specific, bizarre, and unique EFE formulation of MOND. If I could do that, I would have done so a long time ago.

No matter what we come up with, the best we can hope to do is a post facto explanation of something that MOND predicted correctly in advance. Can that be satisfactory?