Tully-Fisher from gravitational lensing

Tully-Fisher from gravitational lensing

Last time, we discussed the remarkable result that gravitational lensing extends the original remarkable result of flat rotation curves much farther out, as far as the data credibly probe. This corroborates and extends the result of Brouwer et al. They did a thorough job, but one thing they did not consider was Tully-Fisher. If the circular speed inferred from gravitational lensing remains constant, does this flat velocity fall on the same Tully-Fisher relation that is seen in kinematic data?

We set out to answer this question. Along the way, we did three new things: 1. Dr. Mistele derived an improved method for doing the lensing analysis, extending the radial range over which the data were credible. 2. He explored the criteria by which galaxies were judged to be isolated, finding a morphological type dependence on how far out one had to exclude. 3. We reanalyzed the stellar masses of the KiDS sample to be consistent with those we used when analyzing the kinematic data of SPARC galaxies. The first two are connected, as how far out we can trust the data depends on how well we can define a clean sample of isolated galaxies. The third resolved an apparent offset between early type galaxies (ETGs, aka ellipticals) and late type galaxies (LTGs, aka spirals) seen by Brouwer et al. That appears to be an artifact of stellar population modeling, as I suspected when I first discussed their result. We don’t need to do any fitting of the mass-to-light ratio; the the apparent offset between types disappears when we use use the same population models for both kinematic and lensing data.

I could write a lot about each of these, but most of it is the stuff of technical details that would be dull to many people. If you’re into that sort of thing, go and read the long science paper which is where such details reside. Here I just want to describe the Tully-Fisher result. Spoiler alert: it is the same as that from kinematics.

First off, I’m talking strictly about the Baryonic Tully-Fisher relation: the scaling between baryonic mass and the flat rotation speed. To address this, we bin the lensing data by mass. The mass of each bin is well defined by the average of the many thousands of galaxies within the bin. By far the dominant uncertainty is the systematic in stellar mass caused by stellar population modeling. We went through this with a fine tooth comb, and I’m confident we have an internally self-consistent result. That doesn’t preclude it being wrong in an absolute sense – such is the nature of astronomy – but we can at least make a straight comparison between kinematic and lensing data using the same best-effort stellar mass estimates.

For the velocity, we estimate the average effective rotation curve for each mass bin from the lensing data. We also split the data into morphological types to look for differences. The statistics go down when one divvies up the data like this, so the uncertainties go up, but there are enough KiDS galaxies to define four mass bins. Here are their inferred rotation curves:

Figure 1 from Mistele et al. (2024). Circular velocities implied by weak lensing for four baryonic mass bins (most to least massive from the top row to the bottom) for the whole sample (left column), for LTGs (middle column), and for ETGs (right column). The lowest ETG mass bin is not shown because it contains too few lenses. Instead we show results for lenses with spectroscopic redshifts from GAMA, without splitting by mass or type due to the small sample size (gray and white symbols). For comparison, we also show results for KiDS without splitting by mass or type (small yellow symbols). Open symbols at small radii indicate where lenses are not yet effective point masses. Light-colored symbols (not-outlined) at large radii indicate data points that may still be reliable but where the isolation criterion is less certain. The error bars show the statistical errors. Horizontal lines and the corresponding shaded regions indicate the inferred Vflat values and uncertainties that we use for the BTFR. The extent of the horizontal lines indicates the radial range we consider when calculating Vflat.

Note that the average over all KiDS data shown in the lower right bin is the data shown in the press release image in the previous post, but the x-axis is logarithmic here. The GAMA data in that bin provide an important cross-check, as these galaxies have spectroscopic redshifts. They give the same answer as the larger KiDS sample, which relies on photometric redshifts. We need the larger sample to consider finer bins in mass, which is the rest of the plot.

Another thing to note here is that all the data in all the bins are consistent with remaining flat. There are some hints of a turn down at very large radii, particularly for LTGs in the second and third row, but these are not statistically significant, and only happen where the data start to become untrustworthy. Where exactly that happens is a judgement call.

Let’s take a closer look, with a comparison to radio data:

Figure 2 from Mistele et al. (2024). The circular velocities from weak lensing (circles) compared with those from gas kinematics (diamonds). The individual galaxies illustrated here have among the most extended 21 cm rotation curves in their mass bins; the lensing data continue to much larger radii still. The error bars show the statistical error, while the gray band indicates the systematic uncertainty in the radial accelerations. Symbol colors are as in Figure 1. Open symbols at large radii indicate where lenses are not sufficiently isolated. The solid green lines indicate the circular velocities of NFW halos and baryonic point masses appropriate for each mass bin. Green crosses indicate each NFW halo’s virial radius. The light green band adds a qualitative estimate of a two-halo term contribution to the NFW halo, which may become important at large radii in case our isolation criterion is imperfect there.

Again we see that the lensing data, averaged over many galaxies, extend much further out than the rotation curve of any one individual. The x-axis is again logarithmic, so the lensing data go way further out. They trace to 1 Mpc, which is crazy far beyond the observed ends of the most extended individual galaxies. A more conservative limit is the 300 kpc estimated by Brouwer et al. Surely we can go further than that, but how much further remains a judgement call.

What should we expect? The green lines show the rotation curve we’d expect for galaxy in an NFW halo with parameters specified by the stellar mass-halo mass relation of Kravtsov et al. (2018). Not all such relations agree well with kinematic data; this is the case that agrees most closely. We have intentionally cherry-picked the relation that makes LCDM look best. And it does look good up to a point, for example in the top two mass bins out to the virial radius of the halo (tick marks). Beyond that, not so much, and not at all for the two lower mass bins. The data extend far enough out that we should see the predicted decline. We do not.

The green line only represents the expected halo of the primary galaxy. When one gets so far out, one has to worry about all the other stuff out there. We’ve selected galaxies to be isolated, so there isn’t much that is luminous. But we can only exclude down to some sensitivity limit, there might be lots of tiny dwarf galaxies whose mass adds up and starts to affect the result. And of course there can be completely invisible dark matter. The green band attempts to account for this extra stuff in the so-called 2-halo term. This is hard to do, but we’ve made our best estimate based on the LCDM power spectrum. I’m sure the 2-halo term can be adjusted, but the shape is wrong. It will take some fine-tuning to get an effectively flat rotation curve out of the 1-halo+2 halo terms. They don’t naturally do that.

Something that is easy to do is define a flat value of the rotation speed. That’s just the average over the lensing data. We exclude the points at R < 50 kpc, as the assumption of a spherical mass that we make in the lensing analysis isn’t really valid at those comparatively small scales. We tried averaging over a bunch of different ranges, all of which gave pretty much the same answer. For illustration, we show two cases: a conservative one that only uses the data at R < 300 kpc, and another that goes out to 1 Mpc. Having measured Vflat over these ranges, we can plot Tully-Fisher:

Figure 3 from Mistele et al. (2024). The baryonic Tully–Fisher relation implied by weak lensing for the entire sample (yellow symbols, left column) and for ETGs and LTGs separately (red and blue symbols, right column). The Vflat values are weighted averages of the Vc values shown in Figure 1 for 50 kpc < R < 300 kpc (first row) and 50 kpc < R < 1000 kpc (second row). Vertical error bars represent a 0.1 dex systematic uncertainty on M*/L. For comparison, we also show the best fit to the kinematic data from Lelli et al. (2019; solid gray line) and the corresponding binned kinematic data (white diamonds).

Lo and behold, we find the same Baryonic Tully-Fisher relation from lensing data as we find with kinematics. This does not surprise me, but it didn’t have to be true. It shouldn’t be true in LCDM: if we can measure out to the virial radius, we should see some indication of a decline in velocity. We have and we don’t.

We also see no statistically significant separation between ETGs and LTGs. This is important, as a theory like MOND predicts that there should be no morphology dependence: only the baryonic mass matters. Brouwer et al. did see an indication of such a split, but it was small compared to the uncertainty in stellar population models. We don’t see it when we use our own stellar mass estimates. This is particularly true in the more conservative (300 kpc) case. There is a hint of a segregation when we average out to 1000 kpc, but the statistics say this isn’t significant. Since the lowest mass bin is most affected, I suspect this is a hint that the isolation criterion is failing first for the smallest galaxies. That makes sense, as the sensitivity limit on interlopers makes the lowest mass bin most susceptible to having its signal inappropriately boosted. It also makes sense that ETGs would be affected first, as ETGs are known to be more clustered than LTGs. It is really hard to define an isolated sample of ETGs, as discussed at length by Mistele et al.

The lensing data corroborate previous kinematic work. Rotation curves are flat. The amplitude of the flat rotation speed correlates with baryonic mass as Mb ∝ Vf4. The radial acceleration relation extends to very low accelerations. These are all predictions of MOND. Moreover they are unique predictions: predictions made a priori by MOND and only by MOND. Dark matter models so far provide no satisfactory explanation*.

That hasn’t prevented people from overlooking these basic facts in order to get to the apparent if statistically meaningless difference between ETGs and LTGs. Nevermind the successes! The slight offset between ETGs and LTGs falsify MOND! Seriously: other scientists have already made this argument to me while completely eliding the successes of MOND. It’s a case of refusing to see the forest for a tree that’s a little away from the others.

I think I said something about how this would happen when I first wrote about Brouwer et al‘s lensing result. Ah yes, here it is:

MOND predicted this behavior well in advance of the observation, so one would have to bend over backwards, rub one’s belly, and simultaneously punch oneself in the face to portray this as anything short of a fantastic success of MOND.

I say that because I’m sure people will line up to punch themselves in the face in exactly this fashion.

And so it has come to pass. Sometimes human behavior is as predictable as galaxy dynamics.


*There are many claims to explain limited portions of these results, but none are satisfactory. There is no LCDM model that matches the entire dynamic range of the radial acceleration relation. See, for example, Fig. 5 of Brouwer et al. (reproduced below), which shows the MICE and BAHAMAS simulations. Neither extend into the regime that is well-constrained by kinematic data; there is no reason to think they would successfully do so and good reason to think otherwise. MICE comes nowhere close to this regime and has no baryonic physics that would allow it do even address this question. BAHAMAS comes close but appears to turn away from the kinematic data before it gets there. We’ve built our own LCDM models; they don’t work either. We can make them come close, but only over a limited dynamic range, not over the full span of the data. It isn’t good enough to only explain a limited range of the data. One has to explain the full range, and the only theory that does that so far is MOND.

Fig. 5 from Brouwer et al. showing the radial acceleration relation inferred from the MICE (red band) and BAHAMAS (orange band) simulations. Not also that in our assessment of stellar masses, the lower acceleration points translate a bit to the right.

Rotation curves: still flat after a million light-years

Rotation curves: still flat after a million light-years

That rotation curves become flat at large radii is one of the most famous results in extragalactic astronomy. This had been established by Vera Rubin and her collaborators by the late 1970s. There were a few earlier anecdotal cases to this effect, but these seemed like mild curiosities until Rubin showed that the same thing was true over and over again for a hundred spiral galaxies. Flat rotation curves took on the air of a de facto natural law and precipitated the modern dark matter paradigm.

Optical and radio data

Rotation curves shouldn’t be flat. If what we saw was what we got, the rotation curve would reach a peak within the light distribution and decline further out. Perhaps an illustration is in order:

The rotation curve (data points, left) of NGC 6946 (right). The red line shows the expected rotation curve for the detected normal matter, which includes both the stars (yellow, from 2MASS) and atomic gas (blue, from THINGS). This provides a good description of the inner rotation curve but falls short further out. The excess observed rotation leads to the need for dark matter or MOND. Also noted is the extent of the rotation curve measured optically to the effective edge of the stars (Daigle et al. 2006; Epinat et al. 2008) and that measured with radio interferometric observations of the gas (Boomsma et al. 2008).

An obvious question is how far out rotation curves remain flat. In the rotation curves traced with optical observations by Rubin et al., the discrepancy was clear but modest – typically a factor of two in mass. It was possible to imagine that the mass-to-light ratios of stars increased with radius in a systematic way, bending the red line above to match the data out to the edge of the stars. This seemed unlikely, but neither did it seem like a huge ask.

Once one gets to the edge of the stellar distribution, most of the mass has been encompassed, and the rotation curve really should start to decline. Increasing the mass-to-light ratio of the stars ceases to be an option once we run out of stars*. Fortunately, the atomic gas typically extends to larger radii, so provides a tracer further out. Albert Bosma pursued this until there were again enough examples to establish that yes, flat rotation curves were the rule. They extended much further out, well beyond where the mass of the observed stars and gas could explain the data.

How much further out? It depends on the galaxy. A convenient metric is the scale length of the disk, which is a measure of the extent of the light distribution. Some galaxies are bigger than others. The peak of the contribution of the stars to the rotation curve occurs around 2.2 scale lengths. The rotation curve of NGC 6946 extends to about 7 scale lengths, far enough to make the discrepancy clear. For a long time, the record holder was NGC 2403, with a rotation curve that remains flat for 20 scale lengths.

Twenty scale lengths is a long way out. It is observations like this that demanded dark matter halos that are much larger than the galaxies they contain. They also posed a puzzle, since we were still nowhere near finding the edge of the mass distribution. Rotation curves seemed to persist in being flat indefinitely.

Results from gravitational lensing

Weak gravitational lensing provides a statistical technique to probe the gravitational potential of galaxies. Brouwer et al. did pioneering work with data from the KiDS survey, and found that the radial acceleration relation extended to much lower accelerations than probed by the types of kinematic data discussed above. That implies that rotation curves remain flat way far out. How far?

Postdoc Tobias Mistele worked out an elegant technique to improve the analysis of lensing data. His analysis corroborates the findings of Brouwer et al. It also provides the opportunity to push further out.

Weak gravitational lensing is a subtle effect – so subtle that one must coadd thousands of galaxies to get a signal. Beyond that, the limiting effect on the result is how isolated the galaxies are. Lensing is sensitive to all mass; if you go far enough out you start to run into other galaxies whose mass contributes to the signal. So one key is to identify isolated galaxies, and restrict the sample to them. KiDS is large enough to do this. Indeed, Mistele was able to show that while neighbors+ were a definite concern for elliptical galaxies, they were much less of a problem for spirals. Consequently, we can trace the implied rotation curve way far out.

How far out? In a new paper, Mistele shows that rotation curves continue way far out. Way way way far out. I mean, damn.

The average rotation curve of isolated galaxies (blue points) inferred from KiDS gravitational lensing data. This remains flat well beyond a million light-years with no end in sight. The width of the figure is the distance between the Milky Way and Andromeda. For comparison, the rotation curve of a single galaxy, UGC 6614, is shown in red. An image of the galaxy is shown to scale centered at the origin. UGC 6614 was selected for this illustration because it has a comparable rotation speed to the KiDS average and because it is one of the largest galaxies known: the red points are already a very extended rotation curve. Image credit: Mistele, Lelli, & McGaugh 2024.

Optical rotation curves typically extend to the edge of the stellar disk. That’s about 8 kpc in the example of NGC 6946 given above. Radio observations of the atomic gas of that galaxy extend to 17 kpc. That fits within the first two tick marks on the graph with the lensing rotation curve.

UGC 6614 is a massive galaxy with a very extended low surface brightness disk. Its rotation curve is traced by radio data to over 60 kpc. It is one of the most extended individual rotation curves known. The statistical lensing data push this out by a factor of ten, and more, with no end in sight. The flat rotation curves found by Rubin and Bosma and everyone else appear to persist indefinitely.

So what does it mean? First, flat rotation curves really are a law of nature, in the same sense of Kepler’s laws of planetary motion. Galaxies don’t obey those planetary rules, they have their own set of rules. This is what nature does.

In terms of dark matter halos, the extent of isolated galaxy rotation curves is surprisingly large. Just as we come to the edge of the stellar disk, then the gas disk, we should eventually hit the edge of the dark matter halo. In principle we can imagine this to be arbitrarily large, but in practice there are other galaxies in the universe so this cannot go one forever.

In the context of LCDM, we now have a pretty good idea of how extended halos should be from abundance matching. A galaxy of the mass of UGC 6614 should live in a halo with a virial radius of about 300 kpc or less. There is some uncertainty in this, of course, but we really should have hit the edge with the lensing data. There should be some sign of it, but we see none.

One complication is the so-called 2-halo term. In addition to the primary dark matter halo that hosts a galaxy, when you get very far out, you run into other halos. Isolated galaxies are selected to avoid this to the extent possible, but eventually there will be some extra mass that causes extra lensing signal that would cause an overestimate of the rotation speed. I’ll forgo a detailed discussion of this for now (see Mistele et al. if you’re eager), but the bottom line is that it would require some unnatural fine-tuning for the 1+2 halo terms to add up to such flat rotation curves. There ought to be a perceptible feature in the transition from the primary halo to the surrounding environment. We don’t see that.

In the context of MOND, a flat rotation curve that persists indefinitely is completely natural. That’s what an isolated galaxy should do. Even in MOND there should be an environmental effect: the mass of everything else in the universe should impose an external field effect that eventually limits the extent of the rotation curve. How this transition happens depends on the density of other galaxies; by selecting isolated galaxies this effect is put off as much as possible. Hopefully it will be detected as the data improve from projects like Euclid.

The primary prediction of MOND is an indefinitely extended rotation curve; the external field effect is a subtle detail. Yet again, that is what we see: MOND gets it right without really trying, and in a way that makes little sense in terms of dark matter. Sometimes I wish MOND had never been invented so we could claim to have discovered something profoundly new, or at least discuss the empirical result without concern that the data would get confused with the theory. MOND predictions keep being corroborated, yet the community persists in ignoring its implications, even in terms of dark matter. It’s gotta be telling us something.

We have a press release about this result, so perhaps you will see it kicking around your news feed.


*We could, of course, invoke dark stars, but that’s just an invisible horse of a different color.

+There is a well known correlation between morphology and density such that elliptical galaxies tend to live in the densest environments. This means that they are more likely to have neighbors that interfere with the lensing measurement, so finding that identifying isolated ellipticals with a clean lensing signal is more challenging that finding isolated spirals comes as no surprise. Isolated ellipticals do exist so it is possible, but one has to be very restrictive with the sample.

Updated WIMP Exclusion Diagram

Updated WIMP Exclusion Diagram

This is an update to a post from a few years ago, which itself was an update to a webpage I wrote in 2008, with many updates in between. At that time, the goalposts for detecting WIMPs had already moved repeatedly. I felt some need then to write down a brief synopsis of the history of a beloved hypothesis (including by myself) that had obviously failed as the goalposts were in motion again. That was sixteen years ago.

It is important to remember where we started from, which is now ancient history lost in the myths of time to most who are now working in the field. Indeed, when I search for mention of the WIMP miracle, the theoretical argument that launched a thousand underground detection experiments, little comes up: this essential element of the field has been memory-holed after its failure. I suppose that’s to be expected, as the same thing happened with the decay of the B0 meson: once heralded as the “golden test” for supersymmetry, it simply stopped getting mentioned after it didn’t work out.

The original expectation for WIMPs was a particle of mass around 100 GeV/c2 with an interaction cross-section of about 10-39 cm2. While I remember this, it is getting rare to find this statement, so let me quote a particle physicist:

“The most appealing possibility – a weak scale dark matter particle interacting with matter via Z-boson exchange – leads to the cross section of order 10-39 cm2

14 April 2011 Resonaances

To translate a little bit, the Z-boson is a carrier of the weak nuclear force (as photons are for electromagnetism), so this envisions an otherwise normal interaction that involves a new particle, the WIMP. The weak force is, well, weak, so the interaction probability is small, as quantified by the tiny cross section of 10-39 cm2. That makes such interactions rare, but particle physicists are talented at detecting such phenomena. It helps to have a lot of target material in your detector in a place that is well-shielded from background interference, hence all the giant underground WIMP experiments. Consequently, to continue the quote above,

“the cross section of order 10-39 cm2 … was excluded back in the 80s by the first round of dark matter experiments.”

And so the goalposts were set in motion. There were many steps along this path, so I’ll highlight only one, circa 2008. To complete the quote from Resonaances,

“There exists another natural possibility for WIMP dark matter: a particle interacting via Higgs boson exchange. This would lead to the cross section in the 10-42 – 10-46 cm2 ballpark (depending on the Higgs mass and on the coupling of dark matter to the Higgs).”

So the interaction via the Z-boson had been excluded, but one can have other interactions, this one via the Higgs (which had not quite yet been detected: discovery was in 2012; the Resonaances quote is from 2011. Since then, the Higgs might be said to be “too normal” to make room for any of this.) The possibility of Higgs exchange leads to the blue-green predicted region of Trotta et al. (2008) in the exclusion diagram shown below. If one looks for such plots in the literature, one finds a natural tendency for their upper limits to migrate downwards along with the limits they portray. I thought it might be instructive to update the plot to show the full range of progress:

The interaction cross section as a function of WIMP mass. The original expectation of 10-39 cm2 is at top. Gray areas are regions that were experimentally excluded by 2008 (before the blue-green prediction) and by 2022, which is the most recent update as of this writing. The most sensitive limit is 10-47 cm2, eight orders of magnitude below the original prediction.

I call out the 2008 threshold because we had a conference here at CWRU in 2009 (while I was at the University of Maryland) at which the Trotta et al. prediction was presented. I had already become skeptical of the moving goalposts, so I wondered how much of the probability density was in the tail to low cross-section. A low-likelihood tail seems a lot more probable once the head is lopped off! I made this point at the time, and asked how important the tail was. The answer was about 2% or the probability. The speaker went on to express the usual overconfidence that WIMPs would be detected in the more likely region (marked by an X in the blue region with the handy arrow pointing to it).

The experimentalists have done a fabulous job in increasing the sensitivity of their experiments so that they can see to ever lower interaction cross section. Had WIMPs existed as predicted initially, or subsequently, they would have been detected by now. These experiments have succeeded in failing quite brilliantly. I had long before shown that the astronomical data did not add up for any flavor of dark matter. Maybe WIMPs don’t live in this universe?

While we’d be happy to detect dark matter anywhere in parameter space, the WIMP does have sweet spots: first 10-39 cm2 then 10-44 cm2. Now that those are gone, what’s next? From the particle physics perspective, I’ve heard it said that the next logical expectation for the cross-section is around 10-48 cm2. This apparently follows from “two-loop corrections.” I have only a vague idea of what that means, but in my practical experience it translates to “a difficult-to-compute effect so exotic that it likely has no bearing on reality, except maybe in the sixth place of decimals.”

More generally, this continual moving of the cross section goalpost is what I meant back in 2008 by the scientific version of the express elevator to hell. It just keeps going down, and can do so forever. I keep warning my colleagues about these things, and they keep not heeding the warnings. Being a scientific Cassandra is getting old.

The problem with pushing detection limits to still lower cross-sections like 10-48 cm2 is that the universe is indeed full of weakly interacting particles with at least a little bit of mass: neutrinos. These are not as massive as WIMPs, and should not be confused with them: neutrinos are Standard Model particles that are known to exist and to have a very small mass (< 1 eV) while WIMPs are expected to be hundreds of GeV and require entirely new physics beyond the Standard Model. I shouldn’t need to say this, but WIMPs and neutrinos are very different beasts. However, they do both have mass and interact weakly, so I’ve noticed that some of the more rabid advocates of dark matter mix these two in order to claim that we know weakly interacting dark matter exists. That much is technically true, but in technical parlance it is also some bold bullshit. Hmmm, actually, I think it is worse than ordinary bullshit. It is willful scientific disinformation that intentionally sews confusion by conflating the unconfirmed existence of WIMPs with the known existence of neutrinos in order to lend an air of certainty to a failed hypothesis.

WIMP experimental limits (via Hamdan 2021) with the expected neutrino background in orange. Once this sensitivity is reached, any WIMP signal becomes obscured by the neutrino background.

Meanwhile, experimental progress proceeds apace. The coming generation of WIMP detectors should be sensitive to the solar and atmospheric neutrino background. That is astrophysically interesting, as it can probe nuclear reactions in the sun and, in principle, those in every supernova that have ever exploded. This has bugger all to do with dark matter. However, since that’s what people are looking for, what they built these detectors to find, and they’re completely convinced dark matter exists, and a Nobel prize awaits whoever detects it first, I expect that the first neutrino detections will be misinterpreted as WIMP detections. There will be much arguing between groups, claims and counterclaims, and after a few years it will be recognized that these coming detections are neutrinos not WIMPs. First there will probably be many over-hyped claims that mislead the public into thinking dark matter has been detected.

But there I go being a scientific Cassandra again.