The BS Detector

this might be the greatest postscript of all time

Expand full comment

Thank you very much! Much of the credit should go to Will Wang, the twitter user who uncovered that WIPO complaint!

Expand full comment

R Meager

oh but also incredible work ben. huge props to you for this excellent post.

Expand full comment

Riccardo Puglisi

Great post! As an applied economist I routinely look at the abstracts of NBER working papers, and I recall that I read with some attention the one by Toner-Rodgers.

I also skimmed through the paper itself, and I had three short thoughts (of which: two red flags):

1) too many researchers at this company;

2) too much work for one author (and I did not know he was that early-stage in his career;

3) I was envious at stuff done by MIT students (I did my PhD at LSE, which is not bad at all, but still...). Now I am less envious.

Expand full comment

To be fair, I do know some absolutely cracked MIT students who probably could have done this work if the data were actually real. But ya, the number of researchers in the study stood out to me. If there were a company that hired >1k researchers on materials discovery alone, I feel like everyone I know would have been clamoring to get hired there.

Expand full comment

Eberhard Tarpening

I call humblebrag.

Expand full comment

JS242

https://economics.mit.edu/people/phd-students/aidan-toner-rodgers

MIT has now deleted Toner-Rodgers from their PhD student listing

Expand full comment

David Vandervort

I disagree. Toner-Rogers didn't fabricate this data. The AI that wrote the paper for him fabricated the data. It seems to fall well within the type of hallucination AIs generate when asked to produce data, especially when given the hypothesis the prompter wants confirmed, up front.

I suspect he had virtually nothing to do with the entire paper. He told an AI (probably ChatGPT) what he wanted and then tried to publish the result as if it was his own work.

Expand full comment

Hmmm, I mean, it's quite likely that he used AI assistance extensively, but I think that the AI capabilities in mid-2024, when he would have written this, were not nearly capable enough to generate the entire paper without substantial assistance on his part (and they probably still aren't). It looks like he is quite adept at utilizing generative AI, which ironically supports his fake findings on the most competent researchers having more to gain from AI use.

Expand full comment

David Vandervort

I think it depends a little bit on what we mean by "entirely" and "substantial assistance." You're right that you can't just tell an AI "write me a paper about X" and expect to get anything decent. But you could do it in chunks.

A while back (a long while. Sorry, I don't remember enough to find a link) I read a paper that described a method for producing longer works. It was something like, ask the Ai to do an outline for a paper. Then ask it to develop a prompt for each section of the outline. Then feed it each of those prompts one at a time and merge the results.

I seem to recall that when I tried it out, the results were not usable. The sections were uneven and there was a lot of repetition. So, yes, it would take a lot of work to smooth all that over. But I haven't read the paper under discussion. It might also be uneven and repetitive. Don't know!

Expand full comment

Kathleen Lowrey

This makes sense because my first reaction was jeepers that amount of effort…. Why not just do real work?

The “no real effort” explanation pairs better with the outcome.

Expand full comment

It seems very unlikely that a large company that would employ ~1000 material scientists would allow the release of any data about their employee's productivity or their research results.

Expand full comment

Scott Newstok

The 1996 Sokal Hoax, revisited https://www.journals.uchicago.edu/doi/abs/10.1086/449049

Expand full comment

Matthew Lungerhausen

May 17Edited

Yeah that leaps to mind for sure. Helps when journal editors slow down and find some outside expertise. But also, MIT doctoral advisors have some ‘splaining to do.

Expand full comment

Sandwichman

https://econospeak.blogspot.com/2025/05/artificial-intelligence-creates-more.html

Autor never good at BS detection.

Expand full comment

Simon Crase

May 30

Great photo. The expression on the dog's face speaks for itself...

Expand full comment

May 19

True about Autor, but you really have no idea why the lump of labor fallacy is a fallacy.

Expand full comment

https://scholar.google.ca/citations?view_op=view_citation&hl=en&user=k4xobtAAAAAJ&citation_for_view=k4xobtAAAAAJ:u5HHmVD_uO8C

Sandwichman

May 30

80 citations says otherwise. Not all agree with me, of course

Expand full comment

Oldbiddy

Great analysis! Having all 1018 researchers all working on all 4 areas is a huge red flag. There's very little overlap between polymer scientists and metallurgists.

Expand full comment

Siebe Rozendal

I gave the preprint to Claude 3.7 Sonnet +thinking, and no matter the prompts (including my notes on Stuart Ritchie's book Science Fictions) it didn't recognize any red flags, unless I started really steering it (giving contextual info of it being done by a first year grad student)

Expand full comment

Nigel K Tolley

It's unlikely to diss it's own work.

Expand full comment

Siebe Rozendal

No, that's not how it works. And as Ben said, AI mid-2024 wasn't capable of generating this quality of paper (and I doubt that current AI would do particularly well at it)

Expand full comment

Nigel K Tolley

Jun 3

r/whooosh called.

Expand full comment

Beatriz Gietner

May 19

That only strengthens the point that the results were based on other results, which are available online. Had they been different, I bet Claude would've called attention to it.

Expand full comment

Gidon Kaminer

This is like if Icarus, rather than flying too close to the sun upon beeswax wings, launched himself towards the sun in a Saturn V rocket

Expand full comment

Marc Sabatier Hvidkjær

"Toner-Rodgers submitted his paper to The Quarterly Journal of Economics, the top econ journal in the world."

Argh, AER.

Expand full comment

https://www.scimagojr.com/journalrank.php?category=2002

I mean, I hate SciMaGo rankings as much as the next person (and I'm actually currently working on a side-project for a better system for journal rankings) but apparently QJE clears for now:

Expand full comment

Marc Sabatier Hvidkjær

Fair!

Expand full comment

Andrew Kadel

I think it's likely that the author of the paper is, by now, a "former first year Ph.D." student.

It would be truly funny if someone at one of these companies had received an impertinent request from this student, asking for data for a class project, and the person fed falsified data to him just to screw with him.

Expand full comment

Venkatesh Rao

Was the fraudulent stuff was generated with AI too? Or was AI just the hot theme chosen for the fraud?

Expand full comment

I think it's almost certain that he extensively used generative AI in writing this manuscript (and seemed to have used it quite adeptly, tbh), but I think this fraud required a lot of creativity on his part, and that the data was probably faked with pretty deliberate instructions to the AI, rather than wholesale invented by the AI.

Expand full comment

Richard Van Noorden

May 17Edited

I’ll just point out that Nature’s journalistic coverage of this preprint at the time *did* request comment from at least one materials scientist, Robert Palgrave, who didn’t raise to the journalist the critiques noted here.

Expand full comment

J P

https://x.com/robert_palgrave/status/1856273403595915397

But, as pointed out in the post, that very materials scientist pointed out other concerns at the time in a twitter thread

Expand full comment

Richard Van Noorden

May 20

Right, but his concerns were around asking questions of the methods. They didn’t lead him to doubt the paper’s veracity.

His thread at the time concludes: “A fascinating paper, and clearly a huge amount of work. Very interesting and impressive how seemingly one student managed to conduct such a wide ranging study at what must be a major company.”

Expand full comment

J P

May 20Edited

Not sure why you’re trying to die on this hill. The veracity of some of the results and claims are doubted in that thread, including that anything useful was found. And the author, Ben, specifically points that out above as well:

“And when the piece originally came out, he had an orthogonal, but also very valid set of reasons for being skeptical of the work (mostly due to the difficulty in defining the “novelty” of materials).”

Expand full comment

Richard Van Noorden

May 20

Oh, I'm pointing this out because to my reading, this blog suggests that had a materials scientist read the preprint, they'd have spotted it was likely fraudulent.

("Probably a materials scientist who read the paper realized this was fraudulent but wasn’t able to get that view quickly to the economists who were actually reading and discussing the paper", the blog says).

The implication (to my reading) is that the likeliness of fraud would have been readily apparent to the real subject-matter experts here. And if journalists had only asked those real experts, the mess would have come to light earlier.

I'm responding by noting that this suggestion doesn't really have a basis in fact. Materials scientists *did* read the preprint - and *were* consulted by journalists - and they didn't spot it was likely fraudulent. Sure, they expressed useful concern about the work's methods and conclusions. Subject-matter experts always critique methods in preprints. But there was not a whisper about the possibility of fraud, the idea that literally the whole thing was fake.

In hindsight, red flags are there. But - as the blog says - hindsight is 20/20. It genuinely isn't so clear ahead of time. And I hope MIT is clearer about what happened in this affair, as it's still frustratingly opaque.

(By the way, I'd expect that a journal editor in any subsequent peer review process would have asked the author to confidentially send over details about the firm involved and get confirmation from the firm ... but perhaps economics journals wouldn't have done that).

Expand full comment

https://www.alphaxiv.org/abs/2412.17866

J P

May 20Edited

I assume he wrote that beforehand, given his later statement (& taking him at his word), referring to Palgrave’s more recent post:

“(I promise I read his thread after writing the bulk of this blog post)”

It’s also quite possible Palgrave *did* have such suspicions, as he did initially make explicit note of several of the same observations re: the surprising meta-aspects: The company size, when the study started wrt recent AI developments, and that a student was somehow working alone on the set with the company.

However, it’s quite a difficult task to ask someone who’s simply been asked for comment on the science to indicate the paper is outright fraudulent. You would have to be very sure before even *suggesting* such a thing in a public forum, since, if you’re wrong, the damage done can be disastrous and to some degree irreversible. Just look how Ben here, even after knowing the results were fraudulent, still hedged on the data themselves.

I actually do wonder if this would have been uncovered in peer review at an Econ journal. Without a domain expert, it would be difficult to detect many of these problems, and with emails from “Corning Research,” he could have gotten past concerns re: the company and his access to the data.

My guess is that he ended up being just a little too ambitious for his own good. These results were so initially compelling as to receive national coverage, opening up the audience to many subject-matter experts, at least some of whom likely shared the same concerns as Ben and Palgrave & therefore probably reached out to MIT.

Had he instead made more modest claims that didn’t raise so much attention, this likely would have slid past review without too much difficulty. This suggests there are probably many such fraudulent works out there, made by people who kept their results and claims below the “OMG” level, if you will, and who therefore didn’t receive the same level of acute scrutiny by many independent researchers.

Expand full comment

Niels

May 17Edited

great article

by the way, commenting on Arxiv papers is possible through alphaXiv, e.g.

Expand full comment

Dave Friedel

May 18Edited

Very well written, I had seen the results circulating and a few articles talking about it when first released.

I believe the reason the paper showed more validity in part was because Google’s work with antibiotics discovery was benefiting so carry over could be taking place. https://www.thebrighterside.news/post/google-ai-solves-a-decade-long-superbug-mystery-in-just-two-days/

Expand full comment

Lewis Tunstall