12 Comments
User's avatar
Waste-Time Continuum's avatar

AI is best when used as a tool to help you think, instead of doing the thinking for you. I only started using AI in March, and it has helped me organize and work through my ideas immensely. I think best through conversation. People get bored, or aren’t in the right head space to listen to me go on and on about something I want to work out. AI “converses” with me until I’m the one getting tired. That is the value of this tool for me. I think some are just framing the technology incorrectly.

Expand full comment
Alistair Penbroke's avatar

You're being quite generous! Sounds like this paper is heavily p-hacked, has all the usual flaws of pop psychology, cannot prove anything interesting by design and is written by people with a clear agenda. Just another day in academia!

Expand full comment
Squid's avatar

So they found that using a LLM to write reduces effort, and possibly that typing is mentally effort-intensive. I’d be curious to see how typing prewritten content compares to typing an original essay.

Expand full comment
Daniel J's avatar

Great article looking into all of these details. I haven't looked into it nearly as deeply, but yeah, I don't think you can draw much from this paper that wasn't common sense, and definitely not the dramatic claims some are making.

I feel like LLMs have hugely increased my ability to learn and understand too, but I also think it is something to be careful about since it can be hard to judge that for ourselves. I think having some regular deep thinking time not using LLMs is probably a good safety measure (I think chess is perfect for this luckily!). But totally agree - it's a very powerful tool which can be used for either good or ill, and that's as true in education as anywhere else.

I wonder if AI making it so easy to cheat will force schools to realign themselves to more useful/engaging because students will just cheat otherwise. It's a hope at least!

Expand full comment
Justin's avatar

New here, via Trung. Ty!

Expand full comment
Zinbiel's avatar

Good write-up. I haven't read the paper, so take this with a grain of salt...

I report EEGs as part of my job, and I have lectured on EEG basics to medical students, but I do not consider myself an expert. From my partial knowledge, I am deeply suspicious about any conclusions drawn on the cognitive value of frequency correlations across brain regions.

EEGs do not pick up the activities of single neurons; they pick up synchronised activity in large populations. Neurons engaged in piecemeal information processing are unlikely to be doing the Mexican wave with other neurons; they have more important stuff to do.

For instance, the dominant rhythm from the occipital lobes, where vision is processed, is usually known as the alpha rhythm. It is usually prominent in a relaxed subject whose eyes are closed, and it abates when the eyes are open. That is, it is a negative indicator of actual visual cognition. I think of it as what the occipital lobes do when they are idling. If some other region happened to have activity at the same frequency, that would not prove that clever visual cognition was taking place, or even that the eyes were open.

Maybe frequency correlations tell us something about binding... But it's still more like reading tea leaves than doing science.

I remember that stadium metaphor, too; it is spot on.

Not only does this study tells us nothing much about any residual cognitive effects, with a design committed to task-specific effects, it is not at all clear to me that the authors can judge the inherent cognitive value of the different EEG properties being watched. The subjects were engaged in different activities, as you note, so their EEGs were different, but that's about all that can be said.

The whole exercise of drawing conclusions about the risks of using AI from such an exercise smacks of pseudoscience (probably with a large measure of p-hacking thrown in). At best, this sort of work could generate hypotheses; from your description it cast the statistical net too wide to prove any individual hypothesis.

That the study has generated excitement in the lay press is very typical. Few people seem interested in showing the necessary level of scepticism, so ambiguous results are channelled uncritically into science-themed clickbait.

Expand full comment
allora's avatar

They need to incent the LLM users to write a 'good' paper. Segment the groups the same but the best paper gets $250, the top 3 $100. The rest just paid for their time.

I use LLMs to write legal memoranda, and I genuinely feel like they don't save me much time, but let me go deeper into more cases. I feel like I'm 'thinking' and engaging in a similar flow state as a non-LLM essay, rather than just getting the LLM to regurgitate. But I'm extra motivated to do good work because I have a boss reviewing it, and I'd be especially embarrassed if part of their feedback was 'it looks like an LLM did this.'

Replicating some sense of the motivation to write a quality essay is important!

Expand full comment
allora's avatar

They need to incent the LLM users to write a 'good' paper. Segment the groups the same but the best paper gets $250, the top 3 $100. The rest just paid for their time.

I use LLMs to write legal memoranda, and I genuinely feel like they don't save me much time, but let me go deeper into more cases. I feel like I'm 'thinking' and engaging in a similar flow state as a non-LLM essay, rather than just getting the LLM to regurgitate. But I'm extra motivated to do good work because I have a boss reviewing it, and I'd be especially embarrassed if part of their feedback was 'it looks like an LLM did this.'

Replicating some sense of the motivation to write a quality essay is important!

Expand full comment
allora's avatar

They need to incent the LLM users to write a 'good' paper. Segment the groups the same but the best paper gets $250, the top 3 $100. The rest just paid for their time.

I use LLMs to write legal memoranda, and I genuinely feel like they don't save me much time, but let me go deeper into more cases. I feel like I'm 'thinking' and engaging in a similar flow state as a non-LLM essay, rather than just getting the LLM to regurgitate. But I'm extra motivated to do good work because I have a boss reviewing it, and I'd be especially embarrassed if part of their feedback was 'it looks like an LLM did this.'

Replicating some sense of the motivation to write a quality essay is important!

Expand full comment
allora's avatar

They need to incent the LLM users to write a 'good' paper. Segment the groups the same but the best paper gets $250, the top 3 $100. The rest just paid for their time.

I use LLMs to write legal memoranda, and I genuinely feel like they don't save me much time, but let me go deeper into more cases. I feel like I'm 'thinking' and engaging in a similar flow state as a non-LLM essay, rather than just getting the LLM to regurgitate. But I'm extra motivated to do good work because I have a boss reviewing it, and I'd be especially embarrassed if part of their feedback was 'it looks like an LLM did this.'

Replicating some sense of the motivation to write a quality essay is important!

Expand full comment
Dan Elton's avatar

So FWER analysis > FDR analysis > Bonferroni correction ? Also, how should one learn about these things?

Expand full comment
Swen Werner's avatar

Great article but I think you are way too polite to draw the only conclusion that the data permits: the results are statistical noise rendered as narrative.

When you run thousands of significance tests across EEG channels without strict pre-registration or strong correction (such as FWER), your findings are not discoveries — they are artifacts of fishing. FDR correction allows for false positives by design. If your experiment involves 1024 electrode pairs and runs up to 1000 rmANOVAs per session, you're not uncovering neural dynamics. This biased and compromised. Publishing this kind of result borders on misconduct. If we cannot distinguish signal from noise, we should not pretend we have anything other than speculation. That's not enough for science.

Expand full comment