Greta Tuckute

Email concatenate 'gretatu' with '@mit.edu'
GitHub gretatuckute
Scholar Greta Tuckute
Twitter/X @GretaTuckute
CV

Hi, I am Greta. Thank you for visiting my page. I am an incoming fifth-and-final-year PhD candidate at the Department of Brain and Cognitive Sciences at MIT working with Dr. Ev Fedorenko. I am very grateful to be supported by the K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center. I completed my BSc and MSc degrees at KU/DTU with coursework and research at MIT/CALTECH/Hokkaido University. I work in the intersection of neuroscience, artificial intelligence (AI) and cognitive science. I study how language is processed in the biological brain, often by using artificial neural networks models as tools or computational hypotheses. When I don’t do science, I enjoy photography, tennis, high altitudes, mornings, and magic realism books.


Below is a subset of updates on ongoing/recently finished projects and collaborations:

MIT Technology Review's 35 Innovators Under 35

TR35 Social Card
September 2024 I am incredibly humbled to have been named one of MIT Technology Review's 35 Innovators Under 35 for 2024. My recognition is in the category of artificial intelligence, awarded for my work in leveraging artificial neural network models to obtain a more precise understanding of how the human brain processes sounds and sentences. By developing these quantitatively accurate models of brain function, I believe that we will not only be able to answer scientific questions that were previously out of reach but also create a foundation for innovative treatments for individuals with auditory/language impairments -- such as using these accurate models as components in novel cochlear implants or brain-machine-interfaces.
I am deeply grateful for the support from my mentors (especially Ev Fedorenko, Josh McDermott, and Nancy Kanwisher) and collaborators. I thank the McGovern Institute for Brain Research at MIT for nominating me, and the MIT Techonology Review for covering my work.

Conference for Cognitive Computational Neuroscience (CCN) 2024

  Poster presentation at CCN 2024.
August 2024 I have been fortunate in taking part in organizing this year's Conference for Cognitive Computational Neuroscience (CCN) 2024 (as a co-chair of the local/trainee organizing committee) as well as presenting four posters on ongoing work:

Tuesday, August 6, 4:15 - 6:15 pm, poster A42. Temporal regions are the epicenter of language processing in the human brain. Greta Tuckute, Aalok Sathe, Jingyuan Selena She, Evelina Fedorenko.
Thursday, August 8, 1:30 - 3:30 pm, poster B136. Hippocampal versus cortical language processing: Similar selectivity profile, difference in tuning axes. Elizabeth J. Lee, Idan A. Blank, Evelina Fedorenko*, Greta Tuckute*.
Friday, August 9, 11:15 am - 1:15 pm, poster C57. Brain-like language processing via a shallow untrained multihead attention network. Badr AlKhamissi, Greta Tuckute, Antoine Bosselut*, Martin Schrimpf*.
Friday, August 9, 11:15 am - 1:15 pm, poster C55. Brain-like functional organization in Topographic Transformer models of language processing. Taha O. Binhuraib, Greta Tuckute, Nicholas Blauch.

Please stop by to say hi!

Localizing the human language network in ~3.5 minutes using speeded reading

We investigated brain activations during language processing, for both standard and speeded reading. The brain regions involved in language are qualitatively (and quantitatively) similar across both conditions.
July 2024 Excited to share our fast fMRI localizer for the language network and our investigations into how the human brain processes reading at high speeds. This paper was co-led with Elizabeth J. Lee (undergraduate/master's student I have been fortunate to work with), and with Aalok Sathe and Ev Fedorenko.

Functional "localizers" aim to isolate a (set of) brain regions that support a cognitive process (such as face processing, or language processing) in individual participants. It is challenging to rely on brain anatomy alone, because individuals have different brains. Hence, localizers are a powerful way to isolate a cognitive process in the brain, but one concern is that they take time away from the critical experiment. One ongoing effort in cognitive neuroscience is to make localizers as efficient as possible. This is what we do here.
We present a speeded version of a widely used localizer (Fedorenko et al., 2010) for the language network (for a review on the language network, see Fedorenko et al., 2024). This localizer uses a reading task contrasting sentences and sequences of nonwords. However, our localizer is faster as we present each word/nonword for 200ms instead of 450ms.
In the paper, we first investigated whether the speeded localizer version can reliably localize language-responsive areas in individual participants (n=24). The brief answer is yes. First, we looked at the voxel-level activation maps, and found that the speeded localizer elicited highly similar activation topographies as the standard localizer (Fedorenko et al., 2010). Next, we investigated the BOLD response magnitudes and found that the speeded localizer elicited a robust response to sentences, with an even greater sentences > nonwords effect size than the standard localizer. Second, we asked whether increased processing difficulty due to speeded reading affects neural responses in the language network and/or elsewhere in the brain. We focused on the Multiple Demand (MD) network, sensitive to cognitive effort across various domains and in some cases of effortful language comprehension. It is an open question how the MD network (and other brain regions) contributes to language processing during different kinds of effortful linguistic processing. We found that the MD network responded ~43% more strongly during speeded sentence reading vs. standard sentence reading (compared to a ~16% increase in the language network to speeded reading). Moreover, we did not observe a nonwords > sentences effect in the speeded localizer, compared to the standard one. Hence, speeded reading was more effortful than slower reading, and this cost loaded largely onto the domain-general Multiple Demand (MD) network (only minimally manifesting in the language network and not affecting the ability to localize language areas).
We hope that this language localizer will be useful to researchers when time is of essence (this localizer was developed when we were trying to cram hundreds of sentences into a single fMRI session a few years ago). It should work well with any proficient readers.

The paper can be found here: Tuckute*, G., Lee*, E.J., Sathe, A., Fedorenko, E. (2024): A 3.5-minute-long reading-based fMRI localizer for the language network, bioRxiv, doi: https://doi.org/10.1101/2024.07.02.601683. The associated code to run the speeded language localizer (and analyses) can be found here.

Functional localization of a language network in large language models

The illustration shows our approach for localizing a language network in artificial language models (using methods from cognitive neuroscience).
June 2024 In this work, we propose and validate a method for functionally localization a language network in artificial large language models (LLMs). This work is led by Badr AlKhamissi, and jointly supervised by Antoine Bosselut and Martin Schrimpf.

The language network in the human brain is localized by contrasting language processing vs. processing of a perceptually similar condition that lacks linguistic meaning/structure (e.g., lists of non-words or muffled speech; Fedorenko et al., 2010). However, prior work that has investigated LLMs' similarity to the human language network has not followed any similar procedure. Rather, prior work has used units from a full layer of the model or units across the model. We here asked: can we localize language units in LLMs, and do these units reproduce empirical findings about the human language system?
We localized language units in LLMs by contrasting Sentences (S) > Non-words (S) and found that these units reproduce empirical findings such as a higher response to Sentences than unconnected words (W), Jabberwocky (J) and non-words in held-out materials (such as in several studies, e.g., Fedorenko et al., 2010). Moreover, we found that these LLM language units discriminate more reliably between lexical than syntactic differences (as found in Fedorenko et al., 2012). So, these localized LLM units mirror response profiles in the human language system, but are these units also functionally useful?
We trained a downstream language decoder using the localized LLM units in an untrained network, and showed improved next-word prediction performance compared to two control models without localization. Thus, localized, untrained LLM units benefit language modeling, demonstrating that these units have some functionally-relevant properties that also mimic responses in the human brain. More work is ongoing to understand this further.

The paper can be found here: AlKhamissi, B., Tuckute, G., Bosselut*, A., Schrimpf*, M. (2024): Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network, arXiv, doi: https://doi.org/10.48550/arXiv.2406.15109

Brain-like topographic organization in Transformer language models

The left plot shows a Transformer layer (keys) from our topographical Transformer BERT model ("Topoformer"), with the colors denoting the selectivity to sentence about animate versus inanimate content. The right plot shows the same layer, but in a BERT model without topographical constraints.
March 2024 Large language models (LLMs) fundamentally differ from the human language system. One notable difference is that biological brains are spatially organized, while current LLMs are not. In this work, led by Taha BinHuraib, and with Nick Blauch, we attack this issue, and propose a topographic LLM ("Topoformer").

The Topoformer model uses a novel form of self-attention, converting standard neuron vectors into 2D spatial grids. Spatial variants of querying and reweighting operations give rise to topographic organization of representations within the self-attention layer. We first trained a toy Topoformer model on a supervised sentiment task (rating movie reviews as being positive or negative), and we found that the model learned selectivity for positive versus negative sentiment in the layers of the Topoformer model. Next, we scaled up our approach, and trained a BERT Transformer model with the topographical motifs, on masked word prediction. The performance of the Topoformer-BERT model was similar to that of a non-topograhical BERT across several linguistic benchmarks. Hence, topography does not impede function (nor enhance it). Finally, we asked whether the topographic organization in the Topoformer-BERT model corresponded to that of the human language system. We found that the language responses in the human brain were topographic ("smooth"/"patchy"), and critically, that dimensions between Topoformer-BERT and the brain could be aligned.
This work provides a critical step towards developing more biologically plausible LLMs.

The paper can be found here: BinHuraib, T.O.A, Tuckute, G., Blauch, N. (2024): Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting, ICLR Re-Align Workshop. The project website and code can be found here.

Review on language in brains, minds, and machines

In this review, we examined how properties of language models affect alignment with brain representations during language processing. To better distill patterns, we categorized language model properties into three broad groups, as illustrated in the boxes.
April 2024 Thrilled to share our Annual Review of Neuroscience paper on language in brains, minds, and machines with Nancy Kanwisher and Ev Fedorenko. Broadly, we survey the insights that artificial language models (LMs) provide on the question of how language is represented and processed in the human brain. The language LM-neuroscience field is moving very fast, but I hope that this review will serve as a useful timestamp for contextualizing the findings from this field.

In this review paper, we first ask, what are we trying to model? We discuss the human language system as a well-defined target for modeling efforts aimed at understanding the representations and computations underlying language (also see Fedorenko, Ivanova, Regev, 2024). Second, we discuss what we actually want from computational models of language: There is a trade-off between parsimony (i.e., intuitive-level understanding; many classes of former models) and predictive accuracy (i.e., models that explain accurately explain behavioral/neural data). LMs are data-driven, stimulus-computable and have high accuracy, albeit at the expense of parsimony. However, if we use them in controlled experimental settings, I will argue that we can use them to make meaningful inferences despite their lack of inherent interpretability.
Next, we lay out arguments as to why, a priori, we would or would not expect LMs to share similarities with the human language system. In brief, we would expect LMs to share similarities with the human language system because of their similar behavioral language performance (formal linguistic competence, e.g., Mahowald, Ivanova et al., 2023), both systems are highly modulated and/or shaped by prediction, both systems are sensitive to multiple levels of linguistic structure, and finally, ‘small’ LMs that do language well, struggle with other tasks (e.g., simple arithmetic), suggesting that near-human language ability does not entail near-human reasoning ability. Conversely, LMs are very different from humans in their learning procedure, access to long-range contexts of prior input, and of course, hardware differences.
We then describe evidence that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and decoding, and then turn to the question: Which properties of LMs enable them to capture human responses to language? LMs vary along many dimensions, which we separated into 3 categories: model architecture, model behavior and model training. A few take-aways are: For architecture, contextualized LMs provide a big improvement in brain-LM alignment over decontextualized semantic models. However, within contextualized LMs, many architectures fit brain data well. Larger LLMs predict brain data better, but become worse at predicting human language behavior. For behavior, in line with the idea that neural representations are shaped by behavioral demands, LMs’ ability to predict linguistic input is positively correlated with brain alignment. It remains unknown whether prediction per se, or other factors such as representational generality, is the key factor underlying alignment. Lastly, for training, even LMs trained on a developmentally plausible amount of training data align with brain data. Semantic properties of training data seem to matter more for brain-LM alignment than morphosyntactic ones. Fine-tuning can increase brain-LM alignment (task- and model dependent). The final part of the paper discusses the general use as LMs as in silico language networks, and we describe challenges and future directions associated with the use of LMs in understanding more about language in the mind and brain.

The paper can be found here: Tuckute, G., Kanwisher, N., Fedorenko, E. (2024): Language in Brains, Minds, and Machines Annual Review of Neuroscience 47:277-301, doi: https://doi.org/10.1146/annurev-neuro-120623-101142.

Review on how to optimally use neuroscience data and experiments to build better models of the brain

January 2024 Kohitij Kar and I organized a Generative Adversarial Collaboration (GAC) workshop during the Cognitive Computational Neuroscience (CCN) conference in 2022 (along with Dawn Finzi, Eshed Margalit, Jacob Yates, Joel Zylberberg, Alona Fyshe, SueYeon Chung, Ev Fedorenko, Niko Kriegeskorte, Kalanit Grill-Spector). We primary goal was to discuss how neuroscientific data (behavior and neural recordings) can be used to develop better models of processes in the brain. We focused on visual and language processing.

We synthesized our workshop take-aways into a review/opinion piece ("How to optimize neuroscience data utilization and experiment design for advancing primate visual and linguistic brain models?"), which broadly tackles two questions: i) How should we use neuroscientific data in model development (raw experimental data vs. qualitative insights)?, and ii) How should we collect experimental data for model development (model-free vs. model-based)?. We discuss the pros and cons associated with each stance, and moreover, we review how neuroscience data has traditionally been leveraged to advance model-building within the domains of vision and language. Finally, we discuss directions for both experimenters and model developers in the quest to advance artificial models of brain activity and behavior.

The paper can be found here: Tuckute, G., Finzi, D., Margalit, E., Zylberberg, J., Chung, S.Y., Fyshe, A., Fedorenko, E., Kriegeskorte, N., Yates, J., Grill-Spector, K., Kar, K. (2024): How to optimize neuroscience data utilization and experiment design for advancing primate visual and linguistic brain models? arXiv, doi: https://doi.org/10.48550/arXiv.2401.03376.

Driving and suppressing the human language network using large language models

Using large language models, we identified sentences to either drive or suppress responses in the human language network. From this diverse set of sentences (that we, as experimenters, could not have come up with ourselves), we found that the language network responds strongly to sentences with unusual grammar and/or meaning (here, sentence grammaticality ratings shown on the x-axis, and brain response magnitude on the y-axis).
January 2024 Excited to share that our work on driving and suppressing brain responses in the language network is published in Nature Human Behavior (work with Aalok Sathe, Shashank Srikant, Maya Taliaferro, Mingye Wang, Martin Schrimpf, Kendrick Kay, and Ev Fedorenko).

In this work, we asked two main questions: i) Can we leverage GPT language models to drive or suppress brain responses in the human language network? ii) What is the “preferred” stimulus of the language network, and why?
To answer these questions, we recorded brain data while participants read 1,000 linguistically diverse sentences using fMRI. We fit an encoding model to predict the left hemisphere language network’s response to an arbitrary sentence from GPT embeddings. Next, we used our encoding model to identify new sentences to activate the language network maximally ("drive sentences") or minimally ("suppress sentences") by searching across a massive number of sentences (~1.8 million). We showed that these model-selected new sentences indeed drive and suppress the activity of human language areas in new individuals. Hence, we had trained an encoding model to generate predictions about the magnitude of activation in the language network for the new drive/suppress sentences and then closed the loop, so to speak, by collecting brain data for the new sentences from new participants, effectively “controlling” their brain activity.
Second, we wanted to understand what kinds of sentences that the language network responds to the most. This general approach is rooted in the work by Hubel and Wiesel in the late 1950’s and 1960’s, who discovered what kinds of visual input (e.g., specific orientations of light) that would make neurons in the visual cortex of cats and monkeys fire the most. In our study, we characterized all of our sentences (spanning the full range of brain responses from low to high) using 11 linguistic properties. We discovered that sentences with a certain degree of unusual grammar and/or meaning elicit the highest activity in the language network in the form of an inverted U-shape – for instance, sentences like “I’m progressive and you fall right.” would elicit the highest responses. Conversely, sentences that were highly normal/frequent (e.g., "We were sitting on the couch.") or highly unusual, like a string of words (e.g., "LG'll obj you back suggested Git.") would elicit quite low responses. In other words, the language network responds strongly to sentences that are “normal” (language-like) enough to engage it, but unusual enough to tax it. This means that specific areas of your brain – the language network – will respond selectively to linguistic input that aligns with the statistics of the language and will work really hard to make sense of the words and how they go together.

The paper can be found here: Tuckute, G., Sathe, A., Srikant, S., Taliaferro, M., Wang, M., Schrimpf, M., Kay, K., Fedorenko, E. (2024): Driving and suppressing the human language network using large language models, Nature Human Behavior, doi: https://doi.org/10.1038/s41562-023-01783-7.
I also wrote a blog post about this work for Springer Nature's "Behind the paper" series.

Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions

Auditory deep neural network models trained in the presence of background noise exhibit higher model-brain similarity than the models trained without background noise (hashed bars).
December 2023 Our work on models of the auditory cortex is hereby published in PLoS Biology (work with Jenelle Feather*, Dana Boebinger, and Josh McDermott). The paper presents an extensive analysis of audio neural networks as models of auditory processing in the brain.

We evaluated brain-model correspondence for 19 auditory deep neural networks (DNNs) (9 publicly available models, 10 models trained by us spanning four tasks) on two fMRI datasets (n=8, n=20) and using two different evaluation metrics (via regression and representational similarity analysis, RSA). We found that most DNNs (but not all) showed greater brain-model similarity than a traditional filter-bank model. Results were highly consistent between datasets and evaluation metrics. Most trained DNNs exhibited correspondence with the hierarchical organization of the auditory cortex, with earlier DNN stages best matching primary auditory cortex and later stages best matching non-primary cortex. Other two findings that are worth highlighting: i) A model’s training data significantly influences its similarity to brain responses: models trained to perform word/speaker recognition in the presence of background noise are much better at accounting for auditory cortex responses than models trained in quiet. ii) A model trained on multiple tasks was the best model of the auditory cortex overall, and also accounted for neural tuning to speech and music. Training on particular tasks improved predictions for specific types of tuning, with the MultiTask model getting “best of all worlds”.

Finally, our work demonstrates that if the core goal is to obtain the most quantitatively accurate model of the auditory system, machine learning models move us closer to this goal. However, our findings also highlight the explanatory gap that remains (predictions of all models are well below the noise ceiling), as well as the need for better model-brain evaluation metrics and finer-grained neural recordings to better distinguish models.

The paper can be found here: Tuckute, G.*, Feather*, J., Boebinger, D., McDermott, J. (2023): Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions, PLoS Biology 21(12), doi: https://doi.org/10.1371/journal.pbio.3002366.

Investigating brain-model similarity using information-theoretic compression

The x-axis shows the mean squared reconstruction error (MSE) of BERT representations: larger values indicate less complex representations. The y-axis shows representational similarity between the posterior temporal language region and the reconstructed model representations across 5 participants.
December 2023 Mycal Tucker and I will be presenting a poster at NeurIPS 2023 (UniReps Workshop; Dec 15, 3-5pm CT) on leveraging information theoretic tools to investigate similarity between brains and large language models (LLMs).

The motivation for our work stems from the observation that languages are efficient compressors -- we convey information via lossy representations (words) (e.g., Zaslavsky et al., 2018). Hence, a good model of human language processing should compress linguistic representations in ways similar to the human brain. LLMs are today’s most accurate models of human language processing, however it is unknown whether LLM representations contain similar amounts of information as the human brain. We took an initial stab at this question, and asked i) Do brain and LLM representations contain similar amounts of information?, and ii) Can we leverage an information bottleneck approach to generate compressed representations of brain activity and better unify the representational spaces of humans and LLMs during language processing?
Our prelimary results show that compressing brain activity in frontal language regions improves alignment with LLMs, suggesting that frontal regions in the human brain encode information that LLMs do not. There was no benefit to compressing temporal brain regions, suggesting that these regions might align better with the information present in current LLMs. Broadly, our work establishes the use of information theoretic tools as a "dial" to modify representational complexity and to better unify representational spaces, including those from biological and artificial lanuage processing.

The extended abstract can be found here: Tucker*, M. & Tuckute*, G., Increasing Brain-LLM Alignment via Information-Theoretic Compression, 37th Conference on Neural Information Processing Systems (NeurIPS 2023), UniReps Workshop.

Young Scientist Award at VCCA2023

June 2023 Very honored and thrilled to have received the Young Scientist Award at the Computational Audiology (VCCA) conference 2023! Thanks so much to the Computational Audiology community & conference organizers! I gave a talk on "Driving and suppressing the human language network using large language models" and took part in a panel discussion on large language models and chatbots. Highlights from the jury were: “Great clarity of explanations for very complex research. Excellent flow of presentation and live speaker (!). Scientific contribution, novelty and interdisciplinarity were outstanding and gave a view into the future. Great contributions to special session panel discussion, too.”. Thanks again!

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network

We performed systematic manipulations to the sentences for which brain data was extracted and evaluated the resulting LLM-brain similarity to ask why LLMs map well onto brain data.
May 2023 Many studies have now established that large language models (LLMs) are predictive of human brain responses during language processing. Excited about our work that asks why and which aspects of the linguistic stimulus that contribute to LLM-brain similarity. Co-led with Carina Kauf*, and in collaboration with Roger Levy, Jacob Andreas, and Ev Fedorenko.

To answer these questions, we used an fMRI dataset of brain responses to n=627 diverse English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which LLM representations were extracted. We then fitted standard encoding model to predict brain activity from the LLM representations of these perturbed stimuli, and evaluated the resulting predictivity performance.
Critically, we found that the lexical-semantic content of the sentence rather than the sentence’s syntactic form is primarily responsible for the LLM-to-brain similarity. This means that manipulations that remove the content words or alter the meaning of the sentence decrease brain predictivity. Conversely, manipulations that remove function words or perturb the syntactic structure of the sentence (e.g., by shuffling the word order) do not lead to large decreases in brain predictivity. We show that stimulus manipulations that adversely affect brain predictivity have two interpretable causes: i) they lead to more divergent representations in the LLM’s embedding space (relative to the representations of the original stimuli), and decrease the LLM’s ability to predict upcoming tokens in those stimuli. So, stimuli that are less predictable on average lead to larger decreases in brain predictivity performance.
In summary, the critical result—that lexical-semanticcontent is the main contributor to the similarity between ANN representations and neural ones—aligns with the idea that the goal of the human language system is to extract meaning from linguistic strings. Finally, this work highlights the strength of systematic experimental manipulations for evaluating how close we are to accurate and generalizable models of the human language network.

The preprint can be found here: Kauf*, C, Tuckute*, G., Levy, R., Andreas, J., Fedorenko, E. (2023): Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network, bioRxiv2023.05.05.539646; doi: https://www.biorxiv.org/content/10.1101/2023.05.05.539646v1. The paper will appear in the "Cognitive computational neuroscience of language" special issue of Neurobiology of Language.

Driving and suppressing the human language network using large language models

GPT2-XL can drive (red bars) or suppress (blue bars) brain responses in the language network of new individuals in a model-based, ‘closed-loop’ manner.
April 2023 Very excited to share our work on leveraging the predictive power of large language models to drive and suppress brain responses in the human language network. In collaboration with Aalok Sathe, Shashank Srikant, Maya Taliaferro, Mingye (Christina) Wang, Martin Schrimpf, Kendrick Kay, and Ev Fedorenko.

Transformer language models are today's most accurate models of language processing in the brain (e.g., Schrimpf et al., 2021; Goldstein et al., 2022; Caucheteux et al., 2023). However, there has been no attempt to test whether LLMs can causally control language responses in the brain. Recent work in visual neuroscience has shown that artificial neural network models of image recognition can causally intervene on the non-human primate visual system by generating visual stimuli that modulate activity in different regions of the ventral visual pathway. In this work, we ask whether similar model-based control is feasible for the higher-level cognitive domain of language: can we leverage the predictive power of LLMs to identify new stimuli to maximally drive or suppress brain responses in the language network of new individuals?

We developed an encoding model of the left hemisphere language network in the human brain with the goal of identifying new sentences that would activate the language network to a maximal or minimal extent. To do so, we recorded brain data while participants read 1,000 linguistically diverse sentences using fMRI. We fitted an encoding model to predict the language network’s response to an arbitrary sentence from GPT2-XL embeddings. Using our encoding model, we identified novel sentences to activate the language network maximally (drive sentences) or minimally (suppress sentences). We searched across ~1.8M sentences to identify these novel drive and suppress sentences and recorded brain data in new individuals using two different fMRI designs (event-related vs blocked). We found that the model-selected sentences indeed drive and suppress brain responses in new individuals as predicted, demonstrating non-invasive control of language responses in the brain.
Finally, why do certain sentences elicit higher responses than others? We took advantage of the broad distribution of model-selected linguistic input and obtained 11 features to characterize our experimental sentences: Surprisal and 10 behavioral norms (collected across=3,600 participants) reflecting 5 broad aspects of linguistic input (form/meaning, content, emotion, imageability, perceived frequency). We found that sentences that are surprising, fall in the middle of the grammaticality and plausibility range, and are hard to visualize elicit the strongest responses in the language network.
In summary, we demonstrate that it is possible to drive activity in the language network up or down with model (GPT2-XL)-selected stimuli in new individuals. Hence, we establish LLMs as causal tools to investigate language in the mind and brain, with implications for future basic-science investigations and clinical research.

The preprint can be found here: Tuckute, G., Sathe, A., Srikant, S., Taliaferro, M., Wang, M., Schrimpf, M., Kay, K., Fedorenko, E. (2023): Driving and suppressing the human language network using large language models, bioRxiv2023.04.16.537080; doi: https://doi.org/10.1101/2023.04.16.537080.

Cortical registration using anatomy and function

March 2023 We released a preprint on how to align brains using anatomy and function (to appear at MIDL 2023). This project is led by Jian (Andrew) Li, and in collaboration with Ev Fedorenko, Brian L. Edlow, Bruce Fischl*, and Adrian V. Dalca*.

Brains have complex geometric anatomy and brains differ a lot across individuals. Aligning a brain to another individual or to a common atlas space is a challenging task. Traditionally, this challenge has been solved using registration of anatomical folding patterns of the cortex. However, it is known that many functional regions in the brain do not exhibit a consistent mapping onto macro-anatomical landmarks. Hence, we might miss out on crucial information in aligning brains if we do not take function into account. In this work, we propose JOSA (Joint Spherical registration and AAtlas building that models anatomical and functional differences when aligning brains and building cortical atlases. As a proof of concept, we used functional data from a well-validated language localizer task (e.g., Lipkin et al., 2022) for a set of 150 participants, jointly aligning the language contrast (sentences versus strings of non-words) with cortical folding patterns, demonstrating better alignment in both folding patterns and function compared to two existing methods.
Upon publication, the method will be integrated as a part of VoxelMorph and/or FreeSurfer.

The preprint can be found here: Li, A., Tuckute, G., Fedorenko, E., Edlow, B.L., Fischl, B., Dalca, A.V. (2023): Joint cortical registration of geometry and function using semi-supervised learning, arXiv2303.01592; doi: https://doi.org/10.48550/arXiv.2303.01592.

The scatter plot shows that DNN models where a speech-related task was part of the training regime matches cortical speech-selective responses better than networks not trained on speech tasks.

Preprint on deep neural network models of the auditory system is out

September 2022 We released the preprint of our work on how deep neural networks (DNNs) for audio can account for brain responses in the human auditory cortex. This project is co-led with Jenelle Feather, and in collaboration with Dana Boebinger and Josh McDermott.

We evaluated brain-model correspondence for 19 DNNs (9 publicly available models, 10 models trained by us spanning four tasks) on two fMRI datasets (n=8, n=20) and using two different evaluation metrics (via regression and representational similarity analysis, RSA). We make the following four main claims: 1) Most DNN models (but not all!) outperformed traditional models of the auditory cortex. Results were highly consistent between datasets and evaluation metrics. The overall best DNN model was trained on multiple tasks (word, speaker, environmental sound recognition). 2) This brain-DNN similarity was strictly dependent on task-optimization. DNNs with permuted weights (which destroys the structure learned during model training) performed below the baseline model. 3) Most DNNs exhibited systematic correspondence with the hierarchical organization of the auditory cortex, with earlier DNN stages best matching primary auditory cortex and later stages best matching non-primary cortex. This was not true for permuted networks. 4) The task a DNN model is trained on influences its match to the brain, with e.g., speech-trained models best matching cortical speech responses (scatter plot on the right). 5) Finally, in light of recent discussion suggesting that the dimensionality of a model’s representation correlates with regression-based brain predictions, we evaluated how the effective dimensionality (ED) of each network stage correlated with both the regression and RSA metrics. There was a modest correlation between ED and brain-model similarity but significantly less than that between the two datasets or the two similarity measures. Thus ED does not seem to explain most of the variance across DNNs in our datasets.
Overall, we demonstrate that many, but not all, DNN models account for responses in the human auditory cortex with hierarchical stage-region correspondence, and provide some hints of how to improve brain-model matches for future models.

The preprint can be found here: Tuckute, G.*, Feather*, J., Boebinger, D., McDermott, J. (2022): Many but not all deep neural network audio models capture brain responses and exhibit hierarchical region correspondence, bioRxiv2022.09.06.506680; doi: https://doi.org/10.1101/2022.09.06.506680.

The LanA Language Atlas is published

August 2022 Our probabilistic language atlas, LanA, is now published in Nature Scientific Data and can be openly accessed here! We also have a website http://evlabwebapps.mit.edu/langatlas/ that contains easy access to data download, visualizations, and additional information.
In brief, the LanA language atlas provides the probability that any location in the brain (volume/surface) is language-selective. The atlas was derived from >800 individuals based on functional localization (a contrast between processing of sentences and a linguistically/acoustically degraded condition, such as non-word strings).

Citation: Benjamin Lipkin, Greta Tuckute, Josef Affourtit, Hannah Small, Zachary Mineroff, Hope Kean, Olessia Jouravlev, Lara Rakocevic, Brianna Pritchett, Matthew Siegelman, Caitlyn Hoeflin, Alvincé Pongos, Idan Blank, Melissa Kline Shruhl, Anna Ivanova, Steven Shannon, Aalok Sathe, Malte Hoffmann, Alfonso Nieto-Castañón, Evelina Fedorenko (2022): LanA (Language Atlas): Probabilistic atlas for the language network based on precision fMRI data from >800 individuals. Sci Data 9, 529; doi: https://doi.org/10.1038/s41597-022-01645-3lana7.

The coordinate system demonstrates some of the adversarial axes that we will focus on during our GAC workshop. Each dot is a speaker's opinion on a set of questions, e.g., to which extent we are still in the dark ages of neuroscience and more work needs to be conducted before we start collecting data at a grain leveraged for building artificial neural networks of brain activity and behavior.

Conference on Cognitive Computational Neuroscience 2022

August 2022 Excited to take part in the Conference on Cognitive Computational Neuroscience (CCN) this year where I am co-organizing a Generative Adversarial Collaboration (GAC) as well as presenting a poster.

The GAC workshop takes place Friday August 26 (1.30-4.15pm PT) and aims to tackle how we can optimally use neuroscience data to guide the next generation of brain models. Current use of data is often limited to post-hoc model evaluation or vague ‘inspirations’ for model development. Here, we ask: Can we use neuroscience data more efficiently for model development? Is it even the right time in neuroscience to do this? How much data is enough? What type of data should we collect?
The GAC team (and speakers) include Ko Kar (York University, MIT), Joel Zylberberg (York University), SueYeon Chung (NYU), Alona Fyshe (University of Alberta), Ev Fedorenko (MIT), Konrad Kording (University of Pennsylvania), Nikolaus Kriegeskorte (Columbia University), Jacob Yates (UC Berkeley), and Kalanit Grill-Spector (Stanford University).
I will be giving a talk on how to optimize data collection for model development within language. Specifically, I will try to answer why many existing neuroscience datasets within language are not ideal for model development – and I will provide ideas for ways forward.

I will be presenting a poster on Friday August 26 (7.30-9.30pm PT) and our work (with Jenelle Feather*, Dana Boebinger, and Josh McDermott) is on how several auditory networks with diverse architectures trained for diverse tasks capture human brain responses to natural sounds. The poster will focus on how robust our findings are to the model evaluation metric of interest (regression versus representational similarity analysis) as well as how our findings might be affected by latent variables such as effective dimensionality of network activations.

We investigated why certain words are more memorable than others. For instance, number of synonyms (x-axis) correlates negatively with the word recognition performance of a word (y-axis): words with many synonyms are more forgettable, possibly because any of the synonyms of a word could have generated the relevant meaning in semantic memory (see the preprint for 12 other predictors).

Intrinsically memorable words have unique associations with their meanings

July 2022 This project is the result of a big joint effort with Kyle Mahowald (co-lead), Phillip Isola, Aude Oliva, Edward Gibson, and Ev Fedorenko.

PINEAPPLE, LIGHT, HAPPY, AVALANCHE, BURDEN

Some of these words are consistently remembered better than others. Why is that? In this project, we provide a simple Bayesian account and show that it explains >80% of variance in word memorability.
Building on past work that suggested that words are encoded by their meanings, we hypothesize that words that uniquely pick out a meaning in semantic memory (i.e., unambiguous words with no/few synonyms) are more memorable. We evaluated our account in two behavioral experiments (each with >600 participants and 2,222 target words), similar to past work on image memorability. Participants viewed a sequence of words and pressed a button whenever they encountered a repeat (critical memory repeats occurred 91-109 words apart).
Key findings: 1) Words are as memorable as images. In our experiments, the hit rate was ~68% and the false alarm rate was ~10% which is on par with images (e.g., Isola et al., 2011 CVPR). There does not appear to be a memory advantage for images compared to words. 2) Certain words are consistently remembered better than others across participants – so although individuals differ in their exposure to the amount and kinds of linguistic information across their lifetimes, memorability is largely an intrinsic word property. 3) Critically, most memorable words have a one-to-one relationship with their meaning (such as PINEAPPLE or AVALANCHE). They uniquely pick out a particular meaning in semantic memory, in contrast to ambiguous words (e.g., LIGHT which could mean a fixture in a house, the opposite of heavy, cigarette lighter, etc.) or words with many synonyms (e.g., HAPPY with synonyms CHEERFUL, JOYFUL, GLAD, etc.). Number of synonyms was a more important predictor than number of meanings.
Given that our critical predictors (number of synonyms and meanings) can be estimated from language corpora, this simple account provides a scalable model that can make predictions about memorability of newly encountered words in any language where large corpora are available. Memorability can be used to answer cool questions about how the mind and brain prioritizes and organizes information during semantic memory encoding. Understanding which words lead to longer-lasting memory traces can be leveraged to enable more effective information sharing.

The preprint can be found here: Tuckute, G.*, Mahowald*, K., Isola, P., Oliva, A., Gibson, E., Fedorenko, E. (2022). Intrinsically memorable words have unique associations with their meanings, PsyArXiv, doi: https://doi.org/10.31234/osf.io/p6kv9. (This is a revival of a project that got started back in 2011 and we are excited to share a new and improved version of the manuscript (along with the data and analysis scripts).

We present SentSpace: a framework for streamlined evaluation of text using cognitively motivated linguistic features. This enables to compare text from e.g., artificial language models and humans, as demonstrated above.

SentSpace: Large-scale benchmarking and evaluation of text using cognitively motivated lexical, syntactic, and semantic features

June 2022 SentSpace would not exist without Aalok Sathe* (co-lead), Mingye (Christina) Wang, Harley Yoder, Cory Shain and Ev Fedorenko.
Image that you want to quantify a sentence using a large set of interpretable features. Maybe you are interested in obtaining features that relate to the sentiment of the sentence, maybe features that are known to cause language processing difficulty (such as frequency or age of acquisition). With SentSpace, we introduce such system: we enable streamlined evaluation of any textual input. SentSpace characterizes textual input using diverse lexical, syntactic, and semantic features derived from corpora and psycholinguistic experiments. These features fall into two main domains (sentence spaces, hence the name): lexical and contextual. Lexical features operate on individual lexical items (words) and entail features such as concreteness, age of acquisition, lexical decision latency, and contextual diversity. As several properties of a sentence cannot be attributed to individual words, so the contextual module quantifies a sentence as a whole. This module entails features such as syntactic storage and integration cost, center embedding depth, and sentiment. Hence, SentSpace provides an interpretable sentence embedding with features that have been to shown to affect language processing.
SentSpace allows for quantification and comparison of different types of text and can be useful for answering questions like: How does text generated by an artificial language model compare to that of humans? How does utterances produced by neurotypicals compare to that of individuals with communication disorders? What psycholinguistic information do high-dimensional vector representations from artificial language models capture?

Aalok and I will be demonstrating the current (first!) version of SentSpace at NAACL 2022 in Seattle July 10-15 (System Demonstration poster session July 12). We would love feedback, so please don't hesitate to reach out! The proceedings paper can be found here: Tuckute, G.*, Sathe, A., Wang, M., Yoder, H., Shain, C., and Fedorenko, E (2022). SentSpace: Large-Scale Benchmarking and Evaluation of Text using Cognitively Motivated Lexical, Syntactic, and Semantic Features. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations. Association for Computational Linguistics.
The SentSpace Python package can be accessed at sentspace.github.io/sentspace and the hosted frontend website at sentspace.github.io/hosted.

A surface map of the probabilistic language atlas (LanA). Lighter color indicates higher probability of that part of the brain being language-selective. Evidently, the language networks falls (mostly) within the left hemisphere frontal and temporal lobes.

LanA (Language Atlas): A probabilistic atlas for the language network based on fMRI data from >800 individuals

March 2022 This work is a massive effort (14yrs of data collection!) in collaboration with Benjamin Lipkin (lead), Ev Fedorenko, and a bunch of brilliant current/former lab members of EvLab.
Given any location in the brain, what is the probability of that particular location being selective to language? We present a probabilistic language atlas (LanA) that allows to answer exactly this question. For any 3d pixel (voxel/vertex) in the volumetric or surface brain coordinate spaces, how likely is that pixel to fall within the language network? The atlas was obtained from >800 individuals based on functional localization (a contrast between processing of sentences and a linguistically/acoustically degraded condition, such as non-word strings). Thus, among these ~800 individuals, we provide a group average map that allows to quantify and visualize where ‘the average’ language network resides.
Examples of use cases of LanA are: 1) A common reference frame for analyzing group-level activation peaks from past/future fMRI studies, 2) Lesion locations in individual brains, 3) Electrode location in intracranial ECoG/SEEG investigations, 4) Functional mapping in during brain surgery when fMRI is not possible, and others (please see paper introduction). The atlas will be made publicly available (along with individual contrast/significance maps and demographic data) by publication.

The preprint can be found here: Benjamin Lipkin, Greta Tuckute, Josef Affourtit, Hannah Small, Zachary Mineroff, Hope Kean, Olessia Jouravlev, Lara Rakocevic, Brianna Pritchett, Matthew Siegelman, Caitlyn Hoeflin, Alvincé Pongos, Idan Blank, Melissa Kline Shruhl, Anna Ivanova, Steven Shannon, Aalok Sathe, Malte Hoffmann, Alfonso Nieto-Castañón, Evelina Fedorenko (2022): LanA (Language Atlas): A probabilistic atlas for the language network based on fMRI data from >800 individuals. bioRxiv2022.03.06.483177; doi: https://doi.org/10.1101/2022.03.06.483177.

A surface map showing which layer of VGGish (model trained for environmental sound classification) best predicts each vertex of the surface (aggregated over n=20 participants). Earlier layers of the model in green, later layers in red. The label shows a label of the primary auditory cortex. A relationship between the model layer hiearchy and the auditory cortex is present.

Hierarchical layer-region correspondence of deep neural networks for audition

February 2022 This work is a great collaboration with Jenelle Feather*, Dana Boebinger, and Josh McDermott.
An overarching aim of neuroscience is to build quantitatively accurate computational models of sensory systems. Deep neural networks provide such candidate models. To consider these neural networks as serious candidate models, they must at least 1) Perform a task that is relevant to the real world, 2) Be predictive of brain data, and 3) Be mappable (meaning that earlier layers of the network map onto earlier parts of the cortical hierarchy in the brain, and later layers onto later parts).
Such models are relatively well explored within vision (convolutional neural networks trained for image classification) (e.g., Yamins et al., 2014), but less explored in audition. Kell et al. (2018) showed that a particular neural network architecture was predictive of brain responses and had a degree of correspondence between model stages and brain regions. However, it is unclear whether these results generalize to other neural network models. In our work, we evaluated brain-model correspondence for publicly available audio neural network models along with in-house models trained on five different tasks. We used two independent datasets (Norman-Haignere et al., 2015, n=8; Boebinger et al., 2021, n=20) of participants listening to natural sounds in the fMRI scanner. Most tested models were more predictive of brain responses than traditional spectrotemporal models of auditory cortex, and exhibited a nice relationship between the model layer hierarchy and the cortical hierarchy in the human brain. However, this was not true for all tested models: not all state-of-the-art models were either predictive or mappable. This work helps us understand which parameters are necessary to yield a computationally accurate model of the human auditory cortex and substantiates our knowledge of the hierarchical organization of the auditory cortex.

I will be discussing these findings and other aspects of the work at Cosyne 2022 in Lisbon, Portugal, March 17-20 (poster session 2).

The neural architecture of language: Integrative modeling converges on predictive processing

November 2021 Our paper on artificial neural networks (ANNs) as models of language comprehension is now out in PNAS and it received some nice coverage, for instance by Scientific American. I want to emphasize two points from this paper: 1) We show that better-performing language models (based on next-word prediction) also match the brain better. Critically, we did not evidence this link with performance on other linguistic benchmarks (GLUE), suggesting that a drive to predict future inputs may shape human language processing. Thus, both the human language system and successful ANNs seem to be optimized for predictivity to efficiently extract meaning. 2) Model architecture alone (initialization weights) can reliably predict brain activity, possibly suggesting that these untrained representational spaces already provide enough structure to constrain and predict a given input.
I think these two points open up for multiple exciting research questions: Given that better-performing models are more brain-like, how can we engineer more brain-inspired models? Most state-of-the-art language models are inefficient (requiring billions of parameters and training samples resulting in massive energy expenditure), not robust (can be fooled by adversarial input), and not very interpretable (making it challenging to localize causes of success/unwanted capabilities). How can we exploit principles from the human brain that allows us to processes language efficiently and robustly? Can we modularize or constrain language model representations using human data? In which scenarios do interpretability and performance go hand in hand? Lastly, which human and ANN benchmarks could be most meaningful to evaluate some of the aforementioned questions?

The paper can be found here: Schrimpf, M., Blank, I.*, Tuckute, G.*, Kauf, C.*, Hosseini, E. A., Kanwisher, N., Tenenbaum^, J., Fedorenko^, E (2021): The neural architecture of language: Integrative modeling converges on predictive processing, PNAS Vol. 118, Issue 45; doi: https://doi.org/10.1073/pnas.2105646118.

Can we use transformer models to drive language regions in the brain?

July 2021 I gave an informal 'poster' presentation at the Boston/Cambridge CogSci 2021 meet-up on exploiting transformer language models to drive regions in the human brain. I presented ideas and preliminary data on whether and how that is feasible, and if so, what we can learn from it. Thanks for the great discussions! This is ongoing work with Mingye Wang, Elizabeth Lee, Martin Schrimpf, Noga Zaslavsky, and Ev Fedorenko. More soon!

We investigated a woman living without her left temporal lobe, most likely as a result of pre/perinatal stroke.

Frontal language areas do not emerge in the absence of temporal language areas

May 2021 This work is a joint effort and brilliant collaboration with Alexander Paunov, Hope Kean, Hannah Small, Zachary Mineroff, Idan Blank, and Ev Fedorenko. High-level language processing is supported by a left-lateralized fronto-temporal brain network. In this work, we investigated whether frontal language areas emerge in the absence of temporal language areas. To do so, we examined language processing in the brain of an individual (EG) born without a left temporal lobe. We used fMRI methods to establish that the right hemisphere language network is similar to the left hemisphere language network in controls. However, the critical question was whether EG’s intact left lateral frontal lobe contained language-responsive areas. We found no reliable response to language in EG’s intact left frontal lobe, suggesting that the existence of temporal language areas appears to be a prerequisite for the emergence of language areas in the frontal lobe.

The paper can be found here: Tuckute, G., Paunov, A., Kean, H., Small, H., Mineroff, Z., Blank, I., and Fedorenko, E. (2021): Frontal language areas do not emerge in the absence of temporal language areas: A case study of an individual born without a left temporal lobe, bioRxiv 2021.05.28.446230; doi: https://doi.org/10.1101/2021.05.28.446230.

We link behavioral task performance to neural EEG states (effect only significant in the neurofeedback group and not controls).

Real-time decoding of visual attention using closed-loop EEG neurofeedback

March 2021 Happy to share that my MSc thesis work from DTU is now published (with Sofie T. Hansen, Troels W. Kjaer and Lars K. Hansen).
Neurofeedback is a powerful tool for linking neural states to behavior. In this project, we asked i) Whether we can decode covert states of visual attention using a closed-loop EEG system, and ii) If a single neurofeedback training session can improve sustained attention abilities. We implemented an attention training paradigm designed by DeBettencourt et al., (2015) in EEG. In a double-blinded design, we trained twenty-two participants on the attention paradigm within a single neurofeedback session with behavioral pretraining and posttraining sessions.
We demonstrate that we are able to decode covert visual attention in real time. First of all, we report a mean classifier decoding error rate of 34.3% (chance = 50%). Second, we link this decoding performance to behavioral states, and show that within the neurofeedback group, there was a greater level of task-relevant attentional information decoded in the participant's brain before making a correct behavioral response than before an incorrect response (not evident in the control group; interaction p=7.23e−4). This indicates that we were able to achieve a meaningful measure of subjective attentional state in real time and control participants' behavior during the neurofeedback session. Lastly, we do not provide conclusive evidence whether a single neurofeedback session per se provided lasting effects in sustained attention abilities.

The paper can be found here: Tuckute, G., Hansen, S.T., Kjaer, Troels W., Hansen, L. K. (2021): Real-Time Decoding of Attentional States Using Closed-Loop EEG Neurofeedback, Neural Computation Vol. 33, Issue 4; doi: https://doi.org/10.1162/neco_a_01363.
A video of the neurofeedback system is available at here. The code and sample data for the neurofeedback framework are available on GitHub.

Correlation of connectivity among brain networks and phantom limb sensation. We show that individuals with a low degree of phantom sensation (i.e. low neuroprosthetic controllability) have a strong connectivity among visual and sensorimotor networks, possibly as a compensatory mechanism.

Biological closed-loop feedback preserves proprioceptive sensorimotor signaling

December 2020 This work is a great collaboration with Shriya Srinivasan (lead), Jasmine Zou, Samantha Gutierrez-Arango, Hyungeun Song, Robert L. Barry, and Hugh Herr.
The brain undergoes marked changes in function after limb loss and amputation. In this work, we investigate individuals with a traditional lower limb amputation, no amputation and a novel amputation procedure that preserves physiological central-peripheral signaling mechanisms. We demonstrate that the proprioceptive signaling enabled by the novel amputation procedure restores sensorimotor feedback in the brain. We investigate changes in functional connectivity in the brain, and show that the lack of proprioceptive feedback results in a strong coupling between visual and sensorimotor networks. This suggests a heavy reliance on visual information when no sensory feedback is available, possibly as a compensatory mechanism. Conclusively, we demonstrate that closed-loop proprioceptive feedback can enable desired neuroplastic changes toward improved neuroprosthetic capability.

The paper can be found here: Srinivasan, S. S., Tuckute, G., Zou, J., Gutierrez-Arango, S., Song, H., Barry, R. L., Herr, H (2020): AMI Amputation Preserves Proprioceptive Sensorimotor Neurophysiology, Science Translational Medicine, Vol. 12, Issue 573, doi: 10.1126/scitranslmed.abc5926.

ANNs as models of language processing in the brain

October 2020 I gave a workshop talk at the Center for Cognitive and Behavioral Brain Imaging (CCBBI) at The Ohio State University on artifical neural networks as models of language processing. Part of the talk was based on the work by Schrimpf et al., 2020, while another part focused on the methodological considerations of comparing neural network models to brain representations. The talk can be found on OnNeuro.

Linguistic and Conceptual Processing are Dissociated During Sentence Comprehension

September 2020 This work is a great collaboration with Cory Shain, Idan A. Blank, Mingye Wang, and Ev Fedorenko.
The human mind stores a vast array of linguistic knowledge, including word meanings, word frequencies and co-occurrence patterns as well as syntactic constructions. These different kinds of knowledge have to be efficiently accessed during incremental language comprehension. In this work, we ask how dissociable are the memory stores and processing mechanisms of these different types of knowledge. Moreover, do different types of knowledge representations and processing rely on language-specific networks in the human brain, domain-general networks, or both? To address these questions, we used representational similarity analysis (RSA) to relate linguistic knowledge and processing and neural data.

I will be presenting this ongoing work (poster) at SNL 2020 in October. Poster Session: A, Board #: 29, Wednesday, October 21, 12:00 pm PDT.

Left panel: Methodology of brain to ANN comparisons. Right panel: Brain predictivity correlates with computational accounts of predictive processing (next-word prediction).

Artificial neural networks accurately predict language processing in the brain

July 2020 This work is a great collaboration with Martin Schrimpf (lead), Idan A. Blank, Carina Kauf, Eghbal A. Hosseini, supervised by Nancy Kanwisher, Josh Tenenbaum and Ev Fedorenko.
In the recent years, great progress has been made in modeling sensory systems with artificial neural networks (ANNs) to provide mechanistic accounts of brain processing. In this work, we investigate if we can exploit ANNs to inform us about higher level cognitive functions in the human brain – specifically, language processing. Here, we ask which language models best capture human neural (fMRI/ECoG) and behavioral responses. Moreover, we investigate how this links to computational accounts of predictive processing. Lastly, we examine the contributions of intrinsic model network architecture in brain predictivity. We tested 43 diverse state-of-the-art language models spanning a diverse set of embedding, recurrent, and transformer models. In brief, certain transformer families (GPT2) demonstrate consistent high predictivity across all neural datasets investigated. These models’ performance on neural data correlate with language modeling performance (next-word prediction) - but not other The General Language Understanding Evaluation (GLUE) benchmarks, suggesting that a drive to predict future inputs may shape human language processing. Thus, both the human language system and successful ANNs seem to be optimized for predictivity to efficiently extract meaning. Lastly, model architecture alone (random weights, no training) can reliably predict brain activity, possibly suggesting that these untrained representational spaces already provide enough structure to constrain and predict a given input, analogous to evolutionary-based optimization.

The pre-print can be found here: Schrimpf, M., Blank, I., Tuckute, G., Kauf, C., Hosseini, E. A., Kanwisher, N., Tenenbaum, J., Fedorenko, E (2020): Artificial Neural Networks Accurately Predict Language Processing in the Brain, bioRxiv 2020.06.26.174482; doi: https://doi.org/10.1101/2020.06.26.174482.

Martin Schrimpf will also be presenting this work (slide) at SNL 2020 in October (SNL 2020 Merit Award Honorable Mention).