Linearly mapping from image to text space J Merullo, L Castricato, C Eickhoff, E Pavlick arXiv preprint arXiv:2209.15162, 2022 | 101 | 2022 |
Does clip bind concepts? probing compositionality in large image models M Lewis, NV Nayak, P Yu, Q Yu, J Merullo, SH Bach, E Pavlick arXiv preprint arXiv:2212.10537, 2022 | 50 | 2022 |
Circuit component reuse across tasks in transformer language models J Merullo, C Eickhoff, E Pavlick arXiv preprint arXiv:2310.08744, 2023 | 45 | 2023 |
Characterizing mechanisms for factual recall in language models Q Yu, J Merullo, E Pavlick arXiv preprint arXiv:2310.15910, 2023 | 36 | 2023 |
Language models implement simple word2vec-style vector arithmetic J Merullo, C Eickhoff, E Pavlick arXiv preprint arXiv:2305.16130, 2023 | 35* | 2023 |
Investigating sports commentator bias within a large corpus of American football broadcasts J Merullo, L Yeh, A Handler, A Grissom II, B O'Connor, M Iyyer arXiv preprint arXiv:1909.03343, 2019 | 22 | 2019 |
Does CLIP Bind Concepts M Lewis, NV Nayak, P Yu, Q Yu, J Merullo, SH Bach, E Pavlick Probing Compositionality in Large Image Models, 2023 | 6 | 2023 |
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models J Merullo, C Eickhoff, E Pavlick arXiv preprint arXiv:2406.09519, 2024 | 5 | 2024 |
ezCoref: Towards unifying annotation guidelines for coreference resolution A Gupta, M Karpinska, W Zhao, K Krishna, J Merullo, L Yeh, M Iyyer, ... arXiv preprint arXiv:2210.07188, 2022 | 5 | 2022 |
Axiomatic causal interventions for reverse engineering relevance computation in neural retrieval models C Chen, J Merullo, C Eickhoff Proceedings of the 47th International ACM SIGIR Conference on Research and …, 2024 | 4 | 2024 |
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting S Anand, MA Lepori, J Merullo, E Pavlick arXiv preprint arXiv:2406.00053, 2024 | 4 | 2024 |
Pretraining on interactions for learning grounded affordance representations J Merullo, D Ebert, C Eickhoff, E Pavlick arXiv preprint arXiv:2207.02272, 2022 | 4 | 2022 |
$100 K or 100 Days: Trade-offs when Pre-Training with Academic Resources A Khandelwal, T Yun, NV Nayak, J Merullo, SH Bach, C Sun, E Pavlick arXiv preprint arXiv:2410.23261, 2024 | 2 | 2024 |
Transformer mechanisms mimic frontostriatal gating operations when trained on human working memory tasks A Traylor, J Merullo, MJ Frank, E Pavlick arXiv preprint arXiv:2402.08211, 2024 | 2 | 2024 |
ACQuA: Arrhythmia Classification with Quasi-Attractors W Rudman, J Merullo, L Mercurio, C Eickhoff medRxiv, 2022.08. 31.22279436, 2022 | 1* | 2022 |
On Linear Representations and Pretraining Data Frequency in Language Models J Merullo, NA Smith, S Wiegreffe, Y Elazar The Thirteenth International Conference on Learning Representations, 0 | | |