Jacob Mitchell Springer
[email hidden] · sprin.xyz · github.com/jakespringer · Google Scholar
PDF version ↗Education
Ph.D., Machine LearningAug 2022 – Present
Carnegie Mellon University, Pittsburgh, PA
Advisor: Aditi Raghunathan. Supported by the NSF Graduate Research Fellowship.
B.A., Mathematics and Computer ScienceAug 2017 – May 2022
Swarthmore College, Swarthmore, PA
Publications
* In submission.
- J. Springer, M. Advani, L. Aichberger, A. Bradley, E. Malach, O. Saremi, S. Williamson, P. Nakkiran, E. Littwin, A. Raghunathan. Annotations mitigate post-training mode collapse. ICML, 2026.
- I. Watts, C. Li, S. Goyal, J. Springer, A. Raghunathan. Sharpness-aware pretraining mitigates catastrophic forgetting. ICML, 2026. Oral, ICBINB Workshop @ ICLR 2026
- L. Feng, G. R. Ghosal, J. Springer, Z. Zhong, A. Raghunathan. Mix early, forget less: data mixing during pretraining builds resistance to forgetting. In submission, 2025*.
- A. Kulkarni, J. Springer, A. Subramonian, S. Swayamdipta. Disentangling geometry, performance, and training in language models. ICML, 2026. Spotlight
- J. Springer, S. Goyal, K. Wen, T. Kumar, X. Yue, S. Malladi, G. Neubig, A. Raghunathan. Overtrained language models are harder to fine-tune. ICML, 2025. Outstanding Paper, SCOPE Workshop @ ICLR 2025; Entropic Paper Award, ICBINB Workshop @ ICLR 2025
- J. Springer, S. Kotha, D. Fried, G. Neubig, A. Raghunathan. Repetition improves language model embeddings. ICLR, 2025.
- T. Kim, J. Springer, A. Raghunathan, M. Sap. Mitigating bias in RAG: controlling the embedder. ACL Findings, 2025.
- J. Springer, V. Adlakha, S. Reddy, A. Raghunathan, M. Mosbach. Understanding the influence of synthetic data for text embedders. ACL Findings, 2025.
- J. Springer, V. Nagarajan, A. Raghunathan. Sharpness-aware minimization enhances feature quality via balanced learning. ICLR, 2024.
- S. Kotha, J. Springer, A. Raghunathan. Understanding catastrophic forgetting in language models via implicit inference. ICLR, 2024.
- H. T. Jones, J. Springer, G. T. Kenyon, J. Moore. If you've trained one you've trained them all: inter-architecture similarity increases with robustness. UAI, 2022. Oral presentation
- J. Springer, M. Mitchell, G. T. Kenyon. A little robustness goes a long way: leveraging robust features for targeted transfer attacks. NeurIPS, 2021.
- J. Springer, B. M. Reinstadler, U.-M. O'Reilly. STRATA: simple, gradient-free attacks for models of code. Workshop on Adversarial Learning Methods @ KDD, 2021.
- J. Springer, G. T. Kenyon. It's hard for neural networks to learn the Game of Life. IJCNN, 2021.
- J. Springer, M. Mitchell, G. T. Kenyon. Adversarial perturbations are not so weird: entanglement of robust and non-robust features in neural network classifiers. Preprint, 2021.
- D. A. Wang, C. M. S. Strauss, J. Springer, A. Thresher, H. Pritchard, G. T. Kenyon. Sparse MP4. IEEE SSIAI, 2020.
- J. Springer, C. S. Strauss, A. M. Thresher, E. Kim, G. T. Kenyon. Classifiers based on deep sparse coding architectures are robust to deep learning transferable examples. Preprint, 2018.
- J. Springer, W. Feng. Teaching with angr: a symbolic execution curriculum and CTF. USENIX Workshop on Advances in Security Education, 2018.
Research Experience
Research InternJun – Sep 2025
Apple Machine Learning Research, Cupertino, CA
Advised by Etai Littwin.
Graduate Research AssistantAug 2022 – Present
Carnegie Mellon University, Pittsburgh, PA
Advised by Aditi Raghunathan.
Research Assistant, ML & Computational NeuroscienceJan – Jul 2022
Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
Advised by Anthony Zador.
Research Assistant, ML & Computational NeuroscienceJun 2018 – Dec 2021
Los Alamos National Laboratory, Los Alamos, NM
Advised by Garrett Kenyon.
Research InternJun – Aug 2020
MIT, Cambridge, MA
Advised by Una-May O'Reilly.
Research Intern, Computer Security EducationJun – Aug 2017
Portland State University, Portland, OR
Advised by Wu-chang Feng.
Invited Talks
- Don't (Just) Minimize Your Pretraining Loss. ML Foundations Seminar, Apr 2026.
- Overtrained Language Models Are Harder to Fine-Tune. Waymo, Aug 2025.
- Echo Embeddings & Overtrained Language Models Are Harder to Fine-Tune. Summer of Data Seminar, DatologyAI, Jun 2025.
- Overtrained Language Models Are Harder to Fine-Tune. Translate Reading Group, Google, Jun 2025.
- Overtrained Language Models Are Harder to Fine-Tune. FLAME Center Seminar, Carnegie Mellon University, Apr 2025.
- Repetition Improves Language Model Embeddings. Foundation and Language Model (FLAME) Seminar, Carnegie Mellon University, Mar 2024.
- What Can Adversarial Examples Tell Us About Similarities Between Neural Networks? Pacific Northwest Seminar on Topology, Algebra, and Geometry in Data Science, University of Washington, Feb 2023.
Awards
| Outstanding Paper, SCOPE Workshop @ ICLR | 2025 |
| Entropic Paper Award, ICBINB Workshop @ ICLR | 2025 |
| Hertz Fellowship, Finalist | 2023 |
| NSF Graduate Research Fellowship | 2022 |
| Barry M. Goldwater Scholarship | 2020 |
| National Merit Scholarship, Finalist | 2017 |