Andrew Drozdov

Research Scientist @ Databricks

PhD @ UMass Amherst CICS
Thesis: Unlocking Natural Language Generalization through Adaptive Retrieval-based Methods
Advisors: Andrew McCallum, Mohit Iyyer
TA: CS 685, CS 696DS
Organizer: Data Science Tea

MS @ NYU CS
Mentors: Samuel Bowman, Kyunghyun Cho

Previously at Google and IBM.

I have reviewed 100+ papers at top AI/IR/NLP conferences, as well as supervised many papers as AC and SAC.

Say hello: andrew.drozdov@databricks.com

Research

I’m broadly interested in neural network-related topics including training, inference, in-context learning, knowledge distillation, and evaluation. Most of my work has been in natural language processing and information retrieval. I’m particularly excited about the emerging field of generative retrieval.

Publications

Drowning in Documents: Consequences of Scaling Reranker Inference

Mathew Jacob, Erik Lindgren, Matei Zaharia, Michael Carbin, Omar Khattab, and Andrew Drozdov

arXiv 2024

arXiv Bib
@article{jacobs2024drowning, bibtex_show = {true}, selected = {true}, category = {main}, title = {{Drowning in Documents}: Consequences of Scaling Reranker Inference}, author = {Jacob, Mathew and Lindgren, Erik and Zaharia, Matei and Carbin, Michael and Khattab, Omar and Drozdov, Andrew}, journal = {arXiv}, year = {2024}, arxiv = {https://arxiv.org/abs/2411.11767} }
Tutorial
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Fernando Diaz, Andrew Drozdov, To Eun Kim, Alireza Salemi, and Hamed Zamani

SIGIR-AP 2024

Bib HTML
@article{diaz2024remal, bibtex_show = {true}, selected = {true}, category = {main}, abbr = {Tutorial}, title = {{Retrieval-Enhanced Machine Learning}: Synthesis and Opportunities}, author = {Diaz, Fernando and Drozdov, Andrew and Kim, To Eun and Salemi, Alireza and Zamani, Hamed}, journal = {SIGIR-AP}, year = {2024}, html = {https://dl.acm.org/doi/abs/10.1145/3673791.3698439} }

Thesis

Unlocking Natural Language Generalization with Adaptive Retrieval-based Methods

Andrew Drozdov

2024

Bib HTML

@article{drozdov2024thesis,
  bibtex_show = {true},
  selected = {true},
  category = {main},
  abbr = {Thesis},
  title = {Unlocking Natural Language Generalization with Adaptive Retrieval-based Methods},
  author = {Drozdov, Andrew},
  year = {2024},
  html = {https://scholarworks.umass.edu/entities/publication/04520d37-6227-4441-9ec8-e91495271ce8}
}

ReDMM: Retrieval Driven Memory Manager

Andrew Drozdov, Andrew McCallum, and Mohit Iyyer

2024

Bib
@article{drozdov2024redmm, bibtex_show = {true}, selected = {true}, category = {main}, title = {{ReDMM}: Retrieval Driven Memory Manager}, author = {Drozdov, Andrew and McCallum, Andrew and Iyyer, Mohit}, year = {2024} }
Long Paper
Multistage collaborative knowledge distillation from a large language model for semi-supervised sequence generation

Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay Yoon Lee, Mohit Iyyer, and Andrew McCallum

In ACL 2024

arXiv Bib
@inproceedings{zhao2024multistage, bibtex_show = {true}, selected = {true}, category = {main}, abbr = {Long Paper}, title = {Multistage collaborative knowledge distillation from a large language model for semi-supervised sequence generation}, author = {Zhao, Jiachen and Zhao, Wenlong and Drozdov, Andrew and Rozonoyer, Benjamin and Sultan, Md Arafat and Lee, Jay Yoon and Iyyer, Mohit and McCallum, Andrew}, booktitle = {ACL}, year = {2024}, arxiv = {https://aclanthology.org/2024.acl-long.766/} }
Short Paper
PaRaDe: Passage Ranking using Demonstrations with LLMs

Andrew Drozdov, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler, and Kai Hui

In EMNLP (Findings) 2023

arXiv Bib
@inproceedings{drozdov2023parade, bibtex_show = {true}, selected = {true}, abbr = {Short Paper}, category = {main}, title = {{PaRaDe}: Passage Ranking using Demonstrations with LLMs}, author = {Drozdov, Andrew and Zhuang, Honglei and Dai, Zhuyun and Qin, Zhen and Rahimi, Razieh and Wang, Xuanhui and Alon, Dana and Iyyer, Mohit and McCallum, Andrew and Metzler, Donald and Hui, Kai}, year = {2023}, booktitle = {EMNLP (Findings)}, arxiv = {https://arxiv.org/abs/2310.14408} }
Long Paper
kNN-LM Does Not Improve Open-ended Text Generation

Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, and Mohit Iyyer

In EMNLP 2023

arXiv Bib
@inproceedings{wang2023knnlm, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, category = {main}, title = {{kNN-LM} Does Not Improve Open-ended Text Generation}, author = {Wang, Shufan and Song, Yixiao and Drozdov, Andrew and Garimella, Aparna and Manjunatha, Varun and Iyyer, Mohit}, year = {2023}, booktitle = {EMNLP}, arxiv = {https://arxiv.org/abs/2305.14625} }
Long Paper Poster
Compositional Semantic Parsing with Large Language Models

Andrew Drozdov, Nathanael Schärli, Ekin Akyürek, Nathan Scales, Xinying Song, Xinyun Chen, Olivier Bousquet, and Denny Zhou

In ICLR 2023

arXiv Bib
@inproceedings{drozdov2022compositional, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, category = {main}, title = {Compositional Semantic Parsing with Large Language Models}, author = {Drozdov, Andrew and Schärli, Nathanael and Akyürek, Ekin and Scales, Nathan and Song, Xinying and Chen, Xinyun and Bousquet, Olivier and Zhou, Denny}, year = {2023}, booktitle = {ICLR}, award = {Poster}, arxiv = {https://arxiv.org/abs/2209.15003} }
Long Paper Findings
You can’t pick your neighbors, or can you? When and how to rely on retrieval in the kNN-LM

Andrew Drozdov, Shufan Wang, Razieh Rahimi, Andrew McCallum, Hamed Zamani, and Mohit Iyyer

In EMNLP (Findings) 2022

Bib PDF
@inproceedings{drozdov2022knnlm, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, category = {main}, title = {You can't pick your neighbors, or can you? {W}hen and how to rely on retrieval in the {kNN-LM}}, author = {Drozdov, Andrew and Wang, Shufan and Rahimi, Razieh and McCallum, Andrew and Zamani, Hamed and Iyyer, Mohit}, booktitle = {EMNLP (Findings)}, award = {Findings}, year = {2022}, pdf = {https://mrdrozdov.github.io/knnlm_retrieval_quality.pdf} }
Long Paper Poster
Inducing and Using Alignments for Transition-based AMR Parsing

Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, and Ramon Fernandez Astudillo

In NAACL 2022

arXiv Bib Code
@inproceedings{drozdov2022amralign, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, award = {Poster}, category = {main}, title = {Inducing and Using Alignments for Transition-based AMR Parsing}, author = {Drozdov, Andrew and Zhou, Jiawei and Florian, Radu and McCallum, Andrew and Naseem, Tahira and Kim, Yoon and Astudillo, Ramon Fernandez}, booktitle = {NAACL}, year = {2022}, arxiv = {https://arxiv.org/abs/2205.01464}, code = {https://github.com/IBM/transition-amr-parser} }
Long Paper Poster
Improved Latent Tree Induction with Distant Supervision via Span Constraints

Zhiyang Xu, Andrew Drozdov, Jay Yoon Lee, Tim O’Gorman, Subendhu Rongali, Dylan Finkbeiner, Shilpa Suresh, Mohit Iyyer, and Andrew McCallum

In EMNLP 2021

arXiv Bib Code
@inproceedings{diora2021distantdiora, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, award = {Poster}, category = {main}, title = {Improved Latent Tree Induction with Distant Supervision via Span Constraints}, author = {Xu, Zhiyang and Drozdov, Andrew and Lee, Jay Yoon and O{'}Gorman, Tim and Rongali, Subendhu and Finkbeiner, Dylan and Suresh, Shilpa and Iyyer, Mohit and McCallum, Andrew}, booktitle = {EMNLP}, year = {2021}, arxiv = {https://arxiv.org/abs/2109.05112}, code = {https://github.com/iesl/distantly-supervised-diora} }
Long Paper Poster
Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

Andrew Drozdov, Subendhu Rongali, Yi-Pei Chen, Tim O’Gorman, Mohit Iyyer, and Andrew McCallum

In EMNLP 2020

Bib PDF
@inproceedings{drozdov2020sdiora, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, award = {Poster}, category = {main}, title = {Unsupervised Parsing with {S-DIORA}: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders}, author = {Drozdov, Andrew and Rongali, Subendhu and Chen, Yi-Pei and O{'}Gorman, Tim and Iyyer, Mohit and McCallum, Andrew}, booktitle = {EMNLP}, pdf = {https://aclanthology.org/2020.emnlp-main.392/}, year = {2020} }

Long Paper Oral

The Impact of Preprint Servers in the Formation of Novel Ideas

Swarup Satish, Zonghai Yao, Andrew Drozdov, and Boris Veytsman

In EMNLP (Workshop on Scholarly Document Processing) 2020

arXiv Bib Code Workshop

@inproceedings{emnl2020biorxivimpact,
  bibtex_show = {true},
  selected = {true},
  category = {workshop},
  abbr = {Long Paper},
  award = {Oral},
  arxiv = {https://www.biorxiv.org/content/10.1101/2020.10.08.330696v1},
  code = {https://github.com/seasonyao/BiorXivImpact},
  title = {The Impact of Preprint Servers in the Formation of Novel Ideas},
  author = {Satish, Swarup and Yao, Zonghai and Drozdov, Andrew and Veytsman, Boris},
  booktitle = {EMNLP (Workshop on Scholarly Document Processing)},
  workshop = {https://ornlcda.github.io/SDProc/},
  year = {2020}
}

Short Paper Poster
Unsupervised Labeled Parsing with Deep Inside-Outside Recursive Auto-Encoders

Andrew Drozdov, Patrick Verga, Yi-Pei Chen, Mohit Iyyer, and Andrew McCallum

In EMNLP 2019

Bib PDF Code
@inproceedings{drozdov2019dioralabeled, bibtex_show = {true}, selected = {true}, abbr = {Short Paper}, award = {Poster}, category = {main}, author = {Drozdov, Andrew and Verga, Patrick and Chen, Yi-Pei and Iyyer, Mohit and McCallum, Andrew}, title = {Unsupervised Labeled Parsing with Deep Inside-Outside Recursive Auto-Encoders}, booktitle = {EMNLP}, pdf = {https://aclanthology.org/D19-1161/}, code = {https://github.com/iesl/diora}, year = {2019} }
Long Paper Oral
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

Andrew Drozdov, Patrick Verga, Mohit Yadav, Mohit Iyyer, and Andrew McCallum

In NAACL 2019

arXiv Bib Code
@inproceedings{drozdov2019diora, bibtex_show = {true}, selected = {true}, abbr = {Long Paper}, award = {Oral}, category = {main}, author = {Drozdov, Andrew and Verga, Patrick and Yadav, Mohit and Iyyer, Mohit and McCallum, Andrew}, title = {Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders}, booktitle = {NAACL}, arxiv = {https://arxiv.org/abs/1904.02142}, code = {https://github.com/iesl/diora}, year = {2019} }

Journal Oral

Do latent tree learning models identify meaningful structure in sentences?

Adina Williams, Andrew Drozdov, and Samuel R. Bowman

TACL 2018

arXiv Bib Code

@article{Williams2018DoLT,
  bibtex_show = {true},
  selected = {true},
  category = {main},
  abbr = {Journal},
  award = {Oral},
  title = {Do latent tree learning models identify meaningful structure in sentences?},
  author = {Williams, Adina and Drozdov, Andrew and Bowman, Samuel R.},
  journal = {TACL},
  year = {2018},
  volume = {6},
  code = {https://github.com/nyu-mll/spinn},
  arxiv = {https://arxiv.org/abs/1709.01121},
  pages = {253-267}
}

Long Paper Poster
Emergent Communication in a Multi-Modal, Multi-Step Referential Game

Katrina Evtimova, Andrew Drozdov, Douwe Kiela, and Kyunghyun Cho

In ICLR 2018

arXiv Bib Code
@inproceedings{iclr2018emergent, bibtex_show = {true}, selected = {true}, category = {main}, abbr = {Long Paper}, award = {Poster}, arxiv = {https://arxiv.org/abs/1705.10369}, title = {Emergent Communication in a Multi-Modal, Multi-Step Referential Game}, author = {Evtimova, Katrina and Drozdov, Andrew and Kiela, Douwe and Cho, Kyunghyun}, booktitle = {ICLR}, year = {2018}, code = {https://github.com/nyu-dl/MultimodalGame} }

Ext. Abstract Poster

The Coadaptation Problem when Learning How and What to Compose

Andrew Drozdov, and Samuel R. Bowman

In ACL (Workshop on Representation Learning for NLP) 2017

Bib PDF Workshop

@inproceedings{acl2017coadapt,
  bibtex_show = {true},
  selected = {true},
  category = {workshop},
  abbr = {Ext. Abstract},
  award = {Poster},
  pdf = {https://github.com/nyu-mll/spinn/blob/master/writing/acl17-latex/acl2017.pdf},
  title = {The Coadaptation Problem when Learning How and What to Compose},
  author = {Drozdov, Andrew and Bowman, Samuel R.},
  booktitle = {ACL (Workshop on Representation Learning for NLP)},
  workshop = {https://sites.google.com/site/repl4nlp2017/},
  year = {2017}
}