NLP seminar: Retrieving Texts based on Abstract Descriptions


Event details

Date 21.11.2023
Hour 11:0012:00
Speaker Shauli Ravfogel (Bar-Ilan University)  
Location Online
Category Conferences - Seminars
Event Language English
Shauli Ravfogel, from Bar-Ilan University, is presenting his most recent work on Retrieving Texts based on Abstract Descriptions. 
You can join in BC 04 or online.

This talk aims to connect two research areas: instruction models and retrieval-based models.

While instruction-tuned Large Language Models (LLMs) excel at extracting information from text, they are not suitable for semantic retrieval. Similarity search over embedding vectors allows to index and query vectors, but the similarity reflected in the embedding is sub-optimal for many use cases. We identify the task of retrieving sentences based on abstract descriptions of their content.
We demonstrate the inadequacy of current text embeddings and propose an alternative model that significantly improves when used in standard nearest neighbor search. The model is trained using positive and negative pairs sourced through prompting a large language model (LLM). While it is easy to source the training material from an LLM, the retrieval task cannot be performed by the LLM directly. This demonstrates that data from LLMs can be used not only for distilling more efficient specialized models than the original LLM, but also for creating new capabilities not immediately possible using the original model.

Practical information

  • Informed public
  • Free




Natural Language Processing Large language models Instruction-tuning Retrieval

Event broadcasted in