Talks

Enumeration Theory through the Lens of Database Challenges

June 1st, 2026

PODS 2026, Bengaluru, India

Computational tasks that require producing many answers arise throughout data management systems: SQL engines return all qualifying tuples; search engines retrieve all relevant documents, albeit ranked; graph databases are tasked with producing all paths connecting two nodes; query optimizers routinely explore a vast space of candidate plans; and so on. Abstractly, these tasks are formalized as enumeration problems, solved by enumeration algorithms, and studied through the lens of enumeration complexity. This talk reflects on the evolution of enumeration in database research. It will begin with a brief survey of classic complexity-theoretic notions and algorithmic techniques that play a central role in enumeration across many domains. It will then review past results on enumeration problems arising in various areas of database systems. After that, it will delve deeper into the evolution of ideas and results on answer enumeration in relational queries. It will conclude with challenges for future directions.

presentation

Facets of Probabilistic Databases

March 30th, 2020

EDBT/ICDT 2020 Joint Conference

Probabilistic databases are commonly known in the form of the tuple-independent model, where the validity of every tuple is an independent random event. Conceptually, the notion is more general, as a probabilistic database refers to any probability distribution over ordinary databases. A central computational problem is that of marginal inference for database queries: what is the probability that a given tuple is a query answer? In this talk, I will discuss recent developments in several research directions that, collectively, position probabilistic databases as the common and natural foundation of various challenges at the core of data analytics. Examples include reasoning about uncertain preferences from conventional distributions such as the Mallows model, data cleaning and repairing in probabilistic paradigms such as the HoloClean system, and the explanation of query answers through concepts from cooperative game theory such as the Shapley value and the Banzhaf Power Index. While these challenges manifest different facets of probabilistic databases, I will show how they interrelate and, moreover, how they relate to the basic theory of inference over tuple-independent databases. (video)

presentation

Logical Formalisms for Information Extraction

September 2nd, 2019

EDBT Summer School 2019, Lyon, Saint Germain au Mont D'Or, France

The abundance and availability of valuable textual resources position text analytics as a standard component in data-driven workflows. To facilitate the incorporation of such resources, a core operation is the extraction of structured data from text, a classic task known as Information Extraction (IE). The lecture will begin with a short overview of the algorithmic concepts and techniques used for performing IE tasks, including declarative frameworks that provide abstractions and infrastructures for programming IE. The lecture will then focus on the concept of a "document spanner" that models an IE program as a function that takes as input a text document and produces a relation of spans (intervals in the document) over a predefined schema. For example, a well-studied language for expressing spanners is that of the "regular" spanners: relational algebra over regular expressions with capture variables. The lecture will cover recent advances in the theory of document spanners, including their expressive power and computational complexity, aspects of incompleteness and inconsistency, integration with structured databases, and compilation into parallel executions over document fragments. Finally, the lecture will list relevant open problems and future directions, including aspects of uncertainty and explainability. (video)

presentation

Some Clique Enumerations in Database Management

Nov 5, 2018

Pisa, Italy

Talk given at WEPA 2018, an international forum for researchers in the area of design, analysis, experimental evaluation and engineering of algorithms for enumeration problems.

presentation

Database Uncertainty for Computational Social Choice

Sep 15, 2018

Bayreuth, Germany

Talk given at Colloquium Logicum 2018 in Bayreuth University.

presentation

Probabilistic Database Repairing

February 14, 2018

Eilat, Israel

This talk has been given in the MoDaS Workshop in Eilat, Israel, and discusses probabilistic notions of database repairs.

presentation