Logical Formalisms for Information Extraction
September 2nd, 2019
EDBT Summer School 2019, Lyon, Saint Germain au Mont D'Or, France
The abundance and availability of valuable textual resources position text analytics as a standard component in data-driven workflows. To facilitate the incorporation of such resources, a core operation is the extraction of structured data from text, a classic task known as Information Extraction (IE). The lecture will begin with a short overview of the algorithmic concepts and techniques used for performing IE tasks, including declarative frameworks that provide abstractions and infrastructures for programming IE. The lecture will then focus on the concept of a "document spanner" that models an IE program as a function that takes as input a text document and produces a relation of spans (intervals in the document) over a predefined schema. For example, a well-studied language for expressing spanners is that of the "regular" spanners: relational algebra over regular expressions with capture variables. The lecture will cover recent advances in the theory of document spanners, including their expressive power and computational complexity, aspects of incompleteness and inconsistency, integration with structured databases, and compilation into parallel executions over document fragments. Finally, the lecture will list relevant open problems and future directions, including aspects of uncertainty and explainability.  
Some Clique Enumerations in Database Management
Nov 5, 2018
Pisa, Italy
Talk given at WEPA 2018, an international forum for researchers in the area of design, analysis, experimental evaluation and engineering of algorithms for enumeration problems.  
Database Uncertainty for Computational Social Choice
Sep 15, 2018
Bayreuth, Germany
Talk given at Colloquium Logicum 2018 in Bayreuth University.  
Probabilistic Database Repairing
February 14, 2018
Eilat, Israel
This talk has been given in the MoDaS Workshop in Eilat, Israel, and discusses probabilistic notions of database repairs.