Efficient reasoning in PAC semantics
Machine learning is often employed as one step in a larger application, serving to perform information extraction or data mining for example. The rules obtained by such learning are then used as inputs to a further analysis. As a consequence of the inputs to this analysis having been learned, however, its conclusions can only be theoretically guaranteed to satisfy the same standard as the inputs---that of "PAC Semantics" (Valiant, 2000). In this talk, we explore the benefits of taking the problems of learning and logical reasoning together, capturing a relatively general family of such applications. Briefly, the benefit is that we can simultaneously (1) handle incomplete information, (2) utilize rules that are a reasonable but imperfect fit for the data, and (3) reach conclusions more efficiently than is achieved by known algorithms for reasoning from rules alone. Precisely, we consider a problem of testing the validity of a "query" formula (hypothesis) against incomplete data. We present algorithms for soundly verifying such queries that succeed whenever there exist learnable rules that suffice to complete a simple proof of the query, matching what can be achieved by such a two-stage learning and reasoning process.