BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Quality Data Acquisition for Machine Learning
DTSTART:20190827T110000
DTEND:20190827T130000
DTSTAMP:20260406T111856Z
UID:5924781b7e4cbf4a3769603dd12cbea342cceb96b69c493f0fb95cc8
CATEGORIES:Conferences - Seminars
DESCRIPTION:Adam Richardson\nEDIC candidacy exam\nExam president: Prof. Ma
 rtin Jaggi\nThesis advisor: Prof. Boi Faltings\nCo-examiner: Prof. Volkan 
 Cevher\n\nAbstract\nModern machine learning has seen tremendous growth in 
 recent years\, largely due to an abundance of data used to train complex l
 earning models. As these models become more integral to daily life\, we fi
 nd an increasing need for such data. However\, not much thought has been g
 iven to how to ensure that this data has the right statistical properties 
 to produce a high quality model. In particular\, we are concerned with how
  to incentive self-interested agents to report quality data in a crowdsour
 cing context. We build on the idea of the Peer Prediction mechanism presen
 ted in [Peer Truth Serum: Incentives for Crowdsourcing Measurements and Op
 inions]\, which incentivizes truthful reporting of a distribution of obser
 vations under certain conditions. We observe that in the context of machin
 e learning this problem has additional structure. We are not simply concer
 ned with a distribution of observations\, rather\, we are concerned with t
 he ability to predict a mapping within that distribution of observations. 
 Yang et al. attempt to address this problem in [Optimum Statistical Estima
 tion with Strategic Data Sources] under some strong assumptions. We propos
 ed a mechanism for linear regression learning based on the notion of influ
 ence defined in [Understanding Black-box Predictions via Influence Functio
 ns].\n\nIn prior work\, we have shown that our influence mechanism induces
  a truthful reporting under more relaxed assumptions than [Yang et al.]. H
 owever\, in order to strengthen our findings\, we wish to show that our me
 chanism can be generalized to non-linear models\, and we wish to strengthe
 n our game-theoretic guarantees. We also wish to apply our mechanism in th
 e context of federated learning. This would involve extending the mechanis
 m in order to be privacy-preserving on the data\, or re-examining the fede
 rated learning pipeline in order to construct a privacy-preserving mechani
 sm.\n\nBackground papers\nPeer Truth Serum: Incentives for Crowdsourcing M
 easurements and Opinions\, by Faltings\, B.\, et al.\nOptimum Statistica l
 Estimation with Strategic Data Source\, by Cai\, Y.\, et al.\nUnderstandin
 g Black-box Predictionsvia Inﬂuence Functions\, by Koh\, P. W.\, et Lian
 g\, P.\n 
LOCATION:INR 212 https://plan.epfl.ch/?room==INR%20212
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
