Learning Program Properties From Big Code For Software Fault Localization

Event details
Date | 18.07.2016 |
Hour | 13:00 › 15:00 |
Speaker | David Aksun |
Location | |
Category | Conferences - Seminars |
EDIC Candidacy Exam
Exam President: Prof. Viktor Kuncak
Thesis Director: Prof. James Larus
Co-examiner: Prof. Katerina Argyraki
Background papers:
Finding latent code errors via machine learning over program executions, by Y. Brun, M. D. Ernst.
Predicting Program Properties from "Big Code, by V. Raychev et al.
Automatic Patch Generation by Learning Correct Code, by F. Long, M. Rinard.
Abstract
Automated software fault localization systems are essential for identifying the locations of software faults. Most of the automated tools require explicit program specifications, such as test suites and formal specifications. We can infer specifications for software fault localization using the information from bug fixes and get these bug fixes from large open source repositories (big code).
Recently, tools based on probabilistic models of code trained from big code are scalable, effective and provide significant results in areas, such as code completion, fault finding, code beautification, code language translation and mining code patterns. In our work, we can use these probabilistic models to find specifications from bug fixes to locate program statements that are likely to be repaired.
In this research proposal, we investigate a machine learning formulation of software fault localization based on bug fixes. We examine an expressive, scalable probabilistic model of code, which can learn program properties from complex dependencies. Then, we present a program repair system, which ranks candidate patches based on a model of correct code. Finally, we propose building software fault localization tools based on probabilistic models of code using bug fixes.
Exam President: Prof. Viktor Kuncak
Thesis Director: Prof. James Larus
Co-examiner: Prof. Katerina Argyraki
Background papers:
Finding latent code errors via machine learning over program executions, by Y. Brun, M. D. Ernst.
Predicting Program Properties from "Big Code, by V. Raychev et al.
Automatic Patch Generation by Learning Correct Code, by F. Long, M. Rinard.
Abstract
Automated software fault localization systems are essential for identifying the locations of software faults. Most of the automated tools require explicit program specifications, such as test suites and formal specifications. We can infer specifications for software fault localization using the information from bug fixes and get these bug fixes from large open source repositories (big code).
Recently, tools based on probabilistic models of code trained from big code are scalable, effective and provide significant results in areas, such as code completion, fault finding, code beautification, code language translation and mining code patterns. In our work, we can use these probabilistic models to find specifications from bug fixes to locate program statements that are likely to be repaired.
In this research proposal, we investigate a machine learning formulation of software fault localization based on bug fixes. We examine an expressive, scalable probabilistic model of code, which can learn program properties from complex dependencies. Then, we present a program repair system, which ranks candidate patches based on a model of correct code. Finally, we propose building software fault localization tools based on probabilistic models of code using bug fixes.
Practical information
- General public
- Free
Contact
- Cecilia Chapuis EDIC