On the Impact of Refactoring Operations on Code Naturalness

Bin Lin, Csaba Nagy, Gabriele Bavota, Michele Lanza. SANER 2019

[IEEEexplore] [PDF]
language model refactoring

Recent studies have demonstrated that software is natural, that is, its source code is highly repetitive and predictable like human languages. Also, previous studies suggested the existence of a relationship between code quality and its naturalness, presenting empirical evidence showing that buggy code is “less natural” than non-buggy code. We conjecture that this qualitynaturalness relationship could be exploited to support refactoring activities (e.g., to locate source code areas in need of refactoring). We perform a first step in this direction by analyzing whether refactoring can improve the naturalness of code. We use state-of-the-art tools to mine a large dataset of refactoring operations performed in open source systems. Then, we investigate the impact of different types of refactoring operations on the naturalness of the impacted code. We found that (i) code refactoring does not necessarily increase the naturalness of the refactored code; and (ii) the impact on the code naturalness strongly depends on the type of refactoring operations.

Machine Learning for Big Code and Naturalness

On the Impact of Refactoring Operations on Code Naturalness

Bin Lin, Csaba Nagy, Gabriele Bavota, Michele Lanza. SANER 2019

Similar Work