代码推荐相关论文
在IDE开发中,代码补全、代码推荐等功能一直是大家所期望的亮点。前不久TabNine的火热是这一需求的生动反映。
从论文的思路看,异彩纷呈,很多借鉴自然语言处理的最新技术,也有使用传统机器学习技术的,有挖编程历史的,有挖掘网站上的讨论的,有挖掘样例代码的,有关注于如何推荐API的,还有针对Python和Javascript等特定语言的。
Code Completion with Neural Attention and Pointer Networks, 2018
Jian Li
https://arxiv.org/pdf/1711.09573
这一篇是使用RNN+Attention的方法基础上,增加了AST树型结构的Attention和Pointer Network增强局部性效果
Intelligent Code Completion with Bayesian Networks, 2015
http://www.st.informatik.tu-darmstadt.de/artifacts/pbn/proksch-2015-Intelligent-Code-Completion-with-Bayesian-Networks.pdf
这一篇相对于第一篇是比较偏重传统方法的
Context-sensitive Code completion, 2018
Muhammad Asaduzzaman
这是最长的一篇,博士论文
Evaluating the Evaluations of Code Recommender Systems: A Reality Check, 2016
Sebastian Proksch
https://sarahnadi.org/resources/pubs/Proksch_ASE16.pdf
这一篇是一篇Survey
下面两篇的思路是挖掘编码的历史信息:
Enriching In-IDE Process Information with Fine-Grained Source Code History, 2016
Sebastian Proksch
http://www.st.informatik.tu-darmstadt.de/artifacts/caret/preprint.pdf
Predicting Source Code Changes by Mining Change History, 2004
Annie T.T. Ying
http://trese.cs.utwente.nl/publications/files/04692004-tse-mine-change.pdf
下面两篇关注于API推荐:
On Evaluating Recommender Systems for API Usages, 2008
Marcel Bruch
https://sites.google.com/site/rsseresearch/rsse2008/p16-bruch.pdf
How Can I Use This Method?, 2015
Laura Moreno
http://www.cs.colostate.edu/~malref82/07194634.pdf
下面一篇尝试挖掘网站:
Mining StackOverflow to Turn the IDE into a Self-Confident Programming Prompter, 2014
Luca Ponzanelli
https://people.lu.usi.ch/bavotg/papers/msr2014_Prompter.pdf
下面一篇挖掘例子:
Learning from Examples to Improve Code Completion Systems, 2009
Marcel Bruch
https://hal.archives-ouvertes.fr/hal-01575348/file/Learning-from-Examples-to-Improve-Code-Completion-Systems.pdf
Recommendation Systems for Software Engineering, 2010
Martin P. Robillard
https://www.cs.mcgill.ca/~martin/papers/rsse-c1.pdf
MyMediaLite: A Free Recommender System Library, 2011
Zeno Gantner
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.6242&rep=rep1&type=pdf
下面一篇是通过共享浏览信息来分享经验:
Easing Program Comprehension by Sharing Navigation Data, 2015
Robert DeLine
http://www.academia.edu/download/43905359/Easing_Program_Comprehension_by_Sharing_20160319-5093-e5fh11.pdf
下面这篇关注于Python
Learning Python Code Suggestion with a sparse point network, 2017
Avishkar Bhoopchand
https://arxiv.org/pdf/1611.08307
Learning Natural Coding Conventions, 2014
Miltiadis Allamanis
https://arxiv.org/pdf/1402.4182
Suggesting Accurate Method and Class Names, 2015
M. Allamanis
http://www.research.ed.ac.uk/portal/files/23088913/accurate_method_and_class.pdf
A Survey of Machine Learning for Big Code and Naturalness, 2018
Miltiadis Allamanis
https://arxiv.org/pdf/1709.06182
程序分析类
Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools, 2015
Ugur Koc
https://www.cs.tufts.edu/~jfoster/papers/mapl17.pdf
A Convolutional Attention Network for Extreme Summarization of Source Code, 2016
Miltiadis Allamanis
http://www.jmlr.org/proceedings/papers/v48/allamanis16.pdf
Software Defect Prediction via Convolutional Neural Network, 2017
Jian Li
https://www.cse.cuhk.edu.hk/lyu/_media/conference/jianli_qrs17.pdf
搜索相关
Deep API Learning, 2017
Xiaodong Gu
https://arxiv.org/pdf/1605.08535
Deep Code Search, 2018
Xiaodong Gu
https://www.researchgate.net/profile/Hongyu_Zhang28/publication/325732005_Deep_code_search/links/5b29dcfb4585150c633faa57/Deep-code-search.pdf
代码生成
A Grammar-Based Structural CNN Decoder for Code Generation, 2018
Zeyu Sun
https://m.aaai.org/ojs/index.php/AAAI/article/view/4686/4564
Code2Seq: Generating Sequences from structured representations of code, 2019
Uri Alon
https://arxiv.org/pdf/1808.01400
语言模型
PHOG: Probabilistic Model for Code, 2016
Pavol Bielik
http://www.jmlr.org/proceedings/papers/v48/bielik16.pdf
Graph-based Statistical Language Model for Code, 2015
Anh Tuan Nguyen
http://home.engineering.iastate.edu/~anhnt/Research/Files/ICSE15_Gralan.pdf
Natural Language Models for Predicting Programming Comments, 2013
Dana Movshovitz-Attias
https://www.aclweb.org/anthology/P13-2007
A deep language model for software code, 2016
Hoa Khanh Dam
https://arxiv.org/pdf/1608.02715
翻译
Using Machine Translation for Converting Python 2 to Python 3 Code,2015
Karan Aggarwal
https://peerj.com/preprints/1459.pdf
代码修复
Sequencer: Sequence-to-sequence learning for end-to-end program repair, 2019
Zimin Chen
https://arxiv.org/pdf/1901.01808
DeepFix: Fixing Common C Language Errors by Deep Learning, 2017
Rahul Gupta
https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14603/13921
背景知识
Information Needs in Collocated Software Development Teams, 2007
Andrew J. Ko
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/icse07_ko.pdf
Who Should Fix This Bug?, 2006
John Anvik
http://www.st.cs.uni-saarland.de/edu/softmine2007/Projects/p361-anvik.pdf
Maintaining Mental Models: A Study of Developer Work Habits, 2006
Thomas D. LaToza
http://plg.math.uwaterloo.ca/~migod/846/papers/icse06-venolia.pdf
网友评论