Transformer-Assisted LLM Source Code Summarisation
Neural Source Code Summarisation (NSCS) aims to generate natural language summaries of source code to improve developer and maintainer understanding of code.
Many solutions to this problem use small transformer models, designed to be run locally on a workstation. Transformer-generated summaries often score well across many NLG metrics but fail to consistently produce clear and understandable natural language.
Conversely, Large Language Model (LLM)s’ ability to generate clear and understandable natural language presents an exciting solution to this problem, especially with the increased availability of LLMs and the increase in capability of workstation hardware over recent years meaning that some LLMs can be run from developers' workstations.
However, LLM summaries of code often differ greatly from developer-written summaries, and frequently miss key words and phrases resulting in low scores across NLG metrics.
We show how combining these two methods by using transformer-generated summaries in prompt engineering may enable LLMs to create better source code summaries.
- Jesse Phillips
- Prof. Tracy Hall
- Dr. Mo El-Haj