The state of LLMSE

Project/Code Generation

Terminal Tools

sgpt

Documentation generation

reading existing documentations, inline documentations
finding outdated documentation (the comments says x, the function does x,y and z)
using an ubiquitous language for comments (my llm says this is dependency, mine said it is optionl)
using in multi repo, project level.
must have chat interface (event better integrated to slack or google meet)

Autodoc

I liked starting this one becuase, I believe this what people would have build in the first phase. This is a simple implementation of a documentation generator, where each file is passed to LLM API provider and parsed to generate the documentation, along with a thoughtful comment.

Nice Parts:

cost estimation, however inaccurate it is, it is a good starting point.
uses the fundamental linux strategy of dotfiles to store markdown files.
generated documents contains some meaningful Q&A section, mostly for beginners.
generates also some usage examples, which is a good starting point for the user.
generates a summary file for each package.

Could be better:

An overall documentation structure could be better, it is simply following file <> markdown association
Navigation is missing, it is hard to move between files.

The biggest problem with this approach is that it is not scalable, and it is not a good idea to send each file to the API provider. Also, if the LLM cannot link the knowledge between files, packages and libraris used, what is the actual improvement here. 1

Doc-Comments-AI

This nice python application targets more in-file documentation, and it is a good start for lazy programmers. It uses the LLM API to generate comments for the functions, classes, and variables in the code.

Nice Parts:

easy to use, just run the script and it will generate comments for you.
have a guided mode to go through functions one by one.
check the git status of the files
Skips the already commented functions

Could be better:

Skips the already commented functions, does not integrate with the existing comments.
Does not generate comments that are linked, so the context is lost. It would be interesting to find what is the most important this methods calls / iscalled and why
Does not generate comments for the classes, only for the functions.

This tool could be useful when we had a large codebase and a checkstyle that enforces comments, but you are too lazy to write them. But this problem (generating basic comments for functions) I believe is already solved by the IDEs, and the comments generated by the IDEs are more meaningful than the ones generated by the LLM. Plus, if the comments are meant to be read by actual Senior Engineers, they are not going to be happy with the comments generated by the LLM: they explain what the code does, which is not heard to understand from the code itself, but they do not explain why the code is written that way, which is why the comments are there in the first place.

RepoAgent

This is the best application that I have seen so far that integrates a couple of nice ideas together. The emphasis on the pre-commit is a bit too mch in my opinion, but it is a good start, how git should be somehow integrated to any documentation tool.

Still, the generated documents are lacking integration. The LLM’s are too short of a memory to remember the context of the whole project

The state of LLMSE

The state of LLMSE

Project/Code Generation

Terminal Tools

Documentation generation

Autodoc

Nice Parts:

Could be better:

Doc-Comments-AI

Nice Parts:

Could be better:

RepoAgent

Test Generation

Further Reading

Who will eat the software world

Ok Advice for Good People Leading Average Teams

Vanity reading list 2024