Code formatting is an opinionated beast. It always has been a matter of taste, and it always will be a matter of taste. This is the reason, why professional formatting tools, such as Eclipse JDT, offer a gazillion number of options. Which is still not sufficient enough. After all, you can override them inline with tag-comments to make the formatter shut up. Can't we do better than that? What if we could use machine learning techniques to detect the preferred code style that was use in a codebase so far? Turns out, we can.
The Antlr Codebuff project (https://github.com/antlr/codebuff) offers a generic formatter for pretty much any given language. As long as a grammar file exists, existing source can be analyzed to learn about the rules that have been applied while writing the code. Those can than be used to pretty print newly written code. No configuration required. And existing sources will stay as nicely formatted as they are. In the end, the primary purpose of code formatting is not to re-arrange all the keywords, but to make the source layout consistent.
In this talk, we will demonstrate the usage of the codebuff project and how it can be used to format the sources of your repo in a consistent way. We'll also show some other gems that have been revealed when toying around with the technology.