I used to post articles related to this topic here, but at some point I thought it better to dedicate this website exclusively to its original purpose, which is to publish my literary work. I'll make an exception for this little program I wrote in C, since it's kinda related.
If you're a *nix user, you're surely used to edit plain text, using your preferred text editor. While popular modern full-featured editors like Vim or GNU Emacs come with embedded paragraph formatting functions, with the old basic ones, like BSD nvi, you have to use external tools like fmt(1) (which you can also call from the editor, using key bindings.) Anyway, as I explain in the top comment in the code itself, none of the existing tools to format plain text paragraphs fully satisfied me, which is why I wrote my own.
Those familiar with unix-like systems know well what troff is, a very versatile tag like text formatting language, which among other things has been used by many to give format to published known books. To edit my novels I used groff, the GNU version. Well, the code I'm posting here, apart from improving some features present in other fmt versions also brings an innovation to make it easier (and safer) to work with troff files.
Download (fmtroff.c)
Tested in OpenBSD and Linux. I hope you'll find it useful.
Changelog
- Nov 7, 2023 (bug). Fixed an error created with the last changes.
- Oct 8, 2023 Now I rewrote the code again this time using libc and wide char functions.
- Oct 1, 2023 I changed VLA arrays by pointers in most functions.
- Sep 8, 2023 Except for ‘new line’ and ‘tab’ (\n and \f), fmtroff strips all control characters. Today, reading an interesting thread in groff mailing list, I learned that, with roff you can use the ‘leader’ or ‘SOH’ character in tables and indices. So, I modified the code to, in troff mode, fmtroff let pass that character.
- Ago 24, 2023 (bug): Today I discovered that it's necessary to allocate memory before entering the loop that reads the file (or input string). Without this, calling fmtroff from a text editor (nvi or vim) on an empty file produces a segfault.
- Jun 28, 2023: I improved a bit the "recognize and skip some initialisms" conditional, now it ignores leading quotes and brackets and takes also in care non ASCII uppercase letters.
- Apr 25, 2023 (bug): Years passed without using iso-latin, I didn't realize until yesterday, while using OpenBSD wscons, that iso-latin characters made fmtroff hang. It took a minimal change in a conditional to fix the bug, this means that now the utf-8 limitation is only applicable to multibyte characters.
- Apr 6, 2023 (bug): When in troff mode or when the ‘-n’ option is not used, treat lines starting with ‘'’ (single quotes) in the same way as those starting with a period, as the former can also be troff macro lines.
- Apr 5, 2023: Don't add extra space after a ASCII ellipsis when it opens the line.
- Mar 28, 2023: New option ‘-l’ to allow sentences begin with lower case (useful with man pages where to begin a sentence with a command name is recurrent.
- Mar 25, 2023 (bug): Ignore a word when its number of characters is grater than the established column width.
GO BACK HOME