A new version of fmt(1) (Castellano)

As the title says, fmtroff is another version of the old fmt(1) that you'll find in most unix-like systems:

It's a utility to format paragraphs in plain text. The name I chose for my version may lead to confusion, the first thing to clarify is that fmtroff is NOT a roff parser, I chose a different name to prevent suggesting it's a replacement for the other versions, and the suffix ‘roff’ came about because my version, apart from improving some features present in other versions, also brings some innovations to make it easier (and more reliable) to work with roff files (to edit my novels I used groff, the GNU version.)

Download source file (fmtroff.c)

Tested in OpenBSD and Linux. I hope you'll find it useful.

Documentation

To compile it, just run:

$ cc fmtroff.c -o fmtroff

May be, I'll write a man page some day. For now, for those already familiarized with fmt, the help that fmtroff prints (option ‘-h’ will be quite enough to catch the subtle differences:

$ fmtroff -h
Usage: fmtroff [-bhlnp] [-w width] [file ...]
  -b   break sentences with a new line
  -h   print this help
  -m   try to skip mail headers and quoted text
  -n   format also lines beginning with a dot character
  -o   lowercase letters can begin a sentence (you may need this with
         man pages)
  -p   indent the whole paragraph copying the first line indentation
  -w   set maximum line width (default 72 columns)

Differences you can't deduce from the help above and it's important to consider:

Except for indentation, whether on the first line only or the entire paragraph when using the ‘-p’ option (see the second item), fmtroff always collapses spaces, leaving one between words and two between sentences. In other versions, this behavior is implemented as an option and is not enabled by default.
By using the '-p' option, the paragraph is indented completely, copying exactly the same indentation string from the first line, respecting spaces, tabs and their order.

To call it from Vi or Vim add a line like the following in your ~/.nexrc or ~/.vimrc:

map v !}fmtroff -np^M

Changelog

Mar, 3, 2026. Correction in wrap() columns count, it started from zero instead of one.
Set, 1, 2025. I had missed detecting the preceding space in the third abbreviations detection case (is_initial()).
Aug 20, 2025. Ingo Schwarze, member of the OpenBSD team, made me notice that a constant defined as 0x2e is interpreted as type int (see C11 6.4.4.1, Integer constant). I was casting int as wchar_t what didn't make sense. The proper notation to define that data as character consants is L'.' (see C11 6.4.4.4, Character constant).
Aug 19, 2025. I restored the change of Aug 17. It wasn't about a problem with the compilers but a mistake of my part. I overlooked to to null-terminate the constant arrays. By definition, if the array doesn't end in a NULL character is not a string! This fed garbage to functions as wcsspn() which returned wrong values. Thanks to Otto Moerbeek from the OpenBSD team for pointing this out to me.
Aug 18, 2025. Unfortunately, I had to revert the previous change. The change introduced a bug under OpenBSD that is not reproducible under Linux. In the entry from June 5, I mentioned that compiling in Linux resulted in different behavior for the executable compared to doing so in OpenBSD. Today, I installed GCC (GNU cc) on OpenBSD and confirmed that GCC interprets the code differently than Clang does.
Aug 17, 2025. I moved all the abbreviations detecting staff to a separate function to order and clean a bit the code.
Aug 10, 2025. Reverted change that generated a memory error under Linux.
Aug 8, 2025. More improvements in abbreviations recognition.
Aug 2, 2025. Recognize abbreviations even if they are enclosed in quotation marks or parentheses.
Aug 2, 2025. Minor change in abbreviation recognition. Not only a capital letter, but a lowercase letter preceded by a space (or a quote character) and followed by a period is also likely an abbreviation.
June 5, 2025. Lately, using Linux, I found two bugs not reproducible under OpenBSD:

In the cquote_count() function (formerly left_quote()), not limiting the countdown loop caused a segmentation fault when a quotation character followed by spaces was encountered at the beginning of the line.
I noticed that on Linux, for the wcsspn() function to return the expected values, the arrays used as arguments have to be declared as constants (in this case end_of_sentence[], oquote[] and cquote[].) To achieve the same effect on OpenBSD, they have to be declared as static (as they were.) I guess this is due to differences between how the compilers gcc and clang interpret the code.

June 5, 2025. I renamed the left_quote() function to something more appropriate: cquote_count(). I also simplified the names of the cl_quote[] and op_quote[] arrays, now cquote[] and oquote[] respectively.
Jan 28, 2025. Removed two abreviations I'd added by mistake.
Jan 6, 2025. In "troff" mode, in addition to lines starting with a period or apostrophe, also ignore those starting with a backslash.
Oct 26, 2024. Recognize backslash '\' as begin-of-sentence character. This allows the use roff and LaTeX tags at the beginning of a sentence (eg \fI or \emph{).
Oct 1, 2024. I decided to revert the changes made on August 10th and 22nd since they complicated the code just to cover isolated cases.
Set 26, 2024. Yesterday I noticed that when running fmtroff on Chinese text many characters were deleted, shortening the string. This was happening on Linux, as I normally use OpenBSD (and I don't usually write in Chinese either ;-)) I hadn't noticed it. After some research, I realized that the cause was in collapse_whitespace(), where I was using isblank() instead of iswblank(). I thought that in this case it was not necessary to use the wide-character version of the function, but I was wrong.
Ago 22, 2024. Before saving the variable explained in the previous item, check that the word contains only letters to make sure that it is a person's name.
Ago 10, 2024. Save to a variable if the present word is capitalized to recognize if the next is an initialism.
May 9, 2024. I simplified the code a bit.
May 7, 2024. Added the ‘-m’ option, when used fmtroff skips mail headers and quoted text (I added this option just to be consistent with the original use of fmt(1)).
May 7, 2024. Changed the option ‘-t’ to ‘-b’ and the option ‘-l’ to ‘-o’ for compatibility with OpenBSD fmt(1).
May 6, 2024. Now fmtroff skips nested code (tbl(1), pic(1), eqn(1).) Thanks to Victor <vico at tuta dot io> for pointing it out to me.
Nov 7, 2023 (bug). Fixed an error created with the last changes.
Oct 8, 2023. Now I rewrote the code again this time using libc and wide char functions.
Oct 1, 2023. I changed VLA arrays by pointers in most functions.
Sep 8, 2023. Except for ‘new line’ and ‘tab’ (\n and \f), fmtroff strips all control characters. Today, reading an interesting thread in groff mailing list, I learned that, with roff you can use the ‘leader’ or ‘SOH’ character in tables and indices. So, I modified the code to, in troff mode, fmtroff let pass that character.
Ago 24, 2023 (bug). Today I discovered that it's necessary to allocate memory before entering the loop that reads the file (or input string). Without this, calling fmtroff from a text editor (nvi or vim) on an empty file produces a segfault.
Jun 28, 2023. I improved a bit the "recognize and skip some initialisms" conditional, now it ignores leading quotes and brackets and takes also in care non ASCII uppercase letters.
Apr 25, 2023 (bug). Years passed without using iso-latin, I didn't realize until yesterday, while using OpenBSD wscons, that iso-latin characters made fmtroff hang. It took a minimal change in a conditional to fix the bug, this means that now the utf-8 limitation is only applicable to multibyte characters.
Apr 6, 2023 (bug). When in troff mode or when the ‘-n’ option is not used, treat lines starting with ‘'’ (single quotes) in the same way as those starting with a period, as the former can also be troff macro lines.
Apr 5, 2023. Don't add extra space after a ASCII ellipsis when it opens the line.
Mar 28, 2023. New option ‘-l’ to allow sentences begin with lower case (useful with man pages where to begin a sentence with a command name is recurrent.
Mar 25, 2023 (bug). Ignore a word when its number of characters is grater than the established column width.

GO BACK HOME