Markup languages for writing

When I started writing for the web several years ago, content management systems didn’t commonly include WYSIWYG editors. Typically I wrote posts in standard HTML, and the editors at best had automatic insertion of paragraph tags and javascript buttons to insert text styling cues such as bold and italics.

I was never a fan of mixing HTML tags with my text while writing. I felt like they were too verbose and mostly got in the way of the writing process. But they were a necessary evil, so I lived with them.

The first foray into structured markup

Eventually I learned about two markup languages, Textile and Markdown, designed to make writing for the web a more natural process. They had many similarities, and both were cleaner than writing plain HTML. I spent a while looking over the documentation for both, and eventually settled on using Textile for my web writing. It seemed straightforward and I think I preferred the visual unambiguity of the language.

I used Textile for a long time in my web writing, and got pretty used to it. Eventually though the editor in Wordpress, my blogging platform at the time, became “good enough”, and I stopped writing in plain text and started trusting the CMS to handle things. For a long time after this I didn’t use any markup languages. Web writing was done in a WYSIWYG editor in Wordpress, and notes were taken in unstructured plain text in Notational Velocity or in basic Microsoft Word documents.

Coming back to text

Over the last year or two, though, I’ve moved away from Wordpress, reduced the amount of unstructured text I write, and become sick of MS Word. The proliferation of “distraction free” writing environments on Mac OS X and iOS caught my attention, and I started taking a lot of notes on my iPhone.

It started to seem obvious that plain text files were the way to go, at least for the writing process itself. It reminded me of my days using LaTeX. A plain text file could be edited anywhere by almost any program, and could act as a sort of “source code” for the written word. It also seemed obvious that some kind of organizational structure for the text documents was needed.

But unlike my first foray into this space, I no longer saw a choice between two viable candidates. I went straight to Markdown like it was the right answer all along.

Choosing a language the second time around

I’ve been thinking about why I chose Markdown over Textile, or reStructuredText or LaTeX for that matter. I think it came down to a few basics that added up to such a compelling choice that the decision seemed obvious.

  1. Simplicity of the syntax
  2. Readability of a formatted document
  3. Lack of a specific “intent” of the document’s final presentation format (HTML, PDF, etc) baked into the syntax
  4. Wide support in text editing environments
  5. Wide support for conversion

LaTeX was my first real exposure to writing in structured plain text. It’s a good writing format which largely stays out of the way, but has always felt like its native output intent was the printed page (usually via a PDF). It also feels intended for complex and heavily referenced academic documents.

Because of this a typical LaTeX document has a large preamble filled with document setup boilerplate code. LaTeX is fantastic for large reports, but never felt like the right format for shorter less formal documents. The right format for writing should have no required boilerplate to begin or end the document.

reStructuredText is closer to the mark, but I felt like it was still a little too formal for writing. It seemed like more of a documentation writing format than I was looking for, and the plain text source never felt easy enough to read to me.

Textile was the language I had used in the past, but in reviewing it now I realize that it was probably the wrong choice. While easy to read, it is very much a language intended for conversion to HTML. It uses HTML concepts for most of its tags, and mostly just simplifies the syntax to the point that there are not distracting HTML tags everywhere. Paragraphs are marked, where needed, by p.. Block quotes are marked by bq., and headings marked by h1. through h6.. A lot of the syntax therefore takes itself directly from standard HTML terms, just in a prettier format.

I realize now though that I’m not “writing for the web”. The web is where these words are now, but the intent of the markup I use shouldn’t be “HTML”. It should be meaning-based, and read easily in the plain text it started in. Of all the markup languages I looked at Markdown is the most like basic writing, with the least amount of output intent built into the syntax. It doesn’t feel like HTML made pretty.

What Markdown feels like to me is the type of writing I used on old school mailing lists. On the mailing lists everyone was expected to write in plain text, and basic conventions developed for marking up text with this limitation in mind. The markup wasn’t intended to be converted but read as-is. Markdown feels like these conventions, expanded to allow for conversion to other presentation formats. It’s easy to write and easy to read.

In addition to the natural (by convention at least) feel of the language, Markdown has by far the widest support in text editors. Of course any text editor can work with the plain text, but basic syntax support like bold or centered headings, underlined, italicized, or bolded emphasis, lists that are indented like lists, and block quotes that look like block quotes enhance the writing process without getting in the way. Call it what-you-see-is-what-you-mean, to steal a phrase from the developers of LyX.

I use it for writing, including this essay. I use it for formal notetaking, converting to HTML or other formats for display and sharing. I use it for quick, informal notes.

Combined with the SmartyPants language extension for smart quotes and automatic generation of en- and em-dashes and ellipsis marks (included with many Markdown to HTML converters), Markdown allows me to write simple, easy to read plain text documents that can be converted into well-formed HTML by an automatic processor, but can also be easily written, read, and edited with a text editor.

— Steve

Posted on 27 July 2012