How To Have Comments In Hugo Without An External Service

This blog is powered, or generated, by Hugo. Like all static site generators, Hugo doesn’t support commenting out of the box but relies on external services, like Disquss. Although JavaScript is beginning to be pretty ubiquitous I personally always browse the web with JavaScript disabled and therefore prefer my own sites to work without it. (The comment form on this site is intended to work with JavaScript enabled, but it works even if JavaScript is disabled, albeit not as gracefully.)

I have solved this problem by writing a Go script that processes POSTed comments and turns each comment into a JSON file that is written to disk in a specific location. Those files are then read using Hugo’s readDir function and looped through. In the loop each of them is processed by the getJSON function. The decoded comment object is then used to display the comment.

The nice thing about this setup is that if the sources folder is being watched, adding a comment triggers site rebuild, so comments appear in real time. Another nice thing is that since each comment is an individual file, moderating them is relatively easy. It would be trivial to add a feature that would prefix each comment file with a dot, and then display only those comments that don’t have that dot in their filename, creating an rudimentary approval system.

But let’s get down to details.

What You Need

There are three parts to this system:

A form that is used to post the comments
A script that processes each comment as it is posted
A template that displays saved comments

You also need to host your Hugo site somewhere where you can run Go or other scripting language. If you can use strictly static files only, this is not going to work.

We’ll take a look at the template and the form first.

The Template

This is the structure of my Hugo source folder:

content/
  |--post/
  |    |--my-first-post.md
  |    |--my-second-post.md
  |    |--my-third-post.md
comments/
  |--post/
       |--my-first-post/
       |    |--comment-1.json
       |    |--comment-2.json
       |    |--comment-3.json
       |    |--comment-4.json
       |--my-second-post/
            |--comment-1.json
            |--comment-2.json
            |--comment-3.json
            |--comment-4.json

The script that processes comments saves each one into the comments/ folder, creating a subfolder for each page. Each comment is saved as an individual JSON file.

The actual files are named like 2016-06-03-194837-john-doe-this-is-the.json, and inside they look like this:

{
  "name":"John Doe",
  "emailMd5":"8eb1b522f60d11fa897de1dc6351b7e8",
  "emailMd5Salted":"a2075be64b31fda2c3c6e5d25923494a",
  "website":"http://www.example.com",
  "avatarType":"gravatar",
  "ipv4Address":"127.0.0.1:64093",
  "pageId":"my-third-post",
  "body":"This is the comment body. \u0026lt;span\u0026gt;HTML is escaped.\u0026lt;/span\u0026gt;",
  "timestamp":"2016-06-05T17:10:33+03:00"
}

In the earlier versions I stored the email address as plain text and it was used as part of the filename, but I decided it was an unnecessary privacy issue, because I don’t need the address for anything. The addresses are still not stored securely in any sense of the word, but the md5 hash of it is already out there for anyone who has posted a comment to a blog that uses Gravatar for images. Also, the hashes are not exposed if the commenter chooses not use an avatar.

The IP address is not currently showing the correct address, because I’m proxying the requests through nginx.

Anyway, having that structure, we can process the comments using the following template:

{{ $dir := .File.BaseFileName | printf "%s%s" .Dir | printf "%s%s" "comments/" }}
{{ $files := readDir $dir }}
{{ range $files }}
  {{ $comment := getJSON $dir "/" .Name }}
  <article class="comment">
    <footer class="comment__meta">
      {{ if eq $comment.avatarType "gravatar" }}
      <div class="comment__avatar-wrap"><img class="comment__avatar-img" src="https://www.gravatar.com/avatar/{{ $comment.emailMd5 }}?d=identicon"></div>
      {{ else }}
      <div class="comment__avatar-wrap"><img class="comment__avatar-img" src="/images/avatars/monkey.png"></div>
      {{ end }}
      <div class="comment__name">{{ $comment.name }}</div>
      <time class="comment__time" datetime="{{ $comment.timestamp }}">{{ dateFormat "2006-02-01 15:04" $comment.timestamp }}</time>
    </footer>
    <div class="comment__body">
      <p>{{ $comment.body | markdownify }}</p>
    </div>
  </article>
{{ end }}

readDir() reads the contents of a directory relative to the Hugo source folder, giving us a list of the JSON files. We then loop through the files and call getJSON() for each one, getting the data inside of it. After that it is a simple matter of choosing what we want to display.

Note that since I’m using markdownify with the comment body it needs to be escaped before it’s saved, because markdownify expects safe HTML and doesn’t do any escaping.

The Form

There is nothing special about the form. I have this in the comments template below the code that generates the comments. Class attributes have been removed to make the snippet more concise.

<form class="comment-form" id="comment-form" action="/comment" method="POST">
  <input type="hidden" name="last_name">
  <input type="hidden" name="content_type" value="{{ .File.Ext }}">
  <input type="hidden" name="page_id" value="{{ .Dir }}{{ .File.BaseFileName }}">
  <div><label for="c-name">Name*</label><input id="c-name" type="text" name="name"></div>
  <div><label for="c-email">Email*</label> <input id="c-email" type="email" name="email"></div>
  <div>
    <label for="c-avatar">Avatar*</label>
    <select id="c-avatar"name="avatar_type">
      <option value="gravatar">Gravatar</option>
      <option value="libravatar">Libravatar</option>
      <option value="adorable">Adorable Avatar</option>
      <option value="none">None</option>
    </select>
  </div>
  <div><label for="c-website">Website</label> <input id="c-website" type="url" name="website"></div>
  <div><label for="c-body">Comment*</label><textarea id="c-body" name="body"></textarea></div>
  <div id="comment-form-message"></div>
  <div><button type="submit">Send</button></div>
</form>

last_name is there to bait spammers. content_type is the extension of the content file. page_id is the directory and the base filename of the content file, without the extension. content_type and page_id are used to a) verify that a page exists and b) to figure out which page the comment should be associated with. Having them out in the open is also a potential security risk and needs to be attended to in the script that processes the comment.

Just for completeness’ sake, here is also the JavaScript code that sends the form to the Go script:

var form = document.getElementById( "comment-form" )
if ( form ) {
    var msgArea = document.getElementById( "comment-form-message" );
  form.addEventListener( "submit", function( e ){
    var req = new XMLHttpRequest();
    msgArea.class = "";
    msgArea.textContent = "";
    req.onload = function( e ){
      if ( req.status === 200) {
        msgArea.textContent =
          "Thank you for the comment! It should be visible after you refresh the page.";
        msgArea.classList.add( "message" );
        msgArea.classList.add( "message--success" );
        form.reset();
      } else {
        msgArea.classList.add( "message" );
        msgArea.classList.add( "message--error" );
        msgArea.textContent = req.response.message;
      }
    };
    // Each input and the textarea has the class ".comment-form__field"
    var fields = document.querySelectorAll( ".comment-form__field" );
    var values = [];
    for ( var i = 0, j = fields.length; i < j; i++ ) {
      values.push( fields[i].name + "=" + encodeURIComponent( fields[i].value ) );
    }
    var payload = values.join( "&" );
    req.open( "POST", "//saimiri.io/comment", true );
    req.responseType = "json";
    req.setRequestHeader( "Content-type", "application/x-www-form-urlencoded" )
    req.setRequestHeader( "X-Requested-With", "XMLHttpRequest" );
    req.send( payload );
    e.preventDefault();
  } );
}

The Script

In a way, this is the least important piece of the puzzle, because a) it does nothing really special and b) it can be substituted with any similar script written in any language. I used Go simply because Hugo is written with it. Also, I have never written anything else with Go and wanted to see what it’s like.

The script I’m using is available at Github. It’s not really production ready, but it works. It has a few shortcomings and security vulnerabilities, but I’m tweaking and refactoring it whenever I have free time. If you plan to use it, there are some things you need to be aware of:

It’s a proof of concept put together in a couple of days with an “I’ll fix that later” mentality.
It’s not idiomatic Go. My first Go project, right?
It’s very picky about what the comment can contain. The name field, for example, can’t contain dots or accent characters.
It does not, however, check the length of any of the fields, so some malicious person could use it to dump huge comment files into your system.
The naming of the JSON files needs some fine tuning when it comes to accent characters (the are simply skipped and normalization would be better).
In its current state it’s not very configurable, but I’m working on that, too.
I have no idea how this works with HTTPS.

So How Do I Use This Thing?

See README at the Github repository. (work in progress, sorry)

Alternative Approaches To Associating A Comment With A Page

I came up with some alternative ways to handle the comment-page association, because I was a bit squeamish about exposing the filesystem to the outside world.

Use A Separate Hash Map To Associate Comments With Pages

For a bit more security-oriented approach (though maybe through obscurity) we could use the md5 hash of the content filename instead of the actual filename. To do that, we would need to use the uniqueId of the content file in the form and also generate a hash map of each of the content files.

Generating the hash map is simple. Here is a Bash script that generates a JSON file that contains an object literal. The object has each file hash as a property (or key) and the filename as that property’s value.

#!/bin/bash
find hugo/ -name "*.md" | (while read fname; do
        fn=${fname##*/}
        md5="$(printf '%s' "$fn" | md5sum | cut -d ' ' -f 1)"
        json="$json,\"$md5\": \"$fn\""
done
json=${json:1}
echo "{$json}" > md5.json)

Then all we would need to do is to compare the hash that is posted alongside the comment with the hashes in the map and see if there is a match.

The downside with this approach is the need to generate the hash map in the first place, of course. We could automate it, but it’s still an extra step and a potential source of bugs.

Create Comment Folders Manually

Instead of creating comment folders automatically, we could create them manually. Then instead of checking if the content file exists we would check if the comment folder exists. In effect, if the comment folder is missing or not correctly named, comments are disabled.

The obvious downside is that if we want comments enabled by default, we need to remember to add the folder every time we publish a new page, or we need to add more intelligence to your automatic deployment scripts.

The upside is that we can be pretty sure that only the pages that we specify will have comments.

Use Hash To Name The Comment Folder

Instead of using the content filename to name the comment folder, use the hash.

This requires the least wiring and effort, but it makes it extremely annoying to manage comment files manually. When the folders have names like fce90cbcae55af554acb24245e217ce5 and we have dozens or hundreds of pages, finding that one comment is going to take some time. The naming scheme helps a little and we can always search the file contents, but it’s still more work than I’m comfortable with. Plus it looks horrible.

Summa Summarum

This is probably not the best way to add comments to your Hugo sites but personally I’m pretty satisfied with it. There is still a lot of room for improvement and I have mentally prepared myself for something to go horribly wrong if my posts ever start to get significant amount of comments. But the principle is solid so I’m quite confident it will work out ok.

Anyway, in the coming days and weeks I’ll be tweaking the program and getting some of the kinks out of it. I’ll probably also write a PHP version of it, in case someone finds it useful.

If you have any suggestions how to improve it, please let me know.

Comments

Commenting has been disabled until I get a proper spam protection working. =(

First!

Nice method, I also tend to disable javascript

Thank You :)