Testing Jekyll

There are a bunch of different types of tests you can run. In this example, we’ll do a style check with RuboCop, and then we’ll use a tool called html-proofer to verify that all links that we create in our blog work. I hate running into broken links when I’m looking for something, so html-proofer is a great tool for catching any broken links you may be posting. html-proofer is especially useful for larger and older sites, as things on the internet move.

Style tests (Ruby)

The most common tool for testing styling for Ruby is RuboCop. The rules are updated on a pretty frequent basis, so don’t be surprised if your code passes style tests this week, but fail next week, even without any changes. RuboCop is only meant to test Ruby code. If you’ve been following all of the steps of setting up a basic Jekyll site from the first post to now, the only files that will get checked for style will be Rakefile and your Gemfile.

You will only have one additional dependency when working with RuboCop. Add the following to your Gemfile:

gem 'rubocop'

Run bundle to install RuboCop. To setup a RuboCop rake task, you only need to add two lines to your Rakefile:

require 'rubocop/rake_task'

RuboCop::RakeTask.new

With your updated Rakefile, run the style checks with bundle exec rake rubocop:

bundle exec rake rubocop
Running RuboCop...
Inspecting 2 files
..

2 files inspected, no offenses detected

To give you an example of what a failed style check looks like:

$ bundle exec rake rubocop
Running RuboCop...
Inspecting 2 files
C.

Offenses:

Gemfile:6:7: C: Prefer single-quoted strings when you don't need string interpolation or special symbols.
  gem "rubocop"
      ^^^^^^^^^

2 files inspected, 1 offense detected
RuboCop failed!

When a style check fails, it will exit with a non-zero exit code, which makes it perfect for CI.

html-proofer setup

As mentioned before, this is a great tool for finding bad links. It will go test all links by trying to connect and perform a GET. If it comes back something other than a 200, the test fails.

The only dependency here is html-proofer. Add the following to your Gemfile:

gem 'html-proofer'

Run bundle and add the following to your Rakefile:

require 'html-proofer'
require 'jekyll'

task :build do
  puts 'Building site...'.bold
  Jekyll::Commands::Build.process(profile: true, future: true)
end

task :html_proofer do
  Rake::Task['build'].invoke
  puts 'Running html proofer...'
  HTMLProofer.check_directory('./_site').run
end

The Rake::Task["build"].invoke statement will call the build task if it hasn’t already been called. To run the task, run bundle exec rake html_proofer. With the default setup, I have a few issues. See the following output:

$ bundle exec rake html_proofer
Building site...
Configuration file: /Users/brint/src/spock.rocks/_config.yml
            Source: /Users/brint/src/spock.rocks
       Destination: /Users/brint/src/spock.rocks/_site
 Incremental build: disabled. Enable with --incremental
      Generating...

Filename                                                 | Count |  Bytes |  Time
---------------------------------------------------------+-------+--------+------
_layouts/default.html                                    |     3 | 22.14K | 0.010
sitemap.xml                                              |     1 |  0.57K | 0.008
_includes/footer.html                                    |     3 |  8.28K | 0.004
_includes/head.html                                      |     3 |  1.86K | 0.003
_includes/header.html                                    |     3 |  3.13K | 0.001
feed.xml                                                 |     1 | 10.34K | 0.001
_includes/icon-github.html                               |     3 |  3.05K | 0.001
_includes/icon-twitter.html                              |     3 |  2.66K | 0.001
_includes/comments.html                                  |     3 |  0.00K | 0.001
index.html                                               |     1 |  0.38K | 0.001
_layouts/post.html                                       |     1 |  7.67K | 0.000
_posts/2016-03-16-starting-with-jekyll.markdown          |     1 |  5.14K | 0.000
robots.txt                                               |     1 |  0.04K | 0.000
_includes/icon-twitter.svg                               |     3 |  2.31K | 0.000
_layouts/page.html                                       |     1 |  0.32K | 0.000
_includes/icon-github.svg                                |     3 |  2.71K | 0.000
_posts/2016-03-16-starting-with-jekyll.markdown/#excerpt |     1 |  0.55K | 0.000
css/main.scss                                            |     1 |  0.96K | 0.000
about.md                                                 |     1 |  0.15K | 0.000

                    done in 0.248 seconds.
 Auto-regeneration: disabled. Use --watch to enable.
Running html proofer...
Running ["ScriptCheck", "LinkCheck", "ImageCheck"] on ["./_site"] on *.html...


Checking 22 external links...
Ran on 3 files!


- ./_site/about/index.html
  *  External link https://spock.rocks/about/ failed: 403 No error
  *  linking to internal hash # that does not exist (line 28)
- ./_site/index.html
  *  linking to internal hash # that does not exist (line 28)
- ./_site/tech/2016/03/16/starting-with-jekyll.html
  *  External link http://127.0.0.1:4000/ failed: got a time out (response code 0)
  *  linking to internal hash # that does not exist (line 27)
rake aborted!
HTML-Proofer found 5 failures!
/Users/brint/src/spock.rocks/Rakefile:30:in `block in <top (required)>'
Tasks: TOP => html_proofer
(See full trace by running task with --trace)

Since this site is being run with CloudFront in mind, I’ve moved /about to /about/index.html because CloudFront only allows for a default document in the root path of your origin. For all additional paths, you have to explicitly reference index pages.

So lets fix this. We will be opening _includes/head.html and looking for this line:

<link rel="canonical" href="{{ page.url | replace:'index.html','' | prepend: site.baseurl | prepend: site.url }}">

We need to remove replace:'index.html','' so it now looks like this:

<link rel="canonical" href="{{ page.url | prepend: site.baseurl | prepend: site.url }}">

Run the test again, and we error regarding /about goes away. Note that at this point, it’s still validating this link against my production site. The linking to internal hash # that does not exist error needs to be handled. In this case, it’s okay to ignore this, so we’ll update our rake task using the html-proofer configuration docs to set this up:

task :html_proofer do
  Rake::Task['build'].invoke
  puts 'Running html proofer...'.bold
  HTMLProofer.check_directory('./_site', allow_hash_href: true).run
end

Now we run bundle exec rake html_proofer again, and we still have one error:

Running html proofer...
Running ["ScriptCheck", "LinkCheck", "ImageCheck"] on ["./_site"] on *.html...


Checking 22 external links...
Ran on 3 files!


- ./_site/tech/2016/03/16/starting-with-jekyll.html
  *  External link http://127.0.0.1:4000/ failed: got a time out (response code 0)
rake aborted!
HTML-Proofer found 1 failure!
/Users/brint/src/spock.rocks/Rakefile:30:in `block in <top (required)>'
Tasks: TOP => html_proofer
(See full trace by running task with --trace)

This is trying to do an external site check on a link in a post. In this case, the link is a reference for someone following along with the post, and I wouldn’t expect this to be running in most cases. I consider this link okay. There are two ways to ignore this error:

  1. Setup a global ignore in the task for that URL
  2. Add an additional attribute to that one link to make html-proofer ignore it.

In this case, I’m going with the second option. I don’t like to ignore things globally unless it’s absolutely necessary. In order to fix this link. I’m going to update my post and add a data-proofer-ignore attribute to the link. The link will now look like this:

[127.0.0.1:4000/](http://127.0.0.1:4000/){:data-proofer-ignore=''}

Lets run the test again:

Running html proofer...
Running ["LinkCheck", "ImageCheck", "ScriptCheck"] on ["./_site"] on *.html...


Checking 21 external links...
Ran on 3 files!


HTML-Proofer finished successfully.

And we have what we wanted, a successful test!

The testing it’s doing for external links is prone to failure. If you’re adding a few posts and one of your new posts references another unpublished post, the external link validation will fail. We still want to validate external links, so we’re going to setup our html_proofer task to ignore our site when validating external links by utilizing the url_ignore option. url_ignore takes a list of regular expressions that it applies to all links that it looks at. We need to add require 'uri' to our Rakefile, and a few methods to help us pull information out of the site configuration. We could statically define the site name, but like I’ve talked about in previous posts, this means you’re repeating yourself and you end up with multiple sources of the truth. It takes a few extra steps to do it this way, but it makes long term management simpler. Rather than just putting in all of the small changes that were made, here’s a full Rakefile that shows everything we’ve created up to this point including the addition of colorize, which also needs to be added to your Gemfile:

# coding: utf-8
require 'colorize'
require 'html-proofer'
require 'jekyll'
require 'rubocop/rake_task'
require 'uri'

# Configuration Options
config_file = '_config.yml' # Name of Jekyll config file

# Do not touch below this line
RuboCop::RakeTask.new

# Extend string to allow for bold text.
class String
  def bold
    "\033[1m#{self}\033[0m"
  end
end

# Rake Jekyll tasks
task :build do
  puts 'Building site...'.yellow.bold
  Jekyll::Commands::Build.process(profile: true)
end

task :clean do
  puts 'Cleaning up _site...'.yellow.bold
  Jekyll::Commands::Clean.process({})
end

task :html_proofer do
  Rake::Task['build'].invoke
  host_regex = Regexp.new(site_domain(config_file))
  puts 'Running html proofer...'.yellow.bold
  HTMLProofer.check_directory('./_site', allow_hash_href: true,
                                         url_ignore: [host_regex]).run
end

# Misc Methods
def site_domain(config_file)
  URI(fetch_jekyll_config(config_file)['url']).host
end

def fetch_jekyll_config(config_file)
  site = Jekyll::Configuration.new
  site.read_config_file(config_file)
end

At this point, you can now run two commands to check for styling and validate external URLs.

  • Style: bundle exec rake rubocop
  • Validate URLs: bundle exec rake html_proofer