Keeping an Octopress Site Clean

Octopress uses rsync to deploy to the web host. Keeping the files on the host cleaned up might be a bit of a worry if you’re like me1.

Imagine this scenario: you rename or delete2 an Octopress post and generate and deploy. Your site is updated to reflect the changes, but you will still have the old files left on the host, likely to receive incoming search engine hits.

The Octopress Rakefile has the rsync_delete option which will make rsync delete any file at the destination that doesn’t exist in the source.

That’s fine to use, unless you have non-Octopress files at the destination. I run my Octopress blog in the root of my site and have other directories there that I for natural reasons don’t keep in my Octopress source directory, so rsync --delete would delete those.

Update Jan 8th 2013: After a brief “doh” moment I realized that Octopress already supports this, as clearly documented. I was aware of the rsync-exclude file, but in my mind that meant it only dealt with not uploading the excluded local files to the destination. As clearly mentioned, if the rsync_delete option is true it will not delete files listed in rsync-exclude on the destination.

However, if you want to use the full abilities of rsync’s filter rules, the rest of this article still stands and has been modified to accommodate better exclude filter rules.

Rsync Filter Rules

I found the solution in rsync’s filter options. I’ve used --include and --exclude often, but after some RTFMing I learned that those parameters are just a subset of the actual filter rules.

First of all, set it up3 in your Rakefile:

1
2
rsync_delete = true
rsync_args   = "—filter='merge rsync-filter'"

Next, create a file named rsync-filter in your Octopress root directory. This is where you state which files you want to keep untouched on the destination host.

P dir1/
P dir2/
P file.html

There are plenty of other options for the filter, but P is what we’re after here — it makes the files and folders excluded, meaning rsync will leave them alone on the destination when --delete is used. The rsync man page says this about --delete:

This tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending side), but only for the directories that are being synchronized. (…) Files that are excluded from transfer are also excluded from being deleted unless you use the --delete-excluded option or mark the rules as only matching on the sending side (…)

Danger! High Voltage!

While testing this out, use --dry-run as well to check that it actually works as intended — otherwise you might nuke your whole site! That’s not fun.

Footnotes

  1. That is, slightly obsessive-compulsive.

  2. Which is something I would be very wary of doing because cool URIs don’t change.

  3. You might not have this option in your Rakefile yet – it’s a new addition pushed by yours truly. Pull a fresh Octopress to get it.

Comments