Backups are important

Your laptop will die. Your hard drive will fail. You will lose your USB key. You’ll have a power cut halfway through typing up your essay. Accidents happen, and making sure you don’t lose work - especially assignments and essays - is important.

Any important work you do should be saved in multiple places. At a minimum, you should have at least two copies on physically separate devices. Preferably, you should have a third copy - this means that if one copy is destroyed or inaccessible, you still have two copies, and are still protected from a second device failing.

Geographical separation is also useful, and makes it much harder to lose work - this doesn’t need to mean keeping copies of your work in another country, but you should avoid keeping multiple copies in the same place. For example, if you have one copy on a laptop and another on a USB key, you may want to consider not keeping them in the same backpack - if the backpack was stolen, you would lose both copies of the work.

Work in progress

As well as making sure you have multiple copies of your files, you should make sure that you save changes to your files while you are working on them. Most editors will have a feature that automatically saves a copy of your work every few minutes - make sure you enable it. You don’t need to make a full backup every time you write a sentence, but making sure your work is saved regularly will save you from losing all the work you did that day when your word processor crashes or your laptop runs out of battery.

Online file storage

Various online services allow you to synchronise files and folders between online storage and multiple computers. This is one of the easiest ways to ensure you have backup copies of your work, and is extremely useful when working on something from multiple computers. You can also access your files though a web browser or mobile application, so you’re almost always able to access them.

Online storage services generally have a free plan, with paid plans available for additional features or extra space. This post doesn’t cover their paid features, and unless noted each service has applications for Linux, OSX and Windows, as well as Android and iOS phones.

  • Dropbox provides 2GB of storage (with extra space for using certain features or recruiting people), keeps a 30 day history of changes to files, and has applications for almost any device or operating system.
  • Google Drive provides 15GB of storage, and integrates with a set of office software that you can use from your browser or phone (word processing, spreadsheets, presentations). It’s missing a Linux application.

If you have existing services or devices you want a service to work with, Amazon and Apple also provide their own storage offerings: Amazon Cloud Drive provides 5GB of space, though doesn’t have much to show off aside from being available on Amazon Fire devices. iCloud is part of iOS and OSX, and you need an Apple device to use it, and there’s no Android client.

Version Control

Git can be used to incrementally record changes to your work and push/pull them to other repositories. Mercurial is a simpler and more user friendly alternative to Git, but both will need you to spend some time learning how version control systems work and how to use the tools they provide. For Computer Science students, basic knowledge of how to use Git and version control is essential, and it’s worth knowing how to use at least one version management system well as it’s almost essential for any software project. Various hosting services for Git allow you to store and publish repositories online.

  • GitHub is the most popular, and is used for many open source projects. You can publish an unlimited number of public repositories, but private repositories either require the Student Developer Pack (which provides 2 private GitHub repositories along with many other goodies), or a paid plan.
  • BitBucket provides unlimited public and private repositories. It’s similar to Github, but provides some different features and is aimed at companies and teams rather than individuals.

You can use Git and Mercurial on your own server with nothing more than SSH access, though there are also applications that will provide an interface similar to Github or Bitbucket. I’d suggest looking at GitLab or Gitorious if you plan on doing this.


  • Remember to regularly save work in progress.
  • Your work doesn’t exist until there are at least two copies.
  • It doesn’t count if those copies are in the same physical location.

Installing Pandoc on Windows

Install pandoc using the Windows installer from here. Install pandoc globally using this command (using the path to the version of pandoc you downloaded):

msiexec /i pandoc-1.12.3.msi ALLUSERS=1

Install miktex from here.

Run these commands to check that pandoc and miktex are installed:

"C:\Program Files (x86)\Pandoc\pandoc.exe" --version
"C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\pdflatex.exe" --version

Building a RPM for Supervisor

Start with a skeleton specfile, which can be generated using rpmdev-newspec:

rpmdev-newspec supervisor.spec

The example specfile used to write this post can be found here.

Package metadata

The basic package metadata can be extracted from Supervisors existing package metadata ( and it’s website (

Name: supervisor
Version: 3.0
Release: 1
Summary: A system for controlling process state under UNIX

Group: System Environment/Daemons

Supervisor is a client/server system that allows its users to control a number
of processes on UNIX-like operating systems.

Defining requirements

Supervisor requires Python 2.4 or above (excluding Python 3.0), Setuptools (a packaging library), and meld3 (a library used inside Supervisor). In CentOS, these are packaged as python, python-setuptools and python-meld3, and so are added to the package requirements using their package names.

Requires: python > 2.4
Requires: python < 3
Requires: python-meld3 >= 0.6.5
Requires: python-setuptools


Supervisor is developed on GitHub, which provides source tarballs for repositories.

In the best case, the source url could be defined in the specfile, and spectool could be used to download the tarball into the SOURCES folder:

    spectool --get-files --sourcedir supervisor.spec

Unfortunately, Github’s tarball URLs use a series of redirects that result in a different name for the file, and so the files downloaded by spectool don’t match the filenames rpmbuild expects.

Instead, I’ve used a Makefile that downloads the tarball (and ensures the filename is correct, which also makes it easy to checksum the sources before using them. The Makefile used to build the package is decribed later on in the article. For now, the specfile only has to know the name of the downloaded tarball:

Source0: supervisor-%{version}.tar.gz

Building from source

The package is built across three major steps: prep, build and install.

The preparation step is simple, and does nothing but extract the source tarball. The -n option describes the name of the tarball to extract, though %{name}-%{version} is the default anyway:

%setup -q -n %{name}-%{version}

The build step uses setuptools to build the package, and then calls it again to install the package into the ‘build root’:

%{__python} build

%{__python} install --skip-build --root $RPM_BUILD_ROOT

Finally, the specfile has to define the files that will be included in the package:


This declares that the files should be owned by root, that Supervisors documentation should be placed in a automatic documentation directory, that each of the scripts Supervisor provides should be included, and that all of the Python modules that were installed should be included.

This should result in a directory tree like this:

└── usr
    ├── bin
    │   ├── echo_supervisord_conf
    │   ├── pidproxy
    │   ├── supervisorctl
    │   └── supervisord
    ├── lib
    │   └── python2.6
    │       └── site-packages
    │           ├── supervisor
    │           ├── supervisor-3.0-py2.6.egg-info
    │           └── supervisor-3.0-py2.6-nspkg.pth
    └── share
        └── doc
            └── supervisor-3.0
                ├── CHANGES.txt
                ├── COPYRIGHT.txt
                ├── LICENSES.txt
                ├── PLUGINS.rst
                └── README.rst

Including configuration

As well as Supervisor itself, including distribution specific information is useful, and one of the primary reasons to build an RPM for it. Firstly, an init script to mange the service is needed, and that requires default configuration so that Supervisor can be started.

Two additional sources are used, containing an init script and a Supervisor configuration file.

Source0: supervisor-%{version}.tar.gz
Source1: supervisor.init
Source2: supervisor.conf

The Supervisor configuration used is a simple as possible, only overriding the options needed to run Supervisor as a system service (by default, most of Supervisors files are placed in /tmp):



supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface


files = /etc/supervisor.d/*.conf

These files are added to the install step, along with the directories they need:

mkdir -p %{buildroot}/%{_initrddir}
install -p -m 755 %{SOURCE1} %{buildroot}/%{_initrddir}/supervisord
mkdir -p %{buildroot}/%{_sysconfdir}
mkdir -p %{buildroot}/%{_sysconfdir}/supervisord.d
install -p -m 644 %{SOURCE2} %{buildroot}/%{_sysconfdir}/supervisord.conf
mkdir -p %{buildroot}/%{_localstatedir}/log/%{name}
%{__python} install --skip-build --root $R

Finally, the package needs to be configured to add the supervisord service to chkconfig using the post-install step:

/sbin/chkconfig --add %{name}d || :

And to then remove it in the pre-uninstall step:

if [ $1 = 0 ]; then
    /sbin/service supervisord stop >/dev/null 2>&1 || :
    /sbin/chkconfig --del %{name}d || :

The $1 = 0 statement checks that the package is being removed.

With these changes, the specfile should be complete, and the directory tree should now include files in /etc and /var:

├── etc
│   ├── rc.d
│   │   └── init.d
│   │       └── supervisord
│   ├── supervisord.conf
│   └── supervisord.d
├── usr
│   ├── bin
│   │   ├── echo_supervisord_conf
│   │   ├── pidproxy
│   │   ├── supervisorctl
│   │   └── supervisord
│   ├── lib
│   │   └── python2.6
│   │       └── site-packages
│   │           ├── supervisor
│   │           ├── supervisor-3.0-py2.6.egg-info
│   │           └── supervisor-3.0-py2.6-nspkg.pth
│   └── share
│       └── doc
│           └── supervisor-3.0
└── var
    └── log
        └── supervisor

Automating the build with make

Since rpmbuild requires all sources to be placed in ~/rpmbuild/SOURCES, I’ve used a makefile to automate the collecting the sources, and building the RPM:


version=$(shell grep Version ${specfile} | awk '{ print $$2 }')
release=$(shell grep Release ${specfile} | awk '{ print $$2 }')



${package}: ${rpmbuild_package}
    cp ${rpmbuild_package} ${package}

${rpmbuild_package}: sources
    rpmbuild -ba ${specfile}

sources: ${tarball}
    spectool --list-files ${specfile} | awk '{ print $$2 }' | xargs -i cp "{}" ${HOME}/rpmbuild/SOURCES/

    wget ${tarball_url} --output-document ${tarball}
    md5sum -c sources

.PHONY: sources

The version and release are read from the specfile, the used to work out the package name. spectool is used to list the sources required and copy them into rpmbuilds directories, and wget is used to fetch the source tarball due to the previously mentioned issues with redirects.

A simple filesystem watcher

watch-fs is a small, simple tool that watches a directory and runs commands when files change.

Lots of similar tools already exist - nodemon, my own spotter, inotifywait. Each of them have been to complex to use quickly, or in the case of nodemon, are geared towards a specific use case.

watch-fs aims to work as simply as possible out of the box, with any non-essential features being optional.

watch-fs "echo File changed"

In this example, watch-fs will echo File changed any time a file changes in the current directory. It will ignore further file changes for a short delay once the command finished. This is so that editors (and commands) that change multiple files don’t result in the command running several times.

Commands can use the name and path of the file that changed:

watch-fs "echo File '{name}' changed at path '{path}'"

A few arguments control optional features:

watch-fs --first --clear --verbose "rspec"
watch-fs -fcv "rspec"
  • --first runs the command before waiting for changes.
  • --clear runs clear before running the command, which keeps the terminal free of output from multiple runs of the command.
  • --verbose prints the command before running it, which is useful for separating the output or showing what was run when using --clear.

Advanced Tox usage

This article explains the Tox configuration used by several of my Python projects, which both tests the module on multiple versions of python and runs testing-related tools for style guides, static code checking and coverage reports.

The tox.ini used here is taken from Supermann, and the source can be found here.

Running pytest to find and run tests

py.test is used to discover, run and write tests. It uses tox.ini as a configuration file by default, and the two tools work very well in combination. The first two sections tell Tox to use the default Python 2.6 and 2.7 environments, to install pytest and mock as test dependancies, and to uses py.test to run the modules tests.


commands=py.test supermann

addopts=-qq --strict --tb=short

Running flake8 to check code

flake8 wraps several Python tools for static checking of code - pep8 for style, and pyflakes for static error checking. This configures it to use tox.ini for it’s configuration - reducing the number of configuration files in the repo.

commands=flake8 --config=tox.ini supermann


Running coverage to show untouched code

Coverage is by far the most verbose tool configured in my tox.ini, but the HTML reports it generates are very useful for working out which parts of my code aren’t being run by my tests. This does several new things with tox: it runs two commands (one to generate the data, and another to render a HTML report), and uses the dependencies from the existing testenv configuration.

Coverage itself is told to use tox.ini for it’s configuration, and is configured to ingnore various parts of the codebase (the tests folder, and lines of code that should never run). The coverage files are placed in .tox to keep the repository tidy while working.

    coverage run --rcfile tox.ini --source supermann -m py.test
    coverage html --rcfile tox.ini


    def __repr__
    raise NotImplementedError
    class NullHandler

title=Supermann coverage report