James Cooke

Why isn’t the UK in DST yet?!

2024-03-28T00:00:00+00:00

Warning: I particularly hate Daylight Savings Time (DST), so this post is tainted with negativity. However, I work hard to prepare myself and family for the biannual time change, so this post explores how I set my mental model up for the clocks going forwards in the UK.

It’s 28th March as I write this from the UK.

The UK should have moved to DST according to my mental model. This should have happened last weekend.

I even did some family sleep time prepping last weekend. I even wrongly tooted about it and then corrected myself.

All of this has happened because my mental model about DST in the UK is WRONG 🤦.

🇺🇸 USA and 🇨🇦 Canada

Canada is a country that I like - some of our friends and family live there and I’ve visited multiple times. I also like the USA - I’ve worked for USA companies and have visited multiple times.

All this means (I think) I’m well connected to what timezone the USA and Canada are on any particular day. So when they move to DST (which happens before the UK and Europe’s change), we get a brief couple of weeks where:

We get more overlap with family and friends in Canada because their summer time is closer by 1 hour to GMT.
As an NHL ice hockey follower, most games start at midnight UK time, but for this period, games start an hour earlier.

Let’s call this period the “1 hour closer overlap”.

🧠 Mental model

My mental model is based off this “1 hour closer overlap”. In my head this overlap lasts two weeks - I don’t know why - I’ve just programmed that as a “fact”.

Therefore, my mental model for:

When does the UK apply DST and switch from GMT?

Is…

Two weeks after North America changes to DST.

This can be WRONG (sometimes).

This year, 2024, is one of those wrong years.

⚠️ The problem

I didn’t realise how ingrained this two week duration of overlap was in me, so let’s check facts and find out if I need an upgrade.

First fact

UK’s biannual switch to DST happens on the last Sunday in March:

[In the UK] BST begins at 01:00 GMT every year on the last Sunday of March

Second fact

North America’s biannual switch to DST happens on the second Sunday in March

In the U.S., daylight saving time starts on the second Sunday in March

🚨 Calendar Siren! 🚨

For any months of March where the last Sunday happens in the fifth week of March, the “1 hour closer overlap” will be three weeks long, not two.

📅 Checking future overlaps

In order to have a look at what the ‘normal’ overlap looks like, I’m going to brute force the calculation of how many days the “1 hour closer overlap” lasts for each year from 2014 to 2033 inclusive.

I’m going to use Python with the rrule() from dateutil and stuff the data inside a Pandas DataFrame.

In [1]: import datetime

In [2]: import pandas as pd

In [3]: from dateutil.rrule import rrule, YEARLY, SU

In [4]: # Build UK and USA start dates using rrule
   ...: dst_df = pd.DataFrame(data={
   ...:     'UK DST Start': rrule(
   ...:         YEARLY,
   ...:         dtstart=datetime.date(2014, 1, 1),
   ...:         count=20,
   ...:         bymonth=3,
   ...:         byweekday=SU(-1)),  # Last Sunday of the month
   ...:     'USA DST Start': rrule(
   ...:         YEARLY,
   ...:         dtstart=datetime.date(2014, 1, 1),
   ...:         count=20,
   ...:         bymonth=3,
   ...:         byweekday=SU(2)),  # Second Sunday of the month
   ...:     }
   ...: )

In [5]: dst_df['Weeks overlap'] = dst_df['UK DST Start'] - dst_df['USA DST Start']

In [6]: dst_df[' '] = dst_df.apply(lambda r: "👈 You are here" if r['UK DST Start'].year == 2024 else '',
   ...:  axis='columns')

In [7]: dst_df
Out[7]:
   UK DST Start USA DST Start Weeks overlap
0    2014-03-30    2014-03-09       21 days
1    2015-03-29    2015-03-08       21 days
2    2016-03-27    2016-03-13       14 days
3    2017-03-26    2017-03-12       14 days
4    2018-03-25    2018-03-11       14 days
5    2019-03-31    2019-03-10       21 days
6    2020-03-29    2020-03-08       21 days
7    2021-03-28    2021-03-14       14 days
8    2022-03-27    2022-03-13       14 days
9    2023-03-26    2023-03-12       14 days
10   2024-03-31    2024-03-10       21 days  👈 You are here
11   2025-03-30    2025-03-09       21 days
12   2026-03-29    2026-03-08       21 days
13   2027-03-28    2027-03-14       14 days
14   2028-03-26    2028-03-12       14 days
15   2029-03-25    2029-03-11       14 days
16   2030-03-31    2030-03-10       21 days
17   2031-03-30    2031-03-09       21 days
18   2032-03-28    2032-03-14       14 days
19   2033-03-27    2033-03-13       14 days

Wow - so there are many more 3 week overlaps than I thought!

👀 Review

Looking back, only two of the last eight years have had a three week “1 hour closer overlap”.

Let’s be honest, I’m not sure how I missed the three week overlap in 2019.

But for the 2020 one, although I was working for a US company, we were in COVID lock-down and child[0] was only a few weeks old so I was on parental leave.

However, looking forward, 2024 marks the start of a 3 year run of three week overlaps!

🧠 A new mental model

So my conclusion is that my previous mental model was pretty poor. It’s time for a new one. Here we go:

When March starts on a Friday, Saturday or Sunday then there will be 3 weeks of “1 hour closer overlap” with North America, otherwise it’s 2.

Let’s confirm that by averaging the twenty years in the DataFrame above:

# Add a 'first' column which contains the first day of March for each year
In [23]: dst_df['first'] = dst_df.apply(
    ...:     lambda r: datetime.date(r['UK DST Start'].year, 3, 1).strftime('%a'),
    ...:     axis='columns',
    ...: )

In [24]: dst_df.groupby('first')['Weeks overlap'].mean()
Out[24]:
first
Fri   21 days
Mon   14 days
Sat   21 days
Sun   21 days
Thu   14 days
Tue   14 days
Wed   14 days
Name: Weeks overlap, dtype: timedelta64[ns]

That looks right - so that’s my new mental model sorted.

Now I just need to remember to check on what NHL ice hockey games will be happening between 9th and 30th March 2025 when the schedule is published - that’s going to be three weeks of “easier to watch” matches! 😊

Pipx’s upgrade is shallow, let’s go deeper

2024-03-07T00:00:00+00:00

pipx has been managing my Python tools for almost a year.

But those tools are getting stale - new versions are out - I need to upgrade.

💪 Let’s upgrade this

One of my favourite and most used Python tools installed in pipx is Frogmouth. While working on some documentation, I think I’ve spotted a bug in some Markdown rendering. So before I report the bug, let’s ensure I’ve got the latest version.

Upgrading “Is Easy ™️”. Just use pipx upgrade:

pipx upgrade frogmouth

We get a spinner, and then:

frogmouth is already at latest version 0.9.2 (location: /home/james/.local/pipx/venvs/frogmouth)

Success! Nothing to do, end of blog post.

…

🔎 Let’s check

Frogmouth is using Textual and rich under the hood - so if I want to make sure I’ve got the latest Markdown code, I need to ensure they’ve been upgraded too.

Let’s ask pip to tell us all versions of packages in the frogmouth virtual environment:

pipx runpip frogmouth list

Package            Version
------------------ ---------
anyio              3.7.1
certifi            2023.7.22
frogmouth          0.9.2        👈 Here's Frogmouth at the latest version
h11                0.14.0
httpcore           0.17.3
httpx              0.24.1
idna               3.4
importlib-metadata 6.8.0
linkify-it-py      2.0.2
markdown-it-py     3.0.0
mdit-py-plugins    0.4.0
mdurl              0.1.2
pip                24.0
pkg_resources      0.0.0
Pygments           2.16.1
rich               13.5.2       👈 rich is at 13.7.1 on PyPI
setuptools         69.1.1
sniffio            1.3.0
textual            0.43.2       👈 Textual is at 0.52.1 on PyPI
typing_extensions  4.7.1
uc-micro-py        1.0.2
wheel              0.42.0
xdg                6.0.0
zipp               3.16.2

Uho - rich and Textual didn’t get updated by doing pipx upgrade.

🤔 This kinda makes sense

When we have a virtual environment for a project and we run pip upgrade, it just upgrades the package we request. It only upgrades dependencies if they conflict with the newly upgraded package. This is called the “only-if-needed” strategy and is documented in the pip User Guide.

But, given I’m a pip-tools addict, I rarely call pip directly. Usually I blow away all of a project’s requirements, rebuild them with pip-compile and then install all the new freshness with pip-sync.

How can I get this “everything new” behaviour with pipx? I think there are two options…

Option 1: Tell pip to be eager

Also listed in the pip User Guide is the “eager” option which:

upgrades all dependencies regardless of whether they still satisfy the new parent requirements.

This sounds like what I’m looking for.

And, luckily, pipx upgrade --help shows us just what we need:

--pip-args PIP_ARGS   Arbitrary pip arguments to pass directly to pip install/upgrade commands

Let’s try it by passing --upgrade-strategy=eager:

pipx upgrade --pip-args=--upgrade-strategy=eager frogmouth

This, unfortunately, gives very little output regarding the packages being updated. So let’s check them again with pip list (this time just grepping for ‘rich’ and ‘textual’):

pipx runpip frogmouth list | grep -E '^rich|^textual'

rich               13.7.1   🎉 Yay - upgraded to latest.
textual            0.43.2   😞 boo - not upgraded to latest.

😬 Textual ain’t gunna upgrade

After “some” digging, it turns out that Textual isn’t going to upgrade when installing / upgrading Frogmouth. That’s because Frogmouth has a caret requirement in its pyproject.toml file which restricts Textual from being upgraded beyond 0.43.

I only discovered this after pulling out pip-tools and running a clean compile of the current Frogmouth requirements and diffing them to the output of pipx runpip frogmouth list.

Personally, I think this kind of pinning is frustrating, especially in zero versioned software. If something breaks I can apply any pins required to get them to work - I don’t need the upstream maintainer to do it for me. That just creates slowness and unnecessary confusion.

Anyway - back to the upgrades…

Option 2: Hit it with a reinstall

There is another way. That’s to ask pipx to do a reinstallation of the software. As per pipx reinstall --help:

Package is uninstalled, then installed with pipx install PACKAGE with the same options used in the original install of PACKAGE.

Warning: this is a bit of a lie. The --python option is not kept when doing reinstall. But, this does allow for new versions of Python to be used after reinstalling.

Given that I’m not using the default Python version for pipx installs, I always have to pass in my preferred Python:

pipx reinstall frogmouth --python=python3.12

uninstalled frogmouth! ✨ 🌟 ✨
  installed package frogmouth 0.9.2, installed using Python 3.12.2
  These apps are now globally available
    - frogmouth
done! ✨ 🌟 ✨

And rich and Textual got to the same versions as before with “eager”:

pipx runpip frogmouth list | grep -E '^rich|^textual'

rich               13.7.1
textual            0.43.2

Which is best?

My guess is you should use what you think is best for your workflow.

I’m aggressive with my upgrading, so I’m happy with the pipx reinstall route. This also may give cleaner virtual environments since we shouldn’t get any hanging dependencies in the scenario that a package stops using a particular dependency.

Also, during my experimentation, I accidentally installed a package off PyPI called “eager” 🤦. Luckily it didn’t run and the source doesn’t look malicious to my trusting eye. But it’s this kind of mistake that’s nicely cleaned up every time the virtual environment is recreated with reinstall. 😅

Missing tiny data breaks pipeline

2024-02-18T00:00:00+00:00

This week, during our monthly reporting run, two major label licensing reports failed validation. This is unexpected because usually all reports are generated and validate just fine.

It turned out a row of advertising revenue was missed for the United States Minor Outlying Islands (UMI).

That missed row was worth just £ 0.0003. 🙀

👌 This is tiny tiny data

At work (Mixcloud) we generate usage reports for major labels on a monthly basis. The pipeline:

identifies, reports and pays royalties out on tens of millions of tracks, played by millions of Mixcloud creators, and owned by hundreds of thousands of different artists and songwriters. Via Mixcloud blog

This missing row was “tiny” by many definitions:

It was a tiny territory that I have to look up on Wikipedia. Turns out the population is about 300 people.
It was a tiny amount of revenue that would get rounded out of existence at payout time. It would literally make zero change to the total payout for the month to any label.

We often use a 0.1 % sense check definition of edge cases when working out what bugs and issues to put effort against, and by every definition, this missing row was less than 0.1 % of all sorts of monthly factors.

🔥 But the pipeline failed

A long time ago, I realised that we needed to validate the reports generated before they were sent to partners. So we built a post-process validation system. This checks the generated reports from the client perspective, providing row-wise, file-wise and batch-wise validation.

One of these checks ensures that advertising revenue is reported in GBP £. However, because we had a missing row for the United States Minor Outlying Islands (UMI), the reported advertising-based usage row became USD $ and failed validation.

Under the hood, this happened because we have a LEFT JOIN between revenue and usage which wasn’t populated on the revenue side because the UMI row was missing.

🛑 When there’s a validation failure, everything stops

When the generated reports with $ 0 amounts of advertising revenue hit our validators they fail for the partners whose reports contain enough detail to see that revenue and currency information. Even though this was just two partners, when we receive those validation errors in the pipeline, the monthly production stops.

We keep the generated reports, but work to find out the cause of the error and assess how many generated reports are tainted.

🔧 Fix and regenerate

This time the error was, as discussed, tiny. And the fix was pretty tiny too. We generated an extra row of revenue for UMI worth £ 0.0001 and spliced it back into our monthly source data snapshots.

Then we reran all partners that receive reports on Mixcloud’s ad-funded usage and our ops colleagues got our monthly production process back up to speed.

🤔 Is this kind of behaviour a “good” thing?

After this incident, I’m left wondering if it’s OK that our pipeline is halted by a missing row worth less than a penny that wouldn’t affect monthly payouts.

This is good

On the “good” side, we could say:

All the main sources of error are stable, it’s just the tiny edge cases that are failing.

In addition, these failures are so rare that we often are surprised when things fail. Plus, it’s good that we have the validation in place that finds these kind of errors and reports them.

This is bad

On the other hand, we could say:

The pipelines are so fragile that a tiny missing piece of revenue allocated to a user in a territory can bring down a monthly reporting run.

There also seems some truth in this.

Probably the LEFT JOIN in our revenue pipeline that caused the USD row to appear is not robust enough. And as we’ve dug more into the error later in the week, my colleague Tim might have found a scenario that we would never be able to prevent without strengthening this revenue query’s SQL.

⭐ Turn the bad into good

What I realised is that the failure is a gift in disguise - it’s helped us to see a flaw in the pipeline that’s so often hidden by aggregation. Instead of resting on our laurels, we have an opportunity to improve the robustness and accuracy of our revenue pipeline, plus a new test case to add to our test suite.

As a result of this error, we’re also planning to adjust the source of the missing row. This is currently a manual monthly process, but we’ve seen that it might be better incorporated into our pipeline directly, which we think will give more stability.

So, if you happen to be that Mixcloud user in the United States Minor Outlying Islands who listened in January - thanks so much. Your unusual pattern of listening really helped us out. 😊

🙏 Thanks to Duncan and Dan for proof reading and suggestions.

hledger failure messages are better than Ledger’s

2023-08-29T00:00:00+01:00

For any new plain text accounting project I always recommend using hledger over Ledger.

The main reason is errors and failures are better reported and rendered with hledger, so let’s look at an example - failed balance assertions.

An erroneous assertion

Given a journal file with a single transaction, which contains an error:

2023/08/29 Some person
    Assets:Current         $ 100 = $ 75.73
    Income

The error is that the balance of the Current Account is asserted as $ 75.73 after the transaction, when it’s really $ 100.

Ledger output

Running Ledger, here’s the version:

ledger --version

Ledger 3.1.3-20190331, the command-line accounting tool

Copyright (c) 2003-2019, John Wiegley.  All rights reserved.

This program is made available under the terms of the BSD Public License.
See LICENSE file included with the distribution for details and disclaimer.

Now, let’s ask for a balance - this will check the transaction and complain about the incorrect balance assertion:

ledger -f ledger.dat bal

While parsing file "/tmp/ledger.dat", line 2:
While parsing posting:
  Assets:Current         $ 100 = $ 75.73
                                 ^^^^^^^
Error: Balance assertion off by $ -24.27 (expected to see $ 100)

I’ve always found the “off by” amount confusing and find I don’t know if the asserted balance is too low or high.

hledger’s failed assertion

Just confirming my hledger version:

hledger --version

hledger 1.28, linux-x86_64

Now let’s run the same balance report with hledger:

hledger -f ledger.dat bal

hledger: Error: /tmp/ledger.dat:2:34:
  | 2023-08-29 Some person
2 |     Assets:Current           $ 100 = $ 75.73
  |                                    ^^^^^^^^^
  |     Income                  $ -100

This balance assertion failed.
In account:    Assets:Current
and commodity: $
this balance was asserted:     75.73
but the calculated balance is: 100
a difference of:               -24.27

Consider viewing this account's calculated balances to troubleshoot. Eg:

hledger reg 'Assets:Current$' cur:'\$' -I  # -f FILE

For me, this output is:

Much more clear. It helpfully shows the failing transaction in the error message.
Easier to understand: The asserted balance is compared to the computed balance.

Conclusion

Given that plain text accounting is hard enough to work with at the best of times, I would always go for a tool that helps me out the most with the complexity. Right now, that means I’d take hledger over Ledger.

An Ode to pipx

2023-07-26T21:00:00+01:00

Oh pipx, how I love thee… 🎵

Using pipx means I can have Python packages installed and executable on my path much more easily than in the past. That’s changed my personal and work development experience for the better. Here’s how…

Before pipx

When I wanted to make a Python package (like IPython) available on the command line in my Linux environment, I would get hacky… Using virtualenv and boilerplate bash scripts I would manage package installs, and then wrap them in a script to make them available on my PATH.

As an example, to make IPython runnable on the command line I would:

Create an IPython directory in my user’s opt dir: ~/opt/ipython.
Build a virtual environment inside it.
Activate the virtual environment and install IPython there with pip.
Add a wrapper executable script called ipython which was then callable on my shell’s PATH.

That script looked like:

#!/bin/bash

set -eo pipefail

~/opt/ipython/venv/bin/ipython

A side note about Python environments: The main reason for using virtual environments for these projects and tools is to keep my Ubuntu global Python environment clean: Not all Python installed packages can or should just be thrown in there. Separation is important, and sometimes required, not least because each package may have conflicting package requirements and may not be able to be installed together.

Disadvantages of these hacks

There were a growing number of issues with the hacky approach above - not least the problems with managing the resulting stack of venv and wrappers as the number of Python tools I wanted on my path grew.

Yes - these could be handled with Ansible (I like to build and manage my machines with Ansible), but there always seems to be a lag between the time I “need” a new thing on my command line, and when I manage to get it wired into Ansible correctly.

Upgrades also became hard - where were all those manually managed tools? Which ones should I update?

A small, but niggling, disadvantage for using the wrapper script to run local private tools: I found is that it was hard to keep “development” and “production” separate. I’d rarely re-create the private code repository so I could run a version on shell PATH separate from the development directory. No, instead, the wrapper script would call the development directory directly. Often when I was trying to do small fixes or improvements, I would accidentally break my tool, or make it unusable in some way. Annoying when you’re trying to update some accounts and the bank account parsing tool is crashing because you’re half way through updating it.

Switching to pipx

I installed pipx into the user virtual environment on my Ubuntu machine as per the instructions.

python3 -m pip install --user pipx

Then, installing IPython was as simple as:

pipx install ipython

Everything just worked and IPython was installed successfully. pipx even warned me that there was a previous executable on my path (my previous crappy wrapper script).

A better dev life

Now I use pipx to install, manage the virtual environment and expose packages’ endpoints on my shell’s PATH.

🙅 Gone are the wrapper scripts and manually built virtual environments.
🙅 Gone are the multiple directories of Python apps, some in ~/opt some in ~/active (my usual working path). Along with their Make recipes for managing virtual environments and upgrades.
🙅 Gone is the need for orchestration scripts and Make recipes to “know” the particular directory and virtual environment a package is installed in. pipx can upgrade everything with pipx upgrade-all.

✅ Public packages

I now install all my favourite, regularly used, public packages with pipx so they’re available all the time on the command line.

My favourite public packages currently installed are:

devpi-server to allow Tox to install packages without having Pip call PyPI.
flit for packaging.
frogmouth - my new favourite Markdown tool.
hledger-utils for helping with our family accounts.

✅ Personal private packages

I’ve got baggage - and it lives in private repositories: A suite of personal tools I’ve built up over the years used for all sorts of tasks, from filing downloads into correct directories, to managing my work time, to bookkeeping our family accounts.

With pipx these are now executable from anywhere in my shell, with none of the previous overhead and boilerplate mentioned above.

These personal private packages are a little harder for me to get into pipx, but only because I’m lazy - if you’ve done your proper packaging, then you’re probably already set.

I’ve got a follow-up post about making your private packages installable with pipx which I’ll publish soon.

Next steps

Some things I’m not sure about yet.

Private packages from private repositories

My current pipx install workflow for private packages depends on having them cloned to a local directory, and then calling pipx install [path] to install from there.

I would like it if I could install my private packages directly from their private GitLab repository without manually cloning first - I’m pretty sure pipx can do this, I’ve just not hacked around enough with the invocation.

This improvement would mean that I would just use a pipx install of my private packages, and that means more cleanliness in my development environment - no need to keep directories around in order to provide runnable Python code any more.

Managing all this with Ansible

As I mentioned I usually build and manage my machines with Ansible. I need to invest some time in catching my Ansible playbooks with my current machine states and the Ansible pipx module in Ansible galaxy looks particularly helpful.

🙏 Thanks

Thanks for reading.

Thanks to Brian and Michael’s coverage of Julia Evans’s “Some blogging myths” post… For “nagging” bloggers that it doesn’t have to be perfect - just write the thing and put it out there.

Thanks to Fosstodon folks for tooting the new and interesting things, that, in turn, inspire me to try out these things and get them working for myself.

Pytest’s cache and gitignore

2022-12-19T15:00:00+00:00

This post is about sanity checking. It was written at the end of 2019, but not published until the end of 2022. The underlying change to Pytest’s cache directories was made in 3.8.1, released at the end of 2018.

TL;DR 🥱

You can check any path, real or imaginary, with git check-ignore to see if Git will ignore it or not.
Pytest prevents its cache directory .pytest_cache from getting into Git repositories by adding a .gitignore file inside them.

The (long) story 📜

While working on a project using Pytest, pytest --lf was not selecting all possible tests.

The --lf flag tells Pytest to run the tests that failed in the last run and those test IDs are stored in Pytest’s cache.

To ensure that I started from a clean place, I went to clean out the .pytest_cache directory. But while I was looking at that directory, I had a mild panic - I had completely forgotten to add it to project’s .gitignore file!

Had I accidentally committed the .pytest_cache dir?!

Was this why pytest --lf was being strange?!

Not in Git

Firstly, I was able to reassure myself that I’d not accidentally committed the cache directory: git log can accept a path, so when git log -- .pytest_cache came back empty, this was reassuring. It was not committed to the repo!

However, there was no entry for .pytest_cache in .gitignore.

I usually populate the .gitignore for Python projects by lifting the lines that I want from the Github gitignore repo, but I’d forgotten to copy over the line for .pytest_cache.

Why is the .pytest_cache directory being ignored by Git if I’ve not written a pattern for it into .gitignore?

Checking ignored files

My guess was one of the existing patterns in .gitignore might be matching the .pytest_cache path. To check this I went through deleting lines from the file until it was empty. But even with an empty ignore file, .pytest_cache still did not get picked up by Git!

Then I went and found that there is a super-helpful git check-ignore command. You can read some of the background of this command on Stack Overflow. This can be used to check what Git ignore thinks of a path.

So now I can call:

git check-ignore -v .pytest_cache/

And get back:

.pytest_cache/.gitignore:2:*    .pytest_cache/

This means:

There is a file .pytest_cache/.gitignore.
Line 2 of that file is *.
This rule is being applied to .pytest_cache/.

So - Pytest creates its own .gitignore file in the cache to prevent it being included! Phew, what a journey! 😪

A bit more investigation

So now we have an opportunity to learn a little bit about Pytest…

From some searching, I found that the inclusion of a .gitignore file in Pytest’s cache directories was a feature:

Introduced in Pull #3982: Ignore pytest cache.
To solve Issue #3286: .pytest_cache is showing up in projects git repos.

Previously, Pytest had renamed its cache directory from .cache to .pytest_cache. As a result, on projects where maintainers hadn’t updated their ignore files, the new cache directories had been committed by accident.

In looking at the Pytest team’s response, what’s interesting to me is the trade-off between:

Pytest developers do nothing. Let Pytest users update their .gitignore files or other SCM ignore methods, or…
Pytest developers take some action. Prevent the folder being added to SCM systems or some other fix.

In the discussion on the Issue, this comment shows the idea of a .pytest_cache/.gitignore file coming into being:

another devious idea - if we add a .gitignore with the content * then the folder is protected as well and people dont need to track manually

But all decisions have consequences.

Less might be more

For me I would prefer to follow the Zen of Python:

Explicit is better than implicit.

I would vote for: Let Pytest users update their ignore mechanisms.

This would mean:

Pytest SCM users learn that .pytest_cache exists and add it to their .gitignore or similar.
Confusion is avoided because no directories are unexpectedly ignored by Git. (Confusion as you can see in my case above and also in this issue.)
Other side effects do not occur, like this ones mentioned in the issue above regarding Debian packaging or search.

To the wider open source issue, I think that projects that do less will last better than projects that do too much. I would generally take trade-offs where less is done rather than more.

Reflection 2022

Much of this post was written in 2019, much has happened, my confusion has lessened.

If you ask “did the Pytest team do the right thing by adding .gitignore to the newly named .pytest_cache directories?”, then my answer is yes.

It seems to have been a successful strategy and is even used by mypy with a hat-tip to Ronny Pfannschmidt’s original comment suggesting the idea.

While editing this post, I found two quotes from Ronny that I’ll end with:

we would be more than happy to have a better way (like xdg)

but lets be realistic here - the added .gitignore protects beginner uses from a very common mistake, that’s why its there

its a practical solution to a practical problem and has a interference component

…

from my pov its an absolutely acceptable tradeoff to prevent a lot of developer pain by inflicting a extra step on package maintainers

Nice one Pytest team for looking after new developers! 🙌

Migrating Open Source projects on Travis CI to fix GitHub API limit problems

2020-04-23T23:00:00+01:00

Previously I wrote that Travis dot org has been exhausting its GitHub API rate limit. Test results for projects built on Travis dot org (travis-ci.org) have not been reliably reported back to GitHub. This leaves commits on GitHub in a pending yellow status and pull requests blocked.

The solution is for open source maintainers to migrate their projects from Travis dot org to Travis dot com (travis-ci.com). This solves the API rate limit problem because Travis dot com uses GitHub Apps, whereas Travis dot org uses a GitHub integration.

With GitHub Apps each install of the app gets its own API quota. So with the Travis dot com GitHub app installed in your GitHub user or organisation, the 5,000 requests per hour API limit applies to just your install of the app, not globally for all Travis dot com calls to GitHub. As a small-time open source developer, there are no realistic future scenarios where my install of the app will reach 5k requests per hour.

Key migration points

The migration documentation on Travis is pretty comprehensive, but watch out for these gotchas:

Make sure you “Sign up for the beta” of migration in your Travis dot org account.

Without this your existing repositories will not appear in your new Travis dot com account.
If you have required checks in the branch protection rules of your GitHub project repository, these need to be switched over.

You will need to trigger a build on Travis dot com for these new checks to appear as options.
Remember to change any build badges on your README from dot org to dot com.

A trade off

With GitHub apps, results of checks are kept in the Checks Framework. This means that when you click “details” of a Travis dot com check, you will be shown GitHub’s page for this check (here’s an example). Whereas with Travis dot org, clicking on the “details” link for a check took you straight to Travis dot org.

Here’s how GitHub advertises this benefit:

Once you migrate your project, Travis will be one click further away. Therefore you are more likely to stay on GitHub while nursing a pull request or checking on a build.

While I’m sure many people consider this an improvement, I’m not a fan of the GitHub checks system. I prefer the old system because:

It was easier and more reliable to visit the external build system’s site. As we’ve seen with this whole issue, communication across GitHub’s boundary can be unreliable.
I prefer Travis’s interface for showing build information, not GitHub’s static checks page.

Finally

Thanks to MK at Travis for the help with migration.

I’m glad that it was possible to find a way to continue to use Travis on my open source projects.

Happy building!

Travis hitting GitHub’s API limits for Open Source projects

2020-04-02T23:00:00+01:00

Note: A newer post Migrating Open Source projects on Travis CI to fix GitHub API limit problems has information on how to fix the problems described below.

Last week, GitHub’s Dependabot created a pull request with a fix to a vulnerability found in the development dependencies of one of my FOSS projects. This was a bump to Mozilla’s bleach, a project that GitHub states is used by more than 61,000 other projects.

Flake8-AAA’s repository is wired into Travis CI to provide automated execution of its test suites across all supported versions of Python. Better still, because Flake8-AAA is an open source public repository, Travis provides the computing power to run these tests for free. I’ve always found Travis reliable and stable, so it’s a requirement that pull requests have a “green” Travis build before merging into Flake8-AAA’s master branch.

Unreported build status

However, when I checked on the Dependabot Pull Request, GitHub was still waiting for the status of its Travis build to be reported.

You can see that the “Merge pull request” box is greyed out because the required Travis build has not completed yet according to GitHub.

But here’s the build at Travis - both green and done within 3 minutes of Dependabot opening the PR at GitHub, so the call from Travis to GitHub to report the build status on the commit failed for some reason.

Debugging

Sometimes webhook and API calls to GitHub fail - I’ve seen this with both personal and work projects. Often the simplest solution is to retrigger the build in some way. At first I tried to get a follow up build to work by:

Creating a new commit on the branch with updated requirements and pushing that to the branch.
Amending the existing commit and pushing with --force.
Creating and pushing a new branch with an update to all requirements.

All of these strategies had the same effect - a new build was triggered on Travis and that build was green, but it was not reported to GitHub. So it looked like all API calls were failing from Travis to GitHub.

Next, while checking the GitHub status page and Travis status page, I found this status update on the Travis site:

In light of that status message, I tried installing the Travis app integration, but had no success getting it to link to Flake8-AAA.

The message says:

Please write to support@travis-ci.com if you encounter any similar problems.

So I emailed.

Reply from Travis Support

Here’s the full text of the reply from Travis support:

MK (Travis CI)

Mar 31, 15:38 EDT

Hello ,

Thanks for your patience on this issue.

We want to provide some visibility into the issues we are facing, the effects on our infrastructure and efforts made so far to restore normalcy.

We recently started hitting API rate limits for Github calls and on March 25, 2020, we contacted Github to ask for increases and are awaiting their feedback in this regard.

On the Travis CI end, we have made improvements on how our code accesses the Github API, which has led to improvements, albeit minimal.

While we occasionally hit API limits, it’s important to note that we haven’t hit these kinds of limits before now. In the interim, the best course of action would be to retry the action you wanted to perform.

For next steps,

We are following up with Github via various channels to get the requested API rate limit increased.

In addition, we are looking for more avenues to remove invalid/unnecessary Github API calls in our codebase to ensure we stay under the limit and avoid disruptions like this.

We are coordinating internally to ensure customers are up-to-date on progress made so far.

We know how critical our platform is to your business and our goal is to provide the best experience for our customers. In line with this, we extend our sincere apologies for inconveniences this is causing.

Thank you and we will provide periodic updates as we have more.

Firstly, thanks to Travis support for this helpful message - it’s pretty unusual for a service that offers a free tier to be open and responsive to messages from freeloading users like myself.

Secondly, I assumed that Travis would not be opposed to publishing the text of the email since it should help other developers in my situation.

In response to the mail itself:

My understanding is that this issue mainly affects open source projects on Travis dot org.
This message makes no mention of migrating to the Travis dot com GitHub Apps integration, so I assume that it wouldn’t work for Flake8-AAA or other open source projects.
The mail states:

In the interim, the best course of action would be to retry the action you wanted to perform.

Unfortunately I’ve had no success with this yet, but will continue to try.

Update: Since writing this post I have successfully migrated projects to Travis dot com. My next post has a list of items to remember when migrating.

Although I’m happy with the Travis response so far, I’m worried about what this means about the future of GitHub.

Thoughts on the GitHub ecosystem

I was not part of the “mass exodus” from GitHub in 2018 after Microsoft completed its purchase of the platform. At the time I thought that this could only be good for the site, however, now I’m reconsidering, especially in the light of the situation above. Let me explain why…

GitHub wants Actions to replace Travis

GitHub Actions is what GitHub calls its “world-class CI/CD” system. CI/CD has been supported by Actions since August 2019 and is free for open source projects - GitHub has “embraced” CI/CD.

Travis dot org is now a competitor to GitHub rather than the helpful addition to the ecosystem it was before.

Also the existence of CI/CD in Actions means that GitHub can allow the degradation of other CI/CD integrations because it’s able to offer a “better” replacement - use Actions instead. My guess would be that GitHub intends Actions to replace all CI/CD building on GitHub for open source projects.

GitHub wants developers to stay on GitHub

In the final paragraph of the GitHub blog post above, Nat Friedman states:

Our vision is to serve every developer on the planet, by being the best place to build software.

Building software includes CI/CD and GitHub’s vision means that every developer that needs a CI/CD function would stay on GitHub while “building software”, not traverse external systems like Travis, Circle CI or Codeship.

GitHub can make it harder for CI/CD integrations to keep up

Since GitHub (and therefore Microsoft) acquired Dependabot in 2019, GitHub now has a tool which it can use to generate a larger number of builds on CI/CD services integrated with its platform like Travis. This will have the knock-on effect of making it harder for those CI/CD services to keep within their API rate limits and more expensive to run because they will need to buy more computing power from AWS and/or Google to run builds.

Best of all for GitHub, they can put this pressure on others while maintaining the guise of making “dependency upgrades easy”. Now GitHub automatically creates a pull request for any project owned by an account with security alerts enabled when it finds a relevant security vulnerability alert.

In the case of the pull request above that started this post, that was a vulnerability in bleach. As I mentioned this is a project used by over 60k projects on GitHub. So when a security advisory on bleach occurs, Dependabot creates a pull request on GitHub, each pull request will then be built by a CI/CD system for those repositories that have one wired in. For an external CI/CD system like Travis, that flood of builds requires a large volume of computing resources and GitHub API calls.

The GitHub rate limit documentation currently states a quota of 5,000 requests per hour. If each CI/CD build requires 2 API calls (one to say “in progress” and one to post the result), then once 2,500 builds are completed in an hour the quota will be exhausted. If 4% of all the repositories that depend on bleach are using Travis for builds, then a single bump to the bleach release would exhaust a 5,000 request quota immediately - and that’s before any “normal” human-driven regular build activity is taken into consideration.

Now I’m pretty sure that Travis has an hourly quota that’s greater than 5,000 requests per hour, probably granted to them when GitHub saw them as augmenting the GitHub ecosystem, but when the Travis email above stated:

We are following up with Github via various channels to get the requested API rate limit increased.

… why would GitHub bump this now?

Instead, GitHub can leave Travis in an awkward situation: choose to throttle builds and get reliable status calls back to the GitHub API, or make open source projects have a less reliable and smooth experience when status update API calls are dropped. Either option makes GitHub Actions look “better” as a CI/CD solution - a win for GitHub.

Finally, hope

I hope that my thoughts on the GitHub ecosystem above are overly negative and that these issues with Travis are not the start of an “extinguish” strategy by GitHub towards external CI/CD systems (see Embrace, extend, extinguish).

I hope I’m completely wrong and that GitHub open up their API limits to Travis so that open source projects like Flake8-AAA can still use it for reliable CI/CD. But if things don’t go well then I’m certainly more ready to join the GitHub exodus, just 18 months behind the curve.

Thanks Travis CI for all the builds, I hope we have many more to come!

It’s good to extract

2018-04-21T19:00:00+01:00

Last week we released version 1 of pysyncgateway - a Python package for communicating with Couchbase’s Sync Gateway via its REST API.

But this “new” library was not created from scratch. It consists mainly of code extracted from my employer’s Django based API server repository. That API is now around 4,000 lines of code and test smaller and installs pysyncgateway as a package during deployment.

Both the process of extraction and the final result have been been really helpful - this post covers some of the benefits that we have found so far.

Better separation of concerns

The boundary between the new library and the server code makes it much easier to reason about where responsibilities start and end.

Originally the Sync Gateway communication code was tightly knitted with our Django API server:

It used Django settings for establishing URLs of the Sync Gateway instance in test and production.
It provided test cases to our server’s old Unittest test suite, Those test cases created test Databases, Users and Documents on the Sync Gateway for each test - tearing them down afterwards.
It manipulated the statistical data retrieved from Sync Gateway and posted it to our statsd instance. Again Django’s settings were used for configuration.

In extracting the library, these responsibilities have been cleaned out and clarified:

Communication with Sync Gateway’s API from Python - Responsibility of pysyncgateway library. All calls made to Sync Gateway are the responsibility of the library.
Testing and mitigating any strange behaviours of the Sync Gateway API - Responsibility of pysyncgateway. The library’s code is the place to pin and mitigate any strange behaviours that are found.
Integration of Sync Gateway’s objects (User, Document, Database) into the API server and Django - Responsibility of API server code. The server code remains responsible for managing its own tests conditions.
Synchronisation of Django’s User object with Sync Gateway’s User objects - Responsibility of API server. The library is oblivious to the application that is using it - in the same way that the requests libary is oblivious to the fact that is it being used by pysyncgateway to communicate with Sync Gateway.

Improved efficiency of development and test

While working on the library code, I’ve found that testing has been much more efficient.

In terms of time, a single test run as part of a build on Circle CI takes around 10s whereas in our API server test suite it was taking 40s and was mixed in with a much longer (~20 minute) long test suite.

The dedicated library repository now means that when I’ve had questions about how Sync Gateway behaves in certain situations, then the library is the place to explore that behaviour and ensure that the library code is fulfilling its main responsibility - communicating as best it can with any Sync Gateway instance.

Document all the things

The documentation built by sphinx and hosted on Read The Docs is great. I’ve found it much better than reading docs via a code editor or ipython and end up using the RTD site as the main point of reference.

Luckily many of the docstrings were in place in much of the code before the extraction, but moving they were mixed in with API project specific information that could not be published. Again, the clarity of responsibilities meant that we could clean up much of the docs to make them ready to be published.

Still a monolith, but with packaging benefits

Our server code remains a single monolith - it’s one installed blob of code on one server. The Sync Gateway code was extracted into a library, not a service.

However, now that the Sync Gateway code is installed from PyPi via pip-sync, this provides the additional abstraction that we can select the version of the library that will be installed.

This means we will have more flexibility to improve the library to work with the latest version of Sync Gateway 2 (it’s currently only tested with 1.5) and also Python 3. We can upgrade the library, make breaking changes if required and bump versions without touching the server monolith at all.

Finally

The extraction of pysyncgateway has worked out well for us and so I’m preparing to extract our next library - a simple object orientated layer that we use to communicate with Nextcloud.

There will be quite a bit of time invested to extract the code, but my expectation is that the test benefits will be great. Not only will we get to remove library code that takes around 6 minutes to test, but also we will gain the library’s test suite as a dedicated area to test the nuanced edge cases of Nextcloud’s API.

Happy code extraction!

AAA Part 2: Extracting Arrange code to make fixtures

2017-08-07T00:00:00+01:00

In this post I will describe how code in tests’ Arrange blocks can become over-complicated, break the AAA pattern and benefit from extraction.

Background

This post is Part 2 of a series on the Arrange Act Assert pattern for Python developers. See Part 1 for an introduction to the pattern and outline of its constituent parts.
When I mention “code extraction” I’m primarily referring to the Extract Method [1] of refactoring. Kent Beck’s book “Test Driven Development: By Example” really turned me on to the value in eliminating duplicated code between tests and between tests and the SUT [2].
I’m using pytest in this example which means that fixtures are marked with the @pytest.fixture decorator. If you’re using unittest then you could extract the set up code into the TestCase.setUp method.
If you can, perform Extract Method while your test suite is GREEN [3]. This means that you can be more assured that your refactoring has worked without errors.
During my work I often build permission systems that manage access to resources such as files, accounts, projects, etc, based on the connection between Users and those resources. The example test below is from one of those projects. I often use Simpsons and Futurama characters in tests because I think it makes it easier to visualise the test conditions when characters are used that other programmers may be familiar with already.

The problem

I’ve found that this problem, which I call “Complicated Setup”, occurs as a test suite grows and the complexity of the tests on the outside of the code increases.

Tests will often need to combine a number of objects in increasingly complex states to build the SUT [2]. As a result, additional assertions are required before the Act block to ensure that the test conditions are correctly established. The problem with these additional assertions is that they break the AAA pattern because there should be no assertions in the Arrange block.

# Warning - this test does *not* fit the AAA pattern because it has
# assertions in the Arrange block.

def test_owner_invite_admin():
    """
    Leela can invite Bender to an additional Project, Fry is notified

    ----------------+---------------+-----------
     Account Role   | Project Role  | Name
    ----------------+---------------+-----------
     Owner          | -             | Leela
     Admin          | -             | Fry
     Viewer         | Admin         | Bender
    ----------------+---------------+-----------
    """
    # LEELA (and account)
    account = AccountFactory(owner__first_name='Leela')
    account_document = AccountDocument(account, default_database)
    account_document.get_or_create()
    leela = account.owner
    new_project = leela.create_project('new_project')
    # FRY
    admin_membership = AccountMembershipFactory(
        account=account,
        permission='AA',
        person__first_name='Fry',
    )
    fry = admin_membership.person
    # BENDER
    project_data = ProjectMembershipFactory(
        account=account,
        person__first_name='Bender',
        role='admin',
    )
    project_couchbase = project_data['project']
    bender = project_data['person']
    # Check
    assert len(bender.accounts) == 1            # <
    assert bender.accounts[0].owner == leela    # < Assertions in Arrange
    assert len(bender.projects) == 1            # <
    assert bender.projects[0] != new_project    # <
    assert len(fry.messages) == 0               # <

    result = leela.new_project.invite(bender)

    assert result is True
    assert len(fry.messages) == 1

Tests on the arrangement of the SUT will often be informed by the tests that are about to be carried out on it in the Act. Here I want to ensure that Fry is notified with a new message so it is important that after Arrange Fry has no messages waiting. But adding these assertions before the Act section means breaking AAA and this is a smell the test has grown too complex and should be cut down.

It is possible to use Extract Method to create a fixture that solves this issue and returns the test to pure AAA pattern. I’ve used a simplified example to illustrate how to solve this below. I’ve imagined a SUT class that must be called with some arrangement functions like arrange_a, arrange_b, etc.

If the example does not load for you, you can view it on speakerdeck.

Now applying this process to the Futurama account test above I get the following fixture with its own dedicated test and a much simpler test for the invite behaviour.

@pytest.fixture
def account_members():
    """
    Returns:
        tuple:
            User: Leela - Account owner.
            User: Fry - Admin.
            User: Bender - Project admin.

    ----------------+---------------+-----------
     Account Role   | Project Role  | Name
    ----------------+---------------+-----------
     Owner          | -             | Leela
     Admin          | -             | Fry
     Viewer         | Admin         | Bender
    ----------------+---------------+-----------
    """
    # LEELA (and account)
    account = AccountFactory(owner__first_name='Leela')
    account_document = AccountDocument(account, default_database)
    account_document.get_or_create()
    leela = account.owner
    new_project = leela.create_project('new_project')
    # FRY
    admin_membership = AccountMembershipFactory(
        account=account,
        permission='AA',
        person__first_name='Fry',
    )
    fry = admin_membership.person
    # BENDER
    project_data = ProjectMembershipFactory(
        account=account,
        person__first_name='Bender',
        role='admin',
    )
    project_couchbase = project_data['project']
    bender = project_data['person']
    return leela, fry, bender

def test_account_members(account_members):
    """
    Fry has no pending messages and Bender is a member of the Account
    """
    result = account_members

    assert len(result) == 3
    leela, fry, bender = result
    assert len(bender.accounts) == 1
    assert bender.accounts[0].owner == leela
    assert len(bender.projects) == 1
    assert bender.projects[0] != new_project
    assert len(fry.messages) == 0

def test_owner_invite_admin(account_members):
    """
    Leela can invite Bender to an additional Project, Fry is notified
    """
    leela, fry, bender = account_members

    result = leela.new_project.invite(bender)

    assert result is True
    assert len(fry.messages) == 1

Even though this example is long winded, I hope you can see that the extraction of the set up code into its own fixture has simplified the tests and brought the code back into conformity with the AAA pattern.

Benefits of extraction

The result of the extraction process is a pair of tests with a single fixture. The tests fit the AAA pattern that I advocated in Part 1 of this series and the resulting code’s structure has a number of advantages for the future of the test suite:

Continued development on the fixture can happen using TDD [4] by adding new requirements to test_fixture() and then expanding the fixture to get back to GREEN.
The resulting fixture can be reused really easily. Permutations of different actions on a particular SUT can be easily tested without having to depend on our power of copy and paste and without creating more duplicated code.
If a situation arises in the future where the arrangement of the SUT needs to change in the fixture all the tests that use it might fail. However, the payoff for the additional failure of the fixture’s dedicated tests is that there is the opportunity to fix the problem in one place - the extracted code in the fixture.

On top of that, the fix can be performed using TDD because the fixture is already extracted and under test - a potential double win.

In this way the test suite remains dynamic, clear and able to adapt with the software it’s testing.

Should all fixtures have their own tests?

I’m often asked whether I think test fixtures should be tested. My answer is: “It depends”.

When the fixture was arrived at via “Complicated setup” then my answer is “yes”. As we’ve seen, the test_fixture() test remains to pin the fixture’s behaviour and assert that the SUT is in the expected state.

When the fixture has been extracted because of “Setup duplication” [5] there will be a fixture created that does not have its own explicit test. Instead, the fixture is tested implicitly by the two tests but does not have a dedicated test of its own.

For me this is an “OK” situation and if it turns out that the fixture should be adjusted then a fixture test can be created to facilitate that change under the usual RED, GREEN, REFACTOR cycle.

flake8-aaa

Check out flake8-aaa - a Flake8 plugin that makes it easier to write tests that follow the Arrange Act Assert pattern.

Happy testing!

Tiny glossary

[1]	Extract Method is a refactoring step defined here.

[2]	(1, 2) System Under Test I’ve used this to mean the Unit under test, there is no implication around the size of the “system” or “unit”.

[3]	GREEN is the name for the state when all tests in your suite pass.

[4]	Test Driven Development.

[5]	Setup duplication: My name for the situation where there are large chunks of Arrange code duplicated between tests. This topic warrants a follow-up post.

Arrange Act Assert pattern for Python developers

2017-07-06T23:00:00+01:00

This is the first of two posts exploring the Arrange Act Assert pattern and how to apply it to Python tests. It presents a recognisable and reusable test template following the Arrange Act Assert pattern of testing. In addition, I aim to present strategies for test writing and refactoring which I’ve developed over the last couple of years, both on my own projects and within teams.

In this first part I will introduce the Arrange Act Assert pattern and discuss its constituent parts.

What is Arrange Act Assert?

The “Arrange-Act-Assert” (also AAA and 3A) pattern of testing was observed and named by Bill Wake in 2001. I first came across it in Kent Beck’s book “Test Driven Development: By Example” and I spoke about it at PyConUK 2016.

The pattern focuses each test on a single action. The advantage of this focus is that it clearly separates the arrangement of the System Under Test (SUT) and the assertions that are made on it after the action.

On multiple projects I’ve worked on I’ve experienced organised and “clean” code in the main codebase, but disorganisation and inconsistency in the test suite. However when AAA is applied, I’ve found it helps by unifying and clarifying the structure of tests which helps make the test suite much more understandable and manageable.

TL;DR: The shape of an AAA test

Here is a test that I was working on recently that follows the AAA pattern. I’ve extracted it from Vim and blocked out the code with the colour that Vim assigns.

Hopefully in this rough image you will see three sections to the test separated by an empty line:

First there is the test definition, docstring and Arrangement.
Empty line.
In the middle, there is a single line of code - this is the most important part: The Act.
Empty line.
Finally there are the Assertions. You can see that the Assert block code lines all start with the orange / brown colour - that is because the Python keyword assert is marked with this colour in Vim with my current configuration.

While working on test suites that employ this pattern, my experience has been that I’ve found it easier to understand each test. My eye has definitely got used to the test “shape”. Want to know what is being tested? Just look at the clear line above the assertion block.

Follow this pattern across your tests and your suite will be much improved.

Background

I’ll now go into detail on each of these parts using Pytest and a toy test example - a simple happy-path test for Python’s builtin list.reverse function.

I’ve made the following assumptions:

We all love PEP008, so we want tests to pass flake8 linting.
PEP020, The Zen of Python, is also something we work towards - I will use some of it’s “mantras” when I justify some of the suggestions in this guide.
Simplicity trumps performance. We want a test suite that is easy to maintain and manage and can pay for that with some performance loss. I’ve assumed this is a reasonable trade off because the tests are run much less frequently than the SUT in production.

This post is only an introduction to the AAA pattern. Where certain topics will be covered in more detail in future posts in this series, I have marked them with a footnote.

Definition

The definition of the test function.

Example

def test_reverse():

Guidelines

Name your function something descriptive because the function name will be shown when the test fails in Pytest output.
Good test method names can make docstrings redundant in simple tests (thanks Adam!).

Docstring

An optional short single line statement about the behaviour under test.

Example

"""
list.reverse inverts the order of items in a list, in place
"""

Guidelines

Docstrings are not part of the AAA pattern. Consider if your test needs one or if you are best to omit it for simplicity.

If you do include a docstring, then I recommend that you:

Follow the existing Docstring style of your project so that the tests are consistent with the code base you are testing.
Keep the language positive - state clearly what the expected behaviour is. Positive docstrings read similar to:

X does Y when Z

Or…

Given Z, then X does Y
Be cautious when using any uncertain language in the docstring and follow the mantra “Explicit is better than implicit” (PEP20)

Words like “should” and “if” introduce uncertainty. For example:

X should do Y if Z

In this case the reader could be left with questions. Is X doing it right at the moment? Is this a TODO note? Is this a test for an expected failure?

In a similar vein, avoid future case.

X will do Y when Z

Again, this reads like a TODO.

Arrange

The block of code that sets up the conditions for the test action.

Example

There’s not much work to do in this example to build a list, so the arrangement block is just one line.

greek = ['alpha', 'beta', 'gamma', 'delta']

Guidelines

Use a single block of code with no empty lines.
Do not use assert in the Arrange block. If you need to make an assertion about your arrangement, then this is a smell that your arrangement is too complicated and should be extracted to a fixture or setup function and tested in its own right.
Only prepare non-deterministic results not available after action.
The arrange section should not require comments. If you have a large arrangement in your tests which is complex enough to require detailed comments then consider:
- Extracting the comments into a multi-line docstring.
- Extracting the arrangement code into a fixture and testing that the fixture is establishing the expected conditions as previously mentioned.

Act

The line of code where the Action is taken on the SUT.

Example

result = greek.reverse()

Guidelines

Start every Action line with result =.

This makes it easier to distinguish test actions and means you can avoid the hardest job in programming: naming. When every result is called result, then you do not need to waste brain power wondering if it should be item = or response = etc. An added benefit is that you can find test actions easily with a tool like grep.
Even when there is no result from the action, capture it with result = and then assert result is None. In this way, the SUT’s behaviour is pinned.
If you struggle to write a single line action, then consider extracting some of that code into your arrangement.
The action can be wrapped in with ... raises for expected exceptions. In this case your action will be two lines surrounded by empty lines.

Assert

The block of code that performs the assertions on the state of the SUT after the action.

Example

assert result is None
assert greek == ['delta', 'gamma', 'beta', 'alpha']

Guidelines

Use a single block of code with no empty lines.
First test result, then side effects.
Limit the actions that you make in this block. Ideally, no actions should happen, but that is not always possible.
Use simple blocks of assertions. If you find that you are repeatedly writing the same code to extract information from the SUT and perform assertions on it, then consider extracting an assertion helper.

The final test

Here’s the example test in full:

def test_reverse():
    """
    list.reverse inverts the order of items in a list, in place
    """
    greek = ['alpha', 'beta', 'gamma', 'delta']

    result = greek.reverse()

    assert result is None
    assert greek == ['delta', 'gamma', 'beta', 'alpha']

flake8-aaa

Check out flake8-aaa - a Flake8 plugin that makes it easier to write tests that follow the Arrange Act Assert pattern outlined above.

Thanks

I hope that this introduction has been helpful and you will return for part 2: AAA Part 2: Extracting Arrange code to make fixtures.

Thanks to Adam for reviewing this post and his helpful feedback.

Thanks for reading and happy testing!

Comparing Django Q Objects in Python 3 with pytest

2017-05-30T22:00:00+01:00

Background

In a previous post I wrote about comparing Django’s Q object instances. The original code was Python 2 with unittest and was due for an update.

The previous issue with comparing Django’s Q objects remains the same:

Django’s Q object does not implement __cmp__ and neither does Node which it extends (Node is in the django.utils.tree module).

Unfortunately, that means that comparison of Q objects that are equal fails.

A simple Python 3 solution

The following is a Python 3.6 assertion helper for use with pytest that uses the original strategy of comparing the string versions of the Q objects.

from django.db.models import Q


def assert_q_equal(left, right):
    """
    Test two Q objects for equality. Does is not match commutative.

    Args:
        left (Q)
        right (Q)

    Raises:
        AssertionError: When -
            * `left` or `right` are not an instance of `Q`
            * `left` and `right` are not considered equal.
    """
    assert isinstance(left, Q), f'{left.__class__} is not subclass of Q'
    assert isinstance(right, Q), f'{right.__class__} is not subclass of Q'
    assert str(left) == str(right), f'Q{left} != Q{right}'

This time the helper is just a function rather than a mixin for unittest.TestCase.

isinstance is used for comparison so that any instance of a class derived from Q can also be matched. The assertions have secondary expressions in the form of f-strings to give helpful output without raising a custom assertion.

When two Q instances do not match, pytest shows the following output:

______________________ test_neq_multi_not_commutative ______________________
test_assert_q_equal.py:83: in test_neq_multi_not_commutative
    assert_q_equal(q_a, q_b)
test_assert_q_equal.py:22: in assert_q_equal
    assert str(left) == str(right), f'Q{left} != Q{right}'
E   AssertionError: Q(AND: ('speed', 12), ('direction', 'north')) != Q(AND: ('direction', 'north'), ('speed', 12))
E   assert "(AND: ('spee...n', 'north'))" == "(AND: ('direc...'speed', 12))"
E     - (AND: ('speed', 12), ('direction', 'north'))
E     + (AND: ('direction', 'north'), ('speed', 12))
==================== 1 failed, 7 passed in 0.07 seconds ====================

The important thing is to adjust your assertion helpers to best fit the needs of your test suite and team.

My Vim setup for Python development

2017-01-04T14:30:00+00:00

Below is a list of my Vim plug-ins and configurations.

My goal has been to make Vim more useful for (primarily) Python development. This post refers to Vim 7, because I have not yet updated to Vim 8. I’m using vim-plug to manage my packages, so mentions of packages below will use the Plug command.

All the commands and configuration below come from my vimrc file. You’ll find that I don’t have a large number of plug-ins or configuration lines compared to other more famous Vim users (cough) Drew (cough). That is a direct result of Kris Jenkins’s Bare Bones Navigation Vim talk at Vim London. The main result of which has been that I have always run a very simple Vim setup.

My relatively recent use of FZF and Ctags listed below are a direct result of attending the most recent Vim London meetup and, if you’re in the London area, I fully recommend joining and attending. Every meetup I attend, my Vim-fu improves.

Specific Python config

The following are my .vimrc lines for handling Python.

When searching for files with Vim, only load Python files:

set suffixesadd=.py

Ignore pyc files when expanding wildcards:

set wildignore=*.pyc

Don’t show pyc in file lists:

let g:netrw_list_hide= '.*\.pyc$'

Keep “Pythonic” tabs using 4 white spaces:

set autoindent nosmartindent    " auto/smart indent
set smarttab
set expandtab                   " expand tabs to spaces
set shiftwidth=4
set softtabstop=4

I really get frustrated with tabs that look like white spaces, so I ensure they are visible by telling Vim to show all tabs as little arrows ▷. This line also ensures that end of lines are shown with a negation sign ¬ :

set listchars=eol:¬,tab:▷\ ,

A classic “Python tell” in Vim is the 79th or 80th character highlight:

set colorcolumn=80              " Show the 80th char column.
highlight ColorColumn ctermbg=5

FZF

My greatest recent revelation has been the integration of FZF to provide “quick” fuzzy searching. Most frequently I search for files in the current git repository, open buffers and tags.

Install FZF and get it working on your machine, then add it to your Vim setup using fzf.vim:

Plug 'junegunn/fzf', { 'dir': '~/.fzf', 'do': './install --all' }
Plug 'junegunn/fzf.vim'

I’ve mapped my most common FZF searches to leader commands:

imap <c-x><c-o> <plug>(fzf-complete-line)
map <leader>b :Buffers<cr>
map <leader>f :Files<cr>
map <leader>g :GFiles<cr>
map <leader>t :Tags<cr>

Keeping FZF’s line completion on CTRL-x CTRL-o means that I can keep access to Vim’s line completion which is bound to CTRL-x CTRL-l by default.

Ag results integration with FZF is next on my list, I’m still using Ag results on the command line.

Ctags

I was definitely slow to get on the Ctags bandwagon, only adding them to my workflow in the last couple of months, but along with FZF, they have been a revelation. I’ve been using Exhuberant Ctags as my index generator.

TPope has published a neat trick of stashing the ctags script inside the .git folder, outlined in his blog post here. My version of the script is inside my git hooks configuration and works in combination with my ctags config.

As mentioned above, I have used <leader>t to trigger an FZF-powered search of tags:

map <leader>t :Tags<cr>

The default “jump to definition under cursor” is still the default CTRL-] which, with “previous tag” CTRL-t makes it really easy to traverse code.

Visual selection

The smartpairs plugin is fantastic for selecting text inside brackets, braces and parentheses and is excellent for all languages I work with, not just Python:

Plug 'gorkunov/smartpairs.vim'

Linting

In general, I’ve used external programs to provide linting of my Python code and so I run Vim with the current project’s virtualenv active.

With Isort installed in the current environment, sort the imports of the current file with <leader>i or call it with :Isort command on a range of lines:

map <leader>i :Isort<cr>
command! -range=% Isort :<line1>,<line2>! isort -

With flake8 installed in the current environment, lint the current file with F7 as provided by Vincent Driessen’s vim-flake8:

Plug 'nvie/vim-flake8'

Happy Vimming!

:xa

A successful pip-tools workflow for managing Python package requirements

2016-11-13T23:00:00+00:00

In this post I present the pip-tools workflow I’ve been using over a number of projects to manage multiple inherited requirements files. At its core is a GNU Make Makefile to provide recipes for managing requirements and specifying the dependencies between the requirements files.

If you are not aware of the excellent pip-tools package it provides two commands: pip-compile and pip-sync. In this post I will be focusing on using pip-compile to compile .in files consisting of top level requirements.

pip-compile consults the PyPI index for each top level package required, looking up the package versions available, outputting a specific list of pinned packages in a .txt file. This extra layer of abstraction (.in files containing top level requirements rather than just outputting .txt files with pip freeze) is very helpful for managing requirements, but does create some complications which mean that a solid workflow is essential for stable package management.

Update 13/10/2019

This post has been updated to use -c constraint files rather than -r recursive inclusion. More info at the end of the post.

Keep requirements files in their own folder

In order to preserve sanity, I keep my project requirements in their own folder directly inside the project.

$ cd project
$ ls requirements/
base.in  base.txt  Makefile  test.in  test.txt

During this post, I’ll use this simple example with one set of “base” requirements and one set of “test” requirements.

Store `.in` and `.txt` files in version control

Both .in and .txt files are tracked in the project’s revision control system, for example git. This allows for shipping of the compiled .txt files for installation, but more importantly, it presents the opportunity to check the diff of .txt files when upgrading packages.

I also tend to keep .in files sorted alphabetically.

Set `.in` files to depend on `.txt` files

In the example project there are base.in and test.in requirements files:

base.in compiles to base.txt
test.in compiles to test.txt

I want the test requirements to be compiled to respect the versions of the base packages so that they can be installed without disrupting those selected versions. Therefore I set test.in to be constrained by the base.txt compiled requirements with -c:

base.in contains:

project-packages

test.in contains:

-c base.txt

test-packages

Use a Makefile for common tasks

On each project that has multiple requirements files, I use a Makefile and place it in the requirements folder.

.PHONY: all check clean

objects = $(wildcard *.in)
outputs := $(objects:.in=.txt)

all: $(outputs)

%.txt: %.in
    pip-compile -v --output-file $@ $<

test.txt: base.txt

check:
    @which pip-compile > /dev/null

clean: check
    - rm *.txt

Here is [a similar] file in a current project.

Note 1: This is an updated version of the Makefile that I have been using. There are no clean or check recipes.

Note 2: make requires recipes to be indented by tabs, so if you want to copy this file then it could be helpful to pull the raw file from Github rather than copying and pasting from this page which does not show tab characters.

Let’s go over the key functionality provided by this Makefile:

First two definitions:
```
objects = $(wildcard *.in)
```
objects is a list containing every .in file in requirements folder.
```
outputs := $(objects:.in=.txt)
```
outputs is also a list made of one .txt filename for each .in file in the outputs list. The .txt files do not need to exist yet, this list tells make what they should be called.
A recipe called all to build all .txt files:
```
all: $(outputs)
```
The all recipe has no commands of its own - it solely depends on all the .txt files in the outputs list being built. In order to fulfil this recipe, make will attempt to build every .txt file in the objects list.
Up until now, make does not know how to build a .txt file, so here we give it a recipe:
```
%.txt: %.in
    pip-compile -v --output-file $@ $<
```
The first line tells make that any .txt file depends on the .in file with the same name. make will check the date stamp on the two files and compare them - if the .txt file is older than the .in file or does not exist, then make will build it.

The next line tells make the command to use to perform the build - it is the pip-compile command with the following flags:
- -v means pip-compile will give verbose output. I find this helpful for general watchfulness, but you may prefer to remove it.
- output-file $@ means “send the output to the target of the recipe”, which is the .txt file we’ve asked to be made. For example when invoking make base.txt, then --output-file base.txt will be passed.
- $< at the end is the corresponding .in input file. Make matches the names using the % sign in the recipe, so it knows to build base.txt from base.in.
Now we tell make about the dependency between the requirements files.
```
test.txt: base.txt
```
This creates a dependency chain. This is an additional recipe for test.txt which tells make that it depends on base.txt. That means that if make is asked to build test.txt, then it should be updated if test.in or base.txt have been updated.

If base.in is updated, then make knows that it will need to recompile base.txt in order to make test.txt. We can see that here:
```
$ touch base.in       # Update timestamp on base.in
$ make -n test.txt    # What commands will be run to build test.txt
pip-compile -v --output-file base.txt base.in
pip-compile -v --output-file test.txt test.in
```
This is exactly what we want for requirements constraining. If the requirements in our base have changed, then we want our test file to be recompiled too because of the -c base.txt line we added to the test.in file.

Of course, this is a trivial example, but I have used multiple lines of dependency in Makefiles to manage multiple levels of inheritance in requirements files.
Finally, a recipe to help us update requirements.
```
check:
    @which pip-compile > /dev/null

clean: check
    - rm *.txt
```
The check recipe will fail if pip-tools is not installed.

The clean recipe will remove all the .txt files if the check recipe is successful. This makes it harder to accidentally delete your requirements files without pip-tools already installed to be able to build them again.

I’ve explained what the Makefile above does, but not how or when you would use it. So let’s continue with some common workflow actions.

Build one or more requirements files

To update all requirements use the default all recipe.

$ make all

To update a particular file, ask for it by name:

$ make test.txt

If make tells you that a file is up-to-date but you want to force it to be rebuilt you should touch the .in file.

$ make base.txt
make: 'base.txt' is up to date.
$ touch base.in
$ make base.txt
pip-compile -v --output-file base.txt base.in
...

Add a dependency

To add a dependency, locate the appropriate .in file and add the new package name there. The version number is only required if a particular version of the library is required. The latest version will be chosen by default when compiling.

$ cat >> base.in
ipython
$ make all

Update a package

In order to update a single top level package version, remove its lines from the compiled corresponding .txt files. I tend to be quite “aggressive” with this and remove every package that the top level package depended on using sed with a pattern match.

Given that I want to update ipython and it is not pinned in my .in file:

$ sed '/ipython/d' -i *.txt
$ make all

There is no command for this removal built into the Makefile, but potentially it could be. Ideally, it could be provided as extra functionality by pip-tools. Beware that packages often contain each other’s names as substrings so could lead to bad matching. If in doubt review your diff and potentially remove lines from your .txt files manually.

The call to make all will reevaluate the latest version for packages that do not have corresponding lines in the .txt file and they will be updated as required.

Update all requirements

A full update of all requirements to the latest version (including updating all packages that are not pinned in the .in file with a particular version number) can be achieved with:

$ make clean all

The clean recipe will clean out all *.txt files if you have pip-tools installed. Then the all recipe will rebuild them all in dependency order.

Finally

A tip for working with Makefiles. If you want to see what commands will be run by a recipe, you can use the -n flag and inspect the commands that were planned:

$ make -n all

Happy requirements packing!

Update 21/11/2016

For more information on the advantages and disadvantages of setting recursive requirements to point at .in files or .txt files please see this Issue on the pip-tools repository.

In particular, my comment illustrates how development requirements can become out of sync with base requirements when .in files are used in recursion which does not happen when .txt files are used. It’s for this reason, that I continue to recommend pointing at .txt files with -r.

Update 30/06/2017

See also this comment on GitHub from Devin Fee for a Makefile which:

… corrects the annoyance -e file:///Users/dfee/code/zebra -> -e ., making the file useful for users who don’t develop / deploy from your directory.

Update 13/10/2019

This pull request updated the pip-tools README to include info on creating layered requirements files using -c constraints. The use of -c came up in this long running issue discussing layered requirements and how to ensure that each layer is compatible with the others.

I think that -c constraints are much better than -r inclusion and have updated the post to reflect that.

Django Factory Audit

2016-10-12T10:00:00+01:00

Factories build instances of models, primarily for testing, but can be used anywhere. In this post I present my attempts at grading different factory libraries available for Django based on the validity of the instances that they create.

The Problem

You’re a much better programmer than me - I’m a bad programmer. I make mistakes, lots of them. I often forget to validate my instances before saving. In order for my tests to be more dependable and solid, I’d like whatever object factory I use to look after me by validating the generated instance before it reaches the database.

This is the way that I’ve been looking at Django model instances - for a single model we can imagine sets of instances which might look like:

Where:

ε: The universal set of all possible instances of this model.
D: The set of instances that the database would consider valid. It’s interesting to note that this set might change as code is developed, tested and run on machines that have different database versions or use different databases altogether.
V: The set of valid instances which pass full_clean. This is a subset of D.

The main issue is the set D/V of instances. These are all the instances which can be saved into the database but are considered invalid by Django.

When factories create instances that reside in this D/V set, they create instability in the system:

If a user attempts to edit that instance through the Django Admin system, then they may not be able to save their changes without fixing a list of invalid fields.
Test suites will be executed using model instances that should not be created during the “normal” lifetime of the application.

Possible solutions

We could argue that way in which Django does not call full_clean before it writes instances to the database is the root of the problem - I’ve previously written and spoken about this. However, this is more a “condition of the environment” and therefore something that we need to manage, rather than fix.

Also, look at it another way. Any factory that integrates with Django can inspect the target model and immediately find the constraints on each field. Therefore with Django, factory libraries have all the information they need to build a strategy for creating valid data. On top of that, Django provides the full_clean function so any generated data can also be checked for validity before it’s sent to the database. Why should we have to re-code the constraints already created for our models into our factories? This looks like duplication of work.

So let’s explore how different factory libraries deal with this problem of instances in the D/V set - the white of the fried egg in the diagram.

Factory libraries

The following factory libraries have been explored:

Factory Djoy is my factory library. It’s a thin wrapper around Factory Boy which does the hard work. I’ve indented it because it’s really a version of Factory Boy than a standalone factory library.

Test conditions

The code used to test the factory libraries is available in the Factory Audit repository. For each factory library, two factories have been created in a default state, one targeting each of the test Models:

ItemFactory: to create and save instances of plant.models.Item, a test model defined in the ‘plant’ app.
```
class Item(models.Model):
    """
    Single Item with one required field 'name'
    """
    name = models.CharField(max_length=1, unique=True)
```
This example has been taken from the Factory_Djoy README but with a reduced length of name down to one character to more easily force name collisions.
UserFactory: to create and save instances of the default django.contrib.auth User Model.

The goal is that each factory should reliably generate 10 valid instances of each model.

Wherever possible I’ve tried to be as explicit as possible and import the target model, rather than refer to it by name as some factories allow.

Gradings

Each factory library has been graded based on how its default configuration behaves when used with the Item and User models.

The gradings are based on the definition of “valid”. Valid instances are ones which will pass Django’s full_clean and not raise a ValidationError. For example, using the ItemFactory a generated item passes validation with:

item = ItemFactory()
item.full_clean()

The gradings are:

RED - Factory creates invalid instances of the model and saves them to the database. These are instances in the D/V set.
YELLOW - Factory raises an exception and does not save any invalid instances. Preferably this would be a ValidationError, but I’ve also allowed IntegrityError and RunTimeError here.
GREEN - Factory creates multiple valid instances with no invalid instances created or skipped. Running factory n times generates n valid instances.

The tests on each of the factories have been written to pass when the factory behaves to the expected grade. For example, the test on Factory Djoy’s ItemFactory expect that it raises ValidationError each time it’s used and is therefore YELLOW grade.

Results

Original results

Library	ItemFactory	UserFactory
Django Fakery	RED	YELLOW
Factory Boy	RED	RED
Factory Djoy	YELLOW	GREEN
Hypothesis[django]	RED	RED
Mixer	GREEN	GREEN
Model Mommy	YELLOW	GREEN

Update

Thanks to Piotr and Adam who pointed out some issues with my grading system.

Adam pointed out that collisions are still collisions, even if they are unlikely. Therefore, even if factories are employing fantastic strategies for generating valid data, their generated instances should still be run through full_clean before save.

I agree with this opinion and think that calling full_clean on every instance creates the opportunity for two benefits, over and above asserting that every instance is valid:

If a factory raises ValidationError with information on what failed it will always be helpful to the developer who is fixing the broken test run.
If invalid data is found, this would create an opportunity for a factory to adjust failing fields so that valid data can be saved and the test run will not be interrupted.

I’ve added a “Uses full_clean” field to evaluate each factory and capture this information.

Piotr pointed out that the results of the grading are inconclusive since I don’t agree with the results. For example, in the original results Mixer is the only library that has GREEN GREEN and therefore we would assume that it is the best of the factories tested. However, that’s not the case, since I found it hard to use and its exception bubbling was also intrusive.

I’ve added the “Ease of use” grading to capture this information based on my experience working with each factory.

New gradings

Uses full_clean:
- RED - Not instance of full_clean in the factory code base.
- YELLOW - Factory code base includes full_clean in the test suite only.
- GREEN - Factory tests every generated instance with full_clean.
Ease of use:
- RED - Do not bother trying. Too difficult to use.
- YELLOW - Some pain may be experienced. You might struggle to install, need to adjust your workflow, packages, etc.
- GREEN - Easy to install. Clean API.

Updated results

Results with additional “Uses full_clean and “Ease of use” information:

Library	ItemFactory	UserFactory	Uses `full_clean`	Ease of use
Django Fakery	RED	YELLOW	RED	GREEN
Factory Boy	RED	RED	RED	GREEN
Factory Djoy	YELLOW	GREEN	GREEN	GREEN
Hypothesis[django]	RED	RED	YELLOW	GREEN
Mixer	GREEN	GREEN	RED	YELLOW
Model Mommy	YELLOW	GREEN	RED	GREEN

Notes about each library

Grading each library was often harder than I thought it would be because many don’t fall into one grading or another. Where that has happened I’ve noted it below.

Django Fakery

ItemFactory RED

Unfortunately, Django Fakery does not recognise that only one character is allowed for the Item model’s name field. It uses Latin words from a generator which are saved by the default SQLite database and are invalid because they are too long.
UserFactory YELLOW

In order to create User instances Django Fakery also uses the Latin generator which collides often. This means that IntegrityError is raised when collisions occur, but any Users created are valid.

Factory Boy

ItemFactory RED

Creates invalid instance of Item which has no name and saves it.
UserFactory RED

Creates User with invalid username and password fields and saves it.

Factory Boy has no automatic strategies used for default factories and so it fails this test hard. If the library was extended to call full_clean for generated instances before saving then it could be upgraded to YELLOW.

Factory Djoy

ItemFactory YELLOW

Calls full_clean on the Item instance created by Factory Boy which it wraps. This raises ValidationError and the Item is not saved.
UserFactory GREEN

Creates valid instances using a simple strategy Unique usernames are generated via Faker Factory which is already a requirement of Factory Boy. full_clean is called on the generated instance to catch any collisions in the strategy and on collision, a new name is generated and retried.

Factory Djoy contains only one simple strategy for creating Users. It has no inspection ability to create strategies of its own based on Models.

Hypothesis[django]

ItemFactory RED
UserFactory RED

Hypothesis’s Django extra does not reliably create instances of either model because it’s example function does not reliably generate valid data. In the case that an invalid example is generated it is skipped and the previous example is used.

Interestingly, Hypothesis creates User instances that Django considers to have invalid email addresses.
Uses “full_clean“ YELLOW

Hypothesis’s code base currently includes a single instance of full_clean. This is in its test suite to assert that instances built are valid. However, it doesn’t call full_clean on generated instances during its normal operation.

Mixer

ItemFactory GREEN

Mixer appears to inspect the Item model and generates a very limited strategy for generating names. Unfortunately it runs out of instances around 23, even though there are hundreds of characters available.
UserFactory GREEN
Ease of use YELLOW

Mixer helpfully raises Runtime error if a strategy can’t generate a valid instance. However, it echoes this to the standard out, which is annoying and really confused me when I was first using it because exceptions appear on the terminal even though all tests have passed.

It uses an old version of Fake Factory which meant that its tests had to be extracted into a second test run that occurs after a pip-sync has taken place. I found this the hardest factory library to work with.

Model Mommy

ItemFactory YELLOW

There is no method used to create unique values so there are collisions when there are a small number of possible values. Items that are created are valid.
UserFactory GREEN

Model Mommy’s random text strategy works here for username and the random strings are unlikely to collide.

Model Mommy depends on its strategies to create valid data and does not call full_clean meaning that IntegrityError can be raises when collisions occur. It could be argued that it should be downgraded to YELLOW because IntegrityError is raised.

And the winner is?

What is the best factory to use? This is a really hard question.

These factory libraries generally consist of two parts and different libraries do each part well.

Control / API: Personally I really like the Factory Boy API and how it interfaces with Django. I’m happy with the Factory Djoy library because it provides the certainly of calling full_clean for every created instance on top of the Factory Boy API.
Data strategy: I’m excited by Hypothesis and its ability to generate test data.

My current advice is to use Factory Djoy, or wrap your favourite factory in a call to full_clean.

Yes, there is a performance overhead to calling full_clean but my opinion is that eliminating the D/V set of invalid instances is worth the effort and makes the test suite “fundamentally simpler” [1].

My future thinking is that if Hypothesis can improve its interface to Django it could be the winner.

Resources

Factory audit repository: Contains the research work - factories and tests for each factory library. Pull requests very welcome - especially if they add a new factory library or fix a test.
Slides: From my presentation of these results at the London Django October meetup.
Video: Available via the Skills Matter website.
Thanks to Adam for pointing out the collisions issue which you can hear in the video around 20 minutes in. Even if collisions are unlikely, they can still be a problem. Check out his Factory Boy post.
The 15th October update to the post is visible as a Pull Request on the blog’s repository.

Happy fabricating!

[1]	Taking a few percent hit, going a little slower, in order to do something that’s just fundamentally simpler” Pycon UK 2016: Python and the Glories of the UNIX Tradition Brandon Rhodes, Pycon UK 2016

Cleaner unit testing with the Arrange Act Assert pattern

2016-09-18T20:00:00+01:00

At PyConUK 2016 I spoke about the Arrange Act Assert pattern and how it can help clean up unit tests.

Note: A newer post Arrange Act Assert pattern for Python developers is available. It has much clearer examples and guidelines for using AAA than the video and slides below.

Original proposal

PyConUK ask that we provide an explanation of why we think that attendees will be interested in our talk. This was my original proposal’s reasoning.

This talk focuses on developers that practise TDD, or want to use it more in their coding.

My assumption is that our community feels a lot of pain from testing. I’ve heard fellow developers talk about the difficulty with managing complicated test suites; issues with reading and understanding others’ tests; and struggles when updating others’ tests. I hope that the PyConUK attendees will have felt some of this pain be interested in a talk that demonstrates the use of a pattern that can (hopefully) mitigate some of it and help us all to be “cleaner” testers.

Although I’ve marked “moderately experienced” I think that my talk would have a broad appeal: Those who are new to testing and would like a “template” to follow. And those who are expert because of the discussion about when to DRY out tests and how to assert that our test refactors are safe.

Slides and video

The video of the talk does not capture much of the screen, so the slides are posted here too.

Resources

PEP08 and PEP20, The Zen of Python.
Kent Beck: Test Driven Development: By Example - a great book which references the AAA pattern (page 97).
Google-style docstrings: In addition to using this style in my AAA tests, I’ve started to add a Trusts section to indicate which other tests are trusted by any particular test and why.
Bill Wake’s post about AAA: Bill Wake is cited by Kent Beck as having coined the term 3A.
Extract Method: I’ve used extract method as defined by Martin Fowler. See also Extract Variable.

Update August 2018: Check out flake8-aaa - a Flake8 plugin that makes it easier to write tests that follow the Arrange Act Assert pattern.

Finally

Thanks again to Carles for introducing me to the AAA pattern.

Python unittest: assertTrue is truthy, assertFalse is falsy

2016-05-12T14:00:00+01:00

In this post, I explore the differences between the unittest boolean assert methods assertTrue and assertFalse and the assertIs identity assertion.

Definitions

Here’s what the unittest module documentation currently notes about assertTrue and assertFalse, with the appropriate code highlighted:

assertTrue(expr, msg=None)

assertFalse(expr, msg=None)
Test that expr is true (or false).

Note that this is equivalent to
bool(expr) is True
and not to
expr is True
(use assertIs(expr, True) for the latter).

Mozilla Developer Network defines truthy as:

A value that translates to true when evaluated in a Boolean context.

In Python this is equivalent to:

bool(expr) is True

Which exactly matches what assertTrue is testing for.

Therefore the documentation already indicates assertTrue is truthy and assertFalse is falsy. These assertion methods are creating a bool from the received value and then evaluating it. It also suggests that we really shouldn’t use assertTrue or assertFalse for very much at all.

What does this mean in practice?

Let’s use a very simple example - a function called always_true that returns True. We’ll write the tests for it and then make changes to the code and see how the tests perform.

Starting with the tests, we’ll have two tests. One is “loose”, using assertTrue to test for a truthy value. The other is “strict” using assertIs as recommended by the documentation:

import unittest

from func import always_true


class TestAlwaysTrue(unittest.TestCase):

    def test_assertTrue(self):
        """
        always_true returns a truthy value
        """
        result = always_true()

        self.assertTrue(result)

    def test_assertIs(self):
        """
        always_true returns True
        """
        result = always_true()

        self.assertIs(result, True)

Here’s the code for our simple function in func.py:

def always_true():
    """
    I'm always True.

    Returns:
        bool: True
    """
    return True

When run, everything passes:

always_true returns True ... ok
always_true returns a truthy value ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.004s

OK

Happy days!

Now, “someone” changes always_true to the following:

def always_true():
    """
    I'm always True.

    Returns:
        bool: True
    """
    return 'True'

Instead of returning True (boolean), it’s now returning string 'True'. (Of course this “someone” hasn’t updated the docstring - we’ll raise a ticket later.)

This time the result is not so happy:

always_true returns True ... FAIL
always_true returns a truthy value ... ok

======================================================================
FAIL: always_true returns True
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/assertttt/test.py", line 22, in test_is_true
    self.assertIs(result, True)
AssertionError: 'True' is not True

----------------------------------------------------------------------
Ran 2 tests in 0.004s

FAILED (failures=1)

Only one test failed! This means assertTrue gave us a false-positive. It passed when it shouldn’t have. It’s lucky we wrote the second test with assertIs.

Therefore, just as we learned from the manual, to keep the functionality of always_true pinned tightly the stricter assertIs should be used rather than assertTrue.

Use assertion helpers

Writing out assertIs to test for True and False values is not too lengthy. However, if you have a project in which you often need to check that values are exactly True or exactly False, then you can make yourself the assertIsTrue and assertIsFalse assertion helpers.

This doesn’t save a particularly large amount of code, but it does improve readability in my opinion.

def assertIsTrue(self, value):
    self.assertIs(value, True)

def assertIsFalse(self, value):
    self.assertIs(value, False)

Summary

In general, my recommendation is to keep tests as tight as possible. If you mean to test for the exact value True or False, then follow the documentation and use assertIs. Do not use assertTrue or assertFalse unless you really have to.

If you are looking at a function that can return various types, for example, sometimes bool sometimes int, then consider refactoring. This is a code smell and in Python, that False value for an error would probably be better raised as an exception.

In addition, if you really need to assert the return value from a function under test is truthy, there might be a second code smell - is your code correctly encapsulated? If assertTrue and assertFalse are asserting that function return values will trigger if statements correctly, then it might be worth sense-checking you’ve encapsulated everything you intended in the appropriate place. Maybe those if statements should be encapsulated within the function under test.

Happy testing!

API Documentation and the Communication Illusion

2016-05-04T12:00:00+01:00

In this post I talk quite a lot about the “Communication Illusion” - a quote often misattributed to George Bernard Shaw:

The single biggest problem in communication is the illusion that it has taken place.

Now let’s talk about documenting APIs…

API documentation and my current workflow

When I’m building an API, I usually have a client in mind that will be consuming its resources. That means a developer or team of developers will be writing code against the API that I build and so my current workflow for updating or creating new API services focuses on providing API documentation.

It works like this:

Manually document the endpoint using REST API documentation templates. Highlight any side effects, requirements and illustrate required data and responses.

Alternatively, if there is a change to be made to the service, then the documentation edit can be provided as an SCM diff which can be easily reviewed.
Ask for sign off on the API documentation from the API consumer development team. Incorporate any feedback and repeat until team are happy.
Build out new endpoint using test cases written from the documentation.
Once complete, export server responses back from the test suite into the documentation. Note any changes that were required to the signed off document and flag them to the consumer developers during the release process, if not beforehand.

Past experience of Communications Illusions

In conversations with other developers about my process it’s been called “tedious”. I can agree that it seems like extra effort at first, however, I believe that it pays off in the long run and that belief is based on my past experience at my old web development business, Fublo.

At Fublo, we often produced brochureware sites on top of a simple CMS system. When starting a new project, conversation within our team would often happen like this:

We know the solution to the problem, so instead of drawing out the design, let’s just jump straight into HTML and code it up. This will save time.

Inevitably this didn’t save time at all. We would often have to burn much of our initial work because we couldn’t fit the final client requirements into it, or rework major structural elements because they turned out to be unsuitable.

This inefficiency is the result of three illusions:

The illusion with ourselves: When we thought that we knew the solution and that it would be suitable and workable, this was just an illusion. Our minds tricked us into thinking that we knew the answer but we didn’t.
The Communications Illusion within the team: This was in full effect within our small team. We found that what we described as the problem to each other would fit with our own distorted view and we would only find out that our opinions were not the same some significant time after starting the build - sometimes too late to make changes.
The Communications Illusion with the client: We also didn’t often fully understand the client need, even though we thought we did at the outset. Sometimes there would be omissions in the brief or misunderstandings about how content would be provided, presented or managed.

Those three illusions sum up to one big Communications Illusion - we believe that we, our team and our clients all have tallying opinions and ideas, when there is no proof that that is the case.

Pretty quickly, but probably not as quickly as we would have liked, we realised that the “tedious” way of producing flat designs before starting coding was often the most efficient, even though it felt like it wasn’t. It provided the quickest route to a communication feedback loop, within ourselves, our team and with our clients and that destroys the Communication Illusion. On top of that it meant that we were not making any accidental, hard to refactor, architectural decisions on the fly without all the required information.

Applying this to API design

So it’s my opinion that producing a flat API document without touching a line of code or writing a single test is the most efficient way of destroying the potential Communications Illusions when creating APIs.

This means that providing static API documentation first is the most efficient path to a place where API producers and consumers have a joint shared opinion about how the API will operate and perform.

Tooling wishlist

On the flip side of those benefits in the long term, there is still a part of me that finds some of my manual process a little slow. My inner need for efficiency feels unfulfilled every time I copy and paste an API response into a document.

In an ideal world I would have tools that would:

Extract from the API documentation a set of tests that could be run against the built API to assert that the documentation features were all adhered to.
A tool that would extract from the server responses made to these tests the response payloads so that they could be added back in or matched with the API document that they came from.

However, even if those tools did exist, the starting point would still be some form of static documentation that describes what the API does.

I’ve seen that there are tools like Apiary which might make this process easier, but I would want to hear some positive experience and read a case study before I committed to using such a service for client work.

Final thoughts

So, I hope in this post I’ve been able to convince you that it’s most efficient to document your API before you build or change it. I hope that, like me, you find that that efficiency comes from the increase in communication that the documentation creates as a side-effect, destroying the Communication Illusion that can ruin a project build.

Unfortunately, I’ve not found a way to ensure that the consumer programmers of my APIs read and understand what’s in the documentation before coding starts. That’s a second layer of Communication Illusion that I’ll maybe get to tackle another day.

Happily, I still agree with this (old, now deleted) Tweet that I posted more than 18 months ago:

Building an API… All that matters is the docs.

October 3, 2014

…and in fact, after working on more API builds and writing this post, I believe it’s even more true than before.

Happy API building!

Read comments on Hacker News

Comparing Django Q Objects

2016-03-28T12:00:00+01:00

Background

Note: A newer version of this post exists with an assertion helper for Python 3 and pytest. Read on for Python 2 and unittest and general background on Q objects…

When programmatically building complex queries in Django ORM, it’s helpful to be able to test the resulting Q object instances against each other.

However, Django’s Q object does not implement __cmp__ and neither does Node which it extends (Node is in the django.utils.tree module).

Unfortunately, that means that comparison of Q objects that are equal fails.

>>> from django.db.models import Q
>>> a = Q(thing='value')
>>> b = Q(thing='value')
>>> assert a == b
Traceback (most recent call last)
...
Assertion Error:

This means that writing unit tests that assert that correct Q objects have been created is hard.

A simple solution

Q objects generate great Unicode representations of themselves:

>>> a = Q(place='Residential') & Q(people__gt=5)
>>> unicode(a)
u"(AND: ('place', 'Residential'), ('people__gt', 5))"

In addition, it is “good” testing practice to write assertion helpers whenever a test suite has complicated assertions to make frequently. This provides an opportunity to DRY out test code and expand on any error messages that are raised on failure.

Therefore a really simple solution is an assertion helper that would compare Q objects by:

Asserting that left and right sides are both instances of Q.
Asserting that the Unicode for the left and right sides are identical.

So here’s a mixin containing the assertion helper. It can be added to any class that extends unittest.TestCase (such as Django’s default TestCase):

from django.db.models import Q


class QTestMixin(object):

    def assertQEqual(self, left, right):
        """
        Assert `Q` objects are equal by ensuring that their
        unicode outputs are equal (crappy but good enough)
        """
        self.assertIsInstance(left, Q)
        self.assertIsInstance(right, Q)
        left_u = unicode(left)
        right_u = unicode(right)
        self.assertEqual(left_u, right_u)

Disadvantage of this method is that it is simplistic and doesn’t find all the Q objects that are identical (see below). However, the advantage is that it provides rich diffs on failure:

class TestFail(TestCase, QTestMixin):

    def test_unhappy(self):
        """
        Two Q objects are not the same
        """
        a = Q(place='Residential')
        b = Q(place='Palace')
        self.assertQEqual(a, b)

Gives output:

AssertionError: u"(AND: ('place', 'Residential'))" != u"(AND: ('place', 'Palace'))"
- (AND: ('place', 'Residential'))
?                  ^^^^^^^^^
+ (AND: ('place', 'Palace'))
?                  ^  +++

Which can be very helpful when trying to track down errors.

See this updated post for a version of this assertion helper for Python 3 with pytest.

The perfect world: Predicate Logic

Since Q objects represent the logic of SQL WHERE clauses they are therefore Python representations of predicates. In an ideal world the predicate logic rules of equality could be used to compare Q objects and this would be built directly into Q.__cmp__.

This would mean that:

# WARNING MAGIC IMAGINARY CODE!

# Commutative would work
>>> a = Q(x=1) | Q(x=2)
>>> b = Q(x=2) | Q(x=1)
>>> a == b
True

# Double negation would work
>>> a = Q(x=1)
>>> b = ~~(Q=1)
>>> a == b
True

# Negation on expression would work
>>> a = ~(Q(x=1) & Q(x=2))
>>> b = ~Q(x=1) | ~Q(x=2)
>>> a == b
True

# END IMAGINATION SECTION

This is probably never going to be implemented in Django, because it would be functionality only used (as far as I can see) for testing. In addition, without a special implementation for rendering Q objects diffs, it would be hard to understand the source of errors when mismatches occur.

Django’s model save vs full_clean

2015-11-15T17:00:00+00:00

Screwing up data

At a previous talk I gave on “Things I wish I’d known about Django” there was this slide:

What?

I’ve made some experimental code in a small Django clean vs save project. It has a few models and a single test file which is readable and passes.

The main take-aways are:

Creating an instance of a Model and calling save on that instance does not call full_clean. Therefore it’s possible for invalid data to enter your database if you don’t manually call the full_clean function before saving.
Object managers’ default create function also doesn’t call full_clean.

Personally I find this jarring.

Given that the developer is the customer of Django I think it conflicts with the principle of least astonishment.

Why is it like this?

The Django documentation of Validating Objects is quoted in Django ticket #13100 as saying:

Note that full_clean() will NOT be called automatically when you call your model’s save() method. You’ll need to call it manually if you want to run model validation outside of a ModelForm. (This is for backwards compatibility.)

Ahh “backwards compatibility”?!

It appears that phrase only lived for four months back in 2010.

I haven’t been able to find any more specific reasons that it was added or removed.

What next?

More warnings I guess:

Consider if you ever want to be able to call save without full_clean. If the answer is ‘no’, then explore how you’ll wrap your models in business logic or extend them in some way that implements this (with tests of course). A quick search of the internet will show you some Django plugins that adjust this behaviour.
Remember that you can ruin your database when migrating if you don’t call full_clean after the migration has changed a model but before saving. All migrations should be tested to ensure that they can roll-back if full_clean raises a ValidationError during migration.
Check out posts like Why I sort of dislike Django. It mentions things like backwards compatibility and the save function.

Finally

Thanks to PXG for sharing the “Why I sort of dislike Django” post and discussing Django project structures with me.

Thanks for reading!

Irregular Vim

2015-08-16T21:00:00+01:00

My frustrations with Vim arise when it makes actions that are unexpected. At Vim London I presented some of my “pet misbehaviours” - these are the ones that affect my regular use of Vim.

Background

If you’re new to Vim then one of the key features of Vim is that it’s a modal editor. As a result, to quote a quote from a previous talk:

The “Zen” of vi is that you’re speaking a language.

So what happens when a language has many irregularities and frequently broken rules? They become hard to learn. For example the English language is hard because:

… although there are rules, there are lots of exceptions to those rules.

My fear is that if Vim is hard to learn it will be overlooked by new users and it will cease to exist in the future. I think we should all be working on the maxim that Drew has put on the Vim London meetup page:

Use Vim better, make Vim better.

Pet misbehaviours

Here are the five behaviours looked at in this talk, each one linked to its section in the slides on Github.

Linewise motions always include the start and end position

Except when the end of the motion is in column 1.
Change is equivalent to Delete Insert

Except when motion is w.
Pasting from registers is easily repeatable

Except when in visual modes.
Incrementing number after cursor is predictable

Except when the number starts with a 0.
CTRL-O goes back to old cursor position

Except when in visual modes.

Testing process

Automated and predictable testing is an important part of how I work and so I attempted to use a repeatable process for testing each of the behaviours.

Outline the assumption about Vim. Highlight the docs (where available) that make the statement or assertion.
Do some small tests of this assertion. Does it work as expected? How do we feel about the behaviour?
Update the assertion with any exceptions and look at any reasons for those exceptions.

Content

Video on vimeo (thanks Drew).

Slides are on GitHub.

Learnings

One of my annoyances that started this journey arose when attempting to delete everything up to but not including a character. As you’ll see at the end of the talk, I learned a new movement command t (thanks Audience!).

t is like f but not inclusive. From the help :help t file:

Till before [count]’th occurrence of {char} to the right. The cursor is placed on the character left of {char} inclusive. {char} can be entered like with the f command.

This exactly solves the problem that started my exploration of Vim’s irregularities. It’s a humbling experience when you talk for 20 minutes about Vim commands and still learn a ‘basic’ one at the end of the talk. I think that this is a reminder to me that Vim is deep.

Future

I would like to improve these misbehaviours and make them more regular. My hope is that, if this could be achieved, it would make Vim’s interface even more great and also easier to learn.

The main thing for me going forwards is to use Neovim. A project that is open to improving how Vim works. Here’s a great post about why Neovim is better than Vim - thanks Geoff.

From there I will check out how many of these irregularities can be improved with code changes because having to have a vimrc file that resets Vim to ‘regular’ behaviour by turning off things like octal numerical increments seems horrible and repellent to new users.

We can do better.

Thanks

Drew for asking me to talk and providing the cw example.
Kris for inspiring me at my first Vim London meetup with Barebones Vim navigation. This showed me so much about Vim that I didn’t know and also that you can do a high quality presentation from Vim. (I hope that one day I’ll be able to meet your standard Kris).

And thanks to you for reading!

Things I wish I’d known about Django

2015-07-18T20:00:00+01:00

This month at The London Django meetup I gave a talk about some of the things I wish I’d known about Django (before I started).

Thanks

My thanks to Piotr, who I mentioned in the talk. For years he told me to check out Django because “it does things right”. Only when I’d truly broken myself with PHP did I take his advice and I haven’t looked back.

Django opened the door for me to:

Python which has become my backend language of choice.
Test Driven Development which has become a way of life.

More Power

When I’m developing for web and using Django, I think it’s possible for me to keep more features live with fewer errors than was previously possible for me with PHP based tools. I put this down to two main things:

The application of tests in my Django projects. I’m particularly happy with the test suite provided by Django. Where I’ve needed to rewire it or patch it, it’s responded well.
The structure that Django provides, without enforcing how projects are built.

Summary

I wish I’d jumped into Django sooner. If you’re thinking about transitioning to it, see what you can do today. I think it pays back in the long run.

Thanks for reading.

A water pouring problem sketched in Python

2015-01-09T10:00:00+00:00

The problem

At the end of last year, I came across the following water pouring problem because something similar had come up in a friend’s functional programming interview:

There are three glasses on the table - 3, 5, and 8 oz. The first two are empty, the last contains 8 oz of water. By pouring water from one glass to another make at least one of them contain exactly 4 oz of water.

Source: A. Bogomolny, 3 Glasses Puzzle from Interactive Mathematics Miscellany and Puzzles, Accessed 09 January 2015

A solution using search, not algebra

At first I started to explore the problem by looking at it algebraically. What are the differences between each cup? How can those differences be summed together to give the required remainder of 4?

However this didn’t yield anything helpful. Instead, I started looking at the solution states. What do the cups have to look like for the puzzle to be solved?

Two success states

There are at least two success states. One where the 5 oz cup contains 4 oz of water and the other where the 8 oz cup contains 4 oz of water. The other two cups must contain the remaining 4 oz of water. This is the notation I’ve used for these two states, where x + y = 4:

[<Cup x/3>, <Cup 4/5>, <Cup y/8>]

And:

[<Cup x/3>, <Cup y/5>, <Cup 4/8>]

The search problem

So taking the second state, and assuming that the first cup is full, then the question becomes:

How do we get from the start state to the end state?

[<Cup 0/3>, <Cup 0/5>, <Cup 8/8>]
...
TODO: Search in here for a path
...
[<Cup 3/3>, <Cup 1/5>, <Cup 4/8>]

This is really helpful. It turns an algebra problem into a search problem. Computers are good at doing search. We can write code for this.

A Python 3 sketch

I’ve written some Python to solve this problems using a type of depth first Tree Traversal and tree generation strategy.

The code repository is available on GitHub. The README contains the installation and operating instructions.

These are some of the features of the code:

Cup and Game classes

In this code, the Cup class represents a cup in the problem. Each Cup has a certain capacity and contents. The benefit of using a Cup class as a data type is that it can perform checks that is not holding more than its capacity of water or that it’s holding a negative capacity of water either.

The Game class is more complex as it represents a single state in the tree. Each Game state has three main properties:

cups - The three Cups that make up this Game state.
parent - The Game that came before this one in the search. The starting state will have this as None, but all the rest will have a parent.
children - Each Game will have some or no child Game states stored in a list. These are the valid states that can be made by pouring some or all of one Cup’s contents into another Cup in the Game that haven’t already been seen during the search.

The Game’s children property makes the Game class a recursive data structure because it can contain other instances of Games. This opens the door to the recursive search described below.

Again using this Game type is really helpful because I’ve been able to write tested functions for supporting data type functions like __eq__ (which tests if two Game stats are logically the same). The most important function in Game is is_solvable which implements the search function.

Recursive search

As mentioned above, the Game.is_solvable function implements the tree search, so here it is in full, comments removed.

def is_solvable(self):
    if self.is_goal():
        self.print_trace()
        return True

    if self.make_children() == 0:
        return False

    return self.solvable_child()

There are two base cases to this recursive function.

self.is_goal() : Goal has been reached. This Game contains a Cup that has 4 oz of water, success, a goal state has been found! Return True and print a trace of how the algorithm got here.
self.make_children() == 0 : There are no child states. This Game can not generate any new states that don’t exist in the tree already, so this state is a fail, return False.

When neither of those two base cases are found, then this state is on a “success path” if one of its children “is solvable”. The recursive case is that the Game.solvable_child helper function is then used to call Game.is_solvable on each of the child Games.

Here is the helper function without comments:

def solvable_child(self):
    for child in self.children:
        if child.is_solvable():
            return True

    return False

There are two “interesting” features of this function:

It operates like a short circuited OR reduction. This means that as soon as a solvable child is found, it stops searching and returns True.
It has been split out from Game.is_solvable to assist with unit testing.

This short circuiting feature is important. I wasn’t able to get it to work in a reduce statement on the Game.children, so instead I wrote it out explicitly as a for-loop.

Duplicate search

When generating new Games by pouring water from Cup to Cup, only new Game states are added as children of any particular Game. This prevents duplication of Games and ensures that the search will terminate once all different possible states have been generated at the very latest.

The Game.has_game function implements this duplicate search using a recursive depth first tree search.

As much functional style as possible

Originally I intended to write this sketch with as much functional style code as possible. However, there were certainly some functions that we not possible to achieve this without some serious hacking, and so I chose to keep those functions as simple and testable as possible.

I’d love to have the time to come back and construct a similar sketch for this problem in Haskell.

Possible improvements and follow up ideas

Apart from a fully functional rewrite, there are a couple of ways that I could see to improve the sketch. Even though it doesn’t run slowly, there are certainly some optimisations that could be made, plus some follow up ideas.

Save time by checking Cups contents when pouring

When generating child Game states by pouring from one cup to another, the system does not care if a Cup has water to give or if the recipient is full. It does the pour and then eliminates the new state because it’s a duplicate of its parent.

Instead, time could be saved by improving the pouring function so that pours only generate new Game states when there is water to give and the destination cup has space for that water.

Improve the network anti-duplication search

Searching the existing Game states to ensure that the same state hasn’t already been created first runs to the top of the Game tree, then searches downwards.

Most Game states will be duplicates of a Game that’s either their parent or one Game state away from them. This means there’s an advantage, especially when running bigger problem searches, to search nearest Games first.

Create a `goal` variable

The code could be improved to accept a goal value for the amount of water that should be in a Cup for success to be achieved.

Search for bigger solvable problems

Going meta, it would be interesting now to use this code to search for a nice big complicated water pouring problem. What’s the largest number of Cups and steps to success that can be found?

Django Contexts and get

2014-11-17T20:00:00+00:00

Background

If you know me, then you know that I’m an avid tester. It could even be argued that I test too extensively as part of my day-to-day development, but that’s a post for another day.

On a recent project, a particular view started failing with the error:

AttributeError: 'ContextList' object has no attribute 'get'

I wasn’t happy with just changing the tests to work again, so I dug down into why they started failing.

TL;DR

To get a value from a Context object returned by the Django Test Client, then it’s better to use the [] operator than the get method.

For example:

# In a test, after doing
response = self.client.get(reverse('home'))

# ... then it's better to use [] to test the context
self.assertEqual(response.context['name'], 'Homer')

# ...than to use get
self.assertEqual(response.context.get('name'), 'Homer')

Reason

It turns out that the problem was to do with a developer on the project changing how the template for the view was generated. They changed a view that was using a single template, to a couple of templates using Django’s template inheritance and the extends template tag.

This then effects how Django’s test client returns the Context object for inspection.

To test this I prepared the following test:

from django.core.urlresolvers import reverse
from django.test import TestCase


class Tests(TestCase):

    def test_get(self):
        response = self.client.get(reverse('home'))
        self.assertEqual(response.context.get('name'), 'Homer')

    def test_operator(self):
        response = self.client.get(reverse('home'))
        self.assertEqual(response.context['name'], 'Homer')

The home view was just a simple template renderer:

from django.shortcuts import render


def home(request):
    return render(request, 'home.html', {'name': 'Homer', })

Simple works

When the ‘home.html’ template is a simple template with no inheritance (it can be completely empty), then both tests pass.

‘home.html’ template code:

<p>Hello World</p>

Test run:

./manage.py test
Creating test database for alias 'default'...
..
----------------------------------------------------------------------
Ran 2 tests in 0.027s

OK
Destroying test database for alias 'default'...

Template inheritance fails with get

Now adjust ‘home.html’ to extend another template ‘base.html’ which has arbitrary contents.

New ‘home.html’ template code:

{% extends 'base.html' %}
<p>Hello World</p>

Test run:

./manage.py test
Creating test database for alias 'default'...
E.
======================================================================
ERROR: test_get (mini.tests.Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/james/active/mini/mmm/mini/tests.py", line 9, in test_get
      self.assertEqual(response.context.get('name'), 'Homer')
AttributeError: 'ContextList' object has no attribute 'get'

----------------------------------------------------------------------
Ran 2 tests in 0.029s

FAILED (errors=1)
Destroying test database for alias 'default'...

So the test_get case, which was using get failed.

Conclusion

It’s definitely more robust to be using list access [] on Context objects returned by the Django Test Client where possible when checking values passed through to templating.

Thanks for reading.

Current state of Codeship

2014-10-12T18:00:00+01:00

Background

In September I was introduced to Codeship at a presentation by Paul Love. These are some of the learnings I have from using their hosted service primarily for Continuous Integration for the last couple of weeks on a client project.

This post isn’t about advantages and disadvantages of CI or Continuous Delivery, but more focused on Codeship’s offering currently, compared to other CI services I’ve used over the last 6 months or so.

TL;DR

Codeship is relatively new and working hard to stabilise the system while providing documentation and service.

The 100 free builds they provide per month, along with the ability to integrate with BitBucket mean that you can run private projects on CI for free, but watch out for their lack of stability.

Positives

Good value

Codeship currently offers 100 builds a month for free on private repos that include BitBucket and Github - this is GOOD. It’s hard to find hosted CI systems that will work with non-Github repository hosting.

Good support

Codeship’s support team know their stuff.

I’ve had multiple discussions with Codeship’s support via email and Twitter and I’ve found that they get back to you with timely email replies and suggestions about why things are breaking. Great for a free service!

Good speed

Builds are QUICK! So quick that I thought they weren’t using a real DB and wrote a test to prove that they were listening to the settings I’d pushed specifically for Codeship.

Negatives

Unfortunately the good freeness above comes with some disadvantages.

False negatives

Some of my builds have received false negatives meaning they have failed when they shouldn’t have. This is annoying but manageable and can often be solved by re-running the build.

For example, I’ve had a couple of builds happen on an instance where the database wasn’t available when the tests started running, so everything went RED. Slack then went RED. Client team start worrying.

Codeship’s solution for this was “Add sleep 3 to make the tests wait 3 seconds before running”. This works but why isn’t Codeship’s instance build process checking this before starting the run?

False positives

Builds have received false positives. They have passed when they shouldn’t have - and this is far worse than a false negative.

This week a build on Codeship had git submodule init fail, but the build didn’t go red. The same condition happened last week and the build did fail. While I’m writing this, Codeship are looking into the problem.

Scarce documentation

It’s not very clear how to work with teams, amongst other topics. Team members can’t see how many remaining free builds are available each month so they can’t see if they’re using up the allowance of free builds quickly or not.

Artefacts need special attention

It’s hard to get build artefacts off Codeship’s build instances since the system burns the instance once the build is complete regardless of result. This means if you don’t push artefacts off the server yourself with a script, they’re gone.

Compare with CircleCI - they keep build artefacts in a special folder available for further processing, downloading, or access after the build is complete.

Test commands are in Codeship not repo

A minor gripe I have with the Codeship system is that the test setup and run commands are stored in the configuration for the project on their site. With Travis for example, the server runs the commands it finds in .travis.yml.

The benefit of putting the test commands in the repo is that they can be easily coordinated with git commits - if you want to change how something is run, you can do it all in the code, commit and push.

The Codeship way means copy-and-pasting run commands up onto their website and then pushing new code to be tested with those settings. It makes it hard to prove which branch was run with which settings, or if things were changed to make builds pass.

Summary

Most concerning are the false negatives and positives on build. It’s very important that a dev team can trust their CI/CD service 100%. These are the results for the current Django project I’m building on Codeship:

\begin{equation*} \frac {1\ False\ Negative + 1\ False\ Positive} {37\ Total\ Builds} = 5\%\ Incorrect\ Results \end{equation*}

These are the false builds that I’m aware of - there might be some that have gone unnoticed. For me, a figure of 95% success makes Codeship ‘just’ stable enough for work, but it’s free and works with Bitbucket and that’s a massive positive. I would be disappointed if we were on a paid account and receiving the same instability.

For the future, if they can stabilise the builds and document the system then they could become the go-to CI service for teams on Bitbucket.

Project Background

I’m running 125 tests in around 15s on a Python (2.7) Django (1.7) project that makes integrated API calls to Dropbox, sits on top of MySQL, runs coverage and flake8.

Thanks for reading.

Flat designs to website specification - a checklist

2014-03-29T16:00:00+00:00

Isolated design has problems

Since many web projects are approached from the visual aspect, often the seen elements are designed first. This can be fine if it’s integrated with well thought out feedback from developers, but can create more work for the project if it’s completed and signed off in isolation.

Results in more development work when a design might have some serious development implications when compared to a slightly different solution that could also have been acceptable.
Work estimates will be less accurate since the flat designs just start to scratch the surface of the development required - they are not a full specification.
Can lead to frustration within the design team as they are asked to redesign elements during the production process that they thought were already signed off.
Potentially leads to uncertainty with the client if ‘signed off’ designs are presented again for sign off with changes that were not previously foreseen.

How to use this checklist

Next time you see a software project being discussed just via flat designs, let your alarm bells ring. Open up conversation about the features on this list to break down the isolation that the design team is operating in.

For Developers

This basic list can start a journey of specification exploration. Start to ask questions about all these features before you agree on a specification or timeline since some of these items can become heavy or project effecting.

This list is very back-end focused, but hopefully can be helpful for front-enders too.

For Website owners

If you’re being asked to sign off a project on flat designs alone, then it might be beneficial to check that the team you’re working with have these aspects of the development on their radar. They might not have the answer to them yet, but should have a plan to find them.

For Designers

You are often stuck in the middle of the process. Continue to involve your development team, they will be able to point out things that will require more effort to build before your client signs off the visuals, saving the project work in the long term.

You can help to ensure you get good value feedback by asking them questions about items on this checklist - ensure they’re not lulled into a false sense of “it’ll be easy” by your fantastic design work!

The Checklist

The following details are often missing from flat web designs, but should be provided in a full specification.

Page titles - Standard and often missed by designers that use a generic image of a web browser frame to wrap their designs. Loved by content managers and search engines. Do they have a format and can be auto-generated? Does the content database need an extra field?
Hidden HTML data - What about all the data in the HTML <head>? Meta description, icon, Facebook data? Also for each image what will be the <img> alt tag?
URLs - Also missed by designers when using generic browser frames, what are the URLs of each page being shown? Remember to check the URLs of pages that have pagination.
Data field requirements - Designs often show the ‘best case’ for content, but ensuring good data is entering the database is essential for a successful web project. What are the limits - shortest names? Longest ones? Are spaces allowed? Should content be trimmed? Emails should be validated, but on what level? Semantically, or with a request to a DNS or mail server?
Form fields, validation and error messages - If you’re looking at a design that shows a web form, are there are error messages in the design? Expanding on the requirements for the data above, what will happen to the form when invalid data is entered? How will fields be flagged for errors? Which data elements will be sent back to the form and which will be cleaned out? Are there any fields (like address) that need to be localised?
User sign up requirements - If the site will be accepting registrations, what will users need to provide to register? Email address? User name (how long)? Are there any blocked words in user names (like ‘admin’, the project brand name or profanities)? Password (how long)? Are there any password strength requirements? How will users reset their passwords?
Transactional emails - What emails will need to be sent by the system? What is their content and design? Can users manage these notifications?
Security - Will the site have any functions that will protect the data of its users? For example, will the user login page throttle access on multiple incorrect passwords? Will there be ‘https’ required?
Private data - Since flat designs will show the public end user view of a project, what data is hidden from the user but essential for the project? Latest login dates? Number of logins? IP address of last visit or registration? Banned, active, subscriber flags? Active or dormant flags on content?
Click and hover behaviours - What will happen when elements like links are hovered? Are there any menus functions that are hidden behind clicks? Are there any titles to be shown when the user hovers an item?
Error pages - What is the design for the 404 page? What about 503 and any other error pages? Will there need to be a holding page when the site is being updated and is offline?
Analytics configuration - How should the analytics be configured to track behaviours on the site? Is it required? Will a simple configuration suffice or will there need to be funnels and or events configured? Analytics can be complex enough to require as much work as the original build out of a project, so ensuring that the specification is defined and covers the business needs early is a benefit.
Translation requirements - Will any of the content in the designs require translation? This also effects the items mentioned above in the checklist. Remember that any image elements that have been prepared that contain text will need to be generated in each target language - will each of those translated texts fit within the design?

Feedback and thoughts

I hope that the list above helps someone who’s working through the design of a site. Any time that I’ve worked on a project where developers and stakeholders have been involved in the design stages early on have always been successful.

If there are items you think should be added you can contribute on GitHub.

Thanks for reading.

Seinfeld method and coding

2014-01-31T16:00:00+00:00

This presentation focuses on using the Seinfeld method “Don’t break the chain” to get going with personal projects. Those could be all types of things from learning the piano, a new programming language or achieving a personal goal.

I believe that the Seinfeld method can help break down some of the blockers that we experience when procrastinating, by forcing us to refactor large, unmeasurable and daunting tasks into mini-tasks which are the opposite - achievable, simple and regular. It also helps us to refocus on continual small steps rather than the big picture.

I’m especially keen on how the regular measurement of time spent on a project can give insight, and so I’ve started combining Seinfeld with Pomodoro Technique.

The most important thing is to “make it work for you” - there are all sorts of ways that these techniques can be used to push a project forward or help you to achieve your goal.

Hope that’s helpful!

Thanks to Victor at ustwo London for asking me to talk at their Tech Thursday.

Read more on Seinfeld Technique and Pomodoro Technique. I’m currently using tomatoist as my online pomodoro timer.

Update 11/05/2018: I’ve been using a local install of this HTML Pomodoro timer for the last couple of years now. Output the JSON report that it creates into a couple of processors that count my hours and measure my efficiency. This means that I can ensure that I’m working the right number of hours per week, but also taking a good number of breaks.

Python generators and yield

2013-12-14T16:00:00+00:00

It started with an interview

Last week in an interview for a Django developer job, I was asked:

thing = (x**2 for x in xrange(10))

What is the type of thing?

Although I was able to identify that the type is dependent on the () around the list-comprehension-like-construction, I didn’t know the exact type that thing would be.

The answer is a generator.

This post shows some of the functionalities of generators and how they can be used in Python control flow.

Generator expressions

Generators can be created with generator expressions. A generator expression is a bit like a list comprehension. List Comprehension uses square brackets []. In Python…

>>> [x**2 for x in range(10)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A generator expression is a shortcut that allows generators to be created easily with a similar syntax - this time it’s using parentheses ().

>>> (x**2 for x in range(10))

<generator object <genexpr> at 0x2fa5eb0>

Generators are iterators

Generators “provide a convenient way to implement the iterator protocol”.

In Python, an iterator provides two key functions, __iter__ and next, so a generator itself must provide these two functions:

>>> my_gen = (x**2 for x in range(10))
>>> my_gen.__iter__
<generator object <genexpr> at 0x293c3c0>

__iter__ is there and returns the generator, now for next…

>>> my_gen.next()
0
>>> my_gen.next()
1

Therefore next works. We can keep hitting until…

>>> my_gen.next()
81
>>> my_gen.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-19-b28d59f370d8> in <module>()
----> 1 zzz.next()

StopIteration:

A StopIteration is raised - so the generator does everything we’d expect it to by the iterator protocol.

Building a generator with yield

Although it’s not clear from the example above, a generator is able to relinquish control and return a value - while saving its state. It then allows the control to pass back to the structure that called it, until it’s called again, picking up where it left off.

This allows for loops over sets of values to be programmed, without the full list of values being calculated first. A generator can be used so that next is called before each iteration required.

In this way, only the values required for each iteration need to be computed.

The yield keyword - simple example

Adding yield to a function allows for generators to be constructed ‘manually’.

At its very simplest, a function could be written just to generate a single value. However, to show that a generator can return to its previous state when called again, let’s make one with two values. For example…

def two_things():
    yield 1
    yield 'hi'

Now we can make an instance of the generator.

>>> my_things = two_things()
>>> my_things
<generator object two_things at 0x31d0960>

And we can ask for next value.

>>> my_things.next()
1

Now when we call next again, our generator continues from the state of the last yield.

>>> my_things.next()
'hi'

So you see how different values can be returned, one after the other.

And after that second thing, the generator now raises a StopIteration, since it has no further values to return.

Since a generator implements the iterator protocol, it can be used in a for statement and therefore in a list comprehension. This makes for a convenient way to check the values of a limited generator like this one.

>>> [x for x in two_things()]
[1, 'hi']

More complex example with yield

So let’s write Fibonacci as a generator. I’m going to start with doctests to create the definition of the function, then put the code at the end.

What I like about the doctests in this example is that in 3 fib is tested with next, but in 4 it’s tested using a list comprehension.

def fib(last):
    """

    1.  Creates a generator
    >>> type(fib(0))
    <type 'generator'>

    2.  fib(0) just generates 0th value (1)
    >>> zero_fib = fib(0)
    >>> zero_fib.next()
    1
    >>> zero_fib.next()
    Traceback (most recent call last):
    ...
    StopIteration

    3.  fib(1) creates a generator that creates 0th (1) and 1st (1) values of
        fib seq
    >>> one_fib = fib(1)
    >>> one_fib.next()
    1
    >>> one_fib.next()
    1
    >>> one_fib.next()
    Traceback (most recent call last):
    ...
    StopIteration

    4.  fib(10) generates the first 10 fibonacci numbers
    >>> [x for x in fib(10)]
    [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

    """
    result = 1
    x = 0
    a = 1
    b = 0

    while x <= last:
        yield result

        result = a + b
        b = a
        a = result
        x += 1

That’s all - have fun with generators!

Git: To squash or not to squash?

2013-11-19T11:00:00+00:00

It started with a Tweet

Over the weekend I spotted a tweet from Oliver…

To squash features into develop, or not to squash features into develop?
— Oliver Caldwell (@OliverCaldwell) November 15, 2013

And I jumped straight in with…

@OliverCaldwell Squash, but keep detailed commit messages. Unless you have a particular use-case / reason not to.

November 17, 2013

Then, as part of our following conversation, I drew a picture:

This is how I see it. Better to keep the direct route rather than the “how we got here”.

November 18, 2013

But…

It’s about more than just squashing

What I realised while writing this post and experimenting with git is that the issue is not as simple as “Squash? Yes / No?”

Variables to consider include:

How you record your commit messages on your squashed commit. This effects the impact of history loss - good commit messages and or external ticketing / dev tracking mean it’s less important.
Whether you push your feature branches for other developers, or between your dev boxes, to share. Do you need to keep the shared history between machines?
The velocity of your project. How long do you need to keep history for? Do bugs show up regularly?

TL;DR Simple project. Squash = Yes

For a simple project with no sharing between devs required and regular releases, then squashing features seems like a good idea if you:

Keep detailed commit messages when you squash.
Use git rebase to squash your features’ commits into a candidate branch and merge that in to dev or master depending on your SCM strategy.
Only push your squashed features to keep origin clean and easy to understand.
Keep your feature branches if you want. But, if you delete them git will keep your commits in the reflog for 30 days by default.

Keeping a detailed history

One of the issues that Oliver raised was about losing history.

@jamesfublo I suppose so. Squashing just feels like you're killing off that fine grained history, like when was that two line change made.
— Oliver Caldwell (@OliverCaldwell) November 18, 2013

So, since I advocate squashing and branch deletion, I’m therefore suggesting that the reflog is used to recover detailed history in the local repository if required.

So let’s explore how much history is actually kept…

From the docs:

Reflog is a mechanism to record when the tip of branches are updated.

This means that…

Every commit that every branch in your local repostitory has ever pointed to is kept in the reflog.

And this even includes branch switching…

HEAD reflog records branch switching as well.

Sounds very warm and cozy, BUT there are conditions, so let’s do a practical experiment with a test repository.

Experiment: Squashing with rebase and keeping history

Make a repository with an initial commit.

$ git init

Create a README.md file and put a line of text into it and commit - this commit is called A.

$ cat > README.md
First line of readme file
^C
$ git add README.md
$ git commit

Current git tree status:

A   <-master

Work on feature

In a new branch, we create a feature to update the README with two new lines and to delete the first line.

$ git checkout -b feature-a

# First feature commit (B)
$ cat >> README.md
Add a second line
^C
$ git add README.md
$ git commit

# Second feature commit (C)
$ cat >> README.md
Add a third line
^C
$ git add README.md
$ git commit

# Third feature commit (D)
$ vim README.md
# Remove first line and save
$ git add README.md
$ git commit

Current git tree status:

A   <-master
 \
  B--C--D   <-feature-a

Check progress in reflog

Checkout master.

$ git checkout master

Let’s check the reflog.

$ git reflog

8e48d1d HEAD@{0}: checkout: moving from feature-a to master
262057a HEAD@{1}: commit: D: Remove first line
9efbf73 HEAD@{2}: commit: C: Add a third line
f2503d5 HEAD@{3}: commit: B: Add a second line
8e48d1d HEAD@{4}: checkout: moving from master to feature-a
8e48d1d HEAD@{5}: commit (initial): Make readme

Newest stuff pops out first:

HEAD@{0} - Checkout from feature-a to master is recorded.
HEAD@{1} to HEAD@{3} - our feature-a commits (D, C and B).
HEAD@{4} - Checkout of the feature-a branch.
HEAD@{5} - Initial commit.

Squash commits into candidate branch

feature-a is ready to bring into master. Let’s first cleanup our history by doing an interactive rebase. We use a candidate branch for this work because it’s a nice safety net which can help with testing.

$ git checkout feature-a
$ git checkout -b feature-a-candidate

Current git tree status:

A   <-master
 \
  B--C--D   <-feature-a <-feature-a-candidate

$ git rebase --interactive master

Let’s squash our three commits into one.

pick f2503d5 B: Add a second line
squash 9efbf73 C: Add a third line
squash 262057a D: Remove first line

And now we merge together the three commits, describing the activity that took place. We keep the messages so that history is clean, but informative. We also include a reference to the ticket we are working against:

Updating README.md as per #ticket

* Add a second line
* Add a third line
* Remove first line

Check reflog again:

$ git reflog

d0445b2 HEAD@{0}: rebase -i (finish): returning to refs/heads/feature-a-candidat
d0445b2 HEAD@{1}: rebase -i (squash): Updating README.md as per #ticket
362b6ef HEAD@{2}: rebase -i (squash): # This is a combination of 2 commits.
f2503d5 HEAD@{3}: checkout: moving from feature-a-candidate to f2503d5
262057a HEAD@{4}: checkout: moving from feature-a to feature-a-candidate

The reflog shows us that there is a new commit d0445b2, we’ll call this E. This is the commit that results from the rebase and leaves the tree looking like:

A   <-master
|\
| B--C--D   <-feature-a
\
 \
  E   <-feature-a-candidate

This is a good stage to test everything and to check that your tests are what you expect them to be, ensure that no information has been lost.

Merge onto master

The new commit E is the patch for our feature which we now merge onto master.

$ git checkout master
$ git merge feature-a-candidate master

Updating 8e48d1d..d0445b2
Fast-forward
 README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

The tree:

A--E   <-master <-feature-a-candidate
 \
  B--C--D   <-feature-a

Push

At this stage the feature would usually be pushed to a branch on origin.

$ git push origin master

Note that we’ve only shared the squashed E commit, not B, C or D in the feature-a branch.

Cleanup

We can then cleanup our working branches. First the candidate.

$ git branch -d feature-a-candidate

This leaves us with a tree like:

A--E   <-master
 \
  B--C--D   <-feature-a

Keeping history

As Oliver noted, the feature-a branch can just be kept by the developer in their local repository to preserve the full history - that is certainly an option.

@jamesfublo I suppose you can still keep the unsquashed branches in the repository. I never used to squash, but I might start.
— Oliver Caldwell (@OliverCaldwell) November 18, 2013

However, I prefer a clean working repository so I like to delete the feature-a branch.

Clean up the feature branch

When deleting the feature-a branch git requires the -D flag to force the deletion. git does not work out that E is equal to B, C and D combined, so thinks that history could be lost.

$ git branch -D feature-a

Deleted branch feature-a (was 262057a)

This leaves a tree like:

A--E   <-master
 \
  B--C--D

B, C and D are now hanging commits

Check reflog.

$ git reflog

This is a part of it:

...
262057a HEAD@{12}: commit: D: Remove first line
9efbf73 HEAD@{13}: commit: C: Add a third line
f2503d5 HEAD@{14}: commit: B: Add a second line
...

The development commits from the feature development are still available and could be checked out into detached HEAD state and inspected, played with, rebranched. Let’s try that.

$ git checkout 262057a

Now play and explore as much as you want.

When you’re ready, move back to master.

$ git checkout master

And git warns us that we’ve left behind our hanging commits:

Warning: you are leaving 3 commits behind, not connected to
any of your branches:

  262057a D: Remove first line
  9efbf73 C: Add a third line
  f2503d5 B: Add a second line

If you want to keep them by creating a new branch, this may be a good time
to do so with:

 git branch new_branch_name 262057a

How long are hanging commits kept?

But how long will these unreachable commits hang around for?

We can decide!

Hanging commits are removed from the local repository by garbage collection, known as gc, or by manual removal.

There are various settings which gc will use to determine which commits should be cleaned before the repository is repacked.

gc.reflogExpireUnreachable tells gc how long hanging commits should be left in the repository. Default value is 30 days. Adjust this to a value that you feel comfortable with. You can make that setting on any of the normal levels - global, system or local.

Hey - you want to keep all history in the reflog for ever? Here’s a setting:

[gc]
    reflogExpire = never
    reflogExpireUnreachable = never

I’m happy with the 30 day default myself!

For more detailed explanation, checkout the Configuration section of the git-gc man page.

A manual clean

Just for experimention, I tried to clean the repository of the B, C and D hanging commits. This was challenging because my default settings prevented reflog and gc from performing the clean, however I found this SO answer helpful.

$ git reflog expire --all --expire-unreachable=0
$ git repack -A -d

Repacking occurred. Now check reflog.

$ git reflog

d0445b2 HEAD@{0}: merge feature-a-candidate: Fast-forward
8e48d1d HEAD@{1}: checkout: moving from feature-a-candidate to master
d0445b2 HEAD@{2}: rebase -i (finish): returning to refs/heads/feature-a-candidat
d0445b2 HEAD@{3}: checkout: moving from master to feature-a
8e48d1d HEAD@{4}: commit (initial): Make readme

There are now only two commits in the repository:

8e48d1d - Initial commit A @ 1 and 4.
d0445b2 - Feature commit E made by the rebase @ 0, 2 and 3

The cleaned repository now looks like:

A--E   <-master

So fresh and so clean!

Summary

At the end of the day, the dev team (even if that’s just you on a weekend project) decides how best to keep history and share features.

My general solution is for:

Squashed single-commit features.
Detailed commit messages created at squash-time.
Devs keep more history locally, either with branches or in a long-life reflog.
Devs backup their repositories and don’t rely on origin.

Remember there can be a full 30 day history (or longer depending on the gc.reflogExpireUnreachable setting) in the local repo which hasn’t been pushed to origin. It’s this history that could save your bacon one day - so consider backing it up!

Happy source code management!

Update 23/08/2018

See also this comment on GitHub from Curt J. Sampson with some great points about when not to squash. One helpful excerpt:

I think of a set of commits I’m proposing for master branch as a story I’m telling to the other developers. Make the story as clear as possible, divided up into reasonably small chunks where you can do so. This will make other developers love, rather than hate, reviewing your code.

Thanks Curt - spread the love!

Update 06/01/2019

The Twitter account that I used in my conversations with Oliver above has been deleted. I’ve replaced the links to tweets with the original content.

vi-nature everywhere - lightning talk

2013-11-01T20:00:00+00:00

vi-nature, the ‘language’ of vim. It’s the reason that vim works so well for me. However, it does take some learning, and even after many months of use, I’d say I’ve only just scratched the surface.

So if we’re investing so much time and energy in learning this language, then why not apply it to more tasks than just editing files?

In this five minute lighting talk I gave at Vim London this week, I delved into some of the benefits and issues with using vi-nature for more than just editing.

The feedback after the talk was great - here are my take-aways:

Check out uzbl - it provides an interface layer that can be programmed to different keybindings. Thanks Nestor.
Check out Awesome Window Manager because it’s completely operational without mouse. Thanks Nestor.
Write a blog post about ‘vi-nature’ because there’s not much about it on the web - Yes I will do this, thanks for the suggestion Max.
Check out Mac OSX’s slate because it creates a programmable keyboard interface for window management. Thanks David.

Lots to follow up on and hopefully some ways to take vi-nature to more places.

Thanks for reading!

Things to remember about decorators

2013-10-22T20:00:00+01:00

After an interview question about Python decorators which I stumbled over, I promised myself that I would improve my knowledge of this metaprogramming technique.

These are my notes to myself on decorators - maybe they’ll be helpful to someone else who’s improving their knowledge of decorators too.

A decorator is pure Pythonic syntatic sugar.
A decorator is a Python callable that receives the decorated function and returns a new function in its place.

For example, if there is a decorator called my_decorator and we want to decorate my_func then…
```
@my_decorator
def my_func():
    """some stuff"""
    ...
    return
```
Is equivalent to writing.
```
def my_func():
    """some stuff"""
    ...
    return
my_func = my_decorator(my_func)
```
The decorator callable is executed at load time, not at execution time. Here is an example of a silly decorator that prints “Hello World” when the Python file is loaded - there is nothing else in the file.

hello.py
```
def say_hello(func):
    print 'Hello World'
    return func

@say_hello
def nothing():
    # Do nothing just return
    return
```
Run it on the command line, and “Hello World” appears when the nothing function is decorated.
```
$ python hello.py
Hello World
```
When writing a decorator, remember to patch over the docstring of the wrapped function. This can be done by accessing the passed function’s __doc__ attribute. Failing to do so will prevent doctest from testing the decorated function.
```
def my_decorator(func):
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    # Pass through the doc string
    wrapper.__doc__ = func.__doc__
    return wrapper
```
Update This is actually far better done with the wraps decorator from the functools modules, which fixes the __name__ and __doc__ attributes to what we’d expect them to be.
```
from functools import wraps

def my_decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper
```
Found on Improve your Python.

When unit testing decorators, one strategy can be to manually call the decorator on a mocked object and inspect how it interacts with it.

Here’s a caching function which is used to speed up the Fibonacci series.

def cache(func):
    # Keep a dict of values returned already
    vals = {}

    def wrapper(x):
        if not vals.has_key(x):
            vals[x] = func(x)
        return vals[x]

    wrapper.__doc__ = func.__doc__

    return wrapper

Now use the cache function as a decorator.

@cache
def fib(x):
    """Fibonacci series

    >>> fib(1)
    1
    >>> fib(2)
    2

    """
    if x < 0:
        raise ValueError('Must be greater than 0')
    elif x == 0:
        return 1
    elif x == 1:
        return 1
    else:
        return fib(x - 1) + fib(x - 2)

And here’s a unittest that asserts that the cache function only allows calls through when there is no value saved in the vals dict.

import unittest
from mock import Mock

class TestCashDecorator(unittest.TestCase):

    def test_cache(self):
        my_fn = Mock(name='my_fn')
        my_fn.return_value = 'hi'

        wrapped = cache(my_fn)
        # First call gives a call count of 1
        self.assertEqual(wrapped(3), 'hi')
        self.assertEqual(my_fn.call_count, 1)

        # Second call keeps the call count at 1 - the cached value is used
        self.assertEqual(wrapped(3), 'hi')
        self.assertEqual(my_fn.call_count, 1)

        # Subsequent call with a new value increased the call count
        self.assertEqual(wrapped(7), 'hi')
        self.assertEqual(my_fn.call_count, 2)

Make sure the scope of variables used in the decorators is correct, this is an interesting article by Simeon Franklin about decorators and scope.

If in doubt, extend any tests to test a second decorated function and ensure that the two functions do not effect each other.

Below is a test that aims to check that cache dictionaries are not shared between instances of the cache decorator, it is appended to the test_cache test above.

# Check that the vals dict isn't shared between other decor
my_other_fn = Mock(name='other fn')
my_other_fn.return_value = 'other hi'
# Create other wrapped function
other_wrapped = cache(my_other_fn)
self.assertEqual(other_wrapped(7), 'other hi')
self.assertEqual(my_other_fn.call_count, 1)
# The original function has not have been additionally called, its
# call count remains 2
self.assertEqual(my_fn.call_count, 2)

All suggested tips on decorators very welcome - thanks for reading!

Calculating your day rate for spare time freelance work

2013-07-14T20:00:00+01:00

So you want to do some freelance work and you’re not sure how much to charge your new client. The most important thing is to not underestimate your value - it frustrates me so much when I hear of a talented coder selling themselves short.

The calculation

When you work for yourself, it’ll be like your day job, except you keep the profit and take the additional time and cost overheads.

This calculation has worked well for me in the past, so I’m sharing it here. It’s so simple. I hope it can work for you.

\begin{equation*} Your Day Rate = \frac {2\times Your Annual Salary} {252 - Number Of Days Holiday} \end{equation*}

Where:

‘2’ is my freelance multiplier.
‘252’ is the number of working days in a year - an estimate.

“Why is this based on my current salary?”

I assume you will be doing some work similar to your day job. This means that you can use your usual salary as a base unit for calculating your day rate.

If you don’t have a day job and all your income will be from self-employment, then I would guess that you will have an idea what your employed market value would be in the kind of business you’ll be selling your services to is.

“Why is the multiplier 2?”

Remember, the value of the work that you provide a company is greater than the amount that you are paid:

In the UK, your employer pays an employment tax - Employers’ National Insurance Contributions.
Your employer pays overheads as part of your employment which you might not be exposed to. The cost of your equipment, heating and lighting your work space, insuring you at work, payroll costs… All these overheads mount up and you will be taking these on when you’re working freelance.
Any successful business must sell the goods or services at a profit. Therefore, if you’re contributing code to a project, then the future or immediate value of your contribution should be greater than your input for a business to be making a profit from you.

You can play with this number of course, but a factor of 2 has worked for me in the past.

One assumption is that you’re going to do this new project in your free time, probably on the weekends and evenings. Usually this would earn an overtime rate for someone on an hourly wage - usually double time or time and a half.

If you’re in the UK, more free time will be taken up managing a tax-return, paying HMRC for additional National Insurance Contributions, invoicing and keeping records. You need to ensure that this time is covered in some way by the income from your freelance work.

“What if my new client is too poor / too rich to afford £X?”

Of course, you’re perfectly allowed to adjust this if you want to give away some of your work at less than the market rate. Remember, your employer is already paying something along the lines of what you’ve just calculated for your time. Carefully consider how much you should adjust that for someone else.

In my previous businesses I’ve charged all clients the same basic rate for the simple reason that it’s easier on the books and my brain.

In the end

As you work more freelance jobs you’ll get a feel for what’s suitable and what’s not.

I hope this has been helpful.

Good luck!

Pyramid London talk - A testing strategy for Pyramid Applications

2013-06-16T21:00:00+01:00

Pyramid London meetup returned in June to Skills Matter. This time I spoke about testing strategies for Pyramid applications.

As outlined in the slides below, my current testing framework builds up with doctests, through unit and integration tests to functional / behaviour driven testing on the outside of the application. Hopefully my very basic “drawn on Google Docs” diagram of the Pyramid Framework illustrates how each of the testing methods fits within the framework.

I would like to have been able to talk more about Behaviour Driven Development and testing with Behave, which I’m enjoying at the moment, but maybe that’s for another presentation. Again, putting together this presentation was really helpful - it helped me to reflect on the methods we’re using at the moment, and how I might be able to improve and progress the level of test driven development in my daily work.

Video is available via the SkillsMatter site.

Many thanks to Armin Ronacher for his talk on SQLAlchemy at the same Pyramid meetup - the video is also online at SkillsMatter. As well as the technical details and some hints for things to check out with SQLA, I found Armin’s thoughts on how the Pyramid community might improve on how we introduce new developers to Pyramid and SQLAlchemy very helpful. I hope I might be able to contribute to that some time in the future. Hopefully we’ll see more people at the next Pyramid Meetup which may include a talk on using Celery with Pyramid.

Pyramid London talk - Pyramid Router

2013-05-08T23:00:00+01:00

Our first Pyramid London meetup was kindly hosted at Skills Matter, who have posted the video of my talk on their page for the meetup.

All the code I demonstrated is on GitHub in the pyramid-london-talk repository - Please note that the traversal code is in the traversal branch, not in a separate project.

I learned loads from preparing the demonstration code and chatting to everyone that attended - so thanks and hope to see you at the next meetup in June!

Reincublog Django app

2013-04-28T13:00:00+01:00

Reincublog is a Django app that I was asked to code as part of the recruitment process at Reincubate. It’s a weekend of glue code which I was set to see if I am a competent Django programmer. I’m not sure that it’s the kind of test that I show my best at - I’m more of an algorithm guy.

However, after living on GitHub for a few months, the repo has picked up a couple of stars, and because I like to keep a super clean GitHub account, I’ve decided to clean it out of my account.

So, from today the Reincublog code will live on shonenada’s GitHub hopefully it can grow and be useful.

Migrating from Django 1.4 to 1.5 - Lessons learned

2013-03-29T19:00:00+00:00

From Ryan Kaskel‘s talk at Django London in November last year, I guessed that upgrading the Action Guide code from Django 1.4 to 1.5 might have created some issues with users (user models have changed in Django 1.5 to allow more customisation).

However, as it turns out, the main problems were with settings and urls, the users were fine. My main take-aways were:

Url formats have changed - now need quotes

The Django team had already updated the url tag to accept the path parameter as a string, but the old syntax was still allowed. 1.4 allowed both types of syntax, the team having provided {% load url from future %} for those that wanted to update their templates to the new syntax.

Here’s the warning from the URL tag documentation.

This was a reasonably easy change to implement - some search and replace and all url tags can be easily hunted down and changed.

Read up on the settings - no ALLOWED_HOSTS makes 500s

This was the real killer.

There is a new ALLOWED_HOSTS settings in 1.5 required to get Django and running in non-debug mode.

Worst thing about the implementation of this new setting is that I couldn’t get a single bit of debugging output it through wsgi on WebFaction - just a 500 error on every page load when I took the site out of debug mode.

I was so confused that I posted this question on StackOverflow, thinking the problem was url warnings being shown as errors and halting the wsgi. In the end, just adding ALLOWED_HOSTS fixed everything up great.

My main problem was that I scanned the docs, tested the migration on localhost in dev mode, and just expected everything to deploy. With Captain Hindsight, I’d have RTFMed much harder before deploying - a lesson for the future.

Apart from that, everything works really well. Have fun!

Pelican Svbtle theme tweaks

2013-02-21T19:40:00+00:00

My first experiments with Pelican to run this blog have been good - it’s a great way to publish static pages quickly and I find it much easier to manage than Octopress.

It’s built on a version of the Pelican-svbtle theme. There were some problems with the theme in its current form, so I’ve forked CNBorn’s already adjusted version and cleaned out some of the LESS and templates - my fork is on GitHub.

However, this theme isn’t going to stay. Paul has been working on some flat HTML based on Bootstrap to make a new clean theme. Once that’s stable, I’ll plug in some Pelican / Jinja2 tags and hopefully this site will have a new clean theme soon.

jsFiddle documentation update

2013-01-25T19:40:00+00:00

Updated documentation for jsFiddle, merged by Piotr.

Includes a new tutorial - but images are already out of date!

jsFiddle is such a great tool and my goal for the tutorial was to create a simple introduction which first time students would be able to understand and execute.

Setting up this homepage with Pelican

2013-01-20T17:25:00+00:00

This page has been through a lot in the last ten years.

Since starting work at Quibly, I’ve had a lot more time to code and it’s exactly what I wanted, hopefully it’ll continue. The result of that is that I’ve got more to write about… The code that I develop at work, fixes I make to open source libraries and general things I learn, primarily about Python and web - hopefully all valuable and worth sharing.

I’m experimenting with Pelican - a static blog generator written in Python. It’s excellent and noticeably easier than Jekyll - probably because I’m much more clued up in Python than Ruby. I’m lazy, so I’m hosting the outputted static files in the gh-pages branch of the blog’s repository to take advantage of GitHub Pages’ free hosting features - thanks GitHub!

In addition, I found this article by David Fischer very helpful. Particularly the suggestion of adding the CNAME copy command to the Makefile to get GitHub Pages one configuration requirement and gph-import working nicely together. Plus David pointed out that Pelican already has a github target in the Makefile which I hadn’t noticed and is now what I use to push articles live.

All in all - great and simple.

Got the Stack Overflow tumbleweed badge for Mako filters question

2013-01-20T16:40:00+00:00

Last week I posted a question on Stack Overflow - “Mako template filter ordering” - this week it earned the Tumbleweed badge.

It’s always a little concerning when libraries and toolkits you’re using in a project have forums and message boards that are a little too quiet - is there a bad smell? Is there something bad I don’t know about this tech? Even worse is when you look around those quiet forums (or tags in Stack Overflow) you find comments like this about the library you’re being asked to use:

You should listen to a Stack Overflow moderator who has 93K points at time of writing right?

Meanwhile… I haven’t found the reason for the template filter ordering being strange - and I still think that the h filter is putting itself last in the mako render order, but now I’ve got a work around, I’m going back to post it.

Maybe 10 more people will see it before Easter - it might even help someone.

Password cases and test fixes on pyramid_simpleauth

2012-11-30T12:00:00+00:00

At Quibly we’re using Pyramid at the centre of a Python framework. Providing user functionality is the pyramid_simpleauth library.

While writing integration tests before we put the site live, I found that my test users we not able to authenticate with their testing passwords (usually just a simple string like ‘Password’). Digging inside the simpleauth library, I found some fixes necessary to how cases are handled by the lib - plus also fixed some doctests while I was at it.

These changes all been merged now and the library rolled up a version.

Django-mailchimp compatability with v1.3 API

2012-09-25T07:14:00+01:00

For a Fublo project with Neuxpower, we had to communicate with Mailchimp via their API. On Django one of the best libraries for this is django-mailchimp.

However, in its previous state django-mailchimp wasn’t able to specify a send_welcome parameter which lets Mailchimp know whether it should send out a list welcome message when a new user subscribes. For the project, we were managing the signup explicitly with Neuxpower’s code, so no welcome message was required and the default for Mailchimp was True for sending meaning that Neuxpower’s new customers would get hit with a double welcome message… Not desirable.

This small change is now merged in with the library, which has rolled up to a ‘v1.3’ status as there is no backward compatibility.

Fixing exception in django-menu

2012-05-05T19:40:00+01:00

django-menu is a nice simple library for building very simple menus. However, when a site is loaded for the first time, the menu structure was not configured and so it was throwing a DoesNotExist Exception.

This tiny pull request simply wrapped the call to the menu in a try/except so that new sites using django-menu won’t fall over on first load.