The Lumber Room

"Consign them to dust and damp by way of preserving them"

Posts Tagged ‘visualisation

Matplotlib tutorial

The “standard” way to plot data used to be gnuplot, but it’s time to start using matplotlib which looks better and easier to use. For one thing, it’s a Python library, and you have the full power of a programming language available when you’re plotting, and a full-featured plotting library available when you’re programming, which is very convenient: I no longer find it necessary to use a horrible combination of gnuplot, programs, and shell scripts. For another, it’s free software, unlike gnuplot which is (unrelated to GNU and) distributed under a (slightly) restrictive license. Also, the plots look great. (Plus, I’m told that matplotlib will be familiar to MATLAB users, but as an assiduous non-user of MATLAB, I can’t comment on that.)

matplotlib is simple to start using, but its documentation makes this fact far from clear. The documentation is fine if you’re already an expert and want to draw dolphins swimming in glass bubbles, but it’s barely useful to a complete beginner. So what follows is a short tutorial. After this, you (that is, I) should be able to look at the gallery and get useful information, and perhaps even the documentation will make a bit of sense. (If matplotlib isn’t already present on your system, the easiest way to install it, if you have enough bandwidth and disk space to download a couple of gigabytes and you’re associated with an academic installation, is to get the Enthought Python distribution. Otherwise, sudo easy_install matplotlib should do it.)

The most common thing you want to do is plot a bunch of (x,y) values:

import matplotlib.pyplot as plot
xs = [2, 3, 5, 7, 11]
ys = [4, 9, 5, 9, 1]
plot.plot(xs, ys)

p^2 mod 10
Or, for a less arbitrary example:

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)] #That's 0 to 10 in steps of 0.01
ys = [math.sin(x) for x in xs]
plot.plot(xs, ys)

That is all. And you have a nice-looking sine curve:

If you want to plot two curves, you do it the natural way:

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)]
ys = [math.sin(x) for x in xs]
zs = [math.cos(x) for x in xs]

plot.plot(xs, ys)
plot.plot(xs, zs)


It automatically chooses a different colour for the second curve.

Perhaps you don’t find it so nice-looking. Maybe you want the y-axis to have the same scale as the x-axis. Maybe you want to label the x and y axes. Maybe you want to title the plot. (“The curves sin x and cos x”?) Maybe you want the sine curve to be red for some reason, and the cosine curve to be dotted. And have a legend for which curve is which. All these can be done. In that order —

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)]
ys = [math.sin(x) for x in xs]
zs = [math.cos(x) for x in xs]

plot.title(r'The curves $\sin x$ and $\cos x$')
plot.plot(xs, ys, label=r'$\sin$', color='red')
plot.plot(xs, zs, ':', label=r'$\cos$')
plot.legend(loc='upper right')


Observe that we specified the plot style as “:”, for dotted. We can also use ‘o’ for circles, ‘^’ for triangles, and so on. We can also prefix a colour, e.g. ‘bo’ for blue circles, ‘rx’ for red crosses and so on (the default is ‘b-‘, as we saw above). See the documentation for plot.

Also, Matplotlib has some TeX support. :-) It parses a subset of TeX syntax (be sure to use r before the string, so that you don’t have to escape backslashes), and you can also use

plot.rc('text', usetex=True)

to make it actually use (La)TeX to generate text.

My only major annoyance with the default settings is that the y-axis label is vertical, so one has to tilt one’s head to read it. This is easy to fix:

plot.ylabel('$y$', rotation='horizontal')

You can also place text at a certain position, or turn a grid on:

plot.text(math.pi/2, 1, 'top')

You can also annotate text with arrows.

You can have multiple figures in the same image, but it’s all rather stateful (or complicated) and so far I haven’t needed it.

Instead of saving the figure, you can use
and get an interactive window in which you can zoom and so on, but I find it more convenient to just save to image.

You can customize defaults in the matplotlibrc file — fonts, line widths, colours, resolution…

Other plots: Matplotlib can also do histographs, erorr bars, scatter plots, log plots, polarplots, bar charts and yes, pie charts, but these don’t seem to be well-documented.

Animation: Don’t know yet, but it seems that if you want to make a “movie”, the recommended way is to save a bunch of png images and use mencoder/ImageMagick on them.

That takes care of the most common things you (that is, I) might want to do. The details are in the documentation. (Also see: cookbook, screenshots, gallery.)

Edit: If you have a file which is just several lines of (x,y) values in two columns (e.g. one you may have been using as input to gnuplot), this function may help:

def x_and_y(filename):
    xs = []; ys = []
    for l in open(filename).readlines():
        x, y = [int(s) for s in l.split()]
    return xs, ys

Edit [2010-05-10]: To plot values indexed by dates, use the following.

    fig = plot.figure(figsize=(80,10))
    plot.plot_date(dates, values, '-', marker='.', label='something')

The first line is because I don’t know how else to set the figure size. The last is so that the dates are rotated and made sparser, so as to not overlap.

More generally, Matplotlib seems to have a model of figures inside plots and axes inside them and so on; it would be good to understand this model.

Edit [2013-03-15]: Found another good-looking tutorial here:

[2013-04-08: Comments closed because this post was getting a lot of spam; sorry.]


Written by S

Sun, 2010-03-07 at 23:41:37

Probably the most interesting blog I’ve seen

leave a comment »

I’m afraid to look at it, because I expect I’ll get tempted into spending hours and hours reading all the old posts: Strange Maps

Written by S

Mon, 2007-10-15 at 19:31:17