The Lumber Room

"Consign them to dust and damp by way of preserving them"

Archive for the ‘public’ Category

Matplotlib tutorial

The “standard” way to plot data used to be gnuplot, but it’s time to start using matplotlib which looks better and easier to use. For one thing, it’s a Python library, and you have the full power of a programming language available when you’re plotting, and a full-featured plotting library available when you’re programming, which is very convenient: I no longer find it necessary to use a horrible combination of gnuplot, programs, and shell scripts. For another, it’s free software, unlike gnuplot which is (unrelated to GNU and) distributed under a (slightly) restrictive license. Also, the plots look great. (Plus, I’m told that matplotlib will be familiar to MATLAB users, but as an assiduous non-user of MATLAB, I can’t comment on that.)

matplotlib is simple to start using, but its documentation makes this fact far from clear. The documentation is fine if you’re already an expert and want to draw dolphins swimming in glass bubbles, but it’s barely useful to a complete beginner. So what follows is a short tutorial. After this, you (that is, I) should be able to look at the gallery and get useful information, and perhaps even the documentation will make a bit of sense. (If matplotlib isn’t already present on your system, the easiest way to install it, if you have enough bandwidth and disk space to download a couple of gigabytes and you’re associated with an academic installation, is to get the Enthought Python distribution. Otherwise, sudo easy_install matplotlib should do it.)

The most common thing you want to do is plot a bunch of (x,y) values:

import matplotlib.pyplot as plot
xs = [2, 3, 5, 7, 11]
ys = [4, 9, 5, 9, 1]
plot.plot(xs, ys)

p^2 mod 10
Or, for a less arbitrary example:

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)] #That's 0 to 10 in steps of 0.01
ys = [math.sin(x) for x in xs]
plot.plot(xs, ys)

That is all. And you have a nice-looking sine curve:

If you want to plot two curves, you do it the natural way:

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)]
ys = [math.sin(x) for x in xs]
zs = [math.cos(x) for x in xs]

plot.plot(xs, ys)
plot.plot(xs, zs)


It automatically chooses a different colour for the second curve.

Perhaps you don’t find it so nice-looking. Maybe you want the y-axis to have the same scale as the x-axis. Maybe you want to label the x and y axes. Maybe you want to title the plot. (“The curves sin x and cos x”?) Maybe you want the sine curve to be red for some reason, and the cosine curve to be dotted. And have a legend for which curve is which. All these can be done. In that order —

import matplotlib.pyplot as plot
import math

xs = [0.01*x for x in range(1000)]
ys = [math.sin(x) for x in xs]
zs = [math.cos(x) for x in xs]

plot.title(r'The curves $\sin x$ and $\cos x$')
plot.plot(xs, ys, label=r'$\sin$', color='red')
plot.plot(xs, zs, ':', label=r'$\cos$')
plot.legend(loc='upper right')


Observe that we specified the plot style as “:”, for dotted. We can also use ‘o’ for circles, ‘^’ for triangles, and so on. We can also prefix a colour, e.g. ‘bo’ for blue circles, ‘rx’ for red crosses and so on (the default is ‘b-‘, as we saw above). See the documentation for plot.

Also, Matplotlib has some TeX support. :-) It parses a subset of TeX syntax (be sure to use r before the string, so that you don’t have to escape backslashes), and you can also use

plot.rc('text', usetex=True)

to make it actually use (La)TeX to generate text.

My only major annoyance with the default settings is that the y-axis label is vertical, so one has to tilt one’s head to read it. This is easy to fix:

plot.ylabel('$y$', rotation='horizontal')

You can also place text at a certain position, or turn a grid on:

plot.text(math.pi/2, 1, 'top')

You can also annotate text with arrows.

You can have multiple figures in the same image, but it’s all rather stateful (or complicated) and so far I haven’t needed it.

Instead of saving the figure, you can use
and get an interactive window in which you can zoom and so on, but I find it more convenient to just save to image.

You can customize defaults in the matplotlibrc file — fonts, line widths, colours, resolution…

Other plots: Matplotlib can also do histographs, erorr bars, scatter plots, log plots, polarplots, bar charts and yes, pie charts, but these don’t seem to be well-documented.

Animation: Don’t know yet, but it seems that if you want to make a “movie”, the recommended way is to save a bunch of png images and use mencoder/ImageMagick on them.

That takes care of the most common things you (that is, I) might want to do. The details are in the documentation. (Also see: cookbook, screenshots, gallery.)

Edit: If you have a file which is just several lines of (x,y) values in two columns (e.g. one you may have been using as input to gnuplot), this function may help:

def x_and_y(filename):
    xs = []; ys = []
    for l in open(filename).readlines():
        x, y = [int(s) for s in l.split()]
    return xs, ys

Edit [2010-05-10]: To plot values indexed by dates, use the following.

    fig = plot.figure(figsize=(80,10))
    plot.plot_date(dates, values, '-', marker='.', label='something')

The first line is because I don’t know how else to set the figure size. The last is so that the dates are rotated and made sparser, so as to not overlap.

More generally, Matplotlib seems to have a model of figures inside plots and axes inside them and so on; it would be good to understand this model.

Edit [2013-03-15]: Found another good-looking tutorial here:

[2013-04-08: Comments closed because this post was getting a lot of spam; sorry.]


Written by S

Sun, 2010-03-07 at 23:41:37

Dan Brown parody

with 8 comments

Dan Brown is a hilariously bad writer. The Da Vinci Code was an outrageously successful book.
So it was only inevitable that in addition to all the delicious criticism of Dan Brown’s writing,1 there would also be a number of parodies of his books published, and indeed there have been several.2 While looking for something in the library, I found The Da Vinci Cod: A Fishy Parody by “Don Brine” (real name Adam Roberts) and quickly proceeded to borrow it and read it. It was a good two hours spent, which is more than can be said for Dan Brown’s books themselves. Although the author is a professor of literature at London University, the book manages to remain true to the awful writing and plot of the original. I heartily recommend reading the book if you come across it; for a taste of what it’s like, some excerpts follow. You can also see parts of the book at Google Books.
Read the rest of this entry »

Written by S

Sun, 2008-12-28 at 23:45:27

Lattice points visible from the origin

with 16 comments

[A test of LaTeX-to-Wordpress conversion. Bugs remain, point them out. Original PDF]

Here is a problem I love. It is simple to state, and it has a solution that is not trivial, but is easy to understand. The solution also goes through some beautiful parts, so I can promise it’s worth reading :-)

[The solution is not mine. Also, the question is just an excuse for the ideas in the solution. :P]

Question. Suppose you are standing on an infinite grid in the plane. You can see infinitely in all directions, but you cannot see through grid points: a point is hidden from view if some other grid point lies in your line of sight. What fraction of the grid points can you see?

Let us first imagine that we are standing at the origin, and that the grid is that of the lattice (integer) points.

The blue points are visible; the grey points are not

The blue points are visible; the grey points are not

Read the rest of this entry »

Written by S

Fri, 2008-11-07 at 15:00:30

Posted in public

Tagged with , ,

A theorem by Euler on partitions

with 3 comments

There are so many beautiful facts in mathematics that are not as well known as they deserve to be. [Just today I couldn’t find an online reference for the Robinson-Schensted(-Knuth) correspondence.] Some efforts in the direction of communicating these to an online audience have been collected in the Carnival of Mathematics (41 so far) posts on various blogs. Another good idea I recently found is Theorem of the Day. [EtA: Found at Theorem of the Day: RSK correspondence.]

Anyway… today I attended a lecture on Euler by William Dunham (great talk!) where after a brief biography of Euler and a tour through some of his results, he showed Euler’s proof of a particular theorem about partitions. It will be familiar to anyone who has read about integer partitions, but it deserves to be familiar to everyone! So here is a sketch of the proof, for your reading pleasure. [It doesn’t do justice to present only a “sketch”, but I have assignments due tomorrowtoday… feel free to take it and turn it into some form worthy of the content. :-)]

Now when I say “a theorem by Euler”, it is very ambiguous which one I mean, and even with the “on partitions” restriction, there are several. So let me state it:

The number of partitions of a number into odd parts is the same as the number of partitions of the number into distinct parts.

[If I remember correctly what was narrated in the talk, one of the Bernoullis wrote to Euler asking him a question about partitions, and within a few days (think of the postal system then) Euler replied with a proof of the theorem, along with a apology for the delay caused by the poor eyesight he had recently been suffering from. This is an outline of his proof, ask me about the details.]

Let D(n) denote the number of partitions into distinct parts, and O(n) denote the number of partitions into odd parts. Then we have:
\displaystyle \sum_{n\ge0}D(n)x^n
\displaystyle = (1+x)(1+x^2)(1+x^3)(1+x^4)(1+x^5)\dots
\displaystyle = \frac{1-x^2}{1-x}\frac{1-x^4}{1-x^2}\frac{1-x^6}{1-x^3}\frac{1-x^8}{1-x^4}\frac{1-x^{10}}{1-x^5}\dots
\displaystyle = \frac{1}{(1-x)(1-x^3)(1-x^5)\dots}
\displaystyle = (1+x+x^{1+1}+\dots)(1+x^3+x^{3+3}+\dots)(1+x^5+x^{5+5}+\dots)
\displaystyle = \sum_{n\ge0}O(n)x^n
which proves the theorem.
The first equality, which gives the generating function for the D(n)s, can be seen as follows: when you “expand” and write out (1+x)(1+x^2)(1+x^3)(1+x^4)(1+x^5)\dots, you get 1 + x + x^2 + (x^3+xx^2) + (x^4+xx^3) + (x^5+xx^4+x^2x^3) + \dots, where the coefficient of any term x^n is exactly the number of ways of writing x^n as a product of distinct factors of the form x^k, which is D(n). Similarly, the last equality, about the generating function for the O(n)s, is because the coefficient of any term x^n in (1+x+x^{1+1}+\dots)(1+x^3+x^{3+3}+\dots)(1+x^5+x^{5+5}+\dots) is exactly the number of ways of writing n as a product of (not necessarily distinct) x^k for odd k.

The result might not be terribly important, but this idea was the starting point for proving several observations about partitions, and the method has led to the discovery of several facts and is more widely applicable than you think!

If you want a combinatorial proof by bijection (“For Entertainment Only”) here is an outline of one:
Given a partition of n into odd parts, say
n = a_11 + a_33 + a_55 + \dots,
write each a_i “in binary”, by which I mean as a sum of distinct powers of two. So you have
n =  (2^{b_{11}}+2^{b_{12}}+...)1 + (2^{b_{31}}+2^{b_{32}}+...)3 + \dots.
Now just get rid of the brackets, and note that all terms are distinct.
E.g. for 19 = 5+3+3+3+1+1+1+1+1,
i.e. 19 = (1)5 + (2+1)3 + (4+1)1, you get the partition
19 = 5 + 6 + 3 + 4 + 1.
[Exercise: Why are all parts distinct?]

Conversely, given a partition into distinct parts, we can separate the “even part” from the “odd part” in each term, so for example with 19 = 5+6+3+4+1, we write
19 = 5 + (2)3 + 3 + (4)1 + 1, and collect the coefficients of each odd number, so 19= 5 + (2+1)3 + (4+1)1, which was our original odd partition.
[Exercise: Why can’t we do the same thing starting with any partition?]

[Question: Is the bijective proof related to the algebraic one?]

Written by S

Wed, 2008-10-15 at 06:26:39

Posted in public

Tagged with , ,