Mark Jason Dominus tweeted and later blogged about this puzzle:

From the four numbers [6, 6, 5, 2], using only the binary operations [+, -, *, /], form the number 17.

When he tweeted the first time, I thought about it a little bit (while walking from my desk to the restroom or something like that), but forgot about it pretty soon and didn’t give it much further thought. When he posted again, I gave it another serious try, failed, and so gave up and wrote a computer program.

This is what I thought this time.

Any expression is formed as a binary tree. For example, 28 = 6 + (2 * (5 + 6)) is formed as this binary tree (TODO make a proper diagram with DOT or something):

+ 6 * 2 + 5 6

And 8 = (2 + 6) / (6 – 5) is this binary tree:

/ + - 2 6 6 5

Alternatively, any expression is built up from the 4 given numbers [a, b, c, d] as follows:

Take any two of the numbers and perform any operation on them, and replace the two numbers with the result. Then repeat, until you have only one number, which is the final result.

Thus the above two expressions 28 = 6 + (2 * (5 + 6)) and 8 = (2 + 6) / (6 – 5) can be formed, respectively, as:

- Start with [6, 6, 5, 2]. Replace (5, 6) with 5+6=11 to get [6, 11, 2]. Replace (11, 2) with 11*2=22 to get [6, 22]. Replace (6, 22) with 6+22=28, and that’s your result.
- Start with [6, 6, 5, 2]. Replace (2, 6) with 2+6=8 to get [8, 6, 5]. Replace (6, 5) with 6-5=1 to get [8, 1]. Replace (8, 1) with 8/1=8 and that’s your result.

So my idea was to generate all possible such expressions out of [6, 6, 5, 2], and see if 17 was one of them. (I suspected it may be possible by doing divisions and going via non-integers, but couldn’t see how.)

(In hindsight it seems odd that my first attempt was to answer *whether* 17 could be generated, rather than *how:* I guess at this point, despite the author’s assurance that there are no underhanded tricks involved, I still wanted to test whether 17 could be generated in this usual way, if only to ensure that my understanding of the puzzle was correct.)

I happened to do most of this in the iPython shell. So I can pull up the iPython history sqlite file and actually see what I did.

This section is a long painful read on stupid bugs that I typically make (and how something that should have taken just a minute or two took much longer), so you may want to skip to the next section. (Search for “The program, v1”)

Here is my session. Remember, my idea was to start with the single list [6, 6, 5, 2] and iteratively generate:

- all possible lists (triples) possible, after performing one binary operation
- all possible lists (pairs) possible, after performing another binary operation
- all possible lists (of single values) possible, after performing three binary operations

I started with the list of one single possibility:

% ipython Python 2.7.9 (v2.7.9:648dcafa7e5f, Dec 10 2014, 10:10:46) Type "copyright", "credits" or "license" for more information. IPython 1.1.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]: vals = [6, 6, 5, 2] In [2]: vals.sort() In [3]: vals Out[3]: [2, 5, 6, 6] In [4]: poss = [vals] In [5]: poss Out[5]: [[2, 5, 6, 6]]

Now for actually picking up pairs and performing operations on them.

I was expecting to do it on the very next line, but it took considerably longer.

I’m documenting it to show the kinds of bugs and things that can go wrong and slow you down — skip to the next section if not interested.

I opened up a different shell and figured out that `operator.truediv` was the division operator. (`operator.div` does truncating integer division.)

Anyway, continuing in the real shell: for every list of numbers `l` in `poss` (e.g. `l` is `[6, 6, 5, 2]`), take every pair, every operation, and replace with the result, and put it back together: (note: in showing the iPython output below, when I entered multi-line input I’ll just show the input rather than what it looked like on the shell, because it’s confusing to read or copy-paste otherwise)

for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() poss.add(tuple(nl)) NameError: name 'operator' is not defined

(So import it and redo the same thing… also noticed I hadn’t defined my new list `nl`.)

import operator for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() poss.add(tuple(nl))

Still buggy, as the new value ‘v’ should be appended to nl only once:

for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() poss.add(tuple(nl)) AttributeError: 'list' object has no attribute 'add'

Oh crap, to `.add`, I need a `set`, not a `list`. And now in trying to get that set, I become a bumbling clown:

In [10]: poss = set([2, 2, 5, 6]) In [11]: poss Out[11]: {2, 5, 6} In [12]: poss = set() In [13]: poss = set((2, 2, 5, 6)) In [14]: poss Out[14]: {2, 5, 6} In [15]: poss = set(((2, 2, 5, 6))) In [16]: poss Out[16]: {2, 5, 6} In [17]: poss = set((((2, 2, 5, 6)))) In [18]: poss Out[18]: {2, 5, 6} In [19]: t = (2, 2, 5, 6) In [20]: poss = set((t)) In [21]: poss Out[21]: {2, 5, 6} In [22]: poss = set([t]) In [23]: poss Out[23]: {(2, 2, 5, 6)}

Finally!

for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() poss.add(tuple(nl)) RuntimeError: Set changed size during iteration

Oh what’s happened?

In [25]: poss Out[25]: {(-4, 2, 5), (-3, 2, 6), (-1, 2, 2), (0, 5, 6), (0.3333333333333333, 2, 5), (0.4, 2, 6), (0.8333333333333334, 2, 2), (1.0, 5, 6), (2, 2, 5, 6), (2, 2, 11), (2, 2, 30), (2, 5, 8), (2, 5, 12), (2, 6, 7), (2, 6, 10), (4, 5, 6)}

It seems to have formed new triples all right, but looks like we can’t modify a set while iterating through it (makes sense). So we need to put the resulting triples into a new set. Oh well, time to start over and redo:

In [26]: poss = {} In [27]: poss = set([t]) In [28]: poss Out[28]: {(2, 2, 5, 6)} In [29]: newposs = set() for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() newposs.add(tuple(nl)) In [31]: poss Out[31]: {(2, 2, 5, 6)} In [32]: newposs Out[32]: {(-4, 2, 5), (-3, 2, 6), (-1, 2, 2), (0, 5, 6), (0.3333333333333333, 2, 5), (0.4, 2, 6), (0.8333333333333334, 2, 2), (1.0, 5, 6), (2, 2, 11), (2, 2, 30), (2, 5, 8), (2, 5, 12), (2, 6, 7), (2, 6, 10), (4, 5, 6)}

Success! The first step is done: we’ve successfully taken pairs and replaced with result of an operation, to get 3-tuples of numbers now.

So we can iterate by calling the same code as earlier, except that the value of `poss` should be this result “newposs” now.

In [33]: poss = newposs for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() newposs.add(tuple(nl)) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-34-a4bfac256b9a> in () ----> 1 for l in poss: 2 for a in range(len(l)): 3 for b in range(a + 1, len(l)): 4 for op in [operator.add, operator.sub, operator.mul, operator.truediv]: 5 v = op(l[a], l[b]) RuntimeError: Set changed size during iteration

This is what happens when you’re using different languages and cannot switch between their different mental models: `poss = newposs` does not copy (as it would in, say, C++), it just makes both of them point to the same value. So modifying `newposs` is the same as modifying `poss`, the set we are iterating over.

Let’s try again, with proper copying:

In [35]: poss = set([t]) In [36]: for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() newposs.add(tuple(nl)) In [37]: import copy In [38]: poss = copy.deepcopy(newposs) In [39]: newposs = set() In [40]: for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: v = op(l[a], l[b]) nl = [] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) nl.append(v) nl.sort() newposs.add(tuple(nl)) In [41]: newposs Out[41]: {(-40,), (-30,), (-28,), (-28, 2), ...[snip]... (10, 12), (15,), (19,), (20,), (26,), (44,), (54,), (56,), (84,)}

(This is buggy—has both single values and pairs—because I cleared and re-set poss, but didn’t re-initialize newposs.)

At this point, with the basic iteration working, I moved from the terminal to actually typing it in a text editor:

import operator def iterate(poss): newposs = set() for l in poss: for a in range(len(l)): for b in range(a + 1, len(l)): for op in [operator.add, operator.sub, operator.mul, operator.truediv]: if op == operator.truediv and l[b] == 0: continue v = op(l[a], l[b]) nl = [v] for x in range(len(l)): if x not in [a, b]: nl.append(l[x]) newposs.add(tuple(sorted(nl))) return newposs t = (2, 5, 6, 6) poss = set([t]) print 'Start' print list(sorted(poss)) print 'One' poss = iterate(poss) print list(sorted(poss)) print 'two' poss = iterate(poss) print list(sorted(poss)) print 'three' poss = iterate(poss) print list(sorted(poss))

Yes there are copy-pasted pairs of lines, also changed it to `print list(sorted(poss))` instead of `print poss` because otherwise Python doesn’t print the set in sorted order.

Also the line

if op == operator.truediv and l[b] == 0: continue

was added later: I originally had written

if op == operator.truediv and b == 0: continue

and spent several minutes debugging it.

(Note BTW that even in this tiny program, issues were caused by mutation (trying to add to `poss`), and the sort-of-functional-programming approach of `iterate(poss)` creating and returning a new set, without mutating the original, is what made it possible to get rid of the remaining bugs.)

Anyway, this is version 1 of the program, and it printed the following output:

Start [(2, 5, 6, 6)] One [(-4, 5, 6), (-3, 6, 6), (-1, 2, 6), (0, 2, 5), (0.3333333333333333, 5, 6), (0.4, 6, 6), (0.8333333333333334, 2, 6), (1.0, 2, 5), (2, 5, 12), (2, 5, 36), (2, 6, 11), (2, 6, 30), (5, 6, 8), (5, 6, 12), (6, 6, 7), (6, 6, 10)] two [(-34, 5), (-31, 2), (-28, 6), (-24, 2), (-24, 5), (-20, 6), (-18, 6), (-10, 5), (-9, 6), (-7, 2), (-7, 6), (-6, 2), (-6, 5), (-5.666666666666667, 5), (-5.6, 6), (-5.166666666666667, 2), (-5, 2), (-4.666666666666667, 6), (-4, -1), (-4, 0.8333333333333334), (-4.0, 2), (-4, 6), (-4, 11), (-4, 30), (-3, 0), (-3, 1.0), (-3, 6), (-3, 12), (-3, 36), (-2, 5), (-2, 6), (-1.1666666666666665, 6), (-1, 0.3333333333333333), (-1.0, 5), (-1, 6), (-1, 8), (-1, 12), (-0.8, 6), (-0.6666666666666666, 5), (-0.5, 6), (-0.16666666666666666, 2), (0, 0.4), (0, 2), (0, 5), (0, 7), (0, 10), (0.05555555555555555, 5), (0.06666666666666667, 6), (0.1388888888888889, 2), (0.16666666666666666, 5), (0.18181818181818182, 6), (0.2, 2), (0.3333333333333333, 0.8333333333333334), (0.3333333333333333, 11), (0.3333333333333333, 30), (0.4, 1.0), (0.4, 12), (0.4, 36), (0.4166666666666667, 2), (0.4166666666666667, 6), (0.5, 5), (0.5454545454545454, 2), (0.6, 6), (0.625, 6), (0.75, 5), (0.8333333333333334, 8), (0.8333333333333334, 12), (0.8571428571428571, 6), (1, 6), (1.0, 7), (1.0, 10), (1.6666666666666665, 6), (1.6666666666666667, 6), (2, 5), (2, 6.0), (2, 6.833333333333333), (2, 17), (2, 36), (2, 41), (2, 60), (2, 66), (2, 180), (2.4000000000000004, 6), (2.8333333333333335, 6), (3.0, 5), (3, 6), (5, 6.333333333333333), (5, 14), (5, 18), (5, 24), (5, 38), (5, 48), (5, 72), (5.333333333333333, 6), (6, 6.4), (6, 13), (6, 16), (6, 17), (6, 22), (6, 32), (6, 40), (6, 42), (6, 60), (7, 12), (7, 36), (8, 11), (8, 30), (10, 12), (10, 36), (11, 12), (12, 30)] three [(-178,), (-170,), (-168,), (-120,), (-108,), (-67,), (-64,), (-62,), (-58,), (-54,), (-50,), (-48,), (-44,), (-43,), (-42,), (-39,), (-36,), (-35.6,), (-34,), (-33.599999999999994,), (-33,), (-30,), (-29.666666666666668,), (-29,), (-28.333333333333336,), (-28.0,), (-26,), (-24,), (-22,), (-19,), (-18,), (-16,), (-15.5,), (-15,), (-14,), (-13,), (-12.0,), (-11.6,), (-11.166666666666666,), (-11,), (-10.666666666666668,), (-10.666666666666666,), (-10.333333333333334,), (-10,), (-9,), (-8,), (-7.166666666666667,), (-7.166666666666666,), (-7,), (-6.999999999999999,), (-6.8,), (-6.5,), (-6.0,), (-5.933333333333334,), (-5.818181818181818,), (-5.666666666666667,), (-5.583333333333333,), (-5.4,), (-5.375,), (-5.142857142857143,), (-5,), (-4.944444444444445,), (-4.833333333333333,), (-4.800000000000001,), (-4.8,), (-4.666666666666667,), (-4.5,), (-4.333333333333334,), (-4.333333333333333,), (-4.25,), (-4.0,), (-3.5999999999999996,), (-3.5,), (-3.3333333333333335,), (-3.333333333333333,), (-3.166666666666667,), (-3.1666666666666665,), (-3.0,), (-2.5833333333333335,), (-2.5,), (-2.1666666666666665,), (-2.0,), (-1.8611111111111112,), (-1.8,), (-1.5833333333333333,), (-1.5,), (-1.4545454545454546,), (-1.3333333333333333,), (-1.333333333333333,), (-1.2,), (-1.1666666666666667,), (-1.1333333333333333,), (-1,), (-0.9333333333333332,), (-0.7777777777777778,), (-0.666666666666667,), (-0.6666666666666667,), (-0.6666666666666666,), (-0.6,), (-0.5,), (-0.40000000000000036,), (-0.4,), (-0.36363636363636365,), (-0.3333333333333333,), (-0.25,), (-0.2,), (-0.19444444444444442,), (-0.16666666666666666,), (-0.13333333333333333,), (-0.125,), (-0.08333333333333333,), (0,), (0.01111111111111111,), (0.011111111111111112,), (0.0303030303030303,), (0.030303030303030304,), (0.03333333333333333,), (0.04878048780487805,), (0.05555555555555555,), (0.06944444444444445,), (0.09999999999999999,), (0.1,), (0.10416666666666667,), (0.11764705882352941,), (0.13157894736842105,), (0.14285714285714285,), (0.15,), (0.16666666666666666,), (0.1875,), (0.19444444444444445,), (0.20833333333333334,), (0.26666666666666666,), (0.2727272727272727,), (0.27777777777777773,), (0.2777777777777778,), (0.29268292682926833,), (0.3333333333333333,), (0.35294117647058826,), (0.35714285714285715,), (0.375,), (0.39999999999999997,), (0.4,), (0.4000000000000001,), (0.40000000000000036,), (0.46153846153846156,), (0.47222222222222227,), (0.5,), (0.5833333333333334,), (0.6,), (0.7272727272727273,), (0.7894736842105263,), (0.8333333333333333,), (0.8333333333333334,), (0.8888888888888888,), (0.9166666666666666,), (0.9375,), (1.0909090909090908,), (1.1666666666666667,), (1.333333333333333,), (1.4,), (1.8333333333333333,), (2,), (2.138888888888889,), (2.2,), (2.4166666666666665,), (2.5,), (2.5454545454545454,), (3,), (3.5999999999999996,), (3.6666666666666665,), (3.75,), (4.0,), (4.333333333333333,), (4.800000000000001,), (4.833333333333334,), (5,), (5.055555555555555,), (5.142857142857142,), (5.166666666666667,), (5.2,), (5.5,), (5.75,), (6,), (6.066666666666666,), (6.181818181818182,), (6.416666666666667,), (6.6,), (6.625,), (6.666666666666667,), (6.857142857142857,), (7,), (7.666666666666666,), (7.666666666666667,), (8.0,), (8.4,), (8.833333333333332,), (8.833333333333334,), (9,), (10.0,), (11.0,), (11.333333333333332,), (11.333333333333334,), (12.0,), (12.4,), (12.833333333333334,), (13.666666666666666,), (14.4,), (14.400000000000002,), (15.0,), (17.0,), (18,), (19,), (22,), (23,), (26,), (28,), (29,), (30.333333333333332,), (31.666666666666664,), (32.0,), (33,), (34,), (36.4,), (38,), (38.400000000000006,), (42,), (43,), (46,), (48,), (53,), (62,), (66,), (68,), (70,), (72,), (77,), (78,), (82,), (84,), (88,), (90,), (96,), (102,), (120,), (132,), (182,), (190,), (192,), (240,), (252,), (360,)]

See the 17.0 there on the last line (might have to scroll horizontally)? Now I knew that 17 was actually achievable. To find out how, I made it print the op as well, i.e. after

v = op(l[a], l[b])

I (hackily) added the line

if v == 17: print op, l[a], l[b]

and this is the version committed on Github. It prints

<built-in function mul> 2.83333333333 6

after “three”, so I know that the top level of our answer to the puzzle is the product of 2.833… and 6.

I was going to dig deeper to find out how 2.83333333333 arose, but at a glance I noticed (0.8333333333333334, 2, 6) as one of the triples output after “one”, and with that, it’s immediately obvious both how 2.83333333333 is formed, and how 0.8333333333333334 is itself formed from 5 and 6, so I know the answer now:

(5/6 + 2) * 6 = 17

The puzzle is solved! From beginning (starting the iPython shell) to end, this whole thing took about 30 to 40 minutes.

But wait: 24 is not in the output. I actually did a Google search for [24 puzzle 6 6 5 2] or something like that and found this page which gives (5-2)*6+6. That didn’t show up because I only took `op(l[a], l[b])` where `a < b`, so it would have taken `2 - 5` but not `5 - 2`. Fixed that bug and also added a display of the entire expression in this commit. Note that with this change, instead of keeping distinct *values*, we now keep distinct *expressions*. And now it did print solutions for 24:

((24, '(((2 * 5) - 6) * 6)'),) ((24, '(((5 * 2) - 6) * 6)'),) ((24, '(((5 - 2) * 6) + 6)'),) ((24, '((6 * (5 - 2)) + 6)'),) ((24, '(6 * ((2 * 5) - 6))'),) ((24, '(6 * ((5 * 2) - 6))'),) ((24, '(6 + ((5 - 2) * 6))'),) ((24, '(6 + (6 * (5 - 2)))'),) ((24, '(6 - ((2 - 5) * 6))'),) ((24, '(6 - (6 * (2 - 5)))'),)

… and similarly for 17 it prints multiple solutions:

((17.0, '(((5 / 6) + 2) * 6)'),) ((17.0, '((2 + (5 / 6)) * 6)'),) ((17.0, '(6 * ((5 / 6) + 2))'),) ((17.0, '(6 * (2 + (5 / 6)))'),)

Should we really treat these as different expressions? For example the first two solutions above correspond to the expression trees:

* * + 6 + 6 / 2 2 / 5 6 5 6

which are essentially the same tree, if you don’t care about the order of children.

I was able to reduce dupes a bit by not keeping both a+b and b+a, similarly not both a*b and b*a, with this, bringing it down to

((17.0, '(((5 / 6) + 2) * 6)'),) ... ((24, '(((2 * 5) - 6) * 6)'),) ((24, '(6 + ((5 - 2) * 6))'),) ((24, '(6 - ((2 - 5) * 6))'),)

Now note that there is only one solution for 17 (down from four), and the three expressions for 24 actually correspond to distinct trees.

There are still three levels of dupes:

((360, '((2 * 5) * (6 * 6))'),) ((360, '((2 * 6) * (5 * 6))'),) ((360, '(2 * (5 * (6 * 6)))'),) ((360, '(2 * (6 * (5 * 6)))'),) ((360, '(5 * (2 * (6 * 6)))'),) ((360, '(5 * (6 * (2 * 6)))'),) ((360, '(6 * (2 * (5 * 6)))'),) ((360, '(6 * (5 * (2 * 6)))'),) ((360, '(6 * (6 * (2 * 5)))'),)

— all of these are different expression trees but “morally” the same.

((90.0, '((5 * (6 * 6)) / 2)'),) ((90.0, '((5 * 6) / (2 / 6))'),) ((90.0, '((5 / 2) * (6 * 6))'),) ((90.0, '((6 * (5 * 6)) / 2)'),) ((90.0, '((6 * 6) / (2 / 5))'),) ((90.0, '((6 / 2) * (5 * 6))'),) ((90.0, '(5 * ((6 * 6) / 2))'),) ((90.0, '(5 * ((6 / 2) * 6))'),) ((90.0, '(5 * (6 / (2 / 6)))'),) ((90.0, '(5 / ((2 / 6) / 6))'),) ((90.0, '(5 / (2 / (6 * 6)))'),) ((90.0, '(6 * ((5 * 6) / 2))'),) ((90.0, '(6 * ((5 / 2) * 6))'),) ((90.0, '(6 * ((6 / 2) * 5))'),) ((90.0, '(6 * (5 / (2 / 6)))'),) ((90.0, '(6 * (6 / (2 / 5)))'),) ((90.0, '(6 / ((2 / 5) / 6))'),) ((90.0, '(6 / ((2 / 6) / 5))'),) ((90.0, '(6 / (2 / (5 * 6)))'),)

— These involve different operations, but you’re basically keeping 5, 6, 6, in the numerator and 2 in the denominator.

Finally, in

((24, '(((2 * 5) - 6) * 6)'),) ((24, '(6 + ((5 - 2) * 6))'),) ((24, '(6 - ((2 - 5) * 6))'),)

… it could be argued that the last two are equivalent, as would be 6 + 18 and 6 – (-18).

(In this section I’m talking to myself even more than usual, so the vague language may be even harder to understand… ignore until I come back and rewrite this.)

So let’s rethink the number of possible trees out of (a, b, c, d).

If the root node is a `+` (i.e. an addition operation), and one of the children is also a `+`, then the tree can be “rotated”: for any binary operation `op(c,d)`, we have

(a + b) + op(c, d)

= a + (b + op(c, d))

= b + (a + op(c, d))

always, even though all three are distinct binary trees.

Our program based on keeping inequivalent binary trees of expressions will necessarily distinguish them instead of treating them as identical.

So let’s have trees that need not be binary, e.g. `a + b + c + d` would be a single `+` operation with 4 children. (To cast this expression in terms of the original problem we would have to phrase this multi-valent addition operation using binary operations, which we can do in multiple ways: the point is that this denotes all those equivalent expressions.)

(* I guess at this point I would have benefited from trying to formalize what I meant by “equivalent”: I mean something like: two expressions (with numbers replaced by symbols) are equivalent if for any values of (a, b, c, d) the expressions always have the same value.)

Further, note that `a + b - c = a - c + b` etc., and further it’s also equal to `a - (c - b)`. So we can say `+` and `-` are on the same level, with at most one `-` sign needed. For example, `a - b + c - d` can be written `a + c - b - d` or `a + c - (b + d)`. To put it differently, an expression with addition or subtraction at the top level can be written as two sets of expressions, with everything in the first set taken positively, and everything in the second set taken negatively (the second set possibly empty). Similarly, multiplication and division at the same level. And we don’t allow an addition or subtraction to be a child of an addition or subtraction (the “rotation” issue from earlier), and similarly multiplication and division.

This means the possible trees are:

(with top level + or -) 2 children: a ± muldiv(b, c, d) and similarly with b, c, d as the leaf, muldiv(a, b) ± muldiv(c, d) and similarly three others, 3 children: a ± b ± muldiv(c, d) 4 children: a ± b ± c ± d

For the other case, of identifying `6 - ((2-5)*6)` and `6 + ((5-2)*6)`, we could adopt the convention that we’ll never put a negative number on the negative side: `(...)-(-x) = (...)+x` — we could adopt this convention if we could prove that if `-x` is achievable, then so is `x`. The problem is that this is not always the case: e.g. one of the given numbers could be negative. (The problem only allows the binary operation of subtraction, so we can’t negate a number willy-nilly: the unary operation of negation is not part of the problem.) However, if the expression is a multiplication or division, and one of the factors is a subtraction (or is otherwise additively invertible), then we could avoid putting it on the negative side.

[The other alternative is to only perform subtractions in a canonical order. There might be a way to make this work, but I wasn’t able to quickly tell.]

Similarly for multiplication and division.

This gives us the following algorithm, similar to the previous.

- Each expression has a value, a type (add/sub, mul/div, or atom), a flag saying whether it can be additively inverted, a flag saying whether it can be multiplicatively inverted, and its actual structure. Given a list of numbers (or, in general, expressions), repeatedly: - Decide on an operation, either add/sub or mul/div - If add/sub, then -- among the non-add/sub expressions, pick a nonempty subset for the additive side, and a subset for the negative side. -- Form the new expression. Its value and type and structure are obvious. Its multiplicative-inverse flag is False, and its additive-inverse flag is True if either: - there's a nonempty subtraction side, or - all the elements have the additive-inverse flag True. -- Replace the selected expressions with this new expression. - (Similarly if mul/div.) Repeat until there's only one number left.

I wrote this program and fixed some bugs (with the old version testing this), and it prints

`360=2 * 5 * 6 * 6` exactly once now. It also prints only the two really distinct (inequivalent) solutions for 24:

24=((2 * 5) - 6) * 6 24=6 + ((5 - 2) * 6)

Here’s a version of the program without the duplicate removal (it still gets only one expression for 360, so at least it solves the first two levels of dupes):

from fractions import Fraction import operator def product(factors): return reduce(operator.mul, factors, 1) ADD_SUB = 'add/sub' MUL_DIV = 'mul/div' ATOM = 'atom' class Expression(object): def __init__(self, op_type, args_l, args_r, value=None): self.op_type = op_type if op_type in [ADD_SUB, MUL_DIV]: self.args_l = args_l self.args_r = args_r self.value = self.compute_value() else: self.value = value def compute_value(self): if self.op_type == ADD_SUB: return sum(e.value for e in self.args_l) - sum(e.value for e in self.args_r) elif self.op_type == MUL_DIV: return product(e.value for e in self.args_l) / product(e.value for e in self.args_r) else: raise TypeError('No need to compute value of an atom.') def str_expr(self): if self.op_type == ATOM: return str(self.value) else: symbol = {ADD_SUB: ' + ', MUL_DIV: ' * '}[self.op_type] inverse = {ADD_SUB: ' - ', MUL_DIV: ' / '}[self.op_type] lhs = symbol.join([('%s' if e.op_type == ATOM else '(%s)') % e.str_expr() for e in self.args_l]) rhs = symbol.join([('%s' if e.op_type == ATOM else '(%s)') % e.str_expr() for e in self.args_r]) if not self.args_r: return '%s' % lhs if len(self.args_r) > 1: rhs = '(%s)' % rhs return '%s%s%s' % (lhs, inverse, rhs) def __str__(self): return '%s = %s' % (self.value, self.str_expr()) def __eq__(self, other): return str(self) == str(other) def __cmp__(self, other): if self.value != other.value: return cmp(self.value, other.value) return cmp(str(self), str(other)) # For use in a set def __hash__(self): return hash(str(self)) def three_subsets(l): """Yields all ways of partitioning l into three subsets.""" if len(l) == 0: yield ([], [], []) return for (a, b, c) in three_subsets(l[:-1]): last = l[-1] yield (a + [last], b, c) yield (a, b + [last], c) yield (a, b, c + [last]) def iterate(poss): new_poss = set() for l in poss: if len(l) == 1: new_poss.add(l) continue # Nothing further to do here for operation in [ADD_SUB, MUL_DIV]: for (candidates_l, candidates_r, others) in three_subsets(l): if not candidates_l: continue if len(candidates_l) == 1 and len(candidates_r) == 0: continue # Cannot have an ADD_SUB parent of an ADD_SUB, etc. if any(e.op_type == operation for e in candidates_l + candidates_r): continue # Avoid dividing by zero if operation == MUL_DIV and any(e.value == 0 for e in candidates_r): continue new_e = Expression(operation, candidates_l, candidates_r) new_l = tuple(sorted(others + [new_e])) new_poss.add(new_l) return new_poss def atom(value): return Expression(ATOM, None, None, Fraction(value)) start = (atom(2), atom(5), atom(6), atom(6)) poss = set([start]) # four expressions poss = iterate(poss) # at most three (in each possibility) poss = iterate(poss) # at most two poss = iterate(poss) # at most one for t in sorted(poss): assert len(t) == 1 print t[0]

With [6, 6, 5, 2], it (the linked version with duplicate-removal, i.e. it avoids putting negative values on the right side of a subtraction) prints 656 distinct expressions, for 380 different values: like 24 here, many values can be formed in multiple truly different ways.

At this point, I tried to print all expressions with a different set of values (instead of [6, 6, 5, 2]) for which such “coincidences” would be unlikely, picking the far-apart numbers 2, 21, 430, 8507. It gave me 1170 distinct values, with 1260 different expressions. I hacked it further to ignore the values and operate on just symbols, and tried it with 1, 2, and 3 symbols as well (for which the counts were 1, 6, and 68 expressions respectively). I then looked up this sequence [1, 6, 68, 1260] on OEIS and it was not there.

I was feeling good about myself at having possibly discovered something new (new to OEIS at least), but then I realized the 1170 was probably the correct number of distinct expressions. (I was able to confirm this to my satisfaction by manually looking at some of the dupes: they were dupes like `(a-b)*(c-d) = (b-a)*(d-c)`.)

And lo and behold, the sequence [1, 6, 68, 1170] *is* on OEIS: OEIS A140606.

It pointed to a Chinese forum where the problem was posed and solved, and another where it was extended. After downloading the C program by the Chinese person (user mathe on bbs.emath.ac.cn, = Zhao Hui Du?) and trying to understand it, one idea I have to address this `(a-b)/(c-d) = (b-a)/(d-c)` problem is to count negations as simply a doubling.

That is, we keep only canonical forms (say a-b and c-d, never b-a or d-c), and in any expression, simply record that the negation is possible.

Thus any mul-div like xyz or xy/z is negatable if at least one of the factors is negatable. For an add-sub, we need a little more care.

This needs explanation, and explaining this was supposed to be the main point of this post, but for now I’ll just give the program. It prints precisely the distinct expressions (verified for n=4, n=5, etc).

from fractions import Fraction import operator def product(factors): return reduce(operator.mul, factors, 1) ADD_SUB = 'add/sub' MUL_DIV = 'mul/div' ATOM = 'atom' class Expression(object): def __init__(self, op_type, args_l, args_r, value=None, is_negation=False): self.op_type = op_type if op_type in [ADD_SUB, MUL_DIV]: self.args_l = args_l self.args_r = args_r self.value = self.compute_value() else: self.value = value self.negation = None if is_negation else self.create_negation() def create_negation(self): """Given an Expression `self`, returns its negation if it is negatable. Does not mutate self.""" if self.op_type == ADD_SUB: if self.args_r: # x - y -> y - x return Expression(self.op_type, self.args_r, self.args_l, is_negation=True) elif any(e.negation for e in self.args_l): # x + (-y) + z -> y - (x + z) first = None rest = [] for e in self.args_l: if first is None and e.negation: first = e.negation else: rest.append(e) assert first return Expression(self.op_type, [first], rest, is_negation=True) elif self.op_type == MUL_DIV and any(e.negation for e in self.args_l + self.args_r): new_args_l = [] new_args_r = [] negated_yet = False for e in self.args_l: if not negated_yet and e.negation: new_args_l.append(e.negation) negated_yet = True else: new_args_l.append(e) for e in self.args_r: if not negated_yet and e.negation: new_args_r.append(e.negation) negated_yet = True else: new_args_r.append(e) assert(negated_yet) return Expression(self.op_type, new_args_l, new_args_r, is_negation=True) return None def compute_value(self): if self.op_type == ADD_SUB: return sum(e.value for e in self.args_l) - sum(e.value for e in self.args_r) elif self.op_type == MUL_DIV: return product(e.value for e in self.args_l) / product(e.value for e in self.args_r) else: raise TypeError('No need to compute value of an atom.') def str_expr(self): if self.op_type == ATOM: return str(self.value) else: symbol = {ADD_SUB: ' + ', MUL_DIV: ' * '}[self.op_type] inverse = {ADD_SUB: ' - ', MUL_DIV: ' / '}[self.op_type] lhs = symbol.join([('%s' if e.op_type == ATOM else '(%s)') % e.str_expr() for e in self.args_l]) rhs = symbol.join([('%s' if e.op_type == ATOM else '(%s)') % e.str_expr() for e in self.args_r]) if not self.args_r: return '%s' % lhs if len(self.args_r) > 1: rhs = '(%s)' % rhs return '%s%s%s' % (lhs, inverse, rhs) def __str__(self): return self.str_expr() def __eq__(self, other): return str(self) == str(other) def __cmp__(self, other): if self.value != other.value: return cmp(self.value, other.value) return cmp(str(self), str(other)) # For use in a set def __hash__(self): return hash(str(self)) def atom(value): return Expression(ATOM, None, None, Fraction(value)) def three_subsets(l): """Yields all ways of partitioning l into three subsets.""" if len(l) == 0: yield ([], [], []) return for (a, b, c) in three_subsets(l[:-1]): last = l[-1] yield (a + [last], b, c) yield (a, b + [last], c) yield (a, b, c + [last]) def iterate_expressions(poss): """Given a set of lists of expressions, generates a new set of lists of expressions.""" new_poss = set() for l in poss: if len(l) == 1: new_poss.add(l) continue # Nothing further to do here for operation in [ADD_SUB, MUL_DIV]: # Cannot have an ADD_SUB parent of an ADD_SUB, or a MUL_DIV parent of a MUL_DIV candidates = [e for e in l if e.op_type != operation] non_candidates = [e for e in l if e.op_type == operation] for (candidates_l, candidates_r, others) in three_subsets(candidates): if not candidates_l: continue if len(candidates_l) == 1 and len(candidates_r) == 0: continue # Avoid dividing by zero if operation == MUL_DIV and any(e.value == 0 for e in candidates_r): continue # To avoid dupes, we keep only one of each pair of negatives: never both e1 - e2 and e2 - e1. if operation == ADD_SUB and candidates_r and candidates_l < candidates_r: continue new_e = Expression(operation, candidates_l, candidates_r) new_l = tuple(sorted(non_candidates + others + [new_e])) new_poss.add(new_l) return new_poss def all_expressions(values): start = tuple(atom(v) for v in values) poss = set([start]) for _ in range(len(start) - 1): poss = iterate_expressions(poss) # Include the left-out negations as well. actual_poss = set() for t in sorted(poss): assert len(t) == 1 e = t[0] actual_poss.add(e) if e.negation: actual_poss.add(e.negation) assert e.negation.value == -e.value print 'Without negations:', len(poss), 'Including negations:', len(actual_poss), 'Distinct values:', len(set(e.value for e in actual_poss)) return actual_poss # Old version of the program, for comparison def iterate_values(poss): newposs = set() for l in poss: for a in range(len(l)): for b in range(len(l)): if b == a: continue for op in [operator.add, operator.sub, operator.mul, operator.truediv]: if op == operator.truediv and l[b] == 0: continue v = op(l[a], l[b]) nl = [v] + [l[x] for x in range(len(l)) if x not in [a, b]] newposs.add(tuple(sorted(nl))) return newposs def all_values(values): start = tuple(Fraction(v) for v in values) poss = set([start]) for _ in range(len(start) - 1): poss = iterate_values(poss) values = set() for t in poss: assert len(t) == 1 values.add(t[0]) # Print counts print '(Old) number of values:', len(poss), '=', len(values) return values def compare(values): print values expressions = all_expressions(values) values_new = set(t.value for t in expressions) values_old = all_values(values) for v in sorted(values_old): if v not in values_new: print 'Only old: ', v for v in sorted(values_new): if v not in values_old: print 'Only new: ', v assert values_old == values_new compare([6, 6, 5, 2]) compare([2, 4, 5, 6]) compare([2, 21, 430, 8607])

]]>

This is a very hard question. Understanding is an individual and internal matter that is hard to be fully aware of, hard to understand and often hard to communicate. We can only touch on it lightly here.

People have very different ways of understanding particular pieces of mathematics. To illustrate this, it is best to take an example that practicing mathematicians understand in multiple ways, but that we see our students struggling with. The derivative of a function fits well. The derivative can be thought of as:

- Infinitesimal: the ratio of the infinitesimal change in the value of a function to the infinitesimal change in a function.
- Symbolic: the derivative of is , the derivative of is , the derivative of is ,
etc.- Logical: if and only if for every there is a such that when
- Geometric: the derivative is the slope of a line tangent to the graph of the function, if the graph has a tangent.
- Rate: the instantaneous speed of , when is time.
- Approximation: The derivative of a function is the best linear approximation to the function near a point.
- Microscopic: The derivative of a function is the limit of what you get by looking at it under a microscope of higher and higher power.
This is a list of different ways of

thinking about or conceiving ofthe derivative, rather than a list of differentlogical definitions. Unless great efforts are made to maintain the tone and flavor of the original human insights, the differences start to evaporate as soon as the mental concepts are translated into precise, formal and explicit definitions.I can remember absorbing each of these concepts as something new and interesting, and spending a good deal of mental time and effort digesting and practicing with each, reconciling it with the others. I also remember coming back to revisit these different concepts later with added meaning and understanding.

The list continues; there is no reason for it ever to stop. A sample entry further down the list may help illustrate this. We may think we know all there is to say about a certain subject, but new insights are around the corner. Furthermore, one person’s clear mental image is another person’s intimidation:

- The derivative of a real-valued function in a domain is the Lagrangian section of the cotangent bundle that gives the connection form for the unique flat connection on the trivial -bundle for which the graph of is parallel.
These differences are not just a curiosity. Human thinking and understanding do not work on a single track, like a computer with a single central processing unit. Our brains and minds seem to be organized into a variety of separate, powerful facilities. These facilities work together loosely, “talking” to each other at high levels rather than at low levels of organization.

This has been extended on the MathOverflow question Different ways of thinking about the derivative where you can find even more ways of thinking about the derivative. (Two of the interesting pointers are to this discussion on the n-Category Café, and to the book Calculus Unlimited by Marsden and Weinstein, which does calculus using a “method of exhaustion” that does not involve limits. (Its definition of the derivative is also mentioned at the earlier link, as *that notion of the derivative closest to [the idea of Eudoxus and Archimedes] of “the tangent line touches the curve, and in the space between the line and the curve, no other straight line can be interposed”, or “the line which touches the curve only once”* — this counts as another important way of thinking about the derivative.)

It has also been best extended by Terence Tao, who in an October 2009 blog post on Grothendieck’s definition of a group gave several ways of thinking about a group:

In his wonderful article “On proof and progress in mathematics“, Bill Thurston describes (among many other topics) how one’s understanding of given concept in mathematics (such as that of the derivative) can be vastly enriched by viewing it simultaneously from many subtly different perspectives; in the case of the derivative, he gives seven standard such perspectives (infinitesimal, symbolic, logical, geometric, rate, approximation, microscopic) and then mentions a much later perspective in the sequence (as describing a flat connection for a graph).

One can of course do something similar for many other fundamental notions in mathematics. For instance, the notion of a group can be thought of in a number of (closely related) ways, such as the following:

- Motivating examples: A group is an abstraction of the operations of addition/subtraction or multiplication/division in arithmetic or linear algebra, or of composition/inversion of transformations.
- Universal algebraic: A group is a set with an identity element , a unary inverse operation , and a binary multiplication operation obeying the relations (or axioms) , , for all .
- Symmetric: A group is all the ways in which one can transform a space to itself while preserving some object or structure on this space.
- Representation theoretic: A group is identifiable with a collection of transformations on a space which is closed under composition and inverse, and contains the identity transformation.
- Presentation theoretic: A group can be generated by a collection of generators subject to some number of relations.
- Topological: A group is the fundamental group of a connected topological space .
- Dynamic: A group represents the passage of time (or of some other variable(s) of motion or action) on a (reversible) dynamical system.
- Category theoretic: A group is a category with one object, in which all morphisms have inverses.
- Quantum: A group is the classical limit of a quantum group.
etc.

One can view a large part of group theory (and related subjects, such as representation theory) as exploring the interconnections between various of these perspectives. As one’s understanding of the subject matures, many of these formerly distinct perspectives slowly merge into a single unified perspective.From a recent talk by Ezra Getzler, I learned a more sophisticated perspective on a group, somewhat analogous to Thurston’s example of a sophisticated perspective on a derivative (and coincidentally, flat connections play a central role in both):

- Sheaf theoretic: A group is identifiable with a (set-valued) sheaf on the category of simplicial complexes such that the morphisms associated to collapses of -simplices are bijective for (and merely surjective for ).

The rest of the post elaborates on this understanding.

Again in a Google Buzz post on Jun 9, 2010, Tao posted the following:

Bill Thurston’s “On proof and progress in mathematics” has many nice observations about the nature and practice of modern mathematics. One of them is that for any fundamental concept in mathematics, there is usually no “best” way to define or think about that concept, but instead there is often a family of interrelated and overlapping, but distinct, perspectives on that concept, each of which conveying its own useful intuition and generalisations; often, the combination of all of these perspectives is far greater than the sum of the parts. Thurston illustrates this with the concept of differentiation, to which he lists seven basic perspectives and one more advanced perspective, and hints at dozens more.

But even the most basic of mathematical concepts admit this multiplicity of interpretation and perspective. Consider for instance the operation of addition, that takes two numbers x and y and forms their sum x+y. There are many such ways to interpret this operation:

1. (Disjoint union) x+y is the “size” of the disjoint union X u Y of an object X of size x, and an object Y of size y. (Size is, of course, another concept with many different interpretations: cardinality, volume, mass, length, measure, etc.)

2. (Concatenation) x+y is the size of the object formed by concatenating an object X of size x with an object Y of size y (or by appending Y to X).

3. (Iteration) x+y is formed from x by incrementing it y times.

4. (Superposition) x+y is the “strength” of the superposition of a force (or field, intensity, etc.) of strength x with a force of strength y.

5. (Translation action) x+y is the translation of x by y.

5a. (Translation representation) x+y is the amount of translation or displacement incurred by composing a translation by x with a translation by y.

6. (Algebraic) + is a binary operation on numbers that give it the structure of an additive group (or monoid), with 0 being the additive identity and 1 being the generator of the natural numbers or integers.

7. (Logical) +, when combined with the other basic arithmetic operations, are a family of structures on numbers that obey a set of axioms such as the Peano axioms.

8. (Algorithmic) x+y is the output of the long addition algorithm that takes x and y as input.

9. etc.

These perspectives are all closely related to each other; this is why we are willing to give them all the common name of “addition”, and the common symbol of “+”. Nevertheless there are some slight differences between each perspective. For instance, addition of cardinals is based on perspective 1, while addition of ordinals is based on perspective 2. This distinction becomes apparent once one considers infinite cardinals or ordinals: for instance, in cardinal arithmetic, aleph_0 = 1+ aleph_0 = aleph_0 + 1 = aleph_0 + aleph_0, whereas in ordinal arithmetic, omega = 1+omega < omega+1 < omega + omega.

Transitioning from one perspective to another is often a necessary first conceptual step when the time comes to generalise the concept. As a child, addition of natural numbers is usually taught initially by using perspective 1 or 3, but to generalise to addition of integers, one must first switch to a perspective such as 4, 5, or 5a; similar conceptual shifts are needed when one then turns to addition of rationals, real numbers, complex numbers, residue classes, functions, matrices, elements of abstract additive groups, nonstandard number systems, etc. Eventually, one internalises all of the perspectives (and their inter-relationships) simultaneously, and then becomes comfortable with the addition concept in a very broad set of contexts; but it can be more of a struggle to do so when one has grasped only a subset of the possible ways of thinking about addition.

In many situations, the various perspectives of a concept are either completely equivalent to each other, or close enough to equivalent that one can safely “abuse notation” by identifying them together. But occasionally, one of the equivalences breaks down, and then it becomes useful to maintain a careful distinction between two perspectives that are almost, but not quite, compatible. Consider for instance the following ways of interpreting the operation of exponentiation x^y of two numbers x, y:

1. (Combinatorial) x^y is the number of ways to make y independent choices, each of which chooses from x alternatives.

2. (Set theoretic) x^y is the size of the space of functions from a set Y of size y to a set X of size x.

3. (Geometric) x^y is the volume (or measure) of a y-dimensional cube (or hypercube) whose sidelength is x.

4. (Iteration) x^y is the operation of starting at 1 and multiplying by x y times.

5. (Homomorphism) y → x^y is the continuous homomorphism from the domain of y (with the additive group structure) to the range of x^y (with the multiplicative structure) that maps 1 to x.

6. (Algebraic) ^ is the operation that obeys the laws of exponentiation in algebra.

7. (Log-exponential) x^y is exp( y log x ). (This raises the question of how to interpret exp and log, and again there are multiple perspectives for each…)

8. (Complex-analytic) Complex exponentiation is the analytic continuation of real exponentiation.

9. (Computational) x^y is whatever my calculator or computer outputs when it is asked to evaluate x^y.

10. etc.

Again, these interpretations are usually compatible with each other, but there are some key exceptions. For instance, the quantity 0^0 would be equal to zero [

ed: I think this should be one —S] using some of these interpretations, but would be undefined in others. The quantity 4^{1/2} would be equal to 2 in some interpretations, be undefined in others, and be equal to the multivalued expression +-2 (or to depend on a choice of branch) in yet further interpretations. And quantities such as i^i are sufficiently problematic that it is usually best to try to avoid exponentiation of one arbitrary complex number by another arbitrary complex number unless one knows exactly what one is doing. In such situations, it is best not to think about a single, one-size-fits-all notion of a concept such as exponentiation, but instead be aware of the context one is in (e.g. is one raising a complex number to an integer power? A positive real to a complex power? A complex number to a fractional power? etc.) and to know which interpretations are most natural for that context, as this will help protect against making errors when manipulating expressions involving exponentiation.It is also quite instructive to build one’s own list of interpretations for various basic concepts, analogously to those above (or Thurston’s example). Some good examples of concepts to try this on include “multiplication”, “integration”, “function”, “measure”, “solution”, “space”, “size”, “distance”, “curvature”, “number”, “convergence”, “probability” or “smoothness”. See also my blog post below in which the concept of a “group” is considered.

I plan to collect more such “different ways of thinking about the same (mathematical) thing” in this post, as I encounter them.

]]>

This was a journal that ran from 1866 to 1920, and some issues are available online. “The Benares College” in its title is what was the first college in the city (established 1791), later renamed the Government Sanskrit College, Varanasi, and now the Sampurnanand Sanskrit University.

There are some interesting things in there. From a cursory look, it’s mainly editions of Sanskrit works (Kavya, Mimamsa, Sankhya, Nyaya, Vedanta, Vyakarana, etc.) and translations of some, along with the occasional harsh review of a recent work (printed anonymously of course), but also contains, among other things, (partial?) translations into Sanskrit of John Locke’s *An Essay Concerning Human Understanding* and Bishop Berkeley’s *A Treatise Concerning the Principles of Human Knowledge.* Also some hilarious (and quite valid) complaints about miscommunication between English Orientalists and traditional pandits, with their different education systems and different notions of what topics are simple and what are advanced.

The journal’s motto:

श्रीमद्विजयिनीदेवीपाठशालोदयोदितः । प्राच्यप्रतीच्यवाक्पूर्वापरपक्षद्वयान्वितः ॥

अङ्करश्मिः स्फुटयतु काशीविद्यासुधानिधिः । प्राचीनार्यजनप्रज्ञाविलासकुमुदोत्करान् ॥

The metadata is terrible: there’s only an index of sorts at the end of the whole volume; each issue of the journal carries no table of contents (or if it did, they have been ripped out when binding each (June to May) year’s issues into volumes). Authorship information is scarce. Some translations have been abandoned. (I arrived at this journal looking at Volume 9 where an English translation of Kedārabhaṭṭa’s *Vṛtta-ratnākara* is begun, carried into three chapters (published in alternate issues), left with a “to be continued” as usual, except there’s no mention of it in succeeding issues.) Still, a lot of interesting stuff in there.

Among the British contributors/editors of the journal were Ralph T. H. Griffith (who translated the Ramayana into English verse: there are advertisements for the translation in these volumes) and James R. Ballantyne (previously encountered as the author of *Iṅglaṇḍīya-bhāṣā-vyākaraṇam* a book on English grammar written in Sanskrit: he seems to have also been an ardent promoter of Christianity, but also an enthusiastic worker for more dialogue between the pandits and the Western scholars), each of whom served as the principal of the college. (Later principals of the college include Ganganath Jha and Gopinath Kaviraj.) Among the Indian contributors to the journal are Vitthala Shastri, who in 1852 appears to have written a Sanskrit commentary on Francis Bacon’s _Novum Organum,_ (I think it’s this, but see also the preface of this book for context) Bapudeva Sastri, and others: probably the contributors were all faculty of the college; consider the 1853 list of faculty here (Also note the relative salaries!)

Had previously encountered a mention of this magazine in this book (post).

The issues I could find—and I searched quite thoroughly I think—are below. Preferably, someone needs to download from Google Books and re-upload to the Internet Archive, as books on Google Books have an occasional tendency to disappear (or get locked US-only).

https://books.google.com/books?id=Z71EAAAAcAAJ 1866 Vol 1 (1 – 12)

https://books.google.com/books?id=ESgJAAAAQAAJ 1866 vol 1 (1 – 12)

https://books.google.com/books?id=Sr8IAAAAQAAJ 1866 Vol 1 (1 – 12)

https://books.google.com/books?id=JAspAAAAYAAJ 1866 vol 1-3 (1 – 36)

https://books.google.com/books?id=Y78IAAAAQAAJ 1867 Vol 2 (13 – 24)

https://books.google.com/books?id=JigJAAAAQAAJ 1867 Vol 2 (13 – 24)

https://books.google.com/books?id=cL1EAAAAcAAJ 1867 Vol 2 (13 – 24)

https://books.google.com/books?id=g78IAAAAQAAJ 1868 Vol 3 (25 – 36)

https://books.google.com/books?id=eL1EAAAAcAAJ 1868 Vol 3 (25 – 36)

https://books.google.com/books?id=OSgJAAAAQAAJ 1868 Vol 3 (25 – 36)

https://books.google.com/books?id=m78IAAAAQAAJ 1869 vol 4 (37 – 48)

https://books.google.com/books?id=WygJAAAAQAAJ 1869 Vol 4 (37 – 48)

https://books.google.com/books?id=g71EAAAAcAAJ 1869 vol 4 (37 – 48)

https://books.google.com/books?id=vr8IAAAAQAAJ 1870 vol 5 (49 – 60)

https://books.google.com/books?id=eCgJAAAAQAAJ 1870 vol 5 (49 – 60)

https://books.google.com/books?id=24dSAAAAcAAJ 1870 vol 5 (49 – 60)

https://books.google.com/books?id=0b8IAAAAQAAJ 1871 Vol 6 (61 – 72)

https://books.google.com/books?id=nigJAAAAQAAJ 1871 vol 6 (61 – 72)

https://books.google.com/books?id=5YdSAAAAcAAJ 1871 vol 6 (61 – 72)

https://books.google.com/books?id=878IAAAAQAAJ 1872 Vol 7 (73 – 84)

https://books.google.com/books?id=uCgJAAAAQAAJ 1872 Vol 7 (73 – 84)

https://books.google.com/books?id=TrZUAAAAcAAJ 1872 vol 7 (73 – 84)

https://books.google.com/books?id=6ygJAAAAQAAJ 1873 vol 8 (85 – 96)

https://books.google.com/books?id=ASkJAAAAQAAJ 1874 vol 9 (97 – 108)

https://books.google.com/books?id=KMAIAAAAQAAJ 1874 vol 9 (97 – 108)

https://books.google.com/books?id=ICkJAAAAQAAJ 1875 Vol 10 (109 – 120)

https://books.google.com/books?id=CcAIAAAAQAAJ 1875 vol 10 (109 – 120)

[New series]

https://books.google.com/books?id=jHxFAQAAIAAJ 1876 vol 1

https://books.google.com/books?id=A_lSAAAAYAAJ 1877 vol 2

https://books.google.com/books?id=M31FAQAAIAAJ 1877 vol 2

https://books.google.com/books?id=ZQspAAAAYAAJ 1877 vol 2

https://books.google.com/books?id=rgspAAAAYAAJ 1879 vol 3

https://books.google.com/books?id=w31FAQAAIAAJ 1879 vol 3

https://books.google.com/books?id=EA0pAAAAYAAJ 1882 Vol 4

https://books.google.com/books?id=Pn5FAQAAIAAJ 1882 vol 4

https://books.google.com/books?id=gzoJAAAAQAAJ 1882 vol 4

https://books.google.com/books?id=XA0pAAAAYAAJ 1883 Vol 5

https://books.google.com/books?id=jikJAAAAQAAJ 1883 Vol 5

https://books.google.com/books?id=3X5FAQAAIAAJ 1883 vol 5

https://books.google.com/books?id=zSkJAAAAQAAJ 1884 vol 6

https://books.google.com/books?id=vQ0pAAAAYAAJ 1885 vol 7

https://books.google.com/books?id=Oi8JAAAAQAAJ 1885 vol 7

https://books.google.com/books?id=FQ4pAAAAYAAJ 1886 Vol 8

https://books.google.com/books?id=JwopAAAAYAAJ 1887 vol 9

https://books.google.com/books?id=fQ4pAAAAYAAJ 1888 Vol 10

https://books.google.com/books?id=gAopAAAAYAAJ 1890 vol 12

https://books.google.com/books?id=2wopAAAAYAAJ 1891 vol 13

https://books.google.com/books?id=LwspAAAAYAAJ 1892 vol 14

https://books.google.com/books?id=pAspAAAAYAAJ 1895 Vol 17

https://books.google.com/books?id=wdc9AQAAMAAJ 1895 vol 17

https://books.google.com/books?id=BAwpAAAAYAAJ 1896 Vol 18

https://books.google.com/books?id=UwwpAAAAYAAJ 1897 vol 19

https://books.google.com/books?id=1wwpAAAAYAAJ 1898 Vol 20

https://books.google.com/books?id=1g0pAAAAYAAJ 1899 Vol 21

https://books.google.com/books?id=iBNAAQAAMAAJ 1899 Vol 21

https://books.google.com/books?id=Xg4pAAAAYAAJ 1900 Vol 22

https://books.google.com/books?id=MhApAAAAYAAJ 1901 Vol 23

https://books.google.com/books?id=4w4pAAAAYAAJ 1902 Vol 24

https://books.google.com/books?id=Tw8pAAAAYAAJ 1904 Vol 25

https://books.google.com/books?id=vw8pAAAAYAAJ 1905 Vol 27

https://books.google.com/books?id=iBApAAAAYAAJ 1907 vol 29

https://books.google.com/books?id=bQ0pAAAAYAAJ 1908 Vol 30

https://books.google.com/books?id=Bv1SAAAAYAAJ 1908 Vol 30

https://books.google.com/books?id=LNA9AQAAMAAJ 1911 Vol 33 Snippet View

https://books.google.com/books?id=ctA9AQAAMAAJ 1912 Vol 34 Snippet View

https://books.google.com/books?id=3dA9AQAAMAAJ 1913 Vol 35 Snippet View

https://books.google.com/books?id=a9E9AQAAMAAJ 1916 Vol 38 Snippet View

https://books.google.com/books?id=N9E9AQAAMAAJ 1916 Vol 37 Snippet View

]]>

The formula

(reminded via this post), a special case at of

was found by Leibniz in 1673, while he was trying to find the area (“quadrature”) of a circle, and he had as prior work the ideas of Pascal on infinitesimal triangles, and that of Mercator on the area of the hyperbola with its infinite series for . This was Leibniz’s first big mathematical work, before his more general ideas on calculus.

Leibniz did not know that this series had already been discovered earlier in 1671 by the short-lived mathematician James Gregory in Scotland. Gregory too had encountered Mercator’s infinite series , and was working on different goals: he was trying to invert logarithmic and trigonometric functions.

Neither of them knew that the series had already been found two centuries earlier by Mādhava (1340–1425) in India (as known through the quotations of Nīlakaṇṭha c.1500), working in a completely different mathematical culture whose goals and practices were very different. The logarithm function doesn’t seem to have been known, let alone an infinite series for it, though a calculus of finite differences for interpolation for trigonometric functions seems to have been ahead of Europe by centuries (starting all the way back with Āryabhaṭa in c. 500 and more clearly stated by Bhāskara II in 1150). Using a different approach (based on the arc of a circle) and geometric series and sums-of-powers, Mādhava (or the mathematicians of the Kerala tradition) arrived at the same formula.

[The above is based on *The Discovery of the Series Formula for π by Leibniz, Gregory and Nilakantha* by Ranjay Roy (1991).]

This startling universality of mathematics across different cultures is what David Mumford remarks on, in *Why I am a Platonist*:

As Littlewood said to Hardy, the Greek mathematicians spoke a language modern mathematicians can understand, they were not clever schoolboys but were “fellows of a different college”. They were working and thinking the same way as Hardy and Littlewood. There is nothing whatsoever that needs to be adjusted to compensate for their living in a different time and place, in a different culture, with a different language and education from us. We are all understanding the same abstract mathematical set of ideas and seeing the same relationships.

The same thought was also expressed by *Mean Girls*:

]]>

**1. Alphabet **

Suppose we have an alphabet of size . Its generating function (using the variable to mark length) is simply , as contains elements of length each.

**2. Words **

Let denote the class of all words over the alphabet . There are many ways to find the generating function for .

** 2.1. **

We have

so its generating function is

** 2.2. **

To put it differently, in the symbolic framework, we have , so the generating function for is

** 2.3. **

We could have arrived at this with direct counting: the number of words of length is as there are choices for each of the letters, so the generating function is

**3. Smirnov words **

Next, let denote the class of Smirnov words over the alphabet , defined as words in which no two consecutive letters are identical. (That is, words in which for all , and for any .) Again, we can find the generating function for in different ways.

** 3.1. **

For any word in , by “collapsing” all runs of each letter, we get a Smirnov word. To put it differently, any word in can be obtained from a Smirnov word by “expanding” each letter into a nonempty sequence of that letter. This observation (see *Analytic Combinatorics*, pp. 204–205) lets us relate the generating functions of and as

which implicitly gives the generating function : we have

** 3.2. **

Alternatively, consider in an arbitrary word the first occurrence of a pair of repeated letters. Either this doesn’t happen at all (the word is a Smirnov word), or else, if it happens at position so that , then the part of the word up to position is a nonempty Smirnov word, the letter at position is the same as the previous letter, and everything after is an arbitrary word. This gives

or in terms of generating functions

giving

** 3.3. **

A minor variant is to again pick an arbitrary word and consider its first pair of repeated letters, happening (if it does) at positions and , but this time consider the prefix up to : either it is empty, or the pair of letters is different from the last letter of the prefix, giving us the decomposition

and corresponding generating function

so

which is the same as before after we cancel the factors.

** 3.4. **

We could have arrived at this result with direct counting. For , for a Smirnov word of length , we have choices for the first letter, and for each of the other letters, as they must not be the same as the previous letter, we have choices. This gives the number of Smirnov words of length as for , and so the generating function for Smirnov words is

again giving

**4. Words with bounded runs **

We can now generalize. Let denote the class of words in which no letter occurs more than times consecutively. (.) We can find the generating function for .

** 4.1. **

To get a word in we can take a Smirnov word and replace each letter with a nonempty sequence of up to occurrences of that letter. This gives:

so

** 4.2. **

Pick any arbitrary word, and consider its first occurrence of a run of letters. Either such a run does not exist (which means the word we picked is in ), or it occurs right at the beginning ( possibilities, one for each letter in the alphabet), or, if it occurs starting at position , then the part of the word up to position (the “prefix”) is a nonempty Smirnov word, positions to are occurrences of any of the letters other than the last letter of the prefix, and what follows is an arbitrary word. This gives

or in terms of generating functions

so

giving

** 4.3. **

Arriving at this via direct counting seems hard.

**5. Words that stop at a long run **

Now consider words in which we “stop” as soon we see consecutive identical letters. Let the class of such words be denoted (not writing to keep the notation simple). As before, we can find its generating function in multiple ways.

** 5.1. **

We get any word in by either immediately seeing a run of length and stopping, or by starting with a nonempty prefix in , and then stopping with a run of identical letters different from the last letter of the prefix. Thus we have

and

which gives

** 5.2. **

Alternatively, we can decompose any word by looking for its first run of identical letters. Either it doesn’t occur at all (the word we picked is in , or the part of the word until the end of the run belongs to and the rest is an arbitrary word, so

and

so

**6. Probability **

Finally we arrive at the motivation: suppose we keep appending a random letter from the alphabet, until we encounter the same letter times consecutively. What can we say about the length of the word thus generated? As all sequences of letters are equally likely, the probability of seeing any string of length is . So in the above generating function , the probability of our word having length is , and the probability generating function is therefore . This can be got by replacing with in the expression for : we have

In principle, this probability generating function tells us everything about the distribution of the length of the word. For example, its expected length is

(See this question on Quora for other powerful ways of finding this expected value directly.)

We can also find its variance, as

This variance is really too large to be useful, so what we would really like, is the shape of the distribution… to be continued.

]]>

So, a data-URI looks something like the following:

data:image/png;base64,[and a stream of base64 characters here]

The part after the comma is literally the contents of the file (image or whatever), encoded in base64, so all you need to do is run `base64 --decode` on that part.

For example, with the whole data URL copied to the clipboard, I can do:

pbpaste | sed -e 's#data:image/png;base64,##' | base64 --decode > out.png

to get it into a png file.

]]>

I just wanted to show what the sky looks like over the course of a week.

On a Mac with Stellarium installed, I ran the following

/Applications/Stellarium.app/Contents/MacOS/stellarium --startup-script stellarium.ssc

with the following `stellarium.ssc`

:

// -*- mode: javascript -*- core.clear('natural'); // "atmosphere, landscape, no lines, labels or markers" core.wait(5); core.setObserverLocation('Ujjain, India'); core.setDate('1986-08-15T05:30:00', 'utc'); core.wait(5); for (var i = 0; i < 2 * 24 * 7; i += 1) { core.setDate('+30 minutes'); core.wait(0.5); core.screenshot('uj'); core.wait(0.5); } core.wait(10); core.quitStellarium();

It took a while (some 10–15 minutes) and created those 336 images in `~/Pictures/Stellarium/uj*`

, occupying a total size of about 550 MB. This seems a start, but Imagemagick etc. seem to choke on creating a GIF from such large data.

Giving up for now; would like to come back in future and figure out something better, that results in a smaller GIF.

]]>

Just dumping some links here for now:

Feynman on “cargo cult science”: http://www.lhup.edu/~DSIMANEK/cargocul.htm

Feynman on “what is science”: http://www.fotuva.org/feynman/what_is_science.html

Ioannidis, Why Most Published Research Findings Are False: http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

On how subtle it is: http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/

Reinhart, http://www.statisticsdonewrong.com

http://www.nytimes.com/2012/04/17/science/rise-in-scientific-journal-retractions-prompts-calls-for-reform.html?pagewanted=all

https://fivethirtyeight.com/features/science-isnt-broken/

Another well argued summary: http://www.firstthings.com/article/2016/05/scientific-regress

]]>

"""Module to demonstrate an error.""" import random def zero_or_one(): """Returns either 0 or 1 with equal probability.""" return random.choice([0, 1]) def main(): """Function to demonstrate an error.""" if zero_or_one(): first = 42 for _ in range(zero_or_one()): second = 42 print first, second if __name__ == '__main__': main()

Note that line 15 uses the variables `first` and `second`, which are defined only if `zero_or_one()` returned 1 both times.

(Condensed from a real bug where, because of additional indentation, a variable assignment happened as the last line inside a loop, instead of the first line after it.)

I know of three tools that are popular: pychecker, pyflakes, and pylint. None of them say a single thing about this program. It is questionable whether ever (and if so, how often) code like the above is what the programmer intended.

The first of these, pychecker, is not on pip, and requires you to download a file from Sourceforge and run “python setup.py install”. Anyway, this is the output from the three programs:

```
```

```
/tmp% pychecker test.py
Processing module test (test.py)...
Warnings...
None
/tmp% pyflakes test.py
/tmp% pylint test.py
No config file found, using default configuration
Report
======
12 statements analysed.
Statistics by type
------------------
+---------+-------+-----------+-----------+------------+---------+
|type |number |old number |difference |%documented |%badname |
+=========+=======+===========+===========+============+=========+
|module |1 |1 |= |100.00 |0.00 |
+---------+-------+-----------+-----------+------------+---------+
|class |0 |0 |= |0 |0 |
+---------+-------+-----------+-----------+------------+---------+
|method |0 |0 |= |0 |0 |
+---------+-------+-----------+-----------+------------+---------+
|function |2 |2 |= |100.00 |0.00 |
+---------+-------+-----------+-----------+------------+---------+
Raw metrics
-----------
+----------+-------+------+---------+-----------+
|type |number |% |previous |difference |
+==========+=======+======+=========+===========+
|code |11 |73.33 |11 |= |
+----------+-------+------+---------+-----------+
|docstring |3 |20.00 |3 |= |
+----------+-------+------+---------+-----------+
|comment |0 |0.00 |0 |= |
+----------+-------+------+---------+-----------+
|empty |1 |6.67 |1 |= |
+----------+-------+------+---------+-----------+
Duplication
-----------
+-------------------------+------+---------+-----------+
| |now |previous |difference |
+=========================+======+=========+===========+
|nb duplicated lines |0 |0 |= |
+-------------------------+------+---------+-----------+
|percent duplicated lines |0.000 |0.000 |= |
+-------------------------+------+---------+-----------+
Messages by category
--------------------
+-----------+-------+---------+-----------+
|type |number |previous |difference |
+===========+=======+=========+===========+
|convention |0 |0 |= |
+-----------+-------+---------+-----------+
|refactor |0 |0 |= |
+-----------+-------+---------+-----------+
|warning |0 |0 |= |
+-----------+-------+---------+-----------+
|error |0 |0 |= |
+-----------+-------+---------+-----------+
Global evaluation
-----------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
/tmp%
```

```
```

Tagged: python ]]>

ये नाम केचिदिह नः प्रथयन्त्यवज्ञां जानन्ति ते किमपि तान्प्रति नैष यत्नः । उत्पत्स्यते तु मम कोऽपि समानधर्मा कालो ह्ययं निरवधिर्विपुला च पृथ्वी ॥ye nāma kecit iha naḥ prathayanti avajñām

jānanti te kim api tān prati na eṣa yatnaḥ |

utpatsyate tu mama ko api samāna-dharmā

kālo hi ayaṃ niravadhiḥ vipulā ca pṛthvī ||Those who deride or ignore my work —

let them know: my efforts are not for them.

There will come along someone who shares my spirit:

the world is vast, and time endless.

This verse has become a favourite of many. It appears already in the first known anthology of Sanskrit verses (*subhāṣita*-collection), Vidyākara’s *Subhāṣita-ratna-koṣa*. (It’s numbered 1731 (= 50.34) in the edition by Kosambi and Gokhale, and translated by Ingalls.) Ingalls writes and translates (1965):

Of special interest are the verses of Dharmakīrti and Bhavabhūti, two of India’s most original writers, which speak of the scorn and lack of understanding which the writings of those authors found among contemporaries. To such disappointment Dharmakīrti replies with bitterness (1726, 1729), Bhavabhūti with the unreasoning hope of a romantic (1731). If the souls of men could enjoy their posthumous fame the one would now see his works admired even far beyond India, the other would see his romantic hope fulfilled.

Those who scorn me in this world

have doubtless special wisdom,

so my writings are not made for them;

but are rather with the thought that some day will be born,

since time is endless and the world is wide,

one whose nature is the same as mine.

A translation of this verse is also included in A. N. D. Haksar’s *A Treasury of Sanskrit Poetry in English Translation* (2002):

The Proud Poet

Are there any around who mock my verses?

They ought to know I don’t write for them.

Someone somewhere sometime will understand.

Time has no end. The world is big.

— translated by V. N. Misra, L. Nathan and S. Vatsyayan [The Indian Poetic Tradition, 1993]

Andrew Schelling has written of it in *Manoa*, Volume 25, Issue 2, 2013:

Critics scoff

at my work

and declare their contempt—

no doubt they’ve got

their own little wisdom.

I write nothing for them.

But because time is

endless and our planet

vast, I write these

poems for a person

who will one day be born

with my sort of heart.“Criticism is for poets as ornithology is for the birds,” wrote John Cage. Bhavabhūti has scant doubt that future generations will honor his work. The reader who will arise, utpalsyate [sic], is somebody of the same faith, heart, or discipline, samānadharmā.

Just now also found it on the internet, here (2014) (misattributed to Bhartṛhari):

There are those who

treat my work with

studied indifference.

Maybe they know something,

but I’m not writing for them.

Someone will come around

who feels the way I do.

Time, after all, is limitless,

and fortune spreads far.

Finally, in Sadāsvada, written by my friend Mohan with some comments from me, this was included in one of our our very first posts (2012):

In his play Mālatīmādhava, he makes a point that deserves to be the leading light of anyone wishing to do something of value and

~~is~~put off by discouragement. Standing beside the words attributed to Gandhi (“First they ignore you, then they laugh at you, then they fight you, then you win.”) and Teddy Roosevelt (“It is not the critic who counts…”), Bhavabhūti’s confidence in the future stands resplendent:“They who disparage my work should know that it’s not for them that I did it. One day, there will arise someone who will truly know me: this world is vast, and time infinite.”

It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat.

[Note on the text: in Vidyākara’s compilation the verse ends with “विपुला च लक्ष्मीः” (vipulā ca lakṣmīḥ) instead of “विपुला च पृथ्वी” (vipulā ca pṛthvī), but the actual source work Mālatī-mādhava has the latter, as do all quotations of this verse elsewhere (e.g. the काव्यप्रकाशः of Mammaṭa, the Sahityadarpana of Viśvanātha Kavirāja, the रसार्णवसुधाकरः of श्रीसिंहभूपाल), and that is what Ingalls uses: “For *lakṣmīḥ*, which utterly destroys the line, read *pṛthvi* with the printed texts of *Māl*.” Actually, most quotations have “utpatsyate ‘sti” in place of “utpatsyate tu”: “either will be born, or already exists…”.]

Tagged: quotes, sanskrit, sanskrit literature, sanskrit translation, writing ]]>

But what about *printing* them?

Just because the number stored internally is not 0.1 but the closest approximation to it (say as 0.100000001490116119384765625) doesn’t mean it should be printed as a long string, when “0.1” means exactly the same number.

This is a solved problem since 1990.

TODO: Write rest of this post.

Bryan O’Sullivan (of Real World Haskell fame):

http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/

Steel & White paper *How to Print Floating-Point Numbers Accurately*: https://lists.nongnu.org/archive/html/gcl-devel/2012-10/pdfkieTlklRzN.pdf

Their retrospective: http://grouper.ieee.org/groups/754/email/pdfq3pavhBfih.pdf

Burger & Dybvig:

http://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf

Florian Loitsch:

http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf

Russ Cox: http://research.swtch.com/ftoa

http://www.ryanjuckett.com/programming/printing-floating-point-numbers/

https://labix.org/python-nicefloat

http://people.csail.mit.edu/jaffer/III/EZFPRW

Edit (Thanks Soma!): *Printing Floating-Point Numbers: A Faster, Always Correct Method* from POPL’16. Revised to Printing Floating-Point Numbers: An Always Correct Method (see github).

]]>

One of Kālidāsa’s famous similes is in the following verse from the Raghuvaṃśa, in the context of describing the *svayaṃvara* of Indumatī. The various hopeful suitors of the princess, all kings from different regions, are lined up as she passes them one by one, her friend doing the introductions.

संचारिणी दीपशिखेव रात्रौ यम् यम् व्यतीयाय पतिंवरा सा । नरेन्द्रमार्गाट्ट इव प्रपेदे विवर्णभावम् स स भूमिपालः ॥ ६-६७

saṁcāriṇī dīpa-śikheva rātrau

yam yam vyatīyāya patiṁvarā sā |

narendra-mārga-aṭṭa iva prapede

vivarṇa-bhāvam sa sa bhūmipālaḥ || 6-67

Only today did I discover a decent translation into English. It’s by John Brough (1975/6):

As if a walking lamp-flame in the night

On the king’s highway, flanked with houses tall,

She moved, and lit each prince with hopeful light,

And, passing on, let each to darkness fall.

Every other translation I have seen really falls short. Witness the misunderstandings, and the killing of all feeling.

Here is Ryder (1904), who is usually good:

And every prince rejected while she sought

A husband, darkly frowned, as turrets, bright

One moment with the flame from torches caught,

Frown gloomily again and sink in night.

The idea is there, but requires too much effort to understand.

This is P. de Lacy Johnstone (1902):

Now as the Maid went by, each suitor-King,

Lit for a moment by her dazzling eyes,

Like wayside tower by passing lamp, sank back

In deepest gloom. …

Every king, whom Indumati passed by while choosing her husband, assumed a pale look as the houses on a high way are covered with darkness in the absence of lamps.

Whatsoever king the maiden intent on choosing her husband passed by, like the flame of a moving lamp at night, that same king turned pale, just as a mansion situate on the highway, is shrouded in darkness when left behind (by a moving light).

67. pati.m varA sA= husband, selector, she – she who has come to select her husband, indumati; rAtrau sa.mcAriNI dIpa shikha iva= in night, moving, lamp’s, [glittering] flame, as with; ya.m ya.m= whom, whom; [bhUmi pAlam= king, whomever]; vyatIyAya= passed by; saH saH bhUmipAlaH= he, he, king – such and such a king; narendra mArga= on king’s, way; aTTa= a turret, or a balustrade; iva= like; vi+varNa bhAva.m= without, colour, aspect – they bore a colourless aspect; prapede= [that king] obtained – that king became colourless, he drew blank.

Princess indumati who came to choose her husband then moved like the glittering flame of a lamp on a king’s way, and whichever prince she left behind was suffused with pallor just like a turret or balustrade on the king’s way will be shrouded in darkness and becomes dim when left behind by the moving light on the king’s way. [6-67]

And this is representative of the average quality of Sanskrit-to-English translations, and how much beauty is lost.

Tagged: sanskrit translation ]]>

(solutions to this equation are called Pythagorean triples).

Centuries later, in 1637, Fermat made a conjecture (now called Fermat’s Last Theorem, not because he uttered it in his dying breath, but because it was the last one to be proved — in ~1995) that

has no positive integer solutions for . In other words, his conjecture was that none of the following equations has a solution:

… and so on. An nth power cannot be partitioned into two nth powers.

About a century later, Euler proved the case of Fermat’s conjecture, but generalized it in a different direction: he conjectured in 1769 that an nth power cannot be partitioned into fewer than n nth powers, namely

has no solutions with . So his conjecture was that (among others) none of the following equations has a solution:

… and so on.

This conjecture stood for about two centuries, until abruptly it was found to be false, by Lander and Parkin who in 1966 simply did a direct search on the fastest (super)computer at the time, and found this counterexample:

(It is still one of only three examples known, according to Wikipedia.)

Now, how might you find this solution on a computer today?

In his wonderful (as always) post at bit-player, Brian Hayes showed the following code:

import itertools as it def four_fifths(n): '''Return smallest positive integers ((a,b,c,d),e) such that a^5 + b^5 + c^5 + d^5 = e^5; if no such tuple exists with e < n, return the string 'Failed'.''' fifths = [x**5 for x in range(n)] combos = it.combinations_with_replacement(range(1,n), 4) while True: try: cc = combos.next() cc_sum = sum([fifths[i] for i in cc]) if cc_sum in fifths: return(cc, fifths.index(cc_sum)) except StopIteration: return('Failed')

to which, if you add (say) `print four_fifths(150)`

and run it, it returns the correct answer fairly quickly: in about 47 seconds on my laptop.

The `if cc_sum in fifths:`

line inside the loop is an cost each time it’s run, so with a simple improvement to the code (using a set instead) and rewriting it a bit, we can write the following full program:

import itertools def find_counterexample(n): fifth_powers = [x**5 for x in range(n)] fifth_powers_set = set(fifth_powers) for xs in itertools.combinations_with_replacement(range(1, n), 4): xs_sum = sum([fifth_powers[i] for i in xs]) if xs_sum in fifth_powers_set: return (xs, fifth_powers.index(xs_sum)) return 'Failed' print find_counterexample(150)

which finishes in about 8.5 seconds.

Great!

But there’s something unsatisfying about this solution, which is that it assumes there’s a solution with all four numbers on the LHS less than 150. After all, changing the function invocation to `find_counterexample(145)`

makes it run a second faster even, but how could we know to do without already knowing the solution? Besides, we don’t have a fixed 8- or 10-second budget; what we’d *really* like is a program that keeps searching till it finds a solution or we abort it (or it runs out of memory or something), with no other fixed termination condition.

The above program used the given “n” as an upper bound to generate the combinations of 4 numbers; is there a way to generate all combinations when we don’t know an upper bound on them?

Yes! One of the things I learned from Knuth volume 4 is that if you simply write down each combination in descending order and order them lexicographically, the combinations you get for each upper bound are a prefix of the list of the next bigger one, i.e., for any upper bound, all the combinations form a prefix of the same infinite list, which starts as follows (line breaks for clarity):

1111, 2111, 2211, 2221, 2222, 3111, 3211, 3221, 3222, 3311, 3321, 3322, 3331, 3332, 3333, 4111, ... ... 9541, 9542, 9543, 9544, 9551, ... 9555, 9611, ...

There doesn’t seem to be a library function in Python to generate these though, so we can write our own. If we stare at the above list, we can figure out how to generate the next combination from a given one:

- Walk backwards from the end, till you reach the beginning or find an element that’s less than the previous one.
- Increase that element, set all the following elements to 1s, and continue.

We could write, say, the following code for it:

def all_combinations(r): xs = [1] * r while True: yield xs for i in range(r - 1, 0, -1): if xs[i] < xs[i - 1]: break else: i = 0 xs[i] += 1 xs[i + 1:] = [1] * (r - i - 1)

(The `else`

block on a `for`

loop is an interesting Python feature: it is executed if the loop wasn’t terminated with `break`

.) We could even hard-code the `r=4`

case, as we’ll see later below.

For testing whether a given number is a fifth power, we can no longer simply lookup in a fixed precomputed set. We can do a binary search instead:

def is_fifth_power(n): assert n > 0 lo = 0 hi = n # Invariant: lo^5 < n <= hi^5 while hi - lo > 1: mid = lo + (hi - lo) / 2 if mid ** 5 < n: lo = mid else: hi = mid return hi ** 5 == n

but it turns out that this is slower than one based on looking up in a growing set (as below).

Putting everything together, we can write the following (very C-like) code:

largest_known_fifth_power = (0, 0) known_fifth_powers = set() def is_fifth_power(n): global largest_known_fifth_power while n > largest_known_fifth_power[0]: m = largest_known_fifth_power[1] + 1 m5 = m ** 5 largest_known_fifth_power = (m5, m) known_fifth_powers.add(m5) return n in known_fifth_powers def fournums_with_replacement(): (x0, x1, x2, x3) = (1, 1, 1, 1) while True: yield (x0, x1, x2, x3) if x3 < x2: x3 += 1 continue x3 = 1 if x2 < x1: x2 += 1 continue x2 = 1 if x1 < x0: x1 += 1 continue x1 = 1 x0 += 1 continue if __name__ == '__main__': tried = 0 for get in fournums_with_replacement(): tried += 1 if (tried % 1000000 == 0): print tried, 'Trying:', get rhs = get[0]**5 + get[1]**5 + get[2]**5 + get[3]**5 if is_fifth_power(rhs): print 'Found:', get, rhs break

which is both longer and slower (takes about 20 seconds) than the original program, but at least we have the satisfaction that it doesn’t depend on any externally known upper bound.

I originally started writing this post because I wanted to describe some experiments I did with profiling, but it’s late and I’m sleepy so I’ll just mention it.

python -m cProfile euler_conjecture.py

will print relevant output in the terminal:

26916504 function calls in 26.991 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 18.555 18.555 26.991 26.991 euler_conjecture.py:1() 13458164 4.145 0.000 4.145 0.000 euler_conjecture.py:12(fournums_with_replacement) 13458163 4.292 0.000 4.292 0.000 euler_conjecture.py:3(is_fifth_power) 175 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

Another way to view the same thing is to write the profile output to a file and read it with `cprofilev`

:

python -m cProfile -o euler_profile.out euler_conjecture.py cprofilev euler_profile.out

and visit http://localhost:4000 to view it.

Of course, simply translating this code to C++ makes it run much faster:

#include <array> #include <iostream> #include <map> #include <utility> typedef long long Int; constexpr Int fifth_power(Int x) { return x * x * x * x * x; } std::map<Int, int> known_fifth_powers = {{0, 0}}; bool is_fifth_power(Int n) { while (n > known_fifth_powers.rbegin()->first) { int m = known_fifth_powers.rbegin()->second + 1; known_fifth_powers[fifth_power(m)] = m; } return known_fifth_powers.count(n); } std::array<Int, 4> four_nums() { static std::array<Int, 4> x = {1, 1, 1, 0}; int i = 3; while (i > 0 && x[i] == x[i - 1]) --i; x[i] += 1; while (++i < 4) x[i] = 1; return x; } std::ostream& operator<<(std::ostream& os, std::array<Int, 4> x) { os << "(" << x[0] << ", " << x[1] << ", " << x[2] << ", " << x[3] << ")"; return os; } int main() { while (true) { std::array<Int, 4> get = four_nums(); Int rhs = fifth_power(get[0]) + fifth_power(get[1]) + fifth_power(get[2]) + fifth_power(get[3]); if (is_fifth_power(rhs)) { std::cout << "Found: " << get << " " << known_fifth_powers[rhs] << std::endl; break; } } }

and

```
clang++ -std=c++11 euler_conjecture.cc && time ./a.out
```

runs in 2.43s, or 0.36s if compiled with `-O2`.

But I don’t have a satisfactory answer to how to make our Python program which takes 20 seconds as fast as the 8.5-second known-upper-bound version.

Edit [2015-05-08]: I wrote some benchmarking code to compare all the different “combination” functions.

import itertools # Copied from the Python documentation def itertools_equivalent(iterable, r): pool = tuple(iterable) n = len(pool) if not n and r: return indices = [0] * r yield tuple(pool[i] for i in indices) while True: for i in reversed(range(r)): if indices[i] != n - 1: break else: return indices[i:] = [indices[i] + 1] * (r - i) yield tuple(pool[i] for i in indices) # Above function, specialized to first argument being range(1, n) def itertools_equivalent_specialized(n, r): indices = [1] * r yield indices while True: for i in reversed(range(r)): if indices[i] != n - 1: break else: return indices[i:] = [indices[i] + 1] * (r - i) yield indices # Function to generate all combinations of 4 elements def all_combinations_pythonic(r): xs = [1] * r while True: yield xs for i in range(r - 1, 0, -1): if xs[i] < xs[i - 1]: break else: i = 0 xs[i] += 1 xs[i + 1:] = [1] * (r - i - 1) # Above function, written in a more explicit C-like way def all_combinations_clike(r): xs = [1] * r while True: yield xs i = r - 1 while i > 0 and xs[i] == xs[i - 1]: i -= 1 xs[i] += 1 while i < r - 1: i += 1 xs[i] = 1 # Above two functions, specialized to r = 4, using tuple over list. def fournums(): (x0, x1, x2, x3) = (1, 1, 1, 1) while True: yield (x0, x1, x2, x3) if x3 < x2: x3 += 1 continue x3 = 1 if x2 < x1: x2 += 1 continue x2 = 1 if x1 < x0: x1 += 1 continue x1 = 1 x0 += 1 continue # Benchmarks for all functions defined above (and the library function) def benchmark_itertools(n): for xs in itertools.combinations_with_replacement(range(1, n), 4): if xs[0] >= n: break def benchmark_itertools_try(n): combinations = itertools.combinations_with_replacement(range(1, n), 4) while True: try: xs = combinations.next() if xs[0] >= n: break except StopIteration: return def benchmark_itertools_equivalent(n): for xs in itertools_equivalent(range(1, n), 4): if xs[0] >= n: break def benchmark_itertools_equivalent_specialized(n): for xs in itertools_equivalent_specialized(n, 4): if xs[0] >= n: break def benchmark_all_combinations_pythonic(n): for xs in all_combinations_pythonic(4): if xs[0] >= n: break def benchmark_all_combinations_clike(n): for xs in all_combinations_clike(4): if xs[0] >= n: break def benchmark_fournums(n): for xs in fournums(): if xs[0] >= n: break if __name__ == '__main__': benchmark_itertools(150) benchmark_itertools_try(150) benchmark_itertools_equivalent(150) benchmark_itertools_equivalent_specialized(150) benchmark_all_combinations_pythonic(150) benchmark_all_combinations_clike(150) benchmark_fournums(150)

As you can see, I chose inside the benchmarking function the same statement that would cause `all_combinations`

to terminate, and have no effect for the other combination functions.

When run with

`python -m cProfile benchmark_combinations.py`

the results include:

2.817 benchmark_combinations.py:80(benchmark_itertools) 8.583 benchmark_combinations.py:84(benchmark_itertools_try) 126.980 benchmark_combinations.py:93(benchmark_itertools_equivalent) 46.635 benchmark_combinations.py:97(benchmark_itertools_equivalent_specialized) 44.032 benchmark_combinations.py:101(benchmark_all_combinations_pythonic) 18.049 benchmark_combinations.py:105(benchmark_all_combinations_clike) 10.923 benchmark_combinations.py:109(benchmark_fournums)

Lessons:

- Calling
`itertools.combinations_with_replacement`

is by far the fastest, taking about 2.7 seconds. It turns out that it’s written in C, so this would be hard to beat. (Still, writing it in a`try`

block is seriously bad.) - The “equivalent” Python code from the itertools documentation (
`benchmark_itertools_combinations_with_replacment`

) is about 50x slower. - Gets slightly better when specialized to numbers.
- Simply generating
*all*combinations without an upper bound is actually faster. - It can be made even faster by writing it in a more C-like way.
- The tuples version with the loop unrolled manually is rather fast when seen in this light, less than 4x slower than the library version.

Tagged: mathematics, programming ]]>

One of the lessons from functional programming is to encode as much information as possible into the types. Almost all programmers understand to some extent that types are helpful: they know not to store everything as `void*` (in C/C++) or as `Object` (in Java). They even know not to use say `double` for all numbers, or `string` for everything (as shell scripts / Tcl do). (This pejoratively called “stringly typed”.)

A corollary, from slide 19 here, is that (when your type system supports richer types) a boolean type is almost always the wrong choice, as it carries too little information.

The name “Boolean blindness” for this seems to have been coined by Dan Licata when taught a course at CMU as a PhD student.

From here (blog post by Robert Harper, his advisor at CMU):

There is no

informationcarried by a Boolean beyond its value, and that’s the rub. As Conor McBride puts it, to make use of a Boolean you have to know itsprovenanceso that you can know what itmeans.

[…]

Keeping track of this information (or attempting to recover it using any number of program analysis techniques) is notoriously difficult. The only thing you can do with a bit is to branch on it, and pretty soon you’re lost in a thicket of if-then-else’s, and you lose track of what’s what.

[…]The problem is computing the bit in the first place. Having done so, you have blinded yourself by reducing the information you have at hand to a bit, and then trying to recover that information later by remembering the provenance of that bit. An illustrative example was considered in my article on equality:

`fun plus x y = if x=Z then y else S(plus (pred x) y)`

Here we’ve crushed the information we have about x down to one bit, then branched on it, then were forced to recover the information we lost to justify the call to pred, which typically cannot recover the

factthat its argument is non-zero and must check for it to be sure. What a mess! Instead, we should write

`fun plus x y = case x of Z => y | S(x') => S(plus x' y)`

No Boolean necessary, and the code is improved as well! In particular, we obtain the predecessor

en passant, and have no need to keep track of the provenance of any bit.

Some commenter there says

To the best of my knowledge, Ted Codd was the first to point out, in his relational model, that there is no place for Boolean data types in entity modeling. It is a basic design principle to avoid characterizing data in terms of Boolean values, since there is usually some other piece of information you are forgetting to model, and once you slam a Boolean into your overall data model, it becomes very hard to version towards a more exact model (information loss).

An example from Hacker News (on an unrelated post):

Basically, the idea is that when you branch on a conditional, information is gained. This information may be represented in the type system and used by the compiler to verify safety, or it can be ignored. If it is ignored, the language is said to have “boolean blindness”.

Example:`if (ptr == NULL) { ... a ... } else { ... b ... }`

In branch a and branch b, different invariants about ptr hold. But the language/compiler are not verifying any of these invariants.

Instead, consider:

`data Maybe a = Nothing | Just a`

This defines a type “Maybe”, parameterized by a type variable “a”, and it defines two “data constructors” that can make a Maybe value: “Nothing” which contains nothing, and “Just” which contains a value of type “a” inside it.

This is known as a “sum” type, because the possible values of the type are the sum of all data constructor values.

We could still use this sum data-type in a boolean-blind way:`if isJust mx then .. use fromJust mx .. -- UNSAFE! else .. can't use content of mx ..`

However, using pattern-matching, we can use it in a safe way. Assume “mx” is of type “Maybe a”:

`case mx of Nothing -> ... There is no value of type "a" in our scope Just x -> ... "x" of type "a" is now available in our scope!`

So when we branch on the two possible cases for the “mx” type, we gain new type information that gets put into our scope.

“Maybe” is of course a simple example, but this is also applicable for any sum type at all.

Another from notes of someone called HXA7241:

A nullable pointer has two ‘parts’: null, and all-other-values. These can be got at with a simple if-else conditional:

`if p is not null then ... else ... end`

. And there is still nothing wrong with that. The problem arrives when you want to handle the parts – and you lack a good type system. What do you do with the non-null pointer value? You just have to put it back into a pointer again – anullablepointer: which is what you started with. So where has what you did with the test been captured? Nowhere. When you handle that intended part elsewhere you have to do the test again and again.

A Reddit discussion: Don Stewart says

Not strongly typed,

richly typed. Where evidence isn’t thrown away. Agda is about the closest thing to this we have.

There’s also a good (and unintentionally revealing) example there by user `munificent`. Consider Harper’s example, which in more explicit Haskell could be:

```
data Num = Zero | Succ Num
plus :: Num -> Num -> Num
plus Zero y = y
plus (Succ x) y = Succ (plus x y)
```

We might write it in Java as the following:

```
```

```
interface Num {}
class Zero implements Num {}
class Succ implements Num {
public Succ(Num pred) {
this.pred = pred;
}
public final Num pred;
}
Num plus(Num x, Num y) {
if (x instanceof Succ) { // (1)
Succ xAsSucc = (Succ)x; // (2)
return new Succ(plus(xAsSucc.pred, y));
} else {
return y;
}
}
```

```
```

Here instanceof returns a boolean (comment (1)), but doesn’t carry with it any information about what that boolean represents (namely that x is an instance of Succ) so when we get to the next line (2) we’re forced to do an unsafe cast. As programmers, we know it’s safe from context, but the compiler doesn’t.

There *is* a way in Java of avoiding this situation (where the programmer has context the compiler doesn’t):

```
interface Num {
Num plus(Num other);
}
class Zero implements Num {
public Num plus(Num other) {
return other;
}
}
class Succ implements Num {
public Succ(Num pred) {
this.pred = pred;
}
public Num plus(Num other) {
return new Succ(pred.plus(y));
}
public final Num pred;
}
```

But see what has happened (by user `aaronia`):

There’s a rub though — in your first code snippet, “plus” was written as a non-member function; anyone can write it as a stand alone library. It was modular. But, as you demonstrated, it had the redundant type check.

However, your second code snippet has lost this modularity. You had to add new methods to the Num data type.

I think this is the pitfall of Object-Oriented Programming (which is like “Pants-Oriented Clothing”): objects can talk about themselves, but it’s harder to talk about objects.

If you want to write a function that does similar things for a bunch of types, you have to write similar function definitions in all those different classes. These function definitions do not stay together, so there is no way of knowing or ensuring that they are similar.

Tagged: programming ]]>

Here is a handout from this course (taught by, AFAICT, Jeffrey S. Leon) that helps me understand the Java people.

Tagged: programming ]]>

Most programming languages include a “remainder” or “modulo” function, and also an integer division (“quotient”) function. Given two integers and , let’s call the results of these functions and respectively.

For positive and , it is clear what and should be: is the largest integer such that , and is the remainder which therefore satisfies .

What should we do when, as frequently happens, is negative, or (as less frequently happens) is negative?

For negative and positive , there are two choices when lies between two multiples of (i.e. ):

(1) Set to the *lesser* value, so that continues to hold, or

(2) Set to the *greater* (and therefore smaller in magnitude) value.

There are very good reasons why (1) is preferable to (2): it ensures that the function is always positive no matter what the value of , so that, for example, .

And indeed that is what the more recent programming languages do. There is a table on Wikipedia: C, C++, Java, Go(!), OCaml(!), PHP, all have the “bad” behaviour, while Maple, Mathematica, Microsoft Excel, Perl, Python, Ruby have the “good” behaviour. Some languages have separate functions for both behaviours (e.g. Haskell has `quotRem`

and `divMod`

functions, similarly Common Lisp, Fortran, MATLAB).

There’s also the question of what to do when is negative, which turns out not to matter much (as long as it’s consistent with the above). One defintion is to continue to have be the lesser value, and the other is to continue to insist that . Both are fine, though sometimes the latter is nicer.

These are elaborated in The Euclidean Definition of the Functions div and mod by Raymond T. Boute, ACM Transactions on Programming Languages and Systems, Vol 14, No. 2, April 1992.

]]>

The verse that opens the Pūrva-pīṭhikā of Daṇḍin’s Daśakumāracarita plays on this imagination, and on the word daṇda / daṇḍin. Here’s the verse (in Sragdharā metre of pattern GGGGLGG—LLLLLLG—GLGGLGG):

May the leg of Trivikrama,

pole for the parasol that is the universe,

stem of the lotus that is Brahma’s seat,

mast of the ship that is the earth,

rod of the streaming banner that is the river of the Gods,

axle-rod around which the zodiac turns,

pillar of victory over the three worlds,

rod of death for the enemies of the Gods,

favour you with blessings.brahmāṇḍa-cchatradaṇḍaḥ śata-dhṛti-bhavan’-âmbhoruho nāla-daṇḍaḥ

kṣoṇī-nau-kūpa-daṇḍaḥ kṣarad-amara-sarit-paṭṭikā-ketu-daṇḍaḥ /

jyotiścakr’-âkṣa-daṇḍas tribhuvana-vijaya-stambha-daṇḍo ‘ṅghri-daṇḍaḥ

śreyas traivikramas te vitaratu vibudha-dveṣiṇāṃ kāla-daṇḍaḥ //ब्रह्माण्डच्छत्रदण्डः शतधृतिभवनाम्भोरुहो नालदण्डः क्षोणीनौकूपदण्डः क्षरदमरसरित्पट्टिकाकेतुदण्डः । ज्योतिश्चक्राक्षदण्डस्त्रिभुवनविजयस्तम्भदण्डोऽङ्घ्रिदण्डः श्रेयस्त्रैविक्रमस्ते वितरतु विबुधद्वेषिणां कालदण्डः ॥

[The Mānasataraṃgiṇī-kāra, agreeing with Santillana and von Dechend the authors of Hamlet’s Mill, considers the “pole” or “axis” motif central to the conception of Vishnu (e.g. *matsya*‘s horn, Mount Meru as the rod on *kūrma*, *nṛsiṃha* from the pillar, etc.: see here), sees much more depth in this poem, and that Daṇḍin was remembering this old motif.]

The translation above is mildly modified from that of Isabelle Onians in her translation (“What Ten Young Men Did”) of the Daśa-kumāra-carita, published by the Clay Sanskrit Library:

Pole for the parasol-shell that is Brahma’s cosmic egg,

Stem for Brahma’s lotus seat,

Mast for the ship that is the earth,

Rod for the banner that is the rushing immortal river Ganges,

Axle rod for the rotating zodiac,

Pillar of victory over the three worlds—

May Vishnu’s leg favor you with blessings—

Staff that is the leg of him who as Trivikrama reclaimed those three worlds in three steps,

Rod of time, death itself, for the demon enemies of the gods.

Ryder, in his translation (“The Ten Princes”), takes some liberties and manages verse in couplets:

May everlasting joy be thine,

Conferred by Vishnu’s foot divine,Which, when it trod the devils flat,

Became the staff of this and that:The staff around which is unfurled,

The sunshade of the living world;The flagstaff for the silken gleam

Of sacred Ganges’ deathless stream;The mast of earth’s far-driven ship,

Round which the stars (as axis) dip;The lotus stalk of Brahma’s shrine;

The fulcrumed staff of life divine.

For another verse that fully gets into this “filling the universe” spirit, see *The dance of the bhairava* on manasa-taramgini.

Tagged: dandin, poem, sanskrit, sanskrit literature, sanskrit translation, verse ]]>

This is an answer not to the part about whether it is easier to learn German after Sanskrit (I don’t know), but rather, a few more assorted points re. “What similarities exist between the two languages”, or even more generally, “Why would people make such a claim?”

As Cerberus [another user] noted, most of these claims come from people whose familiarity, outside of Indian languages, is with mainly English, and perhaps a bit of French (or rarely, Spanish or Italian). So even though many similarities noted between Sanskrit and German are in fact those shared by many members of the Indo-European family, the claim just means that *among the few languages considered,* German’s similarities are remarkable.

[My background: I have a reasonable familiarity with Sanskrit; not so much with German. For impressions about German I’ll rely on the Wikipedia articles, and, (don’t lynch me) Mark Twain’s humorous essay The Awful German Language — of course I know it’s unfair and not a work of linguistics, but as examples of what the average English speaker might find unusual in German, it is a useful document.]

With that said, some similarities:

German apparently has four cases; Sanskrit has eight cases (traditional Sanskrit grammar counts seven, not counting the vocative as distinct). As Cerberus [another user] notes, “Sanskrit and German have several functional cases, whereas French/Spanish/Italian/Portuguese/Dutch/English/etc. do not. Those are the languages one might be inclined to compare Sanskrit with”.

Although English does have short compound words (like *bluebird*, *horseshoe*, *paperback* or *pickpocket*), German has a reputation for long compound words. (Twain complains that the average German sentence “is built mainly of compound words constructed by the writer on the spot, and not to be found in any dictionary — six or seven words compacted into one, without joint or seam — that is, without hyphens”) He mentions *Stadtverordnetenversammlungen* and *Generalstaatsverordnetenversammlungen*; Wikipedia mentions Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz and Donaudampfschiffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft. But these are ** nothing** compared to the words one routinely finds in ornate Sanskrit prose. See for example this post. Sanskrit like German allows compounds of arbitrary length, and compounds made of four or five words are routinely found in even the most common Sanskrit texts.

It appears that German words tend to come later in the sentence than English speakers are comfortable with. I notice questions on this SE showing that German has V2 word order, not SOV. However, many English speakers seem to find late verbs in German worth remarking on. One of my favourite sentences from Hofstadter goes

“The proverbial German phenomenon of the “verb-at-the-end”, about which droll tales of absentminded professors who would begin a sentence, ramble on for an entire lecture, and then finish up by rattling off a string of verbs by which their audience, for whom the stack had long since lost its coherence, would be totally nonplussed, are told, is an excellent example of linguistic pushing and popping.”

Twain too, says “the reader is left to flounder through to the remote verb” and gives the analogy of

“But when he, upon the street, the (in-satin-and-silk-covered-now-very-unconstrained-after-the-newest-fashioned-dressed) government counselor’s wife

met,”

and also

“In the daybeforeyesterdayshortlyaftereleveno’clock Night, the inthistownstandingtavern called `The Wagoner’ was downburnt. When the fire to the onthedownburninghouseresting Stork’s Nest reached, flew the parent Storks away. But when the bytheraging, firesurrounded Nest itself caught Fire, straightway plunged the quickreturning Mother-stork into the Flames and died, her Wings over her young ones outspread.”

Well, this is *exactly* typical Sanskrit writing. Those sentences might have been translated verbatim from a Sanskrit text. Sanskrit technically has *free* word order (i.e., words can be put in any order), and this is made much use of in verse, but in prose, usage tends to be SOV.

Of Sanskrit’s greatest prose work, *Kādambarī*, someone named Albrecht Weber wrote in 1853 that in it,

“the verb is kept back to the second, third, fourth, nay, once to the sixth page, and all the interval is filled with epithets and epithets to these epithets: moreover these epithets frequently consist of compounds extending over more than one line; in short, Bāṇa’s prose is an Indian wood, where all progress is rendered impossible by the undergrowth until the traveller cuts out a path for himself, and where, even then, he has to reckon with malicious wild beasts in the shape of unknown words that affright him.” (“…ein wahrer indischer Wald…”)

(This is unfair criticism: personally, I have been lately reading the Kādambarī with the help of friends more experienced in Sanskrit, and I must say the style is truly enjoyable.) Now, the fact that this was a *German* Indologist writing for the *Journal of the German Oriental Society* somewhat goes against the claim of Sanskrit and German being similar. But one could say: for someone familiar with Sanskrit’s long compounds and late verbs that even Germans find difficult, the same features in German will pose little difficulty.

In Sanskrit, as it appears to be in German, an adjective takes the gender, case, and number of whatever it is describing. (Twain: “would rather decline two drinks than one German adjective”)

By and large, it is so in Sanskrit as well. Twain notes that in German “a tree is male, its buds are female, its leaves are neuter; horses are sexless, dogs are male, cats are female — tomcats included, of course; a person’s mouth, neck, bosom, elbows, fingers, nails, feet, and body are of the male sex, and his head is male or neuter according to the word selected to signify it, and **not** according to the sex of the individual who wears it — for in Germany all the women either male heads or sexless ones; a person’s nose, lips, shoulders, breast, hands, and toes are of the female sex; and his hair, ears, eyes, chin, legs, knees, heart, and conscience haven’t any sex at all”. (He goes on to write a “Tale of the Fishwife and its Sad Fate.”) It does not seem quite so bad in Sanskrit, but yes, gender of words needs to be learned. (In Sanskrit there exists a word for “wife” in each of the three genders.) However this is a feature common to *many* languages (including, say, languages like Hindi or French that have only two genders) so I shouldn’t list it among similarities.

This is something quite trivial, and linguists often don’t even consider orthography a part of the language proper, but spelling seems to be a pretty big deal to Indians learning other languages. The writing systems of most Indian languages are phonetic, in the sense that the spelling deterministically reflects the pronunciation and vice-versa. There are no silent letters, no wondering about a word spelled in a particular way is pronounced. Indian learners of English often complain about the ad-hoc inconsistent spelling of English; it seems a bigger deal than it should be. From this point of view, the fact that it is claimed that for German, “After one short lesson in the alphabet, the student can tell how any German word is pronounced without having to ask” means that that aspect of German is easier to learn.

This is extremely subjective and will be controversial, and perhaps I will seem biased, but to me, in Sanskrit, it seems possible to pick words whose sounds match the desired feeling, better than in other languages. I have seen people who knew many languages say the same thing, and also Western translators from Sanskrit etc., so it is interesting for me to see Twain make a similar remark about German. Anyway, this is subjective, so I’ll not dwell on this much.

There are of course many; e.g. Sanskrit does not have articles (*the*, etc.) unlike German. It also has very few prepositions (has only a few ones like “without”, “with”, “before”), as the work of prepositions like “to”, “from”, or “by” is handled by case. The difficulty of German prepositions does not seem to be present in Sanskrit.

Some alleged difficulties of learning German, such as cases, long compounds, and word order, are present to a far greater extent in Sanskrit, so in principle someone who knows Sanskrit may be able to pick them up more easily than someone trying to learn German without this knowledge. However, this may not be saying anything more than that knowing one language helps you learn others.

Tagged: german, linguistics, sanskrit ]]>

Roughly, the amazing discovery of Gregory Galperin is this: When a ball of mass collides with one of ball , propelling it towards a wall, the number of collisions (assuming standard physics idealisms) is , so by taking , we can get the first digits of . Note that this number of collisions is an entirely determinstic quantity; there’s no probability is involved!

Here’s a video demonstrating the fact for (the blue ball is the heavier one):

The NYT post says how this discovery came about:

Dr. Galperin’s approach was also geometric but very different (using an unfolding geodesic), building on prior related insights. Dr. Galperin, who studied under well-known Russian mathematician Andrei Kolmogorov, had recently written (with Yakov Sinai) extensively on ball collisions, realized just before a talk in 1995 that a plot of the ball positions of a pair of colliding balls could be used to determine pi. (When he mentioned this insight in the talk, no one in the audience believed him.) This finding was ultimately published as “Playing Pool With Pi” in a 2003 issue of Regular and Chaotic Dynamics.

The paper, *Playing Pool With π (The number π from a billiard point of view)* is very readable. The post has, despite a “solution” section, essentially no explanation, but the two comments by Dave in the comments section explain it clearly. And a reader sent in a cleaned-up version of that too: here, by Benjamin Wearn who teaches physics at Fieldston School.

Now someone needs to make a simulation / animation graphing the two balls in phase space of momentum. :-)

I’d done something a while ago, to illustrate The Orbit of the Moon around the Sun is Convex!, here. Probably need to re-learn all that JavaScript stuff, to make one for this. Leaving this post here as a placeholder.

Or maybe someone has done it already?

]]>

Imagine that you find yourself going to see a performance of “Romeo and Juliet.” You are in the right mood for the play, no mundane worries preoccupy your mind, you have agreeable company, and the theatre, the stage, the director and the actors are all excellent—capable of doing justice to a great play. Your seat in the theatre is comfortable and gives an unobstructed view.

The play begins and you find yourself drawn into the world Shakespeare is sketching. The involvement deepens to an immersion where the ordinary, everyday world dims and fades from the center of attention, you begin to understand and even share the feelings of the characters on stage—under ideal conditions you might reach a stage where you begin to participate in some strange way in the love being evoked.

Now, if at that moment you were to ask yourself: “Whose love is this?” a paradox arises.

It cannot be Romeo’s love for Juliet, nor Juliet’s love for Romeo, for they are fictional characters. It cannot be the actors’, for in reality they may despise one another. It cannot be your own love, for you cannot love a fictional character and know nothing about the actors’ real personalities (they are veiled by the role they assume), and, for the same reasons, it cannot be the actors’ love for either you or the fictional characters. So it is a peculiar, almost abstract love without immediate referent or context.

A Sanskrit aesthete would explain to you that you are at that moment “relishing” (

āsvādana) your own “fundamental emotional state” (sthāyi-bhāva) called “passion” (rati) which has been “decontextualised” (sādhāraṇīkṛta) by the operation of “sympathetic resonance” (hṛdayasaṃvāda) and heightened to become transformed into an “aesthetic sentiment” (rasa) called the “erotic sentiment” (śṛṅgāra).This “aesthetic sentiment” is a paradoxical and ephemeral thing that can be evoked by the play but is not exactly caused by it, for many spectators may have felt nothing at all during the same performance. You yourself, seeing it again next week, under the same circumstances, might experience nothing. It is, moreover, something that cannot be adequately explained through analytic terms, the only proof for its existence is its direct, personal experience.

[…]

It is, moreover, a blissful experience. The fact that sensitive readers often weep while reading poetry does not mean that they are suffering, rather the tenderness of the work has succeeded in melting the contraction of their minds or hearts.

The non-ordinary nature of such aesthetic sentiments makes it possible for the spectator or reader to derive a pleasurable experience even from what in ordinary life would be causes of grief.

The Indian scholarly tradition has a lot more, including some very thoughtful deliberation and perceptive observation, but it seems good to start a discussion of *rasa* with an example like this, than to start with the technical details.

[Another good start may be via film. See for instance:

How to Watch a Hindi Film: The Example of *Kuch Kuch Hota Hai* by Sam Joshi, published in *Education About Asia*, Volume 9, Number 1 (Spring 2004).

and perhaps (and if you have a lot of time):

Is There an Indian Way of Filmmaking? by Philip Lutgendorf, published in International Journal of Hindu Studies, Vol. 10, No. 3 (Dec., 2006), pp. 227-256.

Previously on this blog: On songs in Bollywood]

Tagged: aesthetics, appreciation, drama, emotion, rasa ]]>