The Lumber Room

"Consign them to dust and damp by way of preserving them"

Some playing with Python

with 2 comments

A long time ago, Diophantus (sort of) discussed integer solutions to the equation

\displaystyle  x^2 + y^2 = z^2

(solutions to this equation are called Pythagorean triples).

Centuries later, in 1637, Fermat made a conjecture (now called Fermat’s Last Theorem, not because he uttered it in his dying breath, but because it was the last one to be proved — in ~1995) that

\displaystyle x^n + y^n = z^n

has no positive integer solutions for n \ge 3. In other words, his conjecture was that none of the following equations has a solution:

\displaystyle x^3 + y^3 = z^3

\displaystyle x^4 + y^4 = z^4

\displaystyle x^5 + y^5 = z^5

\displaystyle x^6 + y^6 = z^6

… and so on. An nth power cannot be partitioned into two nth powers.

About a century later, Euler proved the n = 3 case of Fermat’s conjecture, but generalized it in a different direction: he conjectured in 1769 that an nth power cannot be partitioned into fewer than n nth powers, namely

\displaystyle z^n = \sum_{i = 1}^k x_i^n

has no solutions with k < n. So his conjecture was that (among others) none of the following equations has a solution:

\displaystyle z^3 = a^3 + b^3

\displaystyle z^4 = a^4 + b^4 + c^4

\displaystyle z^5 = a^5 + b^5 + c^5 + d^5

\displaystyle z^6 = a^6 + b^6 + c^6 + d^6 + e^6

… and so on.

This conjecture stood for about two centuries, until abruptly it was found to be false, by Lander and Parkin who in 1966 simply did a direct search on the fastest (super)computer at the time, and found this counterexample:

\displaystyle 27^5 + 84^5 + 110^5 + 133^5 = 144^5

(It is still one of only three examples known, according to Wikipedia.)

Now, how might you find this solution on a computer today?

In his wonderful (as always) post at bit-player, Brian Hayes showed the following code:

import itertools as it

def four_fifths(n):
    '''Return smallest positive integers ((a,b,c,d),e) such that
       a^5 + b^5 + c^5 + d^5 = e^5; if no such tuple exists
       with e < n, return the string 'Failed'.'''
    fifths = [x**5 for x in range(n)]
    combos = it.combinations_with_replacement(range(1,n), 4)
    while True:
            cc =
            cc_sum = sum([fifths[i] for i in cc])
            if cc_sum in fifths:
                return(cc, fifths.index(cc_sum))
        except StopIteration:

to which, if you add (say) print four_fifths(150) and run it, it returns the correct answer fairly quickly: in about 47 seconds on my laptop.

The if cc_sum in fifths: line inside the loop is an O(n) cost each time it’s run, so with a simple improvement to the code (using a set instead) and rewriting it a bit, we can write the following full program:

import itertools

def find_counterexample(n):
  fifth_powers = [x**5 for x in range(n)]
  fifth_powers_set = set(fifth_powers)
  for xs in itertools.combinations_with_replacement(range(1, n), 4):
    xs_sum = sum([fifth_powers[i] for i in xs])
    if xs_sum in fifth_powers_set:
      return (xs, fifth_powers.index(xs_sum))
  return 'Failed'

print find_counterexample(150)

which finishes in about 8.5 seconds.


But there’s something unsatisfying about this solution, which is that it assumes there’s a solution with all four numbers on the LHS less than 150. After all, changing the function invocation to find_counterexample(145) makes it run a second faster even, but how could we know to do without already knowing the solution? Besides, we don’t have a fixed 8- or 10-second budget; what we’d really like is a program that keeps searching till it finds a solution or we abort it (or it runs out of memory or something), with no other fixed termination condition.

The above program used the given “n” as an upper bound to generate the combinations of 4 numbers; is there a way to generate all combinations when we don’t know an upper bound on them?

Yes! One of the things I learned from Knuth volume 4 is that if you simply write down each combination in descending order and order them lexicographically, the combinations you get for each upper bound are a prefix of the list of the next bigger one, i.e., for any upper bound, all the combinations form a prefix of the same infinite list, which starts as follows (line breaks for clarity):

2111, 2211, 2221, 2222,
3111, 3211, 3221, 3222, 3311, 3321, 3322, 3331, 3332, 3333,
4111, ...
     ... 9541, 9542, 9543, 9544, 9551, ... 9555, 9611, ...

There doesn’t seem to be a library function in Python to generate these though, so we can write our own. If we stare at the above list, we can figure out how to generate the next combination from a given one:

  1. Walk backwards from the end, till you reach the beginning or find an element that’s less than the previous one.
  2. Increase that element, set all the following elements to 1s, and continue.

We could write, say, the following code for it:

def all_combinations(r):
  xs = [1] * r
  while True:
    yield xs
    for i in range(r - 1, 0, -1):
      if xs[i] < xs[i - 1]:
      i = 0
    xs[i] += 1
    xs[i + 1:] = [1] * (r - i - 1)

(The else block on a for loop is an interesting Python feature: it is executed if the loop wasn’t terminated with break.) We could even hard-code the r=4 case, as we’ll see later below.

For testing whether a given number is a fifth power, we can no longer simply lookup in a fixed precomputed set. We can do a binary search instead:

def is_fifth_power(n):
  assert n > 0
  lo = 0
  hi = n
  # Invariant: lo^5 < n <= hi^5
  while hi - lo > 1:
    mid = lo + (hi - lo) / 2
    if mid ** 5 < n:
      lo = mid
      hi = mid
  return hi ** 5 == n

but it turns out that this is slower than one based on looking up in a growing set (as below).

Putting everything together, we can write the following (very C-like) code:

largest_known_fifth_power = (0, 0)
known_fifth_powers = set()
def is_fifth_power(n):
  global largest_known_fifth_power
  while n > largest_known_fifth_power[0]:
    m = largest_known_fifth_power[1] + 1
    m5 = m ** 5
    largest_known_fifth_power = (m5, m)
  return n in known_fifth_powers

def fournums_with_replacement():
  (x0, x1, x2, x3) = (1, 1, 1, 1)
  while True:
    yield (x0, x1, x2, x3)
    if x3 < x2:
      x3 += 1
    x3 = 1
    if x2 < x1:
      x2 += 1
    x2 = 1
    if x1 < x0:
      x1 += 1
    x1 = 1
    x0 += 1

if __name__ == '__main__':
  tried = 0
  for get in fournums_with_replacement():
    tried += 1
    if (tried % 1000000 == 0):
      print tried, 'Trying:', get
    rhs = get[0]**5 + get[1]**5 + get[2]**5 + get[3]**5
    if is_fifth_power(rhs):
      print 'Found:', get, rhs

which is both longer and slower (takes about 20 seconds) than the original program, but at least we have the satisfaction that it doesn’t depend on any externally known upper bound.

I originally started writing this post because I wanted to describe some experiments I did with profiling, but it’s late and I’m sleepy so I’ll just mention it.

python -m cProfile

will print relevant output in the terminal:

         26916504 function calls in 26.991 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   18.555   18.555   26.991   26.991
 13458164    4.145    0.000    4.145    0.000
 13458163    4.292    0.000    4.292    0.000
      175    0.000    0.000    0.000    0.000 {method 'add' of 'set' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Another way to view the same thing is to write the profile output to a file and read it with cprofilev:

python -m cProfile -o euler_profile.out
cprofilev euler_profile.out

and visit http://localhost:4000 to view it.

Of course, simply translating this code to C++ makes it run much faster:

#include <array>
#include <iostream>
#include <map>
#include <utility>

typedef long long Int;
constexpr Int fifth_power(Int x) { return x * x * x * x * x; }

std::map<Int, int> known_fifth_powers = {{0, 0}};
bool is_fifth_power(Int n) {
  while (n > known_fifth_powers.rbegin()->first) {
    int m = known_fifth_powers.rbegin()->second  + 1;
    known_fifth_powers[fifth_power(m)] = m;
  return known_fifth_powers.count(n);

std::array<Int, 4> four_nums() {
  static std::array<Int, 4> x = {1, 1, 1, 0};
  int i = 3;
  while (i > 0 && x[i] == x[i - 1]) --i;
  x[i] += 1;
  while (++i < 4) x[i] = 1;
  return x;

std::ostream& operator<<(std::ostream& os, std::array<Int, 4> x) {
  os << "(" << x[0] << ", " << x[1] << ", " << x[2] << ", " << x[3] << ")";
  return os;

int main() {
  while (true) {
    std::array<Int, 4> get = four_nums();
    Int rhs = fifth_power(get[0]) + fifth_power(get[1]) + fifth_power(get[2]) + fifth_power(get[3]);
    if (is_fifth_power(rhs)) {
      std::cout << "Found: " << get << " " << known_fifth_powers[rhs] << std::endl;


clang++ -std=c++11 && time ./a.out

runs in 2.43s, or 0.36s if compiled with -O2.

But I don’t have a satisfactory answer to how to make our Python program which takes 20 seconds as fast as the 8.5-second known-upper-bound version.

Edit [2015-05-08]: I wrote some benchmarking code to compare all the different “combination” functions.

import itertools

# Copied from the Python documentation
def itertools_equivalent(iterable, r):
    pool = tuple(iterable)
    n = len(pool)
    if not n and r:
    indices = [0] * r
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != n - 1:
        indices[i:] = [indices[i] + 1] * (r - i)
        yield tuple(pool[i] for i in indices)

# Above function, specialized to first argument being range(1, n)
def itertools_equivalent_specialized(n, r):
  indices = [1] * r
  yield indices
  while True:
    for i in reversed(range(r)):
      if indices[i] != n - 1:
    indices[i:] = [indices[i] + 1] * (r - i)
    yield indices

# Function to generate all combinations of 4 elements
def all_combinations_pythonic(r):
  xs = [1] * r
  while True:
    yield xs
    for i in range(r - 1, 0, -1):
      if xs[i] < xs[i - 1]:
      i = 0
    xs[i] += 1
    xs[i + 1:] = [1] * (r - i - 1)

# Above function, written in a more explicit C-like way
def all_combinations_clike(r):
  xs = [1] * r
  while True:
    yield xs
    i = r - 1
    while i > 0 and xs[i] == xs[i - 1]:
      i -= 1
    xs[i] += 1
    while i < r - 1:
      i += 1
      xs[i] = 1

# Above two functions, specialized to r = 4, using tuple over list.
def fournums():
  (x0, x1, x2, x3) = (1, 1, 1, 1)
  while True:
    yield (x0, x1, x2, x3)
    if x3 < x2:
      x3 += 1
    x3 = 1
    if x2 < x1:
      x2 += 1
    x2 = 1
    if x1 < x0:
      x1 += 1
    x1 = 1
    x0 += 1

# Benchmarks for all functions defined above (and the library function)
def benchmark_itertools(n):
  for xs in itertools.combinations_with_replacement(range(1, n), 4):
    if xs[0] >= n:
def benchmark_itertools_try(n):
  combinations = itertools.combinations_with_replacement(range(1, n), 4)
  while True:
      xs =
      if xs[0] >= n:
    except StopIteration:
def benchmark_itertools_equivalent(n):
  for xs in itertools_equivalent(range(1, n), 4):
    if xs[0] >= n:
def benchmark_itertools_equivalent_specialized(n):
  for xs in itertools_equivalent_specialized(n, 4):
    if xs[0] >= n:
def benchmark_all_combinations_pythonic(n):
  for xs in all_combinations_pythonic(4):
    if xs[0] >= n:
def benchmark_all_combinations_clike(n):
  for xs in all_combinations_clike(4):
    if xs[0] >= n:
def benchmark_fournums(n):
  for xs in fournums():
    if xs[0] >= n:

if __name__ == '__main__':

As you can see, I chose inside the benchmarking function the same statement that would cause all_combinations to terminate, and have no effect for the other combination functions.
When run with

python -m cProfile

the results include:



  • Calling itertools.combinations_with_replacement is by far the fastest, taking about 2.7 seconds. It turns out that it’s written in C, so this would be hard to beat. (Still, writing it in a try block is seriously bad.)
  • The “equivalent” Python code from the itertools documentation (benchmark_itertools_combinations_with_replacment) is about 50x slower.
  • Gets slightly better when specialized to numbers.
  • Simply generating all combinations without an upper bound is actually faster.
  • It can be made even faster by writing it in a more C-like way.
  • The tuples version with the loop unrolled manually is rather fast when seen in this light, less than 4x slower than the library version.

Written by S

Sun, 2015-02-08 at 00:03:38

Boolean blindness

leave a comment »

(Just digesting the first page of Google search results.)

One of the lessons from functional programming is to encode as much information as possible into the types. Almost all programmers understand to some extent that types are helpful: they know not to store everything as void* (in C/C++) or as Object (in Java). They even know not to use say double for all numbers, or string for everything (as shell scripts / Tcl do). (This pejoratively called “stringly typed”.)

A corollary, from slide 19 here, is that (when your type system supports richer types) a boolean type is almost always the wrong choice, as it carries too little information.

The name “Boolean blindness” for this seems to have been coined by Dan Licata when taught a course at CMU as a PhD student.

From here (blog post by Robert Harper, his advisor at CMU):

There is no information carried by a Boolean beyond its value, and that’s the rub. As Conor McBride puts it, to make use of a Boolean you have to know its provenance so that you can know what it means.
Keeping track of this information (or attempting to recover it using any number of program analysis techniques) is notoriously difficult. The only thing you can do with a bit is to branch on it, and pretty soon you’re lost in a thicket of if-then-else’s, and you lose track of what’s what.

The problem is computing the bit in the first place. Having done so, you have blinded yourself by reducing the information you have at hand to a bit, and then trying to recover that information later by remembering the provenance of that bit. An illustrative example was considered in my article on equality:

fun plus x y = if x=Z then y else S(plus (pred x) y)

Here we’ve crushed the information we have about x down to one bit, then branched on it, then were forced to recover the information we lost to justify the call to pred, which typically cannot recover the fact that its argument is non-zero and must check for it to be sure. What a mess! Instead, we should write

fun plus x y = case x of Z => y | S(x') => S(plus x' y)

No Boolean necessary, and the code is improved as well! In particular, we obtain the predecessor en passant, and have no need to keep track of the provenance of any bit.

Some commenter there says

To the best of my knowledge, Ted Codd was the first to point out, in his relational model, that there is no place for Boolean data types in entity modeling. It is a basic design principle to avoid characterizing data in terms of Boolean values, since there is usually some other piece of information you are forgetting to model, and once you slam a Boolean into your overall data model, it becomes very hard to version towards a more exact model (information loss).

An example from Hacker News (on an unrelated post):

Basically, the idea is that when you branch on a conditional, information is gained. This information may be represented in the type system and used by the compiler to verify safety, or it can be ignored. If it is ignored, the language is said to have “boolean blindness”.

  if (ptr == NULL) {
    ... a ...
  } else {
    ... b ...

In branch a and branch b, different invariants about ptr hold. But the language/compiler are not verifying any of these invariants.
Instead, consider:

  data Maybe a = Nothing | Just a

This defines a type “Maybe”, parameterized by a type variable “a”, and it defines two “data constructors” that can make a Maybe value: “Nothing” which contains nothing, and “Just” which contains a value of type “a” inside it.
This is known as a “sum” type, because the possible values of the type are the sum of all data constructor values.
We could still use this sum data-type in a boolean-blind way:

  if isJust mx then
    .. use fromJust mx .. -- UNSAFE!
    .. can't use content of mx ..

However, using pattern-matching, we can use it in a safe way. Assume “mx” is of type “Maybe a”:

  case mx of
    Nothing -> ... There is no value of type "a" in our scope
    Just x -> ... "x" of type "a" is now available in our scope!

So when we branch on the two possible cases for the “mx” type, we gain new type information that gets put into our scope.
“Maybe” is of course a simple example, but this is also applicable for any sum type at all.

Another from notes of someone called HXA7241:

A nullable pointer has two ‘parts’: null, and all-other-values. These can be got at with a simple if-else conditional: if p is not null then ... else ... end. And there is still nothing wrong with that. The problem arrives when you want to handle the parts – and you lack a good type system. What do you do with the non-null pointer value? You just have to put it back into a pointer again – a nullable pointer: which is what you started with. So where has what you did with the test been captured? Nowhere. When you handle that intended part elsewhere you have to do the test again and again.

A Reddit discussion: Don Stewart says

Not strongly typed, richly typed. Where evidence isn’t thrown away. Agda is about the closest thing to this we have.

Relation to “object-oriented programming”

There’s also a good (and unintentionally revealing) example there by user munificent. Consider Harper’s example, which in more explicit Haskell could be:

data Num = Zero | Succ Num

plus :: Num -> Num -> Num
plus Zero     y = y
plus (Succ x) y = Succ (plus x y)

We might write it in Java as the following:

interface Num {}
class Zero implements Num {}
class Succ implements Num {
  public Succ(Num pred) {
    this.pred = pred;
  public final Num pred;

Num plus(Num x, Num y) {
  if (x instanceof Succ) {      // (1)
    Succ xAsSucc = (Succ)x;       // (2)
    return new Succ(plus(xAsSucc.pred, y));
  } else {
    return y;

Here instanceof returns a boolean (comment (1)), but doesn’t carry with it any information about what that boolean represents (namely that x is an instance of Succ) so when we get to the next line (2) we’re forced to do an unsafe cast. As programmers, we know it’s safe from context, but the compiler doesn’t.

There is a way in Java of avoiding this situation (where the programmer has context the compiler doesn’t):

interface Num {
  Num plus(Num other);

class Zero implements Num {
  public Num plus(Num other) {
    return other;

class Succ implements Num {
  public Succ(Num pred) {
    this.pred = pred;

  public Num plus(Num other) {
    return new Succ(;

  public final Num pred;

But see what has happened (by user aaronia):

There’s a rub though — in your first code snippet, “plus” was written as a non-member function; anyone can write it as a stand alone library. It was modular. But, as you demonstrated, it had the redundant type check.
However, your second code snippet has lost this modularity. You had to add new methods to the Num data type.

I think this is the pitfall of Object-Oriented Programming (which is like “Pants-Oriented Clothing”): objects can talk about themselves, but it’s harder to talk about objects.

If you want to write a function that does similar things for a bunch of types, you have to write similar function definitions in all those different classes. These function definitions do not stay together, so there is no way of knowing or ensuring that they are similar.

Written by S

Sat, 2015-01-31 at 14:55:00

Posted in computer science

Tagged with

C++ vs Java terminology glossary

leave a comment »

E.g. there is no such thing as a “method” in C++. As far as the C++ standard is concerned, functions inside classes are always called “member functions”.

Here is a handout from this course (taught by, AFAICT, Jeffrey S. Leon) that helps me understand the Java people.


Written by S

Sat, 2015-01-31 at 11:53:06

Posted in programming

Tagged with

The “most natural” definition of the div and mod functions

with one comment

[incomplete: must add examples and more discussion]

Most programming languages include a “remainder” or “modulo” function, and also an integer division (“quotient”) function. Given two integers D and d, let’s call the results of these functions r and q respectively.

For positive D and d, it is clear what q and r should be: q = \left\lfloor D/d \right\rfloor is the largest integer n such that nd \le D, and r = D - qd is the remainder which therefore satisfies 0 \le r < d.

What should we do when, as frequently happens, D is negative, or (as less frequently happens) d is negative?

For negative D and positive d, there are two choices when D lies between two multiples of d (i.e. q_1 d < D < q_2 d):

(1) Set q to the lesser value, so that 0 \le r < d continues to hold, or
(2) Set q to the greater (and therefore smaller in magnitude) value.

There are very good reasons why (1) is preferable to (2): it ensures that the function a \bmod m is always positive no matter what the value of a, so that, for example, a \equiv b \pmod m \iff (a \bmod m) = (b \bmod m).

And indeed that is what the more recent programming languages do. There is a table on Wikipedia: C, C++, Java, Go(!), OCaml(!), PHP, all have the “bad” behaviour, while Maple, Mathematica, Microsoft Excel, Perl, Python, Ruby have the “good” behaviour. Some languages have separate functions for both behaviours (e.g. Haskell has quotRem and divMod functions, similarly Common Lisp, Fortran, MATLAB).

There’s also the question of what to do when d is negative, which turns out not to matter much (as long as it’s consistent with the above). One defintion is to continue to have q be the lesser value, and the other is to continue to insist that r \ge 0. Both are fine, though sometimes the latter is nicer.

These are elaborated in The Euclidean Definition of the Functions div and mod by Raymond T. Boute, ACM Transactions on Programming Languages and Systems, Vol 14, No. 2, April 1992.

Written by S

Mon, 2015-01-12 at 09:30:35


with 2 comments

Viṣṇu, appearing before Bali as Vāmana, transformed into Trivikrama, filling the universe, covering all the earth and the heavens in two steps.

The verse that opens the Pūrva-pīṭhikā of Daṇḍin’s Daśakumāracarita plays on this imagination, and on the word daṇda / daṇḍin. Here’s the verse (in Sragdharā metre of pattern GGGGLGG—LLLLLLG—GLGGLGG):

May the leg of Trivikrama,
   pole for the parasol that is the universe,
   stem of the lotus that is Brahma’s seat,
   mast of the ship that is the earth,
   rod of the streaming banner that is the river of the Gods,
   axle-rod around which the zodiac turns,
   pillar of victory over the three worlds,
   rod of death for the enemies of the Gods,
favour you with blessings.

brahmāṇḍa-cchatradaṇḍaḥ śata-dhṛti-bhavan’-âmbhoruho nāla-daṇḍaḥ
kṣoṇī-nau-kūpa-daṇḍaḥ kṣarad-amara-sarit-paṭṭikā-ketu-daṇḍaḥ /
jyotiścakr’-âkṣa-daṇḍas tribhuvana-vijaya-stambha-daṇḍo ‘ṅghri-daṇḍaḥ
śreyas traivikramas te vitaratu vibudha-dveṣiṇāṃ kāla-daṇḍaḥ //

ब्रह्माण्डच्छत्रदण्डः शतधृतिभवनाम्भोरुहो नालदण्डः 
क्षोणीनौकूपदण्डः क्षरदमरसरित्पट्टिकाकेतुदण्डः ।
श्रेयस्त्रैविक्रमस्ते वितरतु विबुधद्वेषिणां कालदण्डः ॥

[The Mānasataraṃgiṇī-kāra, agreeing with Santillana and von Dechend the authors of Hamlet’s Mill, considers the “pole” or “axis” motif central to the conception of Vishnu (e.g. matsya‘s horn, Mount Meru as the rod on kūrma, nṛsiṃha from the pillar, etc.: see here), sees much more depth in this poem, and that Daṇḍin was remembering this old motif.]

The translation above is mildly modified from that of Isabelle Onians in her translation (“What Ten Young Men Did”) of the Daśa-kumāra-carita, published by the Clay Sanskrit Library:

Pole for the parasol-shell that is Brahma’s cosmic egg,
Stem for Brahma’s lotus seat,
Mast for the ship that is the earth,
Rod for the banner that is the rushing immortal river Ganges,
Axle rod for the rotating zodiac,
Pillar of victory over the three worlds—
May Vishnu’s leg favor you with blessings—
Staff that is the leg of him who as Trivikrama reclaimed those three worlds in three steps,
Rod of time, death itself, for the demon enemies of the gods.

Ryder, in his translation (“The Ten Princes”), takes some liberties and manages verse in couplets:

May everlasting joy be thine,
Conferred by Vishnu’s foot divine,

Which, when it trod the devils flat,
Became the staff of this and that:

The staff around which is unfurled,
The sunshade of the living world;

The flagstaff for the silken gleam
Of sacred Ganges’ deathless stream;

The mast of earth’s far-driven ship,
Round which the stars (as axis) dip;

The lotus stalk of Brahma’s shrine;
The fulcrumed staff of life divine.

For another verse that fully gets into this “filling the universe” spirit, see The dance of the bhairava on manasa-taramgini.

Written by S

Sun, 2014-08-17 at 23:36:42

Sanskrit and German

with 2 comments

[Originally posted to as an answer to a question by user Manishearth, who asked: “I’ve heard many times that learning German is easier for those who speak Sanskrit, and vice versa. Is there any linguistic basis for this? What similarities exist between the two languages that may be able to explain this?”]

This is an answer not to the part about whether it is easier to learn German after Sanskrit (I don’t know), but rather, a few more assorted points re. “What similarities exist between the two languages”, or even more generally, “Why would people make such a claim?”

As Cerberus [another user] noted, most of these claims come from people whose familiarity, outside of Indian languages, is with mainly English, and perhaps a bit of French (or rarely, Spanish or Italian). So even though many similarities noted between Sanskrit and German are in fact those shared by many members of the Indo-European family, the claim just means that among the few languages considered, German’s similarities are remarkable.

[My background: I have a reasonable familiarity with Sanskrit; not so much with German. For impressions about German I’ll rely on the Wikipedia articles, and, (don’t lynch me) Mark Twain’s humorous essay The Awful German Language — of course I know it’s unfair and not a work of linguistics, but as examples of what the average English speaker might find unusual in German, it is a useful document.]

With that said, some similarities:


German apparently has four cases; Sanskrit has eight cases (traditional Sanskrit grammar counts seven, not counting the vocative as distinct). As Cerberus [another user] notes, “Sanskrit and German have several functional cases, whereas French/Spanish/Italian/Portuguese/Dutch/English/etc. do not. Those are the languages one might be inclined to compare Sanskrit with”.

Compound words

Although English does have short compound words (like bluebirdhorseshoepaperback or pickpocket), German has a reputation for long compound words. (Twain complains that the average German sentence “is built mainly of compound words constructed by the writer on the spot, and not to be found in any dictionary — six or seven words compacted into one, without joint or seam — that is, without hyphens”) He mentions Stadtverordnetenversammlungen and Generalstaatsverordnetenversammlungen; Wikipedia mentions Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz and Donaudampfschiffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft. But these are nothing compared to the words one routinely finds in ornate Sanskrit prose. See for example this post. Sanskrit like German allows compounds of arbitrary length, and compounds made of four or five words are routinely found in even the most common Sanskrit texts.

Verb appearing late

It appears that German words tend to come later in the sentence than English speakers are comfortable with. I notice questions on this SE showing that German has V2 word order, not SOV. However, many English speakers seem to find late verbs in German worth remarking on. One of my favourite sentences from Hofstadter goes

“The proverbial German phenomenon of the “verb-at-the-end”, about which droll tales of absentminded professors who would begin a sentence, ramble on for an entire lecture, and then finish up by rattling off a string of verbs by which their audience, for whom the stack had long since lost its coherence, would be totally nonplussed, are told, is an excellent example of linguistic pushing and popping.”

Twain too, says “the reader is left to flounder through to the remote verb” and gives the analogy of

“But when he, upon the street, the (in-satin-and-silk-covered-now-very-unconstrained-after-the-newest-fashioned-dressed) government counselor’s wife met,”

and also

“In the daybeforeyesterdayshortlyaftereleveno’clock Night, the inthistownstandingtavern called `The Wagoner’ was downburnt. When the fire to the onthedownburninghouseresting Stork’s Nest reached, flew the parent Storks away. But when the bytheraging, firesurrounded Nest itself caught Fire, straightway plunged the quickreturning Mother-stork into the Flames and died, her Wings over her young ones outspread.”

Well, this is exactly typical Sanskrit writing. Those sentences might have been translated verbatim from a Sanskrit text. Sanskrit technically has free word order (i.e., words can be put in any order), and this is made much use of in verse, but in prose, usage tends to be SOV.

Of Sanskrit’s greatest prose work, Kādambarī, someone named Albrecht Weber wrote in 1853 that in it,

“the verb is kept back to the second, third, fourth, nay, once to the sixth page, and all the interval is filled with epithets and epithets to these epithets: moreover these epithets frequently consist of compounds extending over more than one line; in short, Bāṇa’s prose is an Indian wood, where all progress is rendered impossible by the undergrowth until the traveller cuts out a path for himself, and where, even then, he has to reckon with malicious wild beasts in the shape of unknown words that affright him.” (“…ein wahrer indischer Wald…”)

(This is unfair criticism: personally, I have been lately reading the Kādambarī with the help of friends more experienced in Sanskrit, and I must say the style is truly enjoyable.) Now, the fact that this was a German Indologist writing for the Journal of the German Oriental Society somewhat goes against the claim of Sanskrit and German being similar. But one could say: for someone familiar with Sanskrit’s long compounds and late verbs that even Germans find difficult, the same features in German will pose little difficulty.

Adjectives decline like nouns

In Sanskrit, as it appears to be in German, an adjective takes the gender, case, and number of whatever it is describing. (Twain: “would rather decline two drinks than one German adjective”)

Gender of nouns has to be learned

By and large, it is so in Sanskrit as well. Twain notes that in German “a tree is male, its buds are female, its leaves are neuter; horses are sexless, dogs are male, cats are female — tomcats included, of course; a person’s mouth, neck, bosom, elbows, fingers, nails, feet, and body are of the male sex, and his head is male or neuter according to the word selected to signify it, and not according to the sex of the individual who wears it — for in Germany all the women either male heads or sexless ones; a person’s nose, lips, shoulders, breast, hands, and toes are of the female sex; and his hair, ears, eyes, chin, legs, knees, heart, and conscience haven’t any sex at all”. (He goes on to write a “Tale of the Fishwife and its Sad Fate.”) It does not seem quite so bad in Sanskrit, but yes, gender of words needs to be learned. (In Sanskrit there exists a word for “wife” in each of the three genders.) However this is a feature common to many languages (including, say, languages like Hindi or French that have only two genders) so I shouldn’t list it among similarities.


This is something quite trivial, and linguists often don’t even consider orthography a part of the language proper, but spelling seems to be a pretty big deal to Indians learning other languages. The writing systems of most Indian languages are phonetic, in the sense that the spelling deterministically reflects the pronunciation and vice-versa. There are no silent letters, no wondering about a word spelled in a particular way is pronounced. Indian learners of English often complain about the ad-hoc inconsistent spelling of English; it seems a bigger deal than it should be. From this point of view, the fact that it is claimed that for German, “After one short lesson in the alphabet, the student can tell how any German word is pronounced without having to ask” means that that aspect of German is easier to learn.

The harmony of sound and sense

This is extremely subjective and will be controversial, and perhaps I will seem biased, but to me, in Sanskrit, it seems possible to pick words whose sounds match the desired feeling, better than in other languages. I have seen people who knew many languages say the same thing, and also Western translators from Sanskrit etc., so it is interesting for me to see Twain make a similar remark about German. Anyway, this is subjective, so I’ll not dwell on this much.


There are of course many; e.g. Sanskrit does not have articles (the, etc.) unlike German. It also has very few prepositions (has only a few ones like “without”, “with”, “before”), as the work of prepositions like “to”, “from”, or “by” is handled by case. The difficulty of German prepositions does not seem to be present in Sanskrit.

TL;DR version

Some alleged difficulties of learning German, such as cases, long compounds, and word order, are present to a far greater extent in Sanskrit, so in principle someone who knows Sanskrit may be able to pick them up more easily than someone trying to learn German without this knowledge. However, this may not be saying anything more than that knowing one language helps you learn others.

Written by S

Tue, 2014-07-01 at 20:25:02

Posted in language, sanskrit

Tagged with , ,

Colliding balls approximate pi

leave a comment »

Found via G+, a new physical experiment that approximates \pi, like Buffon’s needle problem: The Pi Machine.

Roughly, the amazing discovery of Gregory Galperin is this: When a ball of mass M collides with one of ball m, propelling it towards a wall, the number of collisions (assuming standard physics idealisms) is \pi \lfloor\sqrt{M/m}\rfloor, so by taking M/m = 10^{2n}, we can get the first n+1 digits of \pi. Note that this number of collisions is an entirely determinstic quantity; there’s no probability is involved!

Here’s a video demonstrating the fact for M/m = 100 (the blue ball is the heavier one):

The NYT post says how this discovery came about:

Dr. Galperin’s approach was also geometric but very different (using an unfolding geodesic), building on prior related insights. Dr. Galperin, who studied under well-known Russian mathematician Andrei Kolmogorov, had recently written (with Yakov Sinai) extensively on ball collisions, realized just before a talk in 1995 that a plot of the ball positions of a pair of colliding balls could be used to determine pi. (When he mentioned this insight in the talk, no one in the audience believed him.) This finding was ultimately published as “Playing Pool With Pi” in a 2003 issue of Regular and Chaotic Dynamics.

The paper, Playing Pool With π (The number π from a billiard point of view) is very readable. The post has, despite a “solution” section, essentially no explanation, but the two comments by Dave in the comments section explain it clearly. And a reader sent in a cleaned-up version of that too: here, by Benjamin Wearn who teaches physics at Fieldston School.

Now someone needs to make a simulation / animation graphing the two balls in phase space of momentum. :-)

I’d done something a while ago, to illustrate The Orbit of the Moon around the Sun is Convex!, here. Probably need to re-learn all that JavaScript stuff, to make one for this. Leaving this post here as a placeholder.

Or maybe someone has done it already?

Written by S

Mon, 2014-06-23 at 23:03:18

Posted in mathematics, unfinished


Get every new post delivered to your Inbox.

Join 126 other followers