The Lumber Room

"Consign them to dust and damp by way of preserving them"

Kannada dictionary online

with 13 comments

The absence of a Kannada dictionary online has been a source of pain for a while (unlike Sanskrit dictionaries). Mohan pointed me to the one at kannadakasturi.com with the warning that it is very slow. He also found that the Internet Archive has a scanned copy of the Kittel dictionary (A Kannada-English school-dictionary : chiefly based on the labours of the Rev. Dr. F. Kittel, by the Rev. J. Bucher (1899)). This is actually a fairly good dictionary and could serve most common purposes. Until someone digitizes it and puts it online (this version at least is out of copyright), we will have to make do with looking up words in this scanned copy. To make it easier to find the right page, below is an “index” to the dictionary. Look down the second column in the table below to find the approximate position of the word you want, then click on the corresponding link in the left column. The gap between successive entries is at most 10 pages, so you should be able to find any word with a click and at most 3 page flips.

http://archive.org/stream/kannadaenglishsc00buchrich#page/n13/mode/2up a
http://archive.org/stream/kannadaenglishsc00buchrich#page/10/mode/2up        aDasatte
http://archive.org/stream/kannadaenglishsc00buchrich#page/20/mode/2up        annu
http://archive.org/stream/kannadaenglishsc00buchrich#page/30/mode/2up        artha
http://archive.org/stream/kannadaenglishsc00buchrich#page/38/mode/2up A
http://archive.org/stream/kannadaenglishsc00buchrich#page/48/mode/2up i
http://archive.org/stream/kannadaenglishsc00buchrich#page/56/mode/2up I
http://archive.org/stream/kannadaenglishsc00buchrich#page/58/mode/2up u
http://archive.org/stream/kannadaenglishsc00buchrich#page/68/mode/2up        ura
http://archive.org/stream/kannadaenglishsc00buchrich#page/70/mode/2up U, R
http://archive.org/stream/kannadaenglishsc00buchrich#page/72/mode/2up RR, lR, lRR, e
http://archive.org/stream/kannadaenglishsc00buchrich#page/76/mode/2up E
http://archive.org/stream/kannadaenglishsc00buchrich#page/78/mode/2up ai
http://archive.org/stream/kannadaenglishsc00buchrich#page/80/mode/2up o
http://archive.org/stream/kannadaenglishsc00buchrich#page/84/mode/2up O
http://archive.org/stream/kannadaenglishsc00buchrich#page/86/mode/2up au
http://archive.org/stream/kannadaenglishsc00buchrich#page/88/mode/2up M H
http://archive.org/stream/kannadaenglishsc00buchrich#page/88/mode/2up k
http://archive.org/stream/kannadaenglishsc00buchrich#page/98/mode/2up        kampu
http://archive.org/stream/kannadaenglishsc00buchrich#page/108/mode/2up        kAruNya
http://archive.org/stream/kannadaenglishsc00buchrich#page/118/mode/2up        kusaku
http://archive.org/stream/kannadaenglishsc00buchrich#page/128/mode/2up        kollAra
http://archive.org/stream/kannadaenglishsc00buchrich#page/132/mode/2up kh
http://archive.org/stream/kannadaenglishsc00buchrich#page/134/mode/2up g
http://archive.org/stream/kannadaenglishsc00buchrich#page/144/mode/2up        gillA
http://archive.org/stream/kannadaenglishsc00buchrich#page/154/mode/2up gh, G, c
http://archive.org/stream/kannadaenglishsc00buchrich#page/164/mode/2up        citra
http://archive.org/stream/kannadaenglishsc00buchrich#page/168/mode/2up ch, j
http://archive.org/stream/kannadaenglishsc00buchrich#page/178/mode/2up        jIva
http://archive.org/stream/kannadaenglishsc00buchrich#page/180/mode/2up jh, J, T
http://archive.org/stream/kannadaenglishsc00buchrich#page/182/mode/2up Th, D
http://archive.org/stream/kannadaenglishsc00buchrich#page/184/mode/2up Dh, N
http://archive.org/stream/kannadaenglishsc00buchrich#page/186/mode/2up t
http://archive.org/stream/kannadaenglishsc00buchrich#page/196/mode/2up        tAmbUla
http://archive.org/stream/kannadaenglishsc00buchrich#page/206/mode/2up        tETu
http://archive.org/stream/kannadaenglishsc00buchrich#page/210/mode/2up th, d
http://archive.org/stream/kannadaenglishsc00buchrich#page/220/mode/2up        dumuku
http://archive.org/stream/kannadaenglishsc00buchrich#page/226/mode/2up dh
http://archive.org/stream/kannadaenglishsc00buchrich#page/230/mode/2up n
http://archive.org/stream/kannadaenglishsc00buchrich#page/240/mode/2up        nikAya
http://archive.org/stream/kannadaenglishsc00buchrich#page/250/mode/2up        neTTage
http://archive.org/stream/kannadaenglishsc00buchrich#page/252/mode/2up p
http://archive.org/stream/kannadaenglishsc00buchrich#page/262/mode/2up        parihAsa
http://archive.org/stream/kannadaenglishsc00buchrich#page/272/mode/2up        punarnava
http://archive.org/stream/kannadaenglishsc00buchrich#page/282/mode/2up        prabOdha
http://archive.org/stream/kannadaenglishsc00buchrich#page/286/mode/2up ph
http://archive.org/stream/kannadaenglishsc00buchrich#page/288/mode/2up b
http://archive.org/stream/kannadaenglishsc00buchrich#page/298/mode/2up        bANali
http://archive.org/stream/kannadaenglishsc00buchrich#page/308/mode/2up        bese
http://archive.org/stream/kannadaenglishsc00buchrich#page/312/mode/2up bh
http://archive.org/stream/kannadaenglishsc00buchrich#page/318/mode/2up m
http://archive.org/stream/kannadaenglishsc00buchrich#page/328/mode/2up        marasuttu
http://archive.org/stream/kannadaenglishsc00buchrich#page/338/mode/2up        mIru
http://archive.org/stream/kannadaenglishsc00buchrich#page/348/mode/2up        mEle
http://archive.org/stream/kannadaenglishsc00buchrich#page/352/mode/2up y
http://archive.org/stream/kannadaenglishsc00buchrich#page/354/mode/2up r
http://archive.org/stream/kannadaenglishsc00buchrich#page/362/mode/2up [?]
http://archive.org/stream/kannadaenglishsc00buchrich#page/364/mode/2up l
http://archive.org/stream/kannadaenglishsc00buchrich#page/370/mode/2up v
http://archive.org/stream/kannadaenglishsc00buchrich#page/380/mode/2up        vidyamAna
http://archive.org/stream/kannadaenglishsc00buchrich#page/390/mode/2up z
http://archive.org/stream/kannadaenglishsc00buchrich#page/398/mode/2up S
http://archive.org/stream/kannadaenglishsc00buchrich#page/400/mode/2up s
http://archive.org/stream/kannadaenglishsc00buchrich#page/410/mode/2up        sambALisu
http://archive.org/stream/kannadaenglishsc00buchrich#page/420/mode/2up        siddhAnta
http://archive.org/stream/kannadaenglishsc00buchrich#page/430/mode/2up        sthAyi
http://archive.org/stream/kannadaenglishsc00buchrich#page/432/mode/2up h
http://archive.org/stream/kannadaenglishsc00buchrich#page/442/mode/2up        hiDi
http://archive.org/stream/kannadaenglishsc00buchrich#page/452/mode/2up        hore
http://archive.org/stream/kannadaenglishsc00buchrich#page/454/mode/2up L, [?]
Note: Pages 262–3 are missing, so from there on, printed page = 2 + number in link

Future work:

  • Extend this data to all pages in the dictionary (around 454/2 = 227)
  • Write a web interface where you can type a word/prefix and be taken to the exact page

Feel free to take it up.

Advertisement

Written by S

Mon, 2012-04-30 at 00:26:10

13 Responses

Subscribe to comments with RSS.

  1. Nice work.
    a. Would OCR ing the document be a better idea?
    b. Also, Kittel’s work is dated. Keeping aside the copyright issues away for a while, a better idea will be to do the same for the massive 8 volume nighantus written by venkatasubbayya and others.
    c. Meanwhile. would it be a better idea to put this project onto a more colloborative platform like google code?
    some links that may help:
    1. http://www.baraha.com/kannada/index.php
    2. http://www.shabdkosh.com/kn/
    3. http://www.kannadaocr.com/

    ತಾಳೆಗರಿ

    Mon, 2012-04-30 at 19:10:48

    • Thanks.

      a. Yes, OCR would be great! The downside is that OCR is unreliable enough that one has to basically examine and verify the entire document (even 95% accuracy, which I doubt has been achieved for Kannada, would mean a mistake every 20 words on average), so it needs to be taken up by someone who has more perseverance and would welcome the opportunity to read an entire dictionary. :-)

      b. Yes, bigger modern dictionaries would be great. But I think copyright is still a big deal, and it would be hard to get something like G. Venkatasubbayya’s dictionary online. Also, this Kittel’s dictionary is not bad at all — it is significantly better than other dictionaries of the same size. My friend estimated that it has about 20000 words, and I find that words that I have looked up dictionaries for, I have been more likely to find in this dictionary than in another one we have at home. Of course it may be that I’ve been more likely to look up old words. :-)

      c. Definitely agree on putting it somewhere so others can contribute… will do so soon.

      Thanks a lot for your links to the other dictionaries; it’s really great! I don’t know how the Baraha website managed to get those dictionaries online!

      S

      Mon, 2012-04-30 at 19:38:04

      • Dear shreevatsa,
        Responses in order:

        a. Yes, after OCR process we would still require to go through and verify the entire dictionary. We will have to delete the word the word that was recognized wrong and add the correct word that was recognized wrong. Still, OCR ing will bring down the amount of time significantly.

        b. Ok, lets start with public domain kittel’s dictionary. If we are lucky enough to get our hands to nighantu’s code (which I suspect is written using nudi), we could easily use it and further use hallimane’s code ascii to unicode if required. (https://github.com/aravindavk/ascii2unicode)
        ____
        What kind of ascii format are you using in the second column of the above table? (looks close to baraha to me)
        (you may contact me by email too)

        ತಾಳೆಗರಿ

        Mon, 2012-04-30 at 21:44:54

        • Hello,

          I’m sorry, this putting it on github doesn’t seem to be happening — do feel free to take the stuff in this post and do whatever is useful with it.

          S

          Fri, 2012-08-03 at 01:01:38

    • One more addition here: https://kn.wiktionary.org/wiki/
      I primarily use this for app/site translation in kannada.

  2. Case study time! For a recent blog post, I was looking up the following words: gaDaNe , ghaasi, sRjisu (amazing, it had the nuanced secondary meaning (“to let go”) that KV implies!), ANe, TaMka, hoge (v), kENa, more (v), neravi, bEge and kILANe. I could find EVERY one of them except kILANe (which I think is a later, erratic substitution anyway; what was Kumaravyasa doing talking about ‘aaNes’ as currency, did they not come because of the British Annas?), and that too within 1-2 clicks. This + MW can get me through KVB, and I could hardly ask more from a dictionary!

    My only complaint is that on the website with the default two-page view the font is too small to read properly, and the vertical scrolling is a bit clunky there. This is of course solved by downloading the PDF.

    pmm3

    Wed, 2012-05-02 at 04:24:46

    • “My only complaint is that on the website with the default two-page view the font is too small to read properly, and the vertical scrolling is a bit clunky there.”

      — Pulling unprocessed images from archive.org is always a pain. I am thinking of storing the processed image and calling the required page via a javascript code.

      • That’s a good idea; I was thinking of doing the same too.

        S

        Fri, 2012-08-03 at 15:58:39

  3. Oh this is useful. If you run an OCR on it and it works, please let me know – it would be great to have a Kannada lexicon. Unfortunately, I can only speak and understand Kannada and cant read and write it, so I wouldn’t be able to verify OCR output, but if there’s any other way I can help, please let me know! Very interested in building Speech/NLP systems for Indian languages. ssitaram at cmu edu

    Sunayana

    Sat, 2012-05-19 at 02:44:43

  4. Can you say me how to download the first dictionary

    ini

    Mon, 2014-08-25 at 21:26:53

    • Which one do you mean by “the first dictionary”?‌ Some of these are only available online (cannot be downloaded)… for the one on archive.org you can download the whole book of course.

      S

      Mon, 2014-08-25 at 21:28:56


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: