The Lumber Room

"Consign them to dust and damp by way of preserving them"

Posts Tagged ‘google

First thoughts on Google Wave

with 6 comments

Just saw the demo for Google Wave. It’s impressive and ambitious. It’s hard to describe, but it’s a collaborative real-time thing (think Google Docs for everything) that can work like email, IM, blogs, forums, whatever you want — and can be embedded into, or integrates with, apparently everything: Orkut, Blogger, Google Maps, Google Code (the bug tracker), Twitter, etc. (They’ve already fulfilled the annoying-word requirement, by creating “twave”.)

They say it’s a “product, platform and protocol”.

I can see myself using this. (And thinking of the privacy implications (or the having-your-data-out-there-in-the-cloud-somewhere implications), it’s bloody scary.)

They’ve got pretty amazing sync. Search results and messages get updated in real time character-by-character, and the latter seems to make people cheer as if they’ve never seen good old talk.

Finally someone had the “playback” idea I have been trying to propose for years. (I was calling it the “undo bar” or “edit history bar”, or more recently “Time Machine for Emacs”, but whatever.) You can “play back” the edit history of a document (“wave”), seeing what changes each person made and in what order, and when the “wave” is a chess game, you can play back the chess game. Perfect.

They variously say it will be open-sourced, or that “a lion’s share of the code” will be open-sourced, but let’s hold off believing that until we see it. It’s extensible, so you can add your plugins to it. It’s a protocol, so you can write your own implementations of it. It’s a platform, so you can run it on your own servers. Now someone add a LaTeX compiler to it, and collaborative work with LaTeX will finally be possible.

If you have 80 minutes to spare, here’s the video, or an article at TechCrunch.

Written by S

Fri, 2009-05-29 at 21:35:55 +05:30

Posted in Uncategorized

Tagged with ,

Google and inflection

with 3 comments

It is a generally useful feature that Google tries to “Do What I Mean” instead of taking our queries literally to “Do What I Say”, but sometimes it’s annoying.

For example, searching for [sarah palin trigonometry] includes results that do not contain the word ‘trigonometry’ at all. Fortunately, searching for [sarah palin "trigonometry"] works (which might contradict intuition that putting quotes around single words should not matter).

I have seen Google do this many times (return results which do not contain the words searched for), but can’t recall other examples right now… can you?

Written by S

Thu, 2008-10-23 at 10:27:33 +05:30

Posted in Uncategorized

Tagged with ,

Reverse-engineering Gmail: Initial remarks

with 11 comments

For the last week and a bit, I have been trying to do a particular something with Gmail. (Specifically, get at the Message-ID headers of messages.) This has been mostly a failure, but that’s not so surprising, as I had little experience with “all this web stuff”: JavaScript, AJAX, DOM, browser incompatibilities, Firebug, Greasemonkey… round up the usual buzzwords. I have learnt a bit, though, and thought it might help others starting in a similar situation. (And there’s also the hope that someone might actually find this and help me!)

The story so far
Gmail was launched in April 2004. Since then, it has been through many changes, the latest around October 2007 when there came to our inboxes a “Newer version”, also sometimes called “Gmail 2″. (Note that officially Gmail is still in Beta; it hasn’t even released a 1.0!)
When Gmail was released the set of practices that go by the name of “AJAX” was still new and unfamiliar; it has been refined and better-understood since. (And it turns out to require neither asynchrony nor JavaScript nor XML.)

Johnvey Hwang reverse-engineered much of Gmail’s original version, and even made a “Gmail API” out of it. It no longer works of course, and the site is often down too, but it’s available on the Wayback Machine and the section documenting “the Gmail engine and protocol” is still worth a read, if only for its glimpse into the labyrinthine ways in which Ajax applications can work. He turned it (in May 2005) into a SourceForge project (“Gmail API”), last updated June 2005, and the associated Google Group (” Gmail Agent API”) is also largely defunct and indicates that the API, or whatever came of it, has not been working since the changes in October 2007, at any rate.

My goal
At this point, I might as well reveal what I want to do: I want to make it easy to get the “Message-ID:” header of messages in Gmail. (I like to read email in Gmail but not to send, so one way to reply to a specific message would be to get the Message-ID and ask my other mail client to reply to the message with that message-ID.) In the current interface, the (only) way of getting it is to click on the pulldown menu next to “Reply”, and click on “Show original”. This will open up a page that contains the raw text of the message with all its headers, and “Message-ID:” is always one of them. Since I use Firefox, I’ve been trying to make this easier with a Greasemonkey script.

Trap-patching the P() function
As Greasemonkey scripts for Gmail go, much useful information comes from Mihai Parparita, who wrote many Greasemonkey scripts for Gmail. Quoting from here:

As others have documented, Gmail receives data from the server in form of JavaScript snippets. Looking at the top of any conversation list’s source, we can see that the D() function that receives data in turns calls a function P() in the frame where all the JavaScript resides. Since all data must pass through this global P() function, we can use Greasemonkey to hook into it. This is similar to the trap patching way of extending Classic Mac OS. Specifically, the Greasemonkey script gets a hold of the current P() function and replaces it with a version that first records relevant data in an internal array, and then calls the original function (so that Gmail operations are not affected).

Clever. This same information is also documented at Greasespot wiki, with a few remarks on what different parameters to P() mean. Alas, it no longer works, because Gmail changed their functions around and renamed all of them, so there is no P() function anymore, and I can’t find what the new equivalent is, or if there is one.

Changes of October 2007
Gmail made certain changes in October 2007, including introducing a “newer version”, but also changing the “older version” that is still available: so it’s not really the older version. As far as Greasemonkey scripts go, another change was in January 2008, where they made all the Javascript load in a separate iframe. So “unsafeWindow” in a Greasemonkey script now refers to this iframe (which is the first frame, frame[0], in the window, and can also be got as top.js). So any scripts written in September 2007 or earlier are certainly useless now.

A lesson from all this is that Gmail will always be a moving target, and one must consider whether it’s worth chasing it.

Gmail’s Greasemonkey “API”:
Sometime in November 2007 or so, after the latest changes, Google even released a basic Greasemonkey API for Gmail, which lets you do a few things, like adding things to the pane at the left. It is too limited for what I need, but it works very well for what is meant for, and is also very well-documented, by Mark Pilgrim with his usual “Dive Into” excellence. It is comprehensive, accurate, well-illustrated and to-the-point, and great as documentation goes; it just happens that the API doesn’t provide what I need.

Some observations
Back to what I’m trying to do. Currently, the actions in the menu next to “Reply”, namely “Reply to all”, “Forward”, “Filter messages like this”, … “Show original” etc., do not actually appear in the DOM multiple times once attached to each message. Instead each of these actions corresponds to exactly one node (each) in the DOM, like these:

<div act="27" style="padding-left: 19px;" class="SAQJzb" id=":t6">Filter messages like this</div>
<div id=":t8" class="R10Zdd" act="29" style="padding-left: 19px;">Add to Contacts list</div>
<div id=":tc" class="SAQJzb" act="32" style="padding-left: 19px;">Show original</div>

etc. The IDs change, and the class name also seems to randomly change between “SAQJzb” and “R10Zdd”; the only constant between the action and the node is the “act” attribute. “Show original” is always act=32. So when you click on the down-arrow button next to Reply, this menu comes up, and when you click on something in the menu, it somehow uses the information about where this menu came up and what you clicked, to find out which message to act on.

This means that simply simulating a click on the node (initMouseEvent, etc…) does not work; we also have to somehow give it the information on what message to act on. How to do this is one thing I’m trying to find out.

The other way involves the fact that Gmail also has its own “ID” for each message. When you are looking at a thread (“conversation”) that contains a single message, it is the same as what is in the URL, e.g. if the URL is something like https://mail.google.com/mail/#inbox/11c177beaf88ffe6, Gmail’s ID of the message is 11c177beaf88ffe6. But when you’re looking at a thread containing more than one message, the ID in the URL is just that of any of the messages in the thread (usually the first one, but you can use the ID of a different message in the URL and it will show the same thread). And when you click on the “Show original” link, the URL is something like https://mail.google.com/mail/?ui=2&ik=1234567890&view=om&th=11c177beaf88ffe6 where 1234567890 is a constant (probably depending on the user) and “om” probably stands for “original message”, and the “th” parameter is the ID of the message. So if I can somehow find a way of getting the ID of messages (like the trap-patching P() method, except that it should work for the current version), then it is possible to get the Message-ID headers of messages too.

Neither has worked out yet, but I’m trying…
(And I have more to say, but will post when things actually work.)

Written by S

Sun, 2008-08-31 at 18:45:57 +05:30

Web phrase occurrences

with 3 comments

Quick post while I get back to work. Someone please help me here…

There are two things I mainly use Google for:

  1. Searching for pages related to a particular something. This is the most common, and intended, use of Google.
  2. Searching for all occurrences of a particular phrase, or more generally a pattern. This might be to compare numbers and compile statistics, or to find what context the phrase is most often used in, or find what are the most common phrases using that pattern.

For example, I just thought of the “My dad can beat up your dad” phrase, and searched Google for “my * can beat up your *”. (Click on link, and see results for yourselves.)

Someone should already have developed a tool/library for using Google (or any other search tool) for doing this, right? Why haven’t I found it yet? Maybe I should contact the “X is the new Y” people… Tell me if you’ve found such a tool.

——————————————

X is the new Y:
Original(?) diagram,
Updates, Updates on updates,
Wikipedia.

Written by S

Sun, 2008-02-17 at 15:51:48 +05:30

Google Talk and scalability

leave a comment »

A talk here by a Google Talk guy:

I didn’t pay close attention, but some things caught my notice.

Google Talk started as a standalone application and became embeddable in Gmail, Orkut, iGoogle (the personalised homepage), usable from cellphones, and so on. This is no mean feat, and shows that modularity and reusability are not unattainable ideals.

It also has important lessons in scalability. Questions like “how many IM messages do you deliver?” or “how many users do you have?” might be relevant from the perspective of the product’s success, but they are not the right measure from an engineering perspective. Most of the packets on the network are presence packets, and this is the number of users × the number of buddies they have, which does not grow linearly with the number of users (think integration into Orkut).

Before deploying into Orkut, they did real-life load testing with a “backend launch” — Orkut started fetching presence status from Google Talk several weeks before launch (starting slowly from 1% of Orkut page views), without showing anything in the UI. With enough confidence and some bugs fixed, the integration was finally made visible. They did something similar with Gmail.

Sharding and re-sharding: Different users are allocated to different servers, and this can be changed easily too.

Modularity etc: Different parts (like Orkut and Gmail) know very little about each other, and interact using the same interface that the rest of the world uses, so one can be changed easily without affecting the other.

Not afraid of going low-level (TCP, epoll kernel calls, etc!)

Written by S

Fri, 2008-01-25 at 00:04:02 +05:30

Posted in Uncategorized

Tagged with , , ,

“What I need…

with 3 comments

… is a movie that is an hour and a half of River Tam beating up dinosaurs.”

Randall Munroe at Google:

Look at 21:30: What’s Knuth doing at Google?!

Update [thanks Arpith]: An account of Randall Munroe’s visit to Google by Ellen Spertus who invited him, and whom Knuth mentions in his question.

Written by S

Fri, 2007-12-21 at 03:13:57 +05:30

Posted in Uncategorized

Tagged with , , ,

Google Calendar bug

leave a comment »

Google Calendar is one of the best things ever written. Its features are useful, its UI is brilliant, and its “quick add” feature alone is worth raving about (and I have). I keep all scheduled events on Google Calendar, even my timetable — creating recurring events (like a seminar series) is very easy. (Aside: I’ve never used a calendar for a todo list…)

Random usability comments follow; please don’t read beyond this point.

Google Calendar has several “views” — “Agenda” shows all your events as a list ordered by time (and date, of course), and the “Day”, “Week”, “Month” views show a day, week, month at a time respectively. There is also a “Custom” view which can be set to several durations, from “Next 2 days” to “Next 4 weeks”. (Actually the menu ought to call the options “2 days”, “4 weeks” etc., because these views can be moved to other periods just like any others, but it’s possible that “Next 3 days” in the menu is less confusing than “3 days”.) If you haven’t used Google Calendar, see this blog post for screenshots. (Aside: Found some useful tips here(mostly what I’ve already been doing).)

I use “Next 2 weeks”, because “1 week” is vertical (events are shown in boxes according to their size, intersecting ones intersect, etc… this is a nice feature, but it is distracting to see it except when you specifically want it), and “1 month” shows too few events per day (because I put my timetable, seminars, and subscribe to several calendars, I sometimes have 15 events a day, most of which won’t fit). “Next 2 weeks” fits about 11 events per day, and is a big enough interval for scheduling most events (usually from email I get), so it’s perfect.

Here’s evidence of a thoughtful, well-designed UI: What do think happens when you switch from one view to another? (Takes just a click, BTW, not going to some other “Settings” window and changing it, or even pulling up a menu.)

This is what happens: If you switch to a bigger duration (such as from “Week” to “Month”), it simply shows the period the view you were looking at was in. (Doesn’t reset to the default view for that duration, which is what bad UI would do.) If you switch to a smaller duration, it picks the first period of that duration in the view you were currently looking at (nice!), except if — and this is what distinguishes good UI from the mediocre — today was in the current view. Because if the view is “month”, and it’s the current month, chances are that you’re actually looking at today, and when you switch to “week” you want the current week, not the first week of the month. For other months, it makes sense to switch to the first week (anything else would seem less “logical”). This is what Google Calendar does.

Except — and this is the bug — it doesn’t work when I’m in the custom view. Or at least, my custom view of “2 weeks” (and “3 weeks” and “4 weeks” — I didn’t try the others because I’ll only know the difference on special days of the week, and Thursday is not one of them.) If I’m looking at today in the “Next 2 weeks” view and I switch to the “Day” view, it shows me the first day in my 2-week-period, which is some confusing day I don’t want. Yeah, I know I have to only click on the “Today” button each time, and even all of those times put together it’s not really worth my going to all the trouble of writing this, but the point is that it violates the Rule of Least Surprise (also called the Principle of Least Astonishment), and it annoys me.

This ought to be fixed, but of course, like most other closed software development, it is hard to find a human to speak to. At least they have a “Contact Us” web form….

Written by S

Thu, 2007-11-01 at 14:16:12 +05:30

Does RMS have a Gmail account?

with one comment

Random funny image I remember saving from ages ago:

screenshot-gmail-rms

Google wants me to “Invite Richard to Gmail”.

Written by S

Tue, 2007-10-30 at 07:38:02 +05:30

Posted in funny

Tagged with , , , ,

Gmail has IMAP!

with 3 comments

Finally. Many thought this would never happen.

And just like Free software usually, it seems to be the handiwork of someone scratching an itch.

Notes:

  • IMAP folders are Gmail labels. Gmail labels show up as folders in your client, and moving a message to a folder in your client simply adds that label in Gmail.
  • In particular, be careful creating folders, and avoid making a mess. Try reusing the default Gmail labels: Set your client’s drafts folder to “[Gmail]/Drafts”.
  • Messages with multiple labels appear in each of those folders. So there is some duplication at the client end, of course, but this is unavoidable; the price you pay for forcing a tagging philosophy on software that has different beliefs.
  • Conversely, if you want to apply multiple labels to a message through your client, you can use the “poor man’s tagging” that has always been possible — copy the message to each of those folders.
  • If you delete a message from a “folder” (other than “[Gmail]/Trash” and “[Gmail]/Spam”), Gmail only removes that label. It is still present in “All Mail”. To actually delete, move to “[Gmail]/Trash”. What happens if you delete email from “All mail”?
  • Recommended IMAP client settings: Don’t save sent messages on the server; any mail sent through gmail’s smtp is automatically copied to “[Gmail]/Sent Mail” folder.
  • In general, actions sync neatly; see the full table.
  • IMAP and POP work with messages, so if you move only one message from a thread to a folder, only that one will get that label, but the Gmail web interface will show the entire conversation with that label. Note that this is only a display thing — it’s not that opening Gmail will give all the messages the label, and when you reopen your client suddenly things are different. (I need to actually check this.)
  • You still have Gmail’s amazing server-side spam filtering.
  • Some things don’t work.
  • Some other things are alleged not to work that I don’t even understand
  • Everything.

They got everything in order, made all those pages, and turned on IMAP without making any advance announcement…

Written by S

Fri, 2007-10-26 at 04:29:14 +05:30

Using Gmail with mutt, the minimal way (IMAP update)

with 61 comments

As Gmail has IMAP access, it is fairly trivial to get it working with mutt. First, if you’re on Ubuntu/Debian, run sudo apt-get install openssl mutt to get mutt if you don’t already have it. Then, just put the following lines into your ~/.muttrc:

set imap_user = "username@gmail.com"
set imap_pass = "password"

set smtp_url = "smtp://username@smtp.gmail.com:587/"
set smtp_pass = "password"
set from = "username@gmail.com"
set realname = "Your Real Name"

set folder = "imaps://imap.gmail.com:993"
set spoolfile = "+INBOX"
set postponed="+[Gmail]/Drafts"

set header_cache=~/.mutt/cache/headers
set message_cachedir=~/.mutt/cache/bodies
set certificate_file=~/.mutt/certificates

set move = no

Make sure your ~/.muttrc isn’t world-readable; it contains your password. (Alternatively, you can leave them out and mutt will prompt you for the password each time.) Also, if you copy-paste from the above, make sure that you have only “normal” quotes, not “smart quotes” which WordPress might have inserted here into this post.

[Other things I have:

set sort = 'threads'
set sort_aux = 'last-date-received'
set imap_check_subscribed

ignore "Authentication-Results:"
ignore "DomainKey-Signature:"
ignore "DKIM-Signature:"
hdr_order Date From To Cc

I did not include above to justify the "minimal" :)]

Things work perfectly as you would expect them.
One thing to note is that the full headers will still contain the hostname of the computer you send messages from. I have not figured a way of hiding this, and perhaps it shouldn’t be possible.

The End

If for some reason you want to use POP, read on. And tell me why you would want to use POP. The rest of the post is an old version, which i had written before Gmail supported IMAP.

Old Stuff
There is a guide here, which is the first Google result on searching for the keywords Gmail, mutt and Ubuntu in any order, but I would advise against it: it does too much unnecessary stuff using too many unnecessary programs (okay if you don’t care), and involves putting your username and password in a world-readable file (not okay).

There is a guide here, but that site seems down, and so I guess it’s likely to be down again (a DynDNS domain; could be someone’s house), so putting a (fuller) guide here:

First, run sudo apt-get install openssl mutt

Next, in /etc/ssmtp/ssmtp.conf, put
mailhub=smtp.gmail.com:465
UseTLS=YES

Everything else seems to be optional.

Next, create a shell script with the contents
#!/bin/sh
/usr/sbin/ssmtp -au "gmail-address" -ap "password" $@
and put it somewhere in your path (~/bin/gmailout, say) and make it executable (chmod u+x ~/bin/gmailout, I mean) and make sure only you can read it! (chmod og-r ~/bin/gmailout).

Now in ~/.muttrc, put
set pop_host="pops://username:password@pop.gmail.com:995"
set pop_last
unset pop_delete #Just makes mutt not ask, GMail uses config option
set sendmail="~/bin/gmailout"
set write_bcc=no #Important; sSMTP makes bcc non-blind otherwise

and you’re set (remember to make this world-unreadable too: chmod og-rw ~/.muttrc)

You can start mutt, and hit “G” (uppercase G) whenever you want to fetch mail. Can also put exec fetch-mail in ~/.muttrc to have it happen whenever you start mutt, but I find that irritating.

Problems with POP: Not that everything is perfect. I can’t have other mail-transport-agents like sendmail or postfix installed alongside ssmtp. I can’t figure out how to get my crontab reports sent to root, but they do go into ~/dead.letter :D
Also, with mutt I had the habit of adding a my_hdr bcc: my-email-address so that the mail I send is threaded along with the mail I receive (yaay, like Gmail), but somehow there seems to be simply no way of getting Gmail to give me, through POP, those messages I send using an external client. It’s a quirk [bug!] in the way Gmail implements POP. This I’ve fixed by setting mutt’s fcc to /var/mail/my-username, my mail folder. (Of course, if I were in the habit of moving mail to my mbox, I could fcc to mbox too.)
Apart from that, it works fine!

Written by S

Tue, 2007-07-31 at 02:16:03 +05:30

Follow

Get every new post delivered to your Inbox.

Join 57 other followers