Tag Archives: google

Scripting Google Voice in Python

In the interest of getting some of the fragments of code I’ve written off my hard disk and out where someone might find them useful I’ve decided to start dumping them into git repos with some very minimal documentation. Minimal enough that I actually kick them out there.

The first is a simple Python library to access Google Voice’s calling and texting functionality:

% python                  
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from googlevoice import *
>>> gv = GoogleVoice('googleaccount', 'googlepassword')
>>> gv.sms('+1415MYNUMBER', 'this is a test')
>>> gv.call('+1415MYNUMBER', '18003569377')

The library depends on BeautifulSoup. It’s based on a bunch of work that I found on the internet. I don’t remember exactly whose techniques I ended up using, but there’s a bunch of example code out there, just not much that was simple to use and in Python.

A brighter future for mobile applications?

Since the Chrome OS announcement the other day I’ve been thinking more about what a world with rich enough web APIs to support all general purpose applications might look like. I’m not sure that it’ll happen, but it sounds like Google is putting their weight behind it and they’ve been successful in the past at moving our industry in new directions (remember the world before GMail and Google Maps?).

A richer set of standard web APIs might form the basis for a cross-manufacturer mobile platform. The Palm WebOS stack already kind of looks like Chrome OS (though with local HTML+JS apps rather than remote ones) and the original iPhone application model was exactly what Chrome OS proposes. The limitations that forced Apple to create a native developer platform are exactly the ones that Chrome OS plans to address.

Of course Google’s own mobile platform is decidedly non-web and Apple’s much larger volume of applications discourage it from supporting standard APIs. The handset manufacturers, OS developers and carriers are all making a ton of money selling applications in a model that’s reminiscent of pre-web software models. The only real winners from a move to a web model for mobile applications would be the users.

Google Chrome OS

If I was building an OS today I’d be building what Google just announced.

Like most heavy technology users I’ve been moving heavily toward hosted web applications over the past few years. I don’t use Evolution or mutt anymore, I use GMail. I don’t organize my photos on my laptop and use my own hosted Gallery, I use Flickr. I’ve never been a big office application user, but when I’m forced to open a Powerpoint deck, edit an Excel file or print out a Word document, I do it using Google docs.

I’ve also spent the past four or five or so years working on blurring the line between what’s on your desktop and what’s online. At Flock I worked to synchronize your bookmarks to online services and between machines, to integrate personalized web search into your desktop workflow and to make publishing media from your devices as easy as publishing text from your keyboard. At Songbird we developed APIs to allow web apps to interact with your desktop media player and APIs to let your desktop media player access content from the web. At Rdio I worked on similar things, from a slightly different approach, I don’t think I can talk about them yet.

I’m really excited that Google has the balls (and the skills) to go all out. To commit to offering enough APIs to web applications to allow them to provide the same functionality and user experience as desktop applications would. This isn’t the first time that this has been attempted, but I think this time it just might work. Just a couple of years ago when the iPhone launched and Apple announced that the only way to write applications was to write web applications users and developers rebelled. The iPhone browser wasn’t capable enough. Google have taken the right approach by committing to improving the web platform to support whatever APIs are needed before shipping the product.

I’ll never be running Chrome OS. I rely on too many specialized applications, but I am looking forward to when Flickr can pull photos right off my camera and GMail’s offline features are widely tested enough to actually work right. Much of the innovation in Chrome OS will benefit us all.

Not solving the wrong problem

I like a great deal of what Google does for the open web. They sponsor standards work, they are working on an open source browser, they are building documentation on the state of the web for web developers. It’s all really great. Today they posted what they called A Proposal For Making AJAX Crawlable. It seems like a great idea. More and more of the web isn’t reached by users clicking on a conventional <a href=”http://… link but by executing JavaScript that dynamically loads content off of the server. It’s somewhere between really hard and impossible for web crawlers to fully and correctly index sites that work that way without the sites’ developers taking crawlers into account.

Google’s proposal is to define a convention for URLs that contain state information in the anchor and to define a convention for retrieving the canonical, indexable contents of the an URL with such an anchor tag. First let me dismiss the suggestion that you make a headless browser available over HTTP to render your AJAX pages to HTML out of hand. If it’s so easy for HtmlUnit to render your AJAX to HTML, surely Google can do it. And basically offering HtmlUnit as a web service on your server doesn’t sound that secure or scalable to me.

The bigger question is that if your solution requires the server to be able to serve the correct HTML for any state, would you come up with the same solution as Google? There is a simple, straight-forward solution that works today and is used on sites all over the internet. If the content you serve includes the static, non AJAX URLs in anchor HREFs but uses JS click handlers to do AJAX loads then crawlers can scrape all of your pages, users of modern browsers get the full shiny experience and users on old mobile browsers that don’t support JS get to work for free!

To do this you can either make your AJAX templates include onclick handlers or you can write a simple piece of JS to do the right thing when any link is clicked on. A contrived example using jQuery might look like:

      $(function(event) {
        $('body').click(function(event) {
          var href = $(event.target).attr('href');
          // don't try to AJAX absolute URLs
          if (href.match('https?://')) return;
          // don't let the normal browser navigation operate
          event.preventDefault()
          // based on event.target.href, decide what AJAX URL to load.
          $('#ajaxframe').load('/load-fragment', {path: href});
          // update the URL bar
          document.location.hash=href;
        });
      });

This will intercept clicks on relative anchor tags and let your page JS do its AJAX magic. It doesn’t require special conventions. If you build your site this way you’ll probably find that the state that is in your URL fragments is a the relative URL for the page on your site. So http://www.example.com/random/page and http://www.example.com/#/random/page have the same meaning. That turns out to be a pretty good convention. After all, aren’t our URLs supposed to refer to resources anyway?

Mozilla and WebKit, browser platform wars.

This post began as a comment on Matthew Gertner’s blog post The Browser Platform Wars. It’s a rant not an article, don’t take it personally.

In my experience (8 years building Mozilla based products and playing with WebKit since it was first released as WebCore in 2003) there are a few clear technical and social differences that can make WebKit a more attractive platform for developers than Mozilla. There are plenty of reasons that Firefox is a better product than Safari (I definitely prefer Firefox over Safari on my Mac), but that’s a different story.

The scale and complexity of the Mozilla codebase is daunting. Mozilla advocates will say that that’s because Mozilla provides more functionality, but the reality is that even if you don’t want all that functionality you still have to dig through and around it to get your work done. Much of the Mozilla platform is poorly documented, poorly understood and incomplete (the C++/JS binding security stuff was the most recent example I’ve looked at) while WebKit is smaller, simpler and newer. They use common c++ idioms instead of proprietary systems like XPCOM.

The scale of the Mozilla organization is also daunting. Mozilla’s web presence is vast and is filled with inaccurate, outdated content. Their goals are vague and mostly irrelevant to developers. By contrast WebKit’s web site is simple and straight-forward. Its audience is developers, it sets out goals that matter and make sense to developers, it explains clearly the process for participating and contributing in the project.

WebKit is designed for embedding. Within Apple there are several customers for the WebKit library – Desktop Safari, iPhone Safari, Dashboard, AppKit and more. Since WebKit already serves a variety of purposes it’s likely to work for other applications which third party developers will want to build. By comparison the Mozilla platform really only has one first-class customer – Firefox.

The WebKit community has welcomed non-employee contributors. They’ve even welcomed contributors who work for Apple’s competitors. There are WebKit reviewers from Google, Nokia and the open source community. By comparison, Songbird and Flock don’t have any Mozilla committers or reviewers who weren’t previously Mozilla Corporation employees even though they are two of the largest non-MoCo platform customers.

Perhaps I’m short-sighted, but I don’t see a clear path forward for Mozilla in competing with WebKit as a platform for web content display. The long history of Mozilla have left them with a large, complicated codebase that’s not getting smaller. The rapid growth and defensive attitude of the organization (probably brought on by the Netscape / IE wars) has left it without a culture that welcomes friendly competition. I think that Mozilla’s focus on the product above the platform is the right decision for them. I’m just glad we have an alternative web content platform.

A Different Model For Web Services Authorization

In my last post I set out to describe how easy it is to extract private keys from desktop software. As I was concluding I stumbled on an alternative approach that might be more secure in some circumstances. I didn’t really go into details, so here’s an expansion of the idea.

Current API authentication mechanisms including Flickr Auth, OAuth, Yahoo BBAuth and Google AuthSub work by allowing users to grant an application the right to act for the user on the site. Some implementations allow the user to grant access to only a subset of the functionality — Flickr lets users grant just read access, Google lets users grant access to just one class of data (for example Google Docs but not Google Calendar). But still, the user doesn’t know which specific operations they’re allowing to be carried out. This wouldn’t matter so much if the user can trust the application that they’re granting access to, but there’s no way to provide for that completely.

Another approach might be to give third-party applications the ability to submit sets of changes to be applied to a user’s account, but require the user to approve them through the service provider’s web site before they’re applied to the the user’s real data. For example, a Flickr upload might look like this:

  • Client makes API request flickr.transaction.create(total_files=20, total_data=20M)
  • Flickr responds with transaction id X and a maximum lifetime for the transaction
  • Flickr responds with an URL to direct the user’s browser to to:
    • associate the transaction with an account, and
    • preview, permit and commit the changes

To a user this isn’t going to look or feel to different from the current Flickr upload scenario where on completion they’re presented with a set of uploaded photos and offered the opportunity to make metadata changes.

The key advantage here from a security standpoint is that users approve actions not applications. The challenge of expressing a set of changes to a user clearly is not small but unlike the challenge of identifying and trusting applications it’s solvable.

This model probably only makes sense for write-oriented sessions, and for infrequent, large write sessions (such as Flickr uploads) rather than frequent, small writes (like Twitter posting).

These ideas aren’t well thought out in my head, they’re fresh from me trying to come up with a workable model for API authentication and authorization that doesn’t depend on trusting the identity of clients. I’d really welcome feedback and ideas on how to take this forward and feedback implementers on the feasibility of adopting a model like this.

Twitter Translation

My friend Britt mentioned today that he was about to launch twitter.jp. How exciting! But I don’t understand Japanese. If only I could easily translate all those tweets in languages I don’t understand.

I played around with Google’s new AJAX Translation API before and I wondered how hard it would be to use that from a GreaseMonkey script. The answer: hard. I’m not sure what the exact problem was but every way I tried to include Google’s APIs into the pages I was manipulating, including creating my own iframe and using document.write failed. In the end I used a static proxy html file (hosted in one of my Amazon S3 buckets for cheap efficiency) with some sneaky cross-site communication (the request goes over in location.hash, the response comes back in window.name).

My script is now up on userscripts.org: twitlator.user.js

To use it simply go to a page listing a bunch of tweets, like the public timeline and find a tweet in a language you don’t under stand. For example:

Click the translate! link and several moments later you’ve got:

Yep. People aren’t posting anything interesting in Japanese either.

I’m pretty sure my script will only work on Firefox 3. I’m using getElementsByClassName which I think wasn’t introduced until Firefox 3. Why aren’t you running it already?

Google AJAX APIs outside the browser

Google just announced their new Language API this morning. Unfortunately their API is another one of their AJAX APIs – that are designed to be used from JavaScript in web pages. These APIs are pretty cool for building client-side web applications – I used their AJAX Feeds API in my home page – but I had some ideas for server software that could use a translation API.

I remembered John Resig’s hack from a few months back, getting enough of the DOM in Rhino to run some of the jQuery unit tests. I pulled that down, wrote the bits of DOM that were missing and now I’ve got a Java and JavaScript environment for accessing Google’s AJAX APIs. Apart creating stubs for some methods that are called the main functionality I had to implement was turning Google’s APIs’ asynchronous script loading into the Rhino shell’s load(url) calls. They use <script src="... and document.createElement("script"), but both are pretty easy to catch. The upshot of this is that everything is synchronous. This subtly changes a lot of the interface. For example, my Language API example looks like this:

load('ajax-environment.js');
load('http://www.google.com/jsapi');
google.load('language', '1');
google.language.translate("Hello world", "en", "es", 
  function(result) { 
    print(result.translation);
  });

it of course prints: Hola mundo.

I’ve put the source up on github. Have a play, tell me what you think.

Out with the old, in with the goo(gle)

Some time ago I reworked my home page to feature content from various other sites I post to (blogs, flickr, delicious) by using some JSON tricks to pull in their feeds. I blogged about how to do this with Feedburner’s JSON API, so that my actual page was just static HTML and all the work was done client-side.

Last week I decided to revisit this using Google’s new AJAX feeds API. Feedburner‘s API never seemed to be well supported (it came out of a hackathon) and it forced me to serialize my requests. In the process I neatened up a bunch of the code.

Google’s API is pretty straight-forward. It uses a loader model that is similar to Dojo‘s dojo.require, so you load the main Google library:

<script src="http://www.google.com/jsapi?key=YOURAPIKEY"
  type="text/javascript"></script>

and then ask it to load their feed library:

google.load('feeds', '1');

They have a handy way of setting a callback to be called when the libraries are loaded:

google.setOnLoadCallback(function () { /* just add code */ });

By putting all three of these together we have a straight-forward way to execute code at the right time.

I refactored the code that inserts the feed data into the page a lot. I fleshed out the concept of input filters from simply filtering the title to filtering the whole item objects. This allows for a more flexible transformation from the information that is presented in the RSS feeds to information that I want to present to visitors to my page. In practice I only used it to remove my name from Twitter updates. Instead of hard-coding the DOM node creation like I did in the previous version of the code I moved to a theme model. The theme function takes a feed entry and returns a DOM node to append to the target DOM node.

The flexibility of Google’s API let me abandon my separate code path for displaying my Flickr images. Previously I used Flickr’s own JSON feed API but since Google’s feed API supports returning RSS extensions I used the Flickr’s MediaRSS compliant feeds to show thumbnails and links. They even provide a handy cross-browser implementation of getElementsByTagNameNS (google.feeds.getElementsByTagNameNS) for us to use.

I’m tempted to write a client-side implementation of Jeremy Keith‘s lifestream thing using this API.

Take a look at the code running on my home page or check out the script.