Social media in the Sahara desert

My wife and I just finished a week long camel trek in eastern Morocco with Berber nomads. While our hosts had no formal education, no running water, no grid electricity (just a little solar), no flush toilets and no floors in their homes, no land lines and no computers they did have mobile phones. Pretty much everyone seemed to have a low end (Series 40) Nokia. Their lack of education didn’t stop them texting madly. Perhaps more interesting was that they used their mobiles both as music players and for playing what we’d call viral videos. I’m not sure how they get content on their phones, probably an hour away at the super cheap internet cafes of Rissani. At Danger one of the key ideas that differentiated us from the Blackberry and later iPhone was that we were a standalone appliance, not a peripheral for your existing computer. We’ve seen some failure in this model recently but I think it’s ultimately a worthy goal.

Posted in Default | Tagged , , , | 1 Comment

Not solving the wrong problem

I like a great deal of what Google does for the open web. They sponsor standards work, they are working on an open source browser, they are building documentation on the state of the web for web developers. It’s all really great. Today they posted what they called A Proposal For Making AJAX Crawlable. It seems like a great idea. More and more of the web isn’t reached by users clicking on a conventional <a href=”http://… link but by executing JavaScript that dynamically loads content off of the server. It’s somewhere between really hard and impossible for web crawlers to fully and correctly index sites that work that way without the sites’ developers taking crawlers into account.

Google’s proposal is to define a convention for URLs that contain state information in the anchor and to define a convention for retrieving the canonical, indexable contents of the an URL with such an anchor tag. First let me dismiss the suggestion that you make a headless browser available over HTTP to render your AJAX pages to HTML out of hand. If it’s so easy for HtmlUnit to render your AJAX to HTML, surely Google can do it. And basically offering HtmlUnit as a web service on your server doesn’t sound that secure or scalable to me.

The bigger question is that if your solution requires the server to be able to serve the correct HTML for any state, would you come up with the same solution as Google? There is a simple, straight-forward solution that works today and is used on sites all over the internet. If the content you serve includes the static, non AJAX URLs in anchor HREFs but uses JS click handlers to do AJAX loads then crawlers can scrape all of your pages, users of modern browsers get the full shiny experience and users on old mobile browsers that don’t support JS get to work for free!

To do this you can either make your AJAX templates include onclick handlers or you can write a simple piece of JS to do the right thing when any link is clicked on. A contrived example using jQuery might look like:

      $(function(event) {
        $('body').click(function(event) {
          var href = $(event.target).attr('href');
          // don't try to AJAX absolute URLs
          if (href.match('https?://')) return;
          // don't let the normal browser navigation operate
          event.preventDefault()
          // based on event.target.href, decide what AJAX URL to load.
          $('#ajaxframe').load('/load-fragment', {path: href});
          // update the URL bar
          document.location.hash=href;
        });
      });

This will intercept clicks on relative anchor tags and let your page JS do its AJAX magic. It doesn’t require special conventions. If you build your site this way you’ll probably find that the state that is in your URL fragments is a the relative URL for the page on your site. So http://www.example.com/random/page and http://www.example.com/#/random/page have the same meaning. That turns out to be a pretty good convention. After all, aren’t our URLs supposed to refer to resources anyway?

Posted in Default | Tagged , , , | Leave a comment

An fnmatch implementation using finite state machines and LLVM

For my amusement (and I guess education) I decided to implement a regular expression language on top of LLVM using a Ken Thompson style finite state machine algorithm. Instead of implementing classic POSIX regular expressions I chose to implement something closer to POSIX fnmatch expressions for a couple of reasons. The fnmatch language is simpler to parse than regular expressions and regexes as they are commonly understood and used are not true regular expressions and can’t be expressed as finite state machines.

My last experiment with LLVM and fnmatch was built in C++ but this time I chose Python. I’d been prototyping in Python and after I found the llvm-py module I couldn’t bring myself to port it all to C++. I spent several days trying to work out the right incantations of STL to represent the Python structures I’d chosen correctly and efficiently in C++ before just embracing Python as the implementation language.

nfa.py
Following Thompson’s technique (as explained by Russ Cox) I first converted the fnmatch rule (based on SUSv3 documentation) to non-deterministic finite automaton (NFA) form. Building the NFA from the fnmatch pattern string is straight-forward. Except for bracket expressions each character in the string becomes one node in the NFA. Each bracket expression becomes one node.

nfa

NFA for *.txt

dfa.py
Transforming the NFA to deterministic finite automaton (DFA) form is a little trickier, but Russ Cox’s explanation of the technique made it pretty straight-forward. Each DFA node maps to one or more NFA nodes so that for a given input string there is only one DFA node that would be reached.

Russ Cox’s documentation doesn’t cover character classes (ie: wildcards and bracket expressions) and I found that it was a bit tricky to represent and track these when converting from NFA to DFA form. Eventually I came up with a CharacterSet class that represents a set of matching characters, either by inclusion (tracking which characters are in the set) or exclusion (tracking which characters are not in the set). This was handy for representing bracket expressions but invaluable for storing the transitions between DFA states. The CharacterSet class stores a set of characters and a boolean to remember if it’s tracking inclusion or exclusion. I built the set operations I needed on top of that including containment, equality, union, difference and intersection.

The part of NFA to DFA transformation that I found trickiest was determining the set of DFA nodes that would be reached from each of the NFA nodes associated with a DFA node. For each NFA node there are a set of descendants that a particular character set maps to. In the NFA the character sets can overlap, but in a DFA we must make sure all of the character sets are disjoint – so that the state machine is deterministic. Additionally, since each DFA node is associated with multiple NFA nodes we need to work out which set of NFA descendants can be reached by the same set of characters and should be treated as a single DFA node.

I wrote a function distinctCharacterSets that for a set of CharacterSets return a set of disjoint CharacterSets where each input CharacterSet can be expressed as the union of one or more output CharacterSets, the union of the input CharacterSets is equal to the union of the output CharacterSets and there are no empty CharacterSets. On top of that I built a function to turn the list of descendants from all of the NFA nodes associated with a DFA node into DFA descendants.

I didn’t see this approach discussed in simple descriptions of the Thompson regular expression method and the implementations I tried reading were optimized beyond clarity but unless I’m missing an obvious alternative I’m sure similar structures and algorithms are used in finite automaton based regular expression engines.

DFA for *.txt

DFA for *.txt

compiler.py
While both the DFA and NFA can be interpreted the whole point of the exercise for me was to compile the expression to native code. I found the llvm-py module, a set of Python bindings for LLVM. It’s fairly good but incomplete. I’ve made some patches and have them up on Launchpad.net.

I generate an LLVM function for each DFA with basic blocks for each state. At each state an LLVM switch operation jumps to the next state, to a block that returns true if the input string ends on a terminal state or to a block that returns false if there is no next state to match the input character. Instead of calling llvm-py’s bindings for the LLVM optimizer I chose to call out to the command-line tool. It’s easier and it seems fast enough. The generated code can be JITed or compiled statically to native code.

The generated code looks decent. There’s definitely room for improvement, for example a * at the end of a pattern shouldn’t require us to walk the whole string. The LLVM optimizer doesn’t have much chance of catching code like that but it should be easy to catch things like that.

test.py
The implementation works, often. After building a tool to compare the results of Python’s built-in fnmatch, the NFA, DFA and LLVM implementations I found that after several compiles there are often problems. The problems manifest as an incorrect result, a segmentation fault or a Python error. I’m not sure if these are manifestations of the same problem and I’m not sure if the problem is a bug in LLVM, llvm-py or a mistake in my use of llvm-py.

So, the theory works nicely, but the implementation leaves something to be desired. Hopefully the failures are easy to track down and easy to overcome. There is also very weird performance, but I’ll discuss that in a later post. If you want to take a look at it the source is on github.

Posted in Default | Tagged , , , , | Leave a comment

Python generator fun

I know Python’s iterators and generators aren’t that new anymore, but at heart I’m still a Python 1.5 programmer. I’ve come to iterators and generators in Python from my experience in JavaScript.

This morning I wanted to generate a list of names (for nodes in my NFA/DFA pattern matching code) that looked like: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, AA, AB, AC, AD, AE, AF, AG, AH, AI, AJ, AK, AL, AM, AN, AO, AP, AQ, AR, AS, AT, AU, AV, AW, AX, AY, AZ, BA, BB, BC, BD, etc… A couple of minutes of fiddling I came up with:

def name_generator():
  from string import uppercase
  from itertools import chain
  for n in chain([''], name_generator()):
    for c in uppercase:
      yield n+c

It think the code is pretty simple, pretty readable, and pretty efficient. It’s not rocket science, but I feel that iterators and generators have paid off well for Python.

Posted in Default | Tagged | 1 Comment

Implementing text pattern matching languages in LLVM

We use pattern matching languages all day long. From shell filename matching rules (fnmatch) in our shells and shell utilities like find and locate to regular expression matching in programming languages, configuration files and shell utilities like grep and sed. These have typically been implemented by parsing the pattern into data structures and walking those data structures as input is processed. Obviously, these work fairly well – we couldn’t live without them, but can we do better? I’m thinking particularly about patterns that are matched against many times like the input to GNU locate (matched against every path on your system) or the routing table of your web application (matched for every request)?

Pattern matching languages are programming languages. If we were looking to speed up long-running conventional programs an obvious approach would be to use a virtual machine with a JIT to end up with efficient native code at the cost of slower startup.

I’ve been experimenting with this and the LLVM compiler infrastructure project. LLVM is a set of libraries and UNIX utilities that provide an assembler, optimizer, interpreter, and native compiler for a high level SSA intermediate form. So far I’ve implemented a simplified subset of the POSIX fnmatch function’s functionality as a proof of concept. It’s pretty hacky, but it’s up on github.

The performance isn’t great, it’s 20% slower than GLIBC in my trivial testcase, but so far all I’ve got is a really naiive implementation. I’m going to implement an NFA / DFA based algorithm which should be more efficient, and easier extend to full regular expressions. Hopefully it’ll look more useful then.

Posted in Default | Tagged , , | 1 Comment

Understanding the OAuth vulnerability

Last night’s OAuth Security Advisory 2009.1 was a little light on the details. The blog post wasn’t much better. I was peripherally involved in the OAuth spec development and I couldn’t work out what the advisory meant without a bunch of thinking and spec reading so I thought I’d try to explain it in simpler terms here.

For my example I’ll use the real service Twitter and a theoretical service Twitten that lets users post to to Twitter in LOL-speak and authenticates via OAuth. Alice and Bob will be my attacker and victim.

Alice’s normal authentication process goes like this:

  1. Alice loads twitten.com/login
  2. Twitten creates a regular HTTP session for Alice
  3. Twitten asks Twitter for an unauthorized token
  4. Twitten redirects Alice to an URL on the Twitter servers that will allow her to authorize the token
  5. Alice clicks OK to authorize the token
  6. Twitter redirects Alice back to Twitten
  7. Twitten exchanges its unauthorized token for an access token (associated with Alice’s account) with Twitter and stores it in Alice’s session
  8. Alice makes inane posts on Twitter via Twitten

The vulnerability here is in step 4. If instead of going to the authorization URL Alice convinces Bob to go there and authorize Twitten she can gain access to his account. Like this:

  1. Alice loads twitten.com/login
  2. Twitten creates a regular HTTP session for Alice
  3. Twitten asks Twitter for an unauthorized token
  4. Twitten tells Alice what URL to go to to authorize the token, but she doesn’t go there
  5. Alice tells Bob, “if you love Twitter and kittens try out Twitten – go to http://twitter.com/oauth/authorize/….” (the authorization URL from Twitten)
  6. Bob loads the authorization URL with his Twitter credentials and authorizes the token
  7. Twitten requests Twitter to exchange the unauthorized token for an access token (associated with Bob’s Twitter account) and stores it in Alice’s session
  8. Alice goes to twitten.com and posts “OMG PWNd” to Bob’s twitter account

I’m not really sure how to address this issue. It’s fundamentally hard to establish trust between three parties over insecure communications. Hopefully more experienced people than me will come up with clever answers.

Update: changed wording to match Eran’s suggestion, his blog post on the subject is excellent reading.

Posted in Default | Tagged , | 6 Comments

Restarting an AIR application

For reasons too complicated and secret to go into here I’m writing an Adobe AIR application that needs to restart itself occasionally. I didn’t find any clear documents describing how to do this but after some reverse engineering and experimentation, here’s what I came up with.

The air.swf movie that’s used by web pages to install and launch AIR applications calls some internal, undocumented APIs to do this, and so can you!

namespace {
  import adobe.utils.ProductManager;
  import mx.core.Application;
  public class Restart {
    public static function restart() : void {
      // request that a new instance of the application be launched
      new ProductManager('airappinstaller').launch('-launch ' +
        Application.application.nativeApplication.applicationID + ' ' +
        Application.application.nativeApplication.publisherID);
      // exit the current instance
      Application.application.nativeApplication.exit(0);
    }
  }
}

I don’t know if or when this will break but it’s pretty straight-forward, once you work it out. It feels to me like there might be some kind of race condition, but it seems to work alright. Oh also, I’ve only tried this on Mac.

Posted in Default | Tagged , | Leave a comment

Mozilla and WebKit, browser platform wars.

This post began as a comment on Matthew Gertner’s blog post The Browser Platform Wars. It’s a rant not an article, don’t take it personally.

In my experience (8 years building Mozilla based products and playing with WebKit since it was first released as WebCore in 2003) there are a few clear technical and social differences that can make WebKit a more attractive platform for developers than Mozilla. There are plenty of reasons that Firefox is a better product than Safari (I definitely prefer Firefox over Safari on my Mac), but that’s a different story.

The scale and complexity of the Mozilla codebase is daunting. Mozilla advocates will say that that’s because Mozilla provides more functionality, but the reality is that even if you don’t want all that functionality you still have to dig through and around it to get your work done. Much of the Mozilla platform is poorly documented, poorly understood and incomplete (the C++/JS binding security stuff was the most recent example I’ve looked at) while WebKit is smaller, simpler and newer. They use common c++ idioms instead of proprietary systems like XPCOM.

The scale of the Mozilla organization is also daunting. Mozilla’s web presence is vast and is filled with inaccurate, outdated content. Their goals are vague and mostly irrelevant to developers. By contrast WebKit’s web site is simple and straight-forward. Its audience is developers, it sets out goals that matter and make sense to developers, it explains clearly the process for participating and contributing in the project.

WebKit is designed for embedding. Within Apple there are several customers for the WebKit library – Desktop Safari, iPhone Safari, Dashboard, AppKit and more. Since WebKit already serves a variety of purposes it’s likely to work for other applications which third party developers will want to build. By comparison the Mozilla platform really only has one first-class customer – Firefox.

The WebKit community has welcomed non-employee contributors. They’ve even welcomed contributors who work for Apple’s competitors. There are WebKit reviewers from Google, Nokia and the open source community. By comparison, Songbird and Flock don’t have any Mozilla committers or reviewers who weren’t previously Mozilla Corporation employees even though they are two of the largest non-MoCo platform customers.

Perhaps I’m short-sighted, but I don’t see a clear path forward for Mozilla in competing with WebKit as a platform for web content display. The long history of Mozilla have left them with a large, complicated codebase that’s not getting smaller. The rapid growth and defensive attitude of the organization (probably brought on by the Netscape / IE wars) has left it without a culture that welcomes friendly competition. I think that Mozilla’s focus on the product above the platform is the right decision for them. I’m just glad we have an alternative web content platform.

Posted in Default | Tagged , , , , , , | 8 Comments

Three Months of ActionScript

I’ve been working largely in ActionScript 3 for the past three months. After spending four years working primarily in JavaScript I didn’t expect to encounter too many problems with her cousin. That expectation was pretty well borne out.

I was particularly excited at the chance to play with ActionScript’s optional strict typing since that has been under consideration for future versions of JavaScript / ECMAScript. AS3’s strict typing hasn’t felt optional to me because I’ve been largely using the Flex Builder IDE and it excitedly warns me every time I fail to specify a type on a variable or function signature, so I put them everywhere. Instead of implementing encapsulation using conventions like I do in JS I use AS3’s Interfaces. This felt good at first but I quickly realized that I couldn’t do the kinds of tricks that I’d become accustomed to in JS. I can’t create a typed object in a function and return it without defining a class for it in its own file. This has made the higher level structure of my software far closer to Java than to JavaScript.

On the other hand, inside my functions I’m pretty much writing JavaScript. All of the standard JS built-in objects (Object, Array, Math, etc) behave like modern JS implementations. I use anonymous functions to manipulate and filter arrays. I use Objects and Arrays as my primary data structures.

At this point I’m pretty used to AS3. I know how to design my code and I can be pretty productive, not needing to rewrite everything too many times. I’d like to see Adobe mix the Java-style classes and interfaces up with the JavaScript habit of declaring anonymous functions and objects as you go. I’d be thrilled if I could write something like:

var x : IMyInterface = new IMyInterface {
method: function() : void { ... },
field: 42
};

I see that they’re extending the language to support limited generics support so hopefully that won’t be the end of it. I’m looking forward to experimenting more on the platform.

Posted in Default | Tagged , , , , | Leave a comment

A Different Model For Web Services Authorization

In my last post I set out to describe how easy it is to extract private keys from desktop software. As I was concluding I stumbled on an alternative approach that might be more secure in some circumstances. I didn’t really go into details, so here’s an expansion of the idea.

Current API authentication mechanisms including Flickr Auth, OAuth, Yahoo BBAuth and Google AuthSub work by allowing users to grant an application the right to act for the user on the site. Some implementations allow the user to grant access to only a subset of the functionality — Flickr lets users grant just read access, Google lets users grant access to just one class of data (for example Google Docs but not Google Calendar). But still, the user doesn’t know which specific operations they’re allowing to be carried out. This wouldn’t matter so much if the user can trust the application that they’re granting access to, but there’s no way to provide for that completely.

Another approach might be to give third-party applications the ability to submit sets of changes to be applied to a user’s account, but require the user to approve them through the service provider’s web site before they’re applied to the the user’s real data. For example, a Flickr upload might look like this:

  • Client makes API request flickr.transaction.create(total_files=20, total_data=20M)
  • Flickr responds with transaction id X and a maximum lifetime for the transaction
  • Flickr responds with an URL to direct the user’s browser to to:
    • associate the transaction with an account, and
    • preview, permit and commit the changes

To a user this isn’t going to look or feel to different from the current Flickr upload scenario where on completion they’re presented with a set of uploaded photos and offered the opportunity to make metadata changes.

The key advantage here from a security standpoint is that users approve actions not applications. The challenge of expressing a set of changes to a user clearly is not small but unlike the challenge of identifying and trusting applications it’s solvable.

This model probably only makes sense for write-oriented sessions, and for infrequent, large write sessions (such as Flickr uploads) rather than frequent, small writes (like Twitter posting).

These ideas aren’t well thought out in my head, they’re fresh from me trying to come up with a workable model for API authentication and authorization that doesn’t depend on trusting the identity of clients. I’d really welcome feedback and ideas on how to take this forward and feedback implementers on the feasibility of adopting a model like this.

Posted in Default | Tagged , , , | 8 Comments