<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Software and Opinions &#187; Ian McKellar</title>
	<atom:link href="http://ianloic.com/author/ian/feed/" rel="self" type="application/rss+xml" />
	<link>http://ianloic.com</link>
	<description>from Ian McKellar</description>
	<lastBuildDate>Thu, 19 Nov 2009 22:05:43 +0000</lastBuildDate>
	
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Users&#8217; names and usernames</title>
		<link>http://ianloic.com/2009/11/19/users-names-and-usernames/</link>
		<comments>http://ianloic.com/2009/11/19/users-names-and-usernames/#comments</comments>
		<pubDate>Thu, 19 Nov 2009 22:05:43 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=155</guid>
		<description><![CDATA[A few years ago my friend Jack built a cute little application. It was a text message multiplexer. You could send it a text message and it would send that message to all of your friends. You signed up using your phone number and gave it your name. It was somewhere between addictive and annoying [...]]]></description>
			<content:encoded><![CDATA[<p>A few years ago my friend <a href="http://en.wikipedia.org/wiki/Jack_Dorsey">Jack</a> built a cute little <a href="http://twttr.com/">application</a>. It was a text message multiplexer. You could send it a text message and it would send that message to all of your friends. You signed up using your phone number and gave it your name. It was somewhere between addictive and annoying but completely social, since basically all of the users were our friends. We mostly used it as a free-form <a href="http://en.wikipedia.org/wiki/Dodgeball_(service)">Dodgeball</a>, to work out when friends were out at bars and inevitably they added the ability to send a message directly to a contact. There were no usernames so twttr would cleverly work out who you meant based on the first name you supplied and your contacts list. This never worked right, they added usernames and now I&#8217;m <a href="http://twitter.com/ian">@ian</a>. Unfortunately then the whole @reply thing happened and people <em>do</em> just use their friends first names. Look at <a href="http://search.twitter.com/search?q=@ian">how many people use @ian</a> &#8211; most of them are not talking to me, but other Ians.</p>
<p>Facebook resisted giving people anything other than a free-form name and a numeric user ID for the longest time. They finally gave in and let people pick vanity URLs but still refuse to make that URL useful for anything but getting to your profile. When they added @ support they pop up UI to autocomplete friends names from your contacts list. It works really well, but it depends on a rich message composition UI, something that&#8217;s not possible on the simple mobile devices that twitter was targeting.</p>
<p>I wonder how the next site will approach this.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/11/19/users-names-and-usernames/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Chrome OS</title>
		<link>http://ianloic.com/2009/11/19/google-chrome-os/</link>
		<comments>http://ianloic.com/2009/11/19/google-chrome-os/#comments</comments>
		<pubDate>Thu, 19 Nov 2009 21:43:44 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[browsers]]></category>
		<category><![CDATA[chrome]]></category>
		<category><![CDATA[danger]]></category>
		<category><![CDATA[flock]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[os]]></category>
		<category><![CDATA[rdio]]></category>
		<category><![CDATA[songbird]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=153</guid>
		<description><![CDATA[If I was building an OS today I&#8217;d be building what Google just announced.
Like most heavy technology users I&#8217;ve been moving heavily toward hosted web applications over the past few years. I don&#8217;t use Evolution or mutt anymore, I use GMail. I don&#8217;t organize my photos on my laptop and use my own hosted Gallery, [...]]]></description>
			<content:encoded><![CDATA[<p>If I was building an OS today I&#8217;d be building what Google just announced.</p>
<p>Like most heavy technology users I&#8217;ve been moving heavily toward hosted web applications over the past few years. I don&#8217;t use Evolution or mutt anymore, I use GMail. I don&#8217;t organize my photos on my laptop and use my own hosted Gallery, I use Flickr. I&#8217;ve never been a big office application user, but when I&#8217;m forced to open a Powerpoint deck, edit an Excel file or print out a Word document, I do it using Google docs.</p>
<p>I&#8217;ve also spent the past four or five or so years working on blurring the line between what&#8217;s on your desktop and what&#8217;s online. At Flock I worked to synchronize your bookmarks to online services and between machines, to integrate personalized web search into your desktop workflow and to make publishing media from your devices as easy as publishing text from your keyboard. At Songbird we developed APIs to allow web apps to interact with your desktop media player and APIs to let your desktop media player access content from the web. At Rdio I worked on similar things, from a slightly different approach, I don&#8217;t think I can talk about them yet.</p>
<p>I&#8217;m really excited that Google has the balls (and the skills) to go all out. To commit to offering enough APIs to web applications to allow them to provide the same functionality and user experience as desktop applications would. This isn&#8217;t the first time that this has been attempted, but I think this time it just might work. Just a couple of years ago when the iPhone launched and Apple announced that the only way to write applications was to write web applications users and developers rebelled. The iPhone browser wasn&#8217;t capable enough. Google have taken the right approach by committing to improving the web platform to support whatever APIs are needed before shipping the product.</p>
<p>I&#8217;ll never be running Chrome OS. I rely on too many specialized applications, but I <em>am</em> looking forward to when Flickr can pull photos right off my camera and GMail&#8217;s offline features are widely tested enough to actually work right. Much of the innovation in Chrome OS will benefit us all.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/11/19/google-chrome-os/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Social media in the Sahara desert</title>
		<link>http://ianloic.com/2009/10/24/social-media-in-the-sahara-desert/</link>
		<comments>http://ianloic.com/2009/10/24/social-media-in-the-sahara-desert/#comments</comments>
		<pubDate>Sat, 24 Oct 2009 18:33:44 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[danger]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[socialmedia]]></category>
		<category><![CDATA[travel]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=151</guid>
		<description><![CDATA[My wife and I just finished a week long camel trek in eastern Morocco with Berber nomads. While our hosts had no formal education, no running water, no grid electricity (just a little solar), no flush toilets and no floors in their homes, no land lines and no computers they did have mobile phones. Pretty much everyone seemed [...]]]></description>
			<content:encoded><![CDATA[<p>My wife and I just finished a week long camel trek in eastern Morocco with Berber nomads. While our hosts had no formal education, no running water, no grid electricity (just a little solar), no flush toilets and no floors in their homes, no land lines and no computers they did have mobile phones. Pretty much everyone seemed to have a low end (Series 40) Nokia. Their lack of education didn&#8217;t stop them texting madly. Perhaps more interesting was that they used their mobiles both as music players and for playing what we&#8217;d call viral videos. I&#8217;m not sure how they get content on their phones, probably an hour away at the super cheap internet cafes of Rissani. At Danger one of the key ideas that differentiated us from the Blackberry and later iPhone was that we were a standalone appliance, not a peripheral for your existing computer. We&#8217;ve seen some failure in this model recently but I think it&#8217;s ultimately a worthy goal.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/10/24/social-media-in-the-sahara-desert/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Not solving the wrong problem</title>
		<link>http://ianloic.com/2009/10/08/not-solving-the-wrong-problem/</link>
		<comments>http://ianloic.com/2009/10/08/not-solving-the-wrong-problem/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 23:40:29 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[jquery]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=147</guid>
		<description><![CDATA[I like a great deal of what Google does for the open web. They sponsor standards work, they are working on an open source browser, they are building documentation on the state of the web for web developers. It&#8217;s all really great. Today they posted what they called A Proposal For Making AJAX Crawlable. It [...]]]></description>
			<content:encoded><![CDATA[<p>I like a great deal of what Google does for the open web. They <a href="http://en.wikipedia.org/wiki/Ian_Hickson">sponsor standards work</a>, they are working on an <a href="http://code.google.com/chromium/">open source browser</a>, they are <a href="http://code.google.com/doctype/">building documentation</a> on the state of the web for web developers. It&#8217;s all really great. Today they posted what they called <a href="http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html">A Proposal For Making AJAX Crawlable</a>. It seems like a great idea. More and more of the web isn&#8217;t reached by users clicking on a conventional &lt;a href=&#8221;http://&#8230; link but by executing JavaScript that dynamically loads content off of the server. It&#8217;s somewhere between really hard and impossible for web crawlers to fully and correctly index sites that work that way without the sites&#8217; developers taking crawlers into account.</p>
<p>Google&#8217;s proposal is to define a convention for URLs that contain state information in the anchor and to define a convention for retrieving the canonical, indexable contents of the an URL with such an anchor tag. First let me dismiss the suggestion that you make a headless browser available over HTTP to render your AJAX pages to HTML out of hand. If it&#8217;s so easy for HtmlUnit to render your AJAX to HTML, surely Google can do it. And basically offering HtmlUnit as a web service on your server doesn&#8217;t sound that secure or scalable to me.</p>
<p>The bigger question is that if your solution requires the server to be able to serve the correct HTML for any state, would you come up with the same solution as Google? There is a simple, straight-forward solution that works today and is used on sites all over the internet. If the content you serve includes the static, non AJAX URLs in anchor HREFs but uses JS click handlers to do AJAX loads then crawlers can scrape all of your pages, users of modern browsers get the full shiny experience and users on old mobile browsers that don&#8217;t support JS get to work for free!</p>
<p>To do this you can either make your AJAX templates include onclick handlers or you can write a simple piece of JS to do the right thing when any link is clicked on. A contrived example using jQuery might look like:</p>
<pre class="prettyprint">      $(function(event) {
        $('body').click(function(event) {
          var href = $(event.target).attr('href');
          // don't try to AJAX absolute URLs
          if (href.match('https?://')) return;
          // don't let the normal browser navigation operate
          event.preventDefault()
          // based on event.target.href, decide what AJAX URL to load.
          $('#ajaxframe').load('/load-fragment', {path: href});
          // update the URL bar
          document.location.hash=href;
        });
      });</pre>
<p>This will intercept clicks on relative anchor tags and let your page JS do its AJAX magic. It doesn&#8217;t require special conventions. If you build your site this way you&#8217;ll probably find that the state that is in your URL fragments is a the relative URL for the page on your site. So http://www.example.com/random/page and http://www.example.com/#/random/page have the same meaning. That turns out to be a pretty good convention. After all, aren&#8217;t our URLs <a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">supposed to</a> refer to resources anyway?</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/10/08/not-solving-the-wrong-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An fnmatch implementation using finite state machines and LLVM</title>
		<link>http://ianloic.com/2009/07/15/an-fnmatch-implementation-using-finite-state-machines-and-llvm/</link>
		<comments>http://ianloic.com/2009/07/15/an-fnmatch-implementation-using-finite-state-machines-and-llvm/#comments</comments>
		<pubDate>Wed, 15 Jul 2009 20:41:37 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[fnmatch]]></category>
		<category><![CDATA[llvm]]></category>
		<category><![CDATA[llvm-py]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=137</guid>
		<description><![CDATA[For my amusement (and I guess education) I decided to implement a regular expression language on top of LLVM using a Ken Thompson style finite state machine algorithm. Instead of implementing classic POSIX regular expressions I chose to implement something closer to POSIX fnmatch expressions for a couple of reasons. The fnmatch language is simpler [...]]]></description>
			<content:encoded><![CDATA[<p>For my amusement (and I guess education) I decided to implement a regular expression language on top of LLVM using a Ken Thompson style finite state machine algorithm. Instead of implementing classic POSIX regular expressions I chose to implement something closer to POSIX fnmatch expressions for a couple of reasons. The fnmatch language is simpler to parse than regular expressions and regexes as they are commonly understood and used are not true regular expressions and can&#8217;t be expressed as finite state machines.</p>
<p>My last experiment with LLVM and fnmatch was built in C++ but this time I chose Python. I&#8217;d been prototyping in Python and after I found the llvm-py module I couldn&#8217;t bring myself to port it all to C++. I spent several days trying to work out the right incantations of STL to represent the Python structures I&#8217;d chosen correctly and efficiently in C++ before just embracing Python as the implementation language.</p>
<p><strong>nfa.py</strong><br />
Following Thompson&#8217;s technique (<a href="http://swtch.com/~rsc/regexp/regexp1.html">as explained by Russ Cox</a>) I first converted the fnmatch rule (based on <a href="http://www.unix.org/single_unix_specification/">SUSv3</a> documentation) to non-deterministic finite automaton (NFA) form. Building the NFA from the fnmatch pattern string is straight-forward. Except for bracket expressions each character in the string becomes one node in the NFA. Each bracket expression becomes one node.</p>
<div id="attachment_138" class="wp-caption aligncenter" style="width: 310px"><a title="NFA for *.txt" rel="lightbox" href="http://ianloic.com/wp-content/uploads/2009/07/nfa.png"><img class="size-medium wp-image-138" title="NFA" src="http://ianloic.com/wp-content/uploads/2009/07/nfa-300x54.png" alt="nfa" width="300" height="54" /></a><p class="wp-caption-text">NFA for *.txt</p></div>
<p style="text-align: center;">
<p><strong>dfa.py</strong><br />
Transforming the NFA to deterministic finite automaton (DFA) form is a little trickier, but Russ Cox&#8217;s explanation of the technique made it pretty straight-forward. Each DFA node maps to one or more NFA nodes so that for a given input string there is only one DFA node that would be reached.</p>
<p>Russ Cox&#8217;s documentation doesn&#8217;t cover character classes (ie: wildcards and bracket expressions) and I found that it was a bit tricky to represent and track these when converting from NFA to DFA form. Eventually I came up with a CharacterSet class that represents a set of matching characters, either by inclusion (tracking which characters are in the set) or exclusion (tracking which characters are <em>not</em> in the set). This was handy for representing bracket expressions but invaluable for storing the transitions between DFA states. The CharacterSet class stores a set of characters and a boolean to remember if it&#8217;s tracking inclusion or exclusion. I built the set operations I needed on top of that including containment, equality, union, difference and intersection.</p>
<p>The part of NFA to DFA transformation that I found trickiest was determining the set of DFA nodes that would be reached from each of the NFA nodes associated with a DFA node. For each NFA node there are a set of descendants that a particular character set maps to. In the NFA the character sets can overlap, but in a DFA we must make sure all of the character sets are disjoint &#8211; so that the state machine is deterministic. Additionally, since each DFA node is associated with multiple NFA nodes we need to work out which set of NFA descendants can be reached by the same set of characters and should be treated as a single DFA node.</p>
<p>I wrote a function <code>distinctCharacterSets</code> that for a set of CharacterSets return a set of disjoint CharacterSets where each input CharacterSet can be expressed as the union of one or more output CharacterSets, the union of the input CharacterSets is equal to the union of the output CharacterSets and there are no empty CharacterSets. On top of that I built a function to turn the list of descendants from all of the NFA nodes associated with a DFA node into DFA descendants.</p>
<p>I didn&#8217;t see this approach discussed in simple descriptions of the Thompson regular expression method and the implementations I tried reading were optimized beyond clarity but unless I&#8217;m missing an obvious alternative I&#8217;m sure similar structures and algorithms are used in finite automaton based regular expression engines.</p>
<p style="text-align: center;">
<div id="attachment_143" class="wp-caption aligncenter" style="width: 310px"><a title="DFA for *.txt" rel="lightbox" href="http://ianloic.com/wp-content/uploads/2009/07/dfa.png"><img class="size-medium wp-image-143" title="DFA" src="http://ianloic.com/wp-content/uploads/2009/07/dfa-300x144.png" alt="DFA for *.txt" width="300" height="144" /></a><p class="wp-caption-text">DFA for *.txt</p></div>
<p><strong>compiler.py</strong><br />
While both the DFA and NFA can be interpreted the whole point of the exercise for me was to compile the expression to native code. I found the llvm-py module, a set of Python bindings for LLVM. It&#8217;s fairly good but incomplete. I&#8217;ve made some patches and have them up on Launchpad.net.</p>
<p>I generate an LLVM function for each DFA with basic blocks for each state. At each state an LLVM <code>switch</code> operation jumps to the next state, to a block that returns true if the input string ends on a terminal state or to a block that returns false if there is no next state to match the input character. Instead of calling llvm-py&#8217;s bindings for the LLVM optimizer I chose to call out to the command-line tool. It&#8217;s easier and it seems fast enough. The generated code can be JITed or compiled statically to native code.</p>
<p>The generated code looks decent. There&#8217;s definitely room for improvement, for example a <code>*</code> at the end of a pattern shouldn&#8217;t require us to walk the whole string. The LLVM optimizer doesn&#8217;t have much chance of catching code like that but it should be easy to catch things like that.</p>
<p style="text-align: left;"><strong>test.py</strong><br />
The implementation works, often. After building a tool to compare the results of Python&#8217;s built-in fnmatch, the NFA, DFA and LLVM implementations I found that after several compiles there are often problems. The problems manifest as an incorrect result, a segmentation fault or a Python error. I&#8217;m not sure if these are manifestations of the same problem and I&#8217;m not sure if the problem is a bug in LLVM, llvm-py or a mistake in my use of llvm-py.</p>
<p>So, the theory works nicely, but the implementation leaves something to be desired. Hopefully the failures are easy to track down and easy to overcome. There is also very weird performance, but I&#8217;ll discuss that in a later post. If you want to take a look at it the <a href="http://github.com/ianloic/llvm-fnmatch">source is on github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/07/15/an-fnmatch-implementation-using-finite-state-machines-and-llvm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python generator fun</title>
		<link>http://ianloic.com/2009/06/24/python-generator-fun/</link>
		<comments>http://ianloic.com/2009/06/24/python-generator-fun/#comments</comments>
		<pubDate>Wed, 24 Jun 2009 15:42:15 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=127</guid>
		<description><![CDATA[I know Python&#8217;s iterators and generators aren&#8217;t that new anymore, but at heart I&#8217;m still a Python 1.5 programmer. I&#8217;ve come to iterators and generators in Python from my experience in JavaScript.
This morning I wanted to generate a list of names (for nodes in my NFA/DFA pattern matching code) that looked like: A, B, C, [...]]]></description>
			<content:encoded><![CDATA[<p>I know Python&#8217;s iterators and generators aren&#8217;t that new anymore, but at heart I&#8217;m still a Python 1.5 programmer. I&#8217;ve come to iterators and generators in Python from my experience in JavaScript.</p>
<p>This morning I wanted to generate a list of names (for nodes in my NFA/DFA pattern matching code) that looked like: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, AA, AB, AC, AD, AE, AF, AG, AH, AI, AJ, AK, AL, AM, AN, AO, AP, AQ, AR, AS, AT, AU, AV, AW, AX, AY, AZ, BA, BB, BC, BD, etc&#8230; A couple of minutes of fiddling I came up with:</p>
<pre class="prettyprint">def name_generator():
  from string import uppercase
  from itertools import chain
  for n in chain([''], name_generator()):
    for c in uppercase:
      yield n+c
</pre>
<p>It think the code is pretty simple, pretty readable, and pretty efficient. It&#8217;s not rocket science, but I feel that iterators and generators have paid off well for Python.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/06/24/python-generator-fun/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Implementing text pattern matching languages in LLVM</title>
		<link>http://ianloic.com/2009/06/05/implementing-text-pattern-matching-languages-in-llvm/</link>
		<comments>http://ianloic.com/2009/06/05/implementing-text-pattern-matching-languages-in-llvm/#comments</comments>
		<pubDate>Fri, 05 Jun 2009 20:07:39 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[fnmatch]]></category>
		<category><![CDATA[llvm]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=123</guid>
		<description><![CDATA[We use pattern matching languages all day long. From shell filename matching rules (fnmatch) in our shells and shell utilities like find and locate to regular expression matching in programming languages, configuration files and shell utilities like grep and sed. These have typically been implemented by parsing the pattern into data structures and walking those [...]]]></description>
			<content:encoded><![CDATA[<p>We use pattern matching languages all day long. From shell filename matching rules (<a href="http://www.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html">fnmatch</a>) in our shells and shell utilities like find and locate to <a href="http://en.wikipedia.org/wiki/Regular_expression">regular expression</a> matching in programming languages, configuration files and shell utilities like <a href="http://en.wikipedia.org/wiki/Grep">grep</a> and <a href="http://en.wikipedia.org/wiki/Sed">sed</a>. These have typically been implemented by parsing the pattern into data structures and walking those data structures as input is processed. Obviously, these work fairly well &#8211; we couldn&#8217;t live without them, but can we do better? I&#8217;m thinking particularly about patterns that are matched against many times like the input to <a href="http://en.wikipedia.org/wiki/GNU_locate">GNU locate</a> (matched against every path on your system) or the routing table of your web application (matched for every request)?</p>
<p>Pattern matching languages are programming languages. If we were looking to speed up long-running conventional programs an obvious approach would be to use a virtual machine with a JIT to end up with efficient native code at the cost of slower startup.</p>
<p>I&#8217;ve been experimenting with this and the <a href="http://www.llvm.org/">LLVM</a> compiler infrastructure project. LLVM is a set of libraries and UNIX utilities that provide an assembler, optimizer, interpreter, and native compiler for a high level <a href="http://en.wikipedia.org/wiki/SSA_(compilers)">SSA</a> intermediate form. So far I&#8217;ve implemented a simplified subset of the POSIX fnmatch function&#8217;s functionality as a proof of concept. It&#8217;s pretty hacky, but it&#8217;s <a href="http://github.com/ianloic/llvm-fnmatch/tree/master">up on github</a>.</p>
<p>The performance isn&#8217;t great, it&#8217;s 20% slower than <a href="http://en.wikipedia.org/wiki/GNU_Libc">GLIBC</a> in my trivial testcase, but so far all I&#8217;ve got is a really naiive implementation. I&#8217;m going to implement an <a href="http://en.wikipedia.org/wiki/Nondeterministic_finite_state_machine">NFA</a> / <a href="http://en.wikipedia.org/wiki/Deterministic_finite_state_machine">DFA</a> based algorithm which should be more efficient, and easier extend to full regular expressions. Hopefully it&#8217;ll look more useful then.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/06/05/implementing-text-pattern-matching-languages-in-llvm/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Understanding the OAuth vulnerability</title>
		<link>http://ianloic.com/2009/04/23/understanding-the-oauth-vulnerability/</link>
		<comments>http://ianloic.com/2009/04/23/understanding-the-oauth-vulnerability/#comments</comments>
		<pubDate>Thu, 23 Apr 2009 18:04:15 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[oauth]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=115</guid>
		<description><![CDATA[Last night&#8217;s OAuth Security Advisory 2009.1 was a little light on the details. The blog post wasn&#8217;t much better. I was peripherally involved in the OAuth spec development and I couldn&#8217;t work out what the advisory meant without a bunch of thinking and spec reading so I thought I&#8217;d try to explain it in simpler [...]]]></description>
			<content:encoded><![CDATA[<p>Last night&#8217;s <a href="http://oauth.net/advisories/2009-1">OAuth Security Advisory 2009.1</a> was a little light on the details. The <a href="http://blog.oauth.net/2009/04/22/acknowledgement-of-the-oauth-security-issue/">blog post</a> wasn&#8217;t much better. I was peripherally involved in the OAuth spec development and I couldn&#8217;t work out what the advisory meant without a bunch of thinking and spec reading so I thought I&#8217;d try to explain it in simpler terms here.</p>
<p>For my example I&#8217;ll use the real service Twitter and a theoretical service Twitten that lets users post to to Twitter in LOL-speak and authenticates via OAuth. Alice and Bob will be my attacker and victim.</p>
<p>Alice&#8217;s normal authentication process goes like this:</p>
<ol>
<li>Alice loads twitten.com/login</li>
<li>Twitten creates a regular HTTP session for Alice</li>
<li>Twitten asks Twitter for an unauthorized token</li>
<li>Twitten redirects Alice to an URL on the Twitter servers that will allow her to authorize the token</li>
<li>Alice clicks OK to authorize the token</li>
<li>Twitter redirects Alice back to Twitten</li>
<li>Twitten exchanges its unauthorized token for an access token (associated with Alice&#8217;s account) with Twitter and stores it in Alice&#8217;s session</li>
<li>Alice makes inane posts on Twitter via Twitten</li>
</ol>
<p>The vulnerability here is in step 4. If instead of going to the authorization URL Alice convinces Bob to go there and authorize Twitten she can gain access to his account. Like this:</p>
<ol>
<li>Alice loads twitten.com/login</li>
<li>Twitten creates a regular HTTP session for Alice</li>
<li>Twitten asks Twitter for an unauthorized token</li>
<li>Twitten tells Alice what URL to go to to authorize the token, but she doesn&#8217;t go there</li>
<li>Alice tells Bob, &#8220;if you love Twitter and kittens try out Twitten &#8211; go to http://twitter.com/oauth/authorize/&#8230;.&#8221; (the authorization URL from Twitten)</li>
<li>Bob loads the authorization URL with his Twitter credentials and authorizes the token</li>
<li>Twitten requests Twitter to exchange the unauthorized token for an access token (associated with Bob’s Twitter account) and stores it in Alice’s session</li>
<li>Alice goes to twitten.com and posts &#8220;OMG PWNd&#8221; to Bob&#8217;s twitter account</li>
</ol>
<p>I&#8217;m not really sure how to address this issue. It&#8217;s fundamentally hard to establish trust between three parties over insecure communications. Hopefully more experienced people than me will come up with clever answers.</p>
<p><strong>Update:</strong> changed wording to match Eran&#8217;s suggestion, his <a href="http://www.hueniverse.com/hueniverse/2009/04/explaining-the-oauth-session-fixation-attack.html">blog post</a> on the subject is excellent reading.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/04/23/understanding-the-oauth-vulnerability/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Restarting an AIR application</title>
		<link>http://ianloic.com/2009/03/11/restarting-an-air-application/</link>
		<comments>http://ianloic.com/2009/03/11/restarting-an-air-application/#comments</comments>
		<pubDate>Thu, 12 Mar 2009 00:40:45 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[actionscript]]></category>
		<category><![CDATA[adobeair]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=111</guid>
		<description><![CDATA[For reasons too complicated and secret to go into here I&#8217;m writing an Adobe AIR application that needs to restart itself occasionally. I didn&#8217;t find any clear documents describing how to do this but after some reverse engineering and experimentation, here&#8217;s what I came up with.
The air.swf movie that&#8217;s used by web pages to install [...]]]></description>
			<content:encoded><![CDATA[<p>For reasons too complicated and secret to go into here I&#8217;m writing an Adobe AIR application that needs to restart itself occasionally. I didn&#8217;t find any clear documents describing how to do this but after some reverse engineering and experimentation, here&#8217;s what I came up with.</p>
<p>The <em>air.swf</em> movie that&#8217;s used by web pages to install and launch AIR applications calls some internal, undocumented APIs to do this, and so can you!</p>
<pre class="prettyprint">
namespace {
  import adobe.utils.ProductManager;
  import mx.core.Application;
  public class Restart {
    public static function restart() : void {
      // request that a new instance of the application be launched
      new ProductManager('airappinstaller').launch('-launch ' +
        Application.application.nativeApplication.applicationID + ' ' +
        Application.application.nativeApplication.publisherID);
      // exit the current instance
      Application.application.nativeApplication.exit(0);
    }
  }
}</pre>
<p>I don&#8217;t know if or when this will break but it&#8217;s pretty straight-forward, once you work it out. It feels to me like there might be some kind of race condition, but it seems to work alright. Oh also, I&#8217;ve only tried this on Mac.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/03/11/restarting-an-air-application/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mozilla and WebKit, browser platform wars.</title>
		<link>http://ianloic.com/2009/03/04/mozilla-and-webkit-browser-platform-wars/</link>
		<comments>http://ianloic.com/2009/03/04/mozilla-and-webkit-browser-platform-wars/#comments</comments>
		<pubDate>Thu, 05 Mar 2009 01:53:33 +0000</pubDate>
		<dc:creator>Ian McKellar</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[apple]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[nokia]]></category>
		<category><![CDATA[rant]]></category>
		<category><![CDATA[safari]]></category>
		<category><![CDATA[webkit]]></category>

		<guid isPermaLink="false">http://ianloic.com/?p=109</guid>
		<description><![CDATA[This post began as a comment on Matthew Gertner&#8217;s blog post The Browser Platform Wars. It&#8217;s a rant not an article, don&#8217;t take it personally.

In my experience (8 years building Mozilla based products and playing with WebKit since it was first released as WebCore in 2003) there are a few clear technical and social differences [...]]]></description>
			<content:encoded><![CDATA[<p><em>This post began as a comment on Matthew Gertner&#8217;s blog post <a href="http://browsing.justdiscourse.com/2009/03/04/the-browser-platform-wars/">The Browser Platform Wars</a>. It&#8217;s a rant not an article, don&#8217;t take it personally.<br />
</em></p>
<p>In my experience (8 years building <a href="http://www.hiptop.com/">Mozilla</a> <a href="http://www.flock.com/">based</a> <a href="http://getsongbird.com/">products</a> and playing with WebKit since it was <a href="http://lists.kde.org/?l=kfm-devel&amp;m=104197092318639&amp;w=2">first released</a> as WebCore in 2003) there are a few clear technical and social differences that can make <a href="http://www.webkit.org/">WebKit</a> a more attractive platform for developers than <a href="http://www.mozilla.org/">Mozilla</a>. There are plenty of reasons that <a href="http://www.mozilla.com/firefox/">Firefox</a> is a better <em>product</em> than <a href="http://www.apple.com/safari/">Safari</a> (I definitely prefer Firefox over Safari on my Mac), but that&#8217;s a different story.</p>
<p>The scale and complexity of the Mozilla codebase is daunting. Mozilla advocates will say that that&#8217;s because Mozilla provides more functionality, but the reality is that even if you don&#8217;t want all that functionality you still have to dig through and around it to get your work done. Much of the Mozilla platform is poorly documented, poorly understood and incomplete (the C++/JS binding security stuff was the most recent example I&#8217;ve looked at) while WebKit is smaller, simpler and newer. They use common c++ idioms instead of proprietary systems like XPCOM.</p>
<p>The scale of the Mozilla organization is also daunting. Mozilla&#8217;s web presence is vast and is filled with inaccurate, outdated content. Their <a href="http://www.mozilla.org/about/manifesto.en.html">goals</a> are vague and mostly irrelevant to developers. By contrast WebKit&#8217;s web site is simple and straight-forward. Its audience is developers, it sets out <a href="http://webkit.org/projects/goals.html">goals</a> that matter and make sense to developers, it <a href="http://webkit.org/coding/contributing.html">explains clearly</a> the process for participating and contributing in the project.</p>
<p>WebKit is designed for embedding. Within Apple there are several customers for the WebKit library &#8211; Desktop Safari, iPhone Safari, Dashboard, AppKit and more. Since WebKit already serves a variety of purposes it&#8217;s likely to work for other applications which third party developers will want to build. By comparison the Mozilla platform really only has one first-class customer &#8211; Firefox.</p>
<p>The WebKit community has welcomed non-employee contributors. They&#8217;ve even welcomed contributors who work for Apple&#8217;s competitors. There are WebKit reviewers from <a href="http://webkit.org/blog/354/darin-fisher-is-a-webkit-reviewer/">Google</a>, <a href="http://webkit.org/blog/300/tor-arne-vestb%C3%B8-is-a-webkit-reviewer/">Nokia</a> and the open source community. By comparison, Songbird and Flock don&#8217;t have any Mozilla committers or reviewers who weren&#8217;t previously Mozilla Corporation employees even though they are two of the largest non-MoCo platform customers.</p>
<p>Perhaps I&#8217;m short-sighted, but I don&#8217;t see a clear path forward for Mozilla in competing with WebKit as a platform for web content display. The long history of Mozilla have left them with a large, complicated codebase that&#8217;s not getting smaller. The rapid growth and defensive attitude of the organization (probably brought on by the Netscape / IE wars) has left it without a culture that welcomes friendly competition. I think that Mozilla&#8217;s focus on the product above the platform is the right decision for them. I&#8217;m just glad we have an alternative web content platform.</p>
]]></content:encoded>
			<wfw:commentRss>http://ianloic.com/2009/03/04/mozilla-and-webkit-browser-platform-wars/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
