Calculating Actual Build Dependencies

During development, when you make a change your build system should build every output that would be affected by your change and no more. Incorrect dependencies can lead to building too much (wasting time) or worse, building too little (wasting way more time). Unfortunately keeping your dependencies specified in your build configuration is time consuming and error-prone.

Tom Tromey invented a technique called auto-dependency generation whereby GCC can output a Makefile formatted file when it compiles sources that describes the inputs and outputs from that step. GNU Make will include these generated files, if they exist. Idea has been adopted by other compilers and other build systems.

Unfortunately not all tools can describe what they do in terms of Makefile rules – they’re too busy doing whatever they’re supposed to do. For a long time I’ve wanted to build something that would observe file I/O and generate dependency rule files based on what was observed. I read a bunch of the GNU Make and Ninja source code, I researched loopback filesystems, I learned about various system-level profiling technologies.

Then the other day I wrote a short shell script: strace-deps

It uses strace to trace the system calls that a process (and its children) makes, then looks at all the openat invocations (since that appears to be the syscall that a modern libc uses) to find the files that were written and the files that were read. It excludes uninteresting files (/tmp/, /dev/, /var/, /usr/, etc) and then generates a dependency file describing what happened.

It needs to be explicitly invoked and the dependency file needs to be explicitly specified. My rule for iverilog compilation of Verilog testbenches looks like:
%_tb: %_tb.v
strace-deps $@.d iverilog -o $@ $<
So for a testbench foo_tb.v when generating foo_tb with iverilog it will observe all the file I/O and generate foo_tb.d based on what strace-deps sees.

This isn’t perfect but it’s definitely improved my development experience.

Injustices and Anachronisms

A couple of weeks ago I was thinking happily about the fact that my kids will never know a time before marriage equality in the country that they’re growing up in. The US Supreme Court struck down bans on same-sex partners marrying less than a month before they were born. There was some talk of constitutional amendments to restore the discrimination in the aftermath of the ruling but that’s largely subsided.

Outside of the United States this is still a live issue. My kids are Australian and Swiss too. After a problematic plebiscite Australia recognized marriage equality in 2017. Switzerland doesn’t allow same-sex marriage but polls show heavy support and I expect the law will change before my kids are aware.

My kids will probably grow up seeing “other”, “backwards”, “bad places” that are stuck in the past. I don’t expect Saudi Arabia or Uganda to make this kind of progress before my kids are aware of the legal and cultural difference around the world. They’ll probably see that there’s something wrong with those societies. I hope they don’t think that the problem lies in the people there or the religions they follow, but that there’s some social block preventing their inevitable progress towards modernity and civility.

Then I realized that this is how I feel about capital punishment in the United States.


Growing up in Australia capital punishment was a part of history lessons not civics lessons. We learned that Ronald Ryan was the last man hanged. In 1967 the country was changing: there were protests against the war in Vietnam, metric currency and measures were being adopted, feminism and indigenous rights movements were becoming visible. The end of capital punishment was a part of that – part of the country growing up and becoming a modern civilized democracy.

Now I’ve found myself living in a country where capital punishment is seen as an injustice by many, as racist, ineffective and expensive by others. But for me it’s more than that – it’s an anachronism. It’s as far in the past and as settled a question as segregated lunch counters. Just thinking about it turns my stomach.


As a child I didn’t think that restricting marriage to being between men and women was an injustice – I just didn’t think about it. As I grew up and found gay friends and family members struggling with discrimination it became pretty clear pretty quickly that restrictions are on peoples right to marry and have a family based on their sexuality was both nonsensical and damaging to the lives of people who I cared about.

Will my children have to learn that capital punishment is an injustice?

Infinite localhost tunnels

When I’m playing with web development I typically run a local web server. This is fine until I want to test on a mobile device or show it to someone else. There are a bunch of options out there off the shelf but they either cost money or don’t offer a stable URL. I don’t feel like paying money since I already have a server I’m paying for and my need is very sporadic. Stable URLs are more of a problem because I often want to integrate with 3rd party APIs which are authorized by URL.

The simple solution is to use ssh port forwarding to forward a localhost port on my web server to a localhost port on my laptop and then an Nginx proxy_pass rule to forward a specific named virtual host to the server-side port. This means either using a single, fixed name for everything I might want to tunnel – which started causing problems once I was playing with service workers, or having to edit and reload web server configuration and provision a new TLS certificate for each new name.

Approach

I’ve settled on a new approach that’s simple and scalable using wildcard certificates and a fairly new, infrequently used feature of ssh port forwarding: UNIX domain sockets.

Normally when we think about port forwarding we’re forwarding TCP ports to TCP ports. Passing -R 8000:localhost:1234 to ssh will forward port 8000 on the server to port 1234 on the local machine. If instead of a numeric port we pass a path then ssh will forward a UNIX domain socket. This allows me to have a textual rather than numeric namespace for ports I’m forwarding.

Conveniently, Nginx’s proxy_pass directive also allows forwarding to a UNIX domain socket using the special syntax: proxy_pass http://unix:/path/to/socket:;. Wiring the two together we can forward from an Nginx wildcard server to an SSH port forward based on the server name.

A couple of challenges came up getting this to actually work. First of all the UNIX domain sockets created by ssh are given very conservative permissions and in spite of promising sounding ssh options these aren’t configurable. At least in my server configuration Nginx wasn’t able to connect to those sockets so I had to follow up creating the socket with changing the permissions. Secondly the sockets aren’t reliably cleaned up so before creating the forward I had to explicitly remove any old socket that might be there.

I wasn’t entirely confident that a specially crafted URL couldn’t be used to access parts of the file-system beyond the tunnel sockets so I applied a fairly conservative filter to the allowed host names in the Nginx config.

Server setup

First I set up a wildcard sub-domain on my web server and provisioned a Let’s Encrypt wildcard cert. Depending on how your DNS is set up that can be varying levels of tricky and it’s not really relevant to this story. I also configured http requests to be redirected https because it’s 2019.

Then I updated the domain’s nginx.conf to proxy_pass to a UNIX domain socket in /tmp. It looks like:

server {
        server_name "~^(?<name>[[:alnum:]_-]*).tunnel.mydomain.com";
        root /home/user/tunnels/site;
        error_page 502 /error.html;
        location /error.html {
            internal;
        }
        location / {
                proxy_pass http://unix:/tmp/tunnel-$host:;
        }
        listen 443 ssl; # managed by Certbot
        ssl_certificate /etc/letsencrypt/live/tunnel.mydomain.com/fullchain.pem; # managed by Certbot
        ssl_certificate_key /etc/letsencrypt/live/tunnel.mydomain.com/privkey.pem; # managed by Certbot
        include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
        ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
        listen 80;
        server_name *.tunnel.mydomain.com;
        location / {
                return         301 https://$host$request_uri;
        }
}

I also created this a simple error page that gives basic usage instructions if I forget them: error.html
I think this will make /error.html on all forwarded hosts behave weirdly, but I haven’t found the need to address this yet.

Client setup

On the client side I wrote a little script that I put in ~/bin/tunnel:

#!/bin/sh

SSH_CONNECTION=user@server.mydomain.com
DOMAIN=tunnel.mydomain.com

set -eu

if [ $# -ne 2 ]
then
  echo "$0 <name> <port>"
  echo "To establish a tunnel from https://name.$DOMAIN/ to http://localhost:port/"
  exit 1
fi
TUNNEL_HOSTNAME="$1.$DOMAIN"
TUNNEL_SOCKET="/tmp/tunnel-$TUNNEL_HOSTNAME"
TUNNEL_PORT=$2

# remove the old tunnel socket if any
ssh $SSH_CONNECTION -o RequestTTY=no rm -f $TUNNEL_SOCKET

# connect the tunnel
ssh $SSH_CONNECTION -f -N -R "$TUNNEL_SOCKET:localhost:$TUNNEL_PORT"

# fix the permissions on the tunnel
ssh $SSH_CONNECTION -o RequestTTY=no chmod a+rw $TUNNEL_SOCKET

echo "Connect to: https://$TUNNEL_HOSTNAME/"

I can use this by invoking tunnel mysideproject 1234 and then loading https://mysideproject.tunnel.mydomain.com/ from any device.

The only real annoyance is that the tunnel won’t automatically reconnect after being disconnected. I could solve this with some slightly cleverer scripting but I’ve never felt the need – disconnects only tend to happen when I’ve closed my laptop.

Generating Noise Textures

My current play-with-random-web-tech side-project is is a solitaire game. Well actually it’s a framework for implementing solitaire games because I’m a programmer and a side project is the right place to exercise my desires for needless abstraction. I want a nice felt-looking background texture and the way to do that is you add some noise to an image. Adding some subtle noise to an image is a really common approach to achieving a nice visual aesthetic – as I understand it – I mentioned I’m a programmer.

My project ships as a single HTML file. I’m developing it with in TypeScript with LESS styling but then I can bundle it all up as a single file to scp to a server. I’m not using any external images – Unicode includes plenty of handy symbols and the only other one I needed (a Mahjong bamboo symbol) was only a tiny snippet of SVG. Including a noise texture as an external image or inline as a data: URI is kind of gross. Noise textures by definition don’t compress well.

I was able to build something pretty nice by combining HTML Canvas’ raw pixel data access and Web Crypto’s getRandomValues(). The window.crypto.getRandomValues() call fills a typed array with random data. Actually it’s generated by a PRNG seeded with a little entropy so it’s fast and won’t waste all your hard earned entropy. Canvas’  createImageData() and putImageData() allow you to work with canvas pixels as typed arrays.

The only catch is that getRandomValues() only fills 64k of data. Canvas image data is 32bpp RGBA so all you can generate trivially (without repeated calls to getRandomValues()) is a 128×128 texture. When I use this subtly I don’t notice the repetition.

The completed example is on JSFiddle: https://jsfiddle.net/ianloic/2uzhvg8h/

Reusing Passwords

I have a confession. Sometimes I reuse passwords. Not for anything that “matters”, but I’ve ended up using a couple of passwords a lot of times. And inevitably some of those sites get hacked. But where did I use them?

Chrome remembers all my passwords but unfortunately doesn’t seem to offer a straightforward API to get at them. Conveniently they do sync my passwords to my computer’s password store and there’s an API for that.

I wrote a little script and I’ve been going through generating unique passwords for all the unimportant sites and turning two-factor authentication on where it’s offered. The Secret Service database (and Chrome) seem to sometimes end up with multiple entries for a single site, and Chrome doesn’t seem to sync my updates immediately, but I’ve found this a helpful start.

Partisan Divide

There’s an American election on Tuesday. Whatever the outcome of the races, the partisan polarization is disturbing. Roughly 40% of the electorate considers each presidential candidate to be unqualified to even run. I don’t have the right to vote but my perspective is just as absolute. I’m right, but I’d say that wouldn’t I.

Each party chose a candidate that was dismissed out of hand by half the country as even a valid choice to offer. The core of the debate has not been over policy or even really a vision of the future of the country, but of the fatal personal flaws of the other candidate.

Where do we go from here? If America elects Trump on Tuesday then I and half the country won’t just feel defeated and disappointed, worried about next four years and the country our children will inherit, but we’ll be skeptical of the president elect’s eligibility to hold the office to which he was democratically elected. If Clinton prevails the other half of the country will feel the same way.

Perhaps more disturbingly support for the candidates break heavily along gender, ethnicity and class lines. Whoever wins, whole communities will not only feel unrepresented in the White House, they’ll feel that its occupant is illegitimate. And for all their various skills, neither candidate has demonstrated skill in uniting the nation.

Tyranny is mostly pleasant

In my life I’ve been lucky enough to visit some brutal military dictatorships. From Suharto’s Indonesia as a child to Mubarak’s Egypt more recently, they’ve been really pleasant to visit. My impression is that most citizens of these countries were fairly unencumbered by the political system they lived under. Some people had terrible time – Islamists in Egypt, Timorese and Papuans in Indonesia – were tortured and murdered, their languages and beliefs suppressed. But for most people this wasn’t an issue.

Thinking about technology we have a similar situation. Most people are happy to rely on a proprietary operating system like MacOS or Windows because even though it takes away some freedoms, for them these freedoms aren’t as important on a day-to-day basis as the convenience that the platform provides.

Even though they’ve done a terrible job of protecting women and marginalized minorities from abuse, Twitter is a really convenient conversation platform for me. Facebook too with its real names policy excludes many people from honest, safe expression, but for a white cis man like me it’s really convenient.

On the other hand I run Linux on my personal computers because software freedom is morally important to me and the practical benefits for my fringe use case (programming) are significant.

If we’re going to build and promote Free technology – both FLOSS and a decentralized web, we need to accept that the pure principle of freedom isn’t enough to kick-start change. Enough people need to suffer enough discomfort to trigger a revolution.

Decentralized Web: My Thought Experiment

I’m at the Decentralized Web Summit today and it’s all very interesting. There are some big picture ideas of how the future should be. There are all sorts of interesting disparate technologies filling all kinds of holes. But I have a thought experiment that I’ve used to understand where we need to go and what we need to build to get there.

Uber. Or Lyft or AirBnB, or even Etsy.

This new sharing economy supposedly shakes up traditional businesses by harnessing the distributed power of the internet, but when you ignore shiny apps these businesses look a lot like traditional rent-seeking middlemen.

It feels like a bug that we are making new businesses that look like such old businesses. Ride sharing shouldn’t need  middleman. Prospective passengers and drivers should be able to discover each other, agree on a transaction, go for a ride and then make payment.

We have a lot of the pieces already, but reputation is the big challenge I see in all this. Even centralized systems struggle with reputation but we don’t have a good way to know if we should trust that a driver is competent or a guest won’t trash my apartment. I don’t know how to solve this, but I sure hope someone is thinking about it.

Low Fidelity Abstraction

It’s only through abstraction that we’re able to build the complex software systems that we do today. Hiding unimportant details from developers lets us work more efficiently and most importantly it allows us to devote more of our brain to the higher-level problems we’re trying to solve for our users.

As an obvious example, if you’re implementing a simple arithmetic function in assembly language you have to expend a lot of your brain power to track which registers are used for what, how the CPU will schedule the instructions and so on, while in a high level language you just worry about if you’ve picked the right algorithm and let the compiler worry about communicating it correctly and efficiently to the processor.

Lo-fi

More abstraction isn’t necessarily good though. If your abstractions hide important details then the cognitive burden on developers is increased (as they keep track of important information not expressed in the API) or their software will be worse (if they ignore those details). This can take many forms, but generally it makes things that can be expensive feel free by overloading programming language constructs. Here are some examples…

Getters and Setters

Getters and setters can implicitly introduce unexpected, important side effects. Code that looks like:

foo.bar = x;
y = foo.baz;

is familiar to programmers. We think we know what it means and it looks cheap. It looks like we’re writing a value to a memory location in the foo structure and reading out of another. In a language that supports getters and setters that may be what’s happening, or much more may be happening under the hood. Some common unexpected things that happen with getters and setters are:

  • unexpected mutation – getting or setting a field changes some other aspect of an object. For example, does setting person.fullName update the person.firstName and person.lastName fields?
  • lack of idempotency – reading the same field of an object repeatedly shouldn’t change its value, but with a getter it can. It’s even often convenient to have a nextId getter than returns an incrementing id or something.
  • lack of symmetry – if you write to a field does the same value come out when you immediately read from it? Some getters or setters clean up data – useful, but unexpected.
  • slow performance – setting a field on a struct is just about the cheapest thing you can do in high level code. Calling a getter or setter can do just about anything. Expensive field validation for setters, expensive computation for getters, and even worse reading from or writing to a database are unexpected yet common.

Getters and setters are really useful to API designers. They allow us to present a simple interface to our consumers but they introduce the risk of hiding details that will impact them or their users.

Automatic Memory Management

Automatic memory management is one of the great step forwards for programmer productivity. Managing memory with malloc and free is difficult to get right, often inefficient (because we err on the side of simple predictability) and the source of many bugs. Automatic memory management introduces its own issues.

It’s not so much that garbage collection is slow, but it makes allocation look free. The more allocation that occurs the more memory is used and the more garbage needs to be collected. The performance price of additional allocations aren’t paid by the code that’s doing the allocations but by the whole application.

APIs’ memory behavior is hidden from callers making it unclear what their cost will be. Worse, in weirder automatic memory management systems like Automatic Reference Counting in modern versions of Objective-C, it’s not clear if APIs will retain objects passed to them or returned from them – often even to the implementers of the API (for example).

IPC and RPC

It’s appealing to hide inter-process communication and remote procedure calls behind interfaces that look like local method calls. Local method calls are cheap, reliable, don’t leak user data, don’t have semantics that can change over time, and so on. Both IPC and RPC have those issues and can have them to radically different extents.  When we make calling remote services feel the same as calling local methods we remove the chore of using a library but force developers to carry the burden of the subtle but significant differences in our meagre brains.

But…

But I like abstraction. It can be incredibly valuable to hide unimportant details but incredibly costly to hide important ones. In practice, instead of being aware of the costs hidden by an abstraction and taking them into account, developers will ignore them. We’re a lazy people, and easily distracted.

When designing an API step back and think about how developers will use your API. Make cheap things easy but expensive things obviously expensive.

HTTP/2 on nginx on Debian

I run my web site off a Debian server on GCE. I like tinkering with the configuration. I hear that HTTP 2 is the new hot thing, and that’s going to mean supporting ALPN which means upgrading to OpenSSL 1.0.2 and nginx 1.9.5 or newer. But they isn’t supported in Debian 8.

I used apt pinning to bring in versions of nginx and OpenSSL from testing into my jessie server. I first added sources for testing by creating a file /etc/apt/sources.list.d/testing.list:

deb http://ftp.us.debian.org/debian testing main non-free contrib

Then I configured my pin priorities by creating /etc/apt/preferences with:

Package: *
Pin: release a=stable
Pin-Priority: 700

Package: *
Pin: release a=testing
Pin-Priority: 650

Package: *
Pin: release a=unstable
Pin-Priority: 600

After an apt-get update I could install the version of nginx from testing, bringing in the appropriate version of OpenSSL: apt-get -t=testing install nginx-full

Then it was just a matter of changing:

listen 443 ssl;

to:

listen 443 ssl http2;

wherever I wanted it.

Now it looks like I’m serving over HTTP/2. Not that it makes a whole lot of obvious difference yet.