Calculating Actual Build Dependencies

During development, when you make a change your build system should build every output that would be affected by your change and no more. Incorrect dependencies can lead to building too much (wasting time) or worse, building too little (wasting way more time). Unfortunately keeping your dependencies specified in your build configuration is time consuming and error-prone.

Tom Tromey invented a technique called auto-dependency generation whereby GCC can output a Makefile formatted file when it compiles sources that describes the inputs and outputs from that step. GNU Make will include these generated files, if they exist. Idea has been adopted by other compilers and other build systems.

Unfortunately not all tools can describe what they do in terms of Makefile rules – they’re too busy doing whatever they’re supposed to do. For a long time I’ve wanted to build something that would observe file I/O and generate dependency rule files based on what was observed. I read a bunch of the GNU Make and Ninja source code, I researched loopback filesystems, I learned about various system-level profiling technologies.

Then the other day I wrote a short shell script: strace-deps

It uses strace to trace the system calls that a process (and its children) makes, then looks at all the openat invocations (since that appears to be the syscall that a modern libc uses) to find the files that were written and the files that were read. It excludes uninteresting files (/tmp/, /dev/, /var/, /usr/, etc) and then generates a dependency file describing what happened.

It needs to be explicitly invoked and the dependency file needs to be explicitly specified. My rule for iverilog compilation of Verilog testbenches looks like:
%_tb: %_tb.v
strace-deps $@.d iverilog -o $@ $<
So for a testbench foo_tb.v when generating foo_tb with iverilog it will observe all the file I/O and generate foo_tb.d based on what strace-deps sees.

This isn’t perfect but it’s definitely improved my development experience.

Infinite localhost tunnels

When I’m playing with web development I typically run a local web server. This is fine until I want to test on a mobile device or show it to someone else. There are a bunch of options out there off the shelf but they either cost money or don’t offer a stable URL. I don’t feel like paying money since I already have a server I’m paying for and my need is very sporadic. Stable URLs are more of a problem because I often want to integrate with 3rd party APIs which are authorized by URL.

The simple solution is to use ssh port forwarding to forward a localhost port on my web server to a localhost port on my laptop and then an Nginx proxy_pass rule to forward a specific named virtual host to the server-side port. This means either using a single, fixed name for everything I might want to tunnel – which started causing problems once I was playing with service workers, or having to edit and reload web server configuration and provision a new TLS certificate for each new name.

Approach

I’ve settled on a new approach that’s simple and scalable using wildcard certificates and a fairly new, infrequently used feature of ssh port forwarding: UNIX domain sockets.

Normally when we think about port forwarding we’re forwarding TCP ports to TCP ports. Passing -R 8000:localhost:1234 to ssh will forward port 8000 on the server to port 1234 on the local machine. If instead of a numeric port we pass a path then ssh will forward a UNIX domain socket. This allows me to have a textual rather than numeric namespace for ports I’m forwarding.

Conveniently, Nginx’s proxy_pass directive also allows forwarding to a UNIX domain socket using the special syntax: proxy_pass http://unix:/path/to/socket:;. Wiring the two together we can forward from an Nginx wildcard server to an SSH port forward based on the server name.

A couple of challenges came up getting this to actually work. First of all the UNIX domain sockets created by ssh are given very conservative permissions and in spite of promising sounding ssh options these aren’t configurable. At least in my server configuration Nginx wasn’t able to connect to those sockets so I had to follow up creating the socket with changing the permissions. Secondly the sockets aren’t reliably cleaned up so before creating the forward I had to explicitly remove any old socket that might be there.

I wasn’t entirely confident that a specially crafted URL couldn’t be used to access parts of the file-system beyond the tunnel sockets so I applied a fairly conservative filter to the allowed host names in the Nginx config.

Server setup

First I set up a wildcard sub-domain on my web server and provisioned a Let’s Encrypt wildcard cert. Depending on how your DNS is set up that can be varying levels of tricky and it’s not really relevant to this story. I also configured http requests to be redirected https because it’s 2019.

Then I updated the domain’s nginx.conf to proxy_pass to a UNIX domain socket in /tmp. It looks like:

server {
        server_name "~^(?<name>[[:alnum:]_-]*).tunnel.mydomain.com";
        root /home/user/tunnels/site;
        error_page 502 /error.html;
        location /error.html {
            internal;
        }
        location / {
                proxy_pass http://unix:/tmp/tunnel-$host:;
        }
        listen 443 ssl; # managed by Certbot
        ssl_certificate /etc/letsencrypt/live/tunnel.mydomain.com/fullchain.pem; # managed by Certbot
        ssl_certificate_key /etc/letsencrypt/live/tunnel.mydomain.com/privkey.pem; # managed by Certbot
        include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
        ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
        listen 80;
        server_name *.tunnel.mydomain.com;
        location / {
                return         301 https://$host$request_uri;
        }
}

I also created this a simple error page that gives basic usage instructions if I forget them: error.html
I think this will make /error.html on all forwarded hosts behave weirdly, but I haven’t found the need to address this yet.

Client setup

On the client side I wrote a little script that I put in ~/bin/tunnel:

#!/bin/sh

SSH_CONNECTION=user@server.mydomain.com
DOMAIN=tunnel.mydomain.com

set -eu

if [ $# -ne 2 ]
then
  echo "$0 <name> <port>"
  echo "To establish a tunnel from https://name.$DOMAIN/ to http://localhost:port/"
  exit 1
fi
TUNNEL_HOSTNAME="$1.$DOMAIN"
TUNNEL_SOCKET="/tmp/tunnel-$TUNNEL_HOSTNAME"
TUNNEL_PORT=$2

# remove the old tunnel socket if any
ssh $SSH_CONNECTION -o RequestTTY=no rm -f $TUNNEL_SOCKET

# connect the tunnel
ssh $SSH_CONNECTION -f -N -R "$TUNNEL_SOCKET:localhost:$TUNNEL_PORT"

# fix the permissions on the tunnel
ssh $SSH_CONNECTION -o RequestTTY=no chmod a+rw $TUNNEL_SOCKET

echo "Connect to: https://$TUNNEL_HOSTNAME/"

I can use this by invoking tunnel mysideproject 1234 and then loading https://mysideproject.tunnel.mydomain.com/ from any device.

The only real annoyance is that the tunnel won’t automatically reconnect after being disconnected. I could solve this with some slightly cleverer scripting but I’ve never felt the need – disconnects only tend to happen when I’ve closed my laptop.

Generating Noise Textures

My current play-with-random-web-tech side-project is is a solitaire game. Well actually it’s a framework for implementing solitaire games because I’m a programmer and a side project is the right place to exercise my desires for needless abstraction. I want a nice felt-looking background texture and the way to do that is you add some noise to an image. Adding some subtle noise to an image is a really common approach to achieving a nice visual aesthetic – as I understand it – I mentioned I’m a programmer.

My project ships as a single HTML file. I’m developing it with in TypeScript with LESS styling but then I can bundle it all up as a single file to scp to a server. I’m not using any external images – Unicode includes plenty of handy symbols and the only other one I needed (a Mahjong bamboo symbol) was only a tiny snippet of SVG. Including a noise texture as an external image or inline as a data: URI is kind of gross. Noise textures by definition don’t compress well.

I was able to build something pretty nice by combining HTML Canvas’ raw pixel data access and Web Crypto’s getRandomValues(). The window.crypto.getRandomValues() call fills a typed array with random data. Actually it’s generated by a PRNG seeded with a little entropy so it’s fast and won’t waste all your hard earned entropy. Canvas’  createImageData() and putImageData() allow you to work with canvas pixels as typed arrays.

The only catch is that getRandomValues() only fills 64k of data. Canvas image data is 32bpp RGBA so all you can generate trivially (without repeated calls to getRandomValues()) is a 128×128 texture. When I use this subtly I don’t notice the repetition.

The completed example is on JSFiddle: https://jsfiddle.net/ianloic/2uzhvg8h/

Creepy Flashing Heads

Sharon picked up some cheap styrofoam heads from a store that had previously used them to display wigs for use in a decoration project of hers for this Halloween. I got her to pick up a couple more for me to use. After some discussion and ideas Aaron and I decided that it would be creepy to have lights flashing on them in a random fashion. This sounded like exactly the project I wanted to use my TI MSP430 Launchpad for. I don’t have very much microcontroller experience – pretty much just making lights flash on or off – so that’s perfect.

I managed to get a MSP430 development environment up and running on Linux pretty easily. It was just:

apt-get install mspdebug gcc-msp430

The code is pretty simple. Two digital IO pins drive the LEDs and a watchdog timer wakes up every 32ms to decide if an LED should be turned on or off or left alone. In the end much of the code is overkill for such a simple application, but it does look kind of cool:

Simply logging JavaScript calls.

When debugging complicated JavaScript one thing I find myself constantly doing is using console.log() to print out what functions are being called in what order. JavaScript is single-threaded and event driven so it’s often not entirely clear what functions will be called in what order.

Traditionally I’ve done something like this:

function foo(bar) {
  console.log('foo('+bar+')');
}

but last night I came up with something better. It’s probably not completely portable but it seems to work fine in recent Chrome / Safari / Firefox, which is really all I’m going to be using for debugging anyway:

function logCall() {
  console.log(logCall.caller.name + '(' +
    Array.prototype.slice.call(logCall.caller.arguments)
    .map(JSON.stringify).join(', ') + 
    ')');
}

Just add logCall() to the start of functions and the function call (including serialized arguments) will be logged to the console. Easy and foolproof.