Low Fidelity Abstraction

It’s only through abstraction that we’re able to build the complex software systems that we do today. Hiding unimportant details from developers lets us work more efficiently and most importantly it allows us to devote more of our brain to the higher-level problems we’re trying to solve for our users.

As an obvious example, if you’re implementing a simple arithmetic function in assembly language you have to expend a lot of your brain power to track which registers are used for what, how the CPU will schedule the instructions and so on, while in a high level language you just worry about if you’ve picked the right algorithm and let the compiler worry about communicating it correctly and efficiently to the processor.

Lo-fi

More abstraction isn’t necessarily good though. If your abstractions hide important details then the cognitive burden on developers is increased (as they keep track of important information not expressed in the API) or their software will be worse (if they ignore those details). This can take many forms, but generally it makes things that can be expensive feel free by overloading programming language constructs. Here are some examples…

Getters and Setters

Getters and setters can implicitly introduce unexpected, important side effects. Code that looks like:

foo.bar = x;
y = foo.baz;

is familiar to programmers. We think we know what it means and it looks cheap. It looks like we’re writing a value to a memory location in the foo structure and reading out of another. In a language that supports getters and setters that may be what’s happening, or much more may be happening under the hood. Some common unexpected things that happen with getters and setters are:

  • unexpected mutation – getting or setting a field changes some other aspect of an object. For example, does setting person.fullName update the person.firstName and person.lastName fields?
  • lack of idempotency – reading the same field of an object repeatedly shouldn’t change its value, but with a getter it can. It’s even often convenient to have a nextId getter than returns an incrementing id or something.
  • lack of symmetry – if you write to a field does the same value come out when you immediately read from it? Some getters or setters clean up data – useful, but unexpected.
  • slow performance – setting a field on a struct is just about the cheapest thing you can do in high level code. Calling a getter or setter can do just about anything. Expensive field validation for setters, expensive computation for getters, and even worse reading from or writing to a database are unexpected yet common.

Getters and setters are really useful to API designers. They allow us to present a simple interface to our consumers but they introduce the risk of hiding details that will impact them or their users.

Automatic Memory Management

Automatic memory management is one of the great step forwards for programmer productivity. Managing memory with malloc and free is difficult to get right, often inefficient (because we err on the side of simple predictability) and the source of many bugs. Automatic memory management introduces its own issues.

It’s not so much that garbage collection is slow, but it makes allocation look free. The more allocation that occurs the more memory is used and the more garbage needs to be collected. The performance price of additional allocations aren’t paid by the code that’s doing the allocations but by the whole application.

APIs’ memory behavior is hidden from callers making it unclear what their cost will be. Worse, in weirder automatic memory management systems like Automatic Reference Counting in modern versions of Objective-C, it’s not clear if APIs will retain objects passed to them or returned from them – often even to the implementers of the API (for example).

IPC and RPC

It’s appealing to hide inter-process communication and remote procedure calls behind interfaces that look like local method calls. Local method calls are cheap, reliable, don’t leak user data, don’t have semantics that can change over time, and so on. Both IPC and RPC have those issues and can have them to radically different extents.  When we make calling remote services feel the same as calling local methods we remove the chore of using a library but force developers to carry the burden of the subtle but significant differences in our meagre brains.

But…

But I like abstraction. It can be incredibly valuable to hide unimportant details but incredibly costly to hide important ones. In practice, instead of being aware of the costs hidden by an abstraction and taking them into account, developers will ignore them. We’re a lazy people, and easily distracted.

When designing an API step back and think about how developers will use your API. Make cheap things easy but expensive things obviously expensive.


Also published on Medium.