What I Learned at Work this Week: Information Hiding and Leakage
In engineering, there are many questions that can be answered with a quick Google search. What’s the difference between .find() and .filter()? No problem, just look it up. But there are also important, more complex questions that can’t be so easily looked up. Beneath everything we do, these philosophical concepts drive design decisions that have a wide-ranging impact on the use of our software. I’m grateful that my workplace offers me the opportunity to develop an understanding of some of these concepts through a technical book club. What I learned at work this week didn’t come from a ticket that I had to resolve, but instead from a discussion on A Philosophy of Software Design by John Ousterhout. More accurately, it came from my struggles to understand Chapter 5: Information Hiding (and Leakage).
In chapters 1–4, the book describes different design decisions we might see in a codebase, why they occur, and their benefits and drawbacks. Ousterhout stresses reducing complexity and cognitive load through thoughtful planning and deep modules (and likely other strategies that I haven’t read about yet), which should lead to code that’s easier to read, write, and debug. In Chapter 4 of the book, we learn that deep modules provide powerful functionality yet have simple interfaces. To me, this brings to mind the core data structures of Java. I wouldn’t recognize the code behind the initialization of a LinkedList, but I know what to expect when I invoke it:
LinkedList mediumList <String> = new LinkedList<>();
mediumList.put(“There are complex operations behind the scenes.”);
mediumList.put(“But they’re hidden from me.”);
mediumList.get(1); => “But they’re hidden from me.”
Take a look at the Oracle page for LinkedList — there’s a lot in there, but it’s not difficult to use. It’s a deep module with lots of functionality but a simple interface. If we start moving away from those principles, our module becomes more shallow. Imagine using a class that like this instead:
AddTwoEntriesToALinkedList(String nameOfLinkedList, String firstElement, String secondElement);
It’s an extreme example of a mostly useless module, but many of us have probably coded something a little too specific because it fit our needs in that moment. Shallow modules don’t scale because they have limited functionality and are difficult to use, which increases cognitive load because we’re spending extra time tinkering with our arguments or trying to remember the name of the specific class to reference.
A deep module like LinkedList should remind us of the concept of abstraction (one of the four pillars of Object Oriented Programming). We’re leaving the complicated parts “under the hood” so that we can more easily interact with our module. But Ousterhout doesn’t really reference abstraction when he’s describing deep modules; instead, he says: the most important technique for achieving deep modules is information hiding.
Information Hiding
Information hiding is not a synonym for abstraction, though I personally have found them very difficult to distinguish. Edward V. Berard has a great article that breaks down the confusion between information hiding, abstraction, and encapsulation, but if you’re looking for a short answer, here’s the key quote:
Abstraction can be […] used as a technique for identifying which information should be hidden…Confusion can occur when people fail to distinguish between the hiding of information, and a technique (e.g., abstraction) that is used to help identify which information is to be hidden.
While abstraction is a technique for identifying what to hide, information hiding is the act. And this is where we want to focus because it is the act of hiding the right information that will help us build deeper modules that are more sturdy and easy to use. In his book, Ousterhout gives five examples of information that might be hidden within a module and, though I don’t really understand what how to implement the TCP network protocol means, I did notice that all five of his examples started with “How to.” Just like we discussed earlier with LinkedList, the value of this depth is that I, the user, don’t have to know how my module is executing its tasks, especially when they’re complex. Likewise, the rest of my system, which might depend on this module for its own operations, doesn’t have to be involved in these complex operations. If the information is properly hidden, the system’s use of the module doesn’t have to change even if we refactor its logic. The logic behind mediumList.get(1) might change, but we don’t have to change the way we use it.
Information Leakage
Up to this point, this information might seem elementary or even redundant, which is where information leakage comes in. Ousterhout writes:
…information leakage occurs when a design decision is reflected in multiple modules. This creates a dependency between the modules: any change to that design decision will require changes to all the involved modules.
This makes sense — I don’t want to define how to build a LinkedList in more than one module. It’s inefficient and there’s a great chance that at some point down the road my two definitions will diverge and I won’t realize it until I’m dealing with a bug in my program.
This is the essence of information hiding/leakage: we want our logic to be hidden and encapsulated in one place for the sake of simplicity and consistency. Whether that logic dictates the order we place elements into an array, the keys we use for properties in a constructor, or the data type of a return, minor changes can cause unexpected behavior. And, of course, if information about our logic makes its way into the module’s interface (everything that a developer working in a different module must know in order to use the given module), it has been leaked. Information hiding reduces the responsibility of a module’s invokers — both the programmers and a system’s other modules — which means keeping information requirements out of interfaces to prevent variance and complexity. Ousterhout stresses using default values when possible rather than requiring a variable declaration in an interface, or creating one method that transforms data into its intended form rather than multiple methods that must be used in conjunction (ie one that returns a collection and a second that selects from that collection).
Philosophy in Software Design
I’m a boot camp graduate, which means I took an accelerated route in learning that didn’t afford the opportunity to study much beyond how to read and write working code. The good news is that my boot camp provided a great foundation for becoming a software engineer, but it also taught me that I would never be done learning. We should all strive to find the time to understand the larger concepts that drive our industry. By doing so, we make ourselves better programmers, better teammates, and better leaders.