Ross Esmond

Code, Prose, and Mathematics.

portrait of myself, Ross Esmond
Written — Last Updated

Simplify Early, Complicate Late

Data should be simplified as early in a process as possible, and derived as late in a process as possible. If some subset of a value is inconsequential to a program, and its removal would make further operation on that value simpler, the inconsequential subset should be removed at the first opportunity. Conversely, if a value needs to be rendered into a more complex form to perform some operations, that more complex form should be produced specifically for those operations, and the new form should not be stored for other operations that do not depend on the added data.

An illustrative example

I once worked with a developer who needed to create a Multifactor Authentication (MFA) page on a web app. The user would have a list of phone numbers and emails, and the design of the page involved a “smart option” component for each of SMS (text), voice call, and email verification. The smart option would render as a radio button with a drop down if there were several numbers or emails, would render as a radio button with the static text of a phone number or email if the user only had the one, and would not render at all if the user had no phone number or email which could be contacted.

The developer chose to handle the three possible states of the smart options throughout the applications state engine. Both redux and react housed conditional statements which would handle three scenarios, many items, one item, and no items, on every event. Later it was realized that this design was flawed, since the nesting of dropdowns inside of a radio select created tab-order issues, and we were tasked with updating the application. The new design flattened the options into a single list of radio buttons, but the change prove to be difficult to implement.

After two weeks of development, I was brought in to speed up the change. I chose to refactor the application first, such that the functionality of the application was left unchanged while the code was improved. I rewrote the code such that only the smart options were aware of their three states. To the rest of the application, the smart options received an array of strings (the numbers or emails) and fired an event if one of the strings was selected. If the array was empty, no event would ever fire, and if the array only had one item, only that event could fire, but no part of the program cared that these scenarios were different, only the smart options. The change, both with the refactor and with the update, took two days. The actual update was maybe four hours.

Simplify Early

Simplifying early suggests that data should be parsed for its relevant information and discarded, though this must be performed with care. Loss of information can be damaging to software flexibility if that information becomes necessary later in development. For this reason, the parsed data type should be tolerant to extension, referred to as having Information Flexibility.

Simplification can also involve no loss of information, but simply transform the data into a new representation more condusive to the current process, as with parsing programming languages into an Abstract Syntax Tree. Modern compilers need to be able to report the location of compiler errors on a character by character basis, so even white space is often retained in the AST, though it is usually buried deeper than other, more relevant information.

It is also possible that simplification requires no transformation of data at all, but occurs due to the elimination of a possible scenario, as with guard clauses. If the beginning of a function contains a line which executes an early return when an argument is a Nil Value, the rest of the function can assume that that argument can’t be nil, as that scenario has already been eliminated as a possibility.

Complicate Late

Computations that create new data should be performed just before the data is necessary. Due to the law of Conservation of Information, this data is incapable of holding any new information, meaning that it can always be redirived if lost. Therefore, the data may safely be thrown away once it stops being useful. If there are performance concerns in recomputing the data each time, you may implement memoization or a reactive system to ensure that the data is not actually recomputed. Both of these strategies still keep the complexity of the added data out of most of the system, however, so they still satisfy the rule of complicating late.