Ross Esmond

Code, Prose, and Mathematics.

portrait of myself, Ross Esmond
Written — Last Updated

Information

Information is program data that was collected from outside that program, otherwise known as the programs environment. Information is distinct from other data in that it cannot be redirived if lost. Information may come from a user, a sensor, or another program. A program that provides information could be a server across a network or another function in the same application, depending on what code is being treated as the program and what code is being treated as the environment. In database design, information is the data worth saving, in distributed systems, it is the data worth sychronizing, and in application design, it is the data worth storing mutably.

Information is independent of representation

Information and program values have a corespondence—information is represented as values—but the infromation and the representation are distinct. If a phone number 7635553503 is represented as the string as “(763)555-3503”, the information in the phone number hasn’t changed. Changing a digit, on the other hand, does change the information of the phone number.

Information then only changes if it cannot be derived from the value. If $i \in I$ is some information, and $d \in D$ is some data that represents $i$, then there must exist some $f$ such that $f(d) = i$. If $g : D \to E$ is some transformation on the data $d \in D$ to some form $e \in E$, then $d$ and $e$ represent the same information $i$ if and only if there exists $h : E \to I$ such that $f(d) = h(e)$. This does not mean that the function $f$ is injective. Take, for instance, the case of parsing a phone number for digits, such that both (763)555-3503 and 763-555-3503 become 7635553503. If the information is considered to be the digits of the phone number, then this conversion does not change the information of the data, despite not being injective, and therefore not being invertible.

This distinction of information from its representation leads to the law of conservation of information.

Conservation of Information: Information cannot be created or changed by a program. It may only be lost.

Sources of Information, and changing Information

One source of information is the real world, which includes user input, sensor readings, and random numbers—the last of which must be included for consistency. All of these sources will often be modeled as mutable values, such that the information source is allowed to change said information. Though the temparature at precicely 9:00 AM cannot change, the current temperature can, and so a value that represents the current temperature may need to be mutated. The new temperature can be considered an update to existing information. In this way, information can change, just not without updates from outside the program.

The other source of information is other programs, though this list is even more expansive. “Program,” in this context, may refer to either a separate application or another section of the same application, like another function. When analyzing information you must define the boundary of your “system”—the section of code for which all outside data is information. When analysing a function, all arguments to the function hold information, even if this data did not originate from outside the collective program. When analysing a stack of microservices, on the other hand, information would need to come from outside the cluster of programs entirely, such as from a user or a service not considered in the system.

Storage of Information

Since information cannot be salvaged once lost, it must be stored in some durable form. Information is the only data that requires durability, as all other data may be redirived, and therefore deleted without loss. In addition, if two chunks of data represent the same information, then only one must be retained, as the other may always be derived from the first. If the second chunk of data represents a reduction of the first, then retaining only the second chunk would result in a loss of information. This realization is helpful for designing programs that use Minimal Essential State. If a program were to only retain the data that it was given from the outside world, that program could lose all other data and still function.

The ultimate purpose of Information

Programs do not exist to be black holes for Information. That Information must serve a purpose, or there would be no reason to injest it, and that purpose must always, ultimately, be framed in terms of the programs effect on its environment. The ways in which Information is used closely mirror the sources of Information. Information may be displayed to a user, be used in the control of a machine, or be sent to another program. Of course, often the data that is used in these applications is computed from said Information, but all Information must serve some downstream purpose to the programs environment, either immediate or eventual. Any data that serves no ultimate purpose, as in, data that will never affect the way the program interacts with its environment, should not be retained as information. This distinction can help to determine what data to keep and what data to dispose of.

Sometimes, however, Information will serve a temporary purpose, referred to as ephemeral information. This data alters the way a program injests information or helps in the programs interaction with its environment, but with no applicability to future operation. For interfacing with a user, this would include the selection state of text, which only serves to help edit the textual information, and the users scroll position, which only serves to help the user navigate other information. Similarly, there exists ephemeral information for interacing with sensors and other programs. For instance, devices and programs often provide status updates that hold only temporary relevance to your program. This ephemeral information may be stored, utilized, and disposed of when it is no longer useful.