Universal Laws and Architectures (Part II)

This post is a continuation of a discussion and about a research program to address an essential but (I think) neglected challenge involving “architecture.” If you missed it, be sure to start with Part I, since it provides context for what follows…

Basic layered architectures in theory and practice

If architecture is the most persistent and shared organizational structure of a set of systems and/or a single system over time, then the most fundamental theory of architecture is due to Turing, who first formalized splitting computation into layers of software (SW) running on digital hardware (HW).  (I’m disturbed that Turing has been wildly misread and misinterpreted, but I’ll stick with those parts of his legacy that are clear and consistent within engineering, having to do with computation and complexity.  I’ll avoid Turing tests and morphogenesis for now.)

We take Turing’s work so for granted now that we sometimes forget how profound it is, but I’ll review the obvious because we want to explore the relevance to future tech networks as well as bacteria and brains.  Unfortunately, none of this is obvious if you haven’t studied it, and it is just cryptic, and my attempt to restate the basic ideas in an accessible way has been anything but accessible.  So this needs a lot of help from you.  It won’t matter to most readers, except that I need help from the people who know this and want to find a new way to explain it.

So before sketching Turing, let me briefly review a few practical modern consequences of IT that get us started on a discussion of architecture. What the user of devices (PC, phone, tablet, eyeglasses, etc) connected to the Internet “sees” is a bewilderingly diverse set of hardware (HW) running a vastly more diverse set of apps. A fairly minimal constraint is imposed by the much less diverse operating systems (OS) which must provide a platform for distributed apps to work.  The OS provides the apps with a virtualized set of computational and other resources with minimal regard to their location in the physical network (unless that location is essential to the app, e.g. a specific camera).

Architecturally, this app/OS/HW 3 layers can be thought of as forming a “constraints that deconstrain” hourglass shape where (deconstrained) diverse apps run on diverse HW, but with much less diversity in the more universally shared (but constrained) OS.  Despite its centrality, the OS is often the most cryptic layer, supporting the more obvious and observable apps and HW.  (It is tempting to say that “finding the OS” is central to understanding natural complex systems such as the brain, but at this point such a claim is itself too cryptic to be meaningful.  Hopefully that will change.)  The Internet architecture brilliantly extended the OS to include networks of computers, but with inadequacies in both virtualization and flexibility that have many practical consequences, as we will hopefully expand on in this series.

Minimally put, we now have described a 3 layer architecture of apps/OS/HW, where apps/OS are sublayers of the SW in the original Turing SW/HW layers.  Informally, Turing and his followers (formally) showed that there are hard tradeoffs (one example of what I am cautiously calling “laws”) on speed versus flexibility, specifically computational speed versus problem generality or algorithm flexibility.  Turing just studied decidability, which is an extreme end of speed/flex tradeoffs (though there are notions even more extreme), that have been greatly expanded on since. Even though central questions remain (e.g. P=NP?) it is clear that if an algorithm must solve a set of problems without error, then if the set is larger and more general, the time the algorithm will take necessarily must be longer, often prohibitively.  Put the other direction, for tractable computation, problems must be specialized, often severely, errors allowed, or both.  The theory of computational complexity which started with Turing is currently the clearest, most theoretically developed, and practically relevant example of a universal tradeoff (“law”) between speed and flexibility.  I will argue that this tradeoff is a main driver of the need for layered architectures such as SW/HW and apps/OS/HW.  (I apologize to readers not familiar with this subject.)

While Turing showed that SW/HW layering is “universal” in that it does not affect what is decidable, the split can be part of more general flexibility/speed tradeoffs, since specialized hardware (that may not be universal machines, and might even be analog) can greatly speed up specific tasks.  (And decidability per se is an extreme point and rarely the focus of tradeoffs in modern practice.)  The layering concept can be expanded further since digital HW is implemented in an analog hardware substrate, and software can be further layered with an operating system (OS) providing a platform for diverse applications (apps).  Each layer virtualizes the resources of the layer below and speed can be improved by pushing function downward, making it less virtualized, but with a loss of flexibility.  Examples of components (within their layers) that facilitate this process of fast/inflexible are optimizing and parallelize compilers, GPUs, FPUs, FPGAs, ASICs, etc.

To recap: Speed can be enhanced at the price of both material cost and functional inflexibility, e.g. by implementing previously software function in specialized hardware.  Flexibility can be enhanced, also with costs, by adding software to previously hardware-only solutions.  Radios make good examples.  In, say, cell phones, the radios have an important HW component that is purely distributed and analog, that interfaces with lumped analogy, and then digital HW components, making a digital/lumped/distributed layering.  The distributed/analog antenna components are unavoidable because of the radio waves that must be received and transmitted, but there is substantial design choice in where function is placed in distributed vs lumped, analog vs digital, and HW vs SW.  It might be worth delving into this in some detail, as there are topics like cognitive radio and real-time beam-forming (where Javad Lavaei worked with Ali Hajimiri to use convex optimization in addition to novel HW) that are beautifully illustrative, but I’ll defer that for now.  The point to take away is the apps/OS/HW/analog/distributed layering is a rich area for design tradeoffs, and we want an architecture that supports such designs.

The immediate application of this SW/HW layering and theory to biology is limited because “computing and communications” is only relevant to the extent they support control.  Thus the cell is a cyber-physical system (CPS) where comp/comms sits between sensing and acting, and where action is the only thing that generates behavior on which natural selection can act. Nevertheless, what is relevant to CPS and cells includes the basic notion of layering and virtualization, the tradeoff between speed and flexibility, and the importance of the OS and “constraints that deconstrain” hourglasses.  Ultimately, control theory is central to a complete theory of architecture, but is less familiar and here can be approached from a Turing perspective as an extension of the OS to support apps that involve control of (typically analog) external physical systems through sensors and actuators.

Architecture, tennis, and antibiotic resistance

As further motivation consider two familiar but very different (from each other and from IT) challenges: humans playing a game such as tennis and bacteria acquiring antibiotic resistance.  The features of the layers will be familiar even if much of the mechanism is not, or for humans is not even well understood.  For tennis, much of the basic reflex layer involved in balance, running, stopping, turning, basic swinging, eye tracking (the vestibulo-ocular reflex, VOR) etc is generic and not specific to tennis. A fit athlete will have these reflexes in place, implemented in a highly distributed control system running in parallel in the lower brain, spine, and periphery, and will take them for granted as they learn tennis or do almost anything else.  In contrast, some midlayer skills will include appropriate movements for groundstrokes and serve, and will require substantial tennis-specific training over years.  These skills will start as conscious and quite poorly performed activities but necessarily must become (and remain!) unconscious, automated, and distributed if the skills are to be competitive. Some strategic elements (e.g. which side to hit from and to, whether to charge the net, where to place the serve) will still involve conscious planning, but this too is only possible when the basic hitting skills have become automatic.

A player at almost any level can find others who are better or worse.  Against a worse player, lower layers are adequate and wins come automatically (choking is usually caused by conscious or emotional disruption of the automated skills).  To win against a better player, it may be necessary to use strategic planning, but in the long run, it will require further training to acquire (and automate!) the skills necessary, usually by watching others and coaching, rather than ab initio invention.  Thus for everyone from novice to expert, getting better always involves large amounts of horizontal meme/skill transfer and then “pushing down” in the stack to automate and improve speed at the expense of flexibility, since the tennis-specific skills so acquired will not help in other sports or activities (unlike the generic hardwired athletic capabilities in the lowest layer).

What we have described can be crudely viewed as having 3 layers. (This paragraph is almost pure speculation, but consistent with what I know.  Please correct.) The bottom layer is pure sensing, reflex, and action, is generic to any sensorimotor task and is fast but inflexible, distributed and parallelized.  Note this inflexibility is relative but not absolute. Muscle and bone are the least adaptable parts of this system, but we know they respond to training and can degenerate substantially in the absence of exercise. The top layer is conscious thought, and is more flexible, serial, and slow, but again relatively (it is neither infinitely flexible nor infinitely slow).  These bottom and top layers are observable to us because the first is where all the actions appear externally, and the latter is our internal awareness of these actions.

The middle layer is unconscious but highly trainable.  By most measures the middle layer contains the vast majority of the neural complexity, in terms of volumes or counts of cells, neurons, axons, synapses, etc.  It contains most of the neocortex, and most subcortical brain regions, including the cerebellum.  If the brain has an “OS” it resides primarily in this middle layer, and finding it might be hard but seems ultimately essential.  The lower reflex layer has complexity in the actuators (e.g. muscles and endocrine organs) and anatomy (bones, skin, other tissue) in addition to neural components.

More speculation: This 3 layer architecture has an “hourglass” not in complexity (which bulges in the middle) but in diversity.  We are most different from each other in our physiology and our highest level of conscious thought, and most alike in our shared OS in the midlayers. Across all mammals the lower layer physiology has tremendous diversity, and the top “executive” layer is also presumably very diverse (though we know little about the details of other animal’s conscious thought), but in any case “consciousness” is a thin (and perhaps overrated) veneer on the other layers.  I conjecture that the most shared elements across humans and also across mammals are in the midlayer OS.

To sharpen our research questions, how would we turn these conjectures into engineering design and scientific understanding?  What is the theory and technology needed to formalize this layering, analogous to what Turing and the theorists and engineers who followed have partially done for computing?  What relevance does this have for biology and medicine?

As a related example to motivate this last question, consider bacteria that find themselves in a hospital without the (physical layer) proteins to protect against the antibiotics in use, and lacking the genes to make these proteins, or the time to evolve them.  The only hope (and a common “choice”) is to acquire the necessary genes by HGT (Horizontal Gene Transfer), but fortunately for the bug (and not us) they are available precisely because they have enabled survival of other bugs despite the antibiotic use. (It is true that the protein had to evolve at some point, but this is likely to have happened billions of years ago and only recently found its way to this hospital. It also appears that for every antibiotic we know of, myriad genes exist somewhere in the environment that confer antibiotic resistance to them.)  But acquiring the gene is not enough by itself any more than is finding the right coach.  The cell’s OS control systems must be adequately (re)wired so that the gene is properly expressed and thus “pushed down” the protocol stack to the protein reflex layer to be functional.  HGT often includes control elements as well as protein coding regions, facilitating this process.

This HGT story sounds disturbingly Lamarckian, since the organism acquires new capabilities from the environment and passes them on to progeny, and the genes are there precisely because the antibiotics are.  But it is also consistent with Darwin and all but the most extreme conservative advocates of the “modern synthesis” (though like creationists they seem to spread like rabbits, or, say, antibiotic resistance).  Biologists also know far more about how this layered architecture works than they do about the brain, and it appears as though bacteria have a far more perfect (if simpler) architecture than does the brain or the Internet.

These familiar everyday examples of modern IT, playing tennis, and antibiotic resistance illustrate (I claim) universal laws and architectures, and next time we will explore more deeply these laws and architectures. What is (I claim) universal and (somewhat unavoidably) taken for granted in both the doing and telling of these stories is the enormous hidden complexity in the layered architectures, mostly in the respective OSes, but in the layered architectures generally, as is our use of computers, phones, and the Internet.  Layered architectures are most effective exactly when they disappear and everything just learns, adapts, evolves, and works, like magic.  Matches won and antibiotics survived, and the architecture makes it look easy.  It is this hidden complexity that we must both reverse and forward engineer, with both theory and technology.

One thought on “Universal Laws and Architectures (Part II)

  1. Pingback: Rigor + Relevance | Universal Laws and Architectures (Part III)

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s