Sufficiently Advanced Technology

Hurdles of Innovation

2024-03-09T00:00:00+00:00

If you’re reading this there is a better than average chance that you either are an innovator or will be soon. People who regularly attend conferences, read widely, and follow obscure and longwinded blogs tend to skew venturesome; active information-seekers with diverse social groups and information sources, possess breadth of knowledge, are more willing to explore and adopt new ideas, and are able to cope with higher levels of uncertainty. If you are an innovator then, according to current research, you are part of a vanishingly small demographic that makes up just 2.5% of the total population. The unique cognitive and social diversity of this group offers perspective and insight into ideas, opportunities, possibilities, and solutions ahead of the mainstream.

Perhaps you see the solution to a problem right now. Congratulations, you have an idea–an innovation–that just might make a material difference in the lives of a lot of people. How do you get the other 97.5% of the population to adopt your solution? Will your innovation take off by itself on the merits of its benefits? Unlikely. If the best ideas always won, I wouldn’t be writing this. Even when the idea or innovation has clear and obvious advantages, skeptics must be won over. Old, entrenched habits must be abandoned. Innovations must be evaluated, explored and increasingly accepted until they reach critical mass (only then will they take off). This process can take years or may never succeed at all. Your efficacy as an innovator depends on your ability to optimize and streamline this process, influence your peers, and ultimately effect change in the real world. There is an art–and a science–to the diffusion of innovations.

This feels like a vainglorious boast to actually write but, for as long as I can remember, I have had a mind for solutions. Given most sets of problems, I have long been able to quickly “see” a solution. I suspect it has something to do with my other, multi-decade, career because, in that world, you look at problems and impossibility very differently. This ability has driven my tech career quite far and led me to modest success in the software architecture space but, if I look back at my 25+ year tenure in tech, I see that–good ideas or not, solutions or not–I was rarely effective. It turns out that I spent many years overvaluing my technical skills and blaming others in lieu of a meaningful postmortem. However “smart” I thought I was, I still had a lot to learn.

Our technical skills are just the ante. They get us into the game but all they do is get us into the game. Everything beyond that point depends on everything else we bring to the table.

Of course, I had read Carnegie and Cialdini, two giants in the field of influence and persuasion, as well as “Fearless Change” and The Tipping Point yet my track record in leading technical change remained checkered and uneven. After a particularly disappointing change-effort, I conducted a personal retrospective and determined that, if I was to be consistently successful as an architect and agent for transformational change, deeper study was necessary. I turned to the definitive work, “Diffusion of Innovations” by Everett Rogers and went deep into that rabbit hole.

Rogers’ book is a large and dense textbook; a comprehensive academic study of how ideas and innovations spread in social networks. After two cover-to-cover readthroughs my copy is dogeared and heavily highlighted/annotated. I dug deeper into a number of his citations. I can say the content is excellent however it was not written for me but for current and future students of the theory. What follows is my attempt to distill everything I have learned into an approachable summary for software architects and technical leaders. Although it is far from comprehensive, my goal for this post is to make the process of innovation diffusion and leading change more understandable and actionable. Whether you are architect, a leader, an innovator, or just someone wanting to move up the value chain and be more effective day-to-day, let us explore this valuable topic together.

“Only 15% of one’s financial success is due to one’s technical knowledge and the other 85% is due to human engineering and the ability to lead people

-The Carnegie Institute

How Ideas Spread

In the spirit of Everett Rogers, I will continue the tradition of referring to any kind of change we would like to see take place as an “Innovation;” an idea, practice, tool, object, or technology that has some degree of benefit for potential adopters. Successful innovation has always been a mixed bag. Innovators in all fields and walks of life have struggled with getting good ideas adopted. This has led to close to a century of focused research and several thousand publications and papers on the topic. Despite the diversity of innovations and their contexts throughout history, organizations, and society; their similarities are strikingly consistent. Change is not a technical process, but a social process. Understanding the process, the variables, and becoming more skilled in dealing with people and navigating politics is unavoidable.

Innovations, by definition, are always perceived as new to the masses (even if they are not objectively new). Of course, introducing something new requires invoking some kind of change in people (their process, behavior, beliefs, habits, etc.). Different groups adapt to change at different paces. Despite the promise of an innovation, most respondents will harbor some amount healthy skepticism about whether an innovation is an improvement on what it will replace. Innovators are generally poorly positioned to win-over skeptics. In a social sense, they are often viewed as deviants from the mainstream norms and conventions of the team, organization, or other community. The very cognitive diversity that makes us skew innovative is our downfall.

“The most innovative member of a system is very often perceived as deviant from the social system and is afforded a status of low credibility by the average members of the system. This individual’s role in diffusion (especially in persuading others to adopt the innovation) is therefore very limited.”

-Everett Rogers - Diffusion of Innovations, Fifth Edition

The sphere of influence of a typical innovator is relatively small. An innovation, in the hands of a small number of innovators alone, will not take root. The innovation must be increasingly adopted. In practice, successful innovations spread from the innovator to a small number of early adopters. If the early adopters are sufficiently well-regarded and well connected; the innovation will continue to spread to a larger group, the early majority. It is typically at this point that the innovation will reach a critical mass and diffusion/adoption will accelerate. As the idea spreads and repeated confirmations of success/benefit are increasingly visible, the more skeptical late majority will begin to adopt and finally the most skeptical–the laggards–will adopt the innovation. The innovation diffusion process (when successful) consistently follows an s-shaped curve over time (as illustrated below).

Diffusion and adoption are initially slow but begin to pick up as the innovation spreads to early adopters accelerating rapidly once an early majority begins to form. Achieving that critical mass of 10-20% adoption is key. As you can see from the graph, this is where the change imitative becomes self-sufficient, spreading and accelerating on its own.

Perhaps the best illustration of this is the “Dancing Guy” video and the TED talk that analyses it. From one outsider dancing alone, to a first follower, to a critical mass, to a dance party.

The “S curve” is a generalization of cumulative successful diffusion of an innovation. There are no guarantees that an innovation will go from early adopters to early majority in any given amount of time (or even at all!). In most organizations, there is often a clock on reaching critical mass. Funding can run out; initiatives can be abandoned. There is a particular set of skills and principles necessary to navigate that early space and minimize the time necessary for the innovation to take off.

From Innovator to Change Agent

Assuming you have an innovation–a technical change or improvement–that you hope will diffuse; relying on early adopters to discover and adopt your innovation by chance alone is not a recipe for success. You need to become a change agent. Becoming a change-agent requires navigating a number of hurdles. It really boils down to three things:

The innovation and both its positive and negative attributes
Your ability to identify key individuals in the organization (Specifically Opinion Leaders and Champions)
Your relationships and your ability to influence those key individuals

Take any of these three away and your odds of success drop precipitously. Innovations consistently follow a common development process:

Recognizing a problem or need
Basic and applied research
Formalization
Packaging
Diffusion/Adoption
Consequences

In essence, given some problem, through some amount of research, the innovator develops and formalizes a solution which is then packaged in a way that it can be adopted. At this point, the diffusion process may begin and ultimately the innovation may be rejected, adopted, reinvented, or adopted then later discontinued. Steps one and two may take place out of order. It’s not uncommon for a solution to uncover a previously unrecognized problem or need (although it is equally common to see a solution invent a “need” by brute force rather than uncover one through serendipity–be cautious).

Hurdle #1 Perception of a Need

As innovators, we tend to have a pro-innovation bias. We are often techno-optimists, and our diverse knowledge and perspectives give visibility to many potential innovations. The key to success (and building credibility over time) lies in the ability to connect a potential innovation to a genuine need within an organization. Change and innovation purely for change’s sake is rarely a path to success and adoption. There must be a broad perception of value across both those individuals who have authority and influence to drive change and those who adopt it. The operative word here being perception. If we see value that others don’t, we won’t be successful; shaping perceptions is key.

Organizationally speaking, change equates to risk. Driving change requires communicating that the risk of inaction outweighs the risk of inaction. Human motivation generally falls into two distinct categories:

Toward Pleasure (the change will make something measurably better)
Away From Pain (the change will eliminate a major pain point)

The latter, away from pain, is universally a much stronger motivation.

Let me give you an example admittedly not related to innovation but one that illustrates how these two forces (towards pleasure/away from pain) prompt very different decision processes and how risk of action vs. risk of inaction play in: imagine it is a warm, sunny day and I am sitting happily in the hammock on my patio reading a book. I’m quite content exactly where I am, but then I hear the approaching jingle of the ice cream truck. Admittedly a Choco Taco would make my pleasant afternoon even better but obtaining one requires:

Leaving the comfort of my hammock where I have already done the work of settling in and finding the “sweet spot.”
Finding my wallet
Leaving my house
Flagging down the truck
Purchasing my ice cream
Returning to my house
Re-situating myself in the hammock

I must consider the fact that:

I am already comfortable and happy
I might not find my wallet
I might not have cash
I might not get outside in time

In short, the anticipated pleasure is hypothetical until I actually have my Choco Taco in hand (which is far from a sure thing). I’m going to spend more time considering my options and may decide that taking action is not worth it.

Contrast this with another scenario. I’m in the kitchen making my signature risotto. My priority and perceived need is constantly stirring and adding liquid to the rice to achieve the perfect creamy/starchy texture. I won’t step away from this task for the ice cream truck or anything else… then I placed my hand on the hot stove. The pain is immediate and undeniable. I will instantly recoil away from the surface and, depending on how bad the pain continues to be, take further action to soak my hand in cold water and do anything else necessary to prevent the burn from becoming worse. The pain is my top priority and strong enough for me to abandon my risotto. The risk of inaction clearly outweighs the risk of action; there is no question that change is necessary. In the previous scenario, the decision process is not nearly so cut-and-dried.

Returning to our topic of innovations, we can see the importance that perception of a need and type of benefit confer to the decision/adoption process. Preventative innovations, speculative innovations, technological and process improvements that bring gradual benefit over time can be a tough sell, particularly in the absence of a perceived need (I didn’t even know I wanted a Choco Taco a minute ago). Regrettably, sometimes an innovation cannot be successfully adopted until the pain of the need is first felt. We, of course, know that “an ounce of prevention is worth a pound of cure” so we, as change agents, must become adept at communicating the need. If, some time before the ice cream truck appeared, my mind had somehow become focused on the idea that this sunny day would be even better with a Choco Taco in hand, I would be more apt to take action when I heard the ice cream truck approaching. This suggests that some part of our effort must be spent building awareness of the problem before presenting the solution; what Robert Cialdini calls “Pre-Suasion”. The difficulty of this hurdle is compounded by the fact that different individuals will perceive and respond to different aspects of both the problem and solution. There are multiple “stories” about the problem/need that must be told for each audience.

Hurdle #2 Finding a Solution

In today’s era of technological marvels and miracles, it seems there is a solution for everything. At first glance, this seems easy, but a potential solution must be evaluated holistically. It must not only meet the needs/solve the problems of the organization, but it must also be an innovation within reach of those who must adopt it. As a practicing software architect, I often have what I believe is the optimal solution for a given project, only to realize that this first iteration of the innovation will either involve a much larger scope of change (more risk) or compromising the innovation for compatibility (more risk). The Tailor-Made Architecture Model is an approach to optimize how we design, evaluate, and communicate architecture with this reality in mind. This aspect of the innovation process is often much more challenging than it may first appear, and a would-be innovator must be skilled in navigating this process.

Hurdle #3 Packaging for Adoption

Even once the innovation is well-defined, the potential benefits understood, and the formalized solution optimized for the organization; more work is necessary. Typically, some amount of effort is necessary to “package” the innovation into a form that is ready to be adopted. Often innovators can be overly excited, with a strong desire to unleash the innovation on the world, but it is important to proceed carefully here. Virtually any change invites some disruption, exemplified by the “J curve.”

There is often an expectation and implicit assumption (by both innovators and potential adopters) that the innovation will trace a steady improvement over time but, almost universally, the innovation brings a striking initial disruption that must be overcome. Many promising innovations are abandoned during the j-curve dip. Proper packaging of an innovation is a crucial step. The less potential adopters have to learn and the fewer ingrained behaviors/habits they need to change, the better the odds of success for the innovation.

Packaging is the process of removing as much initial adoption friction as possible. Packaging may involve building a POC, reference implementation, or tooling. It may involve training or coaching/mentoring. There is no one-size-fits-all approach.

Hurdle #4 Reaching Critical Mass

Even if properly packaged, the innovation must find a critical mass of early adopters. You will not succeed if you try to “boil the ocean” and endeavor to convince everyone at once. The vast body of work on diffusion theory consistently shows the universal spread of ideas and subsequent acceleration from the small group that is more open to new ideas to the larger groups that tend to be more skeptical. Regardless of the change agent’s efforts, the early majority’s adoption decision and investment will depend on seeing the success of the early adopters (and so on, up the curve). Focusing communication and persuasion efforts on the right early adopters is a better use of time and energy. Think of early adopters are the local gatekeepers of new ideas and technologies and they bridge the gap between innovators and the larger community within the organization. Early adopters are generally considered a bellwether for the success of the innovation within the organization. Not all early adopters are created equal; you’re specifically looking for “Opinion Leaders” who are respected by their peers and are considered more integrated within their social systems than Innovators yet are not too traditional/resistant to change.

Some time ago, I wrote about the my experiences with the Dvorak keyboard, an optimized and human-centric redesign of the keyboard layout that improves speed, efficiency, and reduces RSI. It solves a problem (yet the perception of the need is not widespread), it has been carefully designed and formalized, and has even been packaged for adoption (every major OS ships with a Dvorak layout option, alternatively designed/labeled keyboards are available, hardwired keyboards are available) yet the innovation has been lingering in the “early adopter” phase of diffusion for close to a century. Every innovation that fails to reach critical mass will either die on the vine or languish in obscurity.

Hurdle #5 Managing the Diffusion Process

Once the diffusion process “takes off” there is this assumption that success is inevitable however, without careful planning and management of this process, the innovation may still fall flat. First, we must accept the very real possibility that the promised benefits don’t materialize. In this case the innovation is typically discontinued or abandoned. Our only real option in this case is to perform a postmortem on the change effort and learn what we can to improve our mental models and process.

We must also be aware of the phenomenon of reinvention. In tackling hurdle #2, we do our best to fit the innovation to the organization but, in most cases, the organization will modify the innovation during adoption. This happens for a number of reasons:

Innovations that are hard to understand are more likely to be re-invented
Reinvention can occur owing to an adopter’s lack of detailed knowledge about the innovation.
An innovation that is a general concept or a tool with many possible applications is more likely to be reinvented - see the web
When an innovation is implemented to solve a wide range of users’ problems, reinvention is more likely to occur.
Reinvention may occur because a change-agent influences its clients to modify or adapt an innovation
Reinvention occurs when an innovation must be adapted to the structure of the organization
Reinvention may be more frequent later in the diffusion process as later adopters profit from the experience gained by earlier adopters.

Although we lose some control of the direction, often innovation reinvention is a positive thing. Reinvention correlates positively with adoption and sustainability, however reinvention can also transform diffusion into dilution. Arguably the widespread diffusion of concepts such as agile and DevOps have led to such a high degree of reinvention that they no longer address the original problems/needs that led to their development and what they have become have increasingly little value in most organizations. A strong, unifying focus on the problem and the why is key to preventing dilution during diffusion.

Understanding the Attributes of an Innovation

Clearing any or all the hurdles identified is not a guarantee of success, nor is reaching critical mass. Even successful widespread adoption is not the finish line. There are no guarantees in the diffusion of innovation, but we can optimize the process to best position ourselves for success. This begins with five key attributes of an innovation that must be explored, understood, and optimized. They are:

Attribute	Definition
Relative Advantage	The degree to which an innovation is perceived as better than the idea it supersedes.
Compatibility	The degree to which an innovation is perceived as being consistent with the existing values, past experience, and needs of potential adopters.
Complexity	The extent of the difficulty or friction adopters experience in attempting to adopt an innovation
Trialability	The degree to which an innovation may be experimented with on a limited basis.
Observability	The degree to which the results of an innovation are visible to others.

Each of these attributes correlates either positively or negatively to the rate or likelihood of adoption and are common to virtually every innovation. Looking at any innovation through the lens of these variables can prove illuminating. Sometimes the innovation is not perceived as advantageous, sometimes the advantages are clear but people are too set in their ways or can’t make time for change. Sometimes the change is just too complex. Let’s look at these in the context of a common example:

Some time ago, industry analyst Gartner made a prediction that by the end of 2019, “90% of organizations who try to adopt the microservices architecture will fail; they will find the paradigm too disruptive.” I believe this is a perfect example of the challenges with the diffusion of innovation. There is this idea that the perceived superiority of the idea will win out and everyone will “jump right on board” however, in reality, adoption of microservices requires many radical changes.

Gartner is not explicit about what is meant by “they will find the paradigm too disruptive” but if we look at this through the lens of Rogers Innovation Diffusion Theory, we can explore the attributes of the innovation:

Relative Advantage

Executed well, microservices promise a great deal of relative advantage in terms of system quality attributes such as agility, scalability, elasticity, fault-tolerance, etc. Additionally, there is much social prestige to be had (often, in fact, this becomes a significant driver). Given industry giants such as Amazon, Netflix, Google, and others adopt microservices; the perception is microservices is a “real” architecture and confers significant bragging rights.

When talking about objective advantage, perception is key; feelings are facts. The degree of objective advantage doesn’t matter so much. It’s all about perception - does the individual perceive the innovation as advantageous? In the case of microservices, the perception of relative advantage in the mind of a would-be innovator is so strong that many pursue this approach as folly. In fact, quite frequently, microservices are a solution to a problem an organization simply doesn’t have (although there are often other factors). There will be members of the organization who are skeptical of this type of change as they see the risks associated with the change and aren’t quite so swayed by the hype, the “true believers” of the perceived advantage often drown out these voices.

Executing this architecture well requires major organization-wide changes, some of which are incredibly disruptive. If the “why”–the relative advantage–is not widely shared in an organization, it can be hard to overcome innovation compatibility and complexity.

Innovation Compatibility

Microservices is a domain-partitioned architecture. In other words, microservice boundaries are defined by business domain and business context boundaries. To put this architecture in practice we must accept what Melvin Conway first postulated in 1967–and has been since demonstrated again and again in studies–that “organizations designing systems tend to produce designs that mirror their own communication structures.” In other words, the technical architecture of a system reflects the social boundaries and interactions within the organization that created it. For most organizations, entire divisions or the entire company must be restructured. This begins with an extremely expensive and lengthy analysis process and ends with major organizational changes. Well established communication lines are massively changed, areas of focus and expertise radically shift, teams might be broken apart and reformed. This is, by definition, a very incompatible change. If the relative advantage is not black-and-white clear to all involved, such an innovation will be abandoned or reinvented (typically the latter).

For developers and teams, several more incompatible changes follow. Decades of convention and “best practices” that have been drilled into their heads are suddenly invalidated. To achieve the goals of independent release cycles and “extreme decoupling,” Microservices play by a very different (and seemingly foreign) set of rules.

Would-be innovators who charge head-first into a microservice strategy rarely consider the innovation compatibility attributes.

Old ideas are the main mental tools that individuals utilize to assess new ideas and give them meaning. Individuals cannot deal with an innovation except on the basis of the familiar.

For the majority of organizations and developers adopting this strategy for the first time, there is very little that is familiar in this innovation. In addition to low compatibility, this innovation is extraordinarily complex.

Innovation Complexity

Microservices are incredibly difficult to implement well. Distributed systems require vastly different mental models and disciplines that do not typically come naturally. Long-held best practices such as “don’t repeat yourself” (DRY) are now anathema to good software. Teams must work independently which often requires interface-first development; it is not easy to begin thinking in terms of abstractions. Teams must also develop new skills surrounding automation and DevOps.

Andrei Taranchenko wrote extensively about the amount of churn, change, and friction microservices can introduce in his blog post “Death by 1000 microservices”.

This compatibility aspect is a significant hurdle to Innovation Diffusion. This also begins to speak to the phenomenon of Innovation Reinvention.

Innovation Reinvention

This is the degree to which an innovation is changed or modified by a user in the process of adoption and implementation. This is another aspect of innovation diffusion that is rarely considered. Once an innovation begins to diffuse, it is rarely within the control of the initial innovator.

Two forces are at play here. First, many adopters want to actively participate in customizing an innovation to fit their unique situation. In other words, they want to put their name on it in a sense. Second, incompatible innovations are often reconciled with the familiar. Remember:

Old ideas are the main mental tools that individuals utilize to assess new ideas and give them meaning. Individuals cannot deal with an innovation except on the basis of the familiar.

The fine-grained and extremely decoupled nature of microservices is rarely compatible with established values and norms of most developers. The more coarse-grained and greatly simplified “mini-services” or “macro-services” that emerge in the “90%” Gartner talks about is a reinvention that emerges in response to the compatibility and complexity aspects of this change. While some innovations provide ample room for reinvention, microservices offer comparatively few dimensions for reinvention that don’t negatively impact other teams and the rest of the system. In short, microservices architecture requires teams optimizing for organization-level goals and optimizing for team-level goals will lead to a “tragedy of the commons” and a failed/discontinued implementation.

To manage innovation reinvention, the innovator/change-agent must think about how this innovation will spread through the organization. How can they scale themselves; how can they identify influential champions and experts who can help steer the early majority during the innovation diffusion process?

Innovation Observability

At a high level, the degree to which the results of an innovation are visible to others. The easier it is to see the results, the more likely they will adopt. When a team or organization sees someone else experiencing success with an innovation, the more likely they will adopt. Observability has another important component here in managing reinvention and avoiding micromanaging the innovation. It is much easier to establish “systems” than to attempt to micromanage innovation diffusion. Observability is key in the setting of feedback loops and improving the odds of a successful innovation.

Innovation Trialability

Microservices are often initially treated as all-or-nothing affairs where the innovation is framed as radical, overnight change. Innovations that can be trialed will be adopted more quickly. Rogers gives the example of farmers adopting new hybrid corn and trying a small amount on limited acreage and he goes on to say:

The trialability of an innovation, as perceived by the members of a social system, is positively related to its rate of adoption.

This reduces the perceived risk of the change and allows for the innovation to progressively become more compatible to the organization and teams.

Notably the innovator/change-agent must consider these aspects up-front and navigate this space carefully. A more constrained “trial” of the architecture, perhaps splitting out one or two microservices from a monolith would provide an opportunity for a more limited and managed trial. Also trialing a more coarse-grained, “mini-service” might provide a path with more relative compatibly and allow architecture and development to iterate on the optimal architecture from a complexity/compatibility standpoint while serving as an observable success for the early majority to observe, follow, and adopt.

Optimizing Innovation Attributes

If our goal is to be successful as innovators and influential as change-agents, we must think about how we can optimize these five attributes central to the success or failure of the innovation. This must take place both as part of the solutioning process and the design process. This is true regardless of how the diffusion decision will be made (i.e. optional, collective, or authority mandate).

My friend and fellow software architect, Daniel Tippie, will frequently point out that there are five elements necessary to effect change in an organization. They are:

Authority
Accountability
Responsibility
Know-How
Will

Rarely will any single individual possess all five. On the spectrum of roles in an organization, from business-focused to technically-focused, those with authority and responsibility will typically occupy the business-side of the spectrum. Know-how and will must exist broadly on the technical side. That leaves us, the innovators and would-be change-agents. We may have know-how and will for change, but that is insufficient; those actually adopting/implementing the innovation will also require know-how and will. Success requires optimizing the innovation attributes from the perspective of both sides of the organization and the respective roles.

Optimizing Relative Advantage

Obviously the higher the perceived relative advantage of the innovation, the higher the likelihood of adoption. The degree of relative advantage may be measured in economic terms, but social prestige factors, convenience, and satisfaction are also important factors. Relative advantage, however, is always in the eye of the beholder. Savvy and adroit change agents will look at relative advantage from multiple angles. When presenting the innovation to potential early adopters and champions, you must be able to succinctly communicate both what the innovation is and why they should care. For technical innovations, I usually summarize this as:

What is the thing?
What does it do?
Why should I care?

The precise framing of this message will vary depending on the audience.

Authority-Driven Decisions

This type of decision typically leads to the fastest rate of diffusion as, generally, far fewer individuals are involved in the actual adoption decision. In this case, the relative advantage of the innovation must be framed in terms the authority (typically a business person) understands and resonates with.

A common error many would-be change-agents make is communicating the value of the innovation in terms of benefit to themselves. e.g. Exclaiming to the business “Hey, if we bought this expensive piece of software, my job would be so much easier!” That may be true, but the business doesn’t necessarily care about making your job easier. Figure out what they do care about and frame the benefits from that perspective. Maybe the consequence of your job getting easier is a higher speed-to-market, or lower defect escape rate, or whatever OKR/Metric they care about. The chances are your job being easier will result in one or more business benefits. You must find these and communicate them.

Authority buy-in and mandate only satisfies three of the five necessary change elements from the Tippie model; know-how and will are still necessary, meaning packaging/training (know-how) and will (relative advantage) must be broadly communicated. For this reason, further relative advantage optimizations must be made for the broader audience and some care given to communication channels.

Collective Decisions

Collective decisions are made by consensus among members of a team or organization. Generally, these types of decisions take longer to reach a point of consensus but the resulting decision is often more sustainable. These types of decisions usually begin with a common awareness of the problem or need and the opportunity for members of the organization to exercise some agency and autonomy in their direction elicits a sense of “ownership” in the final decision which contributes greatly to the element of will.

Generally, an innovator-turned-change-agent will have some idea of a solution in advance, but getting broad-scale acceptance of the innovation can be like herding cats. Optimizing relative advantage in this scenario often begins with a focus on the problem. Before simply pitching your innovation, create an environment where members of the community can contribute their perspectives on the problem/need; let them communicate their pain points. The optimum communication of relative advantage is focused on what matters most to the adopter. Therefore, pay close attention to the who and what of the pain points (particularly from influential members of the group) and use this information to effectively frame relative advantage.

Individual/Optional Decisions

This type of innovation decision involves the most people and is often the slowest to diffuse. This type of innovation decision involves the broadest audience so it can be useful to consider how to segment the various audiences based on common pain-points. Like the collective decision, this positions you as the change agent to speak meaningfully and impactfully about the relative advantage of the innovation. Operating effectively, however, involves the change-agent scaling themselves. It can be useful to identify influential members within each community/segment, tailor the message to them, and empower them to spread the message with their peers.

Given the broad need for know-how and will, even in authority and collective decisions, this skill and practice should be kept in mind and care and thought put into communication channels.

Optimizing Compatibility

A strong belief in the benefits of an innovation often leads would-be innovators to assume that the practices it seeks to replace are so inferior they can be completely dismissed yet history repeatedly shows the folly in this thinking. Adopters can only deal with an innovation within the context and basis of what is already familiar. You must ask yourself how compatible your innovation is with the existing ways of working and existing mental models.

The truly revolutionary ideas (relatively speaking) simply cannot be introduced all at once. As Nikola Tesla’s character in the 2006 film, “The Prestige” says,

“The world only tolerates one change at a time.” -The Prestige

Personally speaking, one of the hardest realities I have had to accept in my career is meaningful change often must take place in stages. On a case-by-case basis, the innovation must be evaluated to determine if the big innovation can be broken down into smaller, more compatible changes that pave the way toward the final innovation.

In 2017 I wrote a conference talk entitled “Persuasion Patterns” which included a case study of a Continuous Delivery transformation within an organization. The full scope and scale of the innovation was far too much to succeed. Ultimately the innovation was reduced into a large number of bite-sized innovations that could incrementally be adopted. Over time, the broader ideas became more familiar, the innovation more compatible, and the innovation ultimately succeeded.

Every innovation must be ruthlessly examined to identify potential areas of flex or deferral. Perfection is the enemy of progress.

Optimizing Complexity

New ideas that are simpler to understand are adopted more rapidly than innovations that require the adopter to develop new skills and understandings. Similar to how we must approach compatibility, ruthless examination must take place to identify where ideas can be simplified in the short term. We must look for every opportunity to reduce adoption complexity.

I find that taking time to create learning guilds, book clubs, or regular lunch-and-learn sessions are an ideal forum to gradually introduce ideas into the organization well-before pitching them as innovations. This creates a foundation of fertile soil for future learning on a topic to take root and blossom.

It is also important to have “skin in the game.” Taking the time to build POCs, tooling, reference implementations can aid greatly. Automation is another avenue to reduce friction points. Finally, consider a training strategy. You may possess the skill of building effective and engaging training but also consider how you can scale yourself.

Optimizing Trialability

To many would-be adopters and champions, change almost universally equates to risk. An innovation that can be evaluated on a trial basis is one with significantly lower risk. The value of presenting a trial should not be underestimated. As a personal example, I have always struggled with my weight to some extent and I am currently down 70lbs (~32kg) from my highest weight. At one point on that downward journey, my doctor recommended increasing caloric and fat intake which was definitely not compatible with my decade or so of diets. Then she said “We’ll do it for six weeks. There’s no amount of damage you can do in six weeks that you can’t undo in that amount of time or less.” It was only then that I agreed. It was the right approach (and I’ve maintained my goal weight for a number of years since).

I once used this same line on a manager when proposing a significant change to our SDLC (“we can undo anything in six weeks”). Resistance was overcome and the change was successful.

Think carefully about the opportunities to trial an innovation. First the limited perceived scope of the risk will win over some who are skeptical, but also consider the trial as part of building your growing body of early adopters. Relatively earlier adopters of an innovation perceive trialability as more important than do later adopters. (since we can see others success later in the adoption curve). Peers are, in essence, a vicarious trial.

Optimizing Observability

“You can’t improve what you don’t measure.” -Peter Drucker

Each adopter group has a different threshold for uncertainty. Early adopters are comfortable with more uncertainty, the late majority is comfortable with significantly less. As the innovation makes its way through the S curve, uncertainty about the innovation should be steadily decreasing. Increasingly, potential adopters should be seeing that:

The innovation is successful
The problem is tractable
The benefits are materializing
This is better than the status quo

The challenge for the change agent is making this data visible. In the past I have attempted to find metrics germane to various groups I wish to influence in the organization. On the business side I have looked at OKRs, financials, and other relevant metrics. For technical groups I have measured a downward trend in out-of-hours support calls, escaped defects, or other pain points.

The key is to figure out what you can measure and show to reduce uncertainty and communicate this.

Relationships and Influence

By now it should be clear that skill in communication, building relationships, and persuasion are critical. How we manage communication, and the extent of our efforts have significant impact on the rate at which the innovation is adopted.

Variables in determining the adoption rate of innovations

To maximize your efforts as a change agent and optimize influence and communication, I recommend reading Influence, The Psychology of Persuasion and Pre-Suasion by Robert Cialdini, along with the classic How to Win Friends and Influence People by Dale Carnegie. These are three valuable books with much to teach. Perhaps I will write a future post distilling their teachings into this topic.

In summary, well before trying to influence anyone, begin to build positive and productive relationships as widely as possible. Be the first to offer support in any relationship and always give more than you ask for. Build good friendships and positive working relationships. Learn to communicate effectively and persuasively.

Wherever possible when communicating with a group or committee, let members of this committee propose their own solutions. Where competing (but inferior) solutions are proposed, ask questions rather than contradict. Allow shortcomings of competing innovations to be discovered by–rather than dictated to–members. Preserve collective ownership of the idea rather than seeking credit.

Conclusion

This post has become significantly larger than I anticipated. There is so much more I want to write but that must wait for a follow-up piece. The takeaway, perhaps, is simply that becoming a change-agent who makes innovation happen is a long game. There are no shortcuts, there are no tricks. It requires a unique set of skills, foresight, and planning to become effective. While we many never master these skills, even bare proficiency will make us more effective in whatever role/title we hold now and in the future. Being able to lead technical change will also result in a much more satisfying career and life.

As a practicing software architect, I have long since learned that, without these skills, I cannot be effective. I find these skills and practice to be of equal importance (if not, even more so) to my hard “architect” skills. The more time I spend in this industry, the more I find this is true in all roles and cases, regardless of title or seniority.

In short, this is a topic I am very passionate about. If you read this far, please know I believe we are kindred spirits and if you ever need advice or assistance, you’re free to call on me anytime.

CD & DevOps for Relational Databases Part II - How

2024-02-21T00:00:00+00:00

In part I we explored the history and goals of DevOps and the Agile movement and the motivations for maturing your DBOps practice. In other words, part I addresses “mindset.” Let’s now look at this in practice and get into the “how” of DbOps, addressing the remaining “skillset” and “toolset” topics. Most crucially this involves not only understanding the tools you’ll be working with but also understanding the database. Poorly planned database changes have the potential to cause extended unwanted downtime and, if they break compatibility with the previous version of your app, can complicate rollbacks. Knowledge of RDBMS internals is crucial to moving “offline-breaking” database changes (database changes that require downtime and break compatability with previous versions of code) to “online-compatible” changes (database changes that do not introduce downtime and can be remain backward/forward compatible with code changes).

Change Management Approaches

The modern CI/CD toolset for databases largely fall into two categories of approach.

Desired State Configuration (DSC)
Database Migrations

Each has their trade-offs that must be evaluated but in summary it really boils down to how much control and visibility you want to delegate. As a (former) DBA, I remain deeply skeptical of DSC - let’s look at why.

Desired State Configuration

Here’s a real-world problem I recently faced. I built and maintain a SaaS CRM that many business rely on daily to run their business. Behind the scenes is a MySQL database that not only powers my application, but holds their data–the lifeblood of their business–in its metaphorical hands.

This started when I received a support ticket notifying me that email messages are being truncated in the history. The culprit was a single emoji character in the body of that message that was causing MySQL to choke. Everything following the emoji was discarded, resulting in some amount of data loss.

The root problem is a disagreement about what UTF-8 means. I understood UTF-8 to mean the entire range of the Unicode charset, but to MySQL it means a subset of the UTF8 charset; specifically MySQL’s implementation of UTF-8 can only address up to three-byte Unicode characters while the emoji are typically 4-byte. I’m not alone in my rage…

" />

The fix is easy enough, I just need to change the character set for that table to be the ‘real’ UTF encoding. The Data-Definition Language (DDL) looks like this:

ALTER TABLE emails_text CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

So I run that script in my dev database, test my code, and prepare to commit and deploy my fixes. If I have a DSC tool, it will generate the change script by comparing the state of my dev database against a snapshot of prod. It will generate a delta script containing each atomic change and write those somewhere in my source tree.

Simple, right? Find the answer on SO, make the change in Dev, and my DB tooling will script the change and execute it as part of my deployment process.

Except, now my customers hate me and they left my platform. If I was an employee I might be reprimanded or fired but since it’s my company I’ll out of business.

The simplicity of this change was a lie. Migrations introduce a little bit more visibility and control into the process.

Migrations

Migrations will sequentially apply individual change scripts that, generally, you write and fall into two subcategories: Code migrations and SQL migrations. As a seasoned SQL developer and former DBA, I have generally been deeply skeptical of ORMs, the SQL they generate is generally garbage compared to what I could write but they’ve gotten significantly better over the years. As a developer, I can’t overlook the convenience and efficiency they bring in certain cases. For better or worse, I now use EntityFramework as my ORM in my CRM. EntityFramework offers a migration capability to scaffold, apply, and rollback changes in my dev environment and pipelines. The underlying process is similar to tools like FlyWay, RoundHouse, and a variety of others.

If I were to use EntityFramework Migrations?

Well, first I need to create a new migration. I would open the developer console and type something like:

dotnet ef migrations add AlterEmailsTextCharset

This will generate a new migration file named something like 20240223123456_AlterEmailsTextCharset.cs in the “Migrations” folder in my project. It will contain an Up() method where I can specify the specific changes I want to apply using the EF domain-specific language (DSL) or as SQL (which makes the most sense in this case.) It will also contain a Down() method to reverse the change if needed. It is in the Up() method where I will add my DDL.

protected override void Up(MigrationBuilder migrationBuilder)
{
  migrationBuilder.Sql("ALTER TABLE emails_text CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;");
}

protected override void Down(MigrationBuilder migrationBuilder)
{
  // If needed, provide a reverse operation here
}

In my pipeline script, I can execute the necessary migrations by calling

dotnet ef database update

This will execute the code in the Up() methods. EntityFramework Migrations manages state in a table called __EFMigrationsHistory and use this history to only apply new changes.

Another popular tool, Flyway, takes a slightly different approach to migrations. The process looks like this:

In your Flyway project directory, create a new SQL migration file. The file should follow the naming convention “V{version}__{description}.sql” where “{version}” is a unique version number and “{description}” is a description of the migration. For example: V1__AlterEmailsTextCharset.sql

Then I will edit this file by opening the newly created SQL migration file and writing my SQL script:

ALTER TABLE emails_text CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Then I would use the Flyway command-line tool or build tool plugin to run the migrations. The flyway configuration would need to be specified, including the connection details and migration location. On the command line tool, it might look like this:

flyway -url=jdbc:mysql://prod-db:3306/your_database -user=${secret} -password=${secret} -locations=filesystem:/path/to/migrations migrate

By adopting migrations, I’ve taken more deliberate control over the change scripts and the order they run in. That said. I’m still out of business (or fired). Why?

DDL - The Rest of the Story

The main table affected by this issue happens to be the single largest table in my database by multiple standard deviations, the emails_text table. The table itself contains about 3,500,000 records occupying about 170 GB of disk space. By today’s standards, it is a small-to-medium sized table but it is still big enough to sink my business.

To apply such a change, the DBMS must first allocate an exclusive lock on the table. This will prevent any other process from reading or writing to the table until the DDL has finished executing. For the purpose of this post, I cloned my production DB and ran this script directly and killed it after 18 hours. I have no idea how long this will actually take to run in my production environment.

If I put this change in my pipeline and deployed to production then, at first, users will notice that certain pages that include email history will stop loading since their reads will be blocked by the exclusive table lock. In the background, a crucial component of the application–sending and receiving emails–will grind to a halt. Read/Write requests will queue up until the connections are saturated and, at this point, my entire application will go offline and remain offline until the DDL finishes executing. My application is architected as a Modular Monolith, meaning there are clear domain-driven module boundaries, but the underlying database server is shared. You may be in a more distributed environment, so that blast radius might be restricted, but the implications remain the same.

Admittedly the gravity of this kind of change may be obvious at design-time, but changes as simple as adding or changing a column can have similar downtime consequences.

Managing Database Changes

We cannot escape the reality that many database changes can break production for long periods of time. The first and most important step is to shift this feedback left. Blue/green and Canary deployment strategies work well for stateless code changes but neither will give you the necessary feedback in time to avoid catastrophe when offline database changes are involved. We need a new, final quality gate to break the build before going to production. Based on your application’s SLA, determine the maximum acceptable downtime (My CRM’s SLA is 99.9% uptime, which means less than 90 seconds of downtime per day is tolerable.)

This quality gate will deploy to a pre-staging environment that contains a clone of the production database. When the DB Migration stage runs, my pipeline monitors the database for extended blocking and will kill the process after 90 seconds which breaks the build. In practice, this will look different on different DBMS platforms but the key thing is the script can run for longer than 90 seconds, it just can’t block for more than 90 seconds.

A broken build on the DB migration stage is my early warning that additional care and skill must go into planning and designing the changes. When you find yourself in this situation, you have two available options:

Option 1 is to pair with–or assign the task to–a DBA. An experienced DBA will not only know the risks of any given change, but they will know the strategies to mitigate this risk. The trade-off for this approach is your overall agility is reduced as the DBA becomes a new bottleneck in the process. There’s a reason many developers jokingly say that what DBA really stands for is “Don’t Bother Asking.”

Option 2 is to level up your DB skills. To be truly agile and effective in our practice of DevOps, we need more than proficiency with the tool (Dev) we need proficiency with the database (Ops). Since you’re still reading, this will be my focus.

Database Change Categories

Database changes fall into a two-dimensional matrix. On one axis in compatibility, on the other is availability.

Compatible Changes

A compatible change is defined as one that can be applied to the database without breaking compatibility between the database and the current or future version of the application. Compatible changes are often–but not exclusively–stateless.

A stateless change is one that doesn’t modify database state. Adding an index, for example, or changing the definition of a stored procedure or UDF are generally stateless changes. They may also be compatible so long as the interface of the UDF or procedure doesn’t change.

A stateful change will modify the data in some way. Adding a column to an existing table (or adding/populating a new table) will technically be stateful, but compatibility will depend. Simply adding a column will not generally affect the existing version of the app (so long as you aren’t using SELECT * FROM... anywhere… and you’re not, right?)

Compatible changes bring two large advantages. First they can be deployed independent of the application code. We’ll talk about why this is useful shortly.

Second, they generally don’t need to be rolled back should the application deployment fail.

Breaking Changes

A breaking change is any change that necessarily requires a code change to be applied in lock-step and one that must be rolled back if the application deployment fails. Breaking changes are typically–but not exclusively–relating to database refactoring.

Perhaps your had a field called Name but later decided it would be better to break that into FirstName and LastName and your script adds the new columns, migrates the data into the new columns, then drops the Name column. Or, perhaps you had a table that contained the columns PrimaryEmail AlternateEmail1 and AlternateEmail2 and you decided that you want to normalize that table and support more than three email addresses by creating a child table, migrating the data, then dropping the old columns.

In either case, the breaking change must be applied, then the code deployed. If the deployment fails both the code and the database changes must be rolled back to return the system to it’s last known good state.

Online & Offline Changes

A change is considered “online” if the change can be applied without impacting availability. An Offline change is one that that will take long enough to violate the your SLA.

Just Enough SQL Internals

Understanding when and why a change will become an offline change requires understanding a few under-the-hood details of how a DBMS works.

Physical Storage/Organization

The fundamental unit of storage for both tables and indexes in SQL databases is the page. Pages occupy a fixed amount of space on disk and in memory whether they are full, empty, or somewhere in between. I/O operations also typically take place on the page level. In the absence of a clustered or primary index, the pages are organized in a heap. Once an index is created, the index (or the table, if the index is a primary/clustered) is structured as a B-Tree with the leaf nodes being the pages themselves representing the data stored and the intermediate nodes supporting fast access to the index keys. At the leaf-level, pages are also double-linked lists to facilitate range-scanning operations.

The key distinctions between a primary/clustered index and a secondary/non-clustered index are:

A primary/clustered index is keyed on a subset of columns but also contains all non-key columns in the row. Effectively this index is the table.
A secondary/non-clustered index contains just the index key and a pointer to the primary/clustered index (or a rowId in a heap)
Because the primary/clustered index is the table itself, there can be only one.

The analogy I like to use for primary/clustered indexes is that of a phone book. It is keyed on last name, first name but contains all data within each page.

Index Overhead

Although indexes can introduce efficiency into record seek operations, they introduce overhead. The most obvious overhead is the need to keep secondary indexes in sync with the table itself. An INSERT or DELETE operation on the table naturally requires similar operations to take place on all related indexes but those secondary operations can become quite expensive, relatively speaking, and introduce long-term performance consequences.

Remember that a page is a fixed size. Should a row need to be inserted in the middle of an already full page the database must perform a Page Split.

A new page must be allocated (this may involve growing files)
50% of the contents of the full page must be moved to the new page
The new data is written to a page
This is repeated for every other affected index

An I/O operation that would normally be a few dozen bytes begins to require several kilobytes of I/O very quickly. Put more directly, page splits introduce a 100x-1000x increase in IO for write operations while reducing concurrency and throughput. The consequences scale quickly. This is the short-term impact.

Long term, page splits degrade performance and resource utilization. Remember that pages occupy a fixed size in memory and on disk; and that I/O is almost always a page-level operation. If you have 10,000,000 half-full, 8k pages on disk that need to be read into memory, then your server did 40GB of extra, random physical I/O to read nothing. That is also (potentially) 40GB of empty space in the buffer pool (server memory). Generally indexes should be carefully designed, monitored, and regularly rebuilt or defragmented to reduce this long-term overhead.

Lock Compatibility

Every operation on a database table requires a lock of some kind. The overhead of a single page-split causes individual transactions to take longer with the database holding locks (typically update or exclusive locks) longer than we would ideally like for optimal concurrency, the typical scope of these locks (page or row-level) limits contention. Contention can further be reduced if a connection opts to perform “dirty reads” if throughput is more important than consistency. Dirty reads, however, still require a lock on a row, table, or page; just one that is “compatible” with other locks on the resource. At a minimum, the “dirty” read query still must allocate a “Schema Stability” lock. In other words, “you can change data while I read, you can delete data while I read, but you can’t change the table while I read.” Schema changes require a “Schema Modify” lock which is incompatible with any other kind of lock. While a SCH-M lock is held on a table, all other queries will be blocked until the lock is released or their connection times out.

While the overhead of a single page-split is rarely palpable, schema changes often involve reorganizing every page in a table which can be incredibly expensive on large tables. Altering a table to change the character set will require rewriting every single page of the table and the indexes, several hundred GB of I/O over a period of hours while holding an exclusive SCH-M lock. What’s worse, attempting to cancel this operation will require a lengthy rollback operation.

Strategies for Reclassifying Changes

The successful adoption of DBOps as part of your DevOps strategy relies on moving as many changes as possible into the Online-Compatible quadrant.

Offline Compatible migrations are low risk but deployments are typically constrained to maintenance windows which reduces agility. Offline breaking changes are the most risky. Not only is availability already a concern, but rollbacks may also be time consuming. These must be avoided unless absolutely necessary. The good news is that, generally, breaking and offline changes can be written as online compatible changes.

Offline Compatible to Online Compatible

The example of my emails_text table is a change that is compatible (my ORM and application doesn’t care if the table stores UTF8 or UTF16mb4 - it’s just text to my application). Any kind of ALTER TABLE statement will first require Schema Modify lock (or equivariant) to be acquired on the table. All other lock types are incompatible meaning any other attempts to access that table will be blocked. Under the hood the storage engine of the DBMS must do a great deal of work to reorganize the data across the pages.

Making this an online change requires we avoid the long-duration Schema Modify lock. There’s no way to specify an ALTER TABLE with nolock but we can redirect our locks elsewhere to avoid contention. Consider this approach:

Make a new table with the appropriate schema

CREATE TABLE `emails_text_migration` (
`email_id` INT UNSIGNED NOT NULL,
`from_addr` varchar(255) DEFAULT NULL,
`reply_to_addr` varchar(255) DEFAULT NULL,
`to_addrs` text,
`cc_addrs` text,
`bcc_addrs` text,
`description` longtext,
`description_html` longtext,
`raw_source` longtext,
`deleted` tinyint(1) DEFAULT '0',
PRIMARY KEY (`email_id`),
KEY `emails_textfromaddr` (`from_addr`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Next we need to populate this new table. Copying hundreds of gigs of data all at once will throttle I/O on the server (while holding at least a shared lock) so we need to think about two things:

How to manage concurrency while copying the data
How to keep these two tables in sync.

--This is probably a bad idea
INSERT INTO emails_text_migration
SELECT * FROM emails_text

For managing concurrency, I find it best to batch the reads/writes. For keeping both tables in sync, we must come up with a strategy. If you want to trigger a DBA, use the word “trigger.” Triggers have a bad reputation both because often they are a hack and because they cause code execution beyond your explicit instructions. That said, they can be useful and this is one of those cases.

My emails_text folder is pretty much a write-only table so a trigger similar to the following would work:

DELIMITER //

CREATE TRIGGER `emails_text_copy_trigger` AFTER INSERT ON `emails_text`
FOR EACH ROW
BEGIN
INSERT IGNORE INTO `emails_text_migration` VALUES (NEW.email_id, NEW.from_addr, NEW.reply_to_addr, NEW.to_addrs, NEW.cc_addrs, NEW.bcc_addrs, NEW.description, NEW.description_html, NEW.raw_source, NEW.deleted);
END; //

DELIMITER ;

Now any new writes will automatically copy over to my new table. Next I want to copy rows from the old table to the new table in batches. Determining optimal batch size requires some experimentation but 1,000-10,000 seems to be a good range to experiment with. My table primary/clustered index key is a monotonically increasing integer which makes this fairly easy, but use whatever key ranges make sense for your purposes - just remember to try to stick to sequential I/O on the primary/clustered index.

-- Get the maximum email_id value from emails_text
SET @max_email_id = (SELECT MAX(email_id) FROM emails_text);

-- Loop to copy rows in batches
SET @start_email_id = 1;
SET @batch_size = 5000;

WHILE @start_email_id <= @max_email_id DO
INSERT IGNORE INTO `emails_text_migration`
SELECT * FROM `emails_text`
WHERE `email_id` BETWEEN @start_email_id AND (@start_email_id + @batch_size - 1);
SET @start_email_id = @start_email_id + @batch_size;
END WHILE;

Although the row range of this script is not guaranteed to include every single row, the trigger is catching records that are written during the migration.

Once the migration is complete, it’s a simple case of swapping out the tables. This is the only point in the process where an application-blocking lock is acquired, but the duration is very short. The migration step would look something like this:

START TRANSACTION;

-- Acquire exclusive locks
LOCK TABLES `emails_text` WRITE, `emails_text_migration` WRITE;

-- Rename tables and validate counts
RENAME TABLE `emails_text` TO `emails_text_old`, `emails_text_migration` TO `emails_text`;

-- Validate counts (optional)
-- You can compare row counts between emails_text and emails_text_old here

-- Drop the old trigger
DROP TRIGGER `emails_text_copy_trigger`;

-- Optinally create a new trigger to copy inserts/updates/deletes to emails_text_old
-- (You need to write this trigger similar to the previous one)

-- Commit the transaction
COMMIT;

-- Release locks
UNLOCK TABLES;

Likewise, a rollback is very straightforward. Just swap the two tables back.

START TRANSACTION;

-- Acquire exclusive locks
LOCK TABLES `emails_text` WRITE, `emails_text_old` WRITE;

-- Rename tables back to their original names
RENAME TABLE `emails_text` TO `emails_text_migration`, `emails_text_old` TO `emails_text`;

-- Drop the new trigger on `emails_text`
-- Recreate the old trigger on `emails_text`

-- Commit the transaction
COMMIT;

-- Release locks
UNLOCK TABLES;

There are some optional steps here, but the key is the ability to make a significant and time consuming change without impacting performance or availability.

Online Breaking to Online Compatible

What about a breaking change, similar to the database refactoring examples above?

perhaps you had a table that contained the columns PrimaryEmail AlternateEmail1 and AlternateEmail2 and you decided that you want to normalize that table and support more than three email addresses by creating a child table, migrating the data, then dropping the old columns.

Again, we can break this process up:

Create the child table
Create the trigger
Migrate the data in batches

At this point, the code that will work with the new tables can be released, but a simple rollback in feasible. At some point in the future, the old columns may be dropped but can be left in place to ensure compatibility in the event of a rollback.

Other Strategies for Phased Changes

Whether I’m building a monolith or a microservice, I like to use the repository pattern. A single class for each domain entity whose single purpose is talking to the database. My repository will handle I/O database I/O and map database entities to the application object model equivariant. This introduces a layer of abstraction between my relational model and my object model. In this way, I can make an online compatible change to the database and deploy this separately and confidently to production without changing application code or behavior. With my new data model in place, I can make a subsequent change and deployment that contains the large changes to the rest of my application.

By breaking these changes down into smaller, bite-sized steps I can reduce my deployment risk and ensure rollback paths every step of the way.

What about NoSQL?

About a decade or so ago, there was a spike in interest in “schemaless” databases. These are useful tools, but not a panacea.

The reality is, whether your application has columns or documents, you have a schema and schema changes must be managed.

Imagine this CRM was built on top of a schemaless database. Perhaps, at one point, address was simply a string, then it became an object.

Then it became an array of objects

No alter tables, right?

Except there’s still as schema - it’s just inconsistent now. We still have to migrate. The problems exist but the options are different

Options:

Leave the schema inconsistent (Thus, writing code that knows how to handle both address and addresses)
Migrate all documents when the schema changes
Migrate on Demand
1. As a record is loaded, make the change
2. Inactive records are never updated

In other words, relational schema or not, we want to think about design and we will have to manage migrations at some point and in one form or another.

Conclusion

DevOps is a valuable practice and, if adopted mindfully, can massively increase overall agility. Agility within databases introduces challenges, but the problems are absolutely tractable. They key is while the tools are valuable their utility is constrained by the skill of the hands that wield them. If you’ve read this far, you already know more about Databases than 90% of developers and you’re better positioned to get faster feedback and remove more bottlenecks in your software delivery process. It’s not always easy but the benefits are vast.

General Best Practices

Practice Continuous Integration
Practice Continuous Delivery
DB Deployment Quality Gate
Break up Big Changes
Avoid Leaky Abstractions
Use Architecture Unit Tests

PostScript: Emoji Support

It took a few weeks of effort and planning, but the change was deployed safely during peak season with zero downtime. The performers are happy.

Remember, the most important and valuable feedback is customer feedback. Investment into pipelines, automation, and quality gates enables me to get working software in the hands of my customers faster and thus feedback on the value of my software faster.

“Our highest priority is early and continuous delivery of valuable software” -The Agile Manifesto

CI/CD, DevOps, and DBOps requires some up-front investment, but sometimes you need to go slower now to go faster later. 10 years into the life of this product and still deploying several times per week with 100% confidence is a good feeling. This agility didn’t come from a framework, or a checklist, or a blog post; it came from a commitment to process improvement and regular action-oriented retrospectives. I’ll say it again: “Don’t do agile; be agile.”

CD & Devops for Relational Databases Part I - Why

2024-02-17T00:00:00+00:00

Everyone seems to want agility, and the DevOps movement has provided tools and practices to enable this. While the tools available today are wonderful, there remain unique challenges when it comes to delivering code changes that also include database changes.

I’ve been building database pipelines on-and-off for over 20 years now. The tools have may have changed but the core principles have not. It would be trivial to write a talk, essay or blog post that was just a tour of syntax and an entertaining summarization of the documentation. You don’t need me to read the docs only to regurgitate them for you into prose or PowerPoint. Instead, I’m going to take a holistic approach that encompasses mindset, skillset, and toolset. What I hope will make this work stand apart from similar content out there is I will be writing from the perspective of a (recovering) DBA and share vital insights that are more relevant than ever.

The Reluctant DBA

I never wanted to become a DBA, that was never a career goal for me. In fact, in high school I took a career aptitude test and one of the top 5 options it spat out was “Database Administrator.” I immediately crossed that off my list because it sounded terrible yet, a mere six years later my career my official title was Database Administrator and that was my job. Although I am not a DBA anymore, the skills I developed have been my ace-in-the-hole for my entire 25 year career and that will continue for the foreseeable future. What I will teach you in this post will be your ace-in-the-hole as you drive more agility in your teams and organizations.

My first job involved building a data-intensive client-server application. Out of convenience we placed a lot of the business logic in the database layer. There are, of course, notable trade-offs to this approach but in this context it was generally a worthwhile trade-off. The two driving factors were as follows:

First, it was more efficient at that scale. Pushing computation closer to the data induced network efficiency; virtually every domain workflow involved making an atomic change to database state that involved inserting, updating, and/or deleting multiple records. It just didn’t make sense to pull all that data to the client, perform a bunch of manipulations to that data, then push it back when a single call to a stored procedure would accomplish the same thing without latency and consistency issues.

Second, it eliminated a lot of day-to-day deployment complexity. With a fat-client installed on dozens or hundreds of computers it was generally easier to patch a stored procedure once than update all of those clients. Client deployment didn’t go away, but they became relatively rare.

The consequence of these facts is I became reasonably skilled with SQL. As a language, SQL is relatively simple; the syntax can be learned in a matter of hours but to be truly effective, the paradigms must be understood as well.

Because I divided my time almost equally between the code and the database, management and administration just became part of my job. It was a time when the stack was simple enough that one could style oneself as a “full stack developer” with sincerity.

One or two jobs later, I found myself in an organization where my initial model no longer fit. We were still building fat-client applications but the original developers and current maintainers were much more comfortable in the code space; the database side was simply a necessary evil. In that organization, I was not only most proficient with the DBMS, but I was also perfectly happy to work in that space. I became the de-facto subject matter expert (SME).

A few years later I found myself emigrating back to the United States at the worst possible time. There was a tectonic shift in the development landscape and the coding skills in the language I had immersed myself in for the past six years had become obsolete. I had a few choices:

Lie on my resume and misrepresent my fluency with the new language
Join a company with a large legacy codebase to replace the developers who fled to pursue more interesting and modern pastures.
Swallow my pride and press the reset button on my career, accepting a junior developer position.

I didn’t like any of the options. I realized that the only current skill I had was my SQL skills. “It’s too bad I can’t get a job as a database developer.” I thought to myself. It turns out there was a great need for this as many organizations had developers who were great with code, but lacked deep SQL skills to round out the necessary skillset to succeed.

The environment I found myself in was radically different. These databases were orders of magnitude larger than those I had previously worked with and operated at a scale and throughput that we would describe today as “web-scale.” Every millisecond mattered and uptime was everything. At one time that organization was an outlier, but this is increasingly becoming the norm in our industry. That DBA job 20 years ago taught me that, to succeed in this space, it was necessary to not only understand the database language, syntax, and paradigms; but also possess sufficient understanding of the internals of the database. It was there that I learned how complex database changes could be–and how to manage that.

This is what I aim to teach you in this two-part series.

DBOps and the Impedance Mismatch

Database and application development have always been at odds, particularly at the intersection of OOP and relational paradigms. The meteoric rise in popularity of ORM tools has succeeded in hiding this problem for a time, but the rise of DevOps and the increasing need for business and product agility is changing the equation. We can no longer paper over the fundamental difference between these two worlds. To effectively practice DevOps, developers must have sufficient working knowledge of both development concerns and operational concerns.

ORMs tell us a comforting lie–that persistence is just an implementation detail that can be safely abstracted away. This abstraction is fine until the fundamental differences between these two worlds rear their ugly heads and bite us in the ass in the form of performance bottlenecks, data consistency issues, or deployment friction. ORMs have been so convincing, however, that there is a tantalizing allure of the fantasy that we are just one more tool or abstraction away from eliminating database change deployment friction.

Difference #1 - Code is stateless, data is stateful

Code is inherently stateless. Containers may be easily started, stopped, and replaced with few side effects. State is generally held in the database. Many database changes, however, are stateful and database changes are often a necessary component of code changes. Despite all the advancements of developer tooling over the two decades since I wrote my first database migration pipelines, the real challenges haven’t gotten much easier. The tools that exist today wish to tell us another comforting lie - that database changes can be managed as easily as code changes but, without the addition of a skillful hand, these tools are simply a loaded gun that will shoot you in the foot when you least expect it. The consequences range from the minor to major–with the major consequences being significant downtime or data loss.

Data > Code

In my career, I’ve done some dumb things. So many, in fact, that it’s difficult to know where to begin. Some 20 years ago (almost to the day) I started a new job at a company and on my first day I was given a tour of the server room. Somehow we ended up behind the racks, and I noticed these structures surrounding the power cable on each server. I (incorrectly) deduced that these things prevented the power cable from being unplugged. “Oh, this is cool” I said, as I clutched the body of the power cable “they must prevent the cord from getting yanked out.” Before my new manager could say anything or stop me, I tested my (faulty) hypothesis by pulling the power cable right out of the server. The server immediately lost power (I guess I was the very first chaos monkey). It was an app server, and for the next several minutes, the apps it hosted were unavailable. It was an inconvenience for the 300 or so employees that were relying on that app’s availability, but within a few minutes everyone continued as though nothing had happened. My boss laughed and it never came up again (except, of course, for the remainder of my tenure where I was regularly reminded of my stupidity “Hey, remember when you unplugged a server on your first day?” Good times.) In short, of all the dumb things I’ve done in my career, this ranks somewhere near the middle.

Arguably, the worst things I’ve even done have resulted in data loss… sometimes crippling data loss. A decade later I was working for a company with a fairly mature CI/CD pipeline for code - but database changes were still run manually. I received an email from my boss’ boss asking me to clean up some spurious data for a particular client. I misread the email and started to delete data from the wrong table; the “master” table. There were cascading deletes in place to maintain referential integrity (my boss liked his referential integrity like he likes his restraining orders… enforced) so each row deleted resulted in hundreds of other rows being deleted across dozens of tables. This delete was going to take all night. My SQL-fu was strong, of course, so I whipped up a quick administrative script that would delete all the related records in batches, then ultimately delete the core records. Now my all-night delete would be completed in about an hour or so; I let it loop and churn. About an hour later I get an email asking if I had completed that delete yet and I replied “It’s still in progress, I expect it will be done in about 20 minutes.” Seconds later I looked up and boss’ boss was standing next to me asking “What the fuck are you doing?!” I was quite proud of my delete script, so I showed him. I ended up deleting almost all of the company’s #1 client’s data. Although we had backups, a great deal of data was gone for good–there was no bringing it back.

I’ve deployed defects to production, I’ve brought down production apps, but it was never more than an inconvenience; data loss, on the other hand, is almost always a disaster.

Difference # 2 - The customer’s data is more valuable than your code.

Embracing the DevOps Mindset

I find DevOps fascinating. Rarely in our industry do we fundamentally rethink how we do things and make difficult changes for the better; generally we just throw more code at the problem. Done right, the practice of DevOps is transformative, but most organizations do just enough to check the box just under “agile” and continue business as usual. You know what I mean by that, right?

For a lot of organizations, some CIO somewhere saw the cover of a magazine that asked is large, intimidating letters “Are you agile?” and this CIO gulped and said “OMG - NO!” The next thing you know, everyone is sent to the two-day training but the only thing that really changes is you get your orders standing up now instead of sitting down. The “Agile Transformation” looks something like this:

“So we’re doing sprints now instead of milestones… But we need to spend some time building a backlog. Why don’t we start with a ‘Sprint 0’ to build that backlog.”
“Ok, we’re starting sprint 1 and actually building this thing.”
“It’s the end of sprint one, I’ve finished all my stories, but there’s no time for testing left in the sprint… Do they roll over? What do I do while I’m waiting for QA? I guess I’ll just start the next stories.”
“Ok, I’m getting defects from last sprint… Do I fix those or continue on the stories I’ve committed to? I’ll schedule those in the next sprint I guess”
“New Plan, we’ll have a stabilization sprint at the end to address all the defects that come in.”
“We finally cut a release - Hand it over to ops, while we have a party!”

Renaming milestone to “sprints” and the Requirements and Analysis phase to “Sprint 0” and the QA phase to “stabilization sprint” does not, an agile team, make. Nothing has changed except the company paid six figures for a worthless certificate.

So now the devs are at the release party and a call comes in after they are all 3-5 beers in. Production is down. The devs are flabbergasted, but shrug and say “it worked on my machine - must be a problem on your side.”

There’s nothing agile about any of that. There are developers in industry who weren’t alive when the agile manifesto was signed and, for many, all they’ve ever known is the worthless bureaucracy marketed as agile. Agile is not SAFe, agile is not Scrum, or any of the other frameworks, processes, or anything else the slick salespeople are pedaling. Agile is the 12 principles, agile is fast feedback. If a framework helps follow these ideals, great! But a framework is not a replacement for the core principles of the agile movement.

“Don’t ‘do’ agile, be agile.”

Agile had it’s genesis in the late 1990s when folks like Kent Beck began building the first unit testing tools. Getting feedback on the quality and correctness of code in seconds from automated tests instead of days and weeks from QA was game-changing. The feedback didn’t stop there, this practice also gave fast-feedback on his thought processes and mental models. Beck once said “It makes problem solving so easy, it felt like I was cheating somehow.”

The ideas grew, blossomed, and intermingled with other forward thinking folks. TDD might give us fast feedback on the code and thought process but we couldn’t get fast feedback on the actual value of software unless we put it in front of a customer.

“Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.”
-Agile Principle #1

The earlier we get feedback, the faster we can course correct, the faster we can deliver valuable software. The side effect of this is as things change, we can respond to market uncertainty and market changes very quickly. TDD was a step in the right direction, but there were many bottlenecks.

Bit by bit we have chipped away at those bottlenecks. Over the next decade or so, we continuously looked for ways to get feedback faster. Automated unit tests tell me I changed behavior as soon as I make the change, static analysis spots errors, broken windows, and vulnerabilities in near real-time, Integration tests give me feedback on breaking or broken dependencies, continuous integration give me daily feedback on clashes and issues with other developer’s work. Build pipeline automation enables this feedback at the team or repository level which, in turn, begins to enable Continuous Delivery… In theory.

The ongoing work is to evaluate every error and failure, posing the question “Is there is any way this failure could have been caught sooner.” Could a prod defect been caught with an integration test? Could an integration test failure been caught with a unit test? If the development process is plotted on a horizontal axis of time, where development time is left, and production is right; how can we keep shifting that feedback further to the left.

Enter DevOps

DevOps has been a natural extension of the shift-left process. How do we get earlier feedback on deployment and environmental issues without actually taking down production in the process?

We had virtualization, and later containers. Self-contained artifacts that behave identically everywhere. To shift the feedback left, the role of operations had to shift left - hence DevOps. Developers became responsible for defining that runtime environment. We got increasingly better tools and built them into our pipeline. If something was going to fail due to environmental issues, we’d get that feedback immediately during development. We could deploy with much more confidence. Moreover, ops was no longer the bottleneck. Dev teams could deploy at any time without manually coordinating with ops, and quickly get the most valuable feedback of all, the customer feedback.

We are now very very good at deploying code. In my pipelines, I can deploy a code change within a matter of minutes. I have a deployment strategy in place that will automatically rollback a faulty or failed deployment. I can deploy code changes confidently with zero downtime. Stateful database changes remain a large problem. There may be a rich tapestry of tooling for DevOps, but DBOps remains a major challenge to this day and one, for which, we cannot simply throw more code at the problem; code may be part of the solution, but it is far from a complete solution.

If you intend to actually practice DevOps, you must know about the Ops part to be effective.

Difference #3 - YOU are now the “reluctant DBA.”

To Be Continued…

Coming up next, we will look at the “how” of DBOps. Automation is a key, but the tools alone won’t save you. To fully adopt this practice, a baseline of knowledge is necessary. Part two will cover “just enough” database internals, popular DBOps tools and strategies, and introduce techniques to manage and avoid disruptive changes.

Stay tuned, and thanks for reading!

Third-Way Web Development Part II - Hypermedia 2.0

2024-01-31T00:00:00+00:00

This post picks up from part I which broadly tells the story of where we are currently and how we got here; ultimately introducing HTMX as the “third way.” It serves as the background and context for this post. If you haven’t already read it, go ahead… I’ll wait.

Now that the background is out of the way, let’s look in depth at HTMX. Rather than the usual quickstart or toy examples, we’ll be looking at this in the context of a real production application already in the wild. We’ll modernize a legacy MVC application, examining the UX without and then with HTMX.

Modernizing Mago

Mago is a SaaS CRM built for professional entertainers. Although it’s life began over a decade ago, it boasts a modern and powerful set of features that many professional entertainers depend on to run their business. It is largely architected as a Modular Monolith implementing the MVC pattern. This architecture provides a balance between simplicity and agility with a clear path to easily decompose into a distributed architecture should changing demands require. On the client-side is some jQuery (which was the style at the time).

For an MVC app, it is reasonably optimized. The backend framework already supports bundling and minification of resources. The dashboard has a raw page weight of 11kb (gzipped).Although this the app uses several UI plugins, they are loaded at runtime in a single minified bundle. The core js weighs in at 369.8kb (gzipped) and the core CSS is 181kb (gzipped). Static resources are available through a CDN fronting the server cluster and a sensible cache policy is already defined (the most efficient network request is the one that doesn’t happen at all). Un-cached, the dashboard total page weight is currently 1.4mb (gzipped) and cached that page weight is 43.2kb(gzipped).

Mago is a good candidate for the hypermedia/htmx model of web app development. Having a single source of truth with respect to state (the server) is more than adequate; duplicating state locally (by way of a state management library like redux or vuex) adds very little benefit given the UX challenges that HTMX address, and greatly simplifies development and maintenance. Client-side code execution is typical of a relatively simple CRUD web-app; UI libraries are used, of course, but there is no need for massive amounts of complex client-side code that a mainstream js framework would organize and optimize. Modernizing this app with HTMX will result in a much lower time-to-value for my customers and, since I’m the sole (part-time) developer/maintainer, this approach keeps maintenance simple and manageable over the life of the product. The trade-offs make sense.

The goal is to achieve that dynamic and responsive user experience that we’ve come to love and expect, without a massive rewrite and the complexity and inevitable tech debt that comes with mainstream js frameworks. jQuery is currently powering asynchronous loading of some elements but implementation is uneven. Some operations will reload individual components and others trigger new navigation or a full-page refresh. Either way, imperatively manipulating the DOM with an ajax library is messy and verbose; HTMX will eliminate that by extending HTML to offer this behavior natively.

Performance and UX

Upon logging in, the performer is presented with a dashboard.

Almost all html resources are actually composite resources. At a minimum, the browser is building a composite of the HTML, CSS rules, and scripts. In addition to those typical embedded resources, we can think of each dashlet on this dashboard as an embedded resource. Mainstream js frameworks adopt the component model, where a page (typically considered a page-level component, or view depending on your conventions) is composed of many components. The components themselves are composed of even more components. This approach maximizes modularity, structure, and reusability. This is a useful approach to modularity on the web and, fortunately, it is very compatible with most MVC frameworks. My backend framework already supports the concept of “views” and “partials” and this application already follows this modular approach without any refactoring prerequisites.

Friend and fellow speaker, Scott Davis, has produced many creative and insightful talks including “The Wrong Kind of Fast” where he argues for evidence-based architecture and questions the conventional wisdom that mainstream frameworks are truly fast (as well as whether they offer the right kind of fast). The image below is from his slide deck.

No matter how we build a web app today, we need to consider both user-perceived performance (first meaningful paint) and total wall-clock performance (time to interactive).

If we begin with a Web 1.0 model (pure HTML/js - no HTMX, no ajax) of the Mago dashboard, this entire page would be composed on the server side before being delivered to the client.

...
 class="row">
     class="col-xs-12 col-sm-7">
         class="page-title txt-color-blueDark">
             class="fa-fw fa fa-home"> 
            Dashboard 
            > My Dashboard
        
    

    @{ Html.RenderAction("FinancialSummary", "Home"); }

...

A number of external calls to the database (and other services) are necessary to compose this page. On my dev machine, the total time necessary to build this page is several seconds. Time to first paint is very high; unacceptably high by today’s standards. However, the time from first paint to interactive is just 266 milliseconds.

Action	Time
Parse HTML	11ms
Parse Scripts	217ms
Render	38 ms
Paint	4ms

This rapid time-to-interactive is a small comfort to the user. The user experience is… not great.

The framework approach is to deliver something very rapidly. On the very first load, a bare skeleton that contains a minimal layout and a loading indicator while the main bundle is downloaded and executed. Depending on the bundle size (see examples illustrated below) this will largely depend on the user’s connection speed.

At this point, the bundle is parsed and executed and the first contentful paint appears. Additional “Loading” placeholders are placed where the content will ultimately be displayed. Meanwhile, in the background, one or more API calls are made to fetch the data. However long the underlying database queries take will ultimately determine the latency here. Notably this bottleneck introduces substantially identical latency as the queries in Web 1.0 Mago. While this approach feels faster, it is similar–if not slower–in total time to interactive when compared to Web 1.0 Mago. Yet the user experience is significantly better.

However, on a second run, the framework will likely leave Mago in the dust. The bundle is already loaded, the code is already compiled and executing, and the data may already be cached locally. The user experience is excellent! Can HTMX compete with this?

Incremental Improvements

HTMX asserts that modern web apps demand more hypermedia controls than just anchor and form tags. HTMX also asserts that web apps demand more surgical precision of what is loaded and when. Within a given app, we should be able to load and parse core content only once (as the framework approach enables). The first HTMX property we will introduce is hx-boost.

HX-Boost

Typically we optimize the loading lifecycle by moving external scripts to the end of the page (or use async/defer properties on script tags). Typically the browser will begin to parse the HTML but pause when it reaches a script tag to load that script before continuing. We can get to first meaningful paint quickly by loading the scripts last. For this example I’m going to buck convention move my scripts to the tag for reasons I’ll explain shortly. In the process, I’m going to include the HTMX library in my page. It can be referenced via CDN:

But for reasons I’m going to be hosting this directly in my application.

My goal is here is to perform a single application bootstrap, similar to the framework approach. Since the content of the tag is common across my application, I don’t want to load/parse CSS and global JavaScript on every navigation. I can accomplish this easily using the hx-boost property like so.

...
 id="main" role="main" hx-boost="true">
...

Adding the hx-boost property to a tag will “boost” all child anchors and form tags to use ajax. This is a form of progressive enhancement that degrades gracefully if JavaScript is not enabled (they’re still normal anchors or forms). What happens under the hood?

The user clicks a local hyperlink
HTMX intercepts the request and performs an ajax GET of the requested resource
HTMX takes the response and parses out the body tag and the title tag.
HTMX replaces the innerHTML of the body with the response, updates the title appropriately
HTMX pushes the full url into the browser history

Of course, there are a number of “smarts” embedded in this approach. A request that should only update part of the page won’t be boosted, external links aren’t boosted, and, of course, this behavior can be selectively enabled or overridden at the form or link level with the properties hx-boost="true" and hx-boost="false" respectively.

This one-liner saves a couple-hundred milliseconds on each request, reducing time to first paint when navigating the application.

You can see a request was boosted from the devtools:

Additionally HTMX will include a number of request headers including Hx-Boosted: true should you need to know a request was boosted on the server side.

Improving UX by Reducing Time-To-First-Paint

We’re going to borrow another trick from the framework playbook by separating data retrieval from the first paint, asynchronously loading content. The magic of the modern UX that began in 2004 continues to be driven by asynchronous interactions. Let’s look at the first data-driven component on the page; the income and pipeline chart.

Currently we’re populating that on the server-side at template render-time.

...
 class="row">
     class="col-xs-12 col-sm-7">
         class="page-title txt-color-blueDark">
             class="fa-fw fa fa-home"> 
            Dashboard 
            > My Dashboard
        
    

    @{ Html.RenderAction("FinancialSummary", "Home"); }

...

On my dev machine, with a remote dev database, that operation takes 1.3 seconds. That’s a significant amount of time where the performer is waiting for anything to happen on their screen. The user-perceived performance will be much higher if this happens asynchronously. It should be noted that this approach breaks the idea of progressive enhancement. If JavaScript is disabled, the user won’t ever see real content (of course, this is also true of any mainstream framework). Keep this consideration in mind if graceful degradation is important to your app. Although mainstream frameworks don’t offer any kind of graceful fallback, it would be trivial to leverage the request headers and optionally perform the appropriate server-side rendering if the Hx-Request or Hx-Boosted headers aren’t present.

For now, let’s replace the RenderAction() with placeholder html:

 id="FinancialSummary" 
     class="col-xs-12 col-sm-5">
     id="sparks" class="">
         class="sparks-info">
             Booked Income 
                 class="txt-color-blue">$Loading...
            
             class="sparkline txt-color-blue hidden-sm">
                100,100,100,100,100,100,100,100,100,100,100,100
            

        
         class="sparks-info">
             Total Pipeline 
                 class="txt-color-greenDark">$Loading...
            
             class="sparkline txt-color-greenDark hidden-sm">
                100,100,100,100,100,100,100,100,100,100,100,100
            

We have now have a meaningful paint without delay, but we want to request the real resource when the page loads. HTMX allows you to turn any element into a hypermedia control by adding properties to the element. Let’s turn the containing div into a hypermedia control with a hypertext reference of /home/financialsummary. We want this hypermedia control to be to be triggered when it loads, and the response targeted to replace just the contents of the div.

First we add the hx-get property to the container div to indicate this element makes a GET request to the given URL. By default HTMX triggers requests based on the click event with some exceptions. By default, input, textarea and select elements trigger on change and form elements trigger on the submit event. There are a number of other triggers we can define, including load. Finally we want the response to be put directly into the div that triggered it but this is the default behavior unless it is overridden.

Our new placeholder code looks like this:

 id="FinancialSummary" 
     class="col-xs-12 col-sm-5" 
     hx-get="/Home/FinancialSummary" 
     hx-trigger="load">
     id="sparks" class="">
         class="sparks-info">
             Booked Income 
                 class="txt-color-blue">$Loading...
            
             class="sparkline txt-color-blue hidden-mobile">
                100,100,100,100,100,100,100,100,100,100,100,100
            

        
         class="sparks-info">
             Total Pipeline 
                 class="txt-color-greenDark">$Loading...
            
             class="sparkline txt-color-greenDark hidden-mobile hidden-md hidden-sm">
                100,100,100,100,100,100,100,100,100,100,100,100
            

The load event fires once the DOM content is loaded, triggering an asynchronous GET request to retrieve the FinancialSummary resource, loading it directly into the div #FinancialSummary. Async partial loading and a better user experience achieved while writing precisely 0 lines of custom JavaScript. This is just a capability baked into HTML now.

Additionally, we get everything induced by the architecture of the web. Recall this architecture has a cache constraint; now that FinancialSummary is a standalone resource that can be requested independently, we can apply a caching policy. Particularly since this info doesn’t change very often. Since this financial summary is displayed all over the app this will massively improve performance across the application. These two changes improve time to first meaningful paint by 1.5s across the entire application.

We can repeat this approach for every dashlet on the dashboard so all dynamic content is loaded asynchronously.

Lazy Loading and Loading Indicators

The notifications panel, the task list, and the Upcoming Automation dashlet all typically load “below the fold” on desktop (and everything but the calendar appears below the fold on mobile). HTMX allows us to lazy load components/embedded resources and this is what we want to do for any potentially off-screen elements. In this case we will include our content skeleton in the dashboard page resource, adding the hx-get property, but our hx-trigger will be (or, at least, include) revealed.

For the FinancialSummary, we made the containing div the hypermedia control that responded not to a user interaction but a page event. Let’s look a slightly different approach applied the Upcoming leads dashlet.

Like the summary, we want a rapid time-to-meaningful-paint so we will continue the pattern of putting placeholder content in the dashboard skeleton that will be asynchronously hydrated. Remember, if progressive enhancement (and graceful degradation) is important, you may want to optionally include page content if HTMX Headers are not present in the request. Alternatively, I will be describing partial swaps later in this post.

I’ve defined a “loading” class to apply to the loading dashlet for the initial loading state:

.widget-body-htmx-loading::before {
    content: url(/content/img/ajax-loader.gif);
    text-align: center;
    font-weight: 700;
    font-size: 16px;
    color: #fff;
    display: block;
    background: rgba(255,255,255,.4);
    height: 100%;
    z-index: 1;
    width: 100%;
    position: absolute
}

.widget-body-htmx-loading:hover {
    cursor: wait !important
}

My Skeleton looks like this:

 class="widget no-padding widget-color-green"
     id="ActiveLeadsWidget" role="widget">
     role="heading">
         class="widget-icon"> 
             class="fa fa-leaf"> 
        
        Active Leads
         class="widget-toolbar" role="menu">

         class="widget-loader htmx-indicator" id="GigsLoading">
             class="fa fa-refresh fa-spin">
        
    
     role="content">
         class="widget-body no-padding" id="ActiveLeads">
             id="LeadsWrapper" class="form-inline no-footer">
                 class="table no-footer" id="AciveLeads" role="grid" aria-describedby="ActiveLeads_info">
                    
                         role="row">
                             rowspan="1" colspan="1">Event
                             class="sorting_disabled" rowspan="1" colspan="1">"Fee"
                             class="sorting_disabled" rowspan="1" colspan="1">
                        
                    
                    
                         class="odd"> valign="top" colspan="3">Loading
                    
                
                 class="dt-toolbar-footer">
                     class="text-danger">Loading...
                

I’ve already overlaid my loading style on the div with id ActiveLeads but we need to fetch this data. If this were a component in a framework, we would hydrate this component using the component lifecycle events (or any of the myriad other component approaches that have come and gone). In the library-driven ajax approach, we might do something like this.

$(document).ready(function() {
    $('#LeadsWidgetContainerDiv').load('/Lead/Active');
});

But part of the mental shift that comes with walking away from the direct DOM manipulation approach is to think first about hypermedia controls. We also want improved Locality of Behavior. This component will have a refresh button as an affordance for the user to independently refresh the content of this widget. This is our hypermedia control. This is our hypermedia control that responds to click events, but HTMX allows us to define a much richer set of interactive possibilities within our markup.

...
Active Leads
 class="widget-toolbar" role="menu">
     href="/Lead/Active" id="ActiveLeadsReloadButton"    
       class="button-icon htmx-refresh-btn txt-color-white" 
       hx-get="/Lead/Active" hx-trigger="click, revealed">
         class="fa fa-refresh">
    

...

Out of the box, an anchor will make a new full-page request based on the href property of the tag, however we’re overriding the behavior of this element with HTMX. It is now an ajax control (assuming HTMX is loaded) but keeping the href attribute allows this element to work without JavaScript (assuming the backend will return just the component/partial for HTMX requests and a full page, inheriting the app _layout if not). I’m specifying multiple events (separated by commas) for this control. (notably, as of v1.9.10 there is a bug in the library that only scans for exact matches on the revealed. I might be able to instead use the intersect but a pull-request for this issue was submitted a week ago and the fix was a single character change.)

Our financial summary swapped the innerHTML of the control/div with the response. This is the default behavior. In this case, it doesn’t make much sense to place the response in the body of that anchor tag, so we need to tell HTMX where to target the response. We use the hx-target attribute with a standard css selector.

...
 href="/Lead/Active" 
   id="ActiveLeadsReloadButton" 
   class="button-icon htmx-refresh-btn txt-color-white" 
   hx-get="/Lead/Active" 
   hx-trigger="click, revealed" 
   hx-target="#ActiveLeads">
     class="fa fa-refresh">

...

This is also a useful approach as the hx-target attribute is inherited by children, so if you use hx-boost at the top level of the body, you might get unexpected results without manually overriding boosted links to include hx-target="body".

Initially the skeleton will render with the loading overlay, but the user receives no subsequent feedback when the component is refreshed. HTMX allows us to specify a loading indicator. The class .htmx-indicator is provided for us, which defaults to transparent but HTMX will apply additional classes during the request rendering the element visible. A spinner/throbber is perfect for this. In the widget menu section, I’ll add an element to act as our indicator.

...
 class="widget-loader htmx-indicator" 
      id="LeadsLoading">
     class="fa fa-refresh fa-spin">

...

And I just need to specify that I want to use this indicator in the HTML using the hx-indicator property.

...
 id="ActiveLeadsReloadButton" 
     class="button-icon htmx-refresh-btn txt-color-white" 
     hx-get="/Lead/Active" 
     hx-target="#ActiveLeads" 
     hx-indicator="#LeadsLoading">
     class="fa fa-refresh">

...

Clicking the refresh button gives the user subtle, but useful feedback on the state of the component. Coming from a direct DOM manipulation mindset or a responsive mindset, it’s both strange yet freeing to step away from the imperative mindset of toggling classes or manually injecting markup into the page. Since this app has historically used jQuery, my mental default was to manually toggle classes. At first, it was it was difficult to let go of this type of approach. It can be hard to relinquish this much control as a developer. Many times in this process I had to stop myself and think “what is the pure hypermedia way to accomplish this?” As I have steadily improved my application’s UX while (so far) only deleting code, I can tell you it’s a good feeling! I can focus more and more on delivering value to my customers without getting bogged down in the implementation details. It’s worth pointing out the goal isn’t to eliminate all JavaScript, but rather use it in a much more deliberate and thoughtful way. I can bring in a service worker, client libraries, or even component frameworks (or use HTMX’s built-in client-side templating) where it makes sense. I simply have a power powerful default/starting place that I can extend with a much more meaningful set of options where appropriate.

Now that I have completed this process for every widget in the dashboard the server is able to generate the skeleton dashboard html page much faster. With the exception of the calendar (which is always at the top of the page) all of our dashlets are now lazy loading when they are visible in the viewport. Our dashboard doesn’t contain any meaningful state, so we may now apply a cache policy on this resource with a long expiration. Navigating to the dashboard from anywhere in the app gives us a time-to-interactive of 187ms with a mere 34.3kb total network transfer on the request.

The main bottlenecks (pausing document parsing to retrieve and load core scripts and css) are skipped. Our parse-to-paint timeline on boosted interactions now looks like this:

Action	Time
Fetch Dashboard HTML	<1ms from cache
Parse HTML	1ms
Parse Scripts	13ms
Render	22 ms
Paint	4ms

The time from initial request to first meaningful paint on a boosted dashboard page load is now 360ms. This is down from multiple seconds using the Web 1.0 approach, all accomplished without writing a single line of JavaScript (in fact, several dozen lines of js have been deleted so far). We’re just extending what HTML and the REST architectural style already gives us. The performance and responsiveness of this app is only going to continue to improve as we continue to leverage the power this library offers.

Trigger Modifiers

On the top of the page is a “recently viewed” menu. Since a user only really cares about the contents of this menu when they’re actually looking at it, rather than asynchrously loading this menu when the page renders, it makes sense to request the body of this menu when it is expanded. Based on what we’ve done so far, this will be easy:

...
 class="context hidden-xs" id="RecentDropdown">
     class="project-selector dropdown-toggle" 
            data-toggle="dropdown" 
            hx-get="/Home/RecentlyViewed" 
            hx-target="#RecentItems" 
            hx-trigger="click" 
            hx-swap="outerHTML">
            Recently Viewed: class="fa fa-angle-down">
    
     class="dropdown-menu" id="RecentItems">
         class="fa fa-refresh fa-spin"> loading...
    

...

The span becomes our hypermedia control, and clicking this element will both expand the dropdown and load its contents. In this case, we want our click to behave differently. Unlike the dashboard widgets, once this menu is loaded the first time, there is no really need to refresh it on a subsequent click. We want to modify this event to only fire once. Our trigger will look like this hx-trigger="click once". Since there is no comma after the trigger event, what follows is a modifier. This is also useful for form submissions where we want to guard against double-submission.

Custom Events & Reactive Components

On the dashboard, events that are currently on the performer’s radar are separated into two categories, Leads and Gigs. Leads are events the performer is in an active sales process. They have not been booked yet. Gigs, on the other hand, are confirmed events the performer is committed to appear and perform at.

On the leads dashlet, each pending event has a kebab menu and one of the available options is to convert the lead into a gig. Selecting this option triggers a POST operation in a hypermedia 1.0

resulting in the entire dashboard being returned in the response. Because this application is boosted, the form post will also be boosted and the entire request will take less than half a second. That said, every dashlet will then need to be reloaded. All we really need to do is remove one from the leads dashlet and add one to the gigs dashlet. We don’t need to re-request the calendar, the task list, the current automation queue, etc. Ideally we want an experience similar to what is offered in the mainstream frameworks, where state changes and any elements dependent on that state will automatically react to that change.

We could take the SPA approach and use some kind of state management library to produce a local copy of state with a complex framework to keep the DOM in sync as a reflection of that state as well as more complexity and lurking edge cases to keep the local state in sync with the actual source of truth (the database on the server). In the case of this application, it makes much more sense to eliminate all of that complexity entirely (as well as close to year of rewrites). The trade-off is an almost imperceptible performance penalty that is not significant in the context of this application and a guarantee that what is on the screen is a true representation of state at all times.

The first thing I want to do is redefine the menu option currently controls this behavior to be another hypermedia control. This control should perform a POST based on the currently defined implementation (although if I were building this application from scratch to be a native hypermedia system, I probably would define this to be a RESTful PATCH operation, but since the legacy app uses an RPC-style approach for the UI, we’ll reuse what we have). HTMX supports the full set of HTTP operations, which affords great power to webdev (in contrast, HTML natively only supports GET and POST). The element would look like this:

...

     class="bookGig" 
            hx-post="/Gig/Book/{Lead.Id}" 
            hx-indicator="#LeadsLoading" 
            hx-target="#ActiveLeadsTable" >
         class="fa fa-calendar-check-o">
        Convert To Gig
    

...

For updating the leads dashlet, there are a couple of interesting options. Naturally I could refactor this endpoint to return a new copy of the component and target the body of the widget. HTMX, however, gives us a great deal of control over the target. If I wanted to be more surgical, I could refactor this endpoint to return an empty response and target the tr closest to the convert button. My target would be hx-target="closest tr". This is a useful trick to keep in your back pocket, but this app uses DataTables, a javascript-driven table component that isn’t observing the underlying table and thus would need to be informed of the change. My instinct is to take the simple approach of refreshing the entire widget body, but it can be useful to know how to listen for HTMX events in javascript so I will show my implementation, then explain listening for htmx events in JavaScript.

I’m going to keep my hypermedia control as implemented above. The problem is, since this lead will be moving to the Gigs dashlet, I also need to refresh this. Following the HTMX way of defining this behavior declaratively rather than imperitively, I want to broadcast an event the Gigs dashlet can trigger on. This will be a custom event that I will be defining in the response headers using the Hx-Trigger header.

My backend controller method looks like this:

public ActionResult Book(Guid id)
{
    var currentGig = LeadService.UpdateStatus(Gig.StatusBooked);
    Request.RequestContext.HttpContext.Response.AddHeader("Hx-Trigger", "GigsUpdated");
    return Active();
}

I can now update my Gigs dashlet load/refresh button to trigger on revealed, click, and gigsUpdated (which will bubble up the DOM so I’m specifing to listen to my custom event from:body):

...
 href="/Gig/Upcoming" 
    id="UpcomingGigsReloadButton" 
    class="button-icon htmx-refresh-btn txt-color-white" 
    hx-get="/Gig/Upcoming" 
    hx-target="#ActiveGigs" 
    hx-indicator="#GigsLoading"
    hx-trigger="revealed, click, gigsUpdated from:body">
         class="fa fa-refresh">

...

I want my components to be reactive to state changes. The difference here is, the source of truth is the server, so rather than duplicating state on the client and creating components that observe that non-authorative state, my components react to events that bubble up as a result of state changes at the source of truth. This is a pattern I will be using all over the application, broadcasting certain state-change events that any components on the page might be interested in. For example, Leads, Gigs, and Tasks can be created from a modal but this model can appear on any page in the app. Rather than relying on callbacks to hunt around and script reload clicks and complex logic to find and optionally reload widgets, any interested widgets that should respond can simply listen for response events keeping everything nice and declarative.

At this point, it occurs to me that these state changes have a side-effect of changing the state of the calendar. New Leads/Gigs should appear, and converted gigs/leads should appear a different color. I certainly could instruct the calendar component to listen to these custom events, but that would require rebuilding the entire calendar (which is drawn by the javascript component fullCalendar). This might be a good candidate for a js listener to listen to these events and update the calendar accordingly. HTMX events are prefixed with htmx: but custom events are not. The code to listen for my custom events and refresh the calendar would look like this:

document.body.addEventListener('gigsUpdated', function (evt) {
    $('#calendar').fullCalendar('refetchEvents');   
});
document.body.addEventListener('leadsUpdated', function (evt) {
    $('#calendar').fullCalendar('refetchEvents');
});        

It’s important to remember that the HX-Trigger header is only processed on HTMX requests. If you (like I) have legacy ajax forms and modals that aren’t driven by HTMX, your events won’t be picked up by your HTMX components. For legacy ajax components, it was necessary to trigger events using the HTMX api. The basic pattern looks like this:

function checkHXTrigger(url, method) {
    $.ajax({
        url: url,
        type: method, // 'POST' or 'GET'
        complete: function(xhr) {
            var headerValue = xhr.getResponseHeader('HX-Trigger');
            if (headerValue) {
                htmx.trigger("body", headerValue);
            }
        }
    });
}

While experimenting with custom events, I found the built-in logging and debugging capabilities useful, particularly htmx.logAll() and monitorEvents(htmx.find("#theElement"));.

Partial Swaps and Pushing History

Most of the components we’ve talked about are little standalone resources that only exist as partials (in .net parliance). Let’s look at an exception; the record detail and edit screens. Here is the contact details page in Mago:

When this page, the dashlets are asynchrously lazy-loaded but clicking the edit icon will load a substantially identical page (including dashlets) but the contact card is replaced with a form. The edit icon is currently a hypermedia 1.0 hyperlink that reloads the entire page and its contents. HTMX will allow us to replace just the contact card dashlet with the edit form.

The problem is, unlike the dashboard dashlets, the edit page is not a partial, it is a full page. Now, I could maintain two copies of this component (the partial, and the full page) and serve one or the other based on different routes or content-negotiation. Even though I’m not totally married to DRY, I don’t relish the idea of maintaining two copies of the same view. I could refactor the edit form to be a partial, but there is still a need to navigate directly to the edit screen (either via hyperlink or browser history). Really I would like to keep only one template, but only inject the relevant part into the dashlet. hx-select is the solution.

Hx-select allows me to specify specific content to be swapped. This is what my edit icon looks like:

...
 href="/Contact/Edit/{contact.id}" 
    class="button-icon btn btn-xs" 
    hx-get="/Contact/Edit/{contact.id}" 
    hx-target="#ContactWidget"
    hx-select="#ContactWidget"
    hx-swap="outerHTML"
    hx-indicator="#ContactLoadingIndicator"
    hx-push-url="true">
         class="fa fa-pencil">

...

Let’s break down what’s happening here:

href - default behavior. No js means the normal hypermedia 1.0 navigation will happen on click
hx-get - this is defining that we’re doing an HTMX driven interactions, overriding the default link behavior
hx-target - target the dashlet container element
hx-select - instructing HTMX to parse out just the fragment of the response we want to swap
hx-swap - we want to include the containing element in the swap
hx-indicator - let the user know something is happening during loading
hx-push-url - by default, these types of events don’t appear in the history, but in this case, I want the full edit URL to be pushed into the browser history so the back button will work properly.

Clicking this link will leave all other elements in place, again providing that responsive UX. There is no need to reload those dashlets over and over again. On the edit form, the submit is an HTMX post with similar targeting and swap properties and will, again, push the URL into the brower history. Likewise the cancel button is identical to the above fragment other than the hx-get property.

This approach trades a little bit of network overhead (a couple of kilobytes) for the simplicity of not maintaining multiple copies of the same form and/or maintaining different routes, or controller if statements and razor @renderpartial() statements. Long term I may refactor to the split the view in to parts and add some conditional logic in the controller that will serve the full page or partial depending on HTMX request headers, but this reduces the scope of my change and associated risk while increasing time-to-value for my customers. It’s a useful trick to keep up your sleeve.

Security Considerations

In many ways, the mainstream framework approach to web development has kept the actual HTML so abstracted away, it can cause complacency. With HTMX, you’re working much closer to the html, response are injected directly into DOM and all scripts go through eval(). Sending down unsanitized user input has always–and will always–be a bad idea. It is important to adopt the best practices of the past to sanitize responses. In most cases you, like I, will be using some kind of backend framework. Mago is built using .net MVC. which has always handled output sanitization at the framework level. There are, however, a few places wher I want to send raw HTML in the template response (e.g. html emails in the history component). In this case, the onus is on me to sanitize this before rendering it in the templates. My application has always had an html sanitizer dependency because this potential vulnerability has always existed and is not unique to HTMX. My history view implements the sanitizer like this:

...
@if (item.HistoryType == HistoryType.Email)
{
    var htmlBody = sanitizer.Sanitize(item.Description ?? "");

        @Html.Raw(htmlBody)
    
}
...

In short, anywhere you’re doing the equilivant of razor’s @Html.Raw() use a reliable and battle-tested sanitizer, but this has always been the case. The truth is, HTMX introduces some new ways to cause mischeif that a sanitizer might not know about. For example, if I had a remote server with no CORS checking, I could inject some poison markup along the lines of .... The solution is simple. For any content I don’t control/trust that I would output with @Html.Raw I simply add the hx-disable attribute in the container along with my raw (but sanitized) html output.

...
@if (item.HistoryType == HistoryType.Email)
{
    var htmlBody = sanitizer.Sanitize(item.Description ?? "");

     hx-disable="true">
        @Html.Raw(htmlBody)
    

}
...                                           

In most cases you will also want to set htmx.config.selfRequestsOnly to true which will only allow AJAX requests to the same domain as the current document.

Conclusions

Teaching an old dog (Mago) new (UX) tricks has been an enlightening experience. Once I was able to get over the hurdle of my existing webdev mental set, I found working with HTMX to be an utterly plesant experience. Furthermore, I was able to take a very incremental approach. Once the dashboard conversion was complete I was able to deploy to production. In other words, I delivered value and improvements to my customers after just a couple of hours.

I will say this experience also provided some illuminating hindsight. When I first wrote Mago, I didn’t know everything I know today. I have a much better understanding of the REST architectural style and would have made significantly different architectural decisions but it was nice that I didn’t have redesign the entire app to get the UX I (and my customers) wanted. Perhaps I will write more in the future on architecting hypermedia systems, but for now I’m happy with this experience and what I learned. If you want a philosophical deep-dive and don’t want to wait on my uneven blog publication schedule, read the collection of HTMX community essays.

In short, to me HTMX is a long overdue breath of fresh air. Do I think HTMX will (or should) replace the mainstream frameworks? Absolutely not. There are many situations where they are absolutely the correct choice. Not only for those situations were a 90s style fat-client app is a better solution but also for those whose core webdev skillset is framework/js ecosystem focused (although if you want a similarly hypermedia-friendly approach but are coming from a typescript/jsx background, check out fresh). I belive HTMX is an excellent choice for modernizing legacy MVC applications, for projects driven by teams who are predominantly backend-developers and for whom the complexity and learning curve of a mainstream framework would be an undue burden, and for projects that crave that modern UX but don’t want the complexity that comes with the framework-centric approach. Most notably, migrating Mago to HTMX took a few days. Rewriting in Angular, Vue, or React would have taken considerably longer and introduced a much higher maintenance burden as these frameworks continue to evolve.

It’s also worth noting that HTMX does play well with many js libraries and frameworks. Despite the stigma, jQuery remains ubiquitous and useful with a large number of UI plugins and component libraries available. A marriage of jQuery and HTMX is a match made in heaven when jQuery handles the UI components and HTMX handles the dynamic interactivity. AlpineJS is worth a look into and VanillaJS is another good option. HTMX promotes hyperscript as an alternative front-end scripting approach. Finally it’s entirely possible to bring in some vue, react, or even angular if it made sense. Just remember (from the official HTMX docs and essays):

I built my first web page using html 2.0 and the past two decades have led me to walk away from most webdev. HTMX has rekindled my love of the web. It’s definitely a technology that should be both on your radar and in your toolkit.

Meet Michael Carducci…

2024-01-27T00:00:00+00:00

A.k.a Michael Cary - and all the other identities and aliases I have held in my life (fun fact, this might be the first time ever those two names have appeared side by side in public!). They’re all masks I’ve worn, idealized versions of myself that I want to present to the world and, ultimately, lies I’ve told myself (and possibly you).

“I’m not defined by what you see, I’m also all of the things you’ll never see.”

-Derek DelGuadio, In and Of Itself

I wrote something today, and felt moved to publish it somewhere. If you are one of the vanishingly few people who actually read my blog, it’s entirely possible you’re interested in getting to know not just my thoughts on tech, but perhaps also interested in getting to know me a little deeper as well.

Read on if you want to see some of the parts of me you have likely never seen. I’m exposing myself for a moment, “warts and all.” If not, no worries, more tech opinions and knowlege are already in the works.

I’ve lived many lives. Most people I have met know me through just one of those lives. Many people have met me here or here, or where those two worlds first collided; here (among countless others).

Those who know the magician are baffled that I remain content with a handful of loyal clients, continuing to polish my craft but only performing at a few dozen events every year. Those who know me from day-jobs past (likely along with some family members) quietly worry about my lack of steady employment. In both cases there is this fear that I am “squandering my gift” (whatever they perceive it to be).

Despite being self-employed for almost a decade, I’m in newly uncharted waters. I can–and have–made a comfortable living as a full-time magician. I can–and have–made a comfortable living as a full-time coder and architect. I can–and have–made a comfortable living as a full-time freelance architect and consultant. Perhaps it is the ongoing effects of a midlife crisis that arrived right on schedule, but I radically changed my approach to all of these things last year.

I’ll be honest, I’m scared. When I first quit my day-job to focus on magic full-time I knew that, worst case, I could land a “square” job in just a week or two. Times have changed and the current tech market continues to be a bloodbath. Jobs seem scarce and mass layoffs continue. I feel like I’m walking a tightrope again but, this time, without a safety net.

I recently confided my anxiety to a friend (who is most acquainted with coder/architect Mike). Having worked together in the past, she said “I know you, you’re smart, you’re capable, there are a million jobs out there you could do.” I countered that (post 2023 radical change) I was “basically unemployable.” There was some back and forth, but that was largely the end of it. This morning, I sat down with a cup of coffee and reflected on traditional W2 employment and this is what I wrote.

(I post this with anxiety. I don’t normally write like this. What will the you, the reader, think of me? Is it too grandiose, is it TMI, does anyone care, will I undermine what I’m trying to do? Is this just fatuous navel-gazing? Here goes anyway… note: there is a small amount of coarse language)

I think I can articulate my position on traditional employment. My whole life I have been a round peg in a world of square holes. Yet, one has to eat, one has to pay the bills. I did what I had to do; compartmentalize myself. Shave off aspects of who I am to fit the hole - it was never a perfect fit and it was one that was incapable of holding for very long, no matter how many coping mechanisms I used to plug the gaps. This left a hollowness that would take decades for me to recognize.

To treat the symptom, I would take cast-off fragments of my self and stick them together; directing them towards some kind of side-hustle or passion project. This gave me the illusion of holism. The comforting lie that I fit just fine in this world as long as I am careful to only show different parts to different people. I belong, providing I take myself apart; some is better than none. Unfortunately, “you can’t serve two masters” yet I was constantly robbing Peter to pay Paul. Most of the leftover pieces were always starving. I was never whole, and no single part, nor their combination, brought any true satisfaction.

Maslow put “belonging” just above the basic survival necessities in his hierarchy of needs. I believed that I could “belong” so long as I never let anyone see my true self—my WHOLE self. I was Michael Carducci sometimes; I was Robert Cary, or Michael Cary, or Bobby at others. I was a walking facade everywhere I went. I would frequently remind myself “Everybody loves Mickey Mouse… Nobody cares about the guy in the suit.”

My mid-life crisis began when, in 2020, I finally admitted defeat on my dream of being a full-time magician (or, let’s call a spade a spade, being “Michael Carducci” the magician full time because that was the side most frequently starving and “the grass is always greener”). I tried to completely remake myself in the mold of my new job. “This is who I really am” I told myself. And when that fell apart I had to realize the truth; that that job was not who I was (and I didn’t like who I became in the process). By that point, I had pounded, cut, sanded, shoehorned, warped, and changed myself so many times I lost my entire identity. My mid-life crisis peaked with that realization. I looked at the scraps and dust that surrounded me and could no longer find anything I recognized. It seemed there was no permutation of a subset of my identities and facets that could belong anywhere. This was compounded with the realization that I didn’t have enough time left on this plane to start from scratch and construct something new.

Then I realized something magical had already happened. I married Kate. I often remind her that “I love you, exactly as you are.” It’s true, but it’s easy, she’s amazing! (Have you met her?!) It took me a very long time to accept that she could love me in the same way… but, somehow, she does. She never wanted to “fix” me, she never needed me to cast off parts of me to fit. She loved and accepted me exactly as I was; this showed me that perhaps I am fine as a round peg. For the first time in my life, I didn’t feel “broken” or “wrong.” I knew I could belong somewhere–I did belong somewhere.

I quit the job and took several months off to undo the cumulative damage wrought by four decades trying to be something I wasn’t. I found me, the whole me. The round peg in a world of square holes. I’m not a coder, I’m not an architect, I’m not a magician, I’m not a skydiver… I defy labels, I contain multitudes, I am a member of many sets and I am the union of all of those classes. Integrating these–living in integrity with myself–has brought me a happiness I have never before known. I don’t want to compromise my identity again for the fleeting illusion of fit and belonging.

Looking at the job market as it stands, I have no doubt there are many Michael-shaped-holes out there, but they don’t know that. They’re all looking for collection of square pegs that will add up to their Michael-shaped hole. In the process, they are filtering for the right individual pieces and not the whole. They pass me by. It’s probably just as well because, under that arrangement, it would be very hard to be more than a cog in the machine ultimately leaving both parties unfulfilled.

I’m reminded of when Dr Richard Feynman decided he wanted to help in the war effort:

” I began to think I ought to make some kind of contribution, too. After I finished up at MIT, a friend of mine from the fraternity, Maurice Meyer, who was in the Army Signal Corps, took me to see a colonel at the Signal Corps offices in New York.

“I’d like to aid my country sir, and since I’m technically minded, maybe there’s a way I could help.”

“Well, you’d better just go up to Plattsburg to boot camp and go through basic training. Then we’ll be able to use you,” the colonel said.

“But isn’t there some way to use my talent more directly?”

“No; this is the way the army is organized. Go through the regular way.”

I went outside and sat in the park to think about it. I thought and thought: Maybe the best way to make a contribution is to go along with their way. But fortunately I thought a little more, and said, “To hell with it! I’ll wait awhile. Maybe something will happen where they can use me more effectively” “

-Richard Feynman - Surely You’re Joking Mr. Feynman

Fuck the “regular way.” I did that for 40 years. It sucks. It’s not me. From now on, I bring my whole self to everything I do. I make magic, love humans, tell stories, have genuine interactions, write code, think deep thoughts, dream impossible dreams, write books, give talks, teach what little wisdom I might have to share, sharpen my unique perspective; OWN my round peg status.

I’m not Richard Feynman but I, too, have gold to give. Eventually the US govt realized they needed to fill a Richard Feynman shaped hole and brought him into the Manhattan project directly.

Perhaps through unabashed authenticity in my books, my talks, my magic, my stories, my random musings, and my simple human interactions… someone will see the whole me and say “there it is, that’s what’s been missing.” And maybe, just maybe, that’s the job where I belong next, or even just a one-off speaking/training/consulting gig somewhere where I can at least give a little gold for a time.

If not, so what? I’ve already found where I belong. I’ve taken off my mask, I’m showing my Shel Silverstienian “blue skin.” I’m finding my tribe and they’re finding me. I now know what it feels like to belong and what it’s like to be truly accepted.

If I continue this path I might be forever broke—but never poor. At least I can die peacefully when my time is up, knowing I finally learned the meaning of “This, above all, to thine own self be true.”

When I say “I’m probably unemployable,” that’s a lie. I am more than capable of presenting the facade that aligns with what a company, or school, or romantic partner is looking for–I’m a black-belt pretender with a lifetime of experience. What I really mean is I am currently not willing to compromise who I am to fit some bullshit job. I say “currently” because I may eventually run out of money or opportunities just as inevitable unforeseen and dire circumstances emerge; but that day is not this day. For now, I’m going to cling to this quixotic quest for as long as I can. Life is too short and true happiness too fleeting.

As I publish with trepidation, I’m comforted by this quote from Planes, Trains, and Automobiles:

“…think what you want about me; I’m not changing. I like.. I like me. My wife likes me. My customers like me. ‘Cause I’m the real article. What you see is what you get.”

-Del Griffith

And now back to our regularly scheduled programming…

Third-Way Web Development Part I - History

2024-01-24T00:00:00+00:00

When I look back on my career in technology, I’ve been seduced, over and over again, by this idea that best practices exist and that I can consider my work “good” so long as I follow those best practices. In some ways it would be wonderful to work in a field where there are absolute “right” and “wrong” answers but I no longer believe software engineering is one of those fields. Every single decision we make has consequences and whether these consequences are positive, negative, or mixed will depend on the context. It’s all just a set of trade-offs, and the key to making good decisions lies in understanding what matters most and evaluating the trade-offs in this context.

There are no best practices, only trade-offs.

-First law of Software Architecture

We’ve somehow gotten into this weird place in web development where the consensus seems to be that an absolute set of best practices exist; where the only meaningful decisions surround which framework to use and how to find the optimal implementation details within that chosen framework. Unfortunately the web development space has become so myopically focused on frameworks, tooling, and Single-Page Applications (SPAs) as the de-facto “best practices” that the trade-offs of these approaches are rarely discussed–or even well-understood up front–and, in many cases, the result is an ocean of bloat and accidental complexity that often could have been avoided entirely.

It’s not that these client-side frameworks are necessarily “bad” in fact they certainly offer a great deal of value in the right place. Rather we need to continue to approach software development more mindfully, deliberately, and contextually. This remains a uniquely human skill that current AI models remain a long way off from materially competing with.

My focus, for this series, is not to tell you how to build your next web app, or suggest you need to rewrite your existing one(s); instead I want to begin a conversation about web development framed in a broader context. I want to explore the trade-offs of framework-centric web development that are typically overlooked and introduce (or re-introduce) some ideas that might not yet be on your radar that may offer meaningful alternatives.

A Brief History of the Web and Web Development

Choosing the framework-centric web-development path is a significant and consequential decision; while there are many benefits to this approach–much can be gained–these benefits inevitably come at a cost. Is the cost justified? Sometimes. It depends.

The nature of trade-offs is that we are necessarily giving things up in exchange for what we gain. I can’t really speak to what we’ve lost without first talking about what we once had and how we got to where we are today.

It Started With Information

(Vannevar Bush)

Vannevar Bush was an engineer and inventor most active in the first half of the 1900s. In 1940, then President Franklin Roosevelt ordered the creation of the National Defense Research Council and named Bush as it’s chairperson. The agency’s stated goal was “to coordinate, supervise, and conduct scientific research on the problems underlying the development, production, and use of mechanisms and devices of warfare.” The 1940s was a period of explosive (no pun intended) growth of scientific information and Bush began to speculate that we had outgrown the existing mechanisms to catalog and organize this information.

“…we can enormously extend the record; yet even in its present bulk we can hardly consult it… There may be millions of fine thoughts, and the account of the experience on which they are based… if the scholar can get at only one a week by diligent search, [their] syntheses are not likely to keep up with the current scene.”

-Vannevar Bush

Bush began to daydream about a radical new way to not only organize information, but connect and cross-link it. This culminated in his visionary 1945 Essay As We May Think where he described a hypothetical machine that could both store and link documents and enable nonlinear navigation of these documents. This essay planted the ideas for what Ted Nelson would later dub “hypertext” less than two decades later.

Hypertext was a revolutionary idea that spread among forward-thinking communities in the coming decades and inspired a young computer scientist named Tim Berners-Lee to build ENQUIRE at CERN in 1980, a hypertext system similar to today’s Wikis.

The Birth of the Web

By 1989, Tim Berners-Lee proposed a global hypertext system that would eventually be known as the World-Wide Web. In 1990, he began constructing the key components of a hypermedia system: A hypermedia (in this case, HTML), a network protocol (HTTP, the hypertext transfer protocol), a server presenting a hypermedia API which responds with hypermedia responses, and a client that can interpret the hypermedia (the first web browser). In August, 1991, the very first web page was published (notably, it’s still out there… and it still works).

The web operated on a handful of key concepts. An information resource (e.g. a document, an image, a midi file, etc.), a URL (originally describing the location of the resource, but as the web made physical location increasingly irrelevant, the URL frequently acts as a mere identifier of the resource), HyperText Markup Language which would allow an author to describe the document in a standard way such that a html client to render the document for humans to view and interact with, while pulling in any additional embedded resources (e.g. images) and offered the first hypermedia control; the anchor tag. The anchor tag had an href property enabling the linking of one resource to another; it’s what made the web “hyper.” These URLs, these links could be referenced, saved/bookmarked, embedded, and shared.

During these formative years, resources on the web were largely static but there were plenty of classes of resources that would need to change independently of publishing cycles; dynamic resources. A notable early example of this phenomenon was a Coke Machine at Carnegie-Mellon University which hosted its own web page allowing students to see if their beverage of choice was in stock (and ice-cold) without first having to walk across the campus. That same year, a draft of the Common Gateway Interface was created to standardize how command-line applications (such as the Perl interpreter or c programs) could be integrated with a web server. The need for input to these applications–not just output–led the specification of HTML 2.0 which included a second hypermedia control, the element along with a set of controls. The web was now truly interactive (at least, in a rudimentary sense). Clicking a link would load a new page in the browser window, and submitting a form would serialize the contents of the input tags, send those to the server, which would ultimately respond with a new html document to load in the browser window. Both interactions would completely replace the current contents of the browser window with a new web resource.

A New Type of Application?

Most network applications of this era followed a relatively simple client/server architecture. The application, along with all it’s constituent business logic, rules, and instructions were compiled into a single “fat” client, installed locally on a PC while state existed in a shared database accessible over the network. This fat-client style application could offer a rich and responsive user experience, taking advantage of an available graphical user interface, dynamically updating only the parts of the user interface that truly needed to change; this approach had a significant drawback in how the application was distributed and updated. In effect, this application needed to be manually installed (or upgraded) on every machine on which it ran. Often a complex process of mastering, disk duplication, and distribution was involved; “ship it” literally meant shipping it. The core ideas of agile, embracing change and rapidly responding to it, weren’t just a fantasy; they were utterly impractical. With the advent of HyperCard several years earlier, we saw the potential of hypermedia applications. The web offered a tantalizing possibility of a new type of thin client, especially by 1995, when client-side scripting first appeared in Netscape Navigator.

At this point, the architecture of the web truly took shape. Roy Fielding, one of the early architects of the web, would later dub the architecture of the web “The REST Architectural Style” defined by six constraints:

The Client/Server Constraint
The Stateless Constraint
The Cache Constraint
The Uniform Interface Constraint
The Layered-System Constraint
The (optional) Code-on-Demand Constraint

Hypermedia is central to the Uniform Interface Constraint, where a server provides a hypermedia representation of a resource’s state along with a set of affordances in the form of hypermedia controls to allow a user to interact with that resource.

The architecture of the web offered unprecedented flexibility and evolvability which enabled experimentation with different ways to build applications on the web. Since hypermedia offered a stable abstraction, many server-side scripting languages and server-runtimes were introduced which were immediately available to all existing web clients. While this opened the door to many new possibilities, the fact that every hypermedia interaction required a post-back and full-page refresh left much to be desired in the overall user experience (particularly on a dial-up connection).

Attempts continued to build rich, client-side user experiences within web applications. One of the early attempts to overcome the limitation of a pull page reload on every request was framesets, first introduced inside Netscape 2.0 and lobbied to standardize this feature in HTML 3.0. In effect, framesets enabled a single browser window to be divided (and, potentially further subdivided) into multiple independent pages which could be targeted and refreshed individually. They weren’t popular and were subsequently removed in HTML 5 (although iFrames remain).

Another approach was to extend html to allow embedding of applications into a web page. Java Applets, ActiveX controls, Flash, Silverlight, and numerous others have come and gone. These approaches used the web as a delivery mechanism, but these apps weren’t native to the web and introduced performance, runtime, and security challenges.

Enter Web 2.0

By the late 1990s, Microsoft held a monopoly in the web browser space and they introduced the XMLHttpRequest object to allow their Outlook web client to perform some basic background tasks. It allowed them to moderately improve the UX of their webmail client and strengthen their monopoly. Outlook was widely used and this feature offered web users a superior user experience… but only in Internet Explorer. The other browser vendors followed suit, opening the door for the next revolution. Ajax and Web 2.0.

(Google Maps dynamic UI in action)

Gmail and Google Maps were among the first web apps to show us the user experience we had been dreaming of for over a decade. Beyond mere background synchronization tasks, these apps were taking responses and rewriting parts of the page on the fly. It was a glorious hack, and not in the pejorative sense; like CMU’s internet Coke machine, or the first webcam created to remotely check if there was any coffee in the Trojan Room coffee pot it was brilliant. The bar was raised and there was no going back.

(the very first webcam)

In the early days, working with XMLHttp objects and using browser native DOM manipulation APIs was not for the faint of heart. These APIs were not yet standardized, and the code quite verbose. A number of new JavaScript libraries popped up providing a browser agnostic API to simplify ajax requests, DOM tree traversal and manipulation, event handling, animations, etc. A new breed of web applications were built using YUI, ext.js, or jQuery (to name but a few). This new paradigm was powerful, but began to erode the foundational ideas of hypermedia.

Declining Hypermedia and a New Ephemerality

By the late 00s, this became an increasingly common sight:

...
     href="#" id="LoadUserPopup">label
...

The anchor above is no longer a hypermedia control, just a user interface element. HTML was beginning a steady slide from hypertext markup language to mere user-interface markup language. Perhaps this was fine, after all the goal was to run and distribute interactive applications over the web. Did we really need hypermedia at this point or was HTML (as UIML) enough?

As user interfaces became increasingly sophisticated, more and more work was being done inside the browser, dynamically generating and manipulating fragments of markup. The interactivity model was moving to the JavaScript libraries with an increasingly short half-life. While the flexibility and evolvability of the core architecture of the web and its supporting standards enabled this evolution, we arguably began to throw the baby out with the bathwater, coupling to very unstable and leaky abstractions.

While the new ajax-powered era of web development promised applications that would feel modern, any projects that “backed the wrong horse” quickly felt positively antiquated. jQuery emerged as the dominant js library of this era. Apps built with YUI, ext.js, or the myriad others that weren’t rewritten for jQuery quickly showed their age and became harder to maintain due to abandoned plugins and declining developer interest.

Also, as powerful as this approach was, the code was still very verbose, lacked structure, was difficult to test, difficult to debug, difficult to organize. We began to reach the limits of even the library-powered ajax approach.

By 2009, client-side html templating started to emerge with libraries like mustache and handlebars but these applications needed better structure and separation of concerns. In under a year, the MVC, MVVM, and observer patterns began to find their way to the client with AngularJS, Backbone.js, and Knockout being among the leading contenders.

The next hypertext domino to fall was the URL. In the quest to eliminate page refreshes at all costs, these new frameworks enabled the entire application to be downloaded into the browser at once and all internal application navigation was handled by custom, client-side code. We broke the back button and had to figure out how to reinvent it. Navigating directly to a “deep link” often resulted in incorrect application states. Again, we added more complexity to bring this basic behavior back. Increasingly navigating to an app would serve almost no html beyond what is necessary to bootstrap the application. We added more complexity in the form of prerendering and server-side rendering, again, to reinvent what we used to get for free.

By it’s very nature, the web was a natively distributed system. A web page was often a composite of many resources which could be sourced from all over the web. These new client-side applications became large, unwieldly monoliths. It would take many years before the microfrontend pattern would emerge to restore yet another feature of the web we had lost.

Opting In to Tech Debt

Applications built in the early ajax era began to suffer to shifting fashions. I personally saw one application rewritten four times in six years (ext.js -> jQuery -> knockout -> angularJS) and worked with many applications that ultimately ended up using most (or all) of these simultaneously as trends would typically shift mid-rewrite. Of course, at this point, not only did the application need to be changed but so, too, did the entire toolchain. Gulp, Grunt, Webpack, SASS, LESS, Babel, Bazel… The list goes on an on. New framework versions often introduced some number of breaking changes, and an increasing amount of most organization’s development budget shifted from new features and innovation to simply keeping up with trends and applying the latest toolchain and runtime additions to fix what always existed until the latest framework “innovation.” Sections of the app dependent on an old version of a library or framework; or built using an abandoned toolchain or library we relegated to a growing mountain of tech debt.

It soon became clear that coupling to any framework and toolchain was simply opting in to tech debt. If not today, than inevitably soon. Of course, the promise of stability was on the horizon.

The consensus in the early 2010s was that the “smart-money” was on a framework that had the backing of the behemoth that is Google. AngularJS got a lot of attention and many web applications were irrevocably coupled to their particular brand of MVC/MVVM. In 2014, however, we learned that AngularJS was already on the sunset track and brand new and completely different framework, confusingly called “Angular” was the future. Oh, and drumroll… there’s no upgrade path. Admittedly that was walked back slightly following the backlash, but there’s no avoiding it… another rewrite was necessary.

To be fair, the world changed on us very quickly, giving us little time to adapt or figure out the best way to proceed. The web dev space contains some of the brightest minds in the industry and many brilliant ideas have emerged during even the most turbulent framework years.

Today we have largely coalesced around React, Angular, and Vue with the best ideas of each (and the past) cross-pollinating to create a powerful set of tools to choose from. Although nobody is hopping from framework to framework quite like they did 15 years ago, these frameworks continue to regularly introduce breaking changes and every single one of them is a ticking time-bomb of tech-debt.

The Web Remains

This timeline, as I have framed it, might be offensive or, at a minimum, seem unfriendly to web framework apologists (assuming they haven’t stopped reading long ago). I write this not because I hate js frameworks (I don’t, they are extremely valuable, important, and certainly have their place). I write this because we have been frogs in the ever-heating water for so long; so preoccupied with keeping our skills current and our code up-to-date and secure that rarely do we have time for such retrospectives.

Ultimately, we cannot deny the fact that there is no turning back from where we are. If you are building almost anything for the web today, there exists a minimum user experience expectation that can’t be escaped. We get this out of the box with any of the major frameworks. This approach has become regarded as the de-facto “best practice” and apparently the only game in town.

It was a long and hard road to get where we are today. As bright and passionate developers over the past 20 years strove to deliver this dynamic UX and transform the web into a viable application platform, the history has been marked by reacting (again, no pun intended).

Google Maps changed the game, and we reacted with ajax libraries that grew into frameworks. Despite what these libraries gave us, manually manipulating the DOM quickly grew unwieldly. We reacted with client-side templating frameworks and reacted again with MVC/MVVM frameworks to tackle the growing complexity of the explosion of client-side code. New client-side paradigms eclipsed the MVC/MVVM patterns and we reacted. The number of lines of code necessary just to build a “hello world” example has ballooned into the millions. Running $tokei node_modules on a brand new create-react-app in January, 2024 shows an astonishing 3,576,754 lines of code. Bundles were getting out of control so we reacted. New build tools, new compiler optimization techniques… even more complexity.

Operating in a reactive space rarely gives time for forethought or planning. There were many unintended consequences (routing, deep linking, state management, server-side rendering, SEO, accessibility, code bloat, dependency management…) along the way that, again, we had to react to.

We are where we are in web development today because the previous generation of tools weren’t cutting it. Yet, as powerful and useful as these frameworks have proven to be, perhaps they are overkill in many cases.

But what’s the alternative? Many would argue the options are either modern frameworks or going back to those vanilla Web 1.0 days of full page refreshes on every action. I guess it works ok for my minimalist, static html, anachronistic blog but it’s certainly not an option for the SaaS CRM I run.

Fortunately, there exists a third way, a “middle path” if you will. A return to power and simplicity of hypermedia systems but designed to meet 21st century UX expectations.

HTMX - The “Third Way”

The third way I present is HTMX, “the newest old way to make web apps.” While mainstream js frameworks have taken the approach of building complex application runtimes that run in a browser, reducing HTML from hypertext markup language to mere user-interface markup language, HTMX takes a refreshingly different hypermedia-first approach by extending HTML to align with the UX demands of modern web apps while avoiding the bloat, complexity, and ephemerality in the process.

My all-time favorite nerdcore rapper, stdout, wrote a fabulous track called “hell.js” that covers first the ecstasy then the agony of modern js frameworks and their complexity ending with the somber line “don’t tell anybody but man I miss jQuery… man I miss jQuery.”

The cynic might be amused that the antidote to “hell.js” that I’m promoting here is… a JavaScript library; but the approach the authors took is materially different and more true to the ideas and architecture of the web than anything I’ve seen before. How is it different? Recall that Roy Fielding defined the REST architectural style–the architecture of the web–by six constraints. The sixth is the optional “Code on Demand” constraint.

“…a client component has access to a set of resources, but not the know-how on how to process them. It sends a request to a remote server for the code representing that know-how, receives that code, and executes it locally. The advantages of code-on-demand include the ability to add features to a deployed client, which provides for improved extensibility and configurability, and better user-perceived performance and efficiency when the code can adapt its actions to the client’s environment and interact with the user locally rather than through remote interactions”

Dr. Roy Fielding - Architectural Styles and the Design of Network-Based Architectures

Long ago we learned that the hypermedia HTML, as currently specified, is insufficient for our modern application needs. In this case, the client requests the code (a 14kb gzipped lib with zero dependencies) that can extend the client’s capabilities (and improve user-perceived performance and efficiency). This is how the web was supposed to work! How is this different from react or any of the other mainstream frameworks? Aren’t they, too, just taking advantage of the code-on-demand constraint? Well, yes but in the process they violate the Uniform Interface constraint. Hypermedia is no longer the engine of application state, it’s an implementation detail stripping away everything hypermedia gave us. It’s UIML, not HTML; different in every respect except syntax.

So we’re violating a constraint, so what? Virtually every system ever created described as RESTful violates one or more of these defining constraints. It’s not about passing some arbitrary purity test, it’s about keeping everything the web already gives us while adding only essential complexity to meet the bar of modern application UX.

Consider that each constraint is chosen to elicit certain system qualities (capabilities, -illities, non-functional quality attributes… whatever you want to call them), violating constraints inevitably degrades or eliminates system qualities. This is fine if those -illities aren’t important but, as the timeline presented above shows, they are almost always important. More code had to be added to fix what got broken as a result of violating that constraint, leading to an increasing amount of complexity and hell.js. In other words, Half of the complexity was figuring out how to best create the dynamic UX of the modern web, and the other half was dealing with the accidental complexity our chosen approach introduced. Beyond what more code (and more dependencies, and more tools, and more moving parts) has been able to patch, one key aspect of the web remains broken; it’s evolvability. JS framework code is going to break orders of magnitude more frequently than HTML (or even extended html) due to a lack of stable abstractions.

What about HTMX breaking changes?

I’m glad you asked! It’s not an issue.

HTMX is declarative, not imperative. HTMX is predicated on the core philosophy of “what if HTML natively included a set of features that better aligned with modern UX needs?” Rather than building yet another js library or framework that you code with, the HTMX simply introduces a set of meticulously planned and well-though-out set of extensions to HTML.

You don’t code with or against it directly, you simply write HTML. HTMX adds new properties to html elements that the browser engine knows how to process by virtue of code on demand. The entire library could be easily rewritten (it’s only 3500 lines) without breaking a single client since all it does is finds and processes the tag attributes as declared rather than requiring direct coupling to the implementation of a framework. Also notably, HTMX implements a zero-clause BSD license to all-but-eliminates the risk of a the library becoming abandonware.

I am currently running V1.9.10 extensively in production however 2.0 is on the horizon. Normally a major version release in the webdev world indicates a major breaking change. I have looked through the v1.x-v2.x upgrade guide and even 1.x - 2.x requires zero changes for almost everyone except for plugin developers and a small set of folks who implemented a niche feature on the 0.x version and did not upgrade when v1.0 was released. Basically this nonsense is a thing of the past:

Tune in for part II where we will see HTMX in action in the context of a real-world, production application. I promise you this: it will not be a panacea. Remember, everything is a trade off. There are no best practices, just what’s best in the context of whatever you’re building next.

Reflection an a Decade and a Half on Dvorak

2023-12-17T00:00:00+00:00

I have now been typing using the Dvorak keyboard layout longer than I have used QWERTY. It’s been over 15 years since I first switched and probably a decade since I moved to the even more niche Programmer Dvorak Layout. As the year winds down (and I am in currently in quarantine) I figure I’d take some of this idle time to write a little bit, and this is where my head is at today. Who knows, there may be some interesting take-aways from this. On the other hand, it may just be 1000-or-so words of “brain droppings” as George Carlin would say.

Origins

We love folklore, especially the apocryphal sort that makes for good, pithy tales of wisdom. The space pen vs pencil urban legend is a great example. Right up there are the tales of the origin of the QWERTY keyboard layout. I have done a fair amount of research on this topic and cannot find any definitive proof that the origins of the layout were to “slow down typists” or to avoid the type hammers from colliding and becoming jammed. Another theory is that the layout was an early attempt at “vendor lock-in” so typists proficient with one brand would be likely to stay with that brand. Either way, we have long known the QWERTY layout is neither efficient nor ergonomically optimal.

Neither Mario, nor Mavis Beacon taught me to type. I learned in the early 90s on IRC and telnet-based talkers. The pace of conversation on these channels meant my fingers had to learn how to keep up… and fast. By the time I first sat in an actual “typing class” in the mid-90s (do such things still exist?) I could already break 100WPM.

By around 2007 I was approaching 20 years of my hands being mostly on keyboards and symptoms of RSI were beginning to set in. I had been vaguely aware of the Dvorak layout for awhile but never seriously looked at it until I saw several people claim that switching had eliminated their RSI issues. I didn’t relish the idea of wearing a wrist brace for the next 40 years, so I took the plunge.

The Innovation

So the story goes, August Dvorak set out to deliberately design a better keyboard layout. His work began with a study of the physiology of the hands and frequency analysis of the English language. The work began with the goal of placing the most frequently used letters on the “home row” of the keyboard. This was further refined by analyzing digram (adjacent pairs of letters - e.g. st th nt etc.) frequency and optimizing the layout such that keystrokes would naturally alternate between hands maximizing typing parallelism where possible, and otherwise organizing common digrams such that they can be typed with a single hand with keystrokes moving from the pinky finger towards the index. If you attempt to drum your fingers on the table, you’ll probably notice that is is easier to drum in that direction (pinky-to-index) than the other way around.

The net result was a keyboard that, in theory, allowed one to type faster, more accurately, and with less fatigue over time. The original Dvorak layout had the numeric characters in non-sequential order but the Dvorak layout we now know today is actually the simplified Dvorak layout.

I’m a geek with a pro-innovation bias, so I chose to adopt the innovation.

Retraining My Brain

By this point, I had well over a decade of muscle memory hardwired into my digits and I knew the switch would not be an easy one. The challenge is compounded by the fact that, although I changed my keyboard map in my OS, the only physical keyboards I had access to all sported qwerty keycaps. I had hoped I could just reposition the keycaps but it turned out that the caps on different rows had slightly different shapes. Hunt-and-peck was not an option to make the switch but this was probably for the better.

Memory is an interesting thing. For whatever reason, our memory is a write-on-read medium, meaning the act of recalling something from memory is what cements that memory in the future. We’re optimized to store that which we remember, and memories that are sufficiently infrequently accessed will eventually be garbage-collected. I needed a reference, however, so I printed out the layout and folded it into a make-shift table tent and placed the reference layout behind my keyboard.

At first the experience was agonizing. Every key I intended to type required me to stop, try to remember where that key was now located on my keyboard, receive the mental equivalent to a cache-miss, and reference my cheat-sheet. This was–and I cannot emphasize this enough–extremely disruptive to my productivity. The act of typing became an endless stream of mental context-switches. The good news was that is was short-lived. By the end of the day, I could throw away my reference. By the end of the week, I could type at a modest 30 wpm. Keyboard shortcuts, it turned out, belonged to a different set of muscle memory that required separate training. I know many a dvorakian who use a custom layout that retains the usual ctrl-c ctrl-v etc. keyboard shortcuts in their original place.

For the next couple of weeks, my typing speed slowly crept up back toward the 60wpm mark and soon I was typing faster than anyone else I knew. It’s just a shame that my brain–and not my fingers–was the bottleneck.

Having spent some time in Dvorak keyboard communities, I soon learned about the programmer Dvorak layout - a set of modifications introduced to optimize the keyboard for those who type code for a living.

Having tackled the truly monumental task of learning the core layout, the switch was largely incremental. With programmer Dvorak, common punctuation is shifted from the bottom row to the top row, special characters ([]{}()=*) are moved into more optimal positions on the numeric row and dominate, the numeric keys are in the shifted position (since, as programmers, one would assume we type parens and operators more often than magic numbers). Also, the numeric keypad is inverted but helpfully includes ABCDEF in the shifted position of 123456 respectively to make it very easy to type in hex values. In some ways, the change to the numeric row was harder.

Looking Back - The Good, The Bad, The Ugly

The Good

Let’s start with the good. I have not had any RSI symptoms in 15 years. My wrists and hands are pain-free even after many long-days at the keyboard. Second, I can type fast, faster than any non-stenographer I know and, unfortunately, faster than I can think.

I also have some geek cred that I can flex from time to time. My ego likes this.

One more good thing - it is very difficult to shoulder surf a password when I am typing. Security through obscurity, I know, but it is enough for someone to move on.

The Bad

At one point I had this fantasy that I would somehow become the keyboard equivalent of bilingual; that I could switch back and forth seamlessly at will. It doesn’t work that way. By the end of my retraining process, my neurons had irrevocably rewired. This also means that I can’t use anyone else’s computer effectively without first monkeying with the system settings (which is generally considered rude). Pairing and mobbing (where a single computer/keyboard is shared between two or more coders) becomes a problem with non-dvorakians. To prevent a lifetime ban in many coding mobs, I have a standing rule that every time I step out of the driver role, if I leave the layout on programmer dvorak I owe everyone involved a dollar per instance. This has cost me a lot of dollars.

You can’t game effectively on dvorak. WASD wields far too much political capitol to be overturned. While some apps allow you to remap your keys, enough have been built assuming qwerty and you just have to switch layouts.

In short, I pretty much have to keep dvorak and qwerty loaded at all times. My OS has a default key combo of ctrl-shift to cycle through layouts but my editors also use a lot of ctrl-shift-_something_ shortcuts that if I hesitate, the keyboard layout will shift on me alh ;fhhdlpt kja; g; ,jak Gzm ktglue (translation: “and suddenly this is what I’m typing.”)

The Ugly

Because I use such a non-standard keyboard, my current home keyboard is a Das Keyboard Ultimate 4 (the one with the all-black keycaps). Since I touch-type anyway, I don’t need the incorrect labels… except when I do.

Occasionally I find myself in a situation where I need that qwerty reference. Perhaps I am remoting into a vm or server that doesn’t have my layout setup yet and I need to type a password. Oh, and by the way, I often have no way to unmask the characters on that password box so I’m probably locking out my account. I’ve experience weird behavior using remote desktop tools as well, where characters double-convert (e.g. I type an H which is in the position of J on a QWERTY keyboard, which gets sent as a keystroke to the J position on the keyboard, which is a C on QWERTY) it’s weird and confusing. Also, I’ve been known to leave hanging remote desktop sessions running which somehow attach to the next person connecting and they can’t type anymore.

Arguably the most irritating fact of the Dvorak layout is the placement of the ‘V’ key… right next to the ‘W’. That might seem innocuous at first, until you try to type ctrl-v (paste) and accidentally type ctrl-w (close window). This happens to me daily.

Conclusions

Dvorak is an interesting case-study of the innovation-diffusion process. It’s interesting, however, because it is really a one-way decision. Discontinuation of the innovation is not a viable option. I’m not sure if this is good or bad. But, like vim and emacs devotees, I’m not sure I would want to go back.

I’m also not entirely sold on programmer dvorak. Supposedly it was designed based on C, Java, Pascal, Lisp, HTML, CSS, and XML. For more terse languages (and regex) you really need the full spectrum of special characters and there just aren’t enough good spots. &%$`^|@ are all in awkward places.

At one point iOS added the dvorak layout although I don’t use it on my phone. It turns out that remains yet another set of muscle memory that would need to be retained. I tried it for about a day and learned that One-thumb typing is a terrible experience on a dvorak keyboard as you’re constantly jumping from side-to-side.

All in all, I’m happy I made the switch as, if nothing else, my wrists don’t hurt anymore and I have some bragging rights. Would I recommend this to others? Only if you really wanted to.

Abstract Style: Space-Based Architecture

2023-06-21T00:00:00+00:00

Many developers strive for elegance efficiency in their code. Imagine a hypothetical developer who takes this too far. Perhaps this developer looks back in shame at the sloppy, inefficient implementations their much less experienced self had written. They were, undoubtedly and objectively, “bad.” Over time, their code grew cleaner and more efficient. They could measure progress in wall-clock time, algorithmic complexity, or other quality metrics. Somewhere, however, the pendulum swung too far. If their “bad” code was slow, and their “better” code is faster, it stands to reason that even faster is even better. They reached a point where they would squeeze out every spare microsecond at all costs. The result was difficult to read, change, and understand; perhaps “the juice was not worth the squeeze.”

Performance can be a valuable system capability, but only in context. Remember the “Mad Potter’s” wisdom on the difference between “a lot” and “enough.” Architecture trade-offs must be considered in many dimensional contexts. Never trade an excess of one capability for a deficit of a more important capability. That said, sometimes performance is among that short-list of business-critical capabilities–perhaps at the very top. What constraints induce high-performance at the architectural level? Read on to learn about the Space-Based Architecture.

This post is part of a series on Tailor-Made Software Architecture, a set of concepts, tools, models, and practices to improve fit and reduce uncertainty in the field of software architecture. Concepts are introduced sequentially and build upon one another. Think of this series as a serially published leanpub architecture book. If you find these ideas useful and want to dive deeper, join me for Next Level Software Architecture Training in Santa Clara, CA March 4-6th for an immersive, live, interactive, hands-on three-day software architecture masterclass.

Introducing the Space-based Architecture Pattern

This architecture takes its name from the concept of tuple space, an implementation of a shared memory space for parallel/distributed computing. It provides a repository of tuples that can be addressed and manipulated concurrently. In this architecture, potential performance bottlenecks are aggressively eliminated which results in a high performance and massively scalable architecture. The database is decoupled from the application by choosing to have each worker node store the entire dataset in memory in the form of replicated data grids and eliminating all direct reads/writes. Writes to the in-memory dataset are synced between worker nodes through some form of middleware which also asynchronously reads and writes to the underlying database. A separate component monitors overall load and scales the number of available workers up or down based on demand.

Although not easy (nor cheap) this abstract style forms the foundation of a highly performant, highly scalable, and highly elastic system architecture which offers a meaningful alternative to attempting to scale a database or adding in caching technologies to a less scalable architecture. Let’s explore what this blackboard/worker model looks like in practice.

The Processing Unit

The processing unit is the component of the system that contains the application logic and performs the business functions (whatever that may be). For practical reasons, the processing unit is generally a fairly fine-grained component with carefully-scoped data and optimal cold-start times. In other words, there may be multiple tuple-spaces depending on the size and scope of the overall system. In addition to the application logic, the processing unit contains the in-memory data grid and replication engine.

The Data Grid

The data grid is the replicated in-memory state of the processing units and is a central concern of this architecture. It is essential that each processing unit contains identical state at all times. The data grid may reside entirely within the processing unit with replication happening asynchronously between processing units but, in some implementations, an external controller is necessary in which case the controller element of the data grid would form part of the virtualized middleware layer.

The Virtualized Middleware Layer

This layer contains components that handle infrastructure concerns that might control some aspects of the data synchronization, request handling and may be off-the-shelf products or custom code.

The Message Grid

The messaging grid is essentially a load balancer in this architecture. The messaging grid manages requests and session state and will determine which processing units are available and distribute requests appropriately.

The Processing Grid

This is an optional component that handle request orchestration should multiple processing units be required to satisfy a given request.

The Deployment Manager

The deployment manager acts as the supervisor in this style. Observing load and capacity and either adding or removing processing units to/from the pool as required.

Data Pumps

Although, in theory, this architecture could indefinitely hold all critical data in volatile memory, we must plan for inevitable cold start/cold restart scenarios. Data Pumps provide eventual consistency between the in-memory datasets and the persistent storage with the database. Data pumps subscribe to the state changes asynchronously broadcast by processing units and then synchronously writes state changes to disk. A data reader is a separate pump responsible for providing initializing state to the processing units in the event of a cold start scenario.

Core Constraints

These capabilities are induced by a number of core constraints as follows:

Fine-Grained Components
High Operational Automation
All Data stored in-memory
Asynchronous State Replication and Database Writes

Constraint: Fine-Grained Components

This constraint states that the overall system is decomposed into fine-grained components. Precise granularity is not prescribed, but generally a single fine-grained component will be scoped to a single bounded context. This medium granularity exists in the continuum between coarse-grained (e.g. n-tier system) and fine-grained components (e.g. microservices).

Elasticity (+3)

Decomposed components are generally lighter-weight and offer reduced start times eliciting improved elasticity compared to monolithic systems or coarse-grained components.

Evolvability (+3.5)

Decomposition into components enables different parts of the system to evolve at different rates. Adding or changing functionality may often be scoped to a single component.

Fault-tolerance (+3.5)

Independent components of any granularity begin to reduce the operational cost of running any given component. It becomes feasible to run multiple instances of a component which reduces the risk of system-wide failures when any individual instance of a component fails. Furthermore, components generally represent only a subset of the overall system functionality which supports implementation decisions that enable the majority of the system to continue functioning.

Deployability (+3)

A system decomposed into compents paves the way for smaller-scoped deployments and partial deployments of the system. Deployments scoped to components, reduces deployment risk and simplifies pipelines.

Scalability (+3)

Smaller components may be individually scaled increasing total available compute where it is most needed.

Testability (+3)

Breaking an application into smaller components reduces the total testing scope and fine-grained components are usually split at the bounded context reducing inter-component testing scope.

Agility (+1.5)

Decomposing a system into independent components of any granularity with thoughtful module boundaries has the potential to improve business agility. Components may be independently evolved and deployed with reduced overall testing scope.

Configurability (+0.5)

Decomposing a system allows components to be individually configured and each component may utilize different framework versions, runtimes, languages, or platforms.

Integration (-2)

Fine-grained components are generally managed by separate teams. More coordinate is required, increasing coordination costs.

Interoperability (-2)

Fine-grained components are usually scoped to the bounded-context, this introduces interoperability challenges when communication spans domain boundaries and ubiquitious languages differ.

Performance (-2)

Fine-grained components generally require more network calls as many common functions increasing potential latency and degrading performance.

Simplicity (-2)

This constraint requires significant investment in infrastructure and pipeline development and typically requires more complex cloud implementation.

Constraint: High Operational Automation

This constraint dictates that the architecture exhibit a high degree of operational automation.

Elasticity (+3)

High operational automation enables more robust and responsive automated scaling of resources up or down as needed.

Performance (+3)

The ease of elasticity improves overall performance as needed, compensating for network overhead in some cases.

Configurability (+2)

A consequence of investment into operational automation is configuration management as part of the deployment/operational toolchain.

Evolvability (+2)

The induced agility and deployability improves overall system evolvability.

Agility (+1.5)

High operational automation correlates with high maturity and automation elsewhere in the build/deployment pipeline. High operational automation also allows for lower risk deployments due to deployment management strategies it enables.

Deployability (+1.5)

High operational automation also allows for lower risk deployments due to deployment management strategies it enables.

Fault-tolerance (+1)

A system with high operational automation provides more scope for managing and mitigating partial failures and recovery.

Testability (+1)

High operational automation correlates with high maturity and automation elsewhere in the build/deployment pipeline which improves system testability.

Scalability (+0.5)

High operational automation offers some capability to scale-up resources on demand.

Cost (-1)

This constraint introduces increased operational cost, development investment, and potential licensing costs for operational middleware.

Simplicity (-1)

This constraint requires significant investment in infrastructure and pipeline development and typically requires more complex cloud implementation.

Constraint: All Data Stored in-memory

This constraint dictates that components store all data in-memory.

Performance (+5)

This constraint eliminates both network I/O and database I/O thus massively improving performance.

Scalability (+5)

By removing the database and external I/O as scalability bottlenecks, scalability becomes effectively infinite.

Deployability (-1)

This constraint reduces deployability as deployments force a system-wide reload of data.

Configurability (-2)

This constraint tends to reduce configurability as data structures are typically fixed.

Fault-tolerance (-1.5)

This constraint introduces new failure conditions that can be difficult to manage. Crashes may result in lost state.

Cost (-3.5)

This constraint introduces high-resource utilization and increased baseline resource requirements. This also typically results in more overall compute resources necessary to run the system at all times.

Testability (-4)

This constraint significantly weakens testability when part of a distributed architecture. Although core logic may be easy to test, various failure conditions such as data collisions and race conditions are extremely difficult to test for at design/build time.

Constraint: Replicated Shared Data Grid

This constraint prescribes the use of a replicated/shared data grid common in space-based architectures, emphasizing data consistency and availability by maintaining multiple replicas of the data. The data grid handles shared state and replication helping ensure that even if one node fails, another can take its place without any data loss, enabling high availability and fault tolerance.

Integration (+3)

This constraint requires well-defined data schemas and integration contracts, improving integration.

Interoperability (+3)

This constraint requires well-defined data schemas and integration contracts, improving overall component interoperability.

Deployability (+0.5)

The replicated, shared data grid operates independantly of any running instances. This improves deployability in some situations supporting incremental deployments rather than full shutdowns and cold starts.

Evolvability (+1)

Evolvability is improved as the shared data grid infrastructure may be extended to additional tuple spaces as well as supporting partial deployments to reduce change costs.

Fault-tolerance (-1)

This constraint introduces new potential failure conditions including, replication errors, collisions, and race conditions that may be difficult to contain and mitigate.

Cost (-3)

The operational costs and licensing costs of products that implement this data grid can be quite high.

Summary

This abstract style offers some unique advantages in terms of linear scalability and high elasticity as well as significant overhead and challenges in terms of managing synchronization.

Abstract Style: Service-Based (a.k.a. ‘miniservices’)

2023-06-17T00:00:00+00:00

In our previous post we looked at a technically partitioned, coarse-grained, distributed architecture. Although that pattern balances capabilities like scalability and elasticity with low cost and simplicity, a domain-partitioned architecture may offer many advantages. For an organization with well-defined domains and domain-aligned teams, this next pattern offers a very balanced and pragmatic distributed architecture that offers comparable capabilities to an architecture like microservices at a fraction of the cost and complexity. In short, a Service-Based architecture offers a highly viable “middle-ground” between monoliths/n-tier architecture and microservices. This architecture can act as a stepping stone between monoliths and more complex topologies although often it will represent an acceptable and desirable end-state for green-field and legacy systems alike. Read on to learn about the Service-Based Architecture

Introducing the Service-based Architecture Pattern

Some architects call this pattern “service-based,” others have adopted the moniker “miniservices” as they are typically much more coarse-grained (less “micro”) than microservices. This pattern typically consists of several domain-centric deployable services (rather than the hundreds or thousands that might exist if the same system were deployed as fine-grained microservices). This is a very common pattern in the wild with some projects arriving styles derived from this pattern by design, many others end up here out of necessity.

“90% of organizations who try to adopt microservices will fail, they will find the paradigm too disruptive and often switch to miniservices” Gartner

At a high-level, this pattern typically consists of an independently deployable user interface, independently deployable coarse-grained services (exposing some kind of API), and a monolithic database.

Like all domain-partitioned architectures, this architecture starts with domain-driven understanding of the problem-space to be supported by the system which informs potential module boundaries. As an architect, you have ultimate control over individual component granularity, although typically module boundaries will initially break along domain boundaries (often referred to as “domain services”). Even with coarse gained components, most changes are scoped to a single domain which accelerates software delivery and improves overall agility.

In the abstract style, a single, monolithic database is shared among services and, although this increases coupling, it eliminates the complexity of splitting databases as well as all but eliminating the need for orchestrating distributed transactions and the overhead application-side joins. Of course additional constraints such as a partitioned shared database or federated database reduce coupling and improve scalability, in many cases this architecture offers “enough” of any given capability for most cases. While database coupling is a deliberate trade-off in this style, some code coupling is also common in the form of shared, versioned code libraries over duplication.

This abstract style doesn’t prescribe how code is structured inside any given domain service. If migrating from a layered monolith, it may be most expedient to extract a vertical slice of the monolith at roughly the domain boundary, creating layered code within each domain service (at least as as starting point). For greenfield development (or if you are decomposing a modular monolith or in the later stage refactoring of a layered monolith) it may be preferable to structure the internals of each domain service similarly to the modular monolith. This approach opens up an axis of future flexibility should it make sense to ultimately decompose the system into microservices or if, long term, the system would benefit from being further decomposed into “multigrain” services (some course-grained, some fine-grained as it makes sense).

Core Constraints

The overall capabilities of this abstract style can be depicted as follows:

These capabilities are induced by a number of core constraints as follows:

Medium-Grained Components
Independent Deployability
Domain Partitioning
Shared Database
Separation of Concerns
API communication

There are many strategies and approaches to API communication, each with various strengths, weaknesses, and trade-offs. We will presume a RPC-style API for simplicity.

Constraint: Medium-Grained Components

This constraint states that the overall system is decomposed into medium-grained components. Generally these components are scoped to a single business domain.

Elasticity (+3)

Decomposed components are generally lighter-weight and offer reduced start times eliciting improved elasticity compared to monolithic systems or coarse-grained components.

Fault-tolerance (+3)

Performance (+2.5)

Medium-grained components generally require fewer network calls as many common functions are co-located locally within the component boundaries reducing latency and improving performance.

Deployability (+2)

Scalability (+2)

Smaller components may be individually scaled increasing total available compute where it is most needed.

Testability (+2)

>Breaking an application into smaller components reduces the total testing scope and medium-grained components are usually split at the domain boundary reducing inter-component testing scope.

Agility (+1.5)

Evolvability (+1.5)

Decomposition into components enables different parts of the system to evolve at different rates. Adding or changing functionality may often be scoped to a single component.

Integration (+1)

Medium-grained components integrate well within the domain boundary, although crossing domains may introduce challenges.

Interoperability (+1)

Decomposing as system into components is a forcing function for integration. Components must be able to communicate and work together, which generally improves this capability at the system level.

Configurability (+0.5)

Decomposing a system allows components to be individually configured and each component may utilize different framework versions, runtimes, languages, or platforms.

Cost (+0.5)

Medium granularity of components reduces total cost of ownership in terms of development time, change time, and bandwidth costs but is offset by design, development, and compute overhead involved in a distributed system.

Workflow (+0.5)

Medium-grained components typically contain many modules and functions grouped by domain which modestly improve the ability for a module to orchestrate workflow behavior within a module boundary.

Constraint: Independent Deployability

This constraint states that the system as a whole need not be delpoyed at one time. Any individual component must be deployable in isolation. This consrtaint cannot co-exist with the monolithic deployment granularity constraint.

Agility (+1.5)

Independent deployability requires some rules around modularity. Clean and stable module boundaries must, therefore, exist (be it in the form of plug-in architecture, event processors connect by async processing channels, independant services, or service layers). These module boundaries in place, combined with independent build and deployment pipelines, significantly improve overall agility. To respond to change, a module can be create or modified and deployed with low overall risk to the rest of the system. Change scope remains constrained and deployment risk reduced increasing overall delivery velocity.

Deployability (+1)

In the same vein of Agility, overall deployability is improved.

Elasticity (+1)

Independent deployability improves component independence and improves elasticity.

Evolvability (+1)

The independant deployability constraint enables different parts of the system to evolve at different rates. Adding or changing functionality is often as simple as a single, scoped, focused deployment.

Scalability (+1)

Independent deployability improves component independence and improves scalability.

Cost (+0.5)

The capability of surgically-precise deployments reduces the overall total cost of ownership of the system. This capability is limited, however, as independant deployability typically requires additional investment in build and deployment infrastructure which incurs a cost penalty. In aggregate, the trade-offs of this constraint are a small net positive.

Simplicity (+0.5)

Although development and management of independant build and deployment pipelines introduce complexity and require some specialized skills, once this infrastructure is in place, the system as a whole is generally easier to maintain and modify. Again, in aggregate, the trade-offs evaluate to a modest net positive.

Constraint: Technical Partitioning

This constraint introduces some structure in terms of how components of the system are organized. In this case components are grouped by their technical categories. As the depiction of the abstract style indicates, usually these are along the line of UI/Presentation, Business logic, persistence, and database but this constraint can apply to both monolithic and distributed topologies.

Generally layers are consider either open or closed. Closed layers abstract any layers below the closed layer - meaning they must act as an intermediary. Open layers are free to be bypassed when it makes sense (the sinkhole antipattern defines a scenario where the layers don’t apply any meaningful changes or validation on the data as it passes through a layer).

Cost (+2)

Testability (+0.5)

Because this constraint add some structure to our software components, testing scope becomes better defined as do interfaces between layers.

Abstraction (+0.5)

Because one layer must interact with another, often this results in better interfaces and abstractions being put in place. As a result, generally this constraint slightly improves abstraction within the system.

Deployability (-1)

Deployability is degraded here, whether this constraint is applied to a monolithic or distributed technically partitioned system. Generally any single change requires modifications to all layers which increases testing and regression-testing scope and reduces velocity while increasing deployment risk.

Configability (-2)

Technical partitioning might introduce tight coupling between different parts of the application. This can make it difficult to change or configure one part without affecting others.

Evolvability (-2)

The large change scope, reduced deployment velocity, and increased change risk also adversely affects evolvability. The risk surface area is much larger than in domain-partitioned systems.

Constraint: Shared Database

This constraint states that the entire application utilizes a common database. While this is often a default of some abstract styles and patterns, it still strengthens and weakens capabilities and should be explicitly noted.

Cost (+1.5)

Generally sharing a single, shared database reduces licensing costs, hosting costs, and reduces development costs. Generally this also reduces data storage redundancy as there is much less need to replicate data to be visible to other application components.

Simplicity (+1)

Administration is simplified by virtue of having a single database to manage. Design is also simplified as all data modeling can be done at the application level rather than domain level.

Deployability (+0.5)

Deployment is generally straightforward as changes to a single database have reduced coordination costs. The improvement is modest, however, as DB changes at this scale can affect availability and introduce risk if schema that other components rely on change.

Configability (-0.5)

Configurability is reduced as any changes must be applied system-wide. A one-size-fits-all approach is generally required under this constraint.

Fault-tolerance (-0.5)

A single database becomes a single point-of-failure. Although most database management systems bring high-availability configuration options, if one database (the only database) is unavailable, the entire system is unavailable.

Scalability (-0.5)

Databases are notoriously difficult to scale. Multiple databases responsible for different parts of the data provide some level of parallelism and increase total capacity, a single database may be limited to scaling up.

Agility (-1)

Database changes potentially require coordinating with all teams and must be regression tested across all components. It can be very difficult to tell which teams are using various tables. Consequently, any change introduces risk which reduces change velocity.

Evolvability (-1)

The high coordination cost and testing scope also degrades evolvability.

Elasticity (-2.5)

As a single, shared resource, the system as a whole becomes less elastic because there is a ceiling to the single database’s capacity.

Constraint: Separation of Concerns

This constraint further narrows the technical partitioning constraint by being more prescriptive around how layer boundaries and modularity are defined. This constraint defines that code is not simply defined by technical area, but also logical concern.

Cost (+2)

Development and maintenance costs are reduced by adding this level of modularity to code. Developers may develop deep domain expertise in business logic (or subset of the business logic) which further reduces cost.

Testability (+1)

This constraint further reduces testing scope for any given change.

Agility (+1)

Agility is improved as the code generally has better boundaries, reduced testing scope, and potentially change scope.

Simplicity (+1)

This constraint generally improves simplicity of development and maintenance of the code. It is well-defined way to develop software, and this constraint improves understandability of the system components as well.

Evolvability (+0.5)

Evolvability is slightly improved as a consequence of the factors detailed above.

Constraint: RPC-API

This constraint states that network communication between components take place through an remote-procedure-call (RPC) application programming interface (API)

Configurability (+1.5)

An API provides configurability through versioned contracts, the potential for more fine-grained control over signatures and payloads, and potentially content-negotiation.

Fault-tolerance (+1)

Networked API calls enable real-time routing to API endpoints. Such calls may often be load balanced and load-balancers may be aware of API instance health. When API interactions are semantically safe and/or idempotent, clients are free to retry failed requests. RPC APIs may lack standard response codes and clients may not know which requests are idempotent hampering failure recovery of some requests.

Integration (+1)

APIs provide a well-defined interface for component integration.

Interoperability (+1)

APIs provide a well-defined interface for component interoperability.

Cost (+0.5)

APIs, even RPC APIs (which are often tightly coupled) standardize on an interface which reduces development and maintenance costs.

Elasticity (+0.5)

Networked API calls may be routed to multiple available endpoints improving potential elasticity.

Performance (+0.5)

RPC endpoints are generally highly specific which enables some optimizations. Although coupling is often increased as a trade-off.

Scalability (+0.5)

Network APIs improve overall system scalability slightly.

Abstraction (-0.5)

The highly-specific nature of RPC endpoints reduces API abstraction.

Deployability (-0.5)

The highly-coupled nature of RPC endpoints often require coordinated changes reducing overall deployability.

Evolvability (-0.5)

The highly-coupled nature of RPC endpoints make changes more difficult.

Agility (-1)

The difficult of changing APIs without frequent client coordination reduces overall agility.

Summary

Overall this pattern offers a broad set of good-to-above-average capabilities without introducing too much cost and complexity. This pattern is useful both as a target or stepping stone as part of an architectural modernization effort or as a design choice for a new system. Although no capability maxes out in terms of scoring, the Tailor-Made approach encourages designing “enough” of a capability into the system as, very often, “too much” or “a lot” can come at a very high cost that often simply isn’t worth paying.

Abstract Style: N-Tier

2023-06-14T00:00:00+00:00

So far in this series on abstract styles we have only looked at monolithic abstract styles and a handful of defining constraints. In reality, however, much–in not most–of the systems we will design as architects will involve some kind of network component. Although many implementations of the layered monolith or modular monolith may be extended through the addition of the client-server constraint (and, typically, some type of API constraint) the next several posts will focus on abstract styles that are, at their core, natively distributed. We will start by examining the N-tier architecture.

Introducing the N-Tier Architecture Pattern

Although older and less exciting than more recent distributed system patterns, the n-tier architecture maintains a meaningful share of distributed system architectures, offering a path of evolution, scalability for existing, technically partitioned client-sever layered monoliths. The n-tier architecture also builds on widely familiar paradigms, offering a reduced learning curve for developers needing to adopt a more distributed architecture. It offers a meaningful alternative to domain-partitioned architectures where the necessary organizational work of Domain-Driven-Design (DDD) and the potential re-orgs that may entail are currently out of reach or otherwise impractical for the organization shipping the software.

This pattern builds on the structure and ideas of a layered monolith and decomposing the logical layers of the monolith into independently deployable/scalable horizontal slices. A horizontal slice could also, in theory, be vertically sliced (but finding the optimal boundary can be tricky without first identifying the natural seams in the domains.) One of the earliest formal definitions of this pattern was published by Wayne Eckerson in 1995 in his publication “Three Tier Client/Server Architecture: Achieving Scalability, Performance, and Efficiency in Client/Server Applications.” Within a few years it became clear that, perhaps, three tiers might be too restrictive/prescriptive and many implementations varied the number of potential layers which afforded the architect more control over component granularity, which in-turn, affects scalability, elasticity, deployability, and agility. Notably, the generalized “n-tier” architecture is not prescriptive on the number of tiers (the size of n). N could, in theory, be as few as one (i.e. a monolith) or two (i.e. a client-server monolith). This creates some semantic ambiguity so, for our purposes, we will assume three or more tiers.

From the top-down, the first tier is the client. This could be a fat-client, a Single-Page Application (SPA) served as static files from a CDN or web server, a PWA, a Mobile App, or any number of others. The client communicates across the network to the server through some kind of API layer. Typically the responsibility of this layer is marshalling incoming requests, handling client authentication, output caching (if an optional cache-constraint is applied), rate-limiting (although it may make sense to delegate some of these to an additional API gateway layer between the client and the API layers). The API layer is generally fairly thin and lightweight which affords faster start times for elasticity.

The next tier is typically some kind of business-logic layer. Again, as a separate and independently deployable component, in theory this component may be independently deployed and scaled (although another intermediary load-balancing layer may be required).

Finally there is a persistence layer. This could be as simple as a database server, or there may be a persistence layer in between.

There is no formal guidance around how many layers are needed, but a general rule of thumb is “only as few as are necessary.” Every hop introduces latency and it is easy to accidentally implement a distributed sinkhole antipattern.

The Sinkhole Antipattern: A layered architecture where each layer is merely passing data up and down through all the layers with no modification.

Layers need not simply be horizontal slices of an existing monolith. Layers may also provide a modern and stable facade encapsulating legacy systems, improving overall interoperability and abstraction.

Core Constraints

The overall capabilities of this abstract style can be depicted as follows:

These capabilities are induced by a number of core constraints as follows:

Coarse-Grained Components
Independent Deployability
Technical Partitioning
Shared Database
Separation of Concerns
Client-Server
Layered System
API communication

There are many strategies and approaches to API communication, each with various strengths, weaknesses, and trade-offs. Given most organizations gravitate towards this particular pattern due to familiar conventions and low learning-curve, we will presume a RPC-style API owing to its similar familiarity and low learning curve.

Constraint: Coarse-Grained Components

This constraint states that the overall system is decomposed into coarse-grained components. Precise granularity is not prescribed, but generally a single coarse-grained component may span multiple domains or busines functions.

Fault-tolerance (+3)

Although hampered somewhat by the larger granularity, decomposed components are generally lighter-weight and offer reduced start times eliciting improved elasticity compared to monolithic systems.

Performance (+3)

Coarse-grained components generally require fewer network calls as many common functions are co-located locally within the component boundaries reducing latency and improving performance.

Elasticity (+2)

Although hampered by the larger granularity, decomposed components are generally lighter-weight and offer reduced start times eliciting improved elasticity compared to monolithic systems.

Testability (+2)

Breaking an application into smaller components reduces the total testing scope. Furthermore, generally components have well-defined API contracts which abstract implementation reducing complexity of integration testing across teams.

Deployability (+1.5)

A system decomposed into compents begins to pave the way for smaller-scoped deployments and partial deployments of the system. Deployments scoped to components, even coarse-grained components, reduces deployment risk and simplifies pipelines.

Scalability (+1.5)

Even at a coarse granularity, components may be individually scaled increasing total available compute where it is most needed.

Agility (+1)

Evolvability (+1)

Decomposition into components enables different parts of the system to evolve at different rates. Adding or changing functionality may often be scoped to a single component. Depending on other factors (module boundaries, coupling, system structure, etc) some changes may span multiple components, hence only a modest improvement on this capability

Integration (+1)

Decomposing as system into components is a forcing function for integration. Components must be able to communicate and work together, which generally improves this capability at the system level.

Interoperability (+1)

Likewise system decomposition is generally a forcing function to decouple components, modestly enabling interoperability across components.

Workflow (+1)

Coarse-grained components typically contain many modules and functions which modestly improve the ability for a module to orchestrate workflow behavior within a module boundary.

Configurability (+0.5)

Decomposing a system allows components to be individually configured and each component may utilize different framework versions, runtimes, languages, or platforms.

Cost (+0.5)

Coarse granularity of components reduces total cost of ownership in terms of development time, change time, and bandwidth costs but is offset by design, development, and compute overhead involved in a distributed system.

Simplicity (-1)

Decomposing a system into independent components of any granularity introduces additional design and operational complexity.

Constraint: Independent Deployability

Agility (+1.5)

Deployability (+1)

In the same vein of Agility, overall deployability is improved.

Elasticity (+1)

Independent deployability improves component independence and improves elasticity.

Evolvability (+1)

Scalability (+1)

Independent deployability improves component independence and improves scalability.

Cost (+0.5)

Simplicity (+0.5)

Constraint: Technical Partitioning

Cost (+2)

Testability (+0.5)

Because this constraint add some structure to our software components, testing scope becomes better defined as do interfaces between layers.

Abstraction (+0.5)

Deployability (-1)

Configability (-2)

Technical partitioning might introduce tight coupling between different parts of the application. This can make it difficult to change or configure one part without affecting others.

Evolvability (-2)

The large change scope, reduced deployment velocity, and increased change risk also adversely affects evolvability. The risk surface area is much larger than in domain-partitioned systems.

Constraint: Shared Database

Cost (+1.5)

Simplicity (+1)

Administration is simplified by virtue of having a single database to manage. Design is also simplified as all data modeling can be done at the application level rather than domain level.

Deployability (+0.5)

Configability (-0.5)

Configurability is reduced as any changes must be applied system-wide. A one-size-fits-all approach is generally required under this constraint.

Fault-tolerance (-0.5)

Scalability (-0.5)

Agility (-1)

Evolvability (-1)

The high coordination cost and testing scope also degrades evolvability.

Elasticity (-2.5)

As a single, shared resource, the system as a whole becomes less elastic because there is a ceiling to the single database’s capacity.

Constraint: Separation of Concerns

Cost (+2)

Testability (+1)

This constraint further reduces testing scope for any given change.

Agility (+1)

Agility is improved as the code generally has better boundaries, reduced testing scope, and potentially change scope.

Simplicity (+1)

Evolvability (+0.5)

Evolvability is slightly improved as a consequence of the factors detailed above.

Constraint: Client-Server

This constraint separates client concerns from backend and data storage concerns.

Evolvability (+1.5)

By separating client concerns from server concerns, both the client and the server may evolve independantly. Furthermore, evolvability in the form of portability is incuded should multiple client implementations be requires (e.g. mobile platform specific clients)

Workflow (+1)

The workflow capability is modestly improved as a client may orchestrate some workflows.

Agility (+0.5)

The independent evolvability of the client and server induce a modest amount of agility. Client-server coupling generally limit this and often coordinated updates are required.

Configurability (+0.5)

Although limited in scope, clients may store user-specific or client-specific configuration independently from the rest of the system.

Deployability (+0.5)

Client and server may be deployed indepentently. Client-server coupling generally limit this and often coordinated updates are required hence this is only a modest improvement.

Elasticity (+0.5)

Decoupling client and server introduces some elasticity capability.

Scalability (+0.5)

Decoupling client and server allows improved independent scalability

Simplicity (+0.5)

The client/server separation of concerns slightly improves overall system simplicity.

Testability (+0.5)

The client/server separation of concerns reduces some testing scopes slightly improving overall testability.

Cost (-0.5)

This constraint slightly increases development, operational, and delpoyment costs.

Performance (-0.5)

Although separating client and server increase total available compute, this is typically offset by this constraint introducing network latency and overhead to most operations. Separating the client and server slightly reduces overall performance of the system.

Fault-tolerance (-1.5)

Typically both the client and the server must be available for normal functioning of a client-server system as well as an available network connection. This dependency reduces the overall fault-tolerance of the system (although this may be improved through additional constraints)

Constraint: Layred System

This constraint allows the overall architecture to be composed of multiple hierarchical layers. Often these are "closed" layers, constraining component behavior such that each component typically cannot "see" beyond the immediate layer. "Open" layers, in contrast, may be bypassed.

Abstraction (+2)

Layers, particularly "closed" layers, improve abstraction as a layer can be used to encapsulate legacy services and to protect new services from legacy clients. A layer facade to a legacy system, or an API layer provides some degree of abstraction within the system.

Evolvability (+2)

Layers may often be evolved indpendently (depending on overall coupling).

Simplicity (+1)

"Closed" layers restrict knowledge of the system to a single layer which limits overall system complexity.

Agility (+0.5)

The ability to abstract legacy systems combined with the innate composibility of layers slightly improve overall agility

Deployability (+0.5)

Decomposing a system into layered components with abstractions in the form of APIs or facades enable each layer to be independently deployed, slightly improving overall deployablity of the system.

Fault-tolerance (+0.5)

Separate layers may modestly improve fault-tolerance when deployed in a high-availablity fashioned.

Interoperability (+0.5)

This constraints potential for component abstraction and encapsulation operates as a forcing function towards improved interoperability. The gain from this constraint is modest but improved by additional constraints that ensure reduced coupling.

Testability (-1)

Although each layer introduces reduced testing scope, end-to-end testing of the system is often required and more difficult under this constraint. Furthermore, layers encapsulating legacy or external systems may be more difficult to test in isolation.

Performance (-1.5)

A disadvantage of layered systems is that they add overhead and latency to the processing of data and communication between layers, reducing performance.

Cost (-2.5)

Decomposing the system into layers increases operational costs requiring more compute instances for each layer and potentially introducing bandwidth costs.

Constraint: RPC-API

This constraint states that network communication between components take place through an remote-procedure-call (RPC) application programming interface (API)

Configurability (+1.5)

An API provides configurability through versioned contracts, the potential for more fine-grained control over signatures and payloads, and potentially content-negotiation.

Fault-tolerance (+1)

Integration (+1)

APIs provide a well-defined interface for component integration.

Interoperability (+1)

APIs provide a well-defined interface for component interoperability.

Cost (+0.5)

APIs, even RPC APIs (which are often tightly coupled) standardize on an interface which reduces development and maintenance costs.

Elasticity (+0.5)

Networked API calls may be routed to multiple available endpoints improving potential elasticity.

Performance (+0.5)

RPC endpoints are generally highly specific which enables some optimizations. Although coupling is often increased as a trade-off.

Scalability (+0.5)

Network APIs improve overall system scalability slightly.

Abstraction (-0.5)

The highly-specific nature of RPC endpoints reduces API abstraction.

Deployability (-0.5)

The highly-coupled nature of RPC endpoints often require coordinated changes reducing overall deployability.

Evolvability (-0.5)

The highly-coupled nature of RPC endpoints make changes more difficult.

Agility (-1)

The difficult of changing APIs without frequent client coordination reduces overall agility.

Summary

This pattern offers a path to improved scalability, elasticity, and evolvability without sacrificing too much cost overhead and complexity for organizations where domain-partitioned patterns are not a good fit or who wish to scale their legacy client-server monolith. Styles derived from this pattern may be adequate for a system’s needs, but this approach may be a stepping stone towards a future architectural state.

Sufficiently Advanced Technology

Hurdles of Innovation

How Ideas Spread

From Innovator to Change Agent

Hurdle #1 Perception of a Need

Hurdle #2 Finding a Solution

Hurdle #3 Packaging for Adoption

Hurdle #4 Reaching Critical Mass

Hurdle #5 Managing the Diffusion Process

Understanding the Attributes of an Innovation

Relative Advantage

Innovation Compatibility

Innovation Complexity

Innovation Reinvention

Innovation Observability

Innovation Trialability

Optimizing Innovation Attributes

Optimizing Relative Advantage

Authority-Driven Decisions

Collective Decisions

Individual/Optional Decisions

Optimizing Compatibility

Optimizing Complexity

Optimizing Trialability

Optimizing Observability

Relationships and Influence

Conclusion

CD & DevOps for Relational Databases Part II - How

Change Management Approaches

Desired State Configuration

Migrations

DDL - The Rest of the Story

Managing Database Changes

Database Change Categories

Compatible Changes

Breaking Changes

Online & Offline Changes

Just Enough SQL Internals

Physical Storage/Organization

Index Overhead

Lock Compatibility

Strategies for Reclassifying Changes

Offline Compatible to Online Compatible

Online Breaking to Online Compatible

Other Strategies for Phased Changes

What about NoSQL?

Conclusion

General Best Practices

PostScript: Emoji Support

CD & Devops for Relational Databases Part I - Why

The Reluctant DBA

DBOps and the Impedance Mismatch

Data > Code

Embracing the DevOps Mindset

Enter DevOps

To Be Continued…

Third-Way Web Development Part II - Hypermedia 2.0

Modernizing Mago

Performance and UX

Incremental Improvements

HX-Boost

Improving UX by Reducing Time-To-First-Paint

Booked Income class="txt-color-blue">$Loading...

Total Pipeline class="txt-color-greenDark">$Loading...

Booked Income class="txt-color-blue">$Loading...

Total Pipeline class="txt-color-greenDark">$Loading...

Lazy Loading and Loading Indicators

Active Leads

Active Leads

Trigger Modifiers

Custom Events & Reactive Components

Partial Swaps and Pushing History

Security Considerations

Conclusions

Meet Michael Carducci…

Third-Way Web Development Part I - History

A Brief History of the Web and Web Development

It Started With Information

The Birth of the Web

A New Type of Application?