This came up in a CI/CD pipeline discussion with Aman Khurana at our customer FedEx a few weeks ago, and the more I think about it, the more I like the phrase. I’m going to call it “CCLC” for short.
In my view, this is a great summary of the important goals of a modern scalable deployment, and what it can mean for management.
Configuration Consistency – to a point. You will probably have necessary environmental differences – one environment might be at a cloud hosting provider, and the other may be in your enterprise, so the way to get to the cloud hosted system might involve a VPN, but otherwise consistency should be pretty important. Data consistency is an entirely different set of questions that I’ll get to later in the post.
We want Cohesion in the way you deal with the necessary differences between deployments. If one system has a networking setup that requires some detail, but another system doesn’t, try to drive that with configuration, not separate code, or with dependencies.
The config pipeline needs to be cohesive in the sense that it’s all of a similar kind of thinking. That has aspects in how you use the tooling: if you must use multiple tools for your pipeline, at least source the configuration for those tools from a cohesive place, so that the possible cases of missing a change for tool X or Y becomes less possible. It’s far too easy to forget when it’s in a totally different place.
Using the same tool for everything would be great – if your pipeline will actually be complete in some predictable amount of time. For example, by forcing every step to be Ansible based, if one of the steps in the middle doesn’t provide Ansible kinds of tools, or doesn’t fit the Ansible model is a poor use of everyone’s time. Practically speaking, if you plan with cohesion in mind, you can source the working data for the ansible and whatever other tool from the same repository, so that you have less chances of making mistakes.
Configuration and cohesion are operationally important, but the big part of the conversation we had was about Coupling. As I define coupling in this context, if performance problems in one particular part of a system can cause issues in another part, I think of those systems as coupled. This was when the conversation with my customer was less about CI/CD and more about scaling.
Coupling comes in many forms: Big individual instances of your important systems – i.e. vertical scaling, couples more and more availability to the availability of that single instance, and makes them need more and more reliability tools. Coupling to things like individual VMWare hosts, or hardware devices is a clear issue.
The alternative is horizontal scaling, and previous posts I’ve done talk about containerization and what it brings to horizontal scaling.
With horizontal scaling, you need to pay even more attention to data consistency, and then that brings up the common scaling question: You could adopt inherent coupling like central databases, but then you have a similar vertical scaling problem. And scaling central databases can be very expensive.
As an aside I fervently believe that many problems that we all used to solve 25 years ago with ORM didn’t really need a central, ACID compliant database. And that model; DB + ORM is just about as coupled as you can get, and really only scales via vertical methods.
Eventual consistency sometimes can fix many of the transaction rate problems inherent in centralized systems. Rewriting for eventual consistency can be expensive, but there may be ways to get around that, with planning user interaction around eventual consistency.
But how do you get CCLC?
To my mind there’s a specific planning exercise you have to do, and it starts at another aspect of consistency; do I really need the data to be up to the millisecond consistent? What bad thing would happen? From whose perspective? Can we manage that user interaction in ways that make expensive data consistency unnecessary?
By way of a clear example of that line of thinking, 20+ years ago, my colleague Mike and I were building a fairly simple online threaded discussion forum. The open source tools we started with had a fully consistent model; it went back to the database many many times for every single request. We were crashing hard, at the busiest time of the day. Mike had a rather brilliant insight; when you post to a forum, the first thing you want to do is read your post to make sure you didn’t misspell something.
Because of that, given the length of time it would take to read your new post, the average amount of time before you wanted the main forum view would be several seconds later. So what we did was send the user to a short ‘this thread only’ view of the forum instead of the full view after posting. This was very much enabled by the fact that the “view” to “post” ratio was heavily on the view side, like 1000+ to 1.
This change to the user workflow allowed us to convert the main forum view to static, and was independently refreshed every 5 seconds, plus we got the benefits of the linux built in disk cache. Effectively, the on-disk version was a cache. And it was eventually consistent with the source of truth – up to the last 5 seconds – but from the user perspective it was perfectly consistent.
That change saved us over 1000 db queries per second. Production scaling was now a non issue, instead of dying right in the middle of the busiest part of the day when the db was overwhelmed. It was so fast, that people started really ramping up the threaded discussions. And we really only had a single server now serving 300+ web views per second.
But the point is that from the user perspective, it was consistent – because of the 5 second+ delay before they read the main view. So we built eventual consistency using the user’s behaviour.
So my TL;DR: To get to high cohesion, high consistency but low coupling, you start your planning by being very clear on necessary consistency, use those insights to drive a way to manage how much coupling you allow. To get to that point, you need to use a cohesive strategy especially around configuration storage to reduce errors. Configuration consistency from single sources of truth like version control systems helps drive cohesion as you get to the operational scaling you need. Lastly, Explore eventual consistency by using user perception and behaviour to reduce the transactional nature of the data store, and help you drive production scaling.