SolarWinds and the evolving economics of information security

For most of us in all parts of information technology, including at small colleges, December brought gifts we didn’t need, largely in the form of the stunning lapses of security at SolarWinds.

For those who missed it, SolarWinds is a key provider of IT utility software to organizations of all shapes and sizes. They make sure the campus network is running, that databases and servers are working right, that you understand how your applications are doing. Think of them as an auto mechanic combined with an airport’s control tower and you get the right idea: work that is both essential and too often an afterthought.

The bad news was that a bad actor (guess who) managed to break into SolarWinds’ system to corrupt some of the software they distribute to tens of thousands of organizations, creating a “backdoor” that this actor could take advantage of. As it happens, only a fairly small percentage of their customers were actually exploited by the believed nation-state actor, but the backdoor was a risk for thousands of organizations for months.

Needless to say, much of December and January was spent assessing and reviewing, determining if your organization was affected — and then cleaning up the mess if you were. Even if you weren’t, there was still work from the lessons learned on the event, largely around exploits that can affect unified login services — the crown jewels of almost every IT shop.

This event comes as Davidson College is in the midst of a sea-shift in how we select technology solutions. Within Davidson Technology & Innovation, our strategy has rapidly moved to a partner-first model, where we strongly prefer IT services delivered by third party partners. Often this is in the form of software as a service (SaaS), where we pay a firm to develop, host and operate a solution end-to-end. In other cases, it’s in the form of paying others for infrastructure services that they run. (Our campus laptops used to send backup data to a stack of hard disks in a data center – now, they backup to the cloud directly.)

Every CIO and chief information security officer, or their organizations, inevitably ask or are asked the same question every time a breach like this comes up: is that strategy the right one in a world where a SolarWinds breach can flow down to thousands of organizations?

In my mind, it’s always a matter of trade-offs between different risks. And at the end of the day, the risks of remaining in a do-it-yourself world are far more extensive and pervasive than relying on a network of partners.

~ ~ ~ ~ ~

As the leader of an IT team at a small college, I know our environment is far less complex than my counterparts’ at larger research universities, where I spent almost twenty years working before coming to Davidson. Our network structure is simpler; our firewalls and defense mechanisms have fewer edge-case requirements; we run a much smaller server environment; we have fewer laptops and desktops.

Yet if anything, the complexity of managing the “DIY” systems we do have continues to grow every year.

An important concept to keep in mind is that an organization must attempt to guard against every technology risk, while an attacker must find only one weakness to get started. A common pathway looks something like this:

Launch ‘phishing’ attacks targeting students, faculty and staff in an organization, trying to find some way to get access to an account from the outside.
Once you have a trusted account, use it to phish others on the environment — it’s much harder to detect one campus account sending malicious emails to another than it is to detect mischief from the outside.
Try to use the accounts you get access to get you some kind of foothold on a system, perhaps by loading malware like Trojan horse software on a user’s computer.
Once you have a foothold, begin poking around to see what you can find, including looking for what servers and systems you can access and, using easily-obtained software, poke at various flaws and weaknesses on servers and networks to try to burrow your way in deeper.
If possible, use flaws and exploits to try something called ‘privilege escalation,’ which might mean jumping from a standard user account one that has administrative privileges on a server or the whole network
At this point, you can proceed to eavesdrop and sneak out data, is espionage is your goal. Or, to cause mayhem through a ransomware attack, where you try to corrupt the backup files, encrypt key servers, and ask for a six- or seven-figure ransom to let you back into your system.

T&I, like our peers, is constantly working to defend against the above.

For one thing, we’ve learned over the years how critical it is to keep bad actors from taking control of campus accounts. This is part of why we’ve rolled out stronger password requirements and mandatory two-factor authentication. We also use AI-based and human review of login activity to spot suspicious login circumstances. (This has helped a lot to deter attacks at the first possibility.)

Inevitably, over a long enough time horizon, a persistent attacker will eventually get a foothold. (Tempting that it is to IT people to daydream of all the campus IT systems sitting in a lightless room, inaccessible to anyone who does not walk into that room to apply for college here or enter a student grade or run the week’s payroll or process a gift to the college, we realize that systems have to be actually usable – perfect security is never possible.)

Once they do, then the attacker can poke and prod and look for any weakness.

We keep up on these, too. We track updates from our vendors letting us know when hardware and software has newly-discovered flaws (vulnerabilities) that need to be patched.

This happens more than most outside IT would guess. Frankly, it happens more than it did a few years ago, and the pace is not getting better.

True story: in December, we patched a key part of our environment with routine software updates, something we do between each semester and often over extended breaks. Days after we did it, the vendor came out with a new, red-alert level patch for a major risk.

We went back, Jack, as Steely Dan once sang, and did it again.

Every quarter, vendors like Oracle release these patches. Every month, Microsoft does. Adobe and Google and Apple sometimes release them multiple times per month. So do Cisco, Juniper, HP, Dell, VMware and just about every other company in existence.

Not all of these are hair-on-fire risks, but plenty of them are. And my team, like every one in IT, spends a ton of time chasing them down and performing the updates, which has to be done with care to avoid creating that second foothold an attacker needs.

Add to this the time you spend running automated tools to scan your network from inside and out for high risk vulnerabilities you have missed, or to catch misconfigurations that happen through human error.

~ ~ ~ ~ ~

One of the really frustrating things about the model above is how much time isn’t spent making systems more modern and easy to use, or delivering a better user experience, or helping to digitize and automate work that college staff do so they can spend more time on higher value work.

So much time is purely overhead – patching and updating and mitigating risk to maintain homeostasis, to keep the technology body simply running.

I contrast this work with cases where we are relying on third party vendors for technology solutions.

For instance, take forms and workflow solutions. We are pivoting that work from in-house technologies we bought or built ourselves, to the Kuali Build hosted platform for building online forms and simple web apps. That platform in turn leverages a second platform for system-to-system integration, allowing us to create virtual ‘pipes’ that flow data between key systems.

By leveraging a platform, we are paying the vendors to build and maintain the solution, including all the patching and updates to keep the solution secure. Meanwhile, our staff can actually build forms and apps and workflows that our campus customers need, or can connect systems to exchange data between them securely, and more quickly than ever before.

In most cases, our vendor partners in turn are building their solutions on cloud-scale infrastructure like that delivered through Amazon Web Services, Microsoft Azure, or Google Cloud, which in turn handle much of the low-level patching and security efforts.

This second advantage — the reliance on cloud hosted infrastructure — is crucial. When you develop solutions from the ground up to run on AWS, Azure or GCP, assuming you have a capable team, you aren’t taking software you built to run in a private data center and simply are running it on hosted servers.

You’re actually designing solutions coded to run in and make the greatest use of cloud infrastructure compute, storage, web, load balancing, serverless, and database functions. All you run is the code base – the patching and flaws are managed automatically, usually with multiple layers of redundancy if architectures are built correctly.

Tech folks love the notion of recursion, and there is a certain beauty to the recursion here: leverage solutions that let your staff work in code, running on top of software platforms designed as code, running on infrastructure maintained by large-scale operations team using code and updated continuously.

No matter how strong a local technology staff is — and my team at T&I is one of the best I’ve ever worked with in my career — there is simply no way to replicate the scale, speed and security that properly-used cloud architectures bring.

~ ~ ~ ~ ~

So where did SolarWinds go wrong? And how do their challenges inform what Davidson and other organizations approach technology procurement?

At a high level, SolarWinds is one of several vendors making this essential on-premises tooling I described earlier. It’s hard to run local servers and networks without the kinds of tools that they and others sell. The reality is, if you maintain servers and applications on-premises, you’re going to have platforms like SolarWinds or their competitors on site.

One reaction, then, is to continue on or accelerate the path that we are on: migrating key applications including sensitive-data systems to modernized platforms.

It certainly is the case that the vendors we pick carry risks. We and our peers vet vendors using a range of security measures, including higher ed-specific cloud vendor assessment tools, and we have walked away from several SaaS applications where the vendor just was not able to deliver security that meets our expectations.

If those vendors are building natively on cloud-scale architectures, it reduces one level of risk — the platform/homeostasis risk I described above with self-hosting.

It still carries the risk that the vendor’s own code base or employee laptops or database is somehow hacked or attacked with ransomware, of course. This happened with one of our major partners, Blackbaud, last summer.

There are two key differences to this type of risks and the risks you carry on-premises, however:

With on-premises systems, a ransomware attacker is likely to encrypt and hijack most or all of your systems, creating much greater business disruption. With SaaS, you are spreading your risk across a much greater number of vendors.
SaaS vendors have to price the risk of their liability into the service, and have strong incentives for disaster recovery to bring services back to normal operations as quickly as possible.

SolarWinds is in this sense simply another encouragement to continue down the pathway we are on to move critical systems to SaaS. We may ask some different questions of vendors as a result of this, but the risk scenario remains the same.

One hesitates to raise the proverbial day-after quarterbacking with SolarWinds. Partly it’s not good form; partly, one learns to be superstitious quickly in IT, and so I will be holding my lucky rabbit’s foot and knocking on the table as I write the next few sentences.

Still, it’s impossible to look at the news coverage of SolarWinds in recent weeks and not see red flags of concern related to the company’s approach to security.

As Bloomberg noted last month, SolarWinds management appears to have not prioritized security. They were warned by an engineer at a company they acquired that SolarWinds “was an incredibly easy target to hack.” Bloomberg alleges that many systems at the company ran on outdated operating systems and were missing security patches, and that there was a culture that prioritized product features over security. Indeed, in an unrelated event in 2019, one of SolarWinds’ server passwords had leaked online.

The password? solarwinds123.

Mind you, a state-sponsored actor like the alleged Russian hackers of SolarWinds have skill, persistence and resources. They will find their way in to an organization if they really want to. But when a company’s safeguards are too lax, it’s far too easy for such a failure to happen.

SolarWinds’ new CEO wrote a blog post earlier this month outlining changes that the company was going to make in light of the breach, and they’ve brought on two respected veterans to help – ex-Facebook CISO Alex Stamos, and the former head of CISA, Chris Krebs, who was fired by Pres. Trump for validating the integrity of the 2020 elections.

All of the improvements SolarWinds’ CIO outlines seems prudent. A couple (like their need to deploy multi-factor across all remote-access and application access pathways) were eyebrow raising for a company of their scale. Still, companies that face massive breaches such as this one usually do the right thing and clean up their proverbial acts quickly.

A more sobering lens on the breach comes from Matt Stoller, a writer and scholar focused on economic concentration and monopolies. He makes the provocative claim that SolarWinds’ acts were of a keeping with and perhaps enabled by the culture of private equity prioritizing immediate profits over underlying investment in the product, including in security.

Private equity acquisitions of software vendors have long been a groan-worthy moment for technology managers, since it tends to be the case that prices go up and product innovation goes down, for many of the non-security related reasons Stoller explains. And as Stoller notes, in many cases security breaches act like more like an economic externality such as pollution, pushing costs further down supply chains or to insurers while the firm profits.

~ ~ ~ ~ ~

While the need for on-premises keys-to-the-kingdom solutions from firms like SolarWinds will wane as we move more services to vendor partners, the issues Stoller raises will become more important than ever as we consider how to ensure SaaS firms have values and incentives aligned with what we want from these firms.

Indeed, Stoller’s thesis is likely to lead my scrutiny of partners to veer from the purely technical and the purely security to a new dimension: how much is your firm focused on doing the right things for your customers and your long-term viability? To what extent do you seem motivated by short-term profit instead of long-term sustainability?

Ultimately, we’re going to have to take a closer look at the economic models of companies and their place in an economic food chain among conglomerates and subsidiaries.

It used to be that the largest vendor brought the lowest price and greatest quality, one of the inequality-inducing truisms of the industrial and early Information Age.

Yet in a world where a large enterprise SaaS vendor and a smaller firm like Kuali both can leverage the same underlying cloud scale architecture, the advantage of size and scale begins to change.

Is a 1,000 person SaaS firm better at security than a 50 person shop? Historically, in a world of vendor-managed data centers with unending patching workload, probably.

Using cloud scale architecture? Perhaps not. Both firms have an equal ability to build a secure product, if they have the willingness to.

If cloud is a great equalizer for security, then there is an opportunity for smaller, more focused firms to deliver products that better focus on customer needs, while avoiding cultures of profit and scale at any cost that create the kind of deep risks Stoller rights about.

Obviously, size and security are not linearly inversely correlated. Central to my thesis is that the largest firms — the Microsoft Azures, AWSes, Google Clouds — are incredibly focused on security. Yet to date, they have generally earned that right from their behavior in the market. And since running secure infrastructures is their whole business, and workloads are relatively portable between them over at least a medium term, the incentives are well-aligned.

There are also plenty of SaaS firms that are small because they’re early stage, hungry for customers, and will do anything for a stale. A piece of old Gartner advice sticks with me here: if you can get something unique and value add from such a firm, consider them, but recognize they run deep security risks and plan for that in what kind of data you store in them.

No, to me, the opportunity lies mid-scale, when we are looking at the firms who make the vast majority of on-premises and SaaS software.

These firms need to speak to us more about not just what their products can do, but about how they are owned and financed. How they prioritize work. How they build a culture of security, and who maintains it.

And when we see firms that continue to absorb every competitor and new entrant in their field, creating line extension after line extension, CIOs need to be worried about more than pricing pressure — we need to know that the firm we are trusting with our business, is trustworthy in return.

For what it’s worth, I think this all portends a huge opportunity for smaller firms that want to engage in real partnerships.

Most of the deals we have done in my time at Davidson where we are happiest are with mid-sized firms with independent ownership and a clear product focus and direction.

At the same time, some of the platforms we struggle the most with are those — I won’t name names here — that have been passed along between PE firms over the year, where talented staff leave and the product stagnates, since it is not of much value to its owners beyond the company’s perpetual bond coupon income stream.

There is a significant opportunity here for firms who want to focus and innovate in the context of an economically sustainable, but responsible business model.

For those firms who want to continue down a pathway that prioritizes maximum revenue for minimal investment, or who seek to dominate a market from all sides to keep customers locked into substandard or insecure solutions on the other hand?

The power of the cloud is not that large scale wins, but that large scale enables smaller scale organizations to be laser-focused and kick your lazy butts out of an enterprise.

And if you don’t think I and many of my CIO colleagues aren’t scrutinizing our vendor dance cards to decide whom we want to be in business with, you’ll find out at the first non-renewal notice.

Leave a Reply Cancel reply