Malone’s moves: a chess analogy

John Malone, one of the eight CEOs mentioned in my post about the Outsiders book, is featured in Cable Cowboy, a book that tries to describe how he built his cable empire and, in the process, compounded the stock of his cable company TCI by an astounding 30.3% per year for over 25 years.

Written by Mark Robichaux, a Wall Street journalist who covered several of Malone’s deals over the years, the book provides some behind-the-scenes color around his myriad cable and content deals.

But it does not, at least to my satisfaction, explain exactly what did Malone did that was so different from other CEOs.  He was clearly very smart and well educated – he graduated from Yale University with a B.A. in Electrical Engineering and Economics as a Phi Beta Kappa and National Merit scholar, and obtained an M.S in Electrical Engineering from an NYU program at Bell Labs as well as a Ph.D. in Operations Research from Johns Hopkins.

And the book describes how Malone learned about cable directly from some of its early pioneers. He was clearly good at financial engineering and pioneered many of the techniques used by private equity firms today (aggressive use of debt leverage). He could do this because he was early in realizing that cable revenues were reliably recurring, like a utility (but unregulated!), so it could be used to raise a lot of debt inexpensively. He also systematically maximized the tax benefits of financing his cable assets in this manner. I think he may have been somewhat lucky to not get wiped out at various points in his career while operating with a high level of leverage.

But beyond these operational strengths, I think he was particularly good at multiplying value via his deal-making. He was perpetually buying and selling various cable and content assets but its not obvious how all that wheeling and dealing actually creates value.

An analogy struck me while reading about some of his deals: the concept of trading small advantages in chess.

Lets say you start with a pawn sacrifice in the opening to get a move advantage. As time passes, unless one plays forcefully, this temporal advantage can quickly dissipate. So good players often convert this into a positional advantage if the opportunity presents itself. Positional advantage is more structural and hence robust. Later in the end-game it can, in turn, be converted into another kind of advantage – a passed pawn or perhaps a sacrifice to get an attack on the opponent’s king, etc. Thus there is a constant trading of advantages, from transient to permanent ones, depending upon the board situation. And a skilled player can usually translate this kind of trading of small advantages into a win.

I get the sense that Malone was very good at doing something equivalent in his business deals.

Using his deep knowledge of the cable industry, he could sense when a cable asset became available at an attractive price. He had a good sense of the intrinsic value of the cash-flows of cable assets. He would opportunistically buy such a mispriced asset even if it was not what he ultimately wanted (e.g. not in a region where he was building a roll-up of cable assets). Just like in chess you collect a small advantage when you can, even when its not a mating attack on the opposite king. You do that to get something to trade with.

Then he would patiently wait, sometimes for years, before an opportunity came to sell this asset, which would usually have appreciated by then (since he bought it when it was distressed). He would use the cash from this sale to then turn around and buy an asset that he really wanted all along. Or buy back shares in TCI if they were undervalued.

Thus he avoided overpaying for premium assets – the downfall of many of his competitors who were pursuing so-called “strategic M&A”on the advice of their investment bankers.

An example of his patience was evident in how he waited to but content assets (programming) until he had enough scale from his rolling up a bunch of regional cable providers. Once he had enough scale in cable distribution, he was in a strong negotiating position to acquire content assets on favorable terms.

And, since he could distribute the content to more subscribers than his competition, he was able to net more cash flow from his content assets. He would then leverage this additional cash-flow by raising more debt and buying more cable subscribers. And so on. This kind of virtuous cycle (more subscribers -> more content -> more subscribers) with increasing returns to scale can indeed explain compounded returns of 30% per year for more than two decades.

He was able to get to this point by systematically trading one advantageous deal into another, like a master chess player, thus multiplying overall value (in other words, by multiplying what economists call gains from trade).

The book triumphantly ends by describing how he is crowned his career by finally selling TCI to AT&T, once again opportunistically, when he judged that they were paying an attractive price for it.

And then he is supposed to have retired. 

Except he did not!

I think Malone is still playing this game, only this time in Europe, even in his seventies!

A company he chairs and controls, Liberty Global, is well along the way to owning a cable franchise that dominates Europe. There are significant economies of scale in doing so in such a dense and contiguous geographical area – just think of a cable truck being able to efficiently serve neighboring regions vs. one that services installations scattered all over the map.

And yet Liberty Global also owns some assets in Chile, completely disconnected from Europe! This fact was puzzling me when I was initially analyzing the company until the chess analogy came to mind. I now suspect Malone bought the Chilean cable opportunistically, when they were available for cheap, knowing full well that he will trade them later for what he really wanted – assets in the dense areas of Europe.

Indeed, recently the Liberty CEO is now talking about selling the Chilean cable assets and is in the process of buying more cable in Netherlands that is contiguous with their other European cable units.

In another repeat of the TCI playbook, Liberty Global is only now going about acquiring content in Europe. They have begun by making some small investments recently (e.g. a small position in ITV), but clearly waited until they had rolled up enough distribution muscle before they did so. At this point they are already the largest cable company (by number of subscribers) in Europe and thus clearly can get very attractive terms from any content producer there. And just like TCI, they can then monetize the acquired content better than others since they have the largest number of subscribers.

As Yogi Berra said, its deja vu all over again.

 

Disclosure: I am long Liberty Global (LBTYA) in various personal and professional portfolios.

 

 

 

 

The Disruption Controversy

There has been a lot of hubbub in the social media around a recent New Yorker article by Harvard historian Jill Lepore sharply disputing the famous “disruptive innovation” model Clayton Christensen, who teaches at the famous business school of the same university. Christensen has responded, with apparent disappointment and anger, in an interview he recently gave to BusinessWeek. Many Silicon Valley VCs seem to have come out in support of Christensen’s theory on Twitter, claiming that they practically live and breathe it in their daily hunt for disruptive startups.

A second, and ostensibly unrelated, meme has also been recently gone viral over Twitter. I think this was due to über-VC Marc Andressen who had one of his famous tweetstorms on offhand remark by his buddy, the former CEO of Netscape, Jim Barksdale:

… there’s only two ways I know of to make money: bundling and unbundling.

I think these two powerful ideas are actually related and underlying both is that ultimate force of technological change  – Moore’s law. I will use this (rather long and meandering) post to try and clarify my own thoughts on this topic since its of high interest to me as an investor in technology related moated companies.

Christensen’s theory of Disruptive Innovation

A provocative analysis of this idea can be found in the work of blogger Ben Thompson. He has illustrated the key idea in one of his distinctive sketches. The path of disruption looks something like the orange line in this:

Adapted from Figure 5-1 in the Innovatorʼs Solution, Christensen, Raynor

The key thing to notice is that products improve more rapidly than consumer needs expand. This means that while the incumbent product may have once been subpar, over time it becomes “too good” for most customers, offering features they don’t need yet charging for them anyways. Meanwhile, the new entrant has an inferior product, but at a much lower price, and as its product improves – again, more rapidly than consumer needs – it begins to peel away customers from the incumbent by virtue of its lower price. Eventually it becomes good enough for nearly all of the consumers, leaving the incumbent high and dry.

In an interesting post on this topic that predated the firestorm caused by Lepore, Thompson suggested that there are actually two types of disruption theories articulated by Christensen over time; he argued that one of them is flawed (emphasis in the following excerpt is mine):

The original theory of disruption, now known as new market disruption, was detailed in Christensen’s seminal paper Disruptive Technologies: Catching the Wave and expanded on in the classic book The Innovator’s Dilemma. Based primarily on a detailed study of the disk drive industry, the theory of new market disruption describes how incumbent companies ignore new technologies that don’t serve the needs of their customers or fit within their existing business models. However, as the new technology, which excels on completely different attributes than the incumbent’s product, continues to mature, it eventually takes over the market.

This remains an incredibly elegant and powerful theory, and I fully subscribe to it. We are, in fact, seeing it in action with Windows – the incumbent – and the iPad and other tablets; new technology that is inferior on attributes that matter to Windows’ best customers, but superior on other attributes that matter to many others.

And:

It is Christensen’s second theory of disruption – low-end disruption – that I believe is flawed … Briefly, an integrated approach wins at the beginning of a new market, because it produces a superior product that customers are willing to pay for. However, as a product category matures, even modular products become “good enough” – customers may know that the integrated product has superior features or specs, but they aren’t willing to pay more, and thus the low-priced providers, who build a product from parts with prices ground down by competition, come to own the market.

Thompson then goes on to argue that Christensen has been badly and repeatedly wrong in his prediction of the demise of Apple products based on this second, low end disruption, version of his theory.  He correctly points out that, in fact, Apple’s hit iPhone at its initial launch was clearly much more expensive than many of the products it has successfully “disrupted”: iPods, dumb cellphones, PDAs, GPS navigators, etc. And the iPhone launched with a rich superset of the features of the disrupted devices rather than a cheaper subset. This was not supposed to happen according to the disruption from the “low-end” theory.

But in the recent post on the success of Chromebooks, Thompson himself seems to be using this low-end version of the disruption hypothesis, contradicting his earlier assertion! Apparently, even the low-end disruption idea does work at times.

So the questions remain: Exactly when does which theory predict successfully? Under what circumstances?

Barksdale-Andreessen theory of Bundling and Unbundling

As HBR’s blog explains, Barksdale, a veteran of IBM, FedEx, and McCaw Cellular, “was brought on a few months after Netscape’s founding to provide adult supervision as its CEO”. He made his bundling and unbundling comment at the end of an investor roadshow in answer to Microsoft’s decision to bundle the Internet Explorer with Windows. In the HBR interview he elaborates:

I had worked for several businesses during my career by that time that had become conglomerates, some fairly large, and then had divested themselves of various businesses. I’m on the board of Time Warner, we have just parsed off our third major part — our original company, Time Inc., which is the publishing arm of Time Warner. We [already had] divested ourselves of Time Warner Cable as well as AOL. So, it’s not uncommon to add a bunch of companies together, much less software products, and then divest yourself of them as the shareholders think they have more value standing alone than standing together. You do it to get your stock price up.

… It’s easier to do in the digital age. It’s easier to bundle and unbundle digital products …

In his tweetstorm, Andreessen gives an example of how this process works:

1/A story of unbundling in the tech industry: 20 years of consumer Internet evolution —

2/One upon a time there was AOL, which was a completely integrated Internet access/information/communication service.

3/Then Yahoo came along and unbundled the information/communication parts like email/IM/sports-scores/stock-quotes from the access service.

4/One of the things you could do on Yahoo was search, then Google came along and unbundled that.

5/You can search for anything on Google, including people; Facebook came along with a much better way to just search for people.

5/You can search for anything on Google, including people; Facebook came along with a much better way to just search for people.

6/Three things you can do on Facebook are messaging, photo sharing, and status updates; therefore Whatsapp, Instagram, and Twitter.

7/And yes, Yo unbundles the creation & existence of a message from the contents of a message, unbundling Whatsapp and Twitter :-).

8/Ev Williams () is the modern genius of this concept–playing out in our industry continuously since the 1950’s.

9/The part people often miss is that you can get extremely powerful second/third order effects at each step with his pattern.

10/The entrepreneurs generally have a pretty good sense of this when they’re doing it, but it doesn’t become clear to others until later.

11/This is a pattern what we love to fund: unbundle X from Y, but then use the liberation of X as leverage to do amazing new things with X.

12/And the howls of press and analyst outrage at the apparent stupidity of each unbundling are very helpful for keeping valuations down :-).

1/The flip side of unbundling: Later on, the unbundlers tend to try to rebundle in the image of whatever they unbundled.

2/So Yahoo adds an ISP (), and Google adds email/IM/sports-scores/stock-quotes.

3/Twitter changes its user profile page to look more like Facebook :-).

4/Sun unbundled DEC with commodity components, then re-bundled into a proprietary computing stack just like DEC w/Solaris, Sparc, etc.

5/Microsoft likewise unbundled DEC minicomputers w/PC OS + tools, then rebundled into DEC-like integrated stack now including hardware (!).

6/Paraphrasing Harvey Dent: “You either die a hero or you live long enough to see yourself become the company you first competed with.”

7/And then sometimes the rebundlers realize what they’re doing and try to reverse course. E.g. Microsoft building apps for iOS & Android.

8/And thus the cycle of life repeats with yet more unbundling :-).

The key driver underneath all this is technology change – the bundles emerge as a consequence:

And so the newspaper bundle, the idea of this slug of news and sports scores and classifieds and stock quotes that arrives once a day was a consequence of the printing plant. Of the metro area printing plant, of the distribution network for newspapers using trucks and newsstands and newspaper vending machines and the famous newspaper delivery boy. That newspaper bundle was based on the distribution technology of a time and place.

When the distribution technology changed with the internet, there was going to be the great unwind, and then the great rebundle, in the form of Google and Facebook and Twitter and all these new bundles.

I think music is a great example of that. It made sense in the LP and CD era to [bundle] eight or 10 or 12 or 15 songs on a disc and press the disc and ship it out and have it sit in storage until somebody came along and bought it.

But, when you have the ability online to download or stream individual tracks, then all of a sudden that bundle just doesn’t make sense. So it [unbundled] into individual MP3s.

And I think now it makes sense that it’s kind of re-bundling into streaming services like Pandora and Spotify.

And the bundling or unbundling of the product actually directly affects the bundling or unbundling of the business:
 
So one of the other things you see happening in music now is actually the music industry getting reconfigured and being split out. There are now companies that are entirely online record labels that have started from scratch. Or there are companies that are entirely focused on merchandise sales. There are companies entirely focused on touring. And the old record labels that are still bundled businesses corresponding to a bundled product offering are struggling to adapt to this new world with lots of new competitors that are effectively unbundled.
 
Andreessen goes on to explain how this pattern helps him identify a promising startup:
 

Often, a key characteristic of large incumbents in any industry is, they have a bundle that is accumulated over time, for the reasons that Jim described [“because it’s an effective growth strategy. Once you try to grow the business, it’s an easier out to stay focused on your core and then add things to it.  And you become a big bundle again”].

And then what we look for is for something to have changed in the underlying technology. The arrival of the Internet was a big one. The arrival of mobile distribution. The arrival of social networks. The arrival of Bitcoin is a current example.

So, we look for something to change in the underlying technology, and then basically say, “Well, you know, gee, if you were to sit down today with a clean sheet of paper, and you knew that the technology was changing, then what would be the proper form of the product, if you were starting from scratch?

That’s the question that’s always the hardest for an incumbent to ask, because that’s the classic innovators dilemma. And that’s the question that’s the easiest for the startup to ask, because the startup literally is somebody sitting down with a clean sheet of paper. All they have is the ability to think from first principles, think from scratch.

I would say we look actively for the pattern of large incumbent, established industry, bundled product or service offering, coupled with underlying technology change, coupled with idea for unbundled product that the customer might prefer, and then of course coupled with an entrepreneur who can actually build a business around that. I think that’s a fairly common pattern.

Andy Grove’s theory of vertical-to-horizontal transition

But even before Christensen and Netscape, way back in early 1980s, Intel’s legendary former CEO Andy Grove had already commented upon the remarkable transition from vertical silos to horizontal modules that was then completely disrupting the landscape of the computer industry:
 
 
This change is obviously related to both the bundling/unbundling theory and the disruption theory. But I think the key underlying driver in Grove’s model is the all-powerful Moore’s law (the observation that silicon chips roughly doubled in performance at the same price point every couple of years).
 
My Take: Both disruption and bundling/unbundling arise from shifts in the point of scarcity
 
I think both these phenomenon are really just long term consequences of Moore’s law. Let me explain:
 
  • Every few years, like clockwork, the incessant, exponential rise in price/performance of silicon hardware alters what is scarce and what is abundant in the value chain of any product that depends upon silicon.
  • Economic value shifts to whatever is scarce. The newly abundant becomes a commodity and loses its value.
  • It is precisely this constant shift in the point of scarcity over time that ultimately drives all the disruptive bundling, unbundling and rebundling that goes on in so many industries.
 
A particularly powerful consequence of Moore’s law is that software can do more and more things every few years. In other words, by exploiting the cheaper and more powerful hardware software “eats the world” in Andreessen’s evocative phrase, encroaching a wider and wider circle of companies and industries. As pointed out by Grove, this can change the vertical value chains into horizontal. And then new value chains form around the winners.
 
Here is how I think the process works:
 
  • Every few years, this scarcity shift causes a specific “module” of a value chain bundle to become vastly more valuable than the surrounding pieces.
  • This creates an opportunity for value creation by unbundling. As Andreessen has observed, this can exploited by a competitor (or entrepreneur) starting from scratch and offering just that valuable point of scarcity as an attractive product offering.
  • This unbundling disrupts the existing value chain incumbents. Some nimble companies may be able to co-opt this process and quickly adapt; those who cannot decline. The devil is in the details, as Lepore has pointed out. of course, predicting the winners and losers in this transition cannot be easy otherwise everyone would be a great investor! It is probably helpful to conduct the mental exercise suggested by Andreessen: what would be the proper form of the product, if you were starting from scratch today? This is useful since some things can be done very differently – perhaps more simply (hence cheaply) today due to the ongoing scarcity shift. For example, the high speed internet and smartphones make digital distribution nearly a free commodity today. Many businesses that have spent a lot of money on distribution infrastructure will find that many parts of their overall value proposition have become a commodity. Imagining building an existing product or service all over from scratch can cause this fact to pop out in our minds, leading to better predictions.
  • Once the unbundled product has succeeded in capturing its market, its fairly easy to rebundle more and more features around it to increase its value (as Barksdale has observed).
  • This goes on until Moore’s law changes the game – the point of scarcity – again, causing the disruptive cycle to repeat!
 
I find this a clearer and deeper explanation of the observed disruptions as compared to the simplistic model graphed at the beginning of this article. And I will venture to guess that if we remove all examples that can be better explained by such a (Moore’s law driven) shift in scarcity from the various Christensen case studies, not much will be left that escapes Lepore’s valid criticism. However, I have not actually done this exercise so I cannot be sure about this. Perhaps there will remain some type of disruptions that are better explained by his model – it will be useful to know.
 
Update:
 
A conversation with my friend Alex led to a provocative question: surely there are other drivers of change than just Moore’s law, so why focus so much on that one force?
 
And, to be sure, there are many varieties of improvements that companies go through in improving their price/performance metrics over time. This could various kinds of process learning (at an individual as well as team level) as well as efficiencies that come from pumping a larger volume through manufacturing, for example.
 
It seems to me, though, that Moore’s law is perhaps unique in having compounded at such a high rate for many decades – most processes have not done that. Compounding over that long leads to improvements by the factor of millions and billions, and most processes just cannot improve that much. So its particularly important to be on the right side of Moore’s law; otherwise one risks being disrupted due to its unique ability to shift the point of scarcity every few years, like clockwork.
 
Having said that, I think the point about the shift in the point of scarcity remains valid and useful even if there are other drivers of rapid change – Moore’s law is just one of many potential drivers of such change.
 
 
 

Why is this “mate in 3” so hard?

I like chess puzzles and if you are like me you know that “mate in 3” can have only a limited number of solutions and usually can be solved within, say, 10 to 15 minutes (master level players will of course be much faster). However, the following puzzle turned out to be much more tricky, at least to me (I am only a club level player). Before going on, give it a try:

Black to move and mate in 3 moves:

chess

If you can’t wait for the answer, you can see it fully described here by Joe Wiesenthal, the prolific economics editor of Business Insider. As he says:

“the eminent chess player and commenter Susan Polgar posted on her blog the following:

Black to move and checkmate in 3. Please no computer analysis. This is a very cool checkmate. Try to find it for yourself.

Now chess problems where you’re asked to just mate in 3 moves aren’t typically all that hard, so my curiosity was piqued by the fact that Polgar said not to use a computer. Obviously it couldn’t be that simple if you’re tempted to use a computer to solve a 3 move chess problem.

And it wasn’t! In fact I spent the evening looking at it yesterday without getting it.”

But now I am intrigued – exactly why is this elegant little puzzle so surprisingly hard?

And that I find to be an even more interesting puzzle, one of human psychology. I think a hint can be obtained by trying to solve an entirely different, almost trivial, problem in arithmetic. Try this:

A bat and a ball cost $1.10. The bat costs $1.00 more than the ball.

How much does the ball cost?

If you answered, like most people, 10 cents then you are wrong!

The correct answer, which will be immediately obvious upon reflection, is 5 cents (since $1.05-$0.05 = $1.00).

This problem comes from the research of Shane Frederick, a collaborator of Daniel Kahneman, one of the world’s top cognitive psychologists. Frederick’s paper, “Cognitive Reflection and Decision Making” (Journal of Economic Perspectives, Volume 19, Number 4, (2005) pp 25–42) describes what might be happening in such problems.

Here are two more arithmetic problems from the paper:

If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?

In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake?

If you said 100 minutes for the widgets or 24 days for the lake, then once again you got the wrong answer. But you are not alone – a surprising majority of students from elite universities, including maths and physics types, get these elementary problems wrong (data in the paper cited above and various follow-up papers).

So what is going on?

Our brains can be thought of as consisting of two separate but interacting systems. As Kahneman explains in his brilliant Nobel lecture:

“The operations of System 1 are fast, automatic, effortless, associative, and difficult to control or modify.

The operations of System 2 are slower, serial, effortful, and deliberately controlled; they are also relatively
flexible and potentially rule-governed.”

System 1 can be thought of as the intuitive system and System 2 as the reflective system – what we normally call “thinking”. Obviously, neurons are firing in both cases, but System 1 feels so effortless that most people don’t realize the massive extent of neural processing involved in, say, seeing that we are looking at a chair, since such acts of perception are accomplished by the fast System 1.

It is likely that System 1 has “hardwired” the critical processing that our ancestors needed frequently, like perceiving objects and making very quick (“intuitive”) judgments. It is basically a pattern recognizer. But it can also learn new things after sufficient repetitions – it is where our habits reside.

System 2 is more flexible and algorithmic computer in its style, albeit a very slow computer. The overall executive control also is a part of System 2.

As the Frederick paper explains, the three arithmetic problems are

“easy” in the sense that their solution is easily understood when explained, yet reaching the correct answer often requires the suppression of an erroneous answer that springs “impulsively” to mind.

And that is exactly what I guess  is going on with the chess puzzle above.

Every chess puzzle lover “knows” that a discovered check pattern is often the heart of many a pretty mating sequence. And sure enough, there is a very seductive discovered (and, indeed, double!) check available on the second move after the obvious (and correct) rook check at first move.

So our intuitive pattern detector jumps to the conclusion that this discovered check just has to be part of any solution. It leads us down the proverbial garden-path and we tend to waste of a lot of time on this dead-end.

The solution to the chess puzzle finally emerges only after we have somehow (hours later in my case) managed to suppress this discovered-check. After that happened me – hours later – I finally focused on the fact that the white king is completely locked-in after the rook check on the first move. And after this “aha” moment, finding the two moves by the black bishop that deliver the coup-de-grace was not too hard.

Of course a computer would solve this problem in milliseconds, since the search tree is so small. And the computer, in a sense, is all logical system 2 with no intuitive system 1 to mislead it!

Unlike the arithmetic problem, there has been no research on this chess puzzle to my knowledge, so this explanation of why this simple enough mate is so hard for humans is just my best guess at this point.

Just in case this discussion makes System 1 appear dumb, it is worth keeping in mind that artificial intelligence programs still cannot come anywhere close to being able to perceive patterns that are trivial for humans. It is also the likely source of creative insights and the two systems together are responsible for all the glories of human achievement.

Incidentally, if you enjoy this sort of stuff, Kahneman’s Thinking, Fast and Slow is a true masterpiece and easily one of the best books I have read in the last few years.