Malone’s moves: a chess analogy

John Malone, one of the eight CEOs mentioned in my post about the Outsiders book, is featured in Cable Cowboy, a book that tries to describe how he built his cable empire and, in the process, compounded the stock of his cable company TCI by an astounding 30.3% per year for over 25 years.

Written by Mark Robichaux, a Wall Street journalist who covered several of Malone’s deals over the years, the book provides some behind-the-scenes color around his myriad cable and content deals.

But it does not, at least to my satisfaction, explain exactly what did Malone did that was so different from other CEOs.  He was clearly very smart and well educated – he graduated from Yale University with a B.A. in Electrical Engineering and Economics as a Phi Beta Kappa and National Merit scholar, and obtained an M.S in Electrical Engineering from an NYU program at Bell Labs as well as a Ph.D. in Operations Research from Johns Hopkins.

And the book describes how Malone learned about cable directly from some of its early pioneers. He was clearly good at financial engineering and pioneered many of the techniques used by private equity firms today (aggressive use of debt leverage). He could do this because he was early in realizing that cable revenues were reliably recurring, like a utility (but unregulated!), so it could be used to raise a lot of debt inexpensively. He also systematically maximized the tax benefits of financing his cable assets in this manner. I think he may have been somewhat lucky to not get wiped out at various points in his career while operating with a high level of leverage.

But beyond these operational strengths, I think he was particularly good at multiplying value via his deal-making. He was perpetually buying and selling various cable and content assets but its not obvious how all that wheeling and dealing actually creates value.

An analogy struck me while reading about some of his deals: the concept of trading small advantages in chess.

Lets say you start with a pawn sacrifice in the opening to get a move advantage. As time passes, unless one plays forcefully, this temporal advantage can quickly dissipate. So good players often convert this into a positional advantage if the opportunity presents itself. Positional advantage is more structural and hence robust. Later in the end-game it can, in turn, be converted into another kind of advantage – a passed pawn or perhaps a sacrifice to get an attack on the opponent’s king, etc. Thus there is a constant trading of advantages, from transient to permanent ones, depending upon the board situation. And a skilled player can usually translate this kind of trading of small advantages into a win.

I get the sense that Malone was very good at doing something equivalent in his business deals.

Using his deep knowledge of the cable industry, he could sense when a cable asset became available at an attractive price. He had a good sense of the intrinsic value of the cash-flows of cable assets. He would opportunistically buy such a mispriced asset even if it was not what he ultimately wanted (e.g. not in a region where he was building a roll-up of cable assets). Just like in chess you collect a small advantage when you can, even when its not a mating attack on the opposite king. You do that to get something to trade with.

Then he would patiently wait, sometimes for years, before an opportunity came to sell this asset, which would usually have appreciated by then (since he bought it when it was distressed). He would use the cash from this sale to then turn around and buy an asset that he really wanted all along. Or buy back shares in TCI if they were undervalued.

Thus he avoided overpaying for premium assets – the downfall of many of his competitors who were pursuing so-called “strategic M&A”on the advice of their investment bankers.

An example of his patience was evident in how he waited to but content assets (programming) until he had enough scale from his rolling up a bunch of regional cable providers. Once he had enough scale in cable distribution, he was in a strong negotiating position to acquire content assets on favorable terms.

And, since he could distribute the content to more subscribers than his competition, he was able to net more cash flow from his content assets. He would then leverage this additional cash-flow by raising more debt and buying more cable subscribers. And so on. This kind of virtuous cycle (more subscribers -> more content -> more subscribers) with increasing returns to scale can indeed explain compounded returns of 30% per year for more than two decades.

He was able to get to this point by systematically trading one advantageous deal into another, like a master chess player, thus multiplying overall value (in other words, by multiplying what economists call gains from trade).

The book triumphantly ends by describing how he is crowned his career by finally selling TCI to AT&T, once again opportunistically, when he judged that they were paying an attractive price for it.

And then he is supposed to have retired. 

Except he did not!

I think Malone is still playing this game, only this time in Europe, even in his seventies!

A company he chairs and controls, Liberty Global, is well along the way to owning a cable franchise that dominates Europe. There are significant economies of scale in doing so in such a dense and contiguous geographical area – just think of a cable truck being able to efficiently serve neighboring regions vs. one that services installations scattered all over the map.

And yet Liberty Global also owns some assets in Chile, completely disconnected from Europe! This fact was puzzling me when I was initially analyzing the company until the chess analogy came to mind. I now suspect Malone bought the Chilean cable opportunistically, when they were available for cheap, knowing full well that he will trade them later for what he really wanted – assets in the dense areas of Europe.

Indeed, recently the Liberty CEO is now talking about selling the Chilean cable assets and is in the process of buying more cable in Netherlands that is contiguous with their other European cable units.

In another repeat of the TCI playbook, Liberty Global is only now going about acquiring content in Europe. They have begun by making some small investments recently (e.g. a small position in ITV), but clearly waited until they had rolled up enough distribution muscle before they did so. At this point they are already the largest cable company (by number of subscribers) in Europe and thus clearly can get very attractive terms from any content producer there. And just like TCI, they can then monetize the acquired content better than others since they have the largest number of subscribers.

As Yogi Berra said, its deja vu all over again.


Disclosure: I am long Liberty Global (LBTYA) in various personal and professional portfolios.





The Disruption Controversy

There has been a lot of hubbub in the social media around a recent New Yorker article by Harvard historian Jill Lepore sharply disputing the famous “disruptive innovation” model Clayton Christensen, who teaches at the famous business school of the same university. Christensen has responded, with apparent disappointment and anger, in an interview he recently gave to BusinessWeek. Many Silicon Valley VCs seem to have come out in support of Christensen’s theory on Twitter, claiming that they practically live and breathe it in their daily hunt for disruptive startups.

A second, and ostensibly unrelated, meme has also been recently gone viral over Twitter. I think this was due to über-VC Marc Andressen who had one of his famous tweetstorms on offhand remark by his buddy, the former CEO of Netscape, Jim Barksdale:

… there’s only two ways I know of to make money: bundling and unbundling.

I think these two powerful ideas are actually related and underlying both is that ultimate force of technological change  – Moore’s law. I will use this (rather long and meandering) post to try and clarify my own thoughts on this topic since its of high interest to me as an investor in technology related moated companies.

Christensen’s theory of Disruptive Innovation

A provocative analysis of this idea can be found in the work of blogger Ben Thompson. He has illustrated the key idea in one of his distinctive sketches. The path of disruption looks something like the orange line in this:

Adapted from Figure 5-1 in the Innovatorʼs Solution, Christensen, Raynor

The key thing to notice is that products improve more rapidly than consumer needs expand. This means that while the incumbent product may have once been subpar, over time it becomes “too good” for most customers, offering features they don’t need yet charging for them anyways. Meanwhile, the new entrant has an inferior product, but at a much lower price, and as its product improves – again, more rapidly than consumer needs – it begins to peel away customers from the incumbent by virtue of its lower price. Eventually it becomes good enough for nearly all of the consumers, leaving the incumbent high and dry.

In an interesting post on this topic that predated the firestorm caused by Lepore, Thompson suggested that there are actually two types of disruption theories articulated by Christensen over time; he argued that one of them is flawed (emphasis in the following excerpt is mine):

The original theory of disruption, now known as new market disruption, was detailed in Christensen’s seminal paper Disruptive Technologies: Catching the Wave and expanded on in the classic book The Innovator’s Dilemma. Based primarily on a detailed study of the disk drive industry, the theory of new market disruption describes how incumbent companies ignore new technologies that don’t serve the needs of their customers or fit within their existing business models. However, as the new technology, which excels on completely different attributes than the incumbent’s product, continues to mature, it eventually takes over the market.

This remains an incredibly elegant and powerful theory, and I fully subscribe to it. We are, in fact, seeing it in action with Windows – the incumbent – and the iPad and other tablets; new technology that is inferior on attributes that matter to Windows’ best customers, but superior on other attributes that matter to many others.


It is Christensen’s second theory of disruption – low-end disruption – that I believe is flawed … Briefly, an integrated approach wins at the beginning of a new market, because it produces a superior product that customers are willing to pay for. However, as a product category matures, even modular products become “good enough” – customers may know that the integrated product has superior features or specs, but they aren’t willing to pay more, and thus the low-priced providers, who build a product from parts with prices ground down by competition, come to own the market.

Thompson then goes on to argue that Christensen has been badly and repeatedly wrong in his prediction of the demise of Apple products based on this second, low end disruption, version of his theory.  He correctly points out that, in fact, Apple’s hit iPhone at its initial launch was clearly much more expensive than many of the products it has successfully “disrupted”: iPods, dumb cellphones, PDAs, GPS navigators, etc. And the iPhone launched with a rich superset of the features of the disrupted devices rather than a cheaper subset. This was not supposed to happen according to the disruption from the “low-end” theory.

But in the recent post on the success of Chromebooks, Thompson himself seems to be using this low-end version of the disruption hypothesis, contradicting his earlier assertion! Apparently, even the low-end disruption idea does work at times.

So the questions remain: Exactly when does which theory predict successfully? Under what circumstances?

Barksdale-Andreessen theory of Bundling and Unbundling

As HBR’s blog explains, Barksdale, a veteran of IBM, FedEx, and McCaw Cellular, “was brought on a few months after Netscape’s founding to provide adult supervision as its CEO”. He made his bundling and unbundling comment at the end of an investor roadshow in answer to Microsoft’s decision to bundle the Internet Explorer with Windows. In the HBR interview he elaborates:

I had worked for several businesses during my career by that time that had become conglomerates, some fairly large, and then had divested themselves of various businesses. I’m on the board of Time Warner, we have just parsed off our third major part — our original company, Time Inc., which is the publishing arm of Time Warner. We [already had] divested ourselves of Time Warner Cable as well as AOL. So, it’s not uncommon to add a bunch of companies together, much less software products, and then divest yourself of them as the shareholders think they have more value standing alone than standing together. You do it to get your stock price up.

… It’s easier to do in the digital age. It’s easier to bundle and unbundle digital products …

In his tweetstorm, Andreessen gives an example of how this process works:

1/A story of unbundling in the tech industry: 20 years of consumer Internet evolution —

2/One upon a time there was AOL, which was a completely integrated Internet access/information/communication service.

3/Then Yahoo came along and unbundled the information/communication parts like email/IM/sports-scores/stock-quotes from the access service.

4/One of the things you could do on Yahoo was search, then Google came along and unbundled that.

5/You can search for anything on Google, including people; Facebook came along with a much better way to just search for people.

5/You can search for anything on Google, including people; Facebook came along with a much better way to just search for people.

6/Three things you can do on Facebook are messaging, photo sharing, and status updates; therefore Whatsapp, Instagram, and Twitter.

7/And yes, Yo unbundles the creation & existence of a message from the contents of a message, unbundling Whatsapp and Twitter :-).

8/Ev Williams () is the modern genius of this concept–playing out in our industry continuously since the 1950’s.

9/The part people often miss is that you can get extremely powerful second/third order effects at each step with his pattern.

10/The entrepreneurs generally have a pretty good sense of this when they’re doing it, but it doesn’t become clear to others until later.

11/This is a pattern what we love to fund: unbundle X from Y, but then use the liberation of X as leverage to do amazing new things with X.

12/And the howls of press and analyst outrage at the apparent stupidity of each unbundling are very helpful for keeping valuations down :-).

1/The flip side of unbundling: Later on, the unbundlers tend to try to rebundle in the image of whatever they unbundled.

2/So Yahoo adds an ISP (), and Google adds email/IM/sports-scores/stock-quotes.

3/Twitter changes its user profile page to look more like Facebook :-).

4/Sun unbundled DEC with commodity components, then re-bundled into a proprietary computing stack just like DEC w/Solaris, Sparc, etc.

5/Microsoft likewise unbundled DEC minicomputers w/PC OS + tools, then rebundled into DEC-like integrated stack now including hardware (!).

6/Paraphrasing Harvey Dent: “You either die a hero or you live long enough to see yourself become the company you first competed with.”

7/And then sometimes the rebundlers realize what they’re doing and try to reverse course. E.g. Microsoft building apps for iOS & Android.

8/And thus the cycle of life repeats with yet more unbundling :-).

The key driver underneath all this is technology change – the bundles emerge as a consequence:

And so the newspaper bundle, the idea of this slug of news and sports scores and classifieds and stock quotes that arrives once a day was a consequence of the printing plant. Of the metro area printing plant, of the distribution network for newspapers using trucks and newsstands and newspaper vending machines and the famous newspaper delivery boy. That newspaper bundle was based on the distribution technology of a time and place.

When the distribution technology changed with the internet, there was going to be the great unwind, and then the great rebundle, in the form of Google and Facebook and Twitter and all these new bundles.

I think music is a great example of that. It made sense in the LP and CD era to [bundle] eight or 10 or 12 or 15 songs on a disc and press the disc and ship it out and have it sit in storage until somebody came along and bought it.

But, when you have the ability online to download or stream individual tracks, then all of a sudden that bundle just doesn’t make sense. So it [unbundled] into individual MP3s.

And I think now it makes sense that it’s kind of re-bundling into streaming services like Pandora and Spotify.

And the bundling or unbundling of the product actually directly affects the bundling or unbundling of the business:
So one of the other things you see happening in music now is actually the music industry getting reconfigured and being split out. There are now companies that are entirely online record labels that have started from scratch. Or there are companies that are entirely focused on merchandise sales. There are companies entirely focused on touring. And the old record labels that are still bundled businesses corresponding to a bundled product offering are struggling to adapt to this new world with lots of new competitors that are effectively unbundled.
Andreessen goes on to explain how this pattern helps him identify a promising startup:

Often, a key characteristic of large incumbents in any industry is, they have a bundle that is accumulated over time, for the reasons that Jim described [“because it’s an effective growth strategy. Once you try to grow the business, it’s an easier out to stay focused on your core and then add things to it.  And you become a big bundle again”].

And then what we look for is for something to have changed in the underlying technology. The arrival of the Internet was a big one. The arrival of mobile distribution. The arrival of social networks. The arrival of Bitcoin is a current example.

So, we look for something to change in the underlying technology, and then basically say, “Well, you know, gee, if you were to sit down today with a clean sheet of paper, and you knew that the technology was changing, then what would be the proper form of the product, if you were starting from scratch?

That’s the question that’s always the hardest for an incumbent to ask, because that’s the classic innovators dilemma. And that’s the question that’s the easiest for the startup to ask, because the startup literally is somebody sitting down with a clean sheet of paper. All they have is the ability to think from first principles, think from scratch.

I would say we look actively for the pattern of large incumbent, established industry, bundled product or service offering, coupled with underlying technology change, coupled with idea for unbundled product that the customer might prefer, and then of course coupled with an entrepreneur who can actually build a business around that. I think that’s a fairly common pattern.

Andy Grove’s theory of vertical-to-horizontal transition

But even before Christensen and Netscape, way back in early 1980s, Intel’s legendary former CEO Andy Grove had already commented upon the remarkable transition from vertical silos to horizontal modules that was then completely disrupting the landscape of the computer industry:
This change is obviously related to both the bundling/unbundling theory and the disruption theory. But I think the key underlying driver in Grove’s model is the all-powerful Moore’s law (the observation that silicon chips roughly doubled in performance at the same price point every couple of years).
My Take: Both disruption and bundling/unbundling arise from shifts in the point of scarcity
I think both these phenomenon are really just long term consequences of Moore’s law. Let me explain:
  • Every few years, like clockwork, the incessant, exponential rise in price/performance of silicon hardware alters what is scarce and what is abundant in the value chain of any product that depends upon silicon.
  • Economic value shifts to whatever is scarce. The newly abundant becomes a commodity and loses its value.
  • It is precisely this constant shift in the point of scarcity over time that ultimately drives all the disruptive bundling, unbundling and rebundling that goes on in so many industries.
A particularly powerful consequence of Moore’s law is that software can do more and more things every few years. In other words, by exploiting the cheaper and more powerful hardware software “eats the world” in Andreessen’s evocative phrase, encroaching a wider and wider circle of companies and industries. As pointed out by Grove, this can change the vertical value chains into horizontal. And then new value chains form around the winners.
Here is how I think the process works:
  • Every few years, this scarcity shift causes a specific “module” of a value chain bundle to become vastly more valuable than the surrounding pieces.
  • This creates an opportunity for value creation by unbundling. As Andreessen has observed, this can exploited by a competitor (or entrepreneur) starting from scratch and offering just that valuable point of scarcity as an attractive product offering.
  • This unbundling disrupts the existing value chain incumbents. Some nimble companies may be able to co-opt this process and quickly adapt; those who cannot decline. The devil is in the details, as Lepore has pointed out. of course, predicting the winners and losers in this transition cannot be easy otherwise everyone would be a great investor! It is probably helpful to conduct the mental exercise suggested by Andreessen: what would be the proper form of the product, if you were starting from scratch today? This is useful since some things can be done very differently – perhaps more simply (hence cheaply) today due to the ongoing scarcity shift. For example, the high speed internet and smartphones make digital distribution nearly a free commodity today. Many businesses that have spent a lot of money on distribution infrastructure will find that many parts of their overall value proposition have become a commodity. Imagining building an existing product or service all over from scratch can cause this fact to pop out in our minds, leading to better predictions.
  • Once the unbundled product has succeeded in capturing its market, its fairly easy to rebundle more and more features around it to increase its value (as Barksdale has observed).
  • This goes on until Moore’s law changes the game – the point of scarcity – again, causing the disruptive cycle to repeat!
I find this a clearer and deeper explanation of the observed disruptions as compared to the simplistic model graphed at the beginning of this article. And I will venture to guess that if we remove all examples that can be better explained by such a (Moore’s law driven) shift in scarcity from the various Christensen case studies, not much will be left that escapes Lepore’s valid criticism. However, I have not actually done this exercise so I cannot be sure about this. Perhaps there will remain some type of disruptions that are better explained by his model – it will be useful to know.
A conversation with my friend Alex led to a provocative question: surely there are other drivers of change than just Moore’s law, so why focus so much on that one force?
And, to be sure, there are many varieties of improvements that companies go through in improving their price/performance metrics over time. This could various kinds of process learning (at an individual as well as team level) as well as efficiencies that come from pumping a larger volume through manufacturing, for example.
It seems to me, though, that Moore’s law is perhaps unique in having compounded at such a high rate for many decades – most processes have not done that. Compounding over that long leads to improvements by the factor of millions and billions, and most processes just cannot improve that much. So its particularly important to be on the right side of Moore’s law; otherwise one risks being disrupted due to its unique ability to shift the point of scarcity every few years, like clockwork.
Having said that, I think the point about the shift in the point of scarcity remains valid and useful even if there are other drivers of rapid change – Moore’s law is just one of many potential drivers of such change.

Why is this “mate in 3” so hard?

I like chess puzzles and if you are like me you know that “mate in 3” can have only a limited number of solutions and usually can be solved within, say, 10 to 15 minutes (master level players will of course be much faster). However, the following puzzle turned out to be much more tricky, at least to me (I am only a club level player). Before going on, give it a try:

Black to move and mate in 3 moves:


If you can’t wait for the answer, you can see it fully described here by Joe Wiesenthal, the prolific economics editor of Business Insider. As he says:

“the eminent chess player and commenter Susan Polgar posted on her blog the following:

Black to move and checkmate in 3. Please no computer analysis. This is a very cool checkmate. Try to find it for yourself.

Now chess problems where you’re asked to just mate in 3 moves aren’t typically all that hard, so my curiosity was piqued by the fact that Polgar said not to use a computer. Obviously it couldn’t be that simple if you’re tempted to use a computer to solve a 3 move chess problem.

And it wasn’t! In fact I spent the evening looking at it yesterday without getting it.”

But now I am intrigued – exactly why is this elegant little puzzle so surprisingly hard?

And that I find to be an even more interesting puzzle, one of human psychology. I think a hint can be obtained by trying to solve an entirely different, almost trivial, problem in arithmetic. Try this:

A bat and a ball cost $1.10. The bat costs $1.00 more than the ball.

How much does the ball cost?

If you answered, like most people, 10 cents then you are wrong!

The correct answer, which will be immediately obvious upon reflection, is 5 cents (since $1.05-$0.05 = $1.00).

This problem comes from the research of Shane Frederick, a collaborator of Daniel Kahneman, one of the world’s top cognitive psychologists. Frederick’s paper, “Cognitive Reflection and Decision Making” (Journal of Economic Perspectives, Volume 19, Number 4, (2005) pp 25–42) describes what might be happening in such problems.

Here are two more arithmetic problems from the paper:

If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?

In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake?

If you said 100 minutes for the widgets or 24 days for the lake, then once again you got the wrong answer. But you are not alone – a surprising majority of students from elite universities, including maths and physics types, get these elementary problems wrong (data in the paper cited above and various follow-up papers).

So what is going on?

Our brains can be thought of as consisting of two separate but interacting systems. As Kahneman explains in his brilliant Nobel lecture:

“The operations of System 1 are fast, automatic, effortless, associative, and difficult to control or modify.

The operations of System 2 are slower, serial, effortful, and deliberately controlled; they are also relatively
flexible and potentially rule-governed.”

System 1 can be thought of as the intuitive system and System 2 as the reflective system – what we normally call “thinking”. Obviously, neurons are firing in both cases, but System 1 feels so effortless that most people don’t realize the massive extent of neural processing involved in, say, seeing that we are looking at a chair, since such acts of perception are accomplished by the fast System 1.

It is likely that System 1 has “hardwired” the critical processing that our ancestors needed frequently, like perceiving objects and making very quick (“intuitive”) judgments. It is basically a pattern recognizer. But it can also learn new things after sufficient repetitions – it is where our habits reside.

System 2 is more flexible and algorithmic computer in its style, albeit a very slow computer. The overall executive control also is a part of System 2.

As the Frederick paper explains, the three arithmetic problems are

“easy” in the sense that their solution is easily understood when explained, yet reaching the correct answer often requires the suppression of an erroneous answer that springs “impulsively” to mind.

And that is exactly what I guess  is going on with the chess puzzle above.

Every chess puzzle lover “knows” that a discovered check pattern is often the heart of many a pretty mating sequence. And sure enough, there is a very seductive discovered (and, indeed, double!) check available on the second move after the obvious (and correct) rook check at first move.

So our intuitive pattern detector jumps to the conclusion that this discovered check just has to be part of any solution. It leads us down the proverbial garden-path and we tend to waste of a lot of time on this dead-end.

The solution to the chess puzzle finally emerges only after we have somehow (hours later in my case) managed to suppress this discovered-check. After that happened me – hours later – I finally focused on the fact that the white king is completely locked-in after the rook check on the first move. And after this “aha” moment, finding the two moves by the black bishop that deliver the coup-de-grace was not too hard.

Of course a computer would solve this problem in milliseconds, since the search tree is so small. And the computer, in a sense, is all logical system 2 with no intuitive system 1 to mislead it!

Unlike the arithmetic problem, there has been no research on this chess puzzle to my knowledge, so this explanation of why this simple enough mate is so hard for humans is just my best guess at this point.

Just in case this discussion makes System 1 appear dumb, it is worth keeping in mind that artificial intelligence programs still cannot come anywhere close to being able to perceive patterns that are trivial for humans. It is also the likely source of creative insights and the two systems together are responsible for all the glories of human achievement.

Incidentally, if you enjoy this sort of stuff, Kahneman’s Thinking, Fast and Slow is a true masterpiece and easily one of the best books I have read in the last few years.

Why were the Outsider CEOs such great compounders?


Henry Singleton, the co-founder and former CEO of Teledyne, has “the best operating and capital deployment record in American business”,” according to Warren Buffett. And yet he did not have an MBA or indeed any educational background in business (he got his Bachelor’s, Master’s and Ph.D. degrees from Electrical Engineering from MIT).

But he does fit an intriguing pattern.

Singleton is featured, along with seven other CEOs in a book that was the subject of a series of posts by one of my favorite bloggers, The Brooklyn Investor last year: The Outsiders by William Thorndike. The book profiles these eight who have delivered some of the highest long term compounded returns to their shareholders during the last few decades.

So what, why does it matter if some CEO returns a few percent more in annualized returns?

Imagine you have invested a million dollars each in two different stocks: company A compounds your money at an average of 8% per year (the likely long term market average) whereas company C compounds your money at, say, 15% per year. After 10 years stock A would have slightly more than doubled your one million (to $2.2 million). Not too bad. But your investment in stock C would have quadrupled to $4 million during the same time. Now imagine holding both companies for 10 more years. At that point, your initial $1 million in company A would be worth $4.7 million whereas the same investment in company C would have compounded to $16 million, even though it compounds only a few percent more every year!

Even better, if that initial million was invested in one of the eight “outsider” companies, it would have compounded at roughly 20% per year so that twenty years later it would be worth an astonishing 38 million dollars. The gap between these companies widens with every passing year – to paraphrase Buffett, time is a friend of a good compounder.

It seems really worth the while to “crack the code” for what makes a company a great compounder – that returns, say, 15% or better for a decade or two – versus an average company that returns 8% or so.

The Outsiders book contains some intriguing clues on this topic. It is the result of a research project that screened the records of thousands of CEOs. After conducting well over a thousand in-person interviews, it details eight “best of the best” – outlier CEOs who managed to compound their company’s stocks at truly astonishing rates (I have lightly edited The Brooklyn Investor’s handy list):

  • Tom Murphy (Capital Cities): 19.9% per year over 29 years (vs. 10.1% for S&P 500)
  • Henry Singleton (Teledyne): 20.3% per year over 27 years (vs. 8.0% for S&P 500)
  • Bill Anders (General Dynamics): 23.3% per year over 17 years (vs. 8.9% for S&P 500)
  • John Malone (TCI Cable): 30.3% per year over 25 years (vs. 14.3% for S&P 500)
  • Katharine Graham (Washington Post): 22.3% per year over 22 years (vs. 7.4% for S&P 500)
  • Bill Stiritz (Ralston Purina): 20.0% per year over 19 years (vs. 14.7% for S&P 500)
  • Dick Smith (General Cinema): 16.1% per year over 43 years (vs. 9.0% for S&P 500)
  • Warren Buffett (Berkshire Hathaway): 20.7% per year over 46 years (vs. 9.3% for S&P 500).

Not surprisingly, Buffett is present in the list, but many of the other seven are not familiar names, even though they all have delivered similarly fantastic returns.

Only one of these other seven (besides Buffett) had the expected MBA whereas four had highly technical backgrounds (beside the electrical engineer Singleton, a second an advanced degree in Nuclear Engineering, a third is an electrical engineer with a Ph.D. in Operations Research from Johns Hopkins, and a fourth graduated as an engineer from Harvard). None of these four had any formal business education, and had to learn the CEO’s job from a clean slate. Most of them are described as being introverted, “lacking charisma,” and preferred to avoid the media limelight, although this might well be completely irrelevant to their performance.

While the book does a good job of describing these eight CEOs, I am not entirely convinced with its attempt to explain why they succeeded like they did. The following are some of the common patterns across these CEOs.

In view of their technical backgrounds, a surprising pattern across the Outsider CEOs is that they all chose to view their job primarily as a portfolio manager of their various operating assets, focusing on capital allocation as opposed to running the operations of their company.

Of course, every CEO has to fulfill both essential functions – operating the company as well as allocating the capital generated from operations. It seems most CEOs are neither too interested in nor very good at the capital allocation process. Yet, since excess cash-flow has to be dealt with, one way or another, the majority of a company’s productive resources will soon enough be the result of a CEO’s capital allocation decisions. Default decisions will soon lead to mediocre cash flows in the future, and the compounding rate will rapidly reduce. Perhaps this is why the best compounders are those CEOs who excel at this vital but oft-neglected function.

Most Outsider CEOs had a trusted “right hand” person as their operations chief (often as COO or President) and completely delegated the operational aspects of their job.

This approach allowed the CEOs the time and energy to focus on what they did best, which was capital allocation. Many also structured their company’s operations into independent, self-contained operating units, with only a bare-bones staff at the corporate level. They all preferred to push operating decisions down to the lowest, most local levels in their organizations. Perhaps this unusual pattern of decentralized operations and centralized capital allocation allowed these CEOs to more dispassionately view their operating assets as portfolio constituents, and spin-off or sell the operating units as and when the opportunity presented itself, thus concentrating capital around their most efficient, highest-returning units.

All Outsider CEOs have been unusually proactive in shrinking their company’s assets –people and as well as capital – and returned excess capital to their shareholders rather than invest it in mediocre units.

It is quite rare for a CEO to sell a division or business without pressure; most are “empire builders” who would rather prefer to grow their employee base and revenues. However, since the various operating units of any company usually have varying returns, the overall return can easily get dragged down by its inefficient units as the company gets bigger. Thus it makes sense that only the radically rational CEOs who (1) carefully focused on identifying their best operating units, (2) pared down to concentrate capital on these outperformers, and (3) returned the excess capital, were be able to compile the best returns over time.

The Outsiders seem to have the temperament to embrace rather than avoid volatility.

Their cash-flows were often “lumpy” over the short term as a result of taking infrequent but bold capital actions. They tended to focus on the long term growth of cash flow per share rather than managing short-term quarterly earnings expectations to please the street analysts.

They were willing to buy their stock back aggressively (as much as 90% in Singleton’s case!) when they thought it to be significantly undervalued.

Such buybacks can create tremendous returns but only when the shares are indeed undervalued. Notice that these “Outsider” CEOs are “insiders” as far as their own company’s prospects are concerned! Since Mr. Market is prone to irrational fits of optimism and pessimism, a market-savvy CEO can create a lot of value by suitably buying his own company’s shares back when they are under-priced, and using the company’s over-priced shares to buy other businesses. The usual oscillation of stock prices around earnings provides any CEO the opportunity to increase their compounding rate much faster than their underlying operational growth. However, Outsiders may well be some of the very few that possess the capital allocation skills and temperament required to actually benefit from this price pendulum.

I am less sure of what to make of the readiness of most Outsiders to use high levels of leverage.

They all had unusually strong operating cash-flows enabling them to carry debt. And it allowed them to avoid diluting their outstanding shares when they needed more capital (to buy a company or to invest in growth). Even Buffett, who eschews debt in its explicit form, benefits from the inherent leverage presented by using the negative cost of float enjoyed by his insurance units. Debt may well have boosted the returns of the Outsiders beyond their intrinsic operating rate of returns. That they were able to do so for decades suggests that this may not have been as risky as it sounds. However, the possibility remains that they just got lucky and we are looking at the survivors.

Finally, the self-reported pattern of Outsider CEOs being opportunistic as opposed to strategic. I don’t know exactly what to make of this characterization. Perhaps it should be framed as a contrast between an adaptive vs. central planning styles.

As Jim Barksdale, the former COO of FedEx and CEO of Netscape is fond of saying:

“In a fight between a bear and an alligator, it is the terrain which determines who wins.”

The aphorism neatly captures the Achilles heel of long-term strategic planning: the terrain often changes with time, making any top-down plan likely to stumble. After all, Darwinian evolution is such a successful “designer” of species precisely because it is adaptive, without any grand plan or strategy. The “best-fit” progeny amongst the many produced by an individual organism simply gets an edge in proliferating its genes. After a series of such opportunistic choices, each successful species occupies a finely-tuned niche in its ecology. Similarly, in the context of an economy, an operating unit of a business may have to make many unexpected adaptations before it finds a successful niche (its “moat”). Such non-linearly contingent outcomes would be virtually impossible to arrive at by the usual corporate strategic planning process. Maybe the Outsider approach of viewing their various operating units as a portfolio manager allows them to more objectively select the better “fits” in a given economic terrain, trading away the others in order to concentrate capital around the more successful units. This opportunistic approach may well be superior in quickly finding and developing moated business niches, leading to a powerful chain of economic compounding.

No wonder Dr. Singleton who was a winner of the Putman prize in mathematics, and a near-master level chess player, and, presumably, as capable of strategizing as anyone on the planet, remarks:

“I like to steer the boat each day rather than plan ahead way into the future … we’re subject to a tremendous number of outside influences and the vast majority of them cannot be predicted … So my idea is to stay flexible.”

He used this “flexibility” to buy or sell more than 130 companies for Teledyne, opportunistically using its high-priced stock. And then when the market for such conglomerate collapsed, he turned around and bought back his own now under-priced stock massively – he bought back 90% of it! These shrewd capital allocation moves allowed him to compound his company’s stock at 20.3% per year for 27 years so that each dollar invested in Teledyne turned into 160 dollars in 27 years!

Singleton basically pioneered these type of capital decisions – they were highly unusual at that time, and still are for the most part. Of course he had an unusual board to back him up. Board member Claude Shannon of MIT is one of the two ” gods” of the digital age (the other being Alan Turing): he was the first to actually quantity the very idea of information (“bits” was his invention). Another legend on Teledyne’s board was Arthur Rock, the VC behind Intel and Apple! Imagine those board discussions…

To be sure, the book does not cover ineffable visionaries like Steve Jobs; perhaps there is no template that fits such one-of-a-kind geniuses; or perhaps his record does not hold up quite so well after mixing in his initial tenure at Apple which was decidedly mixed (compounded averages are diminished rather easily by even a few bad years).

Of course we may be looking at survivor and selection biases often lurking in such history-based records. The high levels of leverage are a clear red flag that luck may well have played a huge role in these outliers: the risk they took did not materialize even though it could have, so the survivor-ship bias really does matter.

It also bothers me that Buffett does not quite fit the “Outsider” mold in some ways. For example, he has almost never “shrunk” by divesting operations (with the notable exception of the original mill), and never really bought back Berkshire’s stock (although his publicly stated willingness to buy shares back below 1.2x book value may have done the job for him).

Nevertheless, if you enjoy reading about fundamental investing or interesting business models, I recommend this well-written and thought provoking book.