Why Amazon Has Bought Whole Foods

(In short: It’s their grocery playground.)

The decision, last week, from Amazon to buy Whole Foods for $13.7 billion, has been met with considerable criticism. Big acquisitions are not a part of Amazon’s usual playbook. Amazon has generally patiently built its services across many years, and relied on mergers only for specific technologies or, in rare cases, to buy competitors such as Pets.com.

It looks like a horrendous decision

This is all the more puzzling when one looks at Whole Foods in more detail. One, for all its growth, it has had tremendous difficulties in recent years, struggling to be profitable, and is now under attack of activist investors over its falling stock. Amazon will therefore have to somehow change Whole Foods to turn it into a success. Plus, Whole Foods is known for having a unique corporate culture in which welfare and the independence of store employees is emphasized. For instance, employees are allowed to vote on benefits every three years. A caricatural way of summing up Whole Foods’ culture is that there is no employee union because the work conditions are so good. This stands in contrast with Amazon’s corporate culture, which is reportedly brutal for low-level employees, and insanely competitive for high-level ones.

So, Amazon has bought, at a massive price, a flailing company in a low-margin, competitive market; seems to prepare itself to massive culture clash and the PR nightmare that could result from it; and, on top of that, some are already talking about the need of antitrust regulation to block this merger that would consolidate Amazon a retail behemoth. What on Earth is Jeff Bezos thinking?

Now, I am here to argue that the merger is strategically justified (and, as The New York Times mentioned, one of Amazon’s strengths is its willingness to fail.) But there is a pretty strong case for Amazon to buy a company like Whole Foods, and I am going to lay it out here.

The problem with the grocery industry

The sector of groceries has long been an area of future growth for Amazon. It is indeed one of the biggest sectors of the retail industry; and, notoriously, it has been pretty impervious to e-commerce. So, for the past years, Amazon has tried to spin their take on grocery shopping with a flurry of products. Most notable among them was Amazon Fresh, which allows customers, for a monthly fee, to have produce to be delivered or picked up. Or one could mention Amazon Prime Now, which allows Amazon Prime members to be delivered produced products in two hours, albeit for a hefty price.

The problem with that strategy is that, essentially, it has not worked out. Grocery services is a tough nut to crack, Amazon seems to have discovered. Groceries are indeed ordered pretty differently than books or households objects. Immediacy is very important: hence the value of a combination of physical presence and very fast delivery. Also, perishable products cannot be stored and presented in a way that is even remotely similar to the rest of their catalog. Therefore, Amazon has tried to innovate in both domains. It has opened physical stores and food trucks in Seattle. These experiments, although headline-grabbing, didn’t seem to be very scalable. If Amazon wanted to have a physical presence as ubiquitous as their website, it would have to acquire a lot of real estate and build stores there. These operations are notoriously lengthy, difficult and expensive to realize; at least, much harder than scaling their online presence.

You could buy Whole Foods for its real estate, but that’s not enough

Whole Foods partially solves that problem. Their physical presence across the US is relatively expansive; there are Whole Foods in virtually every major US city. The mere real estate of Whole Foods can be valued in billions of dollars. So, if Amazon wanted to convert all of the Whole Foods in Amazon Groceries, they would have a good jumpstart. This, in itself, not a sufficient reason to buy Whole Foods. After all, if Amazon really wants to be as gigantic as their $500 billion valuation suggests, it can’t content itself with a grocery brand that occupies only a percent of the market, focuses on organic products, and is itself under financial duress.

So it can’t be just about acquiring a grocer; Amazon will have to change Whole Foods somehow. One could imagine, instinctively, that Amazon would radically transform the Whole Foods stores, rebrand them under the Amazon brand, and change the catalog to make it appealing to every American. And, because this Amazon, make Whole substantially bigger. But that probably would engender a heavy culture clash. This probably explains that, after the acquisition, Amazon declared that Whole Foods store would continue to operate as before, and that no jobs cuts were in store.

Whole Foods is a playground for Amazon

And honestly, there is plenty of ground to believe that. Amazon is probably not thinking of Whole Foods as their endgame, but rather as their playground in the grocery space. And you can think of playground as a demeaning word, but it really isn’t. A playground is what Amazon needs to be able to succeed in the grocery space.

The flurry of experiments that Amazon conducted in the past few years have failed because of a kind of chicken-and-egg problem. It is hard to prove that a single grocery experiment is viable without economies of scale, but it is hard to scale that experiment without financial viability. That problem surely existed for previous business like books, but they probably were less complicated.

So, Amazon needs scale from the get-go to jumpstart their grocery experiments. That is what Whole Foods offers. It is a grocer that is big without being as big as Walmart. Plus, their culture emphasizes the independence of each location, which makes it easier to run local experiments. Therefore, it seems like Whole Foods was bought to be a guinea pig.

This seems all the more logical that it very much fits Amazon’s structure and history. Amazon is structured around small teams that have a lot of independence. This means that a lot of them are running live experiments that are validated or axed based on their results. This is the corporate equivalent of throwing stuff against the wall and see if it sticks. More crucially, a lot of these teams operate as APIs: they are supposed to treat both the rest of Amazon and the external world as customers. This explains, for instance, that the Amazon supply chain is available to both Amazon.com and third-party retailers; or that their web services hosts both Amazon’s websites and other customers.

If one takes that framework to think about the Whole Foods acquisition, it is simple to see how this will unfold. Amazon will run experiments in countless Whole Foods stores (and potentially opposite experiments in two different stores to compare them,) and see what works and what doesn’t. In addition, they will probably will reconfigure the back-end of Whole Foods to make more efficient, and, crucially, more flexible so that it can be modularized. (This has led some writers such as Ben Thompson to think that Amazon could use that supply chain to deliver produce to other groceries or even third-party restaurants.)

Despite this acquisition, they are still hard questions that Amazon will have to answer. Even if Amazon is comfortable with being a modular company, it is dubious that they would only keep Whole Foods as their only customer-facing operation, especially outside of the US where Whole Foods is unknown. It seems probable that they would create Amazon branded stores; what they would look like is probably a mystery to the company itself. (But the experience acquired from running Whole Foods, I think, would give them clues to how to conceive them.) And the question of culture compatibility, which has sunk so many mergers before, is still crucial to Amazon’s success.

These issues will still be here next year; hell, they may very well be there in five years. But, at least, Amazon won’t have to worry anymore about the first step of their grocery business, as they will have a jump base to start from.

French presidential polls seem too consistent to be true

The French presidential election will be held in 10 days from now, and it seems like a nail-biter. The first round, which seeks to determine which two candidates will be qualified for the final runoff1, seems to have become essentially a four-way race. And we are not talking about a four separate, but similar politicians. We are talking about a pro-European centrist candidate (Emmanuel Macron), a neo-conservative who, after being charged for embezzlement of public funds, is now France’s most unpopular politician (François Fillon), a brash but eloquent candidate endorsed by the French Communist Party (Jean-Luc Mélenchon), and, of course, Marine Le Pen. According to the latest polls, all of them are, essentially, within 5 points of each other.

The French media is, understandably, covering the election relatively nervously. Even though every poll for the past few weeks has shown Macron and Le Pen with a somewhat comfortable lead over Fillon and Mélenchon, every one of the 6 potential match-ups is brought up by pundits. (This has a lot to do with a massive 15-year old election upset; I’ll come back to that a little later.) Most bonkers, however, is that this concern is actually probably underplayed. Just by taking a look at the general shape of election polls, it is pretty clear that something weird is going on, and that we may underestimate how truly unpredictable this first round is shaping to be.

The big red flag of these election polls is that they really, really don’t deviate from each other. It is a pretty easy thing to measure. Polls, because they take measurement on a sample, are inherently flawed; fortunately, that flaw is simple to estimate. For this post, I am assuming that the sampling error for one poll, for the first round, is around 2.7 points2. This is the sampling error that one would expect when trying to evaluate percentages around 20 points, and with samples of 1000 people. This what we are talking about here: the four candidates’ numbers are currently in the range of 17 to 25 points; and almost every poll has a 1000-ish sample.

If that standard error due to sampling is 2.7 points, that means that we should find results within that interval something like 68% of the time.3. This is what would happen if all the surveys were done with random sampling, and no tinkering on the back end. What happens if you line up the polls together and compare them?

The black dots here, represent the different polls. The black line is the moving average (computed with a local regression). The blue interval is this average +/- 1.35 points, which represents the sampling error. We should normally expect to see roughly a third of polls outside that interval. It is obviously not the case. Even more striking is the fact that polls get significantly closer 60 days before the election, after February 25th, at the moment where you can see a lot of movements in the numbers. For the past three months, basically, there has been virtually no outlier poll for any of the four major candidates. This should not happen in an ideal polling environment, and is quite concerning.

To get more dramatic, I used a chi-square test, which is used to determine if a dice is weighted. Here is what it yields:

  • The odds that the fact that Macron’s scores were this consistent is a coincidence is 0.001%.
  • The odds that the fact that Fillon’s scores were this consistent is a coincidence is 0.0003%.
  • The odds that the fact that Mélenchon’s scores were this consistent is a coincidence is 0.0006%.
  • The odds that the fact that Le Pen’s scores were this consistent is a coincidence is 0.00000000002%.

There are potential two explanations to this. Firstly, French pollsters almost uniformly use a method called quota sampling. In other words, they seek for a certain balance with regard to their samples to achieve certain ratios that would be, in their mind, representative of the electorate. The consistency of the quotas used by pollsters could be the cause of the consistency of the polls; and, of course, this is a controversial polling method. It was discredited in the US in 1948, after it failed to predict Truman’s re-election. And this is probably a very bad election cycle to promote quota sampling. The 2017 election cycle has been defined by its volatility, as well as an unusually high number of undecided and a big question mark with regard to turnout. (That, in short, never really happened before4.) In this disjunctive election cycle, it seems a little bit crazy to pretend that the quality of polls rely on a deep understanding of the electorate and its dynamics.

The second explanation, much less nice to pollsters, is what Americans would call herding. In other words, manipulating poll data, or hiding some results, to prevent outliers5. I don’t know to which extent this is the case, because it would require a little bit more of analysis, but it definitely does seem likely. That polls seem more and more consistent during periods at the end of the campaign that displayed a lot of poll movements looks very suspicious to me. I just don’t buy that pollsters have a better grasp of the electorate now that they had four months ago, especially provided that the campaign has been essentially upended a few times since then.

I am all the more suspicious of French pollsters that they actually screwed up big time before. In 2002, all of them showed Prime Minister Jospin and President Chirac as confortable front-runners in the first round; they were virtually assured of being qualified to the second round. What happened, in fact, is that National Front’s Jean-Marie Le Pen came in second, ousting Jospin. (And with an almost 1 point difference!). This was, and still is, a national trauma, as the public did not see it coming. And guess how were pre-election Le Pen’s poll numbers? (The red dot is his actual, eventual score)

It seems like history is repeating itself 15 years later. What does that mean for Election Day? Essentially, fasten your seat bells. The uncertainty around this election is much, much higher than polls might let us think. A lot of second-round possibilities, even between the far-right and the far-left, are to be considered. And, for once, the political TV circus is actually justifiably hysterical.

(1) The French election votes according to the runoff voting system. First round candidates need 50% of the votes to win, or else the two top candidates face off in the second round to get these 50%. And for those who think that this system is a French peculiarity, you might want to think again: it’s actually the voting system used in next Tuesday’s special election in Georgia.

(2) This is also somewhat less than the average error that French polls have historically displayed 30 days before the election. If I were to use that metric, though, it would only strengthen my claims that French polls may be pretty low-quality, and too close to each other.

(3) The standard error is basically half the margin of error. If a pollster say “20%” to you, it really means that it is 95% confident that it will fall between 17.3 and 22.7 points (+/- 2.7 points). And 68% that it will fall between 18.65 and 21.35 (the range is 2.7 points large.)

(4) Turnout for French presidential elections tends to be in the 80%’s, which is obviously much higher than elections in the U.S.. This tends to reduce uncertainty with regard to the effect of turnout on elections.

(5) Much more through explanation of herding is given in this article, which served as a partial inspiration for this post.

Will People Die Because of the Obamacare Repeal?

There recently has been a flurry of declarations by Democratic politicians, scholars, and pundits that the bill replacing of Obamacare is literally deadly. This is a pretty potent argument, and an especially shocking one, as it essentially accuses Republicans to willfully kill Americans.

I want to stop on one of these statements. In January, Sen. Bernie Sanders said on Twitter that the Obamacare replacement would kill 36,000 people. He was promptly rebuked by the Washington Post, which considered his declaration nonsensical, partially because the details of the Obamacare repeal law were not known at the time. Now that the bills are on the table, and given the additional similar declarations by other Democrats, I figured it would useful to re-check that claim today, and see if it is still worthy of a rebuttal.

In short, the overall direction of that statement may be true, but the number truly is baseless. There is no proper study to back up that precise number. Indeed, no study provided a precise answer as to how many deaths would result from a repeal law. Unsurprisingly, Democrats have put forward wildly different numbers than just 36,000.

The numbers put forward by Democrats are generally recycled from two health policy studies:

  • The first one looks at how the mortality rate evolved in states that expanded Medicaid in the early 2000s vis-à-vis the ones who didn’t. They found that it saved one life for every 455 newly-insured people; extrapolating that to 20 million insurance losses, you find 44,000 deaths annually.
  • The second one looks at how the rate evolved in Massachusetts after the passage of ‘Romneycare’, which was an Obamacare-like system in that state. They found that it saved one life for every 830 newly-insured adults. This translates to 24,000 or 36,000 deaths nationally if you estimate, respectively, that the law will make 20 or 30 million Americans lose health insurance.

The 36,000 number that Sanders quoted came probably from the conclusion of the second study, as well as an extrapolation based on 30 million health insurance losses. The Washington Post was right when asserting that Sanders couldn’t know how many people would lose health insurance because of the Obamacare repeal, indeed, he was off by 5 to 8 millions. So, even if we apply Sanders’ methodology to the death question, we would find something in the range of 30,000 people.

But even applying Sanders’ methodology is problematic. Translating these studies’ finding to an Obamacare repeal is not easy and amounts a little to compare apples to oranges. The obvious caveat is that the two studies are heavily local, especially in Massachusetts, which has much better care and much more wealth than other states. Therefore, it is unclear how much the results in one American region can be scaled to the entire country. More largely, it highlights a thorny problem when it comes to the evaluation of an Obamacare repeal: how many (healthy) people would happily choose to drop their insurance, how many would do it because they can’t afford it, and how many would do it because of both? This is rendered all the more complicated by the relative leakiness of the Obamacare mandate, which may mean that the healthy people who would drop health insurance after a repeal have already done it.

More notably, the repeal of Obamacare is not a full repeal. For instance, the bills keep the Obamacare structure, with its private healthcare markets, in place. They also change the distribution of the health insurance subsidies, potentially altering the composition of the Americans with health insurance. This mean that, even if we could estimate the numbers of lives spared by Obamacare, it is not possible to consider that it would be the numbers of lives taken by an Obamacare repeal.

In conclusion, the assertion that the Republican bills will kill people is essentially speculative, but it probably is right. It is overwhelmingly likely that, if Republicans are able to repeal and/or replace Obamacare, millions more people will be uninsured in the medium-run, and a lack of health insurance is correlated to mortality.

But to advance a precise number is unjustified. It is unclear how much the population that would forego health insurance would be particularly prone to getting sick. Moreover, the studies that have estimated how many lives were spared by the extension of health insurance in other contexts display varying numbers. Now, all these caveats only make the estimate less reliable. It doesn’t necessarily mean that the number is necessarily going to be smaller. However, the level of assertiveness of some Democrats with regard to these death estimates is simply not warranted.

In Defense of The Web Inspector

Here’s a funny thing about the Web: sometimes, secrets are hiding in plain sight. Indeed, when you browse a web page, you generally receive a lot of elements from its server. Of course, you generally obtain HTML and CSS markup, as well as Javascript code, but also a variety of other files like fonts or data sheets. Then the browser combines and interprets all of that data to form the page you are browsing. What you see on the webpage is only, then, the tip of the iceberg; but there is generally much more to it, and it’s sitting idle in your computer’s memory.

Often the rest of the iceberg is essentially worthless for journalism purposes. Sometimes, however, it can be crucial to access it. For instance, you could be looking at a visualization and you be longing to get the dataset forming the base of what you are seeing. Or you would want to remove that stupid overlay sitting between you and the paywalled content. As it happens, more often than you might think, you can circumvent it. (We’ll see how to do this later.)

So, today, I wanted to talk about a tool that allows you to do that; more crucially, if you are reading this now on desktop, it is probably just a shortcut away:

  • If you are on Chrome or Safari on Mac, just trigger the shortcut Cmd+Alt+i.
  • If you are on Chrome on Windows/Linux, just press F12.
  • If you are on Edge/IE, just trigger the shortcut Ctrl+1.
  • If you are on Firefox on Windows/Linux, just trigger Ctrl+Shift+c.
  • If you are on Firefox on Mac, just trigger Cmd+Alt+c.

What you are seeing here is the Web Inspector. Some of you, probably, have heard of it, or used it; most journalists, maybe even the ones that are processing data, are not aware of its existence. A web inspector allows you to understand what is going on with the web page that you are visiting. It generally is organized around the same categories:

  • a console, which broadly is here to detect and notify errors.
  • a storage panel, which displays cookies and other data stored by the website on your computer’s hard drive.
  • a debugger, which really is useful for developers that seek to debug their Javascript scripts.
  • a timeline, which displays how the page is loading (at what speed? What are the components that take the most time/space/computing power to load?),
  • along with a network panel which shows through which networking mechanisms these elements were loaded.
  • the resource panel, which shows all the elements used to load the page,
  • and the elements (or DOM explorer) panel, which how these elements fit together through HTML.

Let’s go back to the two scenarios that I laid out earlier, and use them as examples of how to harness these a web inspector for journalistic purposes.

Let’s take, for instance, this applet. Made by a French public broadcaster, it tracks the attendance of local politicians across France. You can search by name or region but, sadly, you can’t directly download all the data. This is all the more disappointing that the website indicates that their dataset has been done by hand, so you probably can’t find it elsewhere.

Well, with a web inspector, you can. If you open it and click on the network panel (and reload the page), you can see that there is a datas.json file that is being downloaded. (See the red rectangle.) You just have to click on it, and you just have to browse the dataset.

Now let’s take a second example. You want to go on a paywalled website, say, ForeignPolicy.com. You probably will end up with that:

Now, there is a way to actually read the article in a few clicks. First, open the inspector by right-clicking on the dark part of the page and selecting “Inspect element”.

You should probably obtain a panel with an element of the HTML already selected. You can just remove it by pressing the delete key.

The problem, now, is that scrolling has been deactivated on this website, so you can’t descend much further into the article. However, if you inspect one of the article’s paragraphs, the panel will display the part of the HTML file that corresponds to the article’s content. You can then expand every <p> (which is the HTML-speak for paragraphs), or right-click “Expand all” on the line above the first paragraph:

And here you have it:


It’s not the most practical way of reading an article, but it’s probably better than no article at all. (And to be clear, I’m all for paying for your content!)

The broader point is this: if you feel like you get stuck on a webpage, that a webpage is somehow blocking you to access a deeper level of content, the web inspector may be here to help. It is not bullet-proof, but, as we’ve seen here, it can sometimes save your research process.

In short, the web inspector is an underrated tool for journalistic research: it is already installed in every desktop browser, it is a de facto Swiss knife for web tinkering, and is not that well-known. To me, it may be one of the common tools of journalism in the future.