How might one acquire a field of big data, the oil of the digital economy?

Everything in our daily lives is becoming connected: our car, house, oven, stove, washing machine, bicycle, TV. We can even be linked ourselves through a connected bracelet! All these sensors generate a titanic volume of data that doubles every year; this can be likened to the new oil of the 21st century. 

So how do companies succeed in acquiring such a big data field? How can this big data be transformed into smart data? What impact does this have on a company’s business model? And what are the risks?

I’m starting this year with a series of reflections and by sharing some experiences of this subject, beginning, quite naturally, with the first issue: How can a considerable mass of data, known as big data, be acquired?

Companies in possession of big data will experience a considerable competitive advantage, provided that they are able to make use of it. This phenomenon applies to all areas, from targeted advertising to industry 4.0, as well as personalised medicine and precision farming.

What strategies can be put in place to create such a big data field? Let’s consider three areas as examples: health care, online advertising and mobility.

1.  Setting up a global genome biobank

Medicine will be transformed by players who manage to obtain big data. Indeed, to offer predictive medicine, and subsequently preventive medicine, the industry needs to train these algorithms with a considerable mass of data. Many players are tackling this market with different strategies. Here are two examples:

The first approach: aggregating data collected by hospitals

University hospitals such as Lausanne University Hospital build up biobanks on the basis of voluntary cooperation by patients. By developing partnerships with hospitals, a large mass of data can be formed. To obtain this data, solutions for storage, the secure sharing of data and analyses of the results of genome sequencing are offered to hospitals. This is the approach taken by the start-up Sophia Genetics in the canton of Vaud, which has already signed contracts with more than 400 establishments.

The second approach: offering a service for which clients must supply their data

For USD 99.00, the start-up 23andme invites you to discover your origins over the past 200 years. You may thus discover that your background is, for example, primarily Swedish and Danish, but that you are also of Italian descent. Starting this year, you can also obtain such information as the risk of developing a genetic disease such as Parkinson’s, or your predisposition to be thinner or fatter than average. The process is quite simple: the customer orders a kit containing a test tube. A small amount of saliva is placed in the tube before it is sent back. Two months later, at the most, the customer has the results online. By offering this service, 23andme has so far collected more than one million genomes.

2.  Knowledge of all our actions on the internet

To gain an additional perspective, let’s have a look at a more mature sector, advertising with Google. Google’s business model is centred on data; more than 80% of its income is generated by monetising its big data through the sale of targeted advertisements online. What did it do to access this big data?

1. Access to the flow of data on our computers: Google introduced its free web browser Chrome in 2008. The graph below shows the changes in the market shares of the various web browsers: in 10 years, Google Chrome has captured nearly 80% of the market.


2. Access to the flow of data on our mobile phones: Google bought the start-up Android in 2007, and its operating system is now used by more than 80% of smartphones.

Chrome and Android are thus the interfaces that enable Google to acquire a considerable mass of data on our activities. The Cupertino-based firm has thus succeeded in obtaining such a large market share that it is now the dominant player. Google’s strategy for accessing data can be summed up in a statement made by Peter Thiel (whose book “Zero to One” I’m currently reading): “competition is for losers, look to build a monopoly”.

3.  Capturing the data of the mobility of the future

Cars are all becoming connected to their environment and they are continuously sending back data. A number of applications are set to appear: for example, if we obtain information on the activation of the windscreen wipers of a large number of cars, we’d be able to map, in real time, the position and movement of storms.

Three examples of strategies for accessing this big data:

  • Google (again!) wants to get into our cars and offers Androidauto, which, as is the case with Chrome and Android, will enable it to capture our data.
  • Tesla offers a connected, increasingly autonomous vehicle, integrating its own operating system. Tesla has understood that it must maintain control of the data generated by these vehicles.
  • The offer of an innovative mobility service based on a platform that collects all mobility information: this is the strategy adopted by such companies as Uber and Lyft. Uber is already offering its anonymised data, Uber Movement, with the goal of improving city planning.

In conclusion, I’ve taken three industries as examples, but the challenge of succeeding in accumulating big data is present in all fields. There are many data acquisition strategies, but as the online advertising field shows, the companies that have anticipated this trend will find themselves in a favourable position. There may also be a second question we need to ask ourselves: Could the highly polarised distribution of big data create economic and societal imbalances?

Below, you’ll find my presentation on this subject at the EPFL Swiss Data Day last November:

You can also access my articles in French on the website of the newspaper LeTemps