The saying “Lies, damn lies, and statistics was popularized by Mark Twain, who attributed it to the British prime minister, Benjamin Disraeli. The actual origin of the phrase isn’t clear and doesn’t really matter – it’s as true today as it was in Twain’s day and far more relevant in a world flooded with statistical analyses derived from reams upon reams of data.
But what does this evergreen phrase have to do with founders and mentoring? Increasingly in our data-driven world investors and business partners are insisting on quantitative evidence of both a new venture’s market opportunity and the size of the problem it is trying to solve. Thus many founders need to rely on third party research to buttress their arguments, as they don’t have the resources to do in-depth primary research themselves. Typically the sources of the data they rely upon – government and market research analysts – have far more credibility than the founders do.
I came across a perfect example of the perils of “data risk” for founders in The New York Times article Digital Divide Is Wider Than We Think, Study Says by Steve Lohr. It describes how Microsoft researchers did a study on the actual use of high-speed internet in the U.S. They concluded that 162.8 million people do not use the internet at broadband speeds. However, the Federal Communications Commission (FCC) claims that only 27.7 Americans lack broadband access!
If you are a company like Microsoft that is interested in profiting by solving the so-called digital divide – the large gap between the many millions of U.S. residents who have broadband access and those who don’t – you have to have the correct data to act as foundation for your business. Microsoft, of course, has far more resources than ever available to a startup, but the discrepancies found by their researchers between FCC findings and their’s are dramatic. This gap is most striking in rural areas of the U.S. For example in Ferry County in northwestern Washington, the area highlighted in the Times article, Microsoft estimates that only 2 percent of people use broadband service, versus the 100 percent the federal government says have access to the service.
It turns out that the FCC relies on simplistic surveys of internet service providers that inherently overstate coverage. For example, if one business in an area has broadband service, then the entire area is typically considered to have broadband service available. You can get the full details of how Microsoft generated their data in the article, which I highly recommend, but suffice to say that they did not rely solely on ISPs for their data – they performed primary research.
There are several lessons for founders who are relying on government or other sources for data that may be the foundation of their business case.
- Consider the source of the data. In this case one would expect the FCC to have accurate data. But anyone who has been following the politicization of the FCC would realize that they are very biased: it’s in the FCC’s interest to show that they are doing a great job by making broadband universally available through their policies that favor ISPs.
- Try to learn the source behind the source. The FCC did not do any primary research into broadband access, they totally relied on third parties – ISPs – who like themselves, were biased towards showing universal data access.
- Find out the foundational definitions. Nowhere in The Times article is broadband defined! Access speed can vary exponentially from 100 mbps to 1 gigabit service.
- Learn how the data was gathered. The FCC relied on simplistic surveys of ISPs rather than performing rigorous surveys of consumers’ access to broadband. The best data is gathered from primary – firsthand – research.
- Find multiple sources for the data. One way to factor out bias or poor data collection techniques is to find more than one source for the data. Just as it’s a risk for manufacturing companies like Apple to rely on a single vendor, if your venture is relying on just a single source for foundational data you are exposing yourself to the risk that your data may be in the “damned lies” category.
- Verify data sources with your own research. Let’s assume you were looking to start a business to serve a rural area, for example, an internet-connected animal tracking service for cattle ranchers. Rather than just basing your total addressable market on a single, potentially biased source like the FCC, do your own primary research. By interviewing just a few ranchers you would probably find that most lacked internet service or relied on very slow satellite systems for their internet connectivity. Others might even have to drive to their local library to access the internet. Hard to get real time location tracking by relying on your local library for broadband access!
- Understand that there is no such thing as “objective data.” No matter how rigorous the statistical methods used, biases of the researchers will seep into the data. Microsoft has its own bias, as it is trying to convince the FCC to allow them to use the “white space” between TV channels to deliver internet access to rural areas. So it’s in their interest to show very low levels of internet access – just the opposite bias of the FCC (and big ISPs and broadcasters, both of which are dead set against Microsoft’s initiative).
The bottom line is that secondary market research that has value in determining your total addressable market may be necessary, but not sufficient. Supplement that research with your own primary research, as cash-constrained as that research may be. A good question to ask during your customer discovery process is to ask your potential customers how big they would estimate your market to be and what, if any data, they rely on themselves. While not all data is lies or even damned lies, it may well be tainted by bias – it’s up to you understand and account for the biases and assumptions of your market research sources.