Faculty viewpoints: The infinite capabilities of data science
Tom Breur discusses the infinite capabilities of data science, what it can do for tech companies, and some of the potential pitfalls of this discipline.
For the past 20 years, Lecturer Tom Breur has been working at the forefront of the evolving field of data science. Breur currently works for Boston Scientific, leading the Data & Analytics Center of Excellence, working to improve digital health and data-driven decision making. He teaches Applied Data Science in the M.S. in Innovation & Management program.
To learn more, visit Breur’s blog, where he writes about topics such as data science as a profession, using analytics in the workplace, data governance, business intelligence, and much more.
Tufts Gordon Institute recently spoke to Breur about the infinite capabilities of data science, what it can do for tech companies, and some of the potential pitfalls of this discipline.
Q: Why is it important for anyone working in tech and science to understand the significance of data analytics?
A: This field is becoming so prominent and so prevalent that if you have any position in management or innovation in a tech company, then you will run into data science. I would say at least 75 percent –if not 90 or 95 percent– of all tech professionals within a couple of years will run into the data science in their work.
Q: How has the use of analytics changed the competitive landscape?
A: Over the last 20 years or so, it has been coming both top down in corporations, as well as bottom-up from the marketplace. Executive teams are increasingly demanding financial accountability. They want hard evidence to support important strategic decisions. And it’s not just because of scandals like Enron, for example, where the lid came off. It’s becoming the norm across the board that if you want to propose significant investments or strategy change, you’d better back up your plea with hard data. And it’s also coming from the bottom up – 20 years ago, only certain industries had access to lots and lots of data. But now, big data is everywhere. The Internet, of course, has played a huge role in that.
Q: Which industries or verticals have been quickest to embrace the use of data analytics?
The earliest companies to join the data bandwagon were those for which capturing data was a native capability, a part of their primary business process – for example, financial services, where everything already happens electronically and transactions have to be 100 percent accurate. Also, the telecommunications industry, where they already have call detail records. There was a natural inclination in those industries to make sure that all those data were captured very accurately. Hence, they had a head start when it comes to leveraging all those data with analytics.
And then with the boom and bust around the 2000s, companies could grow at Internet pace by running live experiments on their web sites. That’s what’s happening right now, which is interesting in terms of product innovation. A great example, and a company I admire, is Tesla. They start advertising and promoting a new vehicle long before they actually start producing them – they figure out along the way what features consumers want and are willing to pay for. It’s a great example of how new big data technology is leading the way in product development, and also in marketing.
Q: Which science and tech fields are finding ways to embrace data analytics?
In genomics, for instance, the innovations have been huge and incredibly valuable. In the past, innovations were happening largely in laboratories. Now, innovations are often happening behind the computer. This is happening in astronomy as well, for example. No longer are the people who are working in the lab the only ones who can innovate and invent.
Pharma companies are also embracing data analytics. When they run early tests on new drugs, they spend billions of dollars, and they may choose to start out with maybe 10,000 candidate compounds to test. Along the way, they weed out the top 100 promising compounds to get field tested. Nowadays, rather than spending the money upfront to test the many compounds that may never make it to market, they do it the other way around. They simulate two orders of magnitude more compounds that are at the very early stage of testing with the sole purpose of identifying the most likely candidates to actually make it through the test. That way, they manage to sometimes improve the success rates of clinical trials by a factor of two to five, which is significant for their enormous R&D budgets.
Q: Which industries are still figuring out how to best use big data?
A: In healthcare, where I’m working now, there’s no standardization across electronic patient records. Every hospital has their own system and standard, so they’re not interoperable. In other industries, if different providers didn’t collaborate, nothing would work – imagine if AT&T and Verizon didn’t allow phone calls between their users. It’s one of the reasons the healthcare system is so inefficient and costly – there hasn’t been any driving force to standardize on patient records.
The healthcare industry realizes that, and they will get there – but it’s not easy to make those transitions. There is an awareness that these systems need to talk to each other, and default to high quality data rather than messy data with lots of missing information.
Q: What advice or words of caution do you have as companies embrace big data?
A: A common theme that I hear across all industries is that data are available almost everywhere, except they aren’t necessarily usable yet. And the difference between data being potentially available and data being usable is how successful companies manage to integrate disparate data sources – that is not straightforward at all.
Secondly, to ensure high quality data – we need to work with the awareness that data are not just a byproduct, but can be a valuable resource themselves. That awareness is starting to grow. More and more companies are realizing that they need to manage data as an asset. Just as they manage intellectual property as an asset, just like they manage their tangible inventory as an asset, data can and should be managed as an asset just the same. Data governance is a relatively new competence that requires attention and new ways of thinking.
Q: What are some of the risks, especially related to security?
Remember that data is just data – data is not reality. Because data are so rich and can paint such a compelling story about consumer behavior, you may be tempted to believe what the data shows is actually true – but it’s not. It’s a digital reflection of something that’s supposed to have happened. There are risks.
Clearly, there needs to be a completely new level of awareness around security, as the Equifax breach has taught us, among others. I’m afraid the same thing could have happened or might have happened at other companies as well. That awareness is important going forward as we rely more and more on digital processes running autonomously and data being ubiquitously available. We need to get used to different standards of security and authentication or else our world could become a scary place to live in.
In the Faculty Viewpoints series, Tufts Gordon Institute faculty share thoughts on the latest on news and trends in leadership, business, technology and entrepreneurship.