James Kobielus is an industry veteran. He serves as IBM's big data evangelist; as senior program director for product marketing in big data analytics, and as editor-in-chief of IBM Data Magazine. Kobielus spearheads IBM's thought leadership activities in Big Data, Hadoop, cognitive computing, enterprise data warehousing, and advanced analytics. He advises IBM's product management and marketing teams in all things big data.
James will be speaking on the topic of Big Data and Cognitive Computing at the PASS Business Analytics Conference in Santa Clara, CA, April 20-22.
Recently, we sat down with James to learn about his views on the analytics industry today.
Tell us your data story: How did you first become interested in working with data, and what path did you take to where you are today?
I’ve been in IT since the mid-1980s, but my professional involvement in the data industry started in 2006, when Current Analysis hired me as their Principal Analyst covering the Data Management industry. From there, I moved to Forrester Research, for whom I was their analyst covering data warehousing, business intelligence, big data, advanced analytics, and other topics with a data focus. From there, IBM brought me on board 3 years ago as their Big Data Evangelist, Senior Program Director for Product Marketing in Big Data Analytics, and Editor-in-Chief of IBM Data magazine. My deepening role as a thought leader in big data, cognitive computing, and cloud analytics has been the most rewarding phase of my career.
What excites you most in the world of Big Data and advanced analytics?
Where do I start? I evangelize everything in the big data analytics arena. I’m genuinely excited by so much. Perhaps what excites me most is that data and analytics touch every business process and every aspect of our lives as consumers. Consequently, this is far from a dry topic and in fact is rich with applications that drive the world and make people’s lives better. It’s fun and stimulating to evangelize a solution area that is so rich in applications and value. And big data analytics is one of the most innovative areas in the entire high-tech universe. I learn something new every day and get to share my thinking on it far and wide. That’s what I love the most.
What one tool or technology would you tell analytics professionals to put at the top of their to-learn list this year?
Of course, I work for a solution provider and would prefer that the one tool be anything from our diverse portfolio of great products and services. But in an agnostic, I’d recommend that analytics professionals put cognitive computing at the heart of their to-learn list. Cognitive computing is artificial intelligence for the 21st century, leveraging machine-learning algorithms to extract meaningful correlations and other insights directly from unstructured data sources as well as relational data. Machine learning refers to a wide range of statistical algorithms, such as neural networks, that are at the heart of IBM Watson and other modern cloud analytics services. Without cognitive computing, machine learning, and neural networks, organizations would have a tough time tracking customer sentiments on social media, automating face recognition on video streams, and otherwise sifting through the mind-boggling volumes of various data types at high velocities.
Data often surprises us and causes us to rethink assumptions. As a full-time research analyst in your former life, what were some of the more interesting or unexpected findings you discovered through data?
In terms of my professional experience, bear in mind that I myself have never worked as a data scientist or statistical analyst. As an industry analyst, I was a market researcher whose primary findings were qualitative, not quantitative. Like the vast majority of business analysts, I did most of my core analyses in spreadsheets populated by data that I’d gathered myself through vendor and user surveys. My working analyses were always in the “small data” territory familiar to most users of business intelligence applications. In terms of that research, the most interesting findings that I discovered through data were in a benefit-cost-risk analysis of data warehousing (DW) appliances back in 2010. What I found through spreadsheet analysis of my user-survey data is that break-even on a typical DW appliance deployment is within a year, when the project involves consolidation of two or more DWs or data marts onto a newer, higher-capacity, lower-cost appliance. And that’s purely in cost savings and cost avoidance on the DW infrastructure side. When you factor productivity improvements on the user side from having a higher performance DW running faster queries and enabling greater user and workload concurrency, the break-even is even sooner. Bear in mind that I came to these data-driven research conclusions as an independent analyst. I’m now with a solution provider in the DW appliance market, so take what I’m telling you with all appropriate grains of salt.
As a Big Data evangelist, what’s the biggest caution you have for businesses looking to unlock the power and promise of Big Data?
If you’re in the majority of users that only need smaller-scale transaction processing, data mart, and reporting capabilities on relational databases, don’t rush into big data until you have a clear business need. However, that doesn’t mean that, as your current DBMS nears the end of its useful life, that you shouldn’t migrate to a newer generation that supports big data. Big data isn’t just storage and processing capacity—it’s also about having a DBMS platform that doesn’t lock you into capacity-constraints when the day comes that you need to expand storage into the terabytes, when you need to move from batch to real-time velocities, and when you need to analyze new unstructured data sources. Having a DBMS platform that can scale up and out—perhaps by adding additional blades or racks in a DBMS appliance architecture or by purchasing additional on-demand DBMS capacity in a big-data cloud service—gives you peace of mind that you can grow your existing investment cost-effectively when you need to.
On your blog, you recently wrote that with the democratization of data, you don’t have to be a data scientist to add value to your business through analytics. If you were preparing for a career in business or data analytics today, what knowledge and skills would you focus on?
I would focus on your problem-solving skills in some practical domain, such as finance, marketing, supply chain management, human resources, and so on. There are many excellent, user-friendly, feature-rich data-science and advanced visualization tools and cloud services on the market, including those from IBM. A prepared analytical mind can master any of them as easily as they learned how to use spreadsheets. But none of them can help you solve business problems if you don’t have the domain knowledge to guide your search for data-driven patterns relevant to whatever business problem you’re trying to solve.
Who is your data hero and why?
I believe in science, and, even in that domain, I don’t focus on individual heroes. Science is the human activity of building and testing interpretive frameworks—aka hypotheses—through controlled observation of empirical data. Every scientist must, of necessity, be a “data scientist,” in the sense that all conclusions must defer to the observational data itself. The scientific establishment is worthy of continued respect if they keep that dictum—the data is paramount—at the heart of their collective efforts to extract insights from the bewildering complexity of the cosmos. The most heroic feat of the scientific establishment is when they’re able to sift through fresh data, or even old data, to find heretofore hidden insights. Right now, I’m amazed at the ongoing scientific discovery of distant “exoplanets” in our galaxy, based on minute analysis of the data on the gravitational wobbles of the stars around which those celestial bodies orbit. This is truly a matter of “dark data” being illuminated through ingenious analytical techniques. I find that incredibly heroic from a historical standpoint, ranking right up there with Galileo’s use of the telescope to discover moons orbiting other planets in our solar system. Truth be told, Galileo was more heroic, because he had to suffer official censure for proclaiming this insight. Modern astronomers are universally praised.
Learn more with James Kobielus – catch his session Unlocking Big Data: The Power of Cognitive Computing at the PASS Business Analytics Conference in Santa Clara, CA, April 20-22.