This week we have
been attending a series of events held during Big Data Week. At the London Community Event, we heard from a panel discussion on big data that included (amongst others) Hilary Mason, Chief
Scientist of Bit.ly, Doug Cutting, co-founder of the Apache
Hadoop project, and Nick Halstead, CTO and founder of Datasift. At
the same time of Big Data Week, the gaming industry have been attending
GiGse 2012 in San Fransisco. During the CEO panel at GiGse, the importance of data was highlighted by
Jim Ryan, co-CEO of bwin.party, who said that bwin.party now have around 70
people in their business information team analysing data and feeding back
into its marketing operation. This is a big team of data analysts - bwin.party are taking data very seriously. However whilst large
organisations have the resources to deploy very large analytics teams to solve big data problems, we believe
that as data volumes continue to increase exponentially, adding solely staff to
process and mine increasingly large data volumes will not be a scalable
solution for any organisation in any industry. This view was confirmed by the expert panel at
Big Data Community Event – here is a summary of some of the key discussions themes.
How important is the ‘Big’ in Big Data?
A philosophical explanation of big data centred on being able to look at data
with no pre-conceived ideas. As well as an open mind, big data is about
having the ability to join multiple data sets and run analytics across them, rather than taking a silo approach. Whilst
the panel disagreed on the relative importance of the word ‘big’ in big data, a recurrent message was that today it’s much easier and cheaper to store and analyse very large
data sets e.g. large scale data processing (i.e. map reduce) on platforms such as Amazon
Web Services (AWS) has now become commoditized, and are now considered established
technologies and platforms. And with the profileration of eCommerce, social media and APIs, there is a lot more volume and richness of data available to analyse today. If you are interested in reading an example as to how the
combination of cloud computing (e.g. AWS) and Hadoop (map reduce) enables big
data processing at scale and at a significantly reduced cost then we suggest you read this article - Big Pain or Big Profits?
What are some of the Challenges?
Arguably the biggest challenge is how do organisations find
the important nuggets of information? Taking a 'boil the ocean' approach to big data is fraught with challenges, a theme we have examined in our blog on Data Inflation last month. It was stated
that one of the major benefits of big data is the ability to get answers to
questions back quickly, which in the past could take weeks and months - but this needs access to an increasingly important resource - the data scientist - a combination of math, computing, and domain expertise, coupled with an open and inquisitive mind. And finding these people is a major headache for most organisations. There was also discussion about academic access
and use of data. It was argued commercial organisations still remain very reluctant to
share information via open research projects as there is a lack of trust as to
how these data sets will be used and by whom (an example of an open research project within the gaming industry is The Transparency Project). A key infrastructure challenge is that the internet has not been designed to process large data sets at low
latencies. Ensuring compliance with data privacy requirements, unsurprisingly, remains a major focal point.
Should You Care?
If you want to make better decisions then the answer is yes! The end product of any data or big data project has to be focused on better decision making. In gaming this equates to supporting decision making across aspects of the business: game design, game performance, 1-2-1 marketing and consumer protection, and finance and risk management. So whilst we can argue about the importance of the word 'Big' in big data, we cannot argue about the increasing relevance of data to managing our businesses today. As the panel concluded, "we are just scratching the surface of what is possible".