Thursday 26 April 2012

Scaling Data to Make Better Decisions

This week we have been attending a series of events held during Big Data Week. At the London Community Event, we heard from a panel discussion on big data that included (amongst others) Hilary Mason, Chief Scientist of Bit.ly, Doug Cutting, co-founder of the Apache Hadoop project, and Nick Halstead, CTO and founder of Datasift. At the same time of Big Data Week, the gaming industry have been attending GiGse 2012 in San Fransisco. During the CEO panel at GiGse, the importance of data was highlighted by Jim Ryan, co-CEO of bwin.party, who said that bwin.party now have around 70 people in their business information team analysing data and feeding back into its marketing operation. This is a big team of data analysts - bwin.party are taking data very seriously. However whilst large organisations have the resources to deploy very large analytics teams to solve big data problems, we believe that as data volumes continue to increase exponentially, adding solely staff to process and mine increasingly large data volumes will not be a scalable solution for any organisation in any industry. This view was confirmed by the expert panel at Big Data Community Event – here is a summary of some of the key discussions themes.

How important is the ‘Big’ in Big Data?
A philosophical explanation of big data centred on being able to look at data with no pre-conceived ideas. As well as an open mind, big data is about having the ability to join multiple data sets and run analytics across them, rather than taking a silo approach. Whilst the panel disagreed on the relative importance of the word ‘big’ in big data, a recurrent message was that today it’s much easier and cheaper to store and analyse very large data sets e.g. large scale data processing (i.e. map reduce) on platforms such as Amazon Web Services (AWS) has now become commoditized, and are now considered established technologies and platforms. And with the profileration of eCommerce, social media and APIs, there is a lot more volume and richness of data available to analyse today. If you are interested in reading an example as to how the combination of cloud computing (e.g. AWS) and Hadoop (map reduce) enables big data processing at scale and at a significantly reduced cost then we suggest you read this article - Big Pain or Big Profits?

What are some of the Challenges?
Arguably the biggest challenge is how do organisations find the important nuggets of information? Taking a 'boil the ocean' approach to big data is fraught with challenges, a theme we have examined in our blog on Data Inflation last month. It was stated that one of the major benefits of big data is the ability to get answers to questions back quickly, which in the past could take weeks and months - but this needs access to an increasingly important resource - the data scientist - a combination of math, computing, and domain expertise, coupled with an open and inquisitive mind. And finding these people is a major headache for most organisations. There was also discussion about academic access and use of data. It was argued commercial organisations still remain very reluctant to share information via open research projects as there is a lack of trust as to how these data sets will be used and by whom (an example of an open research project within the gaming industry is The Transparency Project). A key infrastructure challenge is that the internet has not been designed to process large data sets at low latencies. Ensuring compliance with data privacy requirements, unsurprisingly, remains a major focal point.

Should You Care?
If you want to make better decisions then the answer is yes! The end product of any data or big data project has to be focused on better decision making. In gaming this equates to supporting decision making across aspects of the business: game design, game performance, 1-2-1 marketing and consumer protection, and finance and risk management. So whilst we can argue about the importance of the word 'Big' in big data, we cannot argue about the increasing relevance of data to managing our businesses today. As the panel concluded, "we are just scratching the surface of what is possible".

Sunday 15 April 2012

British Gambling Prevalence Survey - Why the basic math should concern the British Government

The 2011 British Gambling Prevalence Survey (BGPS) reported that problem gambling levels as a percentage of adult population rose from 0.6 - 0.9% between 2007 and 2010. There are two ways to interpret this data. One can contextualise the data by comparing problem gambling prevalance rates in the UK to other countries e.g. UK problem gambling prevalence rates are lower than in other countries, such as Australia and the US, and similar to Germany and Norway. One can also look at the absolute increase and say a 0.3% increase is a very small increase in the grand scheme of things, and also caveat the results by stipulating they were at the 'margins of statistical significance'.

The other way to interpret these results is to think about them in the context of basic arithmetic. In a YouTube video that has had over 4 million hits, Albert Bartlett explains what he believes is the greatest shortcoming of the human race - Our Inability to Understand The Exponential Function. Let's apply some of this basic math to the BGPS results with a high-level analysis of the headline figures.

The UK problem gambling prevalence increase from 2007 - 2010 equates to around 14.5% per year. The prevalence survey also states the rate of problem gambling in the UK population remained constant at 0.6% between 1999 - 2007. If we take the annual increase using this period too, the increase to 2010 was around 3.8% per year. If we assume the UK adult population will grow at 0.58% per year, and if we look to 2018, the best case scenario is that UK problem gambling prevalence rate will have increased over 40% to around 650,000 adults (greater than 1.2% of the total adult population), the worst case scenario is the prevalence rates will have tripled to 2.7% (if the trends in the most recent BGPS continue). If we assume the true growth rate is somewhere in between (e.g. the median of the two rates, around 9%), the projected problem gambling prevalence rate will be at 1.6% of the total adult population by 2018. If we also assume that the UK government will not limit gambling supply and that funding to tackle research and education will remain relatively flat (based on the previous 3 years), whichever way you look at it using basic arithmetic, the UK is on course to have the same levels of problem gambling prevalence as those very same countries that today we point to to make the current UK rates look relatively low.