Saturday, April 12, 2014

Big Data - An introduction (benefits and advantages)

Refer Hadoop Introduction

The amount of data being generated and being stored is growing every year.
The sources of the data is increasing too (Smartphones, blogs, social media feeds etc).
Much of this data can be used effectively to make better business decisions.




By leveraging big data, businesses can make sense of unstructured and structured data to better understand customer behavior.

Whats the new change?
Customer intelligence and business intelligence have traditionally focused on past data, whereas Big Data can provide forward-looking insights.

Big Data analytics promises to make our world a more intuitive place and businesses need to develop their data analytics capabilities to capitalize on the trend.



Data can be used to influence your purchasing.
Helps an organization go from being Reactive to Proactive   







What does analytics mean?
There are three A's:
  • Analytics on the data - What happened? [Co-relation]
  • Attribution of the data - Why it happened? [Cause]
  • Algorithms written on the above two

Business advantages of this data?
A users search pattern can help companies predict his/her action and can help them in promoting products.
Companies can predict
  • The movies you like
  • The music you like
  • Your current state of mind (if predicted based on your searches) can be used to help you find the right products and also influence you in buying more "similar-searched" products (used effectively by amazon)
  • Google can predict weeks in advance which area could have a flu outbreak if many people in the same locality are searching details on a flu outbreak
  • Hurricane notices are given a week in advance. Companies can use historic data to predict which products can sell more before a hurricane arrives (based on the nature of the people's action) and can stock up on those products proactively.
  • Human behavior is analyzed and it helps companies make more revenues - e.g a grocery store may sell more 100 INR wine bottles if it has three options of 50 INR, 100 INR and 500 INR compared to if it had two options of 50 INR and 100 INR. The 500 INR may not sell, but is used on the human behavior prediction to sell 100 INR bottles.

Big Data processes:

Step 1.
Get/Acquire as much data as possible from as many sources as possible.

Step 2.
Store the data in Hadoop distributed file system (HDFS)

Step 3.
Browse data, see relationships, identify patterns across the data sets

Step 4.
Use map reduce algorithm to transform, organize and aggregate information.
Data mining and analytic tools are used on this aggregated data (combined with existing data).
The relationships can be visualized by graphical charts/reports to provide value to end business users.

Hadoop addresses challenges created by Big Data, namely:

  • Velocity: Lot of data coming in at high speed
  • Volume: Lots of data being gathered (and volume is ever growing)
  • Variety: Data is from varied sources (audio, video, log files - not organized data)
We will have more sessions on Hadoop and Map Reduce and explain them in detail.

Refer Hadoop Introduction

No comments:

Post a Comment