EasyNotesOfComputer: What is Big Data ?,Big Data Characteristics, What is MapReduce?

What is Big Data ?

Big data is the term for a collection of data sets so large and complex that it becomes

difficult to process using on-hand database management tools or traditional data processing

applications.

uLarge scale data sets.

uComplex data sets

How big is the Big Data?

Any data that can challenge our current technology in some manner can consider as Big Data

-Volume

-Speed of Generating

Big Data Characteristics(3Vs)

uhigh-volume

uhigh-velocity

uhigh-variety

What is Hadoop?

—At Google MapReduce operation are run on a special file system called Google File

System (GFS) that is highly optimized for this purpose.

—Doug Cutting and others at Yahoo! reverse engineered the GFS and called it Hadoop

Distributed File System (HDFS).

—The software framework that supports HDFS, MapReduce and other related entities is

called the project Hadoop or simply Hadoop.

—This is open source and distributed by Apache.

Implementation of Big Data

MapReduce
Parallel DBMS technologies

What is MapReduce?

MapReduce is a programming model Google has used successfully is processing its “big-data” sets

¡A map function

¡A reduce function

¡automatically parallelizes

¡handles machine failures.

MapReduce Advantages

uAutomatic Parallelization

uRun-time:

uData partitioning

uTask scheduling

uHandling machine failures

uManaging inter-machine communication

uCompletely transparent to the programmer/analyst/user

EasyNotesOfComputer

Pages

Wednesday, 11 June 2014

What is Big Data ?,Big Data Characteristics, What is MapReduce?

What is Big Data ?

How big is the Big Data?

Big Data Characteristics(3Vs)

What is Hadoop?

What is MapReduce?

MapReduce Advantages

No comments:

Post a Comment

Blog Archive