What is Big Data: Big data comes to the picture, when following things are required:

    1. Volume: Huge data

    2. Velocity: Huge speed to retrieve the data is required (Eg: Big Billion sale)

    3. Variety: Diff. types of data (Structured/Un-structured/Image/Mp4)

    4. Veracity: Clean data is required


Components of Hadoop: 

HDFS + MapReduce + YARN


Hive: Tool to query data from Hbase
Sqoop: Ingestion of data
Spark: Cleaning of data
PIG: Previously used for cleaning of data
Hbase: Data Warehouse

No comments:

Post a Comment