Hadoop

Case Study of Haddop

Hadoop is an open-source java-based software framework sponsored by the Apache Software Foundation for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

It provides storage for big data at reasonable cost. Hadoop process big data in a single place as in a storage cluster doubling as a compute cluster.

Hadoop Architecture and Components:

Apache Hadoop consist of two major parts:
  1. Hadoop Distributed File System (HDFS)
  2. MapReduce
1. Hadoop Distributed File System:

HDFS is a file system or storage layer of Hadoop. It can store data and can handle very large amount of data.

When capacity of file is large then it is necessary to partition it. And the file systems manage the storage across a network of machine are called distributed file systems.

An HDFS cluster has two types of node operating in a master-worker pattern- Name Node and No. of Data Nodes.

Hadoop keep data safe by duplicating data across nodes.

2. MapReduce:

MapReduce is a programming framework. It organize multiple computers in a cluster in order to perform the calculations. It takes care of distributing the work between computers and putting results together.

Hadoop works in a Master-Worker / Master-slave fashion:-

1. Master:
Master contains Name node and Job tracker components.
  1. Name node: It holds information about all the other nodes in the Hadoop Cluster, files in the cluster, blocks of files, their locations etc.
  2. Job tracker: It keeps track of the individual tasks assigned to each of the nodes and coordinates the exchange of information and result.
2. Worker:
Worker contains Task tracker and Data node components.
  1. Task Tracker: It is responsible for running the task assigned to it.
  2. Data node: It is responsible for holding the data.
Other components of Hadoop architecture are : -Chukwa, Hive, HBase, Mahoutetc.

Characteristics of Hadoop:
  1. Hadoop provides a reliable shared storage(HDFS) and analysis system (Map Reduce).
  2. Hadoop is highly scalable. It can contain thousands of servers.
  3. Hadoop works on the principles of write once and read multiple times.
  4. Hadoop is highly flexible, can process both structured as well as unstructured data.

More topics from Cloud Computing to read

Cloud Computing:

EasyExamNotes.com covered following topics in these notes.

  1. Introduction to Cloud Computing
  2. Historical development of Cloud Computing 
  3. Vision of Cloud Computing
  4. Characteristics of cloud computing as per NIST
  5. Cloud computing reference model
  6. Cloud computing environments
  7. Cloud services requirements
  8. Cloud and dynamic infrastructure
  9. Cloud Adoption and rudiments
  10. Cloud application: ECG Analysis in the cloud
  11. Cloud application: Protein structure prediction
  12. Cloud application: Gene Expression Data Analysis
  13. Cloud Computing Architecture
  14. IaaS
  15. PaaS
  16. SaaS
  17. Types of Clouds
  18. Cloud Interoperability & Standards
  19. Scalability and Fault Tolerance
  20. Cloud Ecosystem
  21. Cloud Business Process Management
  22. Cloud Service Management
  23. Cloud Analytics
  24. Testing Under Control
  25. Virtual Desktop Infrastructure
  26. Cloud Resiliency
  27. Cloud Provisioning
  28. Asset management
  29. Concepts of Map reduce
  30. Cloud Governance
  31. High Availability and Disaster Recovery
  32. Virtualization in cloud computing
  33. Server virtualization
  34. Hypervisor management software
  35. Third Party Cloud Services
  36. Case Study: Google App Engine
  37. Case Study: Microsoft Azure
  38. Case Study: Hadoop
  39. Case Study: Amazon
  40. Case Study: Aneka

A list of Video lectures

References:

  1. Buyya, Selvi ,” Mastering Cloud Computing “,TMH Pub
  2. Krutz , Vines, “Cloud Security “ , Wiley Pub
  3. Velte, “Cloud Computing- A Practical Approach” ,TMH Pub
  4. Sosinsky, “ Cloud Computing” , Wiley Pub
Python Programming ↓ 👆
Java Programming ↓ 👆
JAVA EasyExamNotes.com covered following topics in these notes.
JAVA Programs
Principles of Programming Languages ↓ 👆
Principles of Programming Languages
EasyExamNotes.com covered following topics in these notes.

Practicals:
Previous years solved papers:
A list of Video lectures References:
  1. Sebesta,”Concept of programming Language”, Pearson Edu 
  2. Louden, “Programming Languages: Principles & Practices” , Cengage Learning 
  3. Tucker, “Programming Languages: Principles and paradigms “, Tata McGraw –Hill. 
  4. E Horowitz, "Programming Languages", 2nd Edition, Addison Wesley

    Computer Organization and Architecture ↓ 👆

    Computer Organization and Architecture 

    EasyExamNotes.com covered following topics in these notes.

    1. Structure of desktop computers
    2. Logic gates
    3. Register organization
    4. Bus structure
    5. Addressing modes
    6. Register transfer language
    7. Direct mapping numericals
    8. Register in Assembly Language Programming
    9. Arrays in Assembly Language Programming

    References:

    1. William stalling ,“Computer Architecture and Organization” PHI
    2. Morris Mano , “Computer System Organization ”PHI

    Computer Network ↓ 👆
    Computer Network

    EasyExamNotes.com covered following topics in these notes.
    1. Data Link Layer
    2. Framing
    3. Byte count framing method
    4. Flag bytes with byte stuffing framing method
    5. Flag bits with bit stuffing framing method
    6. Physical layer coding violations framing method
    7. Error control in data link layer
    8. Stop and Wait scheme
    9. Sliding Window Protocol
    10. One bit sliding window protocol
    11. A protocol Using Go-Back-N
    12. Selective repeat protocol
    13. Application layer
    References:
    1. Andrew S. Tanenbaum, David J. Wetherall, “Computer Networks” Pearson Education.
    2. Douglas E Comer, “Internetworking with TCP/IP Principles, Protocols, And Architecture",Pearson Education
    3. KavehPahlavan, Prashant Krishnamurthy, “Networking Fundamentals”, Wiley Publication.
    4. Ying-Dar Lin, Ren-Hung Hwang, Fred Baker, “Computer Networks: An Open Source Approach”, McGraw Hill.