Useful Links
Computer Science
Big Data
Apache Hadoop
1. Introduction to Big Data and Hadoop
2. Hadoop Architecture Overview
3. Hadoop Distributed File System (HDFS)
4. Yet Another Resource Negotiator (YARN)
5. MapReduce Programming Model
6. Hadoop Ecosystem Tools
7. Hadoop Administration
8. Advanced Hadoop Topics
MapReduce Programming Model
MapReduce Fundamentals
Programming Paradigm
Divide and Conquer Approach
Functional Programming Concepts
Core MapReduce Functions
Map Function
Input Processing
Key-Value Pair Generation
Parallel Execution
Reduce Function
Data Aggregation
Result Summarization
Sequential Processing
Key-Value Pair Concept
Data Representation
Serialization Requirements
Custom Data Types
MapReduce Data Flow
Input Phase
InputFormat Role
InputSplit Creation
Record Reading
Map Phase
Mapper Execution
Intermediate Output Generation
Local Storage
Shuffle and Sort Phase
Data Partitioning
Key-Based Sorting
Data Transfer
Merge Operations
Reduce Phase
Reducer Execution
Final Output Generation
Result Writing
Output Phase
OutputFormat Role
Result Storage
MapReduce Job Components
Driver Program
Job Configuration
Input/Output Specification
Job Submission
Mapper Implementation
Map Method Override
Setup and Cleanup
Context Usage
Reducer Implementation
Reduce Method Override
Setup and Cleanup
Context Usage
Combiner Function
Local Aggregation
Network Traffic Reduction
Implementation Considerations
MapReduce Programming
Word Count Example
Problem Definition
Mapper Logic
Reducer Logic
Driver Configuration
Job Packaging and Execution
Code Compilation
JAR File Creation
Job Submission
Monitoring Execution
MapReduce Limitations
Batch Processing Nature
High Latency
Programming Complexity
Iterative Processing Challenges
Real-Time Processing Limitations
Previous
4. Yet Another Resource Negotiator (YARN)
Go to top
Next
6. Hadoop Ecosystem Tools