Useful Links
Computer Science
Big Data
Apache Hadoop
1. Introduction to Big Data and Hadoop
2. Hadoop Architecture Overview
3. Hadoop Distributed File System (HDFS)
4. Yet Another Resource Negotiator (YARN)
5. MapReduce Programming Model
6. Hadoop Ecosystem Tools
7. Hadoop Administration
8. Advanced Hadoop Topics
Hadoop Distributed File System (HDFS)
HDFS Architecture
Master-Slave Design
Centralized Metadata Management
Distributed Data Storage
NameNode Components
Namespace Management
Block Location Tracking
Metadata Operations
FsImage and EditLog
Checkpoint Process
DataNode Components
Block Storage Management
Data Serving Operations
Heartbeat Mechanism
Block Report Generation
Secondary NameNode
Checkpoint Creation
Metadata Backup
Operational Differences from NameNode
HDFS Core Concepts
Block Management
Default Block Size
Block Abstraction Benefits
Block Placement Strategy
Data Replication
Replication Factor Configuration
Rack Awareness
Network Topology Considerations
Replica Placement Policy
Fault Tolerance Features
DataNode Failure Detection
Block Recovery Process
NameNode Failure Scenarios
Data Integrity Mechanisms
Data Integrity
Checksum Calculation
Corruption Detection
Recovery Procedures
HDFS Operations
Command Line Interface
Basic File Operations
Directory Management
Permission Management
Administrative Commands
Common HDFS Commands
File Listing (ls)
Directory Creation (mkdir)
File Upload (put)
File Download (get)
File Viewing (cat)
File Deletion (rm)
Disk Usage (du)
Administrative Operations (dfsadmin)
HDFS Java API
FileSystem Class Usage
File Reading Operations
File Writing Operations
Exception Handling
Advanced HDFS Features
HDFS Federation
Multiple Namespace Support
Namespace Isolation
Scalability Benefits
HDFS High Availability
Active-Standby Configuration
Automatic Failover
Manual Failover
Quorum Journal Manager
Shared EditLog Management
ZooKeeper Integration
Previous
2. Hadoop Architecture Overview
Go to top
Next
4. Yet Another Resource Negotiator (YARN)