Useful Links
Computer Science
Big Data
Apache Hadoop
1. Introduction to Big Data and Hadoop
2. Hadoop Architecture Overview
3. Hadoop Distributed File System (HDFS)
4. Yet Another Resource Negotiator (YARN)
5. MapReduce Programming Model
6. Hadoop Ecosystem Tools
7. Hadoop Administration
8. Advanced Hadoop Topics
Hadoop Administration
Cluster Planning
Hardware Requirements
CPU Specifications
Memory Requirements
Storage Considerations
Network Infrastructure
Capacity Planning
Data Growth Estimation
Workload Analysis
Performance Requirements
Network Design
Topology Considerations
Bandwidth Requirements
Rack Configuration
Installation and Configuration
Installation Modes
Standalone Mode
Pseudo-Distributed Mode
Fully-Distributed Mode
Configuration Files
core-site.xml
hdfs-site.xml
yarn-site.xml
mapred-site.xml
Environment Setup
Java Installation
User Account Configuration
SSH Key Setup
Environment Variables
Cluster Operations
Service Management
Starting Services
Stopping Services
Service Dependencies
Node Management
Adding Nodes
Decommissioning Nodes
Node Health Monitoring
Data Management
HDFS Balancer
Data Rebalancing
Quota Management
Monitoring and Maintenance
Web Interface Monitoring
NameNode Web UI
ResourceManager Web UI
DataNode Monitoring
NodeManager Monitoring
Log Management
Log File Locations
Log Analysis Techniques
Error Pattern Recognition
Performance Tuning
JVM Configuration
Memory Tuning
Garbage Collection Optimization
Network Optimization
Security Implementation
Authentication
Kerberos Setup
Principal Configuration
Keytab Management
Authorization
HDFS Permissions
Access Control Lists
Service-Level Authorization
Data Protection
Encryption at Rest
Encryption in Transit
Key Management
Previous
6. Hadoop Ecosystem Tools
Go to top
Next
8. Advanced Hadoop Topics