Useful Links
Computer Science
Cloud Computing
Cloud Data Management and Analysis
1. Introduction to Cloud Data Management
2. The Data Lifecycle in the Cloud
3. Data Ingestion and Collection
4. Cloud Data Storage Solutions
5. Data Processing and Transformation
6. Data Analysis and Business Intelligence (BI)
7. Machine Learning and Advanced Analytics
8. Data Governance and Security
9. Orchestration and Automation
10. Cost Management and Optimization
Cloud Data Storage Solutions
Object Storage
Characteristics and Use Cases
Unstructured Data Storage
Scalability and Durability
REST API Access
Metadata Management
Storage Tiers and Lifecycle Policies
Hot, Cool, and Archive Tiers
Automated Data Movement
Cost Optimization Strategies
Retrieval Time Considerations
Key Services
Amazon S3 (Simple Storage Service)
Buckets and Objects
Versioning and Lifecycle Rules
Cross-Region Replication
Event Notifications
Transfer Acceleration
Azure Blob Storage
Containers and Blobs
Access Tiers
Blob Types and Use Cases
Change Feed
Google Cloud Storage (GCS)
Buckets and Classes
Object Lifecycle Management
Uniform Bucket-Level Access
Requester Pays
Block Storage
Characteristics and Use Cases
Persistent Disks for VMs
High-Performance Workloads
File System Creation
Database Storage
Attaching to Virtual Machines
Mounting and Unmounting Volumes
Snapshots and Backups
Encryption Options
Performance Tuning
Key Services
Amazon EBS (Elastic Block Store)
Volume Types
IOPS and Throughput
Multi-Attach Capability
Encryption Features
Azure Disk Storage
Managed and Unmanaged Disks
Disk Types and Performance
Shared Disks
Disk Encryption
GCP Persistent Disk
Standard and SSD Options
Regional Persistent Disks
Snapshot Scheduling
Disk Resizing
File Storage
Characteristics and Use Cases
Shared File Systems
NFS and SMB Protocols
Concurrent Access
POSIX Compliance
Key Services
Amazon EFS (Elastic File System)
Scalability and Performance
Performance Modes
Throughput Modes
Access Points
Azure Files
Integration with Windows and Linux
File Sync Service
Backup and Restore
Premium Performance Tier
GCP Filestore
Performance Tiers
NFS Protocol Support
Backup and Restore
Network Configuration
Cloud Databases
Relational Databases (Managed SQL)
Amazon RDS
Supported Engines
High Availability and Backups
Read Replicas
Parameter Groups
Performance Insights
Azure SQL Database
Elastic Pools
Geo-Replication
Automatic Tuning
Threat Detection
Backup and Restore
Google Cloud SQL
Supported Engines
Automated Backups
High Availability Configuration
Connection Security
NoSQL Databases
Key-Value Stores
Amazon DynamoDB
Global Tables
On-Demand Scaling
DynamoDB Streams
Global Secondary Indexes
Azure Cosmos DB (Core API)
Global Distribution
Consistency Models
Partition Key Design
Request Units
Document Stores
Amazon DocumentDB
MongoDB Compatibility
Cluster Architecture
Backup and Restore
Azure Cosmos DB (MongoDB API)
API Support
Indexing Policies
Change Feed
Wide-Column Stores
Amazon Keyspaces
Apache Cassandra Compatibility
Serverless Scaling
Point-in-Time Recovery
Google Cloud Bigtable
High Throughput Use Cases
Row Key Design
Column Family Structure
Replication Configuration
In-Memory Databases
Amazon ElastiCache
Redis and Memcached Support
Cluster Mode
Backup and Restore
Security Groups
Azure Cache for Redis
Caching Patterns
Clustering
Data Persistence
Geo-Replication
Data Warehousing in the Cloud
Core Concepts of Data Warehousing
OLAP vs. OLTP
Schema Design (Star, Snowflake)
Dimensional Modeling
Fact and Dimension Tables
Slowly Changing Dimensions
MPP (Massively Parallel Processing) Architecture
Distributed Query Execution
Scalability Considerations
Node Types and Roles
Data Distribution Strategies
Key Services
Amazon Redshift
Spectrum for Data Lake Integration
Cluster Architecture
Workload Management
Concurrency Scaling
Materialized Views
Azure Synapse Analytics
Dedicated and Serverless Pools
Data Integration Pipelines
Apache Spark Integration
Security and Compliance
Google BigQuery
Serverless Architecture
Columnar Storage
Nested and Repeated Fields
Partitioning and Clustering
Slot Management
Data Lakes
Core Concepts of a Data Lake
Schema-on-Read
Raw and Processed Zones
Data Lake Architecture Patterns
Metadata Management
Data Lake vs. Data Warehouse
Flexibility vs. Structure
Use Cases Comparison
Cost Considerations
Performance Trade-offs
Building a Data Lake on Object Storage
Data Ingestion and Organization
Metadata Management
Data Cataloging
Access Control and Security
Data Quality Management
Lakehouse Architecture
Combining Data Lake and Warehouse Features
Delta Lake and Open Table Formats
ACID Transactions
Time Travel and Versioning
Unified Analytics Platform
Previous
3. Data Ingestion and Collection
Go to top
Next
5. Data Processing and Transformation