Cloud Data Management and Analysis

  1. Cloud Data Storage Solutions
    1. Object Storage
      1. Characteristics and Use Cases
        1. Unstructured Data Storage
          1. Scalability and Durability
            1. REST API Access
              1. Metadata Management
              2. Storage Tiers and Lifecycle Policies
                1. Hot, Cool, and Archive Tiers
                  1. Automated Data Movement
                    1. Cost Optimization Strategies
                      1. Retrieval Time Considerations
                      2. Key Services
                        1. Amazon S3 (Simple Storage Service)
                          1. Buckets and Objects
                            1. Versioning and Lifecycle Rules
                              1. Cross-Region Replication
                                1. Event Notifications
                                  1. Transfer Acceleration
                                  2. Azure Blob Storage
                                    1. Containers and Blobs
                                      1. Access Tiers
                                        1. Blob Types and Use Cases
                                          1. Change Feed
                                          2. Google Cloud Storage (GCS)
                                            1. Buckets and Classes
                                              1. Object Lifecycle Management
                                                1. Uniform Bucket-Level Access
                                                  1. Requester Pays
                                              2. Block Storage
                                                1. Characteristics and Use Cases
                                                  1. Persistent Disks for VMs
                                                    1. High-Performance Workloads
                                                      1. File System Creation
                                                        1. Database Storage
                                                        2. Attaching to Virtual Machines
                                                          1. Mounting and Unmounting Volumes
                                                            1. Snapshots and Backups
                                                              1. Encryption Options
                                                                1. Performance Tuning
                                                                2. Key Services
                                                                  1. Amazon EBS (Elastic Block Store)
                                                                    1. Volume Types
                                                                      1. IOPS and Throughput
                                                                        1. Multi-Attach Capability
                                                                          1. Encryption Features
                                                                          2. Azure Disk Storage
                                                                            1. Managed and Unmanaged Disks
                                                                              1. Disk Types and Performance
                                                                                1. Shared Disks
                                                                                  1. Disk Encryption
                                                                                  2. GCP Persistent Disk
                                                                                    1. Standard and SSD Options
                                                                                      1. Regional Persistent Disks
                                                                                        1. Snapshot Scheduling
                                                                                          1. Disk Resizing
                                                                                      2. File Storage
                                                                                        1. Characteristics and Use Cases
                                                                                          1. Shared File Systems
                                                                                            1. NFS and SMB Protocols
                                                                                              1. Concurrent Access
                                                                                                1. POSIX Compliance
                                                                                                2. Key Services
                                                                                                  1. Amazon EFS (Elastic File System)
                                                                                                    1. Scalability and Performance
                                                                                                      1. Performance Modes
                                                                                                        1. Throughput Modes
                                                                                                          1. Access Points
                                                                                                          2. Azure Files
                                                                                                            1. Integration with Windows and Linux
                                                                                                              1. File Sync Service
                                                                                                                1. Backup and Restore
                                                                                                                  1. Premium Performance Tier
                                                                                                                  2. GCP Filestore
                                                                                                                    1. Performance Tiers
                                                                                                                      1. NFS Protocol Support
                                                                                                                        1. Backup and Restore
                                                                                                                          1. Network Configuration
                                                                                                                      2. Cloud Databases
                                                                                                                        1. Relational Databases (Managed SQL)
                                                                                                                          1. Amazon RDS
                                                                                                                            1. Supported Engines
                                                                                                                              1. High Availability and Backups
                                                                                                                                1. Read Replicas
                                                                                                                                  1. Parameter Groups
                                                                                                                                    1. Performance Insights
                                                                                                                                    2. Azure SQL Database
                                                                                                                                      1. Elastic Pools
                                                                                                                                        1. Geo-Replication
                                                                                                                                          1. Automatic Tuning
                                                                                                                                            1. Threat Detection
                                                                                                                                              1. Backup and Restore
                                                                                                                                              2. Google Cloud SQL
                                                                                                                                                1. Supported Engines
                                                                                                                                                  1. Automated Backups
                                                                                                                                                    1. High Availability Configuration
                                                                                                                                                      1. Connection Security
                                                                                                                                                    2. NoSQL Databases
                                                                                                                                                      1. Key-Value Stores
                                                                                                                                                        1. Amazon DynamoDB
                                                                                                                                                          1. Global Tables
                                                                                                                                                            1. On-Demand Scaling
                                                                                                                                                              1. DynamoDB Streams
                                                                                                                                                                1. Global Secondary Indexes
                                                                                                                                                                2. Azure Cosmos DB (Core API)
                                                                                                                                                                  1. Global Distribution
                                                                                                                                                                    1. Consistency Models
                                                                                                                                                                      1. Partition Key Design
                                                                                                                                                                        1. Request Units
                                                                                                                                                                      2. Document Stores
                                                                                                                                                                        1. Amazon DocumentDB
                                                                                                                                                                          1. MongoDB Compatibility
                                                                                                                                                                            1. Cluster Architecture
                                                                                                                                                                              1. Backup and Restore
                                                                                                                                                                              2. Azure Cosmos DB (MongoDB API)
                                                                                                                                                                                1. API Support
                                                                                                                                                                                  1. Indexing Policies
                                                                                                                                                                                    1. Change Feed
                                                                                                                                                                                  2. Wide-Column Stores
                                                                                                                                                                                    1. Amazon Keyspaces
                                                                                                                                                                                      1. Apache Cassandra Compatibility
                                                                                                                                                                                        1. Serverless Scaling
                                                                                                                                                                                          1. Point-in-Time Recovery
                                                                                                                                                                                          2. Google Cloud Bigtable
                                                                                                                                                                                            1. High Throughput Use Cases
                                                                                                                                                                                              1. Row Key Design
                                                                                                                                                                                                1. Column Family Structure
                                                                                                                                                                                                  1. Replication Configuration
                                                                                                                                                                                                2. In-Memory Databases
                                                                                                                                                                                                  1. Amazon ElastiCache
                                                                                                                                                                                                    1. Redis and Memcached Support
                                                                                                                                                                                                      1. Cluster Mode
                                                                                                                                                                                                        1. Backup and Restore
                                                                                                                                                                                                          1. Security Groups
                                                                                                                                                                                                          2. Azure Cache for Redis
                                                                                                                                                                                                            1. Caching Patterns
                                                                                                                                                                                                              1. Clustering
                                                                                                                                                                                                                1. Data Persistence
                                                                                                                                                                                                                  1. Geo-Replication
                                                                                                                                                                                                            2. Data Warehousing in the Cloud
                                                                                                                                                                                                              1. Core Concepts of Data Warehousing
                                                                                                                                                                                                                1. OLAP vs. OLTP
                                                                                                                                                                                                                  1. Schema Design (Star, Snowflake)
                                                                                                                                                                                                                    1. Dimensional Modeling
                                                                                                                                                                                                                      1. Fact and Dimension Tables
                                                                                                                                                                                                                        1. Slowly Changing Dimensions
                                                                                                                                                                                                                        2. MPP (Massively Parallel Processing) Architecture
                                                                                                                                                                                                                          1. Distributed Query Execution
                                                                                                                                                                                                                            1. Scalability Considerations
                                                                                                                                                                                                                              1. Node Types and Roles
                                                                                                                                                                                                                                1. Data Distribution Strategies
                                                                                                                                                                                                                                2. Key Services
                                                                                                                                                                                                                                  1. Amazon Redshift
                                                                                                                                                                                                                                    1. Spectrum for Data Lake Integration
                                                                                                                                                                                                                                      1. Cluster Architecture
                                                                                                                                                                                                                                        1. Workload Management
                                                                                                                                                                                                                                          1. Concurrency Scaling
                                                                                                                                                                                                                                            1. Materialized Views
                                                                                                                                                                                                                                            2. Azure Synapse Analytics
                                                                                                                                                                                                                                              1. Dedicated and Serverless Pools
                                                                                                                                                                                                                                                1. Data Integration Pipelines
                                                                                                                                                                                                                                                  1. Apache Spark Integration
                                                                                                                                                                                                                                                    1. Security and Compliance
                                                                                                                                                                                                                                                    2. Google BigQuery
                                                                                                                                                                                                                                                      1. Serverless Architecture
                                                                                                                                                                                                                                                        1. Columnar Storage
                                                                                                                                                                                                                                                          1. Nested and Repeated Fields
                                                                                                                                                                                                                                                            1. Partitioning and Clustering
                                                                                                                                                                                                                                                              1. Slot Management
                                                                                                                                                                                                                                                          2. Data Lakes
                                                                                                                                                                                                                                                            1. Core Concepts of a Data Lake
                                                                                                                                                                                                                                                              1. Schema-on-Read
                                                                                                                                                                                                                                                                1. Raw and Processed Zones
                                                                                                                                                                                                                                                                  1. Data Lake Architecture Patterns
                                                                                                                                                                                                                                                                    1. Metadata Management
                                                                                                                                                                                                                                                                    2. Data Lake vs. Data Warehouse
                                                                                                                                                                                                                                                                      1. Flexibility vs. Structure
                                                                                                                                                                                                                                                                        1. Use Cases Comparison
                                                                                                                                                                                                                                                                          1. Cost Considerations
                                                                                                                                                                                                                                                                            1. Performance Trade-offs
                                                                                                                                                                                                                                                                            2. Building a Data Lake on Object Storage
                                                                                                                                                                                                                                                                              1. Data Ingestion and Organization
                                                                                                                                                                                                                                                                                1. Metadata Management
                                                                                                                                                                                                                                                                                  1. Data Cataloging
                                                                                                                                                                                                                                                                                    1. Access Control and Security
                                                                                                                                                                                                                                                                                      1. Data Quality Management
                                                                                                                                                                                                                                                                                      2. Lakehouse Architecture
                                                                                                                                                                                                                                                                                        1. Combining Data Lake and Warehouse Features
                                                                                                                                                                                                                                                                                          1. Delta Lake and Open Table Formats
                                                                                                                                                                                                                                                                                            1. ACID Transactions
                                                                                                                                                                                                                                                                                              1. Time Travel and Versioning
                                                                                                                                                                                                                                                                                                1. Unified Analytics Platform