• Selecting the storage service
    • Simple Storage Service (S3): One of the first storage that Amazon offers. It is object-level storage.
    • Glacier: It is used to archive data. Data that you need to retrieve average in 3 to 5 hours.
    • CloudFront: Bringing the data close to your location especially web data for a faster cache.
    • Elastic Block Store (EBS): It is used with instances for fast level access because it stores data in block-level not object level like S3.
    • Storage Gateway: Is a service connecting an on-premise software application with cloud-based storage.
    • Snow Family: There are collections of 3 storage in this family and they all used to migrate large data from on-premise to AWS.
    • Database: It is also a kind of storage and AWS provides many kinds of databases.
    • Block Storage
      • Elastic Block Storage (EBS): It is used to store data on a local network same like we access storage for virtual machines from hard drives. The technology uses are iSCSI and Fiber Channel to create Storage Area Network (SAN).
        • Used for persistent (durable) storage with EC2 instances.
        • Objects in EBS are consistent because it’s more like a local hard drive.
        • Create EBS volumes in the same region as your EC2 instances.
        • EBS volume types
          • Magnetic (HDD): it is the same as using spindle drives in a local machine. Not high performance but cost less.
            • Throughput Optimize HDD = low-cost design for frequently accessed throughput-intensive workloads
            • Cold HDD = low-cost design for less frequent access workloads
          • SSD (solid-state drive): cost more than magnetic
            • General purpose SSD (default) = balance price and workloads for a wide variety of workloads
            • Provisioned IOPS SSD = for mission critical low-latency / high-throughput workloads
            • EBS-optimized instances should be used otherwise paying more for less optimized instances.
    • File-based Storage
      • Simple Storage Service (S3): it is the same as file-based storage which is object storage in S3. We can use this as Network Attached Storage (NAS) same as we did on-premise.
        • The bucket name must be globally unique across all of S3 not just in your account/organization
        • You can create only 100 buckets in AWS account but storage in a bucket is unlimited.
        • You have to place a bucket in a Region which means placing a bucket in a region that is closer to your location.
        • S3 buckets may look like a hierarchy of folders but it is part of the key name.  
        • Every object in an S3 bucket gets a URL so you can access it through the Internet.
        • Objects in S3 buckets are eventual consistent because if you change/modify in S3 buckets it may take milliseconds to minutes to synchronize to other Availability zones.
        • Works great with a static website
      • S3 features
        • S3 uses Prefixes and delimiters. It may look like a folder but AWS doesn’t use a folder method. Prefixes are just words, string of letters or numbers and delimiters separate one prefix to another prefix before the final object.
          • E.g. “Sales/Plans/[Object key name]”
        • Storage classes: different ways to store data. You can transition between these classes after a certain amount of days to cut the cost of storage. These transitions can be based on policy
        • Types of S3 storage:
          • S3 Standard = $$$$ (most expensive)
          • S3 Infrequent Access (IA) Storage = $$$
          • S3 Reduced Redundancy Storage (RRS) = $$
          • Glacier = $
        • Encryption
          • Server-side encryption = file is encrypted when it stores in AWS server and automatically decrypted when it is accessed. Users don’t need to maintain keys.
          • Client-side encryption = users first encrypt and store data into the server and decrypt after downloading the data in the computer. Users need to maintain keys.
        • Multi-Factor Authentication (MFA) Delete = users cannot just delete objects after login into the AWS account. They need other forms of authentication
        • Logging = you can log adding/deleting/changing objects in the buckets.
        • Event notification = you can get notification based on activities performed against S3 objects through email.
  • Elastic File System (EFS): it is like NAS within the cloud for the cloud
    • It can be sharable with different instances
    • There is a hierarchical structure in the file system
    • Can be accessed through the NFSv4 protocol
    • Free to access by many different instances (Not like EBS which is bound to one intense)
    • Only supported with Linux instances, not Windows.
  • FSx: it eliminates the need to create a full Windows Server and join that server with Active Directory (AD)
    • It also uses NFS protocols like EFS
    • It runs on Windows Server that runs Windows Server Message Block (SMB)
    • AWS FSx for luster builds on a high-performance file system
    • Glacier Storage Overview
      • It is used to achieve data storage
      • It is one of the S3 storage classes so you can integrate with S3.
      • Storage Gateway can connect to Glacier
      • Cost less per GB/month
      • three types of access method
        • Expedited = 3-5 minutes
        • Standard = 3-5 hours
        • Bulk = 5-12 hours

Note: There are 3 three key items you need to keep in mind before choosing the kind of storage. The first thing is size because before choosing the storage you need to determine the amount of storage your organization has. The second is performance because it determines how fast you need to access data if you need to access data more frequently you can store data in S3 or EBS. On the other hand, if data can be accessed in 3 to 5 hours, you can store them in Glacier. The third is cost because the easier to access data the more it costs.

Getting Data into S3

  • API (Application Programming Interface): the general description of this method is that developer designs the application and the application uses the API to talk directly to S3 and place stuff in S3 buckets.
  •  Amazon Direct Connect: it is used to create a VPN connection from the organization to the AWS.
  • Storage Gateway: it is different from Amazon Direct Connect in a way that the organization’s on-prem data synchronize to the S3 bucket. Also, VPN connection into the S3 buckets.
  • Kinesis Firehouse: it is used to transfer a large amount of analytical data into the S3 buckets.
  • Transfer Acceleration: it uses CloudFront technology. Let’s say the user originates data from anywhere in the world. It optimizes the route to your central S3 buckets. It will cost more because it is the fastest way to store data over the Internet.
  • Snow Family: it migrates data from on-premise to AWS location same like we move a file from one device to another device through an external hard drive but in a larger way.
    • Snowball = Petabyte-scale
    • Snowball Edge = 100 TB local storage
      • It is different in a way that the organization can run instances in a snowball while migrating to the AWS location.
    • Snowmobile = Exabyte-scale
      • A large semi-truck comes to your location like a data center on a wheel. AWS staff comes with it. Each truck can store 100 petabyte storage. That means 10 trailers means Exabyte of storage.

Categories: AWS

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *