- (Topic 3)
A company is building a data analysis platform on AWS by using AWS Lake Formation. The platform will ingest data from different sources such as Amazon S3 and Amazon RDS. The company needs a secure solution to prevent access to portions of the data that contain sensitive information.
Correct Answer:
B
This option is the most efficient because it uses data filters, which are specifications that restrict access to certain data in query results and engines integrated with Lake Formation1. Data filters can be used to implement row-level security and cell-level security, which are techniques to prevent access to portions of the data that contain sensitive information2. Data filters can be applied when granting Lake Formation permissions on a Data Catalog table, and can use PartiQL expressions to filter data based on conditions3. This solution meets the requirement of providing a secure solution to prevent access to portions of the data that contain sensitive information. Option A is less efficient because it uses an IAM role that includes permissions to access Lake Formation tables, which is a way to grant access to data in Lake Formation using IAM policies4. However, this does not provide a way to prevent access to portions of the data that contain sensitive information. Option C is less efficient because it uses an AWS Lambda function that removes sensitive information before Lake Formation ingests the data, which is a way to perform data cleansing or transformation using serverless functions. However, this could involve significant changes to the application code and logic, and could also result in data loss or inconsistency. Option D is less efficient because it uses an AWS Lambda function that periodically queries and removes sensitive information from Lake Formation tables, which is a way to perform data cleansing or transformation using serverless functions. However, this could involve significant changes to the application code and logic, and could also result in data loss or inconsistency.
- (Topic 4)
A company has a financial application that produces reports. The reports average 50 KB in size and are stored in Amazon S3. The reports are frequently accessed during the first week after production and must be stored for several years. The reports must be retrievable within 6 hours.
Which solution meets these requirements MOST cost-effectively?
Correct Answer:
A
To store and retrieve reports that are frequently accessed during the first week and must be stored for several years, S3 Standard and S3 Glacier are suitable
solutions. S3 Standard offers high durability, availability, and performance for frequently accessed data. S3 Glacier offers secure and durable storage for long-term data archiving at a low cost. S3 Lifecycle rules can be used to transition the reports from S3 Standard to S3 Glacier after 7 days, which can reduce storage costs. S3 Glacier also supports retrieval within 6 hours.
References:
✑ Storage Classes
✑ Object Lifecycle Management
✑ Retrieving Archived Objects from Amazon S3 Glacier
- (Topic 3)
A company wants to use high performance computing (HPC) infrastructure on AWS for financial risk modeling. The company's HPC workloads run on Linux. Each HPC workflow runs on hundreds of Amazon EC2 Spot Instances, is shorl-lived, and generates thousands of output files that are ultimately stored in persistent storage for analytics and long-term future use.
The company seeks a cloud storage solution that permits the copying of on-premises data to long-term persistent storage to make data available for processing by all EC2 instances. The solution should also be a high performance file system that is integrated with persistent storage to read and write datasets and output files.
Which combination of AWS services meets these requirements?
Correct Answer:
A
https://aws.amazon.com/fsx/lustre/
Amazon FSx for Lustre is a fully managed service that provides cost-effective, high- performance, scalable storage for compute workloads. Many workloads such as machine learning, high performance computing (HPC), video rendering, and financial simulations depend on compute instances accessing the same set of data through high-performance shared storage.
- (Topic 4)
A financial services company wants to shut down two data centers and migrate more than 100 TB of data to AWS. The data has an intricate directory structure with millions of small files stored in deep hierarchies of subfolders. Most of the data is unstructured, and the company's file storage consists of SMB-based storage types from multiple vendors. The company does not want to change its applications to access the data after migration.
What should a solutions architect do to meet these requirements with the LEAST operational overhead?
Correct Answer:
C
AWS DataSync is a data transfer service that simplifies, automates, and accelerates moving data between on-premises storage systems and AWS storage services over the internet or AWS Direct Connect1. AWS DataSync can transfer data to Amazon FSx for Windows File Server, which is a fully managed file system that is accessible over the industry-standard Server Message Block (SMB) protocol. Amazon FSx for Windows File Server is built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, and Microsoft Active Directory (AD) integration2. This solution meets the requirements of the question because:
✑ It can migrate more than 100 TB of data to AWS within a reasonable time frame,
as AWS DataSync is optimized for high-speed and efficient data transfer1.
✑ It can preserve the intricate directory structure and the millions of small files stored in deep hierarchies of subfolders, as AWS DataSync can handle complex file structures and metadata, such as file names, permissions, and timestamps1.
✑ It can avoid changing the applications to access the data after migration, as Amazon FSx for Windows File Server supports the same SMB protocol and Windows Server features that the company’s on-premises file storage uses2.
✑ It can reduce the operational overhead, as AWS DataSync and Amazon FSx for Windows File Server are fully managed services that handle the tasks of setting up, configuring, and maintaining the data transfer and the file system12.
- (Topic 3)
A company sells datasets to customers who do research in artificial intelligence and machine learning (Al/ML) The datasets are large, formatted files that are stored in an Amazon S3 bucket in the us-east-1 Region The company hosts a web application that the customers use to purchase access to a given dataset The web application is deployed on multiple Amazon EC2 instances behind an Application Load Balancer After a purchase is made customers receive an S3 signed URL that allows access to the files.
The customers are distributed across North America and Europe The company wants to reduce the cost that is associated with data transfers and wants to maintain or improve performance.
What should a solutions architect do to meet these requirements?
Correct Answer:
B
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PrivateContent.ht ml