Question-12: You are working with a financial company which has various data feeds are coming on daily basis form companies like Bloomberg, Yahoo Finance, Markit etc. However, there are some contract based on which this data can only be used for technical, historical data analysis, predictions and Machine Learning the data volume you are receiving on daily basis is around 5GB and you already have 5TB data and it is decided to use AWS S3 to host this data and same agreement is signed with the data vendor. Now you need to make sure that nobody else other then team member from the Data Science and Machine Learning team can access this data. So which of the following a suitable solution for this requirement?
- You would be encrypting the entire S3 bucket.
- You would be using server side as well as client-side data encryption.
- Get Latest Certification Questions & Answer from this link, which is regularly updated as per recent syllabus.
- You would be using HSM (Hardware Security Module)
- You would be creating an " All AWS Certification & Training Material Can be accessed from this link as well " policy for each data lake S3 bucket and using the Access control list and bucket policies you can control the resources at the bucket level.
Exp : In this question the main point was controlling the access to S3 bucket and only permitted user can access this bucket. And which can be resolved using the IAM (Identity access management)
You can never encrypt the entire S3 bucket. Only having encryption does not make sure the accessibility and authorization of data access. We don’t need HSM.
You can manage access to your Amazon S3 resources using access policy options. By default, all Amazon S3 resources—buckets, objects, and related subresources—are private: only the resource owner, an AWS account that created them, can access the resources. The resource owner can then grant access permissions to others by writing an access policy. Amazon S3 access policy options are broadly categorized as resource-based policies and user policies. Access policies that are attached to resources are referred to as resource-based policies. Example resource-based policies include bucket policies and access control lists (ACLs). Access policies that are attached to users in an account are called user policies. Typically, a combination of resource-based and user policies are used to manage permissions to S3 buckets, objects, and other resources.
For most data lake environments, we recommend using user policies, so that permissions to access data assets can also be tied to user roles and permissions for the data processing and analytics services and tools that your data lake users will use. User policies are associated with AWS Identity and Access Management (IAM) service, which allows you to securely control access to AWS services and resources. With IAM, you can create IAM users, groups, and roles in accounts and then attach access policies to them that grant access to AWS resources, including Amazon S3