Billing and Cost Management

User-Defined Cost Allocation Tags

User-defined tags are tags that you define, create and apply to resources.After you have create and applied them, you can activate them on the Billing and Cost Management console for cost allocation tracking.

The detailed steps are:

  1. Log in to the AWS Management Console of the new account
  2. Use the Tag Editor to create the new user-defined tags
  3. Use the Cost Allocation Tag manager in the payer account to mark the tags as cost allocation tags

Certificate Manager (ACM)

When you request a public certificate, AWS Certificate Manager (ACM) generates a public/private key pair

Application Load Balancer (ALB)

If an application is running on EC2 instances with Auto Scaling Group, and it’s behind an Application Load Balancer (ALB). If the company want to configure the application to scale based on the number of incoming requests, SysOps need to

  • Use a target tracking scaling policy based on ALB’s RequestCountPerTarget metric


A company has deployed its infrastructure using AWS CloudFormation. Recently, the company made manual changes to the infrastructure. A SysOps Administrator is tasked with determining what was changed and updating the CloudForamtion template,

  • Use drift detection on CloudFormation stack. Use the output to udpate the CloudFormation template and redeploy the stack


A stack set lets you create stacks in AWS accounts across regions by using a single AWS CloudFormation template. All the resources included in each stack are defined by the stack set’s AWS CloudFormation template.


If the third company is using federation to authenticate users and grant AWS permissions, SysOps can use CloudTrail for the federated identity username.

Management Events Logging

Management events logging provide visibility into management operations that are preformed on resources in your AWS account.

Data Events Logging

Data events logging provide visibility into the resource operations performed on or within a resource. These are also known as data plane operations. Data events are often high-volume activities.


  • S3 object-level API activity, e.g GetObject, DeleteObject, PutObject API operations


To enable memory metrics for every minute, SysOps needs to

  1. Enable detailed monitoring on the instance within in Amazon CloudWatch
  2. Publish the memory metrics using Amazon CloudWatch Agent


If EC2 instance has stopped responding and the system checks are failing, SysOps needs to

  • Stop and then start the EC2 instance so that it can be launched on a new host

Reboot vs Stop/Start

  • Reboot an EC2 instance will keep your everything still on the same physical host machine
  • Stop/Start an EC2 instance will move to a new physical host machine

Auto Scaling Group (ASG)

Auto Scaling Group is configured to determine the health status of EC2 instances using EC2 status checks.

If we want to analyse the unhealthy instances before termination, we can use EC2 Auto Scaling Group Lifecycle Hook to pause instance termination after the instance has been removed from service.

Operation System

Patching EC2 instances is customers’ responsibility.


To solve alerts of high CPU utilization from a Memcached-based ElastiCache cluster, SysOps can

  • add additional work nodes to ElastiCache cluster
  • create an Auto Scaling Group to ElastiCache cluster

If the eviction count metric is high whilst other components are normal, SysOps needs to

  • Scale the ElastiCache cluster by adding additional nodes

Elastic Load Balancer (ELB)

An ELB is a software-based load balancer which can be set up and configured in front of a collection of AWS EC2 instances. The ELB servers as a single entry point for consumers of the EC2 instances and distributes incoming traffic across all machines available to receive requests.

If Security team wants to track application requests by the originating IP and the EC2 instance that processes the request, a SysOps Admin can use Elastic Load Balancing access log to provide that information.


IAM Role

To securely access credentials that stored in AWS System Manager Parameter Store, SysOps can create an IAM Role for the EC2 instances and grant the role permission to read the System Manager parameters.

To access AWS Management Console with Security Assertion Markup Language SAML, SysOps can map the role attribute to an AWS role. The AWS role is assigned IAM policies that govern access to AWS resources.


If you are running a serverless application in AWS Lambda and there is a expected traffic increase, SysOps need to ensure the concurrency limitation for the Lambda function is higher than the expected simultaneous function executions.


If your NAT instance has a high latency as the network grows, SysOps need to replace the NAT Instance with a NAT gateway.

Comparison between old NAT Instance and NAT Gateway


  • NAT Instance is a generic Amazon Linux AMI that configured to perform NAT
  • NAT Gateway is performance software that optimised for handling NAT traffic


  • NAT Instance depends on the bandwidth of the EC2 instance type
  • NAT Gateway can scale up to 45Gbps


If security team find there are some employees have been using individual AWS accounts that are not under the control of the company, A SysOps need to

  • Send each existing account an invitation from the central organisation

Service Control Policies (SCPs)

AWS Organisation helps you centrally govern your environment and use Service Control Policies (SCPs) to set permission guardrails with the fine-grained controls using AWS IAM policies.

SysOps can set up notifications for whenever combined billing exceeds a certain threshold for all AWS accounts within a company. To achieve that, SysOps needs to

  1. Set up AWS Organisation and enable Consolidated Billing
  2. In the Payer Account
  • Enable Billing Alerts in the Billing and Cost Management console
  • Set up a billing alarm in Amazon CloudWatch
  • Publish an SNS message when the alarm triggers

If the Security team discovers that some employees are using AWS services in ways that violate company policies, A SysOps Administrator need to prevent all users of an account, including the root user, from performing certain restricted actions, the SysOps needs to

  • Apply Service Control Policies (SCPs) to allow approved actions only

Relational Database Service (RDS)

To ensure minimal downtime of a web application in the event the database suffers a failure, SysOps can modify the DB instance to outside of business hours be a Multi-AZ deployment

To have a daily backup of the RDS database in a separated security account, SysOps needs to

  1. Create an RDS snapshot with AWS CLI create-db-snapshot command
  2. Share it with the security account
  3. Create a copy of the shared snapshot in the security account


Amazon RDS Multi-AZ deployments do not failover automatically in response to database operations, such as

  • long running queries
  • deadlocks
  • database corruption errors

RDS will automated failover to secondary database only when

  • A storage failure on primary database
  • The database instance type was changed


Aurora is fault-tolerant by design and adding a read replica can increase availability.

Route 53

If you web application has a new version that need to roll out, SysOps can use an Amazon Route 53 weighted routing policy to gradually move traffic from the old version to the new one

ALIAS Record

An ALIAS record is a virtual host record type, which is used to point one domain name to another one, almost the same as a CNAME. The important difference is that ALIAS can coexist with other records on that name.

CNAME Record

The CNAME record will point your domain or subdomain to the IP address of the destination hostname. If the IP of the destination hostname changes, you won’t need to change your DNS records as the CNAME will have the same IP.


  • You can have multiple ALIAS records, but only one CNAME record
  • CNAME and ALIAS records must point to a name

Service Catalog

AWS Service Catalog allows IT administrators to create, manage, and distribute catalogs of approved products to end users, who can then access the products they need in a personalized portal.

Storage Gateway

AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage.

Storage Gateway enables you to reduce your on-premise storage footprint and associated costs by leveraging Amazon S3 Cloud Storage.

System Manager

If a SysOps Administrator is attempting to use AWS System Manager Session Manager to initial a SSH session with an Linux EC2 instance, and cannot find the target instance in Session Manager console, the SysOps Administrator need to

  1. Add System Manager permission to the instance profile
  2. Install System Manager Agent on the target instance


Static site Hosting

When using static site hosting features with S3, if you received 403 Forbidden Access Denied error, SysOps needs to add a bucket policy to grant everyone to read access to bucket objects.


If there the requestments is to archive data to be retained for at least 7 years, a SysOps Admin need to configure

  • AWS S3 Glacier Vault Lock policy

S3 Glacier Vault Lock allows you to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.

Virtual Private Cloud (VPC)

If a developer has issues with connectivity issues with a particular port, a SysOps need to check

  • Security Group is correct configured to allow that port
  • Network ACL is using default configuration

If all the above steps are not working, VPC Flow logs will show all the details.

VPC Endpoint

A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoint services powered by AWS PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Endpoints are virtual devices. They allow communication between instances in your VPC and services without imposing availability risks or bandwidth constraints on your network traffic.

Web Application Firewall (WAF)

AWS Web Application Firewall service is the most commonly used solution for protection form XSS and web application attacks.

If you observes a large number of rogue HTTP requests on an Application Load Balancer, SysOps can use AWS WAF rate-based blacklisting to block this traffic when it exceeds a defined threshold.

If you observe 404 errors are being sent to one IP address every minute, SysOps should to use WAF to block this suspected malicious activity.

General Network


ping operate by sending internet Control Message Protocol (ICMP) packets to the target host and wait for an ICMP echo reply.

Common Vulnerabilities and Exposures report (CVE)

To get HTTP layer 7 status code, you can use

  • Application Load Balancer (ALB) access logs
  • CloudFront access log

Network Address Translation

Network Address Translation (NAT) is a method of remapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device.

internet Gateway

An internet Gateway is a logic connection between an Amazon VPC and the internet.

  • It is not a physical device
  • Only one internet Gateway can be associated with VPC
  • It does not limit the bandwidth of internet connectivity
    • The only limitation on bandwidth is the size of the Amazon EC2 instance, and it applies to all the traffic

If a VPC does not have an internet Gateway, then the resources in the VPC cannot be accessed from the internet

NAT vs internet Gateway

A NAT Gateway does similar things like internet Gatway, but with two main differences:

  1. NAT Gateway allows resources in a private subnet to access the internet. Think yum update, external database connections, wget calls, etc.
  2. NAT Gateway only works one way. The internet at large cannot get through your NAT to your private resources unless you explicitly allow it.