How Twilio Improved Control of AWS Root Credentials at Scale

July 18, 2022
Written by
Fabian Lim
Twilion
Reviewed by
Xiao He
Twilion

AWS Root Credentials at Scale Hero

Twilio was founded in 2008 to bring Twilio Programmable Voice to the market and was originally built with a small number of AWS accounts. Today, Twilio has grown to provide more than 20 services and products that utilize hundreds of AWS accounts supporting the entire ecosystem.

Each AWS account has a root user with full administrative access to its respective account. The exposure or loss of a root user’s credentials is the worst nightmare for cloud security teams, as attackers can hold AWS accounts ransom during a security breach. These root users are considered to be some of the “crown jewels” of Cloud Security, and it is the security team’s – and all of engineering’s – job to secure them.

In this post, I will talk about how we on the Twilio Cloud Security team achieve clarity and control on all of Twilio’s AWS accounts’ root credentials.

How Twilio changed the process to handle AWS root credentials

Twilio started with a small Cloud Security team. We defined the Twilio Cloud Security Standard (TCSS) that guided our cloud security journey. One portion of this standard dictates how we should manage our AWS accounts’ root credentials.

During a recent review of hundreds of AWS root users, we found that the TCSS was not specific nor clear enough to ensure 100% compliance with the stated requirements. Here are all of the questions we asked ourselves in the process.

What was the existing AWS Root Credentials Management process?

The TCSS mirrors controls in the AWS CIS Benchmark to achieve our security objectives. There are many controls that apply to the root user, such as:

  • Avoid the use of the root user
  • Ensure no root user access key exists
  • Ensure MFA (Multi Factor Authentication) is enabled for the root user
  • Ensure hardware MFA is enabled for the root user
  • Ensure a log metric filter and alarm exist for usage of root user

These recommendations for root users in the AWS CIS Benchmark are fairly achievable for teams of all AWS estate sizes. The steps are easily auditable, and can be verified with a simple, automated, event-driven configuration check against new accounts.

We implemented a multi-account strategy, even before AWS Organization was an AWS feature, with every AWS account created at Twilio. We followed these steps in the AWS CIS benchmark to ensure our AWS accounts were secured from inception. As long as these CIS controls are implemented, they would meet the AWS CIS Benchmark and passed audits.

What is the problem with the existing process?

The AWS CIS Benchmark controls state what should be done rather than how it should be done. The benchmark should work hand-in-hand with a clearly defined standard and process (the “how”) for handling high-stake credentials within the organization.

We had no process to standardize the root contact information on Twilio’s AWS accounts, and there was no formal way to keep track of the relationship between credentials and their custodians.

During a recent review, we found that some of the root credentials were unaccounted for. This was not a big concern at the time, because root credentials can be reset and obtained securely – or so we thought.

How did we regain control of all of the accounts?

Password Reset

In order to reset the AWS root password, we needed access to the registered email address.  Since our AWS accounts are registered with @twilio.com email addresses, this was not an issue.

MFA Reset Using Alternative Login Method

In order to reset the AWS root MFA, we needed access to the primary phone number registered on the AWS account as an alternative login method. That’s where things got interesting and manual.

Some accounts had unknown or forgotten personal phone numbers registered as the primary contact. If Billing IAM Permission was activated on the account, we logged in as an IAM principal with aws-portal:ViewAccount and aws-portal:ModifyAccount permissions and changed the primary contact’s phone number. Then, we used the phone number to log in as the alternative login method. We received an automated voice call from AWS, verified a PIN that was shown on-screen, and gained root access within minutes.

MFA reset using AWS Web Form

If Billing IAM Permission was deactivated, we used AWS Organization to set the account’s Alternative Security Contact phone number first. Then, we submitted a case via AWS MFA Support Web Form using this same phone number in the field “Alternate phone number where you can be reached. - optional”.

This request has a 4 hour SLA time (for enterprise customers), as well as up to 24 hours for an engineer to actually verify your details. After receiving the request, an AWS agent will:

  1. Send an email to the registered email address informing of a pending phone call
  2. Call the registered primary phone number and the person may or may not answer the call
  3. If the person did not answer or did not deny the request, call the requester’s phone number submitted in the web form
  4. If the requester’s phone number is listed as the alternative contact, send an one-time PIN to the registered email address for the requester to verify
  5. Reset the MFA device

AWS verification of root accounts

This process can take up to 28 hours, which is obviously not ideal during an emergency.

However, this web form does not guarantee us access because the agent will first call the registered phone number (unknown) before calling the requester (us). If that person denies the request, we were denied the reset of the MFA and root access.

This is what happens when a request is denied, an AWS agent will:

  1. Send an email to the registered email address informing of a pending phone call
  2. Call the registered primary phone number where the person denies the request to reset MFA
  3. Call the requester’s phone number submitted in the web form
  4. Inform requester that the request has been denied

AWS flow chart for account denial

Luckily for us, this was part of our review process and not a real emergency. We informed all of the contacts, and regained all access successfully.

What did we improve and learn?

Here are some learnings to consider in the early stage of securing AWS root user:

AWS Root Management Process

We updated the TCSS and supplemented it with a better process to handle all AWS account root users, including setting a standard for the password, MFA, primary contact email, and phone number. Along with this, we documented and tracked the details about the root account in a centralized database and owned this process. We learned that starting this process as early as possible is essential.

Contact Management

We improved our contact management and referenced the AWS Organization best practices for primary phone numbers. We activated Billing IAM Permission so that account contact information can be changed without using the root user, and this saved us a lot of time by avoiding having to submit the AWS MFA Support Web Form.

We learned that AWS does not reveal or provide any account’s primary contact phone numbers for privacy reasons, so we could not track the relationship between an AWS account and phone number effectively.

There is no API to retrieve or update primary contact information, so it is really difficult to update the root email and phone number at a later stage. It requires a lot more manual effort using the web console.

We also learned that the AWS primary contact registered phone number was overlooked as an important piece of information because AWS used it as a method of verification for a MFA reset.

Password Management

We know that the impact of losing these credentials is high. We decided on a centralized approach and became custodians of all the AWS root users. We used a credentials manager to manage credentials (passwords and MFA).

IAM Management

We learned that there are only a limited number of tasks which require the root user, and engineers will have to come to us in order to use it. To complete the deal, we also applied a AWS Organization SCP (Service Control Policy) that denies root activities and created alerts to monitor changes on the SCP. This helps to prevent root users from being able to do any actions if credentials were ever used.

What is our measurable goal?

Besides measuring the number of AWS accounts root credentials we secured, we also significantly decreased the time needed to gain access to AWS root users when responding to an incident. This was also a good time to identify and close abandoned AWS accounts, which reduced our cloud footprint and attack surface area.

Conclusion

While measuring the goal is hard, we know that with each process we better define, we improve Twilio’s cloud security posture. This is all work the Cloud Security Team does in the background that is important for Twilio’s availability and reliability.

With that, our customers can enjoy a more reliable and secure Twilio platform. I hope this post was insightful for everyone on a cloud security team defending their cloud. And for developers writing code for multi factor authentication, don’t reinvent the wheel! Check out Twilio’s User Authentication & Identity solution that integrates seamlessly with any applications to start protecting your customers.

Fabian Lim is a Cloud Security Engineer at Twilio. He loves to automate manual work in security reviews. He also loves building good relationships with anyone who wants to learn about security. Fabian is currently working his way to delete all things unsecured in Twilio, and can be reached at flim[at]twilio.com or on LinkedIn.