Details on Misconfigured Kubernetes NodePorts

July 07, 2021
Written by
Security
Twilion

Misconfigured Kubernetes NodePorts

UPDATE 2021-07-26: Through further investigation, we have updated the cause of the exposure. Our investigation found that the cause of the exposure was a misconfigured Kubernetes network policy. See below for additional details.

Twilio believes that our customers’ trust in us and the security of our products is of paramount importance, and when an event occurs that might threaten that security, we tell you about it. To that end, we wanted to provide an overview of the impact we experienced from a recently discovered server misconfiguration issue and how we managed that event.

What happened?

On June 18, 2021, a security researcher responsibly disclosed that they were able to access internal data on several Twilio SendGrid Kubernetes cluster node hosts. Twilio's security team quickly identified and mitigated the misconfiguration that led to the exposure, and started remediation efforts according to our incident response procedures.

Our investigation found that the cause of the exposure was a misconfigured Kubernetes network policy. We further identified that due to this misconfiguration, a Redis cache cluster had been exposed to the Internet and the data in it could be retrieved without authentication. We validated that the data exposed in that cache included our customer’s private DKIM (Domain Keys Identified Mail) keys. Our investigation determined that this data had been exposed starting on June 14, 2021, and was publicly accessible for 4 days. 

We have thoroughly investigated this incident and, to date, have found no indication that the exposed data was accessed by any unauthorized actors.

How did it happen?

Twilio SendGrid runs a number of large Kubernetes clusters that scale so we can efficiently process email volumes during peak periods. One of these clusters runs the core mail processing services, and we have been actively migrating legacy services to it as part of our normal engineering processes. On June 14, 2021, we migrated two of those legacy services to one of the Kubernetes clusters. One of those services was used to cache DKIM private keys so that email could be signed during processing.

The security researcher who disclosed this exposure through our Bug Bounty Program used an online service that scans for open ports on the Internet. They scanned a set of public IPs associated with SendGrid. The scan revealed that services running on some ports between 30000 and 32767 were listening for connections.

By connecting to those open ports, the researcher found a cache server without any authentication due to a misconfiguration with the cluster. In the cache, the researcher identified JSON text blocks that contained domains and DKIM keys belonging to some customers, who we have since notified.

The researcher promptly reached out to us via our Bug Bounty program and the report was immediately triaged by our Product Security team. Our Security Incident Response Team was engaged immediately, declared a security incident, and began working on the issue. Our SendGrid Network Operations team blocked Internet access to those exposed ports within an hour of the triage of the incident.

What do you need to do?

The private DKIM keys that were present in the affected cache are used by Twilio SendGrid to digitally sign emails so that mailbox providers can verify they are sent by you. While we are confident that this data has not been accessed by unauthorized third parties, in an abundance of caution, we are automatically rotating DKIM keys for customers where possible and recommending customers rotate their own DKIM keys if they cannot be automatically rotated. If you were directly impacted, Twilio SendGrid has already emailed you these instructions.

Determining if your keys will be automatically rotated

If you set up domain authentication using Twilio SendGrid’s automated security option (the default), your keys will automatically be rotated over the next few weeks. We do not expect rotation will have any impact on mail deliverability. If you chose to set up domain authentication using manual security you will need to rotate the keys yourself via the process described below.

To check if your domain is configured with automatic or manual security, follow the steps in this documentation for each of your domains.

PLEASE NOTE! It is possible that your account might have a combination of domain authentication configurations that use automated and manual security, so be sure to check every domain.

How to manually rotate manual security keys

While rotating your keys via this process will issue you a new public/private key, we recommend that you take this opportunity to change your domain to automated security. You can do this by creating a new domain authentication with the same domain name, but using the default automated security option. If you need to continue using manual security, follow the below steps.

The process for rotating the keys for a domain authentication configuration with manual security involves the following high level steps:

  1. Create a new domain authentication using the same domain as your current manual security configuration.
  2. Pick a new, unique, previously-unused selector name.
  3. Add the new DNS records presented in the new domain configuration and validate the configuration.
  4. Delete the original manual security domain authentication.

What other actions are we taking?

In addition to proactively rotating exposed DKIM keys, we are putting detectors in place to notify us when similar misconfigurations occur in the future. Additionally, in order to better identify, prevent, and remediate issues with misconfigurations we frequently update, evaluate, and tune our security scanning on all Twilio – including Twilio SendGrid – infrastructure.

FAQs

What is a Kubernetes cluster?

Kubernetes cluster is a group of nodes that help in managing and running containerized applications, specifically Linux-based applications. Kubernetes eliminates the manual process of deploying and scaling containerized applications. It also helps in running applications written in any language.

What is a DKIM key?

DKIM stands for DomainKeys Identified Mail which was designed to help email providers prevent malicious email senders by validating email from specific domains.

DKIM’s function is to verify that the sender of an email is responsible for the domain the email is from and that the email has not been tampered with. For example, if the sender is Twilio SendGrid, DKIM verifies that the email was from sendgrid.com and that domain  is owned by Twilio SendGrid.

DKIM relies on public-key cryptography, which involves a private key and a public key.

The two steps for DKIM are:

  1. A sender adds a private key on their mail servers and signs the message.
  2. The receiving server checks the public key stored in the txt record of dkimselector._domainkey.domain.com to validate the private key added by the sender.

It was the private DKIM keys of the customers that were exposed.

You can find more information about DKIM in our posts here:

Were the keys abused by an external threat agent?

No, our investigation showed no external actor had access to the keys. Our threat intel team confirmed that there was no leak of the keys. Our logs indicated no anomalies.  

How long were keys exposed?

The keys were exposed for four days. DKIM keys were added to our caching server on June 14, 2021. The misconfiguration was discovered and remediated on June 18, 2021.

Will the rotation of the key impact the delivery of my emails?

For users who configured their domain with automatic security, we intentionally asked you during the setup process to add 2 CNAME records for 2 different DKIM keys (even though we only ever use one of them at a time). This allows us to seamlessly rotate your keys without any impact to email deliverability.

For users who configured manual security, it’s important to follow the steps outlined above as those steps ensure a transition to a new key without impact to email deliverability.