Intro About the Attack
The Sysdig threat research team recently uncovered a sophisticated global campaign dubbed EMERALDWHALE that has exposed 15,000+ cloud credentials by exploiting exposed Git config files. The campaign used an array of private tools to abuse many misconfigured repositories in order to extract sensitive information such as cloud credentials and other confidential data. This necessitates the criticality of securing Git repositories against such potential data breaches and account takeover risks.
- Attackers utilized automated tools and scripts to scan exposed Git repository configurations comprised of .git directories and Laravel .env files, paving the way for credential theft.
- Attackers were able to clone 10,000 private repositories, and obtain compromised credentials from their source code, potentially using them for phishing and spamming purposes.
- Stolen data was stored in an S3 bucket that is publicly accessible, making it easily retrievable by the EMERALDWHALE threat actors.
- The stolen credentials belonged to CSPs (Cloud Service Providers), Email providers, and other cloud-centric service providers.
Sysdig’s senior research director, Michael Clark, told The Register that threat actors are using the stolen credentials primarily for phishing and spamming purposes. Additionally, the stolen credentials can be sold individually for hundreds of dollars per account.
There's a lot of value – $500, $600, $700 – to these credentials.
Clark explained.
As mentioned in the Hacker News article, The EMERALDWHALE activity used the prior victim's Amazon S3 storage bucket to store stolen credentials and scanned data. Following the exposure of this credential theft operation, the S3 bucket was taken down to prevent further misuse.
GIT Background
Git is widely used in the software development industry. And it allows version-controlling systems for developers to manage and collaborate with others on software development projects. However, with the increased adoption and wide use of Git systems, it is also becoming a prime target for threat actors who seek to exploit CVEs, and loopholes to steal sensitive data.
Git config breaches are attractive options for data thieves, like EMERALDWHALE, because they contain data-rich information, such as messages, commit history, email addresses, usernames, passwords, and API keys.
The Attack Methodology
Bleeping Computers reported, citing Sysdig, that the log data recovered from the S3 bucket showed extensive credential theft scanning. This scanning activity occurred from August to September, specifically for those servers with exposed repository config files. EMERALDWHALE took large chunks of the internet as scanning on this scale can become easier with widely available open-source tools such as httpx.
The following diagram shows the outline of the git config breach:
(Image source: Sysdig EMERALDWHALE Report)
Attack Steps:
- Start with an extensive list of IP address ranges as input for the EMERALDWHALE toolset.
- The automated toolset discovers relevant hosts within those IP address ranges.
- After identification, the toolset proceeds to the credential extraction step associated with hosts.
- Obtained credentials are then validated for their authenticity and functionality.
- Once validated, the stolen tokens are used to clone repositories from compatible Git services, including both private and public repositories.
- The toolset then downloads the exposed repositories for finding useful information or vulnerabilities.
- Finally, all results from this process, including stolen credentials and scanned data, are uploaded to a designated S3 bucket for storage and potential further exploitation.
Tools used to carry out the attack
Two tools used in the Git config breach included vulnerability scanners and exploit tools targeting exposed Git repositories:
- MZR V2 (MIZARU): A tool comprising a collection of Python and shell scripts capable of accepting a list of IP addresses as an input method. The supplied IP addresses are then potentially scanned to find relevant hosts for further exploitation.
- Seyzo-v2: This tool, similar to MZR, is also made up of a collection of scripts employed to steal the credentials. This tool has a script that uses httpx to find exposed Git configuration files and generate a target list.
Both tools are available for purchase on underground black marketplaces, often accompanied by complete training courses. For as little as $75, buyers can learn how to use these tools to create spam and phishing campaigns, as shown in the diagram below.
(Image source: Sysdig EMERALDWHALE Report)
Attack Capture Analysis
The attack captured data from many different repositories that had git config breach and exposed files. Most of them belonged to the major Git services such as GitHub, Bitbucket, and GitLab. Sysdig analyzed around 6,000 GitHub tokens and found that approximately 2,000 of them were valid credentials. The diagram below illustrates these valid credentials as identified through token analysis.
(Image source: Sysdig EMERALDWHALE Report)
Finally, the following diagram shows the subset of credential theft for each repository that was exposed during this operation.
(Image source: Sysdig EMERALDWHALE Report)
Reasons that have Caused the Git Config Breach
The following are the primary reasons that have caused the Git Config Breach, which could have been avoided if these measures had been in place properly.
- Unrestricted Access: Open public access to .git directories and misconfigured web server configs allowed attackers to gain access to the repositories' and steal sensitive information.
- Non-Verified Repository Permissions: Permissions for repository access were not properly verified, allowing threat actors to access and steal data from the code. Regularly reviewing and updating permissions and privileges could have prevented this breach.
- No Multi-Factor Authentication (MFA): The stolen cloud credentials were not protected with MFA, which led to exploitation and massive data scrapping from configuration files.
- Non-Rotating Credentials: Timely audit and rotating credentials could have limited the data exposition. If it had been implemented, this strategy could’ve reduced the lifespan of potentially exposed credentials to gain access.
Ways to Staying Secure from Online Activities
The following are some of the proven ways to stay safe and secure from malicious online activities:
- Use Secure Web Gateway (SWG): A secure web gateway greatly helps and protects against credential theft. SWG protects an organization's internet traffic by inspecting and filtering web traffic, URL filtering, web access, application controlling, and blocking phishing and spamming attempts.
- Deploy Enterprise Firewall: An enterprise firewall is designed to protect the devices and networks of an organization against many attack vectors. These highly intelligent firewalls block unauthorized access to the enterprise network and control the traffic by only allowing trusted data and blocking malicious data from entering or leaving the company.
- Regularly Updating and Patching Systems: Regularly updating and patching the organization systems, such as networking devices, firewalls, servers, and critical applications, substantially reduces the attack factor as commonly found vulnerabilities and exploits are hot-fixed.
- Implement Endpoint Detection and Response (EDR) Tools: EDR tools can help detect and respond to cyber-attacks in ultra-real time. EDR tools use AI and advanced machine learning to continuously monitor and analyze endpoint activities such as all processes, file execution, user login attempts, and network traffic, all while on automation. This level of protection significantly reduces the chances of data theft.
- Utilize Managed Cyber Security Services (MSS): Managed cyber security services can provide 24/7 monitoring and threat detection to an organization. Managed Security Services MSS is a comprehensive suite of security services offered by managed security services providers (MSSP) to oversee and take control of an organization’s security posture on its behalf. They act like an extension of an organization's own security team and commit to providing a broader range of security solutions to safeguard a company’s assets from cyber threats.
- Enforce Multi-Factor Authentication (MFA) Policies: MFA adds another layer of security on top of strong passwords by requiring users to provide an additional form of identification before granting them access. Usually, the additional authentication method has a time-based expiry, proving to be very successful in protecting users from credential-based attacks.