A Measure of Motive: How Attackers Weaponize Digital Analytics Tools

A Measure of Motive: How Attackers Weaponize Digital Analytics Tools

Adrian McCabe, Ryan Tomcik, Stephen Clement


Introduction

Digital analytics tools are vital components of the vast domain that is modern cyberspace. From system administrators managing traffic load balancers to marketers and advertisers working to deliver relevant content to their brand’s biggest fan base, tools like link shorteners, location trackers, CAPTCHAs, and digital advertising platforms each play their part in making information universally accessible and useful to all.

However, just as these tools can be used for good, they can also be used for malicious purposes. Mandiant and Google Cloud researchers have witnessed threat actors cleverly repurposing digital analytics and advertising tools to evade detection and amplify the effectiveness of their malicious campaigns.

This blog post dives deep into the threat actor playbook, revealing how these tools can be weaponized by attackers to add malicious data analytics (“malnalytics”) capabilities to their threat campaigns. We’ll expose the surprising effectiveness of these tactics and arm defenders with detection and mitigation strategies for their own environments.

Get Shor.ty

First entering the scene around the year 2000 and steadily gaining in popularity ever since, link shorteners have become a fairly ubiquitous utility for life on the Internet. In addition to the popular link shortening services like bit.ly and rb.gy, large technology companies like Amazon (a.co) and Google (goo.gl) also have (or had, in Google’s case) their own link shortening structures and schemas. In the legitimate advertising and marketing sense, link shorteners are typically used as a mechanism to track things like click-through rates on advertisements, or to reduce the likelihood that a complicated URL with parameterized arguments will get mangled when being shared. However, link shorteners and link shortening services have also been used by threat actors (MITRE ATT&CK Technique T1608.005) to obscure the URLs of malicious landing pages, and Mandiant has observed threat actors using link shorteners to redirect victims during the initial access phase of an attack chain. Some recent examples include: 

  • A link shortener service used by UNC1189 (also known as “MuddyWater”) in spring of 2022 to funnel users to a phishing lure document hosted on a cloud storage provider.

  • A set of SMS phishing campaigns orchestrated by a financially motivated threat actor between spring of 2021 and late 2022, which leveraged link shorteners to funnel users through a nested web of device, location, and browser checks to a set of forms that ultimately attempt to steal credit card information.

  • A malvertising campaign in spring of 2023 that leveraged a link shortener to track click-through data for Dropbox URLs hosting malware payloads. 

Behind the ma.sk

To demonstrate the capabilities of a link shortener service from a threat actor perspective, the service bit.ly will be featured in this blog post. Originally made popular on X (formerly Twitter) around 2008, bit.ly remains a popular link shortening solution. Like most modern software-as-a-service (SaaS) platforms, bit.ly offers multiple subscription levels based around levels of usage and feature availability (Figure 1).

bit.ly subscription page

Figure 1: bit.ly subscription page

In an attempt to avoid direct attribution, threat actors may use fake or stolen personal and/or payment information to complete the registration for such a subscription or service. Once the setup process has been completed, attackers can begin to generate shortened links (Figure 2).

bit.ly destination URL configuration

Figure 2: bit.ly destination URL configuration

bit.ly customized URL configuration

Figure 3: bit.ly customized URL configuration

As part of some bit.ly subscription levels, custom fields can be appended to URLs as parameters to gain further insights into their associated activity (see the “Custom URL parameter name” field and value pair in Figure 4). This feature set is obviously quite beneficial for social media brand influencers, marketers, and advertisers, but attackers can use this functionality to get added insights into their campaign activities.

In this fictitious example, let’s say an attacker intends to use a shortened bit.ly link as part of a larger SMS phishing campaign targeting phone numbers within the “703” area code. When opened, the link will direct users to an attacker-controlled fake payment site enticing the user to pay urgent outstanding invoices.

The attacker can configure parameters (Figure 4) to generate an Urchin Tracking Module (UTM) URL specific to this component of the phishing campaign (Figure 5) for tracking purposes. This bit.ly article contains more information on the legitimate use of these types of URL data fields.

Customized UTM parameter configuration

Figure 4: Customized UTM parameter configuration

Parameterized URL structure with UTM fields

Figure 5: Parameterized URL structure with UTM fields

Though attackers typically would not have such fields in the URL parameters for their campaign infrastructure as overtly labeled as the example in Figure 5, the effectiveness of leveraging such online marketing integrations and data fields is readily apparent. In this scenario: 

  • Source is a designator for a list of active phone numbers that can receive SMS messages. While the list itself and the infrastructure to send the messages would reside outside of bit.ly, bit.ly can be used to correlate corresponding click-through activity through these URL parameters.
  • Medium is the mechanism by which a victim would be exposed to the link. In this case, “sender_1” would be a way for the attacker to correlate the downstream victim to the phone number in the attacker’s infrastructure that originally sent them the message.
  • Campaign is the aggregated bucket of related activity visible within bit.ly. In bit.ly, an individual campaign can have many different links tied to it, but the associated activity can be tracked concurrently.
  • Term is an optional field that has a legitimate use for mapping search engine keywords or terms to strategically placed bit.ly links by advertisers.
  • Custom URL parameter name – targeting_area_code, 703: This is an entirely customized bit.ly field included for the purposes of this scenario that signifies which area code the attacker will be targeting with this specific link. In this case, the attacker will be targeting Washington D.C., metropolitan area residents in Northern Virginia.

After these parameters are selected and the bit.ly links are fully configured, attackers can put their links into action. Once a campaign is underway and links are distributed through their medium of choice, attackers can monitor the activity to their shortened links using a dashboard interface (Figure 6).

bit.ly click-through analytics dashboard

Figure 6: bit.ly click-through analytics dashboard

Defending Against Attacks Leveraging Link Shorteners

Given the fairly ubiquitous nature of link shorteners, unilaterally blocking them from use within an environment is generally inadvisable as this decision would likely impact both productivity and user experience. Instead, defenders should consider implementing some form of automated analysis around them that has the ability to detect behavioral conditions, such as:

  • If the shortened URL goes to a second/nested shortened URL on different infrastructure
  • If the same shortened URL has appeared multiple times in a short timespan in telemetry data associated with different hosts within an environment
  • If the URL goes directly to an executable or archive file on a cloud-hosting service or a file with a “non-standard” file type (e.g., .REV file)

Additionally, it’s possible to identify suspicious behavioral patterns in network telemetry that may indicate link shortener abuse. As part of this exercise, we reviewed the network telemetry associated with two simulated attack chains leveraging a bit.ly URL as an Initial Infection Vector (IIV) and identified some viable elements of the traffic around which to potentially build detections or hunting strategies:

Attack Configuration

Network Requests

Hunting Strategy

bit.ly -> Credential Harvesting Page (afakeloginpage[.]xyz)

00:00:00 – init Client Hello (TLS), bit.ly

00:00:00 – init DNS resolution request, afakeloginpage[.]xyz

In bit.ly’s particular case, there is minimal delay (milliseconds) between the time a host initiates a connection via Client Hello and the time that the host initiates the DNS resolution for its final destination. If any DNS resolution telemetry is evident for a suspicious domain within such close proximity to bit.ly traffic (particularly for domains with non-standard TLDs like “.site,” “.xyz,” “.top,” or “.lol”), consider investigating the activity further.

bit.ly -> zip file hosted on Google Drive

00:00:00 –  init Client Hello (TLS), bit.ly

00:00:00 – DNS resolution request, drive.google[.]com

00:00:00 – Client Hello, drive.google[.]com

00:00:00 – DNS resolution request, drive[.]usercontent[.]google[.]com

Similar to the aforementioned example, there is minimal delay (milliseconds) between the time a host initiates a connection via Client Hello for bit.ly and when it attempts to connect to and/or make domain resolutions for the domains drive.google.com and drive.usercontent.google.com. Any occurrence of these three domains being accessed from a given host in quick succession likely means that a remote file was accessed via bit.ly link and additional investigation into the associated host may be warranted. This detection approach can also be generalized by looking for the co-occurence of network requests for a bit.ly URL followed by a domain categorized by a firewall or proxy device as online storage or file sharing.

Table 1: Simulated bit.ly attack telemetry analysis

The World in a String: Weaponized IP Geolocation Utilities

IP geolocation utilities can be used legitimately by advertisers and marketers to gauge the geo-dispersed impact of advertising reach and the effectiveness of marketing funnels (albeit with varying levels of granularity and data availability). However, Mandiant has observed IP geolocation utilities used by attackers (MITRE ATT&CK Technique T1614). Some real-world attack patterns that Mandiant has observed leveraging IP geolocation utilities include:

  • Malware payloads connecting to geolocation services for infection tracking purposes upon successful host compromise, such as with the Kraken Ransomware. This allows attackers a window into how fast and how far their campaign is spreading.
  • Malware conditionally performing malicious actions based on IP geolocation data. This functionality allows attackers a level of control around their window of vulnerability and ensures they do not engage in “friendly fire” if their motivations are geo-political in nature, such as indiscriminate nation-state targeting by hacktivists. An example of this technique can be seen in the case of the TURKEYDROP variant of the Adwind malware, which attempts to surgically target systems located in Turkey. 
  • Threat actors placing access restrictions on phishing lure pages and second-stage malware downloads based on IP ranges (a feature of the Caffeine PhaaS platform). This allows attackers a limited defensive mechanism against having their campaign infrastructure identified and mitigated too rapidly.

Though elegantly simple, these capabilities are vital for attackers to gain insights into their active campaigns and to prolong their campaigns’ duration and effectiveness.

How2DoUn2Others

Though there are many examples of IP-based geolocation utilities that have been used by attackers, for illustrative purposes the example shown here will use ip2location.io.

Ip2Location.io subscription page

Figure 7: Ip2Location.io subscription page

ip2location.io has a fairly robust feature set (Figure 7) with a free version offering a dedicated API key with respectable limits and upper tier subscriptions offering progressively granular insights into the IP address query results that would be useful to attackers. Using ip2location.io, it is possible to determine things like: 

  • If the connecting entity’s IP address falls within an IP netblock owned by a specific company
  • Currency associated with the locale of the connecting entity
  • If the connecting entity is using a VPN
  • If the connecting entity is using Tor

From an attacker perspective, a primary function of leveraging this type of tooling is integrating it with programmatic actions to both optimize targeting and evade detection. In the following example code snippet, a simple webpage can be configured with Javascript to perform a lookup using the ip2location API and redirect users to different pages based on their locale or connection type. If the user is connecting from a country outside the United States, it will show them an otherwise innocuous page. If the user is connecting from inside the U.S. and is not using a VPN or Tor (in contrast to some analysis sandbox environments), then they will be directed to a malicious webpage. If they are using a VPN or Tor, they will be shown an error page.

<script type="module">
let raw_response = await 
fetch('https://api.ip2location.io/?key=<key>&format=json');
let response_text = await raw_response.text();
var parsed_json = JSON.parse(response_text);

if(parsed["country_code"]=="US") {
        if(parsed["proxy"]["is_tor"] == true || parsed["proxy"]["is_vpn"] 
        == true)
                document.location = 'error.html';
        else
                document.location = 'evilpage.html';
}

else
        document.location = 'nothingburger.html';
</script>

Though the previously shown example is configured to simply route connecting users to different pages based on their connection attributes, it also has the potential to be surprisingly effective at thwarting automated analysis tools. This sort of technique is particularly applicable to regional phishing attacks that target specific geo-dispersed companies or campaigns that target users in certain geographic regions.

Defending Against Attacks Leveraging IP Geolocation Utilities

While IP geolocation utilities commonly appear on legitimate websites, it is less likely that such a methodology would be used programmatically by non-browser processes on endpoints, such as individual workstations. This is good news for defenders, as detection and hunting efforts can primarily focus on correlating observed URL-based telemetry data with anomalous events in endpoint telemetry.

For example, a simulated attack script can be seen in the following PowerShell code snippet using the ip2location.io service:

$Response = Invoke-WebRequest -UseBasicParsing -URI 
https://api.ip2location.io/?key=<key>

if ($Response.Content.IndexOf('"country_code":"US"') -ne "0"){
        $EvilScript = 'echo "<raw bytes of evil file to drop on disk>" >> 
        C:TEMPout.tmp'
        iex $EvilScript
}

This command leverages PowerShell to programmatically connect to ip2location.io, determine if the host is connecting via a U.S.-based IP address, and, if so, drop the file “C:TEMPout.tmp” to disk. 

In network-based telemetry, the User-Agent for the PowerShell Invoke-Webrequest function is clearly identified. Thus, a behavioral network detection for the PowerShell User Agent connecting to ip2location.io could be created to identify this activity. While this is a fairly narrow detection, the concept can be widened by defenders based on the size of their environment and their level of noise tolerance.

Doing the CAPTCHA-cha Slide: Evading Detection with Bot Classification Tools

CAPTCHA, which is short for Completely Automated Public Turing test to tell Computers and Humans Apart, was developed to prevent bots and automated activity from accessing and interacting with web forms and hosted resources. Implementations of CAPTCHA technology, such as Google’s reCAPTCHA or CloudFlare’s Turnstile, are used as a security measure to filter out unwanted bot activity while permitting human users to access websites and interact with forms and other elements of a webpage (e.g., HTML buttons). Traditionally, CAPTCHA security challenges have required users to solve a visual puzzle or perform a brief interactive task. More recent implementations perform passive score-based detection to identify bot activity based on behavioral characteristics.

Evolution of reCAPTCHA

Figure 8: Evolution of reCAPTCHA

While intended to address the issue of malicious activity, CAPTCHA technology has been co-opted for use by threat actors to evade detection and scanning of their malicious infrastructure and payloads by security tools (MITRE ATT&CK T1633.001). For example, threat actors have been observed using free CAPTCHA services to prevent dynamic access and detonation actions that are typically performed by email security technologies to determine if a URL is malicious. This provides threat actors with the ability to allow human users to access a phishing page while screening out programmatic activity and the usage of data transfer tools like cURL (Figure 9).

CAPTCHA victim flow

Figure 9: CAPTCHA victim flow

Mandiant has tracked UNC5296 abusing Google Sites services as early as January 2024 to host CAPTCHA challenges that redirect users to download a ZIP archive. The ZIP archive contains a malicious LNK file masquerading as a PDF file from a financial institution that, once executed, leads to the deployment of either AZORULT or DANCEFLOOR. Mandiant has also identified FIN11 using CAPTCHA challenges as part of a phishing campaign in June 2020 to deliver the FRIENDSPEAK downloader and MIXLABEL backdoor.

Defending Against Attacks Using Bot Classification Tools

CAPTCHA tools have an extensive, legitimate use on the Internet, which makes it challenging to detect when they’re being used for malicious purposes. CAPTCHA widgets are easily implemented within a website using a few lines of HTML to reference the corresponding JavaScript resource and a unique site key that’s associated with the user who registered the CAPTCHA challenge.

<html>
    <head>
    <title>reCAPTCHA Test</title>
        <script 
src="https://www.google.com/recaptcha/api.js"></script>
        <script>
        function passRedirect() {
           window.location.href = 
"https://www.youtube.com/watch?v=dQw4w9WgXcQ";
        }
        </script>
    <div class="g-recaptcha" data-sitekey="<removed>" 
data-callback="passRedirect"></div>
</html>

When the CAPTCHA challenge is implemented within an intermediate webpage, defenders can use the network requests for the CAPTCHA JavaScript API files as potential detection or enrichment opportunities.

CAPTCHA Technology

Network Requests

Detection Strategy

reCAPTCHA v2

00:00:00 – screening website accessed

00:00:00 – www.google.com/recaptcha/api.js

00:00:00 –  www.gstatic.com/recaptcha/releases/vjbW55
W42X033PfTdVf6Ft4q/recaptcha__en.js

00:00:20 – www.google.com/recaptcha/api2/anchor?ar=1&k=<unique reCAPTCHA sitekey>&co=<snip>

00:00:52 – redirection to website after passing CAPTCHA

Look for suspicious proxy or firewall events occurring within 1 second of requests for www.google.com and www.gstatic.com, further refine based on the URI(s) if TLS decryption is available. Potentially include a suspicious proxy or firewall event for the redirection domain occurring within 1 minute of the previous sequence.

CloudFlare Turnstile

00:00:00 – screening website accessed

00:00:00 – challenges.cloudflare.com/turnstile/v0/api.js

00:00:20 – redirection to website after passing CAPTCHA

Look for suspicious proxy or firewall events occurring within 1 second of a request for challenges.cloudflare.com

Table 2: Simulated CAPTCHA telemetry analysis

A Real Jack-Ads: Stealing What Works for Loopholes and Profit

In practice, marketers have many variables to consider when running an ad campaign. There is the content of the ad itself (e.g., text, video, images), the demographics of the intended audience, the geolocation of where the ad will be displayed, and the time of day it will be displayed, among many other factors. Starting a new ad campaign often requires experimentation and refinement on the part of the marketers to find an ad “formula” that best aligns with the product or service they are trying to advertise. 

To get a head start on the process of digital advertising refinement, marketers can use competitive intelligence tools to see what ads their competitors are running. Depending on the tool, marketers can see keywords tied to their competition’s ads, the websites and applications the ad appeared on, media types associated with the ads (e.g., video, text, images), the landing pages users were shown after they clicked the ad, and many other notable advertising insights. One of the more well-known and robust tools for this includes AdBeat. Google and Meta also have repositories. These Search Engine Marketing (SEM) tools can provide insights to threat actors looking to set up malicious or dubious advertising campaigns (MITRE ATT&CK Technique T1583.008), including advertisement geolocation and effective keyword usage to circumvent Google Ads policies (Figure 10 and Figure 11).

Geo-location competitive intel tool functionality mentioned on the blackhatworld.com forum

Figure 10: Geolocation competitive intel tool functionality mentioned on the blackhatworld.com forum

Recommendation for competitive intel tools to assist with keyword refinement mentioned on the blackhatworld.com forum

Figure 11: Recommendation for competitive intel tools to assist with keyword refinement mentioned on the blackhatworld.com forum

Crafting a Malvertising Victim Flow

To illustrate how competitive intelligence tools can be used by threat actors, we’ll explore the steps involved in planning, staging, and executing a malvertising campaign based on a real-life campaign that was investigated and actioned by Google Ads threat researchers. An example of a process threat actors can use to create malvertising campaigns is outlined in Figure 12.

Steps for setting up a malvertising campaign

Figure 12: Steps for setting up a malvertising campaign

Copying an Ad That Works What Can Marketers See?

Much like their legitimate marketer counterparts, a typical starting point for attackers looking to launch a malvertising campaign is deciding which advertising keywords will attract the highest number of potential victims. Using keyword research features available in some Search Engine Marketing tools, a threat actor would be able to see how many users have historically interacted with ads related to specific keywords.

For example, based on the data available within one competitive intelligence tool, in June 2024, an estimated 220,000 clicks originated from relevant ads associated with the keywords “advanced ip scanner” from multiple domains, including two — “ktgotit[.]com” and “advanced-ip-scanner[.]com” — that did not have any associated traffic in June 2024 but have historically been associated with the same keywords. Then, in correlating this data with historical ads featuring the domain ktgotit[.]com, the following ad could be identified as a viable one for mimicking by an attacker*:

Ad snippet, ktgotit[.]com

Figure 13: Ad snippet, ktgotit[.]com

*Using one of the competitive intelligence tools, SEM data associated with ktgotit[.]com indicates that the ad in Figure 13 may have generated an estimated 3,000 visits at a cost of just under $7,000.

Typically, malicious advertisers employ several techniques in crafting their malicious ad content, including:

  • Consciously avoiding any mention of legitimate brands in their advertising text to avoid keyword flags
  • Creating a landing page with a domain name unrelated to the original product, service, or brand they are seeking to emulate
  • Creating a “fake site” with a fictitious e-commerce brand
  • Using cloaked pages, a technique that uses a combination of connection origination checks, device profiling, and page redirects between an initial landing page URL and its final destination in an attempt to conceal malicious activity

Additional insights like whether other malvertising campaigns are making it past moderation filters with misspellings of an official or legitimate website or using entirely unrelated web domains can also be helpful to attackers. Attackers can use this information to craft a convincing landing page that is shown to the user immediately after the ad is clicked to entice the user to move further into the victim flow.

The Clone Wars

Armed with the strategic insights gathered from competitive intelligence tools, a would-be attacker could confirm that mirroring the victim flow previously used by the ktgotit[.]com malvertising campaign would be an effective strategy to expose their malicious ads to a high number of potential victims for a reasonable price. Thus, the attacker may then decide to:

  • Purchase and configure hosting domains for their landing pages and payloads (the details of which are platform-dependent)
  • Generate and host a landing page (competitive intelligence tools may integrate this in the service)
  • Configure their cloaking page redirects and/or hosted distribution payloads (if applicable)
  • Purchase advertising space for their keywords and deploy their ad (also platform-dependent)

At that point all that would be left for the attacker to do is to watch their analytics traffic and wait for victims!

Epilogue: We Done ktgotit

In the case of the ad directing users to ktgotit[.]com (Figure 13), the malware author used an e-commerce “decoy” page with cloaking to circumvent traditional automated analysis techniques and to conceal the final destination URL serving the malicious content. However, even cloaking mechanisms can be defeated (much to the lament of the Romulans), and in the case of ktgotit[.]com, Google threat researchers were able to determine the final destination URL for the page was hxxps://aadvanced-ip-scanner[.]com.

Landing page linked to ad, ktgotit[.]com

Figure 14: Landing page linked to ad, ktgotit[.]com

Recreated lure page shown only to connections that successfully pass the verification checks on ktgotit[.]com

Figure 15: Recreated lure page shown only to connections that successfully pass the verification checks on ktgotit[.]com

In this scenario, the “Free Download” link in Figure 15 led to a download for a malicious archive file named “Advanced_IP_Scanner_v.3.5.2.1.zip” (MD5: “5310d6b73d19592860e81e4e3a5459eb”) from the URL “hxxps://britanniaeat[.]com/wp-includes/Advanced_IP_Scanner_v.3.5.2.1.zip”.

Defending Against Advertising Attacks

Ad networks should aim to respond quickly to new abuse tactics. Once an abuse methodology is known by one threat actor, it will soon become known by many.

For enterprises, an elegantly simple and proactive solution would be to consider elevating your environment’s current default browser security settings for everyday browsing. Most modern browsers seek to optimize a balance between usability and security when it comes to automated protective measures enabled by default (such as in Google SafeBrowsing). For some enterprise environments these can be elevated past default levels without much noticeable impact to overall user experience.

For individual users, when clicking on ads or links in ads, users should double-check the website address (URL) of the destination to make sure it matches the company or product in the ad and doesn’t contain typos. This is especially important on phones where the URL bar might be hidden. In the example shown in Figure 13, the URL for the ad was ”ktgotit[.]com” and the landing page content matched the domain shown in the ad (i.e., ktgotit). Yet, the content of the benign landing page showed dubiously formatted product details for loosely related products that all purported to be affiliated with different manufacturers, and the content of the malicious page (protected by cloaking mechanisms) did not have a domain that matched the one shown in the ad (Figure 13).

Users are also encouraged to double-check URLs prior to downloading files from domains that were sponsored by web advertisements. As demonstrated in Mandiant’s “Opening a Can of Whoop Ads: Detecting and Disrupting a Malvertising Campaign Distributing Backdoors,” users were led to believe the files they were downloading were affiliated with unclaimed funds from the “Treasury Department.”

Google encourages users to report any ads they think may violate their policies or harm users so they can review and take action as needed. This article contains more guidance on how to report ads.

Indicators of Compromise

Filename

MD5

Description

Advanced_IP_Scanner_v.3.5.2.1.zip

5310d6b73d19592860e81e4e3a5459eb

Malicious archive file

 

URL

IP Address

Description

hxxps://ktgotit[.]com

172.67.216[.]166

(Cloudflare Netblock)

Malvertising landing page

hxxps://aadvanced-ip-scanner[.]com

82.221.136[.]1

Cloaked lure page

hxxps://britanniaeat[.]com/wp-includes
/Advanced_IP_Scanner_v.3.5.2.1.zip

3.11.24[.]22

(Amazon Netblock)

Malware download URL

Conclusion

In a digital world where every click leaves a trace, the line between data analytics tooling around marketing demographics and malware attack campaign optimization has become dangerously blurred to some degree. As the capabilities of legitimate tooling increases, so too will the capabilities of threat actors who choose to use them for nefarious purposes. However, as we have demonstrated through the practical examples shown throughout this blog post, by demonstrating how attackers use these tools and providing insights on ways defenders can proactively take steps to mitigate or eliminate their effects, mounting a viable and impactful defense against them is achievable.

Special Acknowledgments

Adrian McCabe would like to thank Joseph Flattery for his subject matter expertise on digital marketing tools.

The authors would like to thank Mandiant Advanced Practices for their in-depth review of associated threat indicators.