troubleshootingUpdated 2025

Data Fragmentation: When One Platform Shows as 50+ Sources in GA4

Dozens of utm_source entries for the same email platform? This data fragmentation makes reporting impossible. Here's how it happens.

8 min readtroubleshooting

Your GA4 Traffic Acquisition report should have 5-10 main traffic sources.

Instead, you have 237 different source entries.

Mailchimp shows up 73 times with different names:

  • mailchimp-january
  • mailchimp-february
  • mailchimp-spring-campaign
  • mailchimp-newsletter-2024-03-15
  • ...69 more variations

Facebook shows up 54 times:

  • facebook-spring-sale
  • facebook-product-launch
  • facebook-retargeting-q1
  • ...51 more variations

You can't tell which platform performs best. You can't compare trends. You can't make budget decisions.

This is data fragmentation—when one traffic source splits into dozens (or hundreds) of entries because campaign details leak into utm_source.

Here's how it destroys your analytics and how to fix it.

🚨 Not sure what's breaking your tracking?

Run a free 60-second audit to check all 40+ ways UTM tracking can fail.

Scan Your Campaigns Free

✓ No credit card ✓ See results instantly

What Is Data Fragmentation?

Data fragmentation = When a single logical entity (like "email" or "facebook") appears as many separate entries in reports.

Healthy GA4 Traffic Acquisition report:

Code
Session source    | Sessions | Conversions
------------------|----------|------------
mailchimp         | 8,200    | 187
facebook          | 6,400    | 134
google            | 12,300   | 289
linkedin          | 3,100    | 78
impact            | 1,200    | 45

5 sources. Clean. Readable. Actionable.

Fragmented GA4 Traffic Acquisition report:

Code
Session source                          | Sessions | Conversions
----------------------------------------|----------|------------
mailchimp_newsletter_jan_2024           | 234      | 5
mailchimp_newsletter_feb_2024           | 198      | 4
mailchimp_promo_spring_sale             | 521      | 14
mailchimp_promo_black_friday            | 789      | 28
mailchimp_welcome_day1                  | 67       | 2
mailchimp_welcome_day7                  | 45       | 1
mailchimp_cart_abandonment              | 203      | 8
...66 more mailchimp entries
facebook_spring_sale_2024               | 412      | 11
facebook_product_launch_video           | 298      | 7
facebook_retargeting_cart_q1            | 189      | 5
...51 more facebook entries

237 sources. Impossible to read. Can't identify top performers. Can't make decisions.

The 5 Types of Data Fragmentation

Type 1: Date-Based Fragmentation

Adding dates to utm_source creates a new entry every time period:

Weekly fragmentation:

Code
utm_source=newsletter-2024-01-08
utm_source=newsletter-2024-01-15
utm_source=newsletter-2024-01-22
utm_source=newsletter-2024-01-29

4 sources for 4 weeks of the same newsletter.

Over 1 year: 52 separate source entries for one email newsletter.

Type 2: Campaign-Name Fragmentation

Adding campaign names to utm_source:

Email campaigns:

Code
utm_source=mailchimp-spring-sale
utm_source=mailchimp-summer-sale
utm_source=mailchimp-fall-sale
utm_source=mailchimp-black-friday
utm_source=mailchimp-cyber-monday
utm_source=mailchimp-year-end-sale

6 campaigns = 6 separate sources (should be 1: "mailchimp")

Type 3: Campaign-Type Fragmentation

Adding email types or campaign categories:

Code
utm_source=newsletter-weekly
utm_source=newsletter-monthly
utm_source=promo-seasonal
utm_source=promo-flash-sale
utm_source=transactional-receipt
utm_source=transactional-shipping
utm_source=automated-cart-abandonment
utm_source=automated-welcome-series

8 sources for what should be 1 platform (e.g., "mailchimp")

Type 4: Combination Fragmentation

Combining multiple variables in utm_source:

Code
utm_source=mailchimp_newsletter_january_2024
utm_source=mailchimp_newsletter_february_2024
utm_source=mailchimp_promo_spring_march_2024
utm_source=mailchimp_promo_summer_june_2024

Each unique combination = new source entry.

Formula: [Platform]_[Type]_[Campaign]_[Date]

Over 1 year with 3 campaign types: 36+ separate sources for one email platform.

Type 5: Typo/Inconsistency Fragmentation

Slight variations in utm_source naming:

Code
utm_source=mailchimp
utm_source=mail-chimp (hyphen)
utm_source=MailChimp (capital letters)
utm_source=mail_chimp (underscore)
utm_source=mailchimp-email (added context)
utm_source=mailchip (typo)

6 entries for the same platform due to inconsistent naming.

😰 Is this your only tracking issue?

This is just 1 of 40+ ways UTM tracking breaks. Most marketing teams have 8-12 critical issues they don't know about.

• 94% of sites have UTM errors

• Average: $8,400/month in wasted ad spend

• Fix time: 15 minutes with our report

✓ Connects directly to GA4 (read-only, secure)

✓ Scans 90 days of data in 2 minutes

✓ Prioritizes issues by revenue impact

✓ Shows exact sessions affected

Get Your Free Audit Report

Real Example: 341 Sources for 6 Actual Platforms

Client: E-commerce brand Actual marketing platforms: 6

  • Mailchimp (email)
  • Klaviyo (email)
  • Facebook Ads
  • Google Ads
  • Impact (affiliates)
  • 2-3 referral partners

Expected GA4 Traffic Acquisition:

Code
Session source    | Sessions
------------------|----------
mailchimp         | 4,200
klaviyo           | 2,100
facebook          | 6,800
google            | 9,200
impact            | 1,400
partner-blog      | 340

What GA4 actually showed: 341 different sources

Mailchimp alone had 89 entries:

Code
mailchimp_newsletter_2024-01-08    | 203
mailchimp_newsletter_2024-01-15    | 189
mailchimp_newsletter_2024-01-22    | 234
...86 more mailchimp entries

Why:

  • Weekly newsletters with dates in utm_source: 52 entries
  • Promotional campaigns with campaign names: 24 entries
  • Automated emails with flow names: 13 entries

Facebook had 76 entries:

Code
facebook_spring_sale_2024          | 521
facebook_spring_sale_retargeting   | 298
facebook_product_launch_feb        | 412
facebook_product_launch_video      | 234
...72 more facebook entries

Impact of fragmentation:

  1. Couldn't answer basic questions:

    • "Which platform drives most traffic?" → Had to manually sum 341 rows
    • "Is email growing month-over-month?" → No consistent source to compare
    • "What's our best traffic source?" → Impossible to rank
  2. Budget decisions on hold:

    • CMO wanted to shift $15,000 from lowest-performing channel to best
    • Couldn't identify either because data was fragmented
  3. Report exports broken:

    • GA4 reports paginated across 35+ pages
    • CSV exports had 341 rows to clean manually
    • Looker Studio dashboards showed 100+ source entries (unreadable)

After consolidating to 6 clean sources:

  • Analysis time: 45 minutes → 5 minutes
  • Identified top channel: Email (56% of conversions)
  • Budget shift: $12,000/month from Display to Email
  • ROI improvement: 34% increase in ROAS over next quarter

Why Data Fragmentation Is Worse Than You Think

Question: "Is email traffic growing?"

With clean utm_source (mailchimp):

Code
Month        | Sessions | Change
-------------|----------|--------
January      | 3,200    | -
February     | 3,800    | +18.8%
March        | 4,100    | +7.9%

Trend clear: Email growing steadily.

With fragmented utm_source:

Code
January
  mailchimp_newsletter_jan           | 234
  mailchimp_promo_new_year           | 521
  mailchimp_welcome_series_jan       | 89
  ...12 more january sources

February
  mailchimp_newsletter_feb           | 198
  mailchimp_promo_valentines         | 612
  mailchimp_welcome_series_feb       | 103
  ...14 more february sources

You can't compare January vs February because the source names don't match.

You'd have to:

  1. Export all data
  2. Manually tag each source as "mailchimp"
  3. Sum by month in Excel
  4. Calculate trends

5 minutes of work becomes 2 hours.

Impact 2: Segments and Filters Are Impossible

Goal: Create GA4 segment for "All Email Traffic"

With clean utm_source:

Code
Session source = mailchimp OR sendgrid OR klaviyo

Done. 3 conditions.

With fragmented utm_source:

Code
Session source = mailchimp_newsletter_jan OR
                 mailchimp_newsletter_feb OR
                 mailchimp_promo_spring OR
                 mailchimp_promo_summer OR
                 mailchimp_welcome_day1 OR
                 mailchimp_welcome_day7 OR
                 ...83 more conditions

89 conditions just for Mailchimp.

And every new campaign requires updating the segment.

Impact 3: Automated Reports Break

Many teams set up automated GA4 reports sent weekly or monthly:

With clean sources:

  • Top 5 Traffic Sources report → Always shows the same 5 platforms
  • Email Performance report → Filter utm_source=mailchimp

With fragmented sources:

  • Top 5 Traffic Sources report → Shows random campaign-specific sources
  • Email Performance report → Only shows campaigns with "mailchimp" in utm_source (misses variations)

Automated reports become unreliable.

Impact 4: Cross-Channel Attribution Fails

GA4's attribution models (data-driven, last-click, etc.) track user journeys across channels:

Example journey:

  1. User clicks Facebook ad
  2. Returns via email
  3. Converts via Google search

With clean sources:

Code
facebook → mailchimp → google (clear path)

With fragmented sources:

Code
facebook_spring_sale_2024 → mailchimp_newsletter_mar_15 → google_brand_cpc

GA4 sees these as 3 unique sources in the path. But for analysis, you want to see the journey as: Social → Email → Search.

Fragmented sources make cross-channel path analysis nearly impossible.

Impact 5: Integration Problems

CRM integration: If you sync GA4 source data to your CRM (HubSpot, Salesforce), fragmented sources create:

  • 341 different "Lead Source" fields in CRM
  • Can't report on "Leads from Email" because email has 89 different source names
  • Salesforce reports show 300+ picklist values (unusable)

Data warehouse/BI tools: If you export GA4 data to BigQuery, Snowflake, or Looker, fragmented sources require:

  • Complex SQL CASE statements to group sources
  • Regular expression matching (error-prone)
  • Manual mapping tables that need constant updates

How to Diagnose Data Fragmentation

Test 1: Count Unique Sources (2 minutes)

  1. GA4 → Reports → Traffic Acquisition
  2. Primary dimension: Session source
  3. Scroll through pages, counting entries

Benchmark:

  • Healthy: 5-20 unique sources (small business), 20-50 (enterprise)
  • Moderate fragmentation: 50-100 sources
  • Severe fragmentation: 100+ sources

Test 2: Pattern Recognition (3 minutes)

  1. GA4 → Explore → Free Form
  2. Dimension: Session source
  3. Metric: Sessions
  4. Sort by sessions (descending)
  5. Export CSV

Look for repeating patterns:

Code
mailchimp_[ANYTHING]    ← 73 rows
facebook_[ANYTHING]     ← 54 rows
google_[ANYTHING]       ← 42 rows

If you see the same platform prefix repeated dozens of times with different suffixes, you have fragmentation.

Test 3: Platform Aggregation (5 minutes)

  1. Export source data from GA4 Explore
  2. In Excel/Sheets:
    • Column A: Session source (as-is from GA4)
    • Column B: Extract platform name (remove dates, campaigns, etc.)
    • Column C: Count rows per platform

Example:

Session SourcePlatformCount
mailchimp_newsletter_janmailchimp1
mailchimp_newsletter_febmailchimp2
mailchimp_promo_springmailchimp3
...87 more mailchimp rowsmailchimp89

If "Count" exceeds 10 for any platform, you likely have fragmentation.

The Fix: Consolidate utm_source to Platform Names

Step 1: Audit Current utm_source Values

Export all utm_source values from GA4 and identify the platform behind each:

Code
Current utm_source              | Actual Platform
--------------------------------|------------------
mailchimp_newsletter_jan        | mailchimp
mailchimp_promo_spring          | mailchimp
sendgrid_transactional_receipt  | sendgrid
facebook_spring_sale_2024       | facebook
linkedin_webinar_q1             | linkedin

Step 2: Define Standard utm_source for Each Platform

Choose ONE utm_source value per platform:

PlatformStandard utm_source
Mailchimp email platformmailchimp
SendGrid email platformsendgrid
Klaviyo email platformklaviyo
Facebook Adsfacebook
Google Adsgoogle
LinkedIn Adslinkedin
Impact affiliate networkimpact

Step 3: Move Campaign Details to utm_campaign

Before (fragmented):

Code
utm_source=mailchimp_newsletter_spring_2024
utm_medium=email
utm_campaign=march

After (consolidated):

Code
utm_source=mailchimp
utm_medium=email
utm_campaign=newsletter-spring-2024-march

All campaign context moves to utm_campaign.

Step 4: Update All Campaign Templates

Update every tool that generates UTM parameters:

Email platforms:

  • Mailchimp: Settings → Tracking → utm_source default = mailchimp
  • SendGrid: Settings → Tracking → utm_source = sendgrid
  • Klaviyo: Settings → UTM Tracking → utm_source = klaviyo

Social schedulers:

  • Hootsuite: Settings → Link Tracking → utm_source = [platform-token]
  • Buffer: Settings → Default Parameters → utm_source = dynamic platform name

Ad platforms:

  • Google Ads: Tracking template → utm_source=google
  • Facebook Ads: URL parameters → utm_source=facebook

✅ Fixed this issue? Great! Now check the other 39...

You just fixed one tracking issue. But are your Google Ads doubling sessions? Is Facebook attribution broken? Are internal links overwriting campaigns?

Connects to GA4 (read-only, OAuth secured)

Scans 90 days of traffic in 2 minutes

Prioritizes by revenue impact

Free forever for monthly audits

Run Complete UTM Audit (Free Forever)

Join 2,847 marketers fixing their tracking daily

Prevention: 5 Rules to Avoid Fragmentation

Rule 1: Platform Name Only in utm_source

utm_source = platform or vendor name, nothing else:

  • mailchimp
  • facebook
  • impact
  • mailchimp-spring-campaign
  • facebook-ads-2024

Rule 2: Campaign Details Go in utm_campaign

All campaign-specific information belongs in utm_campaign:

  • utm_campaign=newsletter-spring-sale-2024
  • utm_campaign=facebook-product-launch-video
  • utm_source=newsletter-spring-sale (wrong parameter)

Rule 3: Use Exact Same Spelling Every Time

Consistency prevents typo-based fragmentation:

  • ✅ Always mailchimp (lowercase, no hyphen)
  • mailchimp, MailChimp, mail-chimp, mail_chimp (creates 4 sources)

Rule 4: Automate utm_source (Don't Type Manually)

Set utm_source as a default in tools, don't type it manually every campaign:

  • Email platform default settings
  • Social scheduler templates
  • URL shortener defaults
  • UTM parameter snippet library

Manual typing = typos = fragmentation.

Rule 5: Quarterly Fragmentation Audit

Every 3 months:

  1. Export GA4 source data
  2. Count unique sources
  3. Identify fragmented platforms
  4. Consolidate for future campaigns

FAQ

Can I fix historical fragmented data in GA4?

No. GA4 doesn't allow retroactive data changes. Historical data stays fragmented. Focus on fixing future campaigns. Over 6-12 months, clean data will accumulate and fragmented data will age out of default reporting windows.

What if I need to track different email types (newsletter vs promotional)?

Use utm_campaign or utm_content to differentiate:

Code
Newsletter:
utm_source=mailchimp
utm_medium=email
utm_campaign=newsletter-weekly

Promotional:
utm_source=mailchimp
utm_medium=email
utm_campaign=promo-spring-sale

Both show as "mailchimp" source, but you can filter by campaign name for granular analysis.

How do I aggregate fragmented historical data for reports?

Create a GA4 custom dimension or use regex in BigQuery:

BigQuery example:

Sql
CASE
  WHEN REGEXP_CONTAINS(source, r'mailchimp') THEN 'mailchimp'
  WHEN REGEXP_CONTAINS(source, r'facebook') THEN 'facebook'
  ELSE source
END AS consolidated_source

This groups all mailchimp-* sources as "mailchimp" for reporting.

Does this apply to utm_medium and utm_campaign too?

utm_medium: Should also be consistent (e.g., always email, not newsletter or e-mail)

utm_campaign: Can and should change per campaign—this is where details belong. Fragmentation here is expected and useful.

What if different teams manage different platforms and use different naming?

Create a company-wide UTM naming convention document with approved utm_source values:

Code
Platform          | utm_source Value | Owner
------------------|-----------------|--------
Mailchimp         | mailchimp       | Marketing
SendGrid          | sendgrid        | Product
Facebook Ads      | facebook        | Paid Social Team
Google Ads        | google          | SEM Team

Require all teams to use the approved list.


Related: Campaign Details in Source Rule Documentation

UTM

Get Your Free Audit in 60 Seconds

Connect GA4, run the scan, and see exactly where tracking is leaking budget. No credit card required.

Trusted by growth teams and agencies to keep attribution clean.