Sudden “Traffic From China” Spikes: What’s Really Going On (And How To Fix It)

In the last few days many SEOs have noticed unusual visit spikes labeled as coming from China. In most cases this isn’t “real growth,” but automated activity: referral spam, headless crawlers, uptime/scanner bots, or aggressive scraping routed via Chinese (or “CN-ASN”) IP ranges/CDNs. Below you’ll find a fast decision tree, GA4 recipes, server-log patterns, and WAF rules to triage and contain it, plus a checklist to keep your data clean going forward.


1) What people are seeing

Across SEO/X threads and private chats, site owners reported:

  • Hourly spikes that don’t match content releases or marketing campaigns

  • Sessions with 0:00 engagement, 100% bounce, and 1 page/sess

  • Countries reporting as China even for sites with no CN audience

  • Referrers that look odd, empty, or repeated

  • User agents that scream headless (or no UA at all)

This typically points to automated traffic, not real users.


2) The 5 most common causes

  1. Referral spam (aka “ghost” spam)
    Fake hits injected into GA4 or requests with forged referrers to get your attention or links.

  2. Headless scanners & uptime bots
    Health checks or scanners pinging your URLs at scale, sometimes via offshore IP pools.

  3. Scrapers & LLM/AI collectors
    Scraping waves that ignore robots.txt or come via unusual ASNs/CDNs.

  4. Mis-attributed geolocation
    Proxies/VPNs/CDNs can map IPs to China even when the origin is elsewhere.

  5. Server-level probing
    Bursts on wp-login.php, xmlrpc.php, sitemap.xml, or feed endpoints that inflate “traffic.”


3) Quick triage: a 10-minute decision tree

A. Is it only in GA4 (but not in server logs)?
→ Likely ghost/referral spam. Apply GA4 filters (below) and mark as spam.

B. Do server logs show real requests with weird UAs/IP ranges?
→ It’s bot traffic. Create WAF rules by ASN/IP, UA, path, and rate-limit.

C. Is the traffic hitting a small URL set (home, feed, sitemap)?
→ Add targeted rules and caching. Consider honeypots to fingerprint offenders.

D. Is engagement zero and referrer empty?
→ Treat as non-human. Segment out for reporting; don’t celebrate.


4) GA4: segments, filters & views that actually help

4.1 Build an “Suspicious CN” segment

  • Include: Country = China

  • AND: Engagement time < 3s OR Events/session = 1 OR Screen resolution = (not set)

  • Optional add: Browser = (not set) OR UA contains Headless/crawler

Use it to compare Landing Pages, Referrers, Tech > Overviews.

4.2 Traffic filters (admin)

  • Data Filters → Internal / Developer: exclude your own IPs and uptime services

  • Unwanted Referrals list: add known spam domains you see in GA4

  • Default Channel Grouping tweak: prevent weird referrers from being classified as organic/social

4.3 Custom exploration: “CN Spike Explorer”

  • Rows: Landing Page + Query String

  • Columns: Hour

  • Filters: Country = China, Engagement time < 3s
    You’ll instantly spot if a single endpoint is under attack.


5) Server-log forensics (Nginx/Apache)

Search patterns that shout “bot wave”:

  • Bursting: many hits per second from the same /24 or ASN

  • Uniform UA: python-requests, curl, Go-http-client, HeadlessChrome, (not set)

  • Path focus: /, /feed, /sitemap.xml, /wp-json/, /xmlrpc.php

  • No assets: HTML requested, but no CSS/JS/image fetches

If you can, enrich IPs with ASN and country to block at the edge.

One-liners (illustrative):

 
# Top talkers to your homepage in the last hour awk '$7=="/" {print $1}' access.log | sort | uniq -c | sort -nr | head # Suspect headless UA grep -Ei "Headless|curl|python|Go-http-client|requests" access.log | head

6) Edge/WAF rules that work (Cloudflare-style examples)

Block bad UAs

 
(http.user_agent contains "HeadlessChrome") or (http.user_agent contains "python-requests") or (http.user_agent contains "curl")

Rate-limit hot paths

 
/$|/feed$|/sitemap\.xml$|/wp-json/|/xmlrpc\.php

ASN / Country throttling (use with care)

  • If Country=CN AND engagement=zero pattern observed → rate-limit, not outright block, for 7 days

  • Maintain an allowlist for good bots (Googlebot, Bingbot via rDNS)

Honeypot trick
Link a hidden URL (e.g., /__botcheck) in markup via CSS; anything fetching it gets flagged and challenged for 24h.


7) Keep your reports clean (and truthful)

  • Separate “Human” vs “All traffic” dashboards in Looker Studio: business won’t mistake bot surges for growth.

  • Annotate the timeline (e.g., “Bot wave filtered from Nov 9–12”).

  • Use server-side tracking for critical KPIs (conversions), harder to spoof than client-side only.


8) Checklist (copy-paste)

  • GA4 segment “Suspicious CN” created

  • Unwanted referrals updated

  • Internal/dev traffic excluded

  • WAF rules for UA + paths + gentle CN throttling

  • Rate-limit sitemap/feed/home bursts

  • Logs reviewed; top ASNs identified

  • Separate Human-Only dashboard shipped

  • Incident annotated


9) Expert notes — Stefano Galloni

“Don’t overreact with country-wide blocks. Start with rate-limits and path-specific rules; keep first-party analytics trustworthy. If you see repeated scraping, consider decoy feeds, ETag tricks, and token-gated JSON for high-value endpoints.”


10) FAQ

Is this real users from China?
Usually not. Patterns (zero engagement, single pageview, uniform UA) are classic bot footprints.

Should I block China entirely?
Only as a last resort. Prefer rate-limits + UA/path rules. You might have legitimate CN visitors or crawlers.

Why did GA4 show “organic”?
Misclassified traffic or spoofed referrers. Fix with unwanted referral entries and channel rules.

Will this hurt SEO?
No—provided you don’t block legitimate search bots. Verify Googlebot/Bingbot via reverse DNS before blocking.


CTA / Credits

If you’re seeing something similar and need help tuning filters or WAF rules, ping Stefano Gallonigalloni.net.
Cross-posted on SEOXIM.