INSIGHTS
CategoriesAll Digital Marketing GDPR Seo Social Media Marketing Website Design |
In the past 2 weeks we are witnessing a new wave of Google Analytics Spam – language spam and in this post I’ll outline what it is, how it happens and how you can protect your GA accounts from it (to some extent). We first started seeing it on Nov 8, just as the 2016 US presidential elections vote was winding down and it contained the following message: Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump! The above text is shown in the dimension usually reserved for language information. Such information is sent automatically to GA by most browsers in the form of short abbreviations, such as “en”, “en-gb”, “en-us”, etc. It is also combined with referral spam, with multiple domains listed as source/medium, including abc.xyz, brateg.xyz, boltalko.xyz, biteg.xyz and others. So it is a two-prong attack trying to get the user’s attention to both the fake referrer domains and to the language report, probably because of it’s prominent placement on the Google Analytics “report homepage”: (Update Dec 7, 2016) – The spam is now with referrer “motherboard.vice.com“, full referrer and page title “motherboard.vice.com/read/this-pro-trump-russian-is-spamming-google-analytics” and language “o-o-8-o-o.com search shell is much better than google!“. Our spam blocking solutions – both manual and automatic continue to work and would have blocked it if you had them deployed. (Update Dec 2, 2016) – Over time the spam mutated its referral source to legitimate sites such as “addons.mozilla.org“, “webmasters.stackexchange.com“, “thenextweb.com“, and lately “reddit.com” and is likely to mutate further. It also used “lifehacĸer . com” which mimicks the legitimate lifehacker.com site, but is controlled by the spammer. All our solutions continue to work and block it all. How Big is secret.ɢoogle.com Spam in Google Analytics? It's HUGE as the number of searches for “secret.ɢoogle.com”, “secret.google.com”, and “Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!” quickly surpassed both searches for “google analytics spam” and “google analytics referral spam” combined: Secret.ɢoogle.com / secret.google.com Vote Trump Search Volume The traffic the spammers artificially generate is also notable since it diverges from the usual referral spam we see – it has a kind of “average” bounce rate, registers over 2.5 pages/sessions and also a very long avg. session duration – over 20 minutes, as seen in the screenshot below: Google Analytics Language Spam The above is different than usual referral spam traffic which has very high % of new sessions and consequently very high bounce rate and pages/session and average duration near 1.00 and 0:00:00 respectively. This particular type of language spam only registers pageviews on your homepage, so metrics for internal pages should not be affected. Other than that, this wave of language spam traffic in GA masquerades as Firefox 49.0 on Linux, no flash version and industry average screen resolution and browser size. Due to the message being related to the US elections happening on Nov 8, we expected that this kind of spam will be short-lived. Boy, were we wrong – it is only growing stronger as it appears to be catching the attention of many webmasters around the world. Like e-mail spam, spammers who insert fake traffic in Google Analytics also thrive on attention. How do Spammers Infect GA’s Language Report? As far as we’ve investigated this type of spam is no different than other types we’ve seen in the past. They are generally two types of them – bots (computer software) that actually “visits” your site and mimics users browsing it, and bots that don’t visit your site, but send “hits” straight to the Google Analytics servers. If you are thinking that there is an easy way out of this – there isn’t, even though language spam is a bit easier to deal with than the classical referral spam. How to Remove Google Analytics Language Spam? First of all – once recorded by GA, you can’t change or edit the data, so there is no way for you to permanently erase this traffic from your reports (this is a pain, I know, but that’s how it is). Then there are two things you can do – to block spam from coming into your reports and to filter out the spam while reporting using advanced segments. The first is a permanent change to your Google Analytics views and only works from the moment you apply it on. The second is a flexible, retroactive solution, however you need to apply the advanced segment to your reports each time. When it comes to blocking future spam, there are two ways to approach it. If you are have just a couple of websites, then the manual solution outlined below will likely be enough to keep your Google Analytics stats free of language spam for a good period of time. It won’t help much with referral spam, though. HERE IS HOW:
Part 1 – Block Spam with a View-level Filter Setting up a view-level filters is fairly simple, but it should be noted that this is a permanent change going forward, so do be careful when using it, especially if you have little prior experience with view filters. The filter I propose will filter out any traffic (hits) where the language dimension contains 15 or more symbols. Since most legitimate language settings sent by browsers are 5-6 symbols and rarely is there traffic with 8-9 symbols in this field, it should only filter out language spam.
In addition to that, there are symbols which are invalid for use in the language field, but which can be used to construct a domain name (or what looks like it, such as “secret google com”, “secret,google,com”, “secret!google!com”), so we can exclude those as well. The resulting regular expression we’ll use looks like this: .{15,}|\s[^\s]*\s|\.|,|\!|\/You need to construct the “Exclude Language Spam” filter as shown in the screenshot below: Make sure to filter to the “Language Settings” dimension. You need “Edit” access at the “Account” level in Google Analytics in order to set up new filters, so make sure you have that, or you won’t even see the setup. You can use the “Verify Filter” option to see how it would affect data from the last few days. Part 2 – Filter Out Historical Spam Via an Advanced Segment View level filters are not retroactive – they only start working on your data from the moment you set them up onwards, so they won’t help with your historical data. To do that it’s best to use custom segments in order to clean up your past data from language spam when preparing reports. Here is a custom segment to filter out secret.ɢoogle.com spam as well as any future Google Analytics language spam: You set it up once and save it. Then you can apply it to most reports in Google Analytics. Bonus tip: create a shortcut in GA to a report with this segment applied in order to access it more quickly, if you need to do so on a regular basis. Is this going to be both the first AND the last type of language spam in GA that we’ll see is a bit too early to say, given that we’re in the midst of the current wave. However, by implementing the advice above or by using our automated tool you will be protected from most language spam to come, not just from “secret.google.com…” spam.
0 Comments
Leave a Reply. |
|