
What did really happen?
When you see a traffic drop from Google, it does not necessary mean that it is due to a penguin demotion. There might be other causes such as:
Tracking issues
In certain occasions the actual reason for a google organic traffic drop might be that the Google Analytics tracking pixel is missing from one or more pages.
Actions
Scan Site – Craw all web sites to identify which pages are missing the google analytics tag (this can be done with either screaming frog or gachecker.com)
Referrals – Make sure you site is not showing as a referral traffic source (GA > Acquisition > All Traffic > Referrals)
Top pages – Check the top pages in traffic to find out if any pages have stopped working. (GA > Behaviour > Landing Pages)
Hostnames – Identify if any other web site is using your analytics code. (caches, translations and other websites.
Alerts – Set up intelligence alerts to get notified when there are big traffic drops. (Use the following (read more here:https://support.google.com/analytics/answer/103302…Content issues
The content of the site might be of poor quality or there might be duplicate content issues that might have resulted in a Panda or manual penalty.
Duplicate Content
Your site might be suffering from duplicate content. Here is what you need to do to indentify any issues:
Check with Google Web Master Tools, Siteliner and also seach for the the following using Google to find internal duplicate content issues (duplicate title tags, meta descriptions)
www.site.com and site.com
http:// and https://
dir and dir/
/ and /index.php
/cat/dir/ and /dir/cat/
/cat/dir/id/ and /cat /id/
param_1 = 12?t_1 = 34 and / cat_12 / dir_34 /
site.com and test.site.com
test.site.com and site.com/test/
/? you_id = 334
/ session_id = 344333
Use CopyScape or PlagSpotter for external duplicate content issues.
Make sure all parameters are blocked from the search engines so that these pages do not get indexed.
Check for partial duplicates.
Check for inconsistent internal linking (Screaming frog, Deepcrawl).
Look for any sub-domains.
Technical Issues
This is very common after a site migration due a disallow directive in the robots.txt, wrong implementation of rel=”canonical” and severe site performance issues etc.
Proper use of 301s.
Any “Bad” redirects.
Redirect chains.
Has the Canonical version of site been specified in Google Webmaster Tools.
Has the canonical version of the page been properly implemented across the site.
Does the site use absolute instead of relative URLs.
Has the canonical version of the site been established through 301s.
Is the robots.txt file blocking any pages?
Has the no-index tag has been implemented by mistake to one or more pages.
Check for indexed pdf versions of your content. (site:mydomain filetype:pdf)
Outbound linking issues – Many times a site links to a spam sites or websites operating in untrustworthy niches.
Links to low trust web sites
Paid Links
Negative SEO – If you experience a sudden traffic drop, then you might have been a victim of negative SEO. Negative SEO usually refers to the practice of a competitor buying low quality links and pointing them to your web site with the intention of hurting your organic traffic.
Hacking – In several occasions your web site could have been hosting spam, malware or viruses as a consequence of being hacked.
Google Updates
In order to get a better understanding of how the various Google updates affected your organic traffic, it is also recommended to identify all the core dates that any updates have taken place (official and unofficial).
Google Algorithm Updates Sources
Algorooo
Google Algorithm Change Index
Google Traffic
In order to isolate the Google Traffic you will need to create the following segment:
https://www.google.com/analytics/web/template?uid=CsPptfU_QE-X-Yngg00dVQ
Google Algorithm Updates Tools
There are two online tools that I highly recommend to speed up the process:
Sistrix Updates Tool
Barracuda Penguin Tool


What is Google Penguin?
Google Penguin is an Algorithmic update firstly launched by Google Search Engine in April 2012 to improve the value of the search results returned for users by trying to deal with any form related to spam (also known as spamdexing or Black Hat SEO) such as:
Keyword Stuffing
Link spamming
Invisible text
Duplication of content from high-ranking web site.
Key facts about Penguin
Penguin is an algorithmic update, which means that it is not possible to instantly recover from it.
You can only partially recover from Penguin before Google does a refresh or an update.
Penguin seems to affect more keyword rankings.
You DO NOT receive a notification in Google Web Master Tools if you have been hit by a Penguin update.
You can only submit a reconsideration request when you have received a manual penalty.
The Key date is the 24th April 2012, so if you show a traffic drop after this date, you have been hit by the Google Penguin Algorithmic Update.
How to find out if you were hit by Penguin
As Penguin is related mostly to back-links, it is absolutely necessary to examine the following:
Over optimised anchor text (externally and internally).
Over optimised anchor text on low quality web sites.
The dates that your web site traffic was affected.
If you have received any notification in Google Web Master Tools.
Is it a site-wide drop or does it seem to be keyword-specific?
Steps to Recovery
Step 1 – Match updates to Google analytics organic traffic

Google Analytics is a very useful tool as it can help you identify if there was any traffic drop after each Penguin update.
April 24, 2012: Penguin 1
May 25, 2012: Penguin 1.2
October 5, 2012: Penguin 1.3
May 22, 2013: Penguin 2.0
October 4, 2013: Penguin 2.1
October 18, 2014: Penguin 3.0
Step 2 – Compare organic traffic before and after the update
Now you need to compare the organic traffic two weeks prior to each Penguin update with two weeks after each update. If the drop is higher than 50% usually it demonstrates clearly that the site has been penalised.
Step 3 – Investigating what dropped
Now that you have a clear understanding which updates affected the web site’s organic traffic, you also need to find out what actually dropped.
Step 4 – Which keywords dropped?
Penguin seems to affect web sites more at a keyword level rather than site wide. Do a comparison for the same period that you checked your traffic for the top keywords that you are optimising your web site to see if there were any keywords that were affected severely.
Step 5 – Check keywords visibility
Once you have found for which keywords you have dropped, log in also to Google Web Master Tools to check each keyword visibility. (Only for one update). If you would like to this for all the updates for a web site you need to use SEMrush or any other tool that was used to track keywords.
Step 6 – Gather all links
Now you reached the point that you need to gather all links to start the analysis. For this process you will need you backlink profile from the following tools:
Ahrefs
Majestic SEO
Open Site Explorer
Google Web Master Tools
Backlink Profiler (BLP)
After you have exported all data and removed all duplicates with excel, start the analysis of the anchor text. What you need to do initially is to find instances of the anchor text by using the following functions:
COUNTIF
Microsoft Excel Definition: Counts the number of cells within a range that meet the given criteria.
Syntax: COUNTIF (range,criteria)
COUNTIF is your go-to function for getting a count of the number of instances of a particular string.
IFERROR
Microsoft Excel Definition: Returns a value that you specify if a formula evaluates to an error; otherwise, it returns the result of the formula. Use IFERROR to trap and handle errors in a formula.
Syntax: IFERROR(value, value_if_error)
IFERROR is really simple and will become an important piece of most of our formulas as things get more complex. IFERROR is your method to turn those pesky #N/A, #VALUE or #DIV/0 messages into something a bit more presentable
Step 7 – Combine data together
Now you need to pull data from Google Analytics for each update (15 days before vs 15 days after) for the top anchor texts in order to discover if there was a drop in the organic traffic for these keywords that were used to improve your rankings by linking back to your web site (top anchors). Here is what you need to do step by step:
Combine all link resources in excel.
Keep only the i) Anchor Text ii) Linking Domains ii) Links Containing Anchor Text.
De-duplicate data.
Use COUNTIF AND IFERROR to find anchor text instances.
Extract data from Google Analytics (pre and post update).
Find the percentage of traffic drop by using the following formula (date before – date after)/date beforeby selecting the columns and cells that represent the data for each date range.
Create a pivot table and combine the following information.
The drop.
# of LRDs.
If you are not very familiar with excel and pivot tables, I recommend downloading the following spreadsheet and use it as a guide as it can help you save a lot of time.
Step 8 – Check links using Link Detox
Link Detox is a very powerful tool if used property as it combines data from multiple resources. Here is what you need to do:
Create an account here https://www.linkresearchtools.com/.
Go to Link Detox https://www.linkresearchtools.com/toolkit/dtox.php.
Enter Domain to analyse.
Analyse links going to Root Domain.
Activate the NOFOLLOW evaluation.
Select theme of domain from dropdown.
Select if Google has Sent You A Manual Spam Example (Yes, No, Do Not Know).
Upload any links you already have (Ahrefs, Open site Explorer, Majestic, Google Web Master Tools).
Upload Disavowed links (if you have disavowed any).
Hit the Run Link Detox and wait until the report is ready.
Classify all your anchor text before start auditing your links to:
Money
Brand
Compound (brand + money example : Debenhams toys collection)
Other
Download the report in CSV format and open with excel.
Keep only the following columns
Step 9 – Create additional columns in excel
From URL– This is the URL of the page that is linking to your web site.
To URL– This is the page of your web site that the external web site is linking to.
Anchor Text – This is the keyword or keyword phrase used as link text.
Link Status – If the link is passing link juice or not for the search engines (follow or nofollow).
Link Loc – The location of the link on the page (paragraph, footer, widget etc.). Very useful when you need to remove it.
HTTP-Code – These codes will help you identify the type of error when a page is not loading or responding.
Link Audit Profile – The higher the priority the more urgent to examine the links.
DTOXRISK – How toxic is each link (bad for your web site in terms of organic search).
Sitewide links – A site wide link is one that appears on most or all of a website’s pages (blogroll, footer etc.)
Disavow – Has Google been notified through the Disavow Tool that this link has to be ignored.
Power Trust – The power trust is a metric used to show how powerful and trusted is a page or domain to the eyes of Google.
Power Trust* Domain – Power trust metric applied to an individual page.
Rules – Spam link classification (banned domain, link network etc.).
Before you start the analysis with excel based on the Link Detox report, you will need to create the following columns:
Contact Email
Contact Page URL
Removed (Yes, No)
Page Trust (Majestic)
Domain Trust (Majestic)
Niche (use majestic for this)
Page Indexed (double check)
Date of 1st Contact
Date of 2nd Contact
Date of 3rd Contact
Notes
The following are supplementary:
Edu Domains (Majestic)
Domain Toxic Links (OpenLinkProfiler)
Governmental Domains (Majestic)
Page Facebook Shares
Page Facebook Likes
Page Twitter Shares
Page Google ++ Likes
Step 10 – Keep only on URL per domain
Create an additional column after the domain column and paste the following function into the fist cell =IF (B1=B2,”duplicate”,”unique”) and copy it across the whole column. Next select from the filter control you have applied to view only unique values.
Step 11 – Exclude non-verified links
Simply use the filter from the column anchor text to exclude the unverified links. These are links that do not exist anymore.
Step 12 – Exclude disavowed links
If you have done a link audit before and you have upoad disavowed files, it would be good to exclude the lins included in them too as this will help save precious time. You can review these links separately later.
Step 13 – Banned domains
Now is the time to start reviewing your links. Follow backlinks are always a higher priority as they violate the Google web master guidelines directly.
Apply a filter to all cells.
Then apply filter to view links with TOX1 (banned links).
Use the tag columns to mark which domain needs to be removed or not by using a descriptive tag.
Mark any URL or domain that needs to be disavowed so that you can create a file at the end very easily.
Be careful while reviewing as in certain cases several domains might not be indexed for other reasons than being penalised (robots.txt, no-index tag).
Also, several domains might be very authoritative and trustworthy and there is nothing wrong with linking to you.
Step 14 – Domains infected with viruses
Apply a filter to all cells.
Then apply filter to view links with TOX2 (virus infected).
If you find good one do not remove but simply contact the web master.
Remove only the bad ones.
Double check the domains with one of the following tool:
https://sitecheck.sucuri.net/
http://safeweb.norton.com/
https://www.mywot.com/ (exntension)
Step 15 – Audit TOX3 Domains
All these links according to Link Genesis are classified a highly toxical, so you will need to check them very carefully and remove them if you agree with link detox suggestions.
Step 16 – Double check Google Web Master backlinks
Pay particular attention to links imported from Google Web Master Tools during your reviews as according to John Mueller from Google this should be the primary source of backlinks used to audit your link profile.
How to judge the value of a link
Before deciding to take any action with any links that might be toxic and therefore could result in your web site receiving a Penguin penalty by Google, you need to devote time to understand all the data that you have pulled in the spreadsheet such as:
Domain Trust Flow (Majestic): How respected is the domain on the web. If the domain has a high trust in general, this is an indication that Google values it. (Usually domains with trust flow <10).
Page Trust Flow (Majestic): This metric is similar to Domain Trust Flow but it is applied at a page level.
Domain Power Trust (Cember): This metric determines the quality of a website according to its strength and trustworthiness.
There are four types of links:
High Trust and Low Power – Links from highly trusted domains such as Universities or governmental institutions. These links are usually very difficult to get and have a very positive impact on your web site’s credibility.
Low Trust and High Power – These links require further research as they are not necessarily always good.
Low Trust and Low Power – These links do not help much in general as they may come from newly established web sites that might have even been penalised. Review carefully any of these sites before you decide to build any links from them.
High Trust and High Power – This is ideally the kind of links that you are looking to earn.
DTOXRISK: This is the risk for each link based on how harmful it might be for your web site based on Link Detox calculations (client feedback, observations, linking domains, neighbourhood, internal and external SEO experts, known google publications etc.)
To get a full understanding please go the following page: http://www.linkdetox.com/faq
Link Audit Priority: The higher the priority the more important it is to review each link.
Link Status: Whether a link is follow or no-follow.
Link Location: Where the link is located on the page (header, footer, navigation etc.)
Niche: The niche that the domain falls under (finance, property, computers etc.)
HTTP code: These codes help identify the cause of the problem based on the response send from the server (For detailed information please go:http://www.w3.org/Protocols/rfc2616/rfc2616-sec10….)
Page indexed: Whether the page is indexed or not by Google. Please double check and also use http://indexchecking.com/
On the top of all these metrics you will need to take into consideration how search engines judge the value of each link.
Which links to remove
Prioritise the following types of links as they violate the Gooogle Web Master guidelines and therefore could result in a penalty easier.
Link networks.
Article submissions.
Directory submissions.
Duplicate content links (e.g. guest blog duplicated over 100s of domains).
Spammy bookmarking sites.
Forum profiles (if done for backlinks).
Malware/hacked sites.
Gambling/Adult sites (if your site is not in the same niche).
Comment links with over-optimized anchor text
Blog roll links.
Footer links.
Site wide links (in most cases).
Scraper sites.
Any auto-generated links (xRumer forum posts, etc.).
How to remove links
When it comes to removing links, there are several options available.
Contact web masters
Draft a document containing complete links’ list to be removed. Send to webmaster in single, well drafted and short email.
Adopt one communication channel (email/linked-in/face-book). Shift the channel only when the earlier one isn’t responding.
Be polite to webmasters. They are trying their best to solve your problem! Keep human touch in communication protocol. An email referring the webmaster by his/her name is more likely to get response and develop strong professional relation.
Email from the domain (you wish to remove) as it is more like for web masters to answer and remove the links.
Be polite because you are asking for a favour.
Be polite while talking to Spammers because they can blackmail you for money if you act rude.
If you spammed somebody’s site; Be polite, Admit and apologize; Make sure you don’t repeat it (webmaster will chec
The disavow tool

If you aren’t able to remove all toxic links, use the disavow tool.
Use the disavow tool only as a last resource and mostly when you were hit by an algorithmic update as in case of a manual penalty you will need to show proof to Google that you have done everything possible to remove the bad links.
Make a spread-sheet of the links removed without using the disavow tool; sort it and make a list of un-removed links, removed links and methods of removal.
Focus on un-removed links. Try to sort data in order of domain.
You can either disavow a full domain or just one link from a domain. Choose the method wisely.
Manage separately domains and links.
404 the pages
In several cases if you cannot remove any deep links at all, you can also change the URL of the page so that all these links go to a 404 page. I personally redirect them to another site that I create specifically for this reason as I do not like to increase the errors on any web site that I am working on.
A common sense approach
Based on everything that has been said by John Mueller, Mutt Cuts from Google,industry experts and my personal experience I would suggest the following actions:
Review very carefully all links.
Clean up as many links as possible you can and in particular the ones that you have created yourself (directories, forums, mini sites, profiles, press releases on poor quality sites, mini sites)
Disavow all toxic links that you could not remove.
Make sure 60% of your anchor text is branded and only 40% focuses on money keywords (20% exact, 20% miscellaneous).
Review carefully you niche to understand your top ranking competitors link profile (branded vs non-branded percentage, link types).
Build quality only links to restore link equity and to also build trust with Google.
Grow your brand.
Get media coverage.
Wait until Google reruns the Penguin algorithm and reassesses your site.
I personally try to remove as many links as possible to recover a site from penguin even if it is not necessary, simply because I do not want the sites that I work with to be associated with any spammy or low quality sites. Furthermore, I am not convinced 100% that the disavow works without removing any Inks. Another reason is that sometimes if I choose to disavow links instead of domains (rarely), I might miss several bad links.
Depending on your time and resources, you will have to decide if you really wish to clean up the sites back link profile whether to focus only on recovering from Penguin. Link removal campaigns have in general 5 – 20% success rate, so for algorithmic updates are inefficient but you should always talk to your clients about this option.