October 20, 2025

The Google Content Warehouse Leak: Understanding Google’s E-E-A-T Algorithm

Google’s algorithm has always been shrouded in mystery. However, back in May 2024, a significant amount of Google's documentation was leaked. Over 2,500 pages with information on Google’s ranking systems, including over 14,000 attributes, were exposed. 

Although the documents didn’t go into specifics about the significance of particular components, they did detail the data Google collects and stores for content, links, and user interactions. As a result, these documents provide one of the clearest indicators of what Google actually considers to be important features on a webpage.

The leak provided a rare but invaluable insight into how Google might measure quality, trust, and authority, with many of the attributes in the documents corresponding with things SEOs have speculated for years. 

The authenticity of the documents was debated at first but Google did acknowledge their authenticity eventually. That said, they did note that the documents were partial and out of context. 

“We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information. We’ve shared extensive information about how Search works and the types of factors that our systems weigh, while also working to protect the integrity of our results from manipulation”.

- Google’s official public statement, 29 May 2024.

The information from the documents can’t be used as a definitive confirmation of Google’s ranking secrets, but they have allowed some industry professionals to create a more structured model of what E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness) could mean in algorithmic terms. And that's exactly what Shaun Anderson, Head of SEO at Hobo Web, has done. 

So, let’s take a look at his findings, and what they mean for your content strategy going forward.

The Analysis 

Here are some of the key takeaways from Shaun's analysis and what they could mean for you:

E-E-A-T Defines the Goal 

E-E-A-T isn’t itself a signal but rather a concept that Google wants to reward. The data implies that Google uses many smaller measurable data points to estimate E-E-A-T, since an algorithm cannot measure trust, for example, like a human can. 

This means E-E-A-T isn’t a simple checklist - you need to be building many supporting signals throughout your site. 

Site Level Strength is Important 

Google uses metrics like siteAuthority, siteFocusScore (which measures how focused a site is on certain topics), and hostAge (how old the domain is) to assess a domain as a whole.  

Even if you create a well-optimised, high-quality page, it may not rank well if your domain has very little reputation. 

More Effort = More Reward 

Google is using signals like originalContentScore (the originality of a page’s content) and ContentEffort (estimate of how much human effort went into the content) to measure experience, and rewards the content that is high-effort. 

Churning out generic content or putting in very little effort could result in lower returns. Instead, you need to prove your own expertise on the subject by adding accurate data or personal insight. 

Anonymous Authors Are Weaker 

Having an identifiable author is important as Google uses attributes like isAuthor and Author to identify author bylines and store the identified author in order to track their reputation. 

Creating a byline author page could give more credibility to your content and strengthen its performance in the search results. 

The User’s Needs Must Be Met

‘Good clicks' vs ‘bad clicks’, how long a user stays on a page, and bounce rate data is collected by the Chrome browser and used by systems like ‘Navboost’ to promote or demote pages. 

Getting a click isn’t enough. You need to keep the user engaged on the page so they don’t bounce back to the search results. 

Trust is Technical 

Secure HTTPS, no duplicate content or spammy signals, and a clean site structure indicate trust. It’s crucial that any technical site issues like SSL and canonical tags are fixed to build trust with Google. 

Attributes That Stand Out 

Many of the attributes in the document, such as brickAndMortarStrength, are more obvious. However, some of the others are a little more interesting…

anchorMismatchDemotion 

One of the most interesting hypotheses is if anchor text doesn't align with the target page’s topic, the link may be penalised rather than ignored. If this is true, link building must rely on relevance as much as authority. 

scaledSelectionTierRank

Before a page is even eligible for ranking it’s classified into a tier. If you're put in a ‘Landfills’ tier, your chances of ranking could be limited no matter how optimised the page is. This indicates that your domain must be trusted to some extent before ranking can begin. 

contentEffort and originalContentScore 

Google estimates how easy it would be to replicate the content and evaluates how much effort went into producing it in order to estimate experience. Sites that have consistently produced high-effort, original content have built a solid foundation for expertise. 

What Does This Mean For Marketers? 

The data leak doesn’t tell us how strongly a signal influences ranking, but it does offer more insight into the framework of Google’s ranking system. However, this isn’t a definitive guide. Google has not confirmed any of the findings and for the most part, they are just theories.

That being said, there is plenty of actionable data that you can take away from this.  

Breakthrough Takes Time

While this is just a hypothesis, there are many plausible attributes that tie in with the E-E-A-T pillars. It seems Google is using measurable signals to estimate E-E-A-T, so building a foundation of trust, authority, and expertise before competing for the big keywords is crucial.

Your site’s overall topic focus, author reputation, technical health, and user satisfaction all feed into how Google values your content. But over time, creating original content written by credible authors and publishing on a well-structured site that offers a strong user experience, which satisfies Google’s signals, will lead to better ranking opportunities.  

If you’d like expert help creating content that will get results, then get in touch with us today. 

Tom Brook

Tom has more than 10 years of experience working in copywriting, content strategy and PR. Over the years, he’s led one of the largest copywriting teams in the UK and has worked on a freelance basis for some of the country’s biggest brands.

Continue to learn

Google’s Algorithms Exposed: What Google’s Trial Docs Tell Us
Google’s Search Quality Rater Guidelines Explained
Google’s Antitrust Ruling: Everything You Need to Know