Why does Tier 1 have looser thresholds?

Because smaller samples produce wider confidence intervals. With 1,000 rounds, an observed RTP of 96.5% could be consistent with a true RTP of 97% — the deviation is within the expected range of random variation for that sample size. With 100,000 rounds, the same 96.5% would be statistically incompatible with a true 97%. Looser thresholds at Tier 1 are not about being less careful — they are about being mathematically honest about what smaller datasets can and cannot prove.

Can a game skip directly to Gold?

Yes, if the data requirements are met. There is no mandatory waiting period between tiers. If a game has 100,000+ rounds of three-source data available from day one (because it has been operating for years and the data is publicly accessible), it can be classified as Gold on the first audit. The tiers are about data sufficiency, not about calendar time. In practice, most first audits are Tier 1 because collecting three-source data takes time.

Does Tier 3 Gold mean the game is guaranteed fair?

No. Gold means we have high confidence that the game's observable behavior matches its declared parameters based on a large, multi-source dataset. It does not guarantee future behavior — the operator could change parameters tomorrow. It does not cover aspects we cannot measure (internal operational practices, employee conduct, server security). Gold is the highest level of empirical confidence we can provide. It is not a guarantee of fairness in perpetuity.

How long does it take to reach Gold?

It depends on data availability, not calendar time. A high-volume game like Stake Crash, where millions of rounds are played daily and the provably fair system provides complete data, could theoretically reach Gold within months. A lower-volume game or one without provably fair transparency could take a year or more to accumulate sufficient three-source data. The timeline is dictated by the data, not by our schedule.

What happens when a game loses a tier?

If new data contradicts a previous classification — for example, if a Verified game's latest data shows RTP deviation that exceeds the Verified threshold — the game is downgraded. The downgrade is published as a new audit report with a clear explanation of what changed. The previous audit report remains available for reference (audit reports are immutable). Downgrades are also accompanied by notification to the operator, who has 30 days to respond before the new classification goes live on the listing pages.

What Makes an Audit Conclusive: Evidence Tiers Explained

Not every audit reaches the same level of confidence. Clash Watchdog AI classifies every audit into one of three evidence tiers — Provisional, Verified, or Gold — based on the amount of data examined, the consistency across sources, and the duration of observation. This article explains the three tiers, what each one means, and why more data buys a tighter threshold.

Why isn't every audit the same strength?

Because the strength of a statistical conclusion depends on the amount of evidence behind it.

An audit based on 1,000 rounds of data from a single source can tell you something — but it cannot tell you much with high confidence. An audit based on 100,000 rounds from three independent sources can tell you a great deal with very high confidence. The mathematics of statistical inference do not allow the same conclusion from both datasets.

Rather than pretending all audits are equal, Clash Watchdog AI explicitly classifies each audit into one of three evidence tiers that communicate exactly how much data is behind the conclusion. This is not common practice in gambling auditing — most regulatory audits produce a binary pass/fail with no indication of the underlying confidence level. We believe this obscures important information.

The tier system serves two audiences:

For players: The tier tells you how seriously to take the verdict. A Tier 1 Provisional whitelist means "we have checked and so far it looks fine, but we do not have enough data to be highly confident." A Tier 3 Gold whitelist means "we have checked extensively and are highly confident." Both are whitelists, but they carry different weight.

For operators: The tier tells you what to expect from the next audit. A Tier 1 classification is a starting point, not a destination. Games that want to be taken seriously by regulators, journalists, and players should aim for Gold — and the path to Gold is transparent.

What is the Provisional tier?

Tier 1 — Provisional is the entry-level classification. It means we have performed an initial audit with limited data and found no disqualifying issues — but the dataset is not large enough to rule out subtle deviations.

Requirements:

Parameter	Threshold
Minimum rounds	1,000–10,000
Data sources required	At least 1 (typically operator data + self-proxy)
RTP deviation tolerance	±2.5% from declared
Distribution test	Chi-squared p > 0.01
Hash verification (if provably fair)	100% of sampled rounds pass

What Provisional means:

A Provisional Whitelist classification means: "Based on the data we have examined, this game's observable behavior is consistent with its declared parameters. The sample is too small to detect deviations below 2.5%. We will continue collecting data."

A Provisional Watchlist classification means: "We have found something worth monitoring — a distribution anomaly, an RTP deviation, or a data-source disagreement — but we do not have enough data to determine whether it is meaningful or within the range of normal variance."

Why the thresholds are wide: With 1,000 rounds, the 95% confidence interval for RTP is roughly ±3%. This means a game with a true RTP of 97% could easily show observed RTP anywhere from 94% to 100% in a 1,000-round sample, purely from random variation. Setting the threshold tighter than the confidence interval would produce false positives — flagging fair games as suspicious because of normal variance.

What is the Verified tier?

Tier 2 — Verified means we have performed a substantive audit with a meaningful dataset and found the game's behavior to be consistent with its declared parameters within tighter tolerances.

Requirements:

Parameter	Threshold
Minimum rounds	10,000–50,000
Data sources required	At least 2 (must include community or self-proxy)
RTP deviation tolerance	±1.0% from declared
Distribution test	Chi-squared p > 0.05
Hash verification (if provably fair)	100% of sampled rounds pass
Serial correlation test	No significant autocorrelation at any lag 1–20

What Verified means:

A Verified Whitelist means: "Based on a substantial dataset from multiple independent sources, this game's RTP is within 1% of its declared value, its distribution matches the theoretical shape, its rounds show no serial correlation, and its hash chain (if applicable) verifies completely. We have moderate-to-high confidence in this assessment."

Verified is the threshold at which we consider an audit result reliable enough to cite in public communications and to use as a basis for listing recommendations.

The two-source requirement: Tier 2 requires at least two independent data sources. This is the level at which single-source manipulation becomes detectable. If operator data and community data agree, or if operator data and self-proxy data agree, the probability that both are being manipulated in the same way drops significantly.

What is the Gold tier?

Tier 3 — Gold is the highest classification. It represents the maximum empirical confidence our methodology can provide. Gold is the standard we recommend for regulatory citations, academic references, and journalist reporting.

Requirements:

Parameter	Threshold
Minimum rounds	50,000+
Data sources required	All 3 (operator + community + self-proxy)
RTP deviation tolerance	±0.5% from declared
Distribution test	Chi-squared p > 0.05, KS test p > 0.05
Hash verification (if provably fair)	100% of all rounds in sample
Serial correlation test	No significant autocorrelation at any lag 1–100
Rotation analysis (if provably fair)	No significant correlation between rotations and player events
Observation duration	Minimum 90 days of data collection

What Gold means:

A Gold Whitelist means: "Based on an extensive dataset from three independent sources collected over at least 90 days, this game's behavior matches its declared parameters within 0.5% across all measured dimensions. We have high confidence that the game is operating as advertised, and we have found no evidence of systemic manipulation."

Gold is deliberately difficult to achieve. The three-source requirement, the large sample size, the 90-day observation window, and the tight tolerances all serve the same purpose: making it very expensive for an operator to fake a pass. An operator who wants to manipulate their game while maintaining a Gold classification would need to sustain consistent behavior across 50,000+ rounds, across three independent observation channels, for 90+ days. This is operationally infeasible for any manipulation that produces a meaningful financial benefit.

The rotation analysis requirement: Gold-tier audits of provably fair games must include rotation analysis — testing whether server seed rotations correlate with player events. This test is unique to our methodology and addresses an attack vector that standard provably fair verification cannot detect.

Can a game move from Provisional to Gold?

Yes. The tier progression is designed to be a one-way ratchet — games start at Provisional and move up as more data becomes available and as the data continues to support the declared parameters.

The typical progression:

Provisional → Verified: Accumulate 10,000+ rounds from at least two sources. If the data remains consistent with the declared parameters at the tighter Verified thresholds, upgrade. If the data reveals anomalies that were not visible in the Provisional sample, the game may stay at Provisional or move to the Watchlist.

Verified → Gold: Accumulate 50,000+ rounds from all three sources over 90+ days. Run the full Gold test suite including rotation analysis. If everything passes, upgrade. If rotation analysis reveals suspicious patterns, the game stays at Verified pending investigation.

Downgrade: If new data contradicts a previous classification, the game is downgraded. Downgrades trigger the due process procedure described in MUST_READ §11.2: the operator is notified, given 30 days to respond with counter-evidence, and the operator's response is published alongside the updated audit report.

What does tier have to do with the Whitelist and Blacklist thresholds?

The tier determines the confidence of the classification, not the classification itself. A game can be Whitelisted at any tier — a Tier 1 Whitelist is a less confident endorsement than a Tier 3 Whitelist, but both mean the game has passed the relevant thresholds for its tier.

Similarly, a game can be Blacklisted at any tier, though the bar for Blacklisting is deliberately higher. We are more cautious about condemning a game than about approving one, because a false Blacklist harms an honest operator. The asymmetry:

Action	Minimum Tier	Rationale
Whitelist	Tier 1+	Low risk of harm from false positive
Watchlist	Any	Watchlist is informational, not punitive
Blacklist	Tier 2+	High risk of harm from false positive; requires stronger evidence

A game cannot be Blacklisted at Tier 1. If Provisional data suggests a problem, the game is placed on the Watchlist and data collection continues until a Tier 2 conclusion is possible.

This asymmetry is a deliberate design choice. We accept the risk that a manipulated game might be Whitelisted at Tier 1 (and caught at Tier 2) in exchange for never Blacklisting an honest game on insufficient evidence. The cost of a false Blacklist — reputational damage to an honest operator — is higher than the cost of a temporary false Whitelist — players using a game that will be caught in the next audit cycle.

For the full methodology, including the exact statistical tests, confidence levels, and decision procedures, see our methodology page. For which games are at which tier, see our game listings.

Not every audit reaches the same level of confidence. Clash Watchdog AI classifies every audit into one of three evidence tiers — Provisional, Verified, or Gold — based on the amount of data examined, the consistency across sources, and the duration of observation. This article explains the three tiers, what each one means, and why more data buys a tighter threshold.

Why isn't every audit the same strength?

Because the strength of a statistical conclusion depends on the amount of evidence behind it.

The tier system serves two audiences:

What is the Provisional tier?

Requirements:

Parameter	Threshold
Minimum rounds	1,000–10,000
Data sources required	At least 1 (typically operator data + self-proxy)
RTP deviation tolerance	±2.5% from declared
Distribution test	Chi-squared p > 0.01
Hash verification (if provably fair)	100% of sampled rounds pass

What Provisional means:

What is the Verified tier?

Tier 2 — Verified means we have performed a substantive audit with a meaningful dataset and found the game's behavior to be consistent with its declared parameters within tighter tolerances.

Requirements:

Parameter	Threshold
Minimum rounds	10,000–50,000
Data sources required	At least 2 (must include community or self-proxy)
RTP deviation tolerance	±1.0% from declared
Distribution test	Chi-squared p > 0.05
Hash verification (if provably fair)	100% of sampled rounds pass
Serial correlation test	No significant autocorrelation at any lag 1–20

What Verified means:

Verified is the threshold at which we consider an audit result reliable enough to cite in public communications and to use as a basis for listing recommendations.

What is the Gold tier?

Requirements:

Parameter	Threshold
Minimum rounds	50,000+
Data sources required	All 3 (operator + community + self-proxy)
RTP deviation tolerance	±0.5% from declared
Distribution test	Chi-squared p > 0.05, KS test p > 0.05
Hash verification (if provably fair)	100% of all rounds in sample
Serial correlation test	No significant autocorrelation at any lag 1–100
Rotation analysis (if provably fair)	No significant correlation between rotations and player events
Observation duration	Minimum 90 days of data collection

What Gold means:

Can a game move from Provisional to Gold?

Yes. The tier progression is designed to be a one-way ratchet — games start at Provisional and move up as more data becomes available and as the data continues to support the declared parameters.

The typical progression:

What does tier have to do with the Whitelist and Blacklist thresholds?

Action	Minimum Tier	Rationale
Whitelist	Tier 1+	Low risk of harm from false positive
Watchlist	Any	Watchlist is informational, not punitive
Blacklist	Tier 2+	High risk of harm from false positive; requires stronger evidence

A game cannot be Blacklisted at Tier 1. If Provisional data suggests a problem, the game is placed on the Watchlist and data collection continues until a Tier 2 conclusion is possible.

For the full methodology, including the exact statistical tests, confidence levels, and decision procedures, see our methodology page. For which games are at which tier, see our game listings.

What Makes an Audit Conclusive: Evidence Tiers Explained

Why isn't every audit the same strength?

What is the Provisional tier?

What is the Verified tier?

What is the Gold tier?

Can a game move from Provisional to Gold?

What does tier have to do with the Whitelist and Blacklist thresholds?

Frequently Asked Questions

Related Articles

What Makes an Audit Conclusive: Evidence Tiers Explained

Why isn't every audit the same strength?

What is the Provisional tier?

What is the Verified tier?

What is the Gold tier?

Can a game move from Provisional to Gold?

What does tier have to do with the Whitelist and Blacklist thresholds?

Frequently Asked Questions

Related Articles