CryptaCount
EN
EnglishENDeutschDEEspañolESFrançaisFRItalianoIT日本語JA한국어KONederlandsNLPolskiPLPortuguêsPT
Log in Start Free

Chainalysis Publishes Formal Ontology for Blockchain Analytics Data Quality

CryptaCount Editorial · · 6 min read
AML / KYC / LICENSING Chainalysis Publishes Formal Ontologyfor Blockchain Analytics Data Quality

Chainalysis has published a formal ontological framework that defines, for the first time in structured terms, what data quality actually means in blockchain analytics. The paper sets out two distinct evidentiary tiers, assigns each a rigorous standard, and gives compliance professionals, auditors, and courts a shared vocabulary for holding analytics outputs to account. For firms whose AML decisions, audit opinions, and legal submissions rest on on-chain attribution data, this is a development worth reading carefully.

Chainalysis Publishes Formal Ontology for Blockchain Analytics Data Quality

Why a Formal Ontology, and Why Now

Blockchain analytics has grown rapidly without the disciplinary infrastructure that mature forensic sciences take for granted. There are no accreditation bodies, no unified evidentiary standards, and no shared definitions of what it means for an attribution claim to be correct. That gap has consequences.

The problem a loose methodology creates

The paper opens with a striking illustration of the risk. Two analytics tools examining the same deposit address returned completely different conclusions: one categorised the address as connected to a gambling service; the other flagged it as linked to child sexual abuse material. Both were working from statistical pattern matching against transaction frequency and value. Small, regular payments of similar size can look alike when viewed through a narrow statistical lens, regardless of the underlying activity. The difference between those two labels is not a rounding error. One describes a person who placed a bet. The other could destroy a life.

That kind of divergence, according to the paper, is what happens when machine learning outputs are treated as forensic facts without any underlying framework distinguishing what can be proven deterministically from what requires human intelligence tradecraft.

The response: formalising existing standards

Chainalysis describes the ontology not as a new direction but as a codification of the methodology it has applied since its founding. The paper required the company to articulate what it had already been doing, assign formal evidentiary labels to each analytical step, and publish the result for external scrutiny. The stated intent is to give the broader industry a foundation to build on, not just to document internal practice.

The Two-Tier Framework

The ontology draws a hard line between two types of analytical claim, each governed by a different standard of rigour.

Tier one: structural analysis

The first tier covers the question of whether addresses share common control. The standard here is described as deterministic, reproducible, and auditable, with known and documented failure modes. This is the layer that determines, from on-chain evidence alone, whether two addresses belong to the same controlling entity. Claims at this tier are expected to withstand the same scrutiny as any scientific finding: they should be reproducible by a technically capable, adversarial reviewer.

Tier two: attribution to a named entity

The second tier covers the attribution of those addresses to a real-world entity, such as an exchange, a sanctioned actor, or a darknet market. This layer necessarily incorporates intelligence tradecraft: human sources, open-source research, and documented reasoning. The ontology applies a structured confidence framework here, requiring source characterisation and explicit documentation of the reasoning chain. It is still rigorous, but it is a different kind of rigour, and the paper is explicit that conflating the two is where errors, and injustices, originate.

Firms reviewing on-chain attribution data supplied by any analytics provider should now ask: which tier does this claim sit in, and does the provider's documentation support that classification? That question did not previously have a formal vocabulary attached to it. It does now. The framework also connects directly to how independent reconciliation is raising auditor expectations around on-chain data in general.

Evidentiary Admissibility and External Validation

The paper references two external validation events that underscore the framework's robustness claims.

Federal court admissibility

The methodology was subjected to full judicial scrutiny in federal court and found admissible across every criterion applied. The paper's author, Chainalysis's Chief Scientist, notes that this outcome was never in doubt precisely because the methodology was built to survive adversarial, technically capable review from the outset.

Independent academic validation

Researchers at Delft University, working in collaboration with law enforcement, conducted what the paper describes as the only empirical validation study of attribution accuracy tested against ground truth from seized infrastructure. Chainalysis invited the study and allowed the results to be published. The paper contrasts this with another provider that reportedly attempted to suppress the same study through legal pressure. The contrast is deliberate: transparency under scrutiny is treated as a baseline requirement, not a differentiator.

This posture sits in direct alignment with the direction compliance infrastructure is moving more broadly. Why institutional blockchain compliance now reaches below the smart contract layer examines how firms such as UBS are demanding verifiable, auditable controls at the protocol level, which is exactly the kind of rigour the ontology is designed to support.

What This Means for Compliance Teams and Auditors

The practical implications are direct. AML compliance teams that act on blockchain analytics outputs, whether to file SARs, restrict accounts, or escalate investigations, are now on notice that a formal evidentiary standard exists against which those outputs can be assessed. Auditors reviewing crypto asset controls should factor the two-tier framework into their procedures. Defence counsel in criminal proceedings already challenge blockchain attribution evidence; this ontology gives prosecutors and investigators a clearer language for defending it, and gives defence teams a clearer basis for challenging providers who cannot demonstrate the same rigour.

The paper is also a direct call on other analytics providers to adopt equivalent standards, or at least to engage with the vocabulary the ontology establishes. Whether the industry responds with competing frameworks, adoption, or silence will itself be informative.

What does the two-tier ontology actually distinguish?

Tier one covers structural analysis: whether addresses share common control, assessed using deterministic, reproducible, auditable on-chain methods. Tier two covers attribution: linking those addresses to a named real-world entity, using a documented confidence framework that incorporates intelligence tradecraft alongside on-chain evidence. The key point is that these tiers require different kinds of rigour, and conflating them produces unreliable conclusions.

Why does this matter for an AML compliance programme?

AML decisions based on blockchain analytics, including account restrictions, SAR filings, and transaction monitoring alerts, carry real consequences for individuals and businesses. If the analytics output does not distinguish between what is proven on-chain and what is inferred from intelligence, the compliance team cannot properly calibrate the confidence level of its decisions. The ontology provides a framework for asking the right questions of any data provider.

How does this affect audit engagements involving crypto assets?

Auditors assessing the adequacy of a firm's AML controls should now consider whether the blockchain analytics tools relied upon operate under a documented evidentiary framework. The existence of the ontology raises the bar: firms that cannot articulate which tier their analytical claims sit in, and on what basis, may face increased scrutiny in audit and regulatory review.

Was the methodology tested in court?

Yes. According to the paper, Chainalysis's methodology was examined in federal court proceedings and found admissible under every criterion applied. The Chief Scientist attributes this to the methodology being designed from the outset to withstand adversarial, technically capable review, not adapted to it after the fact.

Is this ontology binding on other analytics providers?

No. The paper is a published framework, not a regulatory standard or industry rule. Its stated purpose is to establish a vocabulary and invite the industry to build on it. Whether regulators, courts, or professional bodies adopt it as a reference point remains to be seen, but the fact that it exists in formal, published form means it can now be cited in disputes, procurement decisions, and regulatory examinations.

Source: Chainalysis

USGLOBALGeneralProposedAML/KYC & Licensing

FAQ

What does the two-tier ontology actually distinguish?

Tier one covers structural analysis: whether addresses share common control, assessed using deterministic, reproducible, auditable on-chain methods. Tier two covers attribution: linking those addresses to a named real-world entity, using a documented confidence framework that incorporates intelligence tradecraft alongside on-chain evidence. The key point is that these tiers require different kinds of rigour, and conflating them produces unreliable conclusions.

Why does this matter for an AML compliance programme?

AML decisions based on blockchain analytics, including account restrictions, SAR filings, and transaction monitoring alerts, carry real consequences for individuals and businesses. If the analytics output does not distinguish between what is proven on-chain and what is inferred from intelligence, the compliance team cannot properly calibrate the confidence level of its decisions. The ontology provides a framework for asking the right questions of any data provider.

How does this affect audit engagements involving crypto assets?

Auditors assessing the adequacy of a firm's AML controls should now consider whether the blockchain analytics tools relied upon operate under a documented evidentiary framework. The existence of the ontology raises the bar: firms that cannot articulate which tier their analytical claims sit in, and on what basis, may face increased scrutiny in audit and regulatory review.

Was the Chainalysis methodology tested in court?

Yes. According to the paper, the methodology was examined in federal court proceedings and found admissible under every criterion applied. The Chief Scientist attributes this to the methodology being designed from the outset to withstand adversarial, technically capable review, not adapted to it after the fact.

Is this ontology binding on other analytics providers?

No. The paper is a published framework, not a regulatory standard or industry rule. Its stated purpose is to establish a vocabulary and invite the industry to build on it. Whether regulators, courts, or professional bodies adopt it as a reference point remains to be seen, but its existence in formal, published form means it can now be cited in disputes, procurement decisions, and regulatory examinations.

Related articles

AML/KYC & Licensing
UBS and Nethermind Push Blockchain Compliance Below the Smart Contract Layer
AML/KYC & Licensing
Huione Group: World's Largest Illicit Marketplace and the USDH Stablecoin Risk
AML/KYC & Licensing
OFAC SDN Cryptocurrency Addresses: Compliance Priorities for Firms
AML/KYC & Licensing
FBI vs Huione Group: The $134B Illicit Marketplace Case