Legal Metadata Taxonomies: A Reference Guide to Legal Standards and Examples

Legal metadata taxonomies are essential frameworks for organizing, classifying, and governing legal information. In modern law firms and corporate legal departments, these taxonomies transform unstructured legal documents into structured, searchable, and AI-ready knowledge systems.

This reference guide provides an alphabetized list of established legal metadata taxonomies, from foundational standards like the Library of Congress Subject Headings and the Westlaw Key Number System to emerging frameworks like SALI (Standards Advancement for the Legal Industry) and SKOS-based legal ontologies. Each entry explains what the taxonomy is, what it classifies, and where it is used in legal practice.

High-Stakes Environment of Legal Metadata Management

Unlike other industries, legal metadata operates in a high-risk, adversarial environment, where misclassification can result in litigation exposure, regulatory penalties, or breaches of attorney-client privilege—not just operational inefficiencies. For law firms, legal departments, and law library professionals building or auditing a metadata strategy, this guide is designed to serve as a practical reference for taxonomy selection, alignment, and governance.

The Role of Taxonomies and Ontologies in Legal Knowledge Systems

Legal metadata taxonomies and ontologies are essential tools for creating structured knowledge models in the legal sector, enabling machines to reason about legal information and supporting digital asset management. Standardized taxonomies and controlled vocabularies provide consistency, support compliance, and facilitate the management of digital assets, metadata tagging, and information governance. Legal metadata taxonomies help convert unstructured data into strategic assets, supporting workflows, AI, and analytics. Implementing and managing taxonomies is a key responsibility for administrators and teams, ensuring consistency and supporting users across groups and departments. Legal metadata taxonomies underpin practices such as knowledge management, electronic discovery, automation, risk compliance, and business intelligence.

A well-structured metadata framework improves search, retrieval, and usability of legal documents, and metadata is often described as data about data, providing context and meaning. Compliance with legal regulations is facilitated through proper tagging of documents with retention periods and sensitivity classifications.

Overcoming Knowledge Management Challenges with Standardized Taxonomies

Legal organizations often struggle with knowledge management due to poorly defined terms and concepts, and adopting standardized taxonomies reduces ambiguity and enhances interoperability. A shared, open-source knowledge metadata model can improve knowledge management by standardizing data tagging and ensuring a consistent information structure for AI deployment. Hierarchical structures in taxonomies organize terms in parent/child or broad-to-narrow relationships, and taxonomy details and definitions are crucial for clarity.

How to Use This Guide

The taxonomies listed below fall into several functional categories. Depending on your firm’s practice areas and metadata strategy, different frameworks will be relevant:

  • Subject classification taxonomies (LCSH, Westlaw KeyNumber, LexisNexis, BIALL) govern how documents are described and retrieved by legal topic.
  • Matter and practice area standards (SALI, PACER, NIEM) standardize how matters, cases, and legal services are coded across systems.
  • Intellectual property taxonomies (IPC, CPC/USPC, Nice Classification, EUIPO, WIPO Lex) are essential for IP practice metadata management.
  • Records management and archival standards (ISO 15489, EAD, Dublin Core, MARC) govern document lifecycle, retention, and preservation metadata.
  • Interoperability and linked data standards (SKOS, XBRL, ECLI, NIEM) enable machine-readable metadata exchange between systems.

Many firms require alignment across multiple categories simultaneously. Soutron’s polyhierarchical thesaurus is designed to accommodate all of these frameworks within a single, unified metadata architecture.

Download our Legal Metadata Taxonomy Reference Guide

Legal Metadata Taxonomy Reference Guide
Legal Metadata Taxonomy Reference Guide

We’ve compiled every taxonomy in this guide into a print-ready PDF reference — organized by functional category, with a quick-reference table and a full A–Z entry for each framework. Save it, share it with your team, or use it to audit your firm’s current metadata coverage.

Alphabetized Legal Metadata Taxonomies List

AALL Universal Citation Guide

Published by the American Association of Law Libraries, this guide standardizes citation formats across jurisdictions and document types. While primarily a citation standard, it functions as a taxonomy of legal document types and source categories used in authority control and cataloging in law libraries. Consistency in citation and metadata practices supports accurate search and retrieval for users and administrators.

ABA Model Rules of Professional Conduct — Subject Classification

The American Bar Association’s structured classification of professional responsibility topics, used as a controlled vocabulary in legal ethics research databases and law firm compliance libraries. Clear definitions and explanations within this taxonomy ensure shared understanding and effective adoption by teams and groups.

BIALL Classification Scheme

Developed by the British and Irish Association of Law Libraries, this scheme provides a hierarchical subject classification for UK and Irish legal materials, covering common law jurisdictions with particular depth in English statute and case law subject headings. BIALL classification is particularly relevant for UK-headquartered firms and those operating across English, Scottish, Welsh, and Irish jurisdictions, where common law source hierarchies differ from U.S. and civil law frameworks.

Black’s Law Dictionary Controlled Vocabulary

The definitional vocabulary of Black’s Law Dictionary functions as a de facto authority file for legal terminology, widely used by legal information professionals as a source of preferred terms in subject indexing and thesaurus construction. Providing clear definitions and explanations within this taxonomy ensures shared meaning and supports consistency in terminology across teams, enhancing collaboration and the ability to reuse and retrieve legal information efficiently.

CILIP Thesaurus for Graphical Materials — Legal Adaptation

Used in archival and special collections contexts where legal records intersect with visual or documentary materials, adapted from the Chartered Institute of Library and Information Professionals framework. Particularly relevant for legal archives managing photographic evidence, architectural records, maps, and visual exhibits where standard legal subject headings are insufficient.

Cornell Legal Information Institute (LII) Topic Taxonomy

The LII at Cornell Law School maintains a publicly available subject taxonomy organizing U.S. law by topic, statute, regulation, and case category. It is used in legal research portals and as a reference structure for subject heading alignment. As an open-source model, it standardizes practices and supports interoperability in the legal sector, benefiting administrators, teams, and users.

Court Records Metadata Standards — NIEM (National Information Exchange Model)

NIEM is a U.S. federal framework that provides a common vocabulary and data model for information exchange between government agencies, courts, and law enforcement. Its legal domain subset defines standardized metadata fields for case records, party roles, charge classifications, and court event types. The ability to manage taxonomies and implement best practices supports compliance and consistent metadata tagging and classification, facilitating analytics and reporting.

Dublin Core Metadata Initiative — Legal Adaptation (DC-Law)

An adaptation of the Dublin Core 15-element metadata standard specifically mapped to legal document description, used in legal digital libraries and open access legal repositories to ensure interoperability between systems. Assets are classified, described, and organized using metadata attributes, keywords, and tagging, and administrators and users can add metadata to improve digital asset management.

EAD (Encoded Archival Description) — Legal Records Application

EAD is an XML standard for encoding finding aids in archives. Law firm archives, court archives, and government legal record repositories use EAD with legal-specific controlled vocabulary extensions to describe collections of case files, corporate records, and regulatory submissions. Tagging and metadata attributes support digital asset management and facilitate search and retrieval for relevant user groups.

ECLI (European Case Law Identifier)

A standardized identifier and metadata schema adopted by the European Union and member state courts to uniformly describe and cite court decisions across EU jurisdictions. Covers court identity, jurisdiction, year, and case number in a structured, machine-readable format. Standardized taxonomies and clear taxonomy details support advanced search, analytics, and interoperability.

EUIPO (European Union Intellectual Property Office) Classification System

The EUIPO taxonomy governs the classification of trademarks, designs, and IP rights across EU member states, organized by Nice Classification (goods and services), Vienna Classification (figurative elements), and Locarno Classification (industrial designs). Used as a controlling vocabulary in IP practice metadata schemas, supporting compliance and digital asset management through consistent tagging and attributes.

FindLaw Legal Topic Taxonomy

A hierarchical subject classification used across Thomson Reuters’ consumer and professional legal research properties, organizing legal topics by practice area, document type, and jurisdiction in a manner widely referenced for legal content organization. Taxonomies help target content to relevant groups, programs, or administrative units, ensuring information is pertinent to users’ needs and supporting advanced search and analytics.

GARE (Guidelines for Authority Records and References) — Legal Authority Files

IFLA’s guidelines for the construction of authority records, adapted in legal library contexts to manage name authority files for courts, legislators, law firms, and legal entities — ensuring consistent identification of legal actors across catalog records. Teams and administrators play a key role in managing, reviewing, and approving taxonomy changes, ensuring consistency and relevance for different user groups.

ICD (International Classification of Diseases) — Forensic and Medico-Legal Application

In forensic, personal injury, and workers’ compensation legal practice, ICD diagnostic codes function as a controlled vocabulary for classifying injury types, causes of death, and medical conditions referenced in legal documents and damage calculations. Consistent use of attributes and tagging supports compliance and analytics.

IFLA LRM (Library Reference Model) — Legal Collections Application

The International Federation of Library Associations’ Library Reference Model provides a conceptual framework for describing legal information resources, particularly useful in managing relationships between legal texts, their editions, translations, and commentary. Teams and administrators ensure taxonomy implementation and consistency across collections, supporting users and groups.

IPC (International Patent Classification)

Administered by the World Intellectual Property Organization (WIPO), the IPC provides a hierarchical classification of technology subject areas used globally to classify patent documents. Essential metadata in IP law practice, patent prosecution, and freedom-to-operate analysis. Hierarchical taxonomy details and attributes support advanced search and analytics.

ISO 5127 — Information and Documentation Vocabulary

An ISO standard providing a controlled vocabulary for information and documentation concepts, including legal records management terminology, used as a reference framework in legal information governance and records management policy. Clear definitions and explanations within this taxonomy support best practices and consistency.

ISO 15489 — Records Management Metadata Standard

The international standard for records management provides a metadata framework governing the creation, capture, classification, retention, and disposal of records, including legal records. Widely applied in corporate legal departments and law firm records management programs, it enables the ability to manage taxonomies, implement best practices, and support compliance through consistent metadata tagging and classification. ISO 15489 is the foundational records management standard for corporate legal departments operating under GDPR, CCPA, SEC recordkeeping rules, or any jurisdiction with statutory document retention obligations.

ISO 19475 — Legal Document Exchange Standard

Provides structured metadata requirements for the electronic exchange of legal documents between parties, courts, and regulatory bodies, with defined fields for document type, party identification, and procedural status. Standardized taxonomies and attributes support interoperability and analytics.

LCSH (Library of Congress Subject Headings) — Law

The Law section of the Library of Congress Subject Headings is one of the most comprehensive and widely applied legal taxonomies in existence, covering all areas of U.S. and international law with authorized headings, geographic subdivisions, and form subdivisions. Used in law firm libraries, academic law libraries, and court libraries worldwide as the authoritative subject vocabulary. Advanced search and analytics are supported by standardized metadata and taxonomy details, enhancing the speed and accuracy of document retrieval.

LexisNexis Legal Taxonomy

The proprietary subject classification schema underlying the LexisNexis research platform, organizing case law, statutes, regulations, and secondary sources by practice area, jurisdiction, and legal topic. Used as a reference vocabulary for aligning internal library taxonomies with external research source structures. Consistency in terminology supports collaboration among team members and the ability to reuse and retrieve legal information efficiently.

LII Wex Legal Dictionary and Encyclopedia Taxonomy

Cornell Law School’s Wex legal reference database is organized around a controlled subject taxonomy that defines relationships between legal concepts, doctrines, and terms. Widely used as an authority reference for legal subject indexing. Providing clear definitions and explanations within this taxonomy ensures shared understanding and effective adoption, supporting knowledge management practices.

MARC Relator Codes — Legal Roles

The MARC standard includes a set of relator codes defining the roles of persons and corporate bodies in relation to a document. In legal cataloging, these codes describe roles such as attorney of record, court, plaintiff, defendant, intervenor, and amicus, providing a controlled vocabulary for legal party and role metadata. The use of attributes, keywords, and tagging is important for describing and organizing legal documents and assets.

MeSH (Medical Subject Headings) — Forensic and Legal Medicine Application

The National Library of Medicine’s Medical Subject Headings, while primarily a biomedical taxonomy, is used in legal medicine, forensic pathology, and personal injury practice to provide controlled vocabulary for medical subject matter referenced in legal documents. Consistent tagging and attributes support analytics and compliance.

Nice Classification (NCL)

The international classification of goods and services for trademark registration purposes, administered by WIPO and the World Intellectual Property Organization. Mandatory metadata in trademark prosecution and IP portfolio management. Standardized taxonomies and tagging support digital asset management and compliance.

NIEM Legal Domain

Within the National Information Exchange Model, the Legal domain defines standardized metadata schemas for criminal justice, court case management, and legal process data exchange between U.S. federal, state, and local government systems. The ability to manage taxonomies and implement best practices supports compliance, analytics, and interoperability.

OCLC Dewey Decimal Classification — Law (340s)

The Dewey Decimal Classification system’s Law schedule (340–349) provides a hierarchical numerical taxonomy for organizing legal collections by jurisdiction and subject, used in law firm libraries, court libraries, and public law collections for physical and digital collection organization. Hierarchical structures and taxonomy details support advanced search and analytics.

PACER (Public Access to Court Electronic Records) Metadata Schema

The federal courts’ PACER system uses a defined metadata schema for all filed documents, including case type, docket entry type, party roles, and filing dates. This schema functions as a de facto standard for federal court document metadata. The ability to manage taxonomies, implement best practices, and support compliance through consistent metadata tagging and classification is essential for administrators and users.

SALI (Standards Advancement for the Legal Industry)

SALI is the most significant emerging legal metadata taxonomy specifically designed for the modern legal industry. It provides standardized matter-level codes covering legal services (LSSS — Legal Services Subject Scheme), practice areas, document types, industry sectors, legal entities, and geographic codes. SALI enables interoperability between law firm practice management systems, e-billing platforms, legal operations tools, and document management systems. It is rapidly becoming the reference taxonomy for matter coding and legal service classification across enterprise legal management. The Legal Matter Standard Specification (LMSS) is a taxonomy of over 10,000 tags developed to improve legal services classification and enhance interoperability between systems. SALI supports AI applications, automation, and scalable legal expertise by providing structured, machine-readable data, and ongoing implementation and future developments are anticipated in the legal industry.

SKOS (Simple Knowledge Organization System) — Legal Ontology Applications

A W3C standard for representing controlled vocabularies, thesauri, and classification schemes in a machine-readable format. SKOS is used in legal knowledge graph and ontology projects — including the EU Publications Office’s EuroVoc and the LKIF (Legal Knowledge Interchange Format) — to publish and interlink legal taxonomies as linked data. Ontologies define the meaning and relationships of legal content, providing sense and context for automation, AI, and legal analysis. SKOS supports AI applications and scalable legal expertise, with future use cases and ongoing implementation in the legal sector.

Thomson Reuters Practical Law Taxonomy

The subject classification schema used across Thomson Reuters’ Practical Law practice guides and standard documents, organizing legal know-how by practice area, jurisdiction, and document type. Functions as a reference vocabulary for aligning internal knowledge management systems with the Practical Law resource structure. Taxonomies help target content to relevant focus areas and groups, supporting users and administrators in organizing and retrieving information efficiently.

ULAN (Union List of Artist Names) — Legal Entity Authority

Used in arts law, IP, and cultural property practice to provide authority control for the identity of artists and cultural creators referenced in legal documents involving copyright, moral rights, and provenance. Consistent use of attributes and tagging supports digital asset management and compliance.

UN/LOCODE and UN Trade Terms — Trade and Customs Law

United Nations location codes and trade term classifications (including Incoterms) function as controlled vocabularies in international trade law, customs compliance, and cross-border transaction metadata. Standardized taxonomies and tagging support compliance and analytics.

UNODC (United Nations Office on Drugs and Crime) Legal Classification

Provides a standardized taxonomy for criminal offense classification, drug scheduling categories, and international crime typologies, used in criminal law practice, international enforcement cooperation, and forensic records management. Metadata tagging creates an audit trail and supports meeting regulatory requirements like GDPR and CCPA.

USPTO Classification System (CPC/USPC) The United States Patent and Trademark Office uses the Cooperative Patent Classification (CPC), a joint schema with the European Patent Office, as the controlling taxonomy for patent subject matter. Essential metadata in patent prosecution, portfolio management, and freedom-to-operate legal analysis. Hierarchical taxonomy details and attributes support advanced search and analytics.

Westlaw KeyNumber System

West’s Key Number System is one of the oldest and most comprehensive legal taxonomies in existence, organizing all reported U.S. case law into a hierarchical classification of legal topics and subtopics with numbered headnotes. It has been in continuous development since the 1890s and remains a primary reference taxonomy for U.S. common law subject classification, used directly by legal information professionals to align internal subject headings with case law retrieval structures. Advanced search and analytics are supported by standardized metadata and taxonomy details, enhancing the speed and accuracy of document retrieval.

WIPO Lex Legal Database Classification

The World Intellectual Property Organization’s WIPO Lex database uses a structured taxonomy to classify national and international IP legislation, treaties, and court decisions by type, subject matter, and jurisdiction — used as a reference vocabulary in international IP practice metadata management. Consistency in terminology and taxonomy details supports collaboration, analytics, and compliance.

XBRL (eXtensible Business Reporting Language) — Legal and Regulatory Filing Taxonomy

XBRL provides a standardized taxonomy for structured financial and regulatory disclosure data, used in SEC filings, corporate governance documents, and legal compliance reporting. In legal practice, XBRL taxonomy codes function as metadata fields linking legal documents to specific regulatory disclosure requirements. Standardized metadata and taxonomy details enable effective analytics and reporting across legal organizations, supporting compliance and business intelligence.

Building a Legal Metadata Taxonomy Strategy

No single taxonomy covers the full scope of legal metadata requirements. Most law firms and legal departments operate across multiple frameworks simultaneously — using SALI for matter classification, LCSH or Westlaw KeyNumber for subject indexing, ISO 15489 for records management, and ECLI or PACER schemas for court record metadata, all within the same document management environment.

The challenge is not selecting a taxonomy. It is implementing multiple taxonomies coherently, ensuring that the same document can be classified and retrieved under all of the frameworks that are relevant to it — without creating retrieval failures when attorneys search from different starting points.

This is the problem that polyhierarchical thesaurus architecture solves. Soutron’s thesaurus module allows any number of taxonomy frameworks to coexist within a single metadata environment, with every classification path leading to the same document regardless of which standard the searching attorney is working from. The result is a metadata system that is simultaneously standards-compliant, interoperable with major legal research platforms, and practically usable by attorneys who are not information specialists.

To discuss how Soutron’s metadata architecture can be configured around your firm’s specific taxonomy requirements, download our Legal Metadata Reference Guide and [request a consultation with our team →]