Regulatory & Ethics — The Harness Handbook Reference

Status: Phase 3 - Critical for regulated industries, important for all systems
Last Updated: April 2026
Audience: Legal teams, compliance officers, senior engineers, product leaders

Executive Summary

Harnesses operate in an increasingly regulated environment where AI systems are subject to government oversight, data protection laws, and ethical standards. This document provides a framework for building compliant, fair, and ethical harness systems.

Key regulatory drivers (April 2026):

FTC Guidelines on AI Transparency: Mandatory disclosure of AI involvement; prohibition of deceptive practices
GDPR: Applies to any system processing EU citizen data; strict consent and deletion requirements
HIPAA: If handling healthcare data; requires encryption, audit trails, and access controls
SOC 2 Type II: Industry standard for security, availability, and confidentiality
Sector-specific regulations: Finance, healthcare, insurance, employment, education

This framework is not optional for regulated industries and should be adopted proactively by all organizations building harnesses.

1. Regulatory Landscape (April 2026)

1.1 FTC Guidance on AI and Transparency

Status: Updated April 2024; actively enforced

Key requirements:

Disclosure: Must clearly disclose when AI is making decisions that affect users
- Example: “This recommendation is AI-generated” or “An AI system reviewed your application”
Substantiation: Any claims about AI capabilities must be truthful and substantiated
- Cannot claim “AI determines your best match” without evidence
Deception prohibition: AI systems cannot:
- Impersonate humans
- Make false promises about capabilities
- Hide material limitations
Vulnerability targeting: Special protections for vulnerable populations (children, elderly)

How it applies to harnesses:

If a harness generates recommendations, summaries, or decisions → disclose AI involvement
If a harness claims accuracy, safety, or fairness → substantiate with testing
If a harness will be used with vulnerable users → implement additional safeguards

FTC enforcement examples:

ChatGPT falsely claimed it would not retain conversation data for training (settled for $5M)
Social media AI targeting children without adequate disclosure (ongoing investigations)
Job recommendation AI showing gender bias without disclosure (multiple companies fined)

Applies to: Any system processing data of EU residents, regardless of where the company is based

Key principles (Articles 5-14):

Lawfulness & Transparency: Processing must be lawful, fair, and transparent
Purpose limitation: Data collected for stated purpose only
Data minimization: Collect only what is necessary
Accuracy: Keep data accurate and up-to-date
Storage limitation: Delete after purpose is fulfilled
Integrity & confidentiality: Protect with appropriate security

User rights:

Right of access: Users can request all data held about them
Right to rectification: Users can correct inaccurate data
Right to erasure (“right to be forgotten”): Users can request deletion (with exceptions)
Right to restrict processing: Users can pause processing without deletion
Right to data portability: Users can get their data in machine-readable format
Right to object: Users can object to processing for marketing, profiling
Rights related to automated decision-making: Users have right to explanation and human review for automated decisions

Penalties: Up to 4% of global annual revenue or €20M (whichever is higher)

How it applies to harnesses:

If harness processes EU user data → register with Data Protection Authority
If harness makes automated decisions (approving/denying, categorizing users) → provide explanation
Must implement “privacy by design” (data protection from the start, not added later)
Must document processing in Data Protection Impact Assessment (DPIA)
Data deletion must be possible and timely

1.3 HIPAA (Health Insurance Portability and Accountability Act)

Applies to: Healthcare providers, plans, clearinghouses + their business associates

Protected Health Information (PHI) includes:

Medical records and medical billing information
Genetic information, biometric information for ID
Health plans, prescription information
Anything that can identify a patient combined with health info

Key requirements:

Minimum necessary: Use only the PHI needed for stated purpose
Encryption: All PHI in transit and at rest
Access controls: Only authorized personnel can access PHI
Audit trails: Immutable logs of who accessed what data and when
Breach notification: Notify affected individuals within 60 days of discovery
Business Associate Agreements: Any third party handling PHI must have written BAA

Penalties: Up to $1.5M per violation category per year

How it applies to harnesses:

If harness processes PHI (patient data, medical history) → must be HIPAA-compliant
Must encrypt all training data and production data
Must maintain audit trails of model access and inference
Cannot use PHI for training without explicit consent
Consider differential privacy for model training on sensitive health data

1.4 SOC 2 Type II Compliance

Status: Industry standard for security and operational controls

Scope: Five trust service criteria

Security: System is protected against unauthorized access
Availability: System is available for operation and use
Processing integrity: Data is complete, accurate, timely
Confidentiality: Confidential information is protected
Privacy: Personal information is collected, used, retained per laws

What’s required:

Security policies and access controls
Encryption of sensitive data
Incident response plan
Regular security audits (third-party attestation)
Employee training on security
Monitoring and alerting

How it applies to harnesses:

Enterprise customers often require SOC 2 Type II certification
Demonstrates security maturity, reduces customer risk
Requires sustained compliance (audited annually)

1.5 Industry-Specific Regulations

Financial Services (SEC, FINRA)

Model risk management (model validation, monitoring, documentation)
Cannot use models for trading recommendations without disclosure
Must explain model decisions to regulators on request
Applies to: Investment platforms, robo-advisors, loan underwriting

Employment (EEOC, State Laws)

AI hiring/promotion tools must not discriminate based on protected attributes
Must audit for disparate impact (different outcomes for protected groups)
Applies to: Recruitment systems, performance evaluation tools

Education (FERPA)

Cannot disclose student education records to third parties
If using AI on student data, must maintain same confidentiality as paper records
Applies to: Student assessment, tutoring systems, enrollment systems

Insurance (State Insurance Commissioners)

Cannot use discriminatory proxies (e.g., zip code as proxy for race)
Must justify underwriting decisions using legitimate factors
Applies to: Claims processing, pricing models, risk assessment

Autonomous Systems & Safety (NHTSA, FAA)

Self-driving vehicles: must maintain safety records, disclose limitations
Autonomous drones: must have failure modes, human override
Applies to: Robots, autonomous agents, safety-critical systems

1.6 How Regulations Apply to Harnesses

Risk matrix: Is your harness regulated?

Harness Type	Regulated Industries	Key Regulations	Risk Level
Data processing agent	All if processing PII	GDPR, CCPA, HIPAA	Medium
Recommendation engine	Finance, healthcare, employment	FTC, SEC, EEOC	High
Hiring/promotion system	Employment	EEOC, state labor laws	Critical
Medical diagnosis assistant	Healthcare	HIPAA, FDA	Critical
Financial advisor	Finance	SEC, FINRA	Critical
Content moderation agent	All consumer-facing	FTC, Platform rules	Medium
Customer support bot	All with PII	GDPR, CCPA, industry-specific	Medium
Internal automation tool	Depends on data	Industry-specific	Low-Medium

Assessment questions:

Does your harness make consequential decisions about people? (recommendations, approvals, denials, rankings)
Does it process personal or health data?
Will it be used in regulated industries?
Could it affect access to services (finance, employment, housing, education, healthcare)?
Will it be used with vulnerable populations (children, elderly)?

If yes to any: Your harness is likely regulated. Proceed to compliance sections below.

2.1 What Qualifies as Personal Data

Personal data (GDPR Art. 4): Any information relating to an identified or identifiable person

Includes:

Direct identifiers: Name, email, phone, ID number
Pseudonymized data: Data where individual cannot be identified without additional info (but with reasonable effort they could be)
Genetic data: DNA sequences, family relationships
Biometric data: Fingerprints, facial recognition, iris scans, voiceprints
Special categories (require explicit consent):
- Race/ethnicity, political opinions, religious beliefs
- Trade union membership, genetic data, biometric data for ID
- Health data, sexual orientation, criminal convictions

Does NOT include:

Fully anonymized data (cannot be linked back to individual, even with significant effort)
Business contact information (corporate email, business address) — if genuinely only for business
Aggregate statistics (e.g., “70% of users prefer feature X”) — if no individual can be identified

2.2 Data Minimization Principle

Collect only what you need.

Questions to ask:

Do I need this data to fulfill the stated purpose?
Could I achieve the goal with less detailed data?
Could I use aggregated or anonymized data instead?
What’s the minimum retention period needed?

Examples:

Purpose	Needed Data	Excessive Data
Send password reset link	Email address	Email + phone + address
Recommend products	Browsing history, preferences	Full browsing history + location + family info
Train model on diversity	Demographic attributes	Names, addresses, employer details
Verify age (13+)	Birth year	Full date of birth, SSN

Implementation:

In database schema: only store minimum fields
In APIs: accept only necessary parameters
In model training: exclude unnecessary features
In logging: redact sensitive fields before writing logs
In exports: provide only relevant data to third parties

Lawful basis for processing (GDPR Art. 6) — choose one:

Consent: User explicitly agrees to processing (must be specific, granular, informed)
Contract: Processing necessary to fulfill a contract with user
Legal obligation: Required by law
Vital interests: Necessary to protect someone’s life
Public task: Necessary for public authority to perform official duty
Legitimate interests: Balancing test (your interest vs. user privacy) — requires clear documentation

Consent requirements:

Specific: “Use my data to train models” is OK; “We may use your data” is not
Granular: Allow users to consent to email separately from SMS
Informed: Must explain what data, how it will be used, with whom it’s shared
Freely given: Cannot be forced (not a condition of service unless necessary)
Separate from T&Cs: Cannot hide consent in dense terms & conditions
Documented: Must record that consent was given, when, and for what
Easy to withdraw: Users must be able to revoke consent as easily as they gave it

Special rules for children:

Below age 13 (varies by country): Need parental consent, not child’s
Ages 13-18: Can give own consent but consider additional safeguards

Example consent flow:

[ ] I consent to storing my email for account notifications
[ ] I consent to using my anonymized browsing data to improve recommendations
[ ] I consent to sharing my data with analytics platform [AnalyticsCorp]
[ ] I consent to using my health data to train wellness models (requires review for fairness)

Learn more about: [Data processing] [Our privacy practices] [Your rights]

[Decline All] [Accept Checked]

2.4 Right to Be Forgotten (Erasure)

What it means: Users can request deletion of their personal data

Exceptions (data that may not be deleted):

Required by law
Fulfilling contract with user
In public interest (e.g., journalist records)
Exercising freedom of expression
For legal claims
Public health in the public interest

Implementation challenge: After training a model on user data, retraining without that user’s data is expensive

Solutions:

Remove from training set: Retrain model excluding user’s data (expensive but thorough)
Differential privacy: Train model in way that individual contribution is minimal (probabilistic)
Federated learning: Never centralize user data; train on devices or secure enclaves
Layer of abstraction: Train on minimally identifiable features; aggregate before training

Process:

User requests deletion (email, API, delete account button)
Verify identity and find all data for that user
Delete from systems (databases, backups, analytics)
Delete from archives after backup retention period
Document deletion for audit trail
Notify user within 30 days

2.5 Data Retention Policies

Principle: Delete data when no longer needed

Recommended retention periods by data type:

Data Type	Retention Period	Justification
Account credentials (hashed)	Until account deleted	Needed for authentication
Email/contact info	Until account deleted or user requests deletion	Needed for communication
Payment info	Until transaction settled + 7 years	Tax and fraud investigation
Login audit logs	90-365 days	Security monitoring, GDPR audit
Error logs with PII	30 days or less	Debugging; minimize PII in logs
Health/genetic data	Until purpose fulfilled + 1 year	GDPR special category
Consent records	Until consent withdrawn or 5 years	GDPR requirement
Model training data	Until model retired	Can be aggregated/anonymized after
Backups	30-90 days (encrypted)	Disaster recovery
Deleted account data	Delete within 30 days	GDPR requirement

Implementation:

Set expiration dates in database (TTL, automatic deletion)
Audit trail: log when data was deleted, by whom, why
Verify deletion: sampling check that deletion actually occurred
Backup retention: backups are harder to delete; consider encrypted backups with auto-expiration

Assess and document:

Do we process EU resident data? (If yes, GDPR applies)
Have we completed a Data Protection Impact Assessment (DPIA)?
Have we appointed a Data Protection Officer (if required by size/type)?
Have we registered with EU Data Protection Authority?
Do we have lawful basis documented for each processing activity?

Consent and transparency:

Privacy policy clearly explains data collection and use
Consent mechanism is specific, granular, informed, freely given
Easy option to withdraw consent (in user settings)
Cookie consent (if applicable) meets GDPR standards

User rights:

Process to respond to access requests (within 30 days)
Process to respond to deletion requests
Process to respond to portability requests (machine-readable format)
Process to respond to objection requests
Test that deletion actually works in all systems

Technical safeguards:

Encryption in transit (HTTPS, TLS)
Encryption at rest (database encryption, encrypted backups)
Access controls (who can see what data)
Audit logging (immutable logs of data access)
Regular security audits

Training and documentation:

All staff handling data trained on GDPR
Data Processing Agreements with any third parties
Breach notification plan (notify supervisory authority within 72 hours)
Records of processing: what, why, for how long

3. PII Detection & Protection

3.1 Sensitive Data Types

Personally Identifiable Information (PII): Data that directly identifies or can be used to identify someone

High-risk PII (can cause identity theft, fraud, or serious harm if exposed):

Government identifiers: Social Security Number, driver’s license, passport, tax ID
Financial: Credit card, bank account, routing number, PIN
Biometric: Fingerprints, DNA, facial recognition templates, iris scans
Health/genetic: Medical records, diagnoses, genetic data, mental health info
Precise location: GPS coordinates, home address (combined with name)

Medium-risk PII (sensitive but less directly harmful):

Contact: Full name + phone/email/address, work contact info
Demographic: Date of birth, race/ethnicity, religious beliefs
Educational: School/university name, grades, degree
Professional: Job title, employer, salary, client list

Low-risk PII (generally public):

First name, general city, public LinkedIn profile

3.2 Detecting Sensitive Data

Automated detection patterns:

# SSN: 123-45-6789 or 123456789
\b\d{3}-?\d{2}-?\d{4}\b

# Credit card: 16-digit number with spaces/dashes
\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b

# Email: standard format
\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b

# Phone: various formats
\b(\+1)?[\s.-]?\(?[0-9]{3}\)?[\s.-]?[0-9]{3}[\s.-]?[0-9]{4}\b

# IP address (can be PII in some contexts)
\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b

# Keywords in context
"password": ".*"
"api_key": ".*"
"ssn": ".*"
"credit_card": ".*"

Limitations of patterns:

Generate false positives (e.g., “123-45-6789” in a fictional example)
Miss variations (spaces, typos)
Don’t detect semantic PII (e.g., “my mother was born in 1952” = age + family)
Don’t detect combinations (e.g., job title + unusual name = identifiable)

Better approach: Combination of patterns + semantic detection + human review

3.3 Redaction Before Logging

Problem: Logs often contain PII by accident (error messages, request bodies, stack traces)

Example: User reports error

ERROR: Failed to process payment
Request: POST /api/payment
  card_number: 4532-1234-5678-9123
  cvv: 234
  amount: 19.99
Stack trace: ... payment.py line 456

Solution: Redaction

ERROR: Failed to process payment
Request: POST /api/payment
  card_number: [REDACTED]
  cvv: [REDACTED]
  amount: [REDACTED]
Stack trace: ... payment.py line 456

Implementation approach:

Configure logging to redact certain fields
Redact PII patterns before writing to logs
Use structured logging (JSON) to identify sensitive fields
Regular audit of logs for leaked PII

Code example (Python):

import logging
import re

class RedactionFilter(logging.Filter):
    def filter(self, record):
        # Redact SSN
        record.msg = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[REDACTED-SSN]', str(record.msg))
        # Redact credit card
        record.msg = re.sub(r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b', '[REDACTED-CC]', str(record.msg))
        # Redact specific fields
        if hasattr(record, 'password'):
            record.password = '[REDACTED]'
        return True

logger = logging.getLogger()
logger.addFilter(RedactionFilter())

3.4 Anonymization for Analytics

Goal: Enable analysis without identifying individuals

Techniques:

Aggregation: Report statistics, not individuals
- Instead of: “User [email protected] clicked [button] 12 times”
- Use: “Across 1000 users, [button] was clicked average 8.5 times”
Differential privacy: Add noise to protect individuals
- Train model such that removing any single user doesn’t substantially change output
- Quantify privacy loss (epsilon parameter)
K-anonymity: Each record indistinguishable from k-1 others
- Generalize: ZIP code 90210 → “Los Angeles area”
- Delete rare values: If only 1 user from postal code, remove postal code
- Suppression: Replace specific values with ranges
Data masking: Replace with fake but realistic data
- SSN 123-45-6789 → 987-65-4321 (fake but same format)
- Name “Alice Smith” → “Bob Johnson” (not linked to real person)
Tokenization: Replace with random identifier
- user_id=12345 → token=abc-def-ghi (can’t reverse)
- Cannot link back to original (unless encrypted key stored separately)

Anonymization effectiveness matrix:

Technique	Reversal Risk	Analytical Value	Effort
Aggregation	None	Moderate	Low
Differential privacy	Cryptographic	Good	High
K-anonymity	Moderate (homogeneity attacks)	Moderate	Medium
Masking	None (if one-way)	High	Medium
Tokenization	Low (if key separate)	High	Low

Implementation example (Apache Spark):

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, when, concat_ws

spark = SparkSession.builder.appName("Anonymize").getOrCreate()

# Read raw user data
df = spark.read.json("users.json")

# Anonymize: keep age range, remove exact birth date
anonymized = df.select(
    col("user_id").alias("user_token"),  # Keep ID for analytics
    when((col("age") < 18), "0-17")
        .when((col("age") < 25), "18-24")
        .when((col("age") < 65), "25-64")
        .otherwise("65+").alias("age_group"),
    col("country").alias("location"),  # Coarse location
    col("purchase_total"),
    col("visit_count")
)

anonymized.write.parquet("anonymized_analytics/")

3.5 PII Detection & Protection Checklist

Identify:

Catalog all data sources (databases, APIs, files, logs)
Identify what PII each source contains
Classify by sensitivity (high/medium/low)
Identify who accesses each data source

Protect:

Encrypt all high-risk PII at rest and in transit
Access controls: limit who can see PII
Redaction: automatic redaction in logs and exports
Retention: delete when no longer needed
Backups: encrypted backups with separate key management

Monitor:

Audit logs: who accessed what data and when
Anomaly detection: alert on unusual data access patterns
Regular scans: automated PII detection in logs and exports
Breach plan: process to notify users if PII is exposed

Test:

Simulate: mock data that looks like real PII
Verify redaction: check that logs don’t contain real PII
Test retention: verify data is deleted after retention period
Test deletion: verify user deletion removes all traces

4. Fairness & Bias Detection

4.1 What is Bias?

Bias in AI: Systematically treating different groups differently based on protected attributes

Protected attributes (vary by jurisdiction):

Race/ethnicity
Color
Gender/sex (including pregnancy, sexual orientation, gender identity)
Religion/belief
Age
Disability
National origin
Marital status (some jurisdictions)
Genetic information (some jurisdictions)

Types of bias:

Representation bias: Training data doesn’t represent all groups
- Example: Resume screening trained on mostly male engineers; penalizes female candidates
- Impact: Model performs worse for underrepresented groups
Measurement bias: Outcomes measured differently across groups
- Example: Recidivism model trained on arrests, not actual crimes; over-policed communities have more arrests
- Impact: Model learns unfair proxy (overpolicing) as predictor of crime
Aggregation bias: Applying one-size-fits-all model to diverse groups
- Example: Medical model trained on majority population; performs worse for minorities
- Impact: Dangerous health decisions for underrepresented groups
Evaluation bias: Testing only on majority group
- Example: Facial recognition tested on white men, performs poorly on women and people of color
- Impact: Deployed system fails when it matters most
Outcome bias: Model perpetuates historical discrimination
- Example: Hiring model trained on past hiring decisions (which were discriminatory)
- Impact: “Learning” to discriminate against protected groups
Proxy bias: Using proxies that correlate with protected attributes
- Example: ZIP code as proxy for race; name as proxy for gender
- Impact: Discrimination without explicitly using protected attributes

4.2 Detecting Bias in Outputs

Metrics for fairness (choose appropriate for your context):

Demographic Parity (Statistical Parity)
- Definition: Positive outcome rate is equal across groups
- Formula: P(Y=1 | G=A) = P(Y=1 | G=B) for groups A and B
- Example: Hiring acceptance rate should be 30% for all genders
- Limitation: May require rejecting qualified candidates to achieve parity
Equalized Odds (Equal Opportunity)
- Definition: True positive rate (TPR) and false positive rate (FPR) are equal across groups
- Formula: P(Y’=1 | Y=1, G=A) = P(Y’=1 | Y=1, G=B) (and same for FPR)
- Example: Loan approval system identifies 80% of creditworthy applicants in all racial groups
- Strength: Focuses on classifier performance, not outcome rates
Equalized False Negative Rates
- Definition: False negative rate (FNR) is equal across groups
- Formula: P(Y’=0 | Y=1, G=A) = P(Y’=0 | Y=1, G=B)
- Example: Recidivism model misses dangerous individuals at same rate for all races
- Strength: Focuses on errors for positive class
Predictive Parity
- Definition: Precision (accuracy of positive predictions) is equal across groups
- Formula: P(Y=1 | Y’=1, G=A) = P(Y=1 | Y’=1, G=B)
- Example: When hiring model predicts “good candidate,” that’s accurate 85% of the time for all genders
- Limitation: Can be hard to measure (requires long-term outcomes)
Calibration
- Definition: Predicted probability matches actual outcome probability for all groups
- Example: If model says “80% chance of success,” that outcome actually occurs 80% of the time
- Strength: Enables fair decision thresholds
Individual Fairness
- Definition: Similar individuals should receive similar treatment
- Example: Two identical applicants should get same decision regardless of protected attribute
- Limitation: Requires defining “similar,” which is subjective

4.3 Testing for Fairness

Test structure:

Identify relevant protected attributes for your domain
Segment data by protected attribute
Evaluate model performance for each group
Calculate fairness metrics for each group-pair
Identify disparities and root causes
Decide acceptable thresholds (e.g., max 5% difference in TPR)

Example: Hiring recommendation system

from sklearn.metrics import confusion_matrix, roc_auc_score
import pandas as pd

# Data: candidates with predictions and outcomes
df = pd.read_csv("hiring_decisions.csv")
# Columns: predicted_score, hired, gender, race, years_exp, ...

def fairness_audit(df, protected_attr='gender'):
    """Audit model for bias across protected attribute."""
    
    results = {}
    
    for group in df[protected_attr].unique():
        group_df = df[df[protected_attr] == group]
        
        # Calculate metrics for this group
        tn, fp, fn, tp = confusion_matrix(
            group_df['hired'],
            (group_df['predicted_score'] > 0.5).astype(int)
        ).ravel()
        
        tpr = tp / (tp + fn)  # True positive rate
        fpr = fp / (fp + tn)  # False positive rate
        fnr = fn / (fn + tp)  # False negative rate
        precision = tp / (tp + fp) if (tp + fp) > 0 else 0
        
        results[group] = {
            'size': len(group_df),
            'hiring_rate': group_df['hired'].mean(),
            'tpr': tpr,
            'fpr': fpr,
            'fnr': fnr,
            'precision': precision,
            'auc': roc_auc_score(group_df['hired'], group_df['predicted_score'])
        }
    
    # Print results
    audit_df = pd.DataFrame(results).T
    print(audit_df)
    
    # Identify disparities
    tpr_by_group = audit_df['tpr']
    if (tpr_by_group.max() - tpr_by_group.min()) > 0.05:  # >5% difference
        print(f"WARNING: TPR disparity detected ({tpr_by_group.max() - tpr_by_group.min():.1%})")
    
    return audit_df

# Run audit
fairness_audit(df, protected_attr='gender')
fairness_audit(df, protected_attr='race')

Output example:

          size  hiring_rate    tpr    fpr    fnr  precision    auc
male       450        0.45  0.87  0.12  0.13      0.82  0.91
female     350        0.38  0.74  0.18  0.26      0.71  0.84
--------
WARNING: TPR disparity detected (13.0%)
WARNING: FNR disparity detected (13.0%)

4.4 Demographic Parity vs Equalized Odds

When to use each:

Metric	Use When	Don’t Use When
Demographic Parity	Hiring/admissions (goal is equal representation)	Loan approval (can’t force equal lending)
Equalized Odds	Safety-critical (fraud, recidivism, medical)	Sensitive to base rate differences
Equalized FNR	High cost to false negatives (missing criminals, patients)	Equal approval rates more important
Calibration	Transparent scoring needed (lending rates)	Raw fairness metrics less important

Real-world examples:

Hiring: Both demographic parity AND equalized odds recommended
- Demographic parity: ensure diverse candidate pool
- Equalized odds: ensure model isn’t sabotaging qualified minorities
Loan approval: Equalized odds NOT sufficient (lenders incentivized to approve less minorities to reduce FPR)
- Use: Demographic parity + monitoring for disparate impact
Recidivism: Equalized odds critical (don’t want false negatives to vary by race)
- Use: Equal FNR + TPR + regular monitoring

4.5 Mitigation Strategies

Pre-processing (before model training):

Balanced sampling: Oversample underrepresented groups in training data
Data augmentation: Generate synthetic data for underrepresented groups
Stratified sampling: Ensure train/test splits represent all groups equally
Proxy removal: Don’t train on protected attributes or strong proxies

In-processing (during model training):

Fairness constraints: Add penalty if model exhibits bias
Fairness-aware objective: Maximize accuracy subject to fairness constraint
Debiased representations: Learn representations that are independent of protected attribute
Threshold optimization: Adjust decision threshold differently for each group

Post-processing (after model training):

Threshold adjustment: Different thresholds for different groups to equalize outcomes
Output adjustment: Flip some predictions to meet fairness targets
Fairness wrapper: Take model outputs and adjust for fairness

Long-term:

Diverse training data: Actively collect data from underrepresented groups
Diverse team: Engineers, domain experts, and affected communities involved in design
Continuous monitoring: Regular fairness audits in production
User feedback: Incorporate feedback from affected communities

Example: Threshold adjustment

# Original model predicts probability 0-1
# Adjust thresholds to achieve equal TPR across groups

tpr_targets = {
    'male': 0.85,
    'female': 0.85,
}

thresholds = {}
for group, target_tpr in tpr_targets.items():
    group_df = df[df['gender'] == group]
    # Find threshold where TPR = target
    sorted_preds = sorted(group_df['predicted_score'].values)
    # ... find threshold ...
    thresholds[group] = 0.52  # Example

# Apply different thresholds
df['hired'] = df.apply(
    lambda row: row['predicted_score'] > thresholds[row['gender']], axis=1
)

4.6 Fairness Testing Framework

Automated fairness testing (code):

class FairnessAudit:
    """Framework for testing fairness of model across protected attributes."""
    
    def __init__(self, df, protected_attrs, outcome_col, pred_col, pred_threshold=0.5):
        self.df = df
        self.protected_attrs = protected_attrs  # ['gender', 'race', 'age_group']
        self.outcome_col = outcome_col  # 'hired' or 'loan_approved'
        self.pred_col = pred_col  # 'pred_score' or 'model_prob'
        self.pred_threshold = pred_threshold
        self.results = {}
    
    def audit(self):
        """Run fairness audit across all protected attributes."""
        for attr in self.protected_attrs:
            self.results[attr] = self._audit_attribute(attr)
        return self.results
    
    def _audit_attribute(self, attr):
        """Audit single protected attribute."""
        groups = {}
        for group in self.df[attr].unique():
            group_df = self.df[self.df[attr] == group]
            
            # Binary predictions
            y_true = group_df[self.outcome_col].values
            y_pred = (group_df[self.pred_col] > self.pred_threshold).astype(int).values
            
            # Calculate metrics
            tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
            
            groups[group] = {
                'n': len(group_df),
                'outcome_rate': y_true.mean(),
                'pred_rate': y_pred.mean(),
                'tpr': tp / (tp + fn) if (tp + fn) > 0 else 0,
                'fpr': fp / (fp + tn) if (fp + tn) > 0 else 0,
                'fnr': fn / (fn + tp) if (fn + tp) > 0 else 0,
                'precision': tp / (tp + fp) if (tp + fp) > 0 else 0,
            }
        
        return groups
    
    def report(self, threshold=0.05):
        """Generate fairness report with warnings."""
        print("=" * 80)
        print("FAIRNESS AUDIT REPORT")
        print("=" * 80)
        
        for attr, groups in self.results.items():
            print(f"\n{attr.upper()}")
            print("-" * 80)
            
            df_report = pd.DataFrame(groups).T
            print(df_report.to_string())
            
            # Check for disparities
            tpr_disparity = df_report['tpr'].max() - df_report['tpr'].min()
            fpr_disparity = df_report['fpr'].max() - df_report['fpr'].min()
            outcome_disparity = df_report['outcome_rate'].max() - df_report['outcome_rate'].min()
            
            if tpr_disparity > threshold:
                print(f"  ⚠️  TPR disparity: {tpr_disparity:.1%} (threshold: {threshold:.1%})")
            if fpr_disparity > threshold:
                print(f"  ⚠️  FPR disparity: {fpr_disparity:.1%} (threshold: {threshold:.1%})")
            if outcome_disparity > threshold:
                print(f"  ⚠️  Outcome disparity: {outcome_disparity:.1%} (threshold: {threshold:.1%})")

# Usage
audit = FairnessAudit(
    df=hiring_df,
    protected_attrs=['gender', 'race', 'age_group'],
    outcome_col='hired',
    pred_col='model_score',
    pred_threshold=0.5
)
audit.audit()
audit.report(threshold=0.05)  # Fail if >5% disparity

4.7 Fairness & Bias Checklist

Design:

Identified protected attributes relevant to your domain
Reviewed training data for representation of all groups
Documented potential sources of bias (measurement, historical, aggregation)
Decided fairness metric appropriate for use case (parity, odds, calibration)

Development:

Analyzed training data composition by protected attribute
Tested model performance for each group
Calculated fairness metrics for each group-pair
Identified groups where performance differs significantly
Applied mitigation strategy (pre/in/post-processing)

Testing:

Fairness test suite integrated into CI/CD
Regular fairness audits (monthly or quarterly)
Test coverage includes intersectional groups (e.g., women of color, older women)
Document why any disparity is acceptable (if it is)

Monitoring:

Production monitoring: fairness metrics tracked over time
Alerts for fairness regressions (e.g., TPR disparity increases)
User feedback: channel for affected communities to report unfairness
Regular re-auditing: fairness checks continue in production

5. Explainability & Interpretability

5.1 Why It Matters

Regulatory:

GDPR right to explanation (Art. 22): Users can request explanation of automated decisions
FTC substantiation: AI claims must be truthful and can’t hide material limitations
Sector-specific: Finance, healthcare, employment all require explainability

User trust:

Transparency builds trust; opacity creates suspicion
Users want to understand why they were recommended, approved, or rejected
Explainability enables users to correct errors and provide feedback

Debugging & improvement:

When model fails, explanations help identify root cause
Fairness analysis: explanations reveal if model relies on proxies for protected attributes
Feature importance: understand what drives predictions

5.2 How to Provide Explanations

Types of explanations:

Global explanation: How does the model work overall?
- “The model weighs: recent activity (40%), historical engagement (30%), user preferences (20%), trending now (10%)”
- “Credit decision factors: income (35%), debt-to-income ratio (30%), payment history (25%), other (10%)”
Local explanation: Why this specific decision?
- “Recommendation: Movie A because you watched 5 similar movies, your friends rated it 8.5/10, and algorithm detected similar taste”
- “Loan denied because: debt-to-income ratio 0.65 (max 0.60), no credit history, only 6 months employment history”
Counterfactual explanation: What would need to change for different outcome?
- “To get approval, reduce debt by $5,000 or increase income to $65,000+”
- “To get approved: clear past due payment or reduce credit utilization to <30%”
Feature importance: Which inputs matter most?
- SHAP (Shapley Additive exPlanations): Contribution of each feature to prediction
- LIME (Local Interpretable Model-agnostic Explanations): Approximate model locally with interpretable model

Example local explanations:

DECISION: Loan Approved ($50,000)

Key Factors:
+ Income: $120,000 (strong positive, top 20%)
+ Payment history: 8 years, 99% on-time (strong positive)
+ Debt-to-income: 0.35 (acceptable, below 0.40 limit)
- Credit utilization: 45% (minor negative, optimal is <30%)
+ Employment: 5 years at current job (strong positive, stability)

Why approved? Strong income, excellent payment history, and low debt relative to income outweigh moderate credit utilization concern.

To improve rate: Reduce credit card balances to <30% of limits.

5.3 Traceability & Audit Trails

Traceability: Ability to trace decision back to data, model version, and reasoning

Why it matters:

Regulatory: Must explain decisions to regulators
Debugging: Trace failures to identify root cause
Fairness: Verify decisions aren’t based on protected attributes
Legal: Support appeals and disputes

What to log (immutable audit trail):

{
  "decision_id": "dec-2024-04-18-12345",
  "timestamp": "2024-04-18T14:23:45Z",
  "user_id": "user-xyz", // Pseudonymized
  "decision_type": "loan_approval",
  "decision": "approved",
  "model_version": "credit-model-v3.2.1",
  "model_hash": "sha256:abc123...",
  "input_features": {
    "income": 120000,
    "debt_to_income": 0.35,
    "payment_history_years": 8,
    "on_time_payment_pct": 0.99,
    "employment_years": 5
  },
  "model_output": {
    "approval_probability": 0.87,
    "score": 785,
    "threshold": 0.70
  },
  "explanation": {
    "top_factors": [
      {"feature": "payment_history_years", "contribution": 0.35},
      {"feature": "income", "contribution": 0.30},
      {"feature": "on_time_payment_pct", "contribution": 0.25}
    ]
  },
  "approved_by": "system", // Or human reviewer
  "appeal_available": true,
  "appeal_deadline": "2024-05-18T23:59:59Z"
}

5.4 Tool Call Justification

For agent systems: When agent chooses tool, explain why

Example:

User: "Help me plan a 3-day trip to Japan"

Agent reasoning: User is asking for trip planning. I need to:
1. Search for flights [tool: web_search]
2. Find hotels [tool: web_search]
3. Get local attractions [tool: web_search]
4. Build itinerary [reasoning: combine results with LLM]

Tool call 1: web_search(
  query: "flights to Japan 3 days from today"
  reason: "Find available flights and prices for user's trip"
)

Tool call 2: web_search(
  query: "best hotels in Tokyo for budget travelers"
  reason: "Find accommodation options matching user's likely preferences (typical 3-day budget trips use Tokyo as hub)"
)

Response: Here's a 3-day Japan itinerary...
Explanation of recommendations: Based on flight availability from your location, hotel ratings, and popular attractions for first-time visitors.

5.5 Implementation Patterns

Pattern 1: Feature importance with SHAP

import shap
import xgboost as xgb

# Train model
model = xgb.train(params, data)

# Create explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Explain single prediction
def explain_decision(X_row, feature_names):
    """Generate human-readable explanation."""
    shap_value = explainer.shap_values(X_row)
    base_value = explainer.expected_value
    
    # Get top features
    feature_importance = list(zip(
        feature_names,
        X_row[0],
        shap_value[0]
    ))
    feature_importance.sort(key=lambda x: abs(x[2]), reverse=True)
    
    explanation = f"Base score: {base_value:.2f}\n"
    for feat_name, feat_val, shap_val in feature_importance[:5]:
        direction = "↑" if shap_val > 0 else "↓"
        explanation += f"  {direction} {feat_name}={feat_val:.2f} (impact: {shap_val:+.3f})\n"
    
    return explanation

# Usage
explanation = explain_decision(X_test[0:1], feature_names)
print(explanation)

Pattern 2: Counterfactual explanation

def counterfactual_explanation(X_instance, model, target_class, feature_names):
    """Find minimal changes needed for different outcome."""
    current_prediction = model.predict(X_instance)[0]
    
    if current_prediction == target_class:
        return "Already predicted target class"
    
    # Try adjusting each feature
    counterfactuals = []
    for i, feature_name in enumerate(feature_names):
        # Binary search: find value that flips prediction
        original_value = X_instance[0, i]
        
        # Try increasing
        for new_value in np.linspace(original_value, feature_max[i], 20):
            X_test = X_instance.copy()
            X_test[0, i] = new_value
            if model.predict(X_test)[0] == target_class:
                counterfactuals.append({
                    'feature': feature_name,
                    'change': new_value - original_value,
                    'new_value': new_value
                })
                break
    
    # Sort by smallest change needed
    counterfactuals.sort(key=lambda x: abs(x['change']))
    
    explanation = "To achieve target outcome, try:\n"
    for cf in counterfactuals[:3]:
        direction = "increase" if cf['change'] > 0 else "decrease"
        explanation += f"  {direction} {cf['feature']} to {cf['new_value']:.2f}\n"
    
    return explanation

5.6 Explainability Checklist

Design:

Identified stakeholders who need explanations (users, regulators, support)
Determined level of detail needed (simple vs technical)
Chose explanation approach appropriate for model type
Planned how to communicate limitations and uncertainty

Implementation:

Global explanations documented (how model works overall)
Local explanations generated for each decision
Feature importance / contribution clearly communicated
Explanations tested for clarity with actual users
Explanation generated quickly enough for real-time use

Audit trail:

All decisions logged with input data, model version, output
Audit logs immutable and retrievable
Audit logs include explanation/reasoning
Can retrieve decision details for any specific case

Monitoring:

Verify explanations match actual model behavior (no gaming)
User feedback on explanation quality
Explanations reviewed in fairness audits

6. Transparency & Disclosure

6.1 When to Disclose AI Involvement

FTC rule: Disclose material facts about AI

Must disclose when:

System makes consequential decisions (approval/denial, ranking, recommendation)
Claims about system performance or fairness made
System replaces human judgment in decision-making
User would reasonably expect human to make decision
Users are vulnerable (children, elderly, low digital literacy)

Examples where disclosure required:

“This job recommendation is AI-generated” ✓
Loan approval letter: “Your application was reviewed by an AI system” ✓
Medical diagnosis: “This diagnosis was assisted by AI analysis” ✓
Resume screening: Disclose that AI filters resumes ✓
Content moderation: “This content was flagged by AI for review” ✓

Examples where disclosure may not be required (but consider doing it anyway):

Auto-complete in search box (minor, expected)
Spellcheck (clearly AI, obvious to user)
Recommendations in exploratory context (browsing products)
Internal tools not seen by users

Note: When in doubt, disclose. Transparency builds trust.

6.2 Labeling AI-Generated Content

For content generated by AI (text, images, video, audio):

Minimal label:

⚠️ This content was generated using AI

Better label (with context):

This article was written by an AI system trained on news sources.
Fact-check claims before sharing.

Best practice (specific model + limitations):

🤖 AI-Generated Content

Generated using: GPT-4 language model
Limitations: May contain inaccuracies, outdated information, or bias
Confidence: Medium (base facts verified, analysis not independently checked)
Last verified: April 10, 2024
Feedback: Report errors at [email]

Where to place labels:

Beginning of content (most noticeable)
Byline: “By AI Assistant (GPT-4)” instead of human name
Metadata: Mark as “AI-generated” for discoverability
If in video: visual indicator + spoken disclosure
If in social media: Use provided “Labeled” feature; pin comment explaining

6.3 User Expectations

Set expectations early:

Onboarding: When users first use system, explain what AI is involved
Settings: Allow users to choose level of AI assistance
Transparency: Make it easy to see why system made a decision

Example onboarding for recommendation system:

🎯 How our recommendations work

We use AI to learn from:
• What you like and dislike
• What similar users enjoyed
• What's popular right now

Why this matters:
✓ Better recommendations tailored to your taste
⚠️ AI isn't perfect (sometimes will miss or misunderstand)
📊 We track what we get wrong and improve over time

You're in control:
• Can always ignore recommendations
• Can hide inappropriate suggestions
• Can adjust AI preference settings
• Can see why each recommendation was made

Questions? See our [FAQ] or email [support]

Example for decision-making system:

⚖️ How decisions are made

When you apply, your information is reviewed by:
1. Automated screening (AI) — Initial pass/fail based on requirements
2. Human review — Final decision made by human reviewer
3. Appeal process — Can contest decision in writing

The AI screening:
✓ Ensures consistent evaluation
✓ Flags potential bias issues
⚠️ Is not final decision
📊 Is reviewed by humans for fairness

You have rights:
• Right to explanation of why you were approved/denied
• Right to appeal decision
• Right to request human review if denied

Questions? See [Appeals] or email [appeals@company]

6.4 Clear Communication

Principles:

Accurate: Don’t oversell capabilities
Honest: Acknowledge limitations and uncertainty
Simple: Use plain language, not jargon
Timely: Disclose before user is affected
Actionable: Tell user what they can do about it

Common mistakes:

❌ “Our AI will perfectly match you with your soulmate” → No AI is perfect
❌ “Powered by machine learning” with no explanation → Meaningless jargon
❌ Hiding AI in small print → Defeats transparency purpose
❌ Overstating accuracy without evidence → FTC violation

Good examples:

✓ “Based on your viewing history, we think you’d enjoy: [Movie]. Why? You’ve watched 8 similar movies.”
✓ “Estimated match: 72% (based on 5 shared interests, similar values about family and career)”
✓ “Prediction: 70% likely to succeed in program. This is based on academic performance, test scores, and interviews. Accuracy is 75% (varies by demographic).“

7. Audit Trails & Accountability

7.1 Immutable Logs of Decisions

Purpose: Enable investigation, prove compliance, debug failures

What to log for decision systems:

{
    # Identifiers
    "request_id": "req-2024-04-18-xyz789",  # Unique ID for this decision
    "timestamp": "2024-04-18T14:23:45Z",  # UTC timestamp
    
    # User & context
    "user_id": "user-12345",  # Pseudonymized or tokenized
    "context": {
        "device": "mobile-ios",
        "location_country": "US",  # Coarse location only
        "session_id": "sess-abc123"
    },
    
    # Input
    "input_features": {
        "age_group": "25-34",  # Coarse categories
        "income_bracket": "100k-150k",
        "credit_score": 750,
        "historical_purchases": 24
        # Does NOT include: name, email, SSN, full address
    },
    
    # Processing
    "model_version": "recommendation-v2.3.1",
    "model_hash": "sha256:abc123...",  # Immutable reference
    "processing_time_ms": 245,
    "model_output": {
        "recommendation_id": "item-5432",
        "confidence": 0.87,
        "reason": "Similar to 8 items you purchased"
    },
    
    # Decision
    "decision": "recommend",
    "decision_maker": "system",  # Or "human_reviewer"
    "decision_time": "2024-04-18T14:23:46Z",
    
    # Explanation
    "explanation": {
        "primary_reason": "Item matches your browsing history",
        "secondary_reasons": ["Popular with similar users", "On sale this week"],
        "factors_against": ["Expensive", "Niche category"]
    },
    
    # Compliance
    "user_rights_applicable": ["access", "object"],
    "appeal_available": true,
    "appeal_deadline": "2024-05-18"
}

7.2 Who Made What Decision and When

Accountability structure:

{
    "decision_id": "dec-loan-20240418-xyz",
    "decision_type": "loan_approval",
    "decision": "denied",
    
    # Decision by AI
    "ai_stage": {
        "timestamp": "2024-04-18T09:00:00Z",
        "model": "loan-approval-v5.2",
        "recommendation": "denied",
        "confidence": 0.72,
        "reasoning": "Income < $60k threshold"
    },
    
    # Decision by human (if any)
    "human_review": {
        "timestamp": "2024-04-18T10:15:00Z",
        "reviewer_id": "emp-review-543",  # Pseudonymized employee ID
        "reviewer_department": "loan-review",
        "final_decision": "denied",
        "override_of_ai": false,
        "notes": "Applicant underqualified, low income for loan amount"
    },
    
    # Communication
    "decision_communicated": {
        "timestamp": "2024-04-18T10:30:00Z",
        "method": "email",
        "user_received": true
    },
    
    # Appeal
    "appeal": {
        "available": true,
        "appeal_deadline": "2024-05-18",
        "appeal_contact": "[email protected]"
    }
}

7.3 Enabling Investigation of Failures

When something goes wrong, audit trails enable questions:

Question: “Why was this user recommended dangerous product?” Answer (from logs):

Model version: recommendation-v2.1 (known bug with certain user profiles)
Input features: Similar to user’s past, but missing safety flags
Feature missing: “Has reported safety concern” (feature removed in v2.1 by mistake)

Question: “Was this decision biased against women?” Answer (from logs):

Pulled audit logs for past 100 decisions
Compared approval rates by gender
Found no statistical disparity (p-value 0.42)
Model features: No gender input, no strong gender proxies

Question: “Did employee access customer data they shouldn’t have?” Answer (from logs):

Audit logs show all data access
Employee accessed customer X on 2024-04-18 at 2:15pm
Reason code: “Technical support ticket TSK-1234”
Ticket shows: Customer reported password reset issue (legitimate reason)

7.4 Compliance with Regulations

GDPR compliance:

Log evidence of consent (when given, what consent, withdrawal)
Log data deletions (when deleted, what deleted, confirmation)
Log access requests (who requested, when, what data provided)

HIPAA compliance:

Log all PHI access (who, when, why, what data)
Log disclosures (to whom, when, how much)
Immutable: cannot modify or delete historical logs

FTC compliance:

Log AI involvement (when, what system, what decision)
Log explanations provided (what was disclosed to user)

7.5 Retention Periods

How long to keep logs (varies by regulation):

Log Type	Retention	Reason
Decision logs	7 years	GDPR, financial compliance, audit
Access logs (PII)	3 years	GDPR, HIPAA
Deletion logs	5+ years	Prove deletion happened
Consent records	Until withdrawn + 5 years	GDPR requirement
Security incidents	Indefinite	Legal liability
Error logs (no PII)	90 days	Operational debugging

Implementation:

Store logs in immutable storage (write-once)
Separate system from production (can’t modify)
Encryption at rest
Monthly integrity checks (verify no tampering)
Archive old logs (cold storage, but retrievable)

8. Model Cards & Documentation

8.1 What Should Be Documented

Model Card template:

# Model Card: [Model Name]

## Model Details
- **Model version**: recommendation-v3.2.1
- **Type**: Collaborative filtering + content-based hybrid
- **Date**: April 2024
- **Owners**: [Team name]
- **Documentation date**: April 18, 2024
- **Contact**: [email]

## Intended Use
- **Primary use**: Recommend products to users
- **Users**: E-commerce platform, end users (consumers)
- **Out-of-scope uses**:
  - Do NOT use for: Employment decisions, credit scoring, healthcare
  - Has not been tested for: Adversarial inputs, non-English languages
  - Performance unknown for: Users under 18, users with disabilities

## Factors
- **Relevant factors**:
  - User browsing history
  - Item features (category, price, rating)
  - Seasonal trends
- **Irrelevant factors** (should not influence):
  - User demographics (gender, race, age)
  - User location
  - User reviews (other products)

## Metrics
### Overall Performance
- **Accuracy** (top-10 recommendations contain user-liked item): 68%
- **NDCG@10**: 0.72
- **Coverage** (recommendations span catalog): 82%
- **Diversity** (recommendations span categories): 0.65

### Performance by Group
| Demographic | Accuracy | NDCG | Notes |
|-------------|----------|------|-------|
| **Age 18-24** | 71% | 0.75 | Young users: higher engagement |
| **Age 25-34** | 68% | 0.72 | -- |
| **Age 35-49** | 65% | 0.68 | Fewer interactions, lower signal |
| **Age 50+** | 61% | 0.63 | ⚠️ Smaller cohort, less data |
| **Male** | 70% | 0.73 | -- |
| **Female** | 66% | 0.70 | ⚠️ Historically lower engagement |

### Fairness
- **Recommendation rate parity**: 98% (recommendations shown to all demographics at similar rates)
- **Coverage parity**: 85% (all groups see diverse recommendation categories)
- **Concern**: Older users (50+) have lower model performance due to lower training data volume

## Datasets
- **Training data**: 2023 user interactions (500M events)
- **Data source**: Production logs
- **Cutoff date**: January 2024
- **Train/validation/test split**: 80/10/10

### Data Characteristics
- **Size**: 500M interaction records
- **Time span**: Jan 2023 - Dec 2023
- **User coverage**: 95% of active users
- **Item coverage**: 87% of catalog

### Known Issues
- ⚠️ **Demographic imbalance**: Dataset is 60% male, 40% female (matches user base)
- ⚠️ **Sparse data**: Users 50+ have 3x fewer interactions, leading to weaker recommendations
- ⚠️ **Seasonal bias**: Training data primarily from Q1-Q3 (holiday season underrepresented)
- ⚠️ **Recency bias**: Recent users (joined Dec 2023) have minimal training signal

## Limitations
1. **Cold start**: Cannot recommend to new users (no history)
   - Mitigation: Fall back to popularity-based recommendations
2. **Filter bubble**: May reinforce user preferences
   - Mitigation: Inject novelty into recommendations
3. **Performance**: Accuracy drops 10% for users 50+ due to sparse data
   - Mitigation: Collect more data, use transfer learning
4. **Bias**: May recommend products matching historical preferences, can miss new interests
   - Mitigation: Regular audits, user feedback

## Ethical Considerations
- ✓ No protected attributes used directly in model
- ⚠️ Age-based performance differences observed; under investigation
- ✓ Recommendations provide user control (can dislike, hide)
- ✓ Transparent about AI involvement ("AI chose this for you")

## Caveats and Recommendations
- **Bias analysis**: Quarterly bias audits recommended
- **Monitoring**: NDCG and coverage metrics should be monitored weekly
- **Retraining**: Model should be retrained quarterly
- **Human oversight**: Recommendations occasionally reviewed by human (sample: 1% of recommendations)

## Model Card Version History
- **v1.0** (Jan 2024): Initial release
- **v2.0** (Mar 2024): Added diversity factor
- **v3.0** (Apr 2024): Improved cold start
- **v3.2.1** (Apr 2024): Bugfix in NDCG calculation

---

8.2 Limitations and Biases

Examples of limitations to document:

## Known Limitations

### Performance Limitations
- **Accuracy**: 68% (top-10 contains item user later purchased)
- **Coverage**: 82% of catalog recommended (18% of niche items rarely shown)
- **Latency**: 200-300ms per recommendation (real-time)
- **Scalability**: Tested on 5M daily active users; unknown performance at 10M+

### Data Limitations
- **Historical bias**: Training data from 2023; recommendations may be outdated
- **Representation**: 95% of active users in training; 5% new users not represented
- **Temporal**: Data collected Jan-Dec 2023; holiday shopping (Dec) undersampled (only 1/12th of training)
- **Sparse**: Users with <5 interactions (10% of user base) get poor recommendations

### Group-Specific Limitations
| Group | Limitation | Severity |
|-------|-----------|----------|
| **Age 50+** | 7% lower accuracy due to sparse training data | Medium |
| **Women** | 4% lower coverage (some female-specific categories underrecommended) | Low-Medium |
| **New users (< 1 week)** | 30% lower accuracy; recommend using popularity baseline | High |
| **Users with disabilities** | Not tested for accessibility of recommendation explanations | Medium |

### Domain Limitations
- **Not tested for**: Employment decisions, healthcare, safety-critical
- **Not suitable for**: Users with significant accessibility needs (blind users can't see images)
- **Not appropriate for**: Vulnerable populations (children < 13)

### What the Model Doesn't Do
- ❌ Does NOT explain WHY you might like item (just shows similar items)
- ❌ Does NOT consider user preferences expressed in other channels (customer service calls, surveys)
- ❌ Does NOT account for supply chain issues (may recommend out-of-stock items)
- ❌ Does NOT consider ethical concerns (does recommend products from companies with poor labor practices)

8.3 Training Data Characteristics

Document:

Source and collection method
Size and composition
Potential biases or limitations
Preprocessing applied

## Training Data

### Source
- Collection method: Production logs (user interactions on platform)
- Time period: January 1, 2023 - December 31, 2023
- User consent: All data collected under Terms of Service; users can opt-out of recommendations

### Data Composition
- **Total records**: 500 million interactions
- **Unique users**: 5.2 million
- **Unique items**: 250,000 products
- **Time span**: 12 months

### Demographics of Training Data
| Demographic | Proportion | Notes |
|-------------|-----------|-------|
| **Male** | 58% | Includes some data entry errors |
| **Female** | 39% | May be underrepresented in certain categories |
| **Other/Not disclosed** | 3% | -- |
| **Age 18-24** | 22% | Heavy users; 3x more interactions than 50+ |
| **Age 25-34** | 35% | Core demographic |
| **Age 35-49** | 28% | -- |
| **Age 50+** | 15% | ⚠️ Underrepresented in training |

### Preprocessing
- Removed: Interactions flagged as fraudulent (0.5%)
- Removed: Interactions from bot traffic (1.2%)
- Aggregated: Multiple interactions same user/item on same day
- Anonymized: User IDs hashed; names not included

### Known Issues in Training Data
- ⚠️ Gender data: 45% of users didn't disclose; assumed to be male (default)
- ⚠️ Age data: 22% missing; estimated from other signals (purchase behavior, device type)
- ⚠️ Seasonal: Holiday season (Nov-Dec) only 2 months of data; underrepresented
- ⚠️ New product bias: New items (added in 2023) have less interaction data

8.4 Performance on Different Subgroups

Document fairness and accuracy across groups:

## Performance Analysis by Subgroup

### Performance by Age Group

Age 18-24:

Accuracy (top-10): 71%
NDCG@10: 0.75
Avg interactions per user: 450
Sample size: 1.1M users
Notes: Young users highly engaged; recommendations reliable

Age 25-34:

Accuracy: 68%
NDCG@10: 0.72
Avg interactions: 320
Sample size: 1.8M users
Notes: Core demographic; good performance

Age 50+:

Accuracy: 61%
NDCG@10: 0.63
Avg interactions: 95
Sample size: 0.8M users
⚠️ CONCERN: 7% lower accuracy due to sparse data
Mitigation: Consider overweighting older users in future training


### Performance by Gender

Male users:

Accuracy: 70%
NDCG@10: 0.73
Coverage: 85%

Female users:

Accuracy: 66%
NDCG@10: 0.70
Coverage: 78%
⚠️ CONCERN: 4% lower coverage for products in women-specific categories
Root cause: Historical underrepresentation in training data (47% of interactions are male)
Mitigation: Increase sampling of female users in next training cycle


### Intersectional Analysis

Older women (50+ female):

Accuracy: 58%
⚠️ CONCERN: Combines age and gender effects; lowest overall performance
Recommendation: Flag for manual review
Mitigation: Dedicated cohort analysis in next audit

8.5 Intended Use and Misuse

Define appropriate use:

## Appropriate Use
✓ **Product recommendations** to platform users
✓ **Personalization** of user experience
✓ **Trend identification** (aggregated, anonymized)

## Inappropriate Use
❌ **Employment decisions**: Model not evaluated for fair hiring
❌ **Credit/lending decisions**: Model not validated for financial risk
❌ **Healthcare/diagnosis**: Model not tested for medical accuracy
❌ **Advertising vulnerable populations**: Cannot recommend to users under 18
❌ **Surveillance**: Should not be used to track user behavior beyond recommendations

## Context-Specific Considerations
- **Multi-language platforms**: Model trained on English only; performance unknown for non-English users
- **Accessibility**: Model recommendations assume visual access to product images
- **Low-bandwidth**: Model recommendations include images; requires good connectivity

9. Ethical Guidelines

9.1 No Weaponization

Principle: Agents should not facilitate violence, harm, or warfare

Guidelines:

Do not help design weapons, explosives, or biological agents
Do not help plan violence or terrorism
Do not help with surveillance for harmful purposes (e.g., stalking, harassment)
Do not provide targeting assistance for armed conflict

Exceptions:

Legitimate self-defense information (general knowledge)
Educational content about weapons (history, policy)
Information about security and harm prevention
Law enforcement / military use (with appropriate restrictions)

Implementation:

Content filter: Detect requests for weapon design, terrorism planning
User assessment: Identify high-risk use cases
Escalation: Contact legal/safety team for ambiguous cases

9.2 No Deception

Principle: Agents should not impersonate humans or hide their nature

Guidelines:

Never claim to be human if AI
Never hide that system is AI-powered
Disclose limitations and capabilities honestly
Correct misunderstandings about what you are

Examples of deception (don’t do):

❌ Customer service agent that doesn’t disclose it’s AI
❌ “I understand your feelings” (AI doesn’t have feelings)
❌ Claiming to be specific human (“I’m John from customer service”)
❌ Using human photo in avatar

Good examples:

✓ “I’m Claude, an AI assistant. I can help with…”
✓ “I don’t have feelings, but I understand this is frustrating…”
✓ “I can’t make promises, but here’s what I’d recommend…“

9.3 No Manipulation

Principle: Agents should not coerce or manipulate users into actions

Guidelines:

No dark patterns (deceptive design)
No pressure tactics (urgency, scarcity without basis)
No exploitation of vulnerabilities (fear, loneliness, addiction)
Users should maintain control over their choices

Examples of manipulation (don’t do):

❌ “Last chance! Only 2 items left!” (when plenty in stock)
❌ Showing “most people bought this” to create social pressure
❌ Making unsubscribe 5 clicks while subscribe is 1 click
❌ Recommending addictive behavior (excessive purchases, gambling)

Ethical alternatives:

✓ “This item is popular right now. [More info]”
✓ “Would you like to unsubscribe? [Yes] [Maybe later] [No]”
✓ “You’ve ordered frequently this month; consider taking a break?“

9.4 Respect Autonomy

Principle: Users maintain control and can override AI decisions

Guidelines:

Provide explanations so users understand decisions
Allow users to reject or override recommendations
Don’t force AI decisions on users
Respect user preferences and values

Implementation:

Always show “why” (explain recommendations)
Always show “dislike” / “try another” (override)
Always show settings (adjust behavior)
Always show history (see what AI chose)

Example: Recommendation system with user control

🎯 Recommended for you: [Product A]
Why: Similar to 8 items you purchased
[Dislike] [Show Different] [More Like This]

⚙️ Recommendation Settings:
  [ ] Show popular items
  [ ] Show new items
  [x] Show deals
  [ ] Show recommendations from friends

9.5 Organizational Ethical Guidelines

Template (customize for your org):

# Ethical AI Guidelines for [Organization]

## Core Values
1. **Transparency**: We disclose when AI is involved
2. **Fairness**: We test for and mitigate bias
3. **User control**: Users understand and can control AI systems
4. **Safety**: We prioritize safety over convenience
5. **Privacy**: We protect personal information

## Prohibited Use Cases
- Do not use AI for: [surveillance of employees, discriminatory hiring, deceptive marketing, ...]
- Do not deploy AI without: [fairness audit, privacy review, user disclosure, ...]
- Do not train on: [non-consensual data, sensitive health information without permission, ...]

## Approval Process
All new AI systems must be reviewed by:
- [ ] Product team (is this solving a real problem?)
- [ ] Legal (is this compliant with regulations?)
- [ ] Ethics board (does this align with our values?)
- [ ] Privacy team (does this respect user privacy?)
- [ ] Security team (is this secure?)

## Regular Audits
- Fairness audits: quarterly
- Security audits: annually
- Privacy reviews: before major updates
- Ethics review: annually

## Escalation
If any team has concerns, escalate to: [Executive sponsor]

10. Implementation Framework

10.1 Privacy Impact Assessment (PIA)

When to do: Before deploying any system processing personal data

Process:

# Privacy Impact Assessment: [System Name]

## Executive Summary
[1-2 sentences about system and data)

## Data Inventory
- What personal data does system process?
- How is it collected?
- Where is it stored?
- Who has access?
- How long is it retained?

### Sensitive Data Table
| Data Type | Category | Sensitivity | Retention | Purpose |
|-----------|----------|-------------|-----------|---------|
| Email | Contact | Medium | Until account deleted | Account recovery |
| Payment card | Financial | High | Until transaction settled | Billing |
| Browse history | Behavioral | Medium | 90 days | Recommendations |

## Legal Basis
- [ ] User consent
- [ ] Contract with user
- [ ] Legal obligation
- [ ] Vital interests
- [ ] Public task
- [ ] Legitimate interests (with balancing test)

## Risks

### High Risks
1. **Data breach**: Could expose user data to attackers
   - Likelihood: Low (enterprise security)
   - Impact: Very High (credential theft, fraud)
   - Mitigation: Encryption, access controls, monitoring

2. **Unauthorized access**: Employees could access data inappropriately
   - Likelihood: Medium (human factor)
   - Impact: High (privacy violation)
   - Mitigation: Access controls, audit logs, training

### Medium Risks
3. **Data retention**: Keeping data too long
   - Likelihood: Medium
   - Impact: Medium
   - Mitigation: Automated deletion after retention period

### Low Risks
4. **Performance issues**: System downtime
   - Likelihood: Low
   - Impact: Low
   - Mitigation: Redundancy, monitoring

## Safeguards
- [ ] Encryption at rest and in transit
- [ ] Access controls (only authorized users)
- [ ] Audit logging (who accessed what, when)
- [ ] Data minimization (collect only needed)
- [ ] User consent process
- [ ] Deletion process for user requests
- [ ] Regular security audits
- [ ] Data protection training for staff

## Conclusion
- ✓ Low risk with safeguards in place
- Proceed with deployment

10.2 Fairness Audit Process

Schedule: Before launch, then quarterly

Process:

# Fairness Audit: [Model/System Name]

## Scope
- System: [Description]
- Evaluation period: [Dates]
- Protected attributes: [gender, race, age, ...]

## Data
- Evaluation set size: [N] records
- Protected attribute coverage: [%]
- Baseline: [previous version or benchmark]

## Metrics
Choose appropriate fairness metrics (see section 4.2):
- Demographic parity: Pass if <5% difference in outcome rates
- Equalized odds: Pass if <5% difference in TPR
- Calibration: Pass if predicted probabilities match actual rates

## Results
[Present results for each protected attribute]

## Findings
- ✓ Pass: Metrics within acceptable ranges
- ⚠️ Investigate: [Metric] shows [X]% disparity, recommend mitigation
- ❌ Fail: [Metric] violates threshold; halt deployment

## Mitigation Plan (if needed)
1. Root cause: [Why is disparity occurring?]
2. Solution: [Pre/in/post-processing mitigation]
3. Timeline: [When to implement]
4. Re-evaluation: [When to audit again]

## Sign-off
- Product owner: [Name/Date]
- ML engineer: [Name/Date]
- Ethics reviewer: [Name/Date]

10.3 Ethical Review Board (if applicable)

Purpose: Review high-stakes AI decisions

Structure:

Members: Product, Legal, Ethics, Engineering, affected community rep
Frequency: Monthly meetings for new systems; as-needed for urgent issues
Authority: Can delay or block deployment

Review questions:

Does this system align with org values?
Could this system harm vulnerable groups?
Is there disclosure of AI involvement?
Have we tested for bias and fairness?
Is deployment necessary or is there less-risky alternative?
Have we considered long-term societal impact?

10.4 Regular Re-evaluation

Schedule:

Monthly: Fairness metrics (automated)
Quarterly: Full fairness audit
Annually: Privacy & security audit, ethics review
Event-driven: Any user complaint, bias concern, major change

Monitoring dashboard:

Fairness Metrics (Last Updated: Today)
  Demographic parity: ✓ Pass (4.2% diff)
  Equalized odds: ✓ Pass (3.8% TPR diff)
  Coverage: ✓ Pass (all groups represented)
  
Privacy Metrics (Last Updated: Today)
  Data deletion requests: 142 (avg 2 days to delete)
  PII detected in logs: 0
  Unauthorized access attempts: 0
  
User Feedback (Last 30 days)
  Fairness complaints: 1 (investigating)
  Transparency concerns: 3 (addressed in FAQ)

10.5 Incident Response for Violations

If bias is discovered:

Assess severity: Is model still accurate for all groups?
Immediate action: Flag affected users (if any), pause recommendations if severe
Investigation: Root cause analysis (data, model, features)
Mitigation: Retrain, adjust thresholds, or roll back
Communication: Inform affected users, explain what happened and how it’s fixed
Prevention: Update processes to prevent recurrence

If privacy violation occurs:

Containment: Stop any further data exposure
Assessment: How much data, which users affected?
Notification: Notify users within 30 days (GDPR), 60 days (HIPAA)
Investigation: Root cause (breach, unauthorized access, bug)
Remediation: Offer credit monitoring, password reset, etc.
Prevention: Close security gap, increase monitoring

If deception discovered:

Immediate action: Stop deceptive practice
Disclosure: Inform users about what happened
Correction: Update system to be transparent
Audit: Check for other deceptive patterns
Legal review: Assess FTC/regulatory violation risk

11. Compliance Checklist

Before launching:

Ongoing:

Process for responding to access requests (30 days)
Process for responding to deletion requests (30 days)
Process for responding to portability requests
Breach notification plan
Data Processing Agreements with third parties
Annual privacy audit
Staff training on GDPR

Monitoring:

Audit logs of data access
Regular scans for PII in logs
Alert system for suspicious access patterns
Backup deletion after retention period

11.2 HIPAA Compliance Items (if applicable)

Security Rule:

Encryption of all PHI at rest (AES-256 or equivalent)
Encryption of all PHI in transit (TLS 1.2+)
Access controls (authentication, authorization)
Audit logs (who accessed what data, when, why)
Integrity checks (data not modified)
Backup and disaster recovery plan

Privacy Rule:

Minimum necessary principle (collect only needed data)
User authorization before use (signed Business Associate Agreement if third party)
Patient rights (access, amendment, accounting of disclosures)
Breach notification plan (notify affected individuals within 60 days)

Business Associate Agreement (if using third parties):

Signed before any PHI is shared
Specifies what data, how it will be used, safeguards
Requires subcontractors to also sign BAA
Includes breach notification provisions

Documentation:

Privacy policy explaining HIPAA safeguards
Security audit plan
Incident response plan
Training records for all staff handling PHI

11.3 FTC Guidance Alignment

Transparency & Disclosure:

Clearly disclose when AI is involved in consequential decisions
Explain material limitations (what the AI can’t do well)
Don’t claim capabilities without evidence
Don’t hide material information in fine print

Prohibited Practices:

Don’t impersonate humans
Don’t make false claims about accuracy or fairness
Don’t discriminate based on protected attributes
Don’t target vulnerable populations without safeguards

Evidence & Testing:

Claims about accuracy must be substantiated with testing
Claims about fairness must be supported by fairness audits
Performance claims should include limitations and caveats
Regular re-evaluation to maintain claims

11.4 Industry-Specific Requirements

Financial Services (SEC, FINRA):

Model risk management framework documented
Model validation report (independent reviewer)
Performance monitoring (monthly)
Explainability: can explain decisions to regulators
Fairness: no discriminatory outcomes
Disclosure: customers informed of AI involvement

Employment (EEOC):

Fairness audit: no disparate impact by protected attributes
Performance audit: works equally well for all demographics
Validation: proven to predict job performance
Disclosure: candidates informed of AI screening (if used)
Appeals: process for candidates to dispute decision

Healthcare (FDA, State Boards):

Validation study (if clinical use)
Safety analysis (what harms could occur?)
Clinical trial results (if significant decisions)
Adverse event reporting (track problems in production)
Clear labeling of limitations

Education (FERPA):

Confidentiality of student records maintained
Parental consent (if students under 18)
Data retention limits
No unauthorized disclosure

11.5 Documentation Requirements

Maintain records of:

Retention period:

Keep documentation for minimum of 3-7 years
Indefinite for major incidents or legal disputes
Follow industry standards (financial: 7 years, healthcare: varies by state)

12. Quick Reference: Is Your Harness Regulated?

Answer these questions:

Does your harness make consequential decisions about people? (recommendations, approvals, rankings, categorization)
- Yes → Must disclose AI involvement
- No → Skip to Q2
Does it process personal or sensitive data? (names, emails, health info, financial data, location, browsing history)
- Yes → Must comply with GDPR/CCPA/HIPAA (if applicable)
- No → Skip to Q3
Will it be used in regulated industries? (healthcare, finance, employment, insurance, education)
- Yes → Must comply with industry regulations
- No → Skip to Q4
Could it affect access to services? (job offers, credit, housing, education, healthcare)
- Yes → Must audit for fairness and discrimination
- No → You’re good!

If ANY answer is yes: Your harness requires compliance work. Start with:

Privacy Impact Assessment (Sec 10.1)
Fairness Audit Process (Sec 10.2)
Relevant compliance checklist (Sec 11)

Appendices

A. Regulatory Timeline (April 2026)

Date	Regulation	Impact
Feb 2024	FTC AI Transparency Guidelines	Disclosure required for AI decisions
Nov 2023	EU AI Act (partial entry)	Restrictions on high-risk AI
May 2018	GDPR enforcement	EU data protection active (ongoing)
Jan 2020	CCPA enforcement	California privacy law
Ongoing	HIPAA	Healthcare data protection

B. Resources

FTC Guidance: ftc.gov/ai
GDPR: gdpr-info.eu
HIPAA: hhs.gov/hipaa
NIST AI Risk Management: nist.gov/aigovernance
Partnership on AI: partnershiponai.org
Model Cards: research.google/pubs/ModelCards
Fairness Tools: fairlearn.org, agaricus.io/sos

C. Glossary

Bias: Systematically treating groups differently based on protected attributes Differential Privacy: Mathematical guarantee that removing an individual’s data doesn’t significantly change model output Equalized Odds: Fairness metric where true positive and false positive rates are equal across groups Explainability: Ability to understand and explain model decisions Fairness: Treating different groups equitably; no discrimination based on protected attributes PII: Personally Identifiable Information; data that identifies or can identify a person SHAP: Shapley Additive exPlanations; method for explaining model predictions Transparency: Openly disclosing how AI systems work and when they’re being used

Document version: 1.0
Last updated: April 2026
Next review: April 2027

Validation Checklist

How do you know you got this right?

Performance Checks

Privacy Impact Assessment completed in <2 weeks for MVP
Fairness audit runs in <1 day (automated baseline + human review)
Data deletion tested: user data removal takes <7 days
Compliance documentation generated automatically before deployment

Implementation Checks

Privacy policy written and reviewed by legal
GDPR/CCPA/HIPAA applicability determined (questionnaire in Section 10)
Data retention policy defined: what data kept, how long, deletion process
Fairness audit completed: demographic parity checked on 3+ protected groups
Model card created: training data, performance, limitations documented
Explainability mechanism in place: can show why harness made decision
Data minimization applied: only collect essential data for stated purpose

Integration Checks

Harness discloses AI involvement to users (transparency)
Data deletion integrates with persistence layer: database + files cleaned
Audit trail working: logging who accessed what data and when
Consent collection: user opts-in before processing personal data
Error monitoring flagged with bias detection: notified if accuracy differs by group

Common Failure Modes

Scope creep on data collection: Started minimal, expanded without re-assessment
Fairness audit shows high disparate impact: Model treats protected groups differently
No audit trail: Can’t prove data deletion happened or track access
Disclosure missing: Users unaware AI system making decisions about them
Compliance checkbox mentality: Audit completed but findings not acted upon

Sign-Off Criteria

Legal review completed: acceptable risk profile for your jurisdiction
Fairness audit passed: no group has >20% performance disparity
Privacy controls tested: data deletion and access controls verified
Disclosure statement written and displayed to users
Compliance maintenance plan: how often re-audit? who responsible?

Executive Summary

1. Regulatory Landscape (April 2026)

1.1 FTC Guidance on AI and Transparency

1.2 GDPR Requirements (General Data Protection Regulation)

1.3 HIPAA (Health Insurance Portability and Accountability Act)

1.4 SOC 2 Type II Compliance

1.5 Industry-Specific Regulations

1.6 How Regulations Apply to Harnesses

2. Data Privacy & GDPR Compliance

2.1 What Qualifies as Personal Data

2.2 Data Minimization Principle

2.3 User Consent Requirements

2.4 Right to Be Forgotten (Erasure)

2.5 Data Retention Policies

2.6 GDPR Implementation Checklist

3. PII Detection & Protection

3.1 Sensitive Data Types

3.2 Detecting Sensitive Data

3.3 Redaction Before Logging

3.4 Anonymization for Analytics

3.5 PII Detection & Protection Checklist

4. Fairness & Bias Detection

4.1 What is Bias?

4.2 Detecting Bias in Outputs

4.3 Testing for Fairness

4.4 Demographic Parity vs Equalized Odds

4.5 Mitigation Strategies

4.6 Fairness Testing Framework

4.7 Fairness & Bias Checklist

5. Explainability & Interpretability

5.1 Why It Matters

5.2 How to Provide Explanations

5.3 Traceability & Audit Trails

5.4 Tool Call Justification

5.5 Implementation Patterns

5.6 Explainability Checklist

6. Transparency & Disclosure

6.1 When to Disclose AI Involvement

6.2 Labeling AI-Generated Content

6.3 User Expectations

6.4 Clear Communication

7. Audit Trails & Accountability

7.1 Immutable Logs of Decisions

7.2 Who Made What Decision and When

7.3 Enabling Investigation of Failures

7.4 Compliance with Regulations

7.5 Retention Periods

8. Model Cards & Documentation

8.1 What Should Be Documented

8.2 Limitations and Biases

8.3 Training Data Characteristics

8.4 Performance on Different Subgroups

8.5 Intended Use and Misuse

9. Ethical Guidelines

9.1 No Weaponization

9.2 No Deception

9.3 No Manipulation

9.4 Respect Autonomy

9.5 Organizational Ethical Guidelines

10. Implementation Framework

10.1 Privacy Impact Assessment (PIA)

10.2 Fairness Audit Process

10.3 Ethical Review Board (if applicable)

10.4 Regular Re-evaluation

10.5 Incident Response for Violations

11. Compliance Checklist

11.1 GDPR Compliance Items

11.2 HIPAA Compliance Items (if applicable)

11.3 FTC Guidance Alignment

11.4 Industry-Specific Requirements

11.5 Documentation Requirements

12. Quick Reference: Is Your Harness Regulated?

Appendices

A. Regulatory Timeline (April 2026)

B. Resources

C. Glossary

Validation Checklist

Performance Checks

Implementation Checks

Integration Checks