Provenance First
Provenance is the cornerstone of ContextGraph OS. Every piece of data in the system has a traceable origin, ensuring complete auditability and trust.
What is Provenance?
Provenance answers the question: "Where did this data come from?"
In ContextGraph, provenance tracking means:
- Every claim has a documented source
- Every change is recorded in an immutable ledger
- The entire history can be verified cryptographically
- No orphan data exists in the system
The Provenance Ledger
The provenance ledger is an append-only, hash-chained log of all data mutations:
interface ProvenanceEntry {
id: ProvenanceId;
hash: string; // SHA-256 hash of this entry
previousHash: string; // Link to previous entry
type: ProvenanceType; // claim_created, entity_created, etc.
subjectId: string; // What this is about
data: unknown; // Entry-specific data
source: ProvenanceSource;
timestamp: Timestamp;
agentId?: AgentId; // Who created this
}
type ProvenanceType =
| 'claim_created'
| 'claim_revoked'
| 'entity_created'
| 'entity_updated'
| 'decision_recorded'
| 'policy_created'
| 'execution_logged';
Hash Chain Verification
Each entry contains a hash of its contents and a reference to the previous entry's hash, creating an unbreakable chain:
Entry 1 Entry 2 Entry 3
┌─────────┐ ┌─────────┐ ┌─────────┐
│ hash: A │◄─────│prevHash:A│◄─────│prevHash:B│
│ prev: ∅ │ │ hash: B │ │ hash: C │
│ data... │ │ data... │ │ data... │
└─────────┘ └─────────┘ └─────────┘
To verify integrity:
const verification = await client.verifyProvenance();
if (verification.value.valid) {
console.log('Chain integrity verified');
console.log(`Entries: ${verification.value.entriesVerified}`);
} else {
console.log('Chain corrupted!');
console.log(`Broken links: ${verification.value.brokenLinks}`);
console.log(`Invalid hashes: ${verification.value.invalidHashes}`);
}
Provenance Sources
Every entry has a source describing its origin:
interface ProvenanceSource {
type: 'agent' | 'user' | 'system' | 'external' | 'inference';
id: string;
method?: string;
metadata?: Record<string, unknown>;
}
// Example sources
const agentSource = {
type: 'agent',
id: 'agt_processor',
method: 'data_extraction'
};
const externalSource = {
type: 'external',
id: 'weather_api',
method: 'GET /current',
metadata: { apiVersion: '2.0' }
};
const inferenceSource = {
type: 'inference',
id: 'reasoning_engine',
method: 'transitive_closure',
metadata: { confidence: 0.95 }
};
Automatic Provenance Tracking
When using the SDK, provenance is tracked automatically:
// When you add a claim...
await client.addClaim({
subjectId: entityId,
predicate: 'status',
value: 'active',
});
// A provenance entry is automatically created:
// {
// type: 'claim_created',
// subjectId: entityId,
// data: { predicate: 'status', value: 'active' },
// source: { type: 'system', id: 'sdk' },
// hash: '...',
// previousHash: '...'
// }
Querying Provenance
Get Entry by ID
const entry = await ledger.get(provenanceId);
Query by Subject
const entries = await ledger.query({
subjectId: entityId,
limit: 100
});
Query by Type
const claimEntries = await ledger.query({
type: 'claim_created',
from: startTimestamp,
to: endTimestamp
});
Query by Agent
const agentActions = await ledger.query({
agentId: agentId,
limit: 50
});
Provenance for Compliance
Provenance records support compliance requirements:
Audit Trails
// Get complete audit trail for an entity
const audit = await client.getAuditTrail({
entityId: personId,
format: 'detailed'
});
// Export for compliance review
const report = await compliance.generateAuditReport({
from: '2024-01-01',
to: '2024-12-31',
format: 'pdf'
});
GDPR Support
// Find all data related to a person (data subject)
const subjectData = await compliance.getDataSubjectData(personId);
// Export for data portability
const exportData = await compliance.exportDataSubjectData(personId);
// Right to erasure (with provenance of deletion)
await compliance.deleteDataSubjectData(personId, {
reason: 'GDPR Article 17 request',
requestId: 'gdpr_req_123'
});
Best Practices
1. Always Provide Source Context
// Good - includes context
await ledger.record({
type: 'claim_created',
source: {
type: 'external',
id: 'crm_system',
method: 'sync',
metadata: { syncId: 'sync_456' }
},
// ...
});
// Avoid - missing context
await ledger.record({
type: 'claim_created',
source: { type: 'system', id: 'unknown' },
// ...
});
2. Use Meaningful Agent IDs
// Good - descriptive agent
const agent = await client.createAgent({
name: 'invoice-processor',
description: 'Processes incoming invoices from suppliers'
});
// Avoid - generic agent
const agent = await client.createAgent({
name: 'agent1',
description: 'Does stuff'
});
3. Verify Regularly
// In your health checks
async function healthCheck() {
const verification = await client.verifyProvenance();
if (!verification.value.valid) {
alertOps('Provenance chain integrity failure');
}
}
Next Steps
- Temporal Data - Time-aware data management
- Decisions as Data - Decision tracking
- Provenance Package - API reference