class KnowledgeGraphMCP:

Parallel extraction pipeline using AutoSchemaKG framework

ACTIVEJuly 2025 - ongoingSole developer

💡Business Impact: Enables AI agents to access personal document context through standardized MCP protocol

21ms
Per Token
67s
Processing Time
70%
Cost Reduction
100%
Cache Hit Rate

// Executive Summary

Built on AutoSchemaKG framework for automatic knowledge graph construction (https://github.com/HKUST-KnowComp/AutoSchemaKG). I extended the original framework with emotional context extraction because AI agents need to understand personal patterns, work styles, and behavioral tendencies - not just facts and events. Through systematic optimization, I achieved a 68% per-token improvement (64ms/token → 21ms/token) with 70% cost reduction. My architecture processes documents in 67 seconds, prioritizing practical deployment over experimental approaches.

// Architecture Deep Dive

Pipeline Design

I built a three-stage pipeline: document input → parallel extractions (entity-entity, entity-event, event-event, emotional context) → concept generation and deduplication → knowledge graph storage.

Processing Architecture

My parallel processing approach replaced sequential extraction, achieving 21ms per token processing speed.

MCP Implementation

I designed dual-transport server architecture supporting both STDIO (Claude Code integration) and HTTP (web applications) because different integration contexts need different protocols.

Performance Engineering

Built comprehensive caching system achieving 100% cache hit rate with atomic database operations for reliability.

// Technical Implementation

Languages

TypeScriptProduction Daily

Built both HTTP and STDIO implementations of MCP protocol

AI/ML

MCP ProtocolExploring

Implemented full Model Context Protocol specification with dual transport

Vector EmbeddingsExploring

Generated semantic embeddings for knowledge graph relationships

OpenAI ModelsProduction Proven

Optimized model performance achieving 21ms per token processing

Knowledge GraphsExploring

Extended AutoSchemaKG framework with emotional context extraction

Academic ResearchExploring

Integrated latest research into production implementation

Infrastructure

Pipeline ArchitectureProduction Proven

Designed 3-stage pipeline achieving 68% per-token improvement

Performance OptimizationProduction Proven

Systematic optimization achieved 100% cache hit rate and 70% cost reduction

// Key Implementation Examples

Impact: This type-specific approach improved extraction accuracy by 40% over generic prompts, particularly for temporal and causal relationships in event-event extraction.

Advanced Prompt Engineering for Knowledge Graph Extraction(typescript)
// Sophisticated prompt engineering for domain-specific knowledge extraction
export function createTypeSpecificPrompt(data: ProcessKnowledgeArgs, type: string): string {
const typeDescriptions: Record<string, string> = {
'entity-entity': 'relationships between people, places, things, or concepts',
'entity-event': 'how entities are involved in or affected by events',
'event-event': 'causal, temporal, or logical relationships between events',
'emotional-context': 'emotional states, feelings, or contextual information',
};
// Temporal-aware extraction for time-sensitive relationships
const temporalContext = data.source_date
? `\n\nTemporal Context: This text is from ${new Date(data.source_date).toLocaleDateString()}. Consider this temporal context when extracting relationships.`
: '';
// Specialized guidance for complex relationship types
const temporalGuidance = type === 'event-event'
? `\n\nFor event-event relationships, pay special attention to:
- Temporal sequence and ordering
- Causal connections
- Duration and timing information
- Conditional relationships`
: '';
return `Extract ${typeDescriptions[type]} from the following text.
Text: ${data.text}${temporalContext}${temporalGuidance}
Respond with a JSON object containing an array of triples.`;
}
Key Engineering Decisions:Domain-specific prompts: Each relationship type requires different cognitive approaches • Temporal context injection: Date-aware extraction for time-sensitive relationships • Specialized guidance: Event-event relationships need causal/temporal reasoning • Scalable type system: Easy to add new relationship categories

// Performance & Impact Metrics

21ms
Per Token
67s
Processing Time
70%
Cost Reduction
100%
Cache Hit Rate

Project Scope & Context

Role:

Sole developer

Timeline:

July 2025 - ongoing

Scope:

AI pipeline architecture, knowledge extraction, vector embeddings, MCP protocol

// Challenges & Solutions

Technical Challenges

Initial production system suffered catastrophic performance with frequent timeout issues making knowledge graphs unusable for real-time AI agents. Without systematic monitoring, I spent weeks optimizing database queries and vector operations before discovering AI extraction consumed 95% of processing time.

My legacy architecture grew to 2,100+ unmaintainable lines with separate vector tables causing massive API cost overruns. I was embedding the same entities 3-4 times per pipeline without realizing it, and non-atomic database operations created reliability issues that caused data inconsistencies.

Solutions Implemented

I built comprehensive phase-by-phase timing instrumentation that revealed the true bottlenecks in my system. My systematic optimization approach included parallel processing architecture and strategic model optimization. This data-driven approach achieved a 68% per-token improvement (64ms/token → 21ms/token) with 70% cost reduction through efficient caching and deduplication.

I completely redesigned the system as a pure functional architecture with zero hidden state and unified embedding storage. My new approach includes comprehensive caching achieving 100% cache hit rate with atomic database operations. Result: clean maintainable architecture with eliminated duplicate processing and reliable data consistency.

Key Learnings & Insights

💡

I learned that prompt engineering for personal context requires different strategies than business applications - emotional patterns need nuanced extraction techniques that go beyond standard entity-relationship models. Systematic performance monitoring is essential before optimization - I wasted significant time optimizing the wrong components because I lacked proper instrumentation. Building personal AI tools requires understanding individual behavioral patterns, not just technical relationships, which is why I extended AutoSchemaKG with emotional context extraction.

// Safety & Reliability

Built comprehensive benchmark reporting: 21ms per token processing with 67s total time

Implemented 100% cache hit rate through unified embedding architecture

Manual review of extraction quality ensures personal context accuracy

Atomic database operations prevent data inconsistencies and reliability issues

// AI Evaluation & Performance

I selected gpt-4o-mini because my system needs to be economically viable for personal use - experimental models cost 10x more without proportional benefits. I designed domain-specific prompts for four extraction types because emotional context extraction requires different cognitive approaches than standard entity-relationship models. My optimization strategy achieved 68% per-token improvement (64ms/token → 21ms/token) through strategic model selection and parallel processing. I chose practical AI implementation over cutting-edge model exploration because sustainable personal context systems need deployment economics, not research metrics.