martin/Alpha

Fork 0

Go to file

Martin Tahiraj e257a6795f docs: add ALPHA_PROJECT system context

2026-03-21 00:10:52 +01:00

README.md

docs: add ALPHA_PROJECT system context

2026-03-21 00:10:52 +01:00

README.md

ALPHA_PROJECT — System Context for GitHub Copilot

Who I am

I am a Cloud Architect with 6 years of Big4 consulting experience (currently at manager level at EY), working daily on Azure, Dynamics 365 / Dataverse, .NET, and enterprise integration patterns. I run a production-grade homelab on Kubernetes at home, and I am building ALPHA_PROJECT as a personal initiative in my spare time (evenings and weekends).

What ALPHA_PROJECT is

ALPHA_PROJECT is a proactive, multi-agent personal AI assistant built entirely on self-hosted infrastructure. It is not a chatbot. It is an autonomous system that:

Monitors my digital life (email, calendar, home automation, finances, infrastructure)
Maintains a persistent, structured memory of facts, habits, and preferences
Takes initiative to notify me of relevant events, correlations, and pending actions
Interacts with me via voice (Amazon Echo / Alexa custom skill named "Pompeo") and Telegram
Runs local LLMs on dedicated hardware — no cloud AI inference (except GitHub Copilot completions, available via EY license at zero cost)

The assistant is named Pompeo (the Alexa skill wake word).

Infrastructure

LLM Server (new, dedicated node — outside the Kubernetes cluster)

CPU: AMD Ryzen 5 4500
RAM: 16 GB DDR4
GPU: NVIDIA GeForce RTX 3060 (16 GB VRAM)
Runtime: Ollama (API-compatible with OpenAI)
Primary model: Qwen2.5-14B-Instruct Q4_K_M (fits entirely in VRAM, no offload)
Secondary model: Qwen2.5-Coder-14B-Instruct Q4_K_M (for code-related tasks)
Embedding model: TBD — to be served via Ollama (e.g. nomic-embed-text)
Constraint: zero RAM offload — all models must fit entirely in 16 GB VRAM

Kubernetes Homelab Cluster

Production-grade self-hosted stack. Key components relevant to ALPHA_PROJECT:

Component	Role
n8n	Primary orchestrator and workflow engine for all agents
Node-RED	Event-driven automation, Home Assistant bridge
Patroni / PostgreSQL	Persistent structured memory store
Qdrant	Vector store for semantic/episodic memory (to be deployed)
NATS / Redis Streams	Message broker between agents (to be chosen and deployed)
Authentik	SSO / IAM (OIDC)
Home Assistant	IoT hub — device tracking, automations, sensors
MikroTik	Network — VLANs, firewall rules, device presence detection
Paperless-ngx	Document archive (`docs.mt-home.uk`)
Actual Budget	Personal finance
Mealie	Meal planning / recipes
Immich	Photo library
Outline	Internal wiki / knowledge base
Radarr / Sonarr	Media management
Jenkins	CI/CD
AdGuard	DNS filtering
WireGuard	VPN
Minio	S3-compatible object storage
Longhorn	Distributed block storage
Velero	Disaster recovery / backup

External Services (in use)

Gmail — primary email
Google Calendar — calendar (multiple calendars: Work, Family, Formula 1, WEC, Inter, Birthdays, Tasks, Pulizie, Spazzatura, Festività Italia, Varie)
Amazon Echo — voice interface for Pompeo
AWS Lambda — bridge between Alexa skill and n8n webhook
Telegram — notifications, logging, manual document upload
GitHub Copilot (GPT-4.1 via api.githubcopilot.com) — LLM completions at zero cost (EY license)

Internal Services / Custom

orchestrator.mt-home.uk — n8n instance
docs.mt-home.uk — Paperless-ngx
filewizard.home.svc.cluster.local:8000 — custom OCR microservice (async, job-based API)

Architecture Overview

Multi-Agent Design

ALPHA_PROJECT uses specialized agents, each responsible for a specific data domain. All agents are implemented as n8n workflows.

Agent	Trigger	Responsibility
Mail Agent	Cron every 15-30 min	Read Gmail, classify emails, extract facts, detect invoices/bills
Finance Agent	Triggered by Mail Agent or Telegram	Process PDF invoices/bills, archive to Paperless, persist to memory
Calendar Agent	Cron + on-demand	Read Google Calendar, detect upcoming events, cross-reference with other agents
Infrastructure Agent	Cron + alert webhooks	Monitor Kubernetes cluster health, disk usage, failed jobs
IoT Agent	Event-driven (Home Assistant webhooks)	Monitor device presence, home state, learn behavioral patterns
Newsletter Agent	Cron morning	Digest newsletters, extract relevant articles
Proactive Arbiter	Cron (adaptive frequency) + high-priority queue messages	Consume agent outputs, correlate, decide what to notify

Message Broker (Blackboard Pattern)

Agents do not call each other directly. They publish observations to a central message queue (NATS JetStream or Redis Streams — TBD). The Proactive Arbiter consumes the queue, batches low-priority messages, and immediately processes high-priority ones.

Message schema (all agents must conform):

{
  "agent": "mail",
  "priority": "low|high",
  "event_type": "new_fact|reminder|alert|behavioral_observation",
  "subject": "brief description",
  "detail": {},
  "source_ref": "optional reference to postgres record or external ID",
  "timestamp": "ISO8601",
  "expires_at": "ISO8601 or null"
}

Memory Architecture

Three layers of persistence:

1. Structured memory — PostgreSQL (Patroni)

Episodic facts, finance records, reminders, behavioral observations. Fast, queryable, expirable.

-- Generic episodic facts
CREATE TABLE memory_facts (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  source TEXT NOT NULL,         -- 'email', 'calendar', 'iot', 'paperless', ...
  category TEXT,                -- 'finance', 'personal', 'work', 'health', ...
  subject TEXT,
  detail JSONB,                 -- flexible per-source payload
  action_required BOOLEAN DEFAULT false,
  action_text TEXT,
  created_at TIMESTAMP DEFAULT now(),
  expires_at TIMESTAMP,         -- facts have a TTL
  qdrant_id UUID                -- FK to vector store
);

-- Finance documents (frequent structured queries)
CREATE TABLE finance_documents (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  paperless_doc_id INT,
  correspondent TEXT,
  amount NUMERIC(10,2),
  currency TEXT DEFAULT 'EUR',
  doc_date DATE,
  doc_type TEXT,
  tags TEXT[],
  created_at TIMESTAMP DEFAULT now()
);

-- Behavioral context (used by IoT agent and Arbiter)
CREATE TABLE behavioral_context (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  event_type TEXT,              -- 'sport_event', 'dog_walk', 'work_session', ...
  start_at TIMESTAMP,
  end_at TIMESTAMP,
  do_not_disturb BOOLEAN DEFAULT false,
  home_presence_expected BOOLEAN,
  notes TEXT
);

2. Semantic memory — Qdrant

Vector embeddings for similarity search. Three collections:

Collection	Content
`martin_episodes`	Conversations, episodic facts with timestamp
`martin_knowledge`	Documents, Outline notes, newsletters, knowledge base
`martin_preferences`	Preferences, habits, behavioral patterns

Each Qdrant point includes a metadata payload for pre-filtering (source, date, category, action_required) to avoid full-scan similarity searches.

3. Profile memory — PostgreSQL (static table)

User preferences, fixed facts, communication style. Updated manually or via explicit agent action.

Embedding Strategy

Embeddings are generated via Ollama (nomic-embed-text or equivalent) once the LLM server is online
During bootstrap phase: embeddings generated via GitHub Copilot (text-embedding-3-small at api.githubcopilot.com/embeddings) — same token acquisition pattern already in use
Never embed raw content — always embed LLM-generated summaries + extracted entities

Proactive Notification Logic

The Arbiter runs on an adaptive schedule:

Time slot	Frequency	Behavior
23:00–07:00	Never	Silence
07:00–09:00	Once	Morning briefing (calendar, reminders, pending actions)
09:00–19:00	Every 2-3h	Only high-priority or correlated events
19:00–22:00	Once	Evening recap + next day preview

High-priority queue messages bypass the schedule and trigger immediate notification.

Notification is sent via Amazon Echo / Pompeo (TTS) for voice, and Telegram for logging. Every Arbiter decision (notify / discard / defer) is logged to a dedicated Telegram audit channel.

Voice Interface (Pompeo)

Amazon Echo → Alexa Custom Skill → AWS Lambda (bridge) → n8n webhook → Ollama (Qwen2.5-14B) → TTS response back to Echo
Wake phrase: "Pompeo"
Lambda is intentionally thin — it only translates the Alexa request format to the n8n webhook payload and returns the TTS response

Existing n8n Workflows (already in production)

📬 Gmail — Daily Digest [Schedule] (`1lIKvVJQIcva30YM`)

Runs every 3 hours (+ test webhook)
Fetches unread emails from the last 3 hours
Calls GPT-4.1 (via Copilot) to classify each email: category, sentiment, labels, action_required, whether it has a Paperless-relevant PDF attachment
Applies Gmail labels, marks as read, trashes spam
If a bill/invoice PDF is detected → triggers the Upload Bolletta webhook
Sends a digest report to Telegram

📄 Paperless — Upload Bolletta [Email] (`vbzQ3fgUalOPdcOq`)

Triggered by webhook from Daily Digest (payload includes email_id)
Downloads the PDF attachment from Gmail API
Fetches Paperless metadata (correspondents, document types, tags, storage paths, similar existing documents)
Calls GPT-4.1 to infer Paperless metadata (correspondent, doc type, tags, storage path, filename, date)
Uploads PDF to Paperless, polls task status, patches metadata on the created document
Sends Telegram confirmation

📄 Paperless — Upload Documento [Telegram] (`ZX5rLSETg6Xcymps`)

Triggered by Telegram bot (user sends a PDF with caption starting with "Documento")
Downloads file from Telegram
Sends to FileWizard OCR microservice (async job), polls for result
Same GPT-4.1 metadata inference pipeline as above
Uploads to Paperless (filename = original filename without extension), patches metadata
Sends Telegram confirmation with link to document
Cleans up FileWizard: deletes processed files, then clears job history

Common pattern across all three: GitHub Copilot token is obtained fresh at each run (GET https://api.github.com/copilot_internal/v2/token), then used for POST https://api.githubcopilot.com/chat/completions with model gpt-4.1.

n8n Credentials (IDs)

ID	Name	Type
`qvOikS6IF0H5khr8`	Gmail OAuth2	OAuth2
`uTXHLqcCJxbOvqN3`	Telegram account	Telegram API
`vBwUxlzKrX3oDHyN`	GitHub Copilot OAuth Token	HTTP Header Auth
`uvGjLbrN5yQTQIzv`	Paperless-NGX API	HTTP Header Auth

Coding Conventions

n8n workflows: nodes named in Italian, descriptive emoji prefixes on trigger nodes
Workflow naming: {icon} {App} — {Azione} {Tipo} [{Sorgente}] (e.g. 📄 Paperless — Upload Documento [Telegram])
HTTP nodes: always use predefinedCredentialType for authenticated services already configured in n8n credentials
GPT body: use contentType: "raw" + rawContentType: "application/json" + JSON.stringify({...}) inline expression — never specifyBody: string
LLM output parsing: always defensive — handle missing choices, malformed JSON, empty responses gracefully
Copilot token: always fetched fresh per workflow run, never cached across executions
Binary fields: Telegram node file.get with download: true stores binary in field named data (not attachment)
Postgres: use UUID primary keys with gen_random_uuid(), JSONB for flexible payloads, always include created_at
Qdrant upsert: always include full metadata payload for filtering; use message_id / thread_id / doc_id as logical dedup keys

TO-DO

Phase 0 — Infrastructure Bootstrap (prerequisite for everything)

Deploy Qdrant on the Kubernetes cluster
- Create collections: martin_episodes, martin_knowledge, martin_preferences
- Configure payload indexes on: source, category, date, action_required
Run PostgreSQL migrations on Patroni
- Create tables: memory_facts, finance_documents, behavioral_context
- Add index on memory_facts(source, category, expires_at)
Verify embedding endpoint via Copilot (text-embedding-3-small) as bootstrap fallback
Plan migration to local Ollama embedding model once LLM server is online

Phase 1 — Memory Integration into Existing Workflows

Daily Digest: after Parse risposta GPT-4.1, add:
- Postgres INSERT into memory_facts (source=email, category, subject, detail JSONB, action_required, expires_at)
- Embedding generation (Copilot endpoint) → Qdrant upsert into martin_episodes
- Thread dedup: use thread_id as logical key, update existing Qdrant point if thread already exists
Upload Bolletta + Upload Documento (Telegram): after Paperless - Patch Metadati, add:
- Postgres INSERT into finance_documents (correspondent, amount, doc_date, doc_type, tags, paperless_doc_id)
- Postgres INSERT into memory_facts (source=paperless, category=finance, cross-reference)
- Embedding of OCR text chunks → Qdrant upsert into martin_knowledge

Phase 2 — New Agents

Calendar Agent
- Poll Google Calendar (all relevant calendars)
- Persist upcoming events to Postgres (memory_facts + behavioral_context for leisure events)
- Weekly cluster embedding (chunk per week, not per event)
- Dedup recurring events: embed only first occurrence, store rest in Postgres only
Finance Agent (extend beyond Paperless)
- Read Actual Budget export or API
- Persist transactions, monthly summaries to finance_documents
- Trend analysis prompt for periodic financial summary
Infrastructure Agent
- Webhook receiver for Kubernetes/Longhorn/Minio alerts
- Cron-based cluster health check (disk, pod status, backup freshness)
- Publishes to message broker with priority: high for critical alerts
IoT Agent
- Home Assistant webhook → Node-RED → n8n
- Device presence tracking → behavioral_context
- Pattern recognition via Qdrant similarity on historical episodes (e.g. "Tuesday evening, outside, laptop on")
Newsletter Agent
- Separate Gmail label for newsletters (excluded from Daily Digest main flow)
- Morning cron: summarize + extract relevant articles → martin_knowledge

Phase 3 — Message Broker + Proactive Arbiter

Choose and deploy broker: NATS JetStream (preferred — lightweight, native Kubernetes) or Redis Streams
Define final message schema (draft above, to be validated)
Implement Proactive Arbiter n8n workflow:
- Adaptive schedule (morning briefing, midday, evening recap)
- Consume queue batch → LLM correlation prompt → structured notify/defer/discard output
- High-priority bypass path
- All decisions logged to Telegram audit channel
Implement correlation logic: detect when 2+ agents report related events (e.g. IoT presence + calendar event + open reminder)

Phase 4 — Voice Interface (Pompeo)

Create Alexa Custom Skill ("Pompeo")
AWS Lambda bridge (thin translator: Alexa request → n8n webhook → TTS response)
n8n webhook handler: receive transcribed text → prepend memory context → Ollama inference → return TTS string
TTS response pipeline back to Echo
Proactive push: Arbiter → Lambda → Echo notification (Alexa proactive events API)

Phase 5 — Generalization and Backlog

OCR on email attachments in Daily Digest: generalize the ingest pipeline to extract text from any PDF attachment (not just bills), using FileWizard OCR — produce richer embeddings and enable full-text retrieval on any emailed document
Flusso Cedolino (payslip pipeline):
- Trigger: Gmail label Lavoro/Cedolino or Telegram upload
- PDF → FileWizard OCR → GPT-4.1 metadata extraction (month, gross, net, deductions)
- Paperless upload with tag Cedolino
- Persist structured data to finance_documents (custom fields for payslip)
- Trend embedding in martin_knowledge for finance agent queries
Behavioral habit modeling: aggregate behavioral_context records over time, generate periodic "habit summary" embeddings in martin_preferences
Outline → Qdrant pipeline: sync selected Outline documents into martin_knowledge on edit/publish event
Chrome browsing history ingestion (privacy-filtered): evaluate browser extension or local export → embedding pipeline for interest/preference modeling
"Posti e persone" graph: structured contact/location model in Postgres, populated from email senders, calendar attendees, Home Assistant presence data
Local embedding model: migrate from Copilot text-embedding-3-small to Ollama-served model (e.g. nomic-embed-text) once LLM server is stable

README.md Unescape Escape

ALPHA_PROJECT — System Context for GitHub Copilot

Who I am

What ALPHA_PROJECT is

Infrastructure

LLM Server (new, dedicated node — outside the Kubernetes cluster)

Kubernetes Homelab Cluster

External Services (in use)

Internal Services / Custom

Architecture Overview

Multi-Agent Design

Message Broker (Blackboard Pattern)

Memory Architecture

Embedding Strategy

Proactive Notification Logic

Voice Interface (Pompeo)

Existing n8n Workflows (already in production)

📬 Gmail — Daily Digest [Schedule] (1lIKvVJQIcva30YM)

📄 Paperless — Upload Bolletta [Email] (vbzQ3fgUalOPdcOq)

📄 Paperless — Upload Documento [Telegram] (ZX5rLSETg6Xcymps)

n8n Credentials (IDs)

Coding Conventions

TO-DO

Phase 0 — Infrastructure Bootstrap (prerequisite for everything)

Phase 1 — Memory Integration into Existing Workflows

Phase 2 — New Agents

Phase 3 — Message Broker + Proactive Arbiter

Phase 4 — Voice Interface (Pompeo)

Phase 5 — Generalization and Backlog

README.md

📬 Gmail — Daily Digest [Schedule] (`1lIKvVJQIcva30YM`)

📄 Paperless — Upload Bolletta [Email] (`vbzQ3fgUalOPdcOq`)

📄 Paperless — Upload Documento [Telegram] (`ZX5rLSETg6Xcymps`)