docs: add ALPHA_PROJECT system context
This commit is contained in:
359
README.md
Normal file
359
README.md
Normal file
@@ -0,0 +1,359 @@
|
||||
# ALPHA_PROJECT — System Context for GitHub Copilot
|
||||
|
||||
## Who I am
|
||||
|
||||
I am a Cloud Architect with 6 years of Big4 consulting experience (currently at manager level at EY), working daily on Azure, Dynamics 365 / Dataverse, .NET, and enterprise integration patterns. I run a production-grade homelab on Kubernetes at home, and I am building ALPHA_PROJECT as a personal initiative in my spare time (evenings and weekends).
|
||||
|
||||
---
|
||||
|
||||
## What ALPHA_PROJECT is
|
||||
|
||||
ALPHA_PROJECT is a **proactive, multi-agent personal AI assistant** built entirely on self-hosted infrastructure. It is not a chatbot. It is an autonomous system that:
|
||||
|
||||
- Monitors my digital life (email, calendar, home automation, finances, infrastructure)
|
||||
- Maintains a persistent, structured memory of facts, habits, and preferences
|
||||
- Takes initiative to notify me of relevant events, correlations, and pending actions
|
||||
- Interacts with me via voice (Amazon Echo / Alexa custom skill named **"Pompeo"**) and Telegram
|
||||
- Runs local LLMs on dedicated hardware — no cloud AI inference (except GitHub Copilot completions, available via EY license at zero cost)
|
||||
|
||||
The assistant is named **Pompeo** (the Alexa skill wake word).
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure
|
||||
|
||||
### LLM Server (new, dedicated node — outside the Kubernetes cluster)
|
||||
|
||||
- **CPU**: AMD Ryzen 5 4500
|
||||
- **RAM**: 16 GB DDR4
|
||||
- **GPU**: NVIDIA GeForce RTX 3060 (16 GB VRAM)
|
||||
- **Runtime**: Ollama (API-compatible with OpenAI)
|
||||
- **Primary model**: Qwen2.5-14B-Instruct Q4_K_M (fits entirely in VRAM, no offload)
|
||||
- **Secondary model**: Qwen2.5-Coder-14B-Instruct Q4_K_M (for code-related tasks)
|
||||
- **Embedding model**: TBD — to be served via Ollama (e.g. `nomic-embed-text`)
|
||||
- **Constraint**: zero RAM offload — all models must fit entirely in 16 GB VRAM
|
||||
|
||||
### Kubernetes Homelab Cluster
|
||||
|
||||
Production-grade self-hosted stack. Key components relevant to ALPHA_PROJECT:
|
||||
|
||||
| Component | Role |
|
||||
|---|---|
|
||||
| **n8n** | Primary orchestrator and workflow engine for all agents |
|
||||
| **Node-RED** | Event-driven automation, Home Assistant bridge |
|
||||
| **Patroni / PostgreSQL** | Persistent structured memory store |
|
||||
| **Qdrant** | Vector store for semantic/episodic memory *(to be deployed)* |
|
||||
| **NATS / Redis Streams** | Message broker between agents *(to be chosen and deployed)* |
|
||||
| **Authentik** | SSO / IAM (OIDC) |
|
||||
| **Home Assistant** | IoT hub — device tracking, automations, sensors |
|
||||
| **MikroTik** | Network — VLANs, firewall rules, device presence detection |
|
||||
| **Paperless-ngx** | Document archive (`docs.mt-home.uk`) |
|
||||
| **Actual Budget** | Personal finance |
|
||||
| **Mealie** | Meal planning / recipes |
|
||||
| **Immich** | Photo library |
|
||||
| **Outline** | Internal wiki / knowledge base |
|
||||
| **Radarr / Sonarr** | Media management |
|
||||
| **Jenkins** | CI/CD |
|
||||
| **AdGuard** | DNS filtering |
|
||||
| **WireGuard** | VPN |
|
||||
| **Minio** | S3-compatible object storage |
|
||||
| **Longhorn** | Distributed block storage |
|
||||
| **Velero** | Disaster recovery / backup |
|
||||
|
||||
### External Services (in use)
|
||||
|
||||
- **Gmail** — primary email
|
||||
- **Google Calendar** — calendar (multiple calendars: Work, Family, Formula 1, WEC, Inter, Birthdays, Tasks, Pulizie, Spazzatura, Festività Italia, Varie)
|
||||
- **Amazon Echo** — voice interface for Pompeo
|
||||
- **AWS Lambda** — bridge between Alexa skill and n8n webhook
|
||||
- **Telegram** — notifications, logging, manual document upload
|
||||
- **GitHub Copilot** (GPT-4.1 via `api.githubcopilot.com`) — LLM completions at zero cost (EY license)
|
||||
|
||||
### Internal Services / Custom
|
||||
|
||||
- `orchestrator.mt-home.uk` — n8n instance
|
||||
- `docs.mt-home.uk` — Paperless-ngx
|
||||
- `filewizard.home.svc.cluster.local:8000` — custom OCR microservice (async, job-based API)
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Multi-Agent Design
|
||||
|
||||
ALPHA_PROJECT uses specialized agents, each responsible for a specific data domain. All agents are implemented as **n8n workflows**.
|
||||
|
||||
| Agent | Trigger | Responsibility |
|
||||
|---|---|---|
|
||||
| **Mail Agent** | Cron every 15-30 min | Read Gmail, classify emails, extract facts, detect invoices/bills |
|
||||
| **Finance Agent** | Triggered by Mail Agent or Telegram | Process PDF invoices/bills, archive to Paperless, persist to memory |
|
||||
| **Calendar Agent** | Cron + on-demand | Read Google Calendar, detect upcoming events, cross-reference with other agents |
|
||||
| **Infrastructure Agent** | Cron + alert webhooks | Monitor Kubernetes cluster health, disk usage, failed jobs |
|
||||
| **IoT Agent** | Event-driven (Home Assistant webhooks) | Monitor device presence, home state, learn behavioral patterns |
|
||||
| **Newsletter Agent** | Cron morning | Digest newsletters, extract relevant articles |
|
||||
| **Proactive Arbiter** | Cron (adaptive frequency) + high-priority queue messages | Consume agent outputs, correlate, decide what to notify |
|
||||
|
||||
### Message Broker (Blackboard Pattern)
|
||||
|
||||
Agents do not call each other directly. They publish observations to a **central message queue** (NATS JetStream or Redis Streams — TBD). The **Proactive Arbiter** consumes the queue, batches low-priority messages, and immediately processes high-priority ones.
|
||||
|
||||
Message schema (all agents must conform):
|
||||
|
||||
```json
|
||||
{
|
||||
"agent": "mail",
|
||||
"priority": "low|high",
|
||||
"event_type": "new_fact|reminder|alert|behavioral_observation",
|
||||
"subject": "brief description",
|
||||
"detail": {},
|
||||
"source_ref": "optional reference to postgres record or external ID",
|
||||
"timestamp": "ISO8601",
|
||||
"expires_at": "ISO8601 or null"
|
||||
}
|
||||
```
|
||||
|
||||
### Memory Architecture
|
||||
|
||||
Three layers of persistence:
|
||||
|
||||
**1. Structured memory — PostgreSQL (Patroni)**
|
||||
|
||||
Episodic facts, finance records, reminders, behavioral observations. Fast, queryable, expirable.
|
||||
|
||||
```sql
|
||||
-- Generic episodic facts
|
||||
CREATE TABLE memory_facts (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
source TEXT NOT NULL, -- 'email', 'calendar', 'iot', 'paperless', ...
|
||||
category TEXT, -- 'finance', 'personal', 'work', 'health', ...
|
||||
subject TEXT,
|
||||
detail JSONB, -- flexible per-source payload
|
||||
action_required BOOLEAN DEFAULT false,
|
||||
action_text TEXT,
|
||||
created_at TIMESTAMP DEFAULT now(),
|
||||
expires_at TIMESTAMP, -- facts have a TTL
|
||||
qdrant_id UUID -- FK to vector store
|
||||
);
|
||||
|
||||
-- Finance documents (frequent structured queries)
|
||||
CREATE TABLE finance_documents (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
paperless_doc_id INT,
|
||||
correspondent TEXT,
|
||||
amount NUMERIC(10,2),
|
||||
currency TEXT DEFAULT 'EUR',
|
||||
doc_date DATE,
|
||||
doc_type TEXT,
|
||||
tags TEXT[],
|
||||
created_at TIMESTAMP DEFAULT now()
|
||||
);
|
||||
|
||||
-- Behavioral context (used by IoT agent and Arbiter)
|
||||
CREATE TABLE behavioral_context (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
event_type TEXT, -- 'sport_event', 'dog_walk', 'work_session', ...
|
||||
start_at TIMESTAMP,
|
||||
end_at TIMESTAMP,
|
||||
do_not_disturb BOOLEAN DEFAULT false,
|
||||
home_presence_expected BOOLEAN,
|
||||
notes TEXT
|
||||
);
|
||||
```
|
||||
|
||||
**2. Semantic memory — Qdrant**
|
||||
|
||||
Vector embeddings for similarity search. Three collections:
|
||||
|
||||
| Collection | Content |
|
||||
|---|---|
|
||||
| `martin_episodes` | Conversations, episodic facts with timestamp |
|
||||
| `martin_knowledge` | Documents, Outline notes, newsletters, knowledge base |
|
||||
| `martin_preferences` | Preferences, habits, behavioral patterns |
|
||||
|
||||
Each Qdrant point includes a metadata payload for pre-filtering (source, date, category, action_required) to avoid full-scan similarity searches.
|
||||
|
||||
**3. Profile memory — PostgreSQL (static table)**
|
||||
|
||||
User preferences, fixed facts, communication style. Updated manually or via explicit agent action.
|
||||
|
||||
### Embedding Strategy
|
||||
|
||||
- Embeddings are generated via Ollama (`nomic-embed-text` or equivalent) once the LLM server is online
|
||||
- During bootstrap phase: embeddings generated via GitHub Copilot (`text-embedding-3-small` at `api.githubcopilot.com/embeddings`) — same token acquisition pattern already in use
|
||||
- Never embed raw content — always embed **LLM-generated summaries + extracted entities**
|
||||
|
||||
### Proactive Notification Logic
|
||||
|
||||
The Arbiter runs on an **adaptive schedule**:
|
||||
|
||||
| Time slot | Frequency | Behavior |
|
||||
|---|---|---|
|
||||
| 23:00–07:00 | Never | Silence |
|
||||
| 07:00–09:00 | Once | Morning briefing (calendar, reminders, pending actions) |
|
||||
| 09:00–19:00 | Every 2-3h | Only high-priority or correlated events |
|
||||
| 19:00–22:00 | Once | Evening recap + next day preview |
|
||||
|
||||
High-priority queue messages bypass the schedule and trigger immediate notification.
|
||||
|
||||
Notification is sent via **Amazon Echo / Pompeo** (TTS) for voice, and **Telegram** for logging. Every Arbiter decision (notify / discard / defer) is logged to a dedicated Telegram audit channel.
|
||||
|
||||
### Voice Interface (Pompeo)
|
||||
|
||||
- Amazon Echo → **Alexa Custom Skill** → **AWS Lambda** (bridge) → **n8n webhook** → Ollama (Qwen2.5-14B) → TTS response back to Echo
|
||||
- Wake phrase: "Pompeo"
|
||||
- Lambda is intentionally thin — it only translates the Alexa request format to the n8n webhook payload and returns the TTS response
|
||||
|
||||
---
|
||||
|
||||
## Existing n8n Workflows (already in production)
|
||||
|
||||
### 📬 Gmail — Daily Digest [Schedule] (`1lIKvVJQIcva30YM`)
|
||||
|
||||
- Runs every 3 hours (+ test webhook)
|
||||
- Fetches unread emails from the last 3 hours
|
||||
- Calls GPT-4.1 (via Copilot) to classify each email: category, sentiment, labels, action_required, whether it has a Paperless-relevant PDF attachment
|
||||
- Applies Gmail labels, marks as read, trashes spam
|
||||
- If a bill/invoice PDF is detected → triggers the **Upload Bolletta** webhook
|
||||
- Sends a digest report to Telegram
|
||||
|
||||
### 📄 Paperless — Upload Bolletta [Email] (`vbzQ3fgUalOPdcOq`)
|
||||
|
||||
- Triggered by webhook from Daily Digest (payload includes `email_id`)
|
||||
- Downloads the PDF attachment from Gmail API
|
||||
- Fetches Paperless metadata (correspondents, document types, tags, storage paths, similar existing documents)
|
||||
- Calls GPT-4.1 to infer Paperless metadata (correspondent, doc type, tags, storage path, filename, date)
|
||||
- Uploads PDF to Paperless, polls task status, patches metadata on the created document
|
||||
- Sends Telegram confirmation
|
||||
|
||||
### 📄 Paperless — Upload Documento [Telegram] (`ZX5rLSETg6Xcymps`)
|
||||
|
||||
- Triggered by Telegram bot (user sends a PDF with caption starting with "Documento")
|
||||
- Downloads file from Telegram
|
||||
- Sends to FileWizard OCR microservice (async job), polls for result
|
||||
- Same GPT-4.1 metadata inference pipeline as above
|
||||
- Uploads to Paperless (filename = original filename without extension), patches metadata
|
||||
- Sends Telegram confirmation with link to document
|
||||
- Cleans up FileWizard: deletes processed files, then clears job history
|
||||
|
||||
**Common pattern across all three**: GitHub Copilot token is obtained fresh at each run (`GET https://api.github.com/copilot_internal/v2/token`), then used for `POST https://api.githubcopilot.com/chat/completions` with model `gpt-4.1`.
|
||||
|
||||
### n8n Credentials (IDs)
|
||||
|
||||
| ID | Name | Type |
|
||||
|---|---|---|
|
||||
| `qvOikS6IF0H5khr8` | Gmail OAuth2 | OAuth2 |
|
||||
| `uTXHLqcCJxbOvqN3` | Telegram account | Telegram API |
|
||||
| `vBwUxlzKrX3oDHyN` | GitHub Copilot OAuth Token | HTTP Header Auth |
|
||||
| `uvGjLbrN5yQTQIzv` | Paperless-NGX API | HTTP Header Auth |
|
||||
|
||||
---
|
||||
|
||||
## Coding Conventions
|
||||
|
||||
- **n8n workflows**: nodes named in Italian, descriptive emoji prefixes on trigger nodes
|
||||
- **Workflow naming**: `{icon} {App} — {Azione} {Tipo} [{Sorgente}]` (e.g. `📄 Paperless — Upload Documento [Telegram]`)
|
||||
- **HTTP nodes**: always use `predefinedCredentialType` for authenticated services already configured in n8n credentials
|
||||
- **GPT body**: use `contentType: "raw"` + `rawContentType: "application/json"` + `JSON.stringify({...})` inline expression — never `specifyBody: string`
|
||||
- **LLM output parsing**: always defensive — handle missing `choices`, malformed JSON, empty responses gracefully
|
||||
- **Copilot token**: always fetched fresh per workflow run, never cached across executions
|
||||
- **Binary fields**: Telegram node `file.get` with `download: true` stores binary in field named `data` (not `attachment`)
|
||||
- **Postgres**: use UUID primary keys with `gen_random_uuid()`, JSONB for flexible payloads, always include `created_at`
|
||||
- **Qdrant upsert**: always include full metadata payload for filtering; use `message_id` / `thread_id` / `doc_id` as logical dedup keys
|
||||
|
||||
---
|
||||
|
||||
## TO-DO
|
||||
|
||||
### Phase 0 — Infrastructure Bootstrap *(prerequisite for everything)*
|
||||
|
||||
- [ ] Deploy **Qdrant** on the Kubernetes cluster
|
||||
- Create collections: `martin_episodes`, `martin_knowledge`, `martin_preferences`
|
||||
- Configure payload indexes on: `source`, `category`, `date`, `action_required`
|
||||
- [ ] Run **PostgreSQL migrations** on Patroni
|
||||
- Create tables: `memory_facts`, `finance_documents`, `behavioral_context`
|
||||
- Add index on `memory_facts(source, category, expires_at)`
|
||||
- [ ] Verify embedding endpoint via Copilot (`text-embedding-3-small`) as bootstrap fallback
|
||||
- [ ] Plan migration to local Ollama embedding model once LLM server is online
|
||||
|
||||
---
|
||||
|
||||
### Phase 1 — Memory Integration into Existing Workflows
|
||||
|
||||
- [ ] **Daily Digest**: after `Parse risposta GPT-4.1`, add:
|
||||
- Postgres INSERT into `memory_facts` (source=email, category, subject, detail JSONB, action_required, expires_at)
|
||||
- Embedding generation (Copilot endpoint) → Qdrant upsert into `martin_episodes`
|
||||
- Thread dedup: use `thread_id` as logical key, update existing Qdrant point if thread already exists
|
||||
|
||||
- [ ] **Upload Bolletta** + **Upload Documento (Telegram)**: after `Paperless - Patch Metadati`, add:
|
||||
- Postgres INSERT into `finance_documents` (correspondent, amount, doc_date, doc_type, tags, paperless_doc_id)
|
||||
- Postgres INSERT into `memory_facts` (source=paperless, category=finance, cross-reference)
|
||||
- Embedding of OCR text chunks → Qdrant upsert into `martin_knowledge`
|
||||
|
||||
---
|
||||
|
||||
### Phase 2 — New Agents
|
||||
|
||||
- [ ] **Calendar Agent**
|
||||
- Poll Google Calendar (all relevant calendars)
|
||||
- Persist upcoming events to Postgres (`memory_facts` + `behavioral_context` for leisure events)
|
||||
- Weekly cluster embedding (chunk per week, not per event)
|
||||
- Dedup recurring events: embed only first occurrence, store rest in Postgres only
|
||||
|
||||
- [ ] **Finance Agent** (extend beyond Paperless)
|
||||
- Read Actual Budget export or API
|
||||
- Persist transactions, monthly summaries to `finance_documents`
|
||||
- Trend analysis prompt for periodic financial summary
|
||||
|
||||
- [ ] **Infrastructure Agent**
|
||||
- Webhook receiver for Kubernetes/Longhorn/Minio alerts
|
||||
- Cron-based cluster health check (disk, pod status, backup freshness)
|
||||
- Publishes to message broker with `priority: high` for critical alerts
|
||||
|
||||
- [ ] **IoT Agent**
|
||||
- Home Assistant webhook → Node-RED → n8n
|
||||
- Device presence tracking → `behavioral_context`
|
||||
- Pattern recognition via Qdrant similarity on historical episodes (e.g. "Tuesday evening, outside, laptop on")
|
||||
|
||||
- [ ] **Newsletter Agent**
|
||||
- Separate Gmail label for newsletters (excluded from Daily Digest main flow)
|
||||
- Morning cron: summarize + extract relevant articles → `martin_knowledge`
|
||||
|
||||
---
|
||||
|
||||
### Phase 3 — Message Broker + Proactive Arbiter
|
||||
|
||||
- [ ] Choose and deploy broker: **NATS JetStream** (preferred — lightweight, native Kubernetes) or Redis Streams
|
||||
- [ ] Define final message schema (draft above, to be validated)
|
||||
- [ ] Implement **Proactive Arbiter** n8n workflow:
|
||||
- Adaptive schedule (morning briefing, midday, evening recap)
|
||||
- Consume queue batch → LLM correlation prompt → structured `notify/defer/discard` output
|
||||
- High-priority bypass path
|
||||
- All decisions logged to Telegram audit channel
|
||||
- [ ] Implement **correlation logic**: detect when 2+ agents report related events (e.g. IoT presence + calendar event + open reminder)
|
||||
|
||||
---
|
||||
|
||||
### Phase 4 — Voice Interface (Pompeo)
|
||||
|
||||
- [ ] Create Alexa Custom Skill ("Pompeo")
|
||||
- [ ] AWS Lambda bridge (thin translator: Alexa request → n8n webhook → TTS response)
|
||||
- [ ] n8n webhook handler: receive transcribed text → prepend memory context → Ollama inference → return TTS string
|
||||
- [ ] TTS response pipeline back to Echo
|
||||
- [ ] Proactive push: Arbiter → Lambda → Echo notification (Alexa proactive events API)
|
||||
|
||||
---
|
||||
|
||||
### Phase 5 — Generalization and Backlog
|
||||
|
||||
- [ ] **OCR on email attachments in Daily Digest**: generalize the ingest pipeline to extract text from any PDF attachment (not just bills), using FileWizard OCR — produce richer embeddings and enable full-text retrieval on any emailed document
|
||||
- [ ] **Flusso Cedolino** (payslip pipeline):
|
||||
- Trigger: Gmail label `Lavoro/Cedolino` or Telegram upload
|
||||
- PDF → FileWizard OCR → GPT-4.1 metadata extraction (month, gross, net, deductions)
|
||||
- Paperless upload with tag `Cedolino`
|
||||
- Persist structured data to `finance_documents` (custom fields for payslip)
|
||||
- Trend embedding in `martin_knowledge` for finance agent queries
|
||||
- [ ] Behavioral habit modeling: aggregate `behavioral_context` records over time, generate periodic "habit summary" embeddings in `martin_preferences`
|
||||
- [ ] Outline → Qdrant pipeline: sync selected Outline documents into `martin_knowledge` on edit/publish event
|
||||
- [ ] Chrome browsing history ingestion (privacy-filtered): evaluate browser extension or local export → embedding pipeline for interest/preference modeling
|
||||
- [ ] "Posti e persone" graph: structured contact/location model in Postgres, populated from email senders, calendar attendees, Home Assistant presence data
|
||||
- [ ] Local embedding model: migrate from Copilot `text-embedding-3-small` to Ollama-served model (e.g. `nomic-embed-text`) once LLM server is stable
|
||||
Reference in New Issue
Block a user