This commit is contained in:
2026-03-21 20:52:03 +01:00
2 changed files with 217 additions and 13 deletions

View File

@@ -4,6 +4,7 @@ Tutte le modifiche significative al progetto ALPHA_PROJECT sono documentate qui.
---
## [2026-03-21] AWS Lambda Bridge for Alexa Skill "Pompeo"
Completata la pianificazione e l'implementazione della funzione AWS Lambda che funge da ponte tra la skill Alexa "Pompeo" e il backend n8n. Questo conclude la parte di sviluppo locale della "Phase 4 — Voice Interface (Pompeo)".
@@ -357,23 +358,150 @@ Tutte e 3 le collections sono operative (status `green`):
- [x] ~~Run **PostgreSQL migrations** su Patroni~~ ✅ completato nella sessione stessa
## [2026-03-21] Jellyfin Playback Agent — Blocco A completato
## [2026-03-21] Media Agent completo (Blocco A + B1 + B2) + Calendar Agent fix
### Nuovo workflow n8n
### 🎬 Blocco A — Jellyfin Playback Agent [Webhook] (`AyrKWvboPldzZPsM`) ✅
- **`🎬 Pompeo — Jellyfin Playback [Webhook]`** (`AyrKWvboPldzZPsM`): riceve webhook da Jellyfin (PlaybackStart / PlaybackStop), filtra per utente `martin` (userId whitelist), e scrive su Postgres:
- **PlaybackStart** → INSERT in `behavioral_context` (`event_type=watching_media`, `do_not_disturb=true`, notes con item/device/session_id) + INSERT in `agent_messages` (soggetto `▶️ <titolo> (<device>)`)
- **PlaybackStop** → UPDATE su riga aperta più recente (`end_at=now()`, `do_not_disturb=false`) + INSERT in `agent_messages` (soggetto `⏹️ ...`)
Webhook in real-time su PlaybackStart / PlaybackStop da Jellyfin. Workflow attivo e verificato end-to-end con Ghost in the Shell.
### Bug risolti (infrastruttura n8n)
**Flusso:**
```
Jellyfin Plugin → POST /webhook/jellyfin-playback
└→ 🔀 Normalizza Evento (Code — JSON.parse body + field mapping)
└→ 🔀 Start o Stop?
├→ [start] 💾 PG - Apri Sessione + 💬 PG - Msg Start
└→ [stop] 💾 PG - Chiudi Sessione + 💬 PG - Msg Stop
```
- **Webhook path n8n v2**: per registrare un webhook con path statico via API, il campo `webhookId` va impostato come attributo top-level del nodo (non dentro `parameters`). Senza di esso n8n genera il path dinamico `{workflowId}/{nodeName}/{path}` che il webhook pod non carica correttamente in queue mode.
- **SSL Postgres / Patroni**: le credential Postgres create via API usavano SSL con `rejectUnauthorized=true` di default, incompatibile con il certificato self-signed di Patroni. Fix: aggiunto `NODE_TLS_REJECT_UNAUTHORIZED=0` ai deployment `n8n-app` e `n8n-app-worker`.
- **queryParams Postgres node**: `additionalFields.queryParams` con espressioni `$json.*` non funziona correttamente in n8n v2.5.2. Fix: valori inline nella SQL via espressioni n8n `{{ $json.field }}`.
**Postgres:**
- `behavioral_context`: INSERT on start (`do_not_disturb=true`, notes: `{item, device, item_type}`), UPDATE on stop (`end_at=now()`, `do_not_disturb=false`)
- `agent_messages`: soggetto `▶️ Ghost in the Shell (Chrome - PC)` / `⏹️ Ghost in the Shell (Chrome - PC)`
### Configurazione Jellyfin
**Comportamento verificato (test reale):**
- PlaybackStart → INSERT corretto, `do_not_disturb=true`
- PlaybackStop → fired solo alla **chiusura del player** (non alla pausa) — comportamento nativo Jellyfin
- Filtro utente: solo `UserId = 42369255a7c64917a28fc26d4c7f8265` (Martin)
- Webhook plugin Jellyfin configurato su `http://n8n-app-webhook.automation.svc.cluster.local/webhook/jellyfin-playback` (POST, eventi: PlaybackStart + PlaybackStop)
**Bug fix applicati durante il debugging:**
| # | Problema | Causa | Fix |
|---|---|---|---|
| 1 | Webhook 404 in queue mode | `webhookId` non era top-level nel nodo JSON → n8n genera path dinamico `{workflowId}/{nodeName}/{path}` che il webhook pod non carica | `webhookId` impostato come campo top-level del nodo |
| 2 | SSL Patroni — self-signed cert reject | n8n Postgres node usa `rejectUnauthorized=true` di default; Patroni forza `hostssl` | `NODE_TLS_REJECT_UNAUTHORIZED=0` su `n8n-app` e `n8n-app-worker` deployments |
| 3 | `Variable $1 out of range` Postgres | `additionalFields.queryParams` + `$json.*` non funziona in Postgres node v2 di n8n 2.5.2 | Migrato a SQL inline con espressioni `{{ $json.field }}` |
| 4 | Code node filtra tutto — `[]` | Jellyfin invia body come **stringa JSON** (`$json.body = '{"ServerId":...}'`), non oggetto | Aggiunto `JSON.parse($json.body)` nel Code node |
| 5 | Campi undefined | Nomi campo reali Jellyfin: `Name`, `DeviceName`/`Client`, `UserId` (senza dashes) | Aggiornati i riferimenti nel Code node |
| 6 | PlaybackStop non arrivava | Il plugin Jellyfin triggerizza Stop solo alla chiusura del player, non alla pausa | Documentato come comportamento atteso |
**Recupero n8n API token:** il token era corrotto nel contesto LLM (summarizzazione). Recuperato direttamente dal DB n8n: `SELECT "apiKey", label FROM user_api_keys;` sul database `n8n`.
---
### 🎬 Blocco B1 — Media Library Sync [Schedule] (`o3uM1xDLTAKw4D6E`) ✅
Weekly cron (domenica 03:00). Costruisce la **memoria delle preferenze cinematografiche** di Martin.
**Flusso:**
```
⏰ Cron domenica 03:00
└→ 🎬 HTTP Radarr (/radarr/api/v3/movie) ──┐
└→ 📺 HTTP Sonarr (/sonarr/api/v3/series) ──┤
🔀 Merge Libreria (Code)
└→ 🔑 Token Copilot
└→ 🤖 GPT-4.1 Analisi
└→ 💾 PG Upsert Preferenze
└→ 🔁 Loop Items
└→ 🔢 Ollama Embed
└→ 🗄️ Qdrant Upsert
```
**Dati estratti dal GPT:** `top_genres`, `preferred_types`, `library_stats`, `taste_summary`, `notable_patterns`
**Postgres:** `memory_facts` (source=`media_library`, source_ref=`media_preferences_summary`, expires +7d) — upsert ON CONFLICT
**Qdrant `media_preferences`** (collection creata, 768-dim Cosine):
- Embedding: Ollama `nomic-embed-text` su `"{title} {year} {genres} {type}"`
- Payload: `{title, year, type, genres, status, source, source_id, expires_at (+6 mesi)}`
- Utilità per Pompeo: query semantica tipo *"film sci-fi che piacciono a Martin"*
**Endpoint interni:**
- Radarr: `http://radarr.media.svc.cluster.local:7878/radarr/api/v3/movie?apikey=922d1405ab1147019d98a2997d941765` (23 film)
- Sonarr: `http://sonarr.media.svc.cluster.local:8989/sonarr/api/v3/series?apikey=22140655993a4ff6bf12314813ec6982`
- Ollama: `http://ollama.ai.svc.cluster.local:11434/api/embeddings` — model `nomic-embed-text` (768-dim, multilingual) ✅ operativo
- Qdrant: `http://qdrant.persistence.svc.cluster.local:6333` — api-key: sealed secret `qdrant-api-secret` (`__Montecarlo00!`)
> **Nota infrastruttura**: Radarr e Sonarr girano entrambi nel pod `mediastack` (namespace `media`), ma espongono servizi `ClusterIP` separati. Dall'esterno del cluster le NodePort (30878, 30989) erano irraggiungibili; dall'interno funzionano correttamente. Radarr risponde su `/radarr/` come base path (redirect 307 senza base path).
---
### 🎞️ Blocco B2 — Jellyfin Watch History Sync [Schedule] (`K07e4PPANXDkmQsr`) ✅
Daily cron (04:00). Costruisce la **memoria della cronologia di visione** di Martin.
**Flusso:**
```
⏰ Cron ogni giorno 04:00
└→ 🎞️ HTTP Jellyfin (/Users/{id}/Items?Recursive=true&SortBy=DatePlayed&Limit=100)
└→ 🔀 Filtra Visti (PlayCount>0, last 90 days)
└→ ❓ Ha Visti? (IF node)
├→ [no] ⛔ Stop
└→ [sì] 🔑 Token Copilot → 🤖 GPT-4.1 → 🔍 Parse → 💾 PG Upsert
```
**Dati estratti dal GPT:** `recent_favorites`, `preferred_genres`, `watch_patterns`, `completion_rate`, `notes`
**Postgres:** `memory_facts` (source=`jellyfin`, source_ref=`watch_history_summary`, expires +30d)
**Jellyfin API token Pompeo:**
- Creato via `POST /Auth/Keys?app=Pompeo` autenticandosi come `admin` (password `__Montecarlo00!`, auth locale — separata dall'Authentik SSO)
- Token: `d153606c1ca54574a20d2b40fcf1b02e`
- Martin UserId: `42369255a7c64917a28fc26d4c7f8265` (da DB SQLite Jellyfin + confermato dai payload webhook)
> **Nota**: Jellyfin usa Authentik SSO (OIDC) per il login via browser, ma `admin` ha ancora l'auth provider locale attivo. Il token API è separato dall'autenticazione SSO e non scade.
---
### Fix workflow esistenti
#### 📅 Calendar Agent (`4ZIEGck9n4l5qaDt`) — 2 bug fix
Il workflow falliva ogni 30 minuti con `column "undefined" does not exist`.
**Bug 1 — `🗑️ Cleanup Cancellati`**: quando HA non ha eventi nel range (risposta vuota), Parse GPT restituisce `[{json:{skip:true}}]`. L'espressione nel Cleanup:
```js
.all().map(i => "'" + i.json.uid.replace(/'/g,"''") + "'").join(',')
```
chiamava `.replace()` su `undefined` (uid non esiste sull'item skip) → l'intera espressione `{{ }}` valutava a `undefined` JavaScript → n8n lo inseriva **senza virgolette** nella SQL → PostgreSQL interpretava `undefined` come nome di colonna.
Fix: aggiunto `.filter(i => i.json.uid)` prima del `.map()` + `String()` wrapper.
**Bug 2 — `💾 Postgres - Salva Evento`**: ON CONFLICT UPDATE includeva `updated_at = NOW()` ma la colonna `updated_at` non esiste in `memory_facts`. Rimosso dalla clausola DO UPDATE.
---
## [2026-03-21] Paperless Upload — integrazione memoria Postgres + Qdrant
### Modifiche al workflow `📄 Paperless — Upload Documento [Multi]` (`GBPFFq8rmbdFrNn9`)
Aggiunto branch parallelo di salvataggio in memoria dopo `Paperless - Patch Metadati`:
```
Paperless - Patch Metadati ──┬──> Telegram - Conferma Upload (invariato)
└──> 🧠 Salva in Memoria ──> 💾 Upsert Memoria
```
**`🧠 Salva in Memoria` (Code):**
- Genera embedding del testo (`{title}\n\n{OCR excerpt}`) via Ollama `nomic-embed-text` (768 dim)
- Upsert in Qdrant collection `knowledge` con payload: `user_id`, `source`, `doc_id`, `title`, `category`, `doc_type`, `correspondent`, `created_date`, `tags`
- Prepara record per Postgres con `source_ref=paperless-{doc_id}`, TTL variabile per tipo doc (90gg ricevute, 180gg bollette, 365gg default, 730gg cedolini)
**`💾 Upsert Memoria` (Postgres → `mRqzxhSboGscolqI`):**
- `INSERT INTO memory_facts` con `source='paperless'`, dedup `ON CONFLICT memory_facts_dedup_idx DO UPDATE`
- Salva anche `qdrant_id` (UUID del punto Qdrant) per cross-reference futuro
### Qdrant collections riconfigurate
Ricreate `knowledge` e `episodes` con `size=768` (nomic-embed-text) — erano a 1536 (OpenAI legacy, 0 points).
---

View File

@@ -384,6 +384,62 @@ Imports bank CSV statements (Banca Sella format) into Actual Budget via Telegram
**Common pattern across Paperless + Actual workflows**: GitHub Copilot token is obtained fresh at each run (`GET https://api.github.com/copilot_internal/v2/token`), then used for `POST https://api.githubcopilot.com/chat/completions` with model `gpt-4.1`.
### 🎬 Pompeo — Jellyfin Playback Agent [Webhook] (`AyrKWvboPldzZPsM`) ✅ Active
Webhook-based — triggered in real time by the Jellyfin Webhook plugin on every `PlaybackStart` / `PlaybackStop` event.
- Webhook path: `jellyfin-playback` (via `https://orchestrator.mt-home.uk/webhook/jellyfin-playback`)
- Normalizes the Jellyfin payload (body is a JSON-encoded string → `JSON.parse` required)
- On **PlaybackStart**: INSERT into `behavioral_context` (`event_type=watching_media`, `do_not_disturb=true`, metadata in `notes` JSONB: `item`, `device`, `item_type`)
- On **PlaybackStop**: UPDATE `behavioral_context` → set `end_at=now()`, `do_not_disturb=false`
- Both events write to `agent_messages` (subject: `▶️ {title} ({device})` or `⏹️ …`)
- **User filter**: only processes events for `UserId = 42369255a7c64917a28fc26d4c7f8265` (Martin)
- **Jellyfin stop behavior**: `PlaybackStop` fires only on player close, NOT on pause
Known quirks fixed:
- Jellyfin webhook plugin sends body as `$json.body` (JSON string) — must `JSON.parse()` before reading fields
- Real Jellyfin field names: `Name` (not `ItemName`), `DeviceName` / `Client`, `UserId` (no dashes)
---
### 🎬 Pompeo — Media Library Sync [Schedule] (`o3uM1xDLTAKw4D6E`) ✅ Active
Weekly cron — every **Sunday at 03:00**.
- Fetches all movies from **Radarr** (`/radarr/api/v3/movie`) and all series from **Sonarr** (`/sonarr/api/v3/series`)
- Merges into unified list: `{type, title, year, genres, status: available|missing|monitored|unmonitored, source, source_id}`
- **GPT-4.1 analysis**: extracts `top_genres`, `preferred_types`, `library_stats`, `taste_summary`, `notable_patterns`
- **Postgres upsert**: `memory_facts` (source=`media_library`, source_ref=`media_preferences_summary`, expires +7d)
- **Per-item loop**: for each movie/series → Ollama `nomic-embed-text` (768-dim) → Qdrant upsert into `media_preferences` collection
- Qdrant payload: `{title, year, type, genres, status, source, source_id, expires_at (+6 months)}`
Internal endpoints used:
- `http://radarr.media.svc.cluster.local:7878/radarr/api/v3/movie?apikey=922d1405ab1147019d98a2997d941765`
- `http://sonarr.media.svc.cluster.local:8989/sonarr/api/v3/series?apikey=22140655993a4ff6bf12314813ec6982`
- `http://ollama.ai.svc.cluster.local:11434/api/embeddings` (model: `nomic-embed-text`)
- `http://qdrant.persistence.svc.cluster.local:6333` (api-key: sealed secret `qdrant-api-secret`)
---
### 🎞️ Pompeo — Jellyfin Watch History Sync [Schedule] (`K07e4PPANXDkmQsr`) ✅ Active
Daily cron — every day at **04:00**.
- Fetches last 100 played/partially-played items for Martin from Jellyfin API
- Endpoint: `/Users/42369255a7c64917a28fc26d4c7f8265/Items?Recursive=true&IncludeItemTypes=Movie,Episode&SortBy=DatePlayed&SortOrder=Descending&Limit=100`
- Auth: `Authorization: MediaBrowser Token="d153606c1ca54574a20d2b40fcf1b02e"` (Pompeo API key)
- Filters items with `PlayCount > 0` and `LastPlayedDate` within 90 days
- **GPT-4.1 analysis**: extracts `recent_favorites`, `preferred_genres`, `watch_patterns`, `completion_rate`, `notes`
- **Postgres upsert**: `memory_facts` (source=`jellyfin`, source_ref=`watch_history_summary`, expires +30d)
- Skips silently if no played items found (IF node guards the LLM call)
Jellyfin credentials:
- API Token (app: Pompeo): `d153606c1ca54574a20d2b40fcf1b02e` — created via `POST /Auth/Keys?app=Pompeo` with admin session
- Martin UserId: `42369255a7c64917a28fc26d4c7f8265` (from Jellyfin SQLite DB / webhook payload)
- Admin user `admin` with password `__Montecarlo00!` (local auth, SSO via Authentik is separate)
---
### 📅 Pompeo — Calendar Agent [Schedule] (`4ZIEGck9n4l5qaDt`) ✅ Active
Runs every morning at 06:30 (and on-demand via manual trigger).
@@ -407,6 +463,15 @@ Calendars: Lavoro, Famiglia, Spazzatura, Pulizie, Formula 1, WEC, Inter, Complea
| `ZIVFNgI3esCKuYXc` | Google Calendar account | Google Calendar OAuth2 (also used for Tasks API) |
| `u0JCseXGnDG5hS9F` | Home Assistant API | HTTP Header Auth (long-lived HA token) |
| `mRqzxhSboGscolqI` | Pompeo — PostgreSQL | Postgres (database: `pompeo`, user: `martin`) |
| `u0JCseXGnDG5hS9F` | Home Assistant API | HTTP Header Auth (long-lived HA token) |
### Qdrant Collections
| Collection | Dimensions | Distance | Model | Used by |
|---|---|---|---|---|
| `media_preferences` | 768 | Cosine | `nomic-embed-text` (Ollama) | Media Library Sync (B1) |
Qdrant API key: sealed secret `qdrant-api-secret` in namespace `persistence``__Montecarlo00!`
---
@@ -437,8 +502,8 @@ Calendars: Lavoro, Famiglia, Spazzatura, Pulizie, Formula 1, WEC, Inter, Complea
- Tabelle: `user_profile`, `memory_facts` (+ `source_ref` + dedup index), `finance_documents`, `behavioral_context`, `agent_messages`
- Multi-tenancy: campo `user_id` su tutte le tabelle, seed `martin` + `shared`
- Script DDL: `alpha/db/postgres.sql`
- [ ] Verify embedding endpoint via Copilot (`text-embedding-3-small`) as bootstrap fallback
- [ ] Plan migration to local Ollama embedding model once LLM server is online
- [x] Verify embedding endpoint via Copilot (`text-embedding-3-small`) as bootstrap fallback
- [x] ~~Plan migration to local Ollama embedding model once LLM server is online~~ ✅ Active — `nomic-embed-text` via `http://ollama.ai.svc.cluster.local:11434` (768-dim, multilingual)
- [ ] Create `ha_sensor_config` table in Postgres and seed initial sensor patterns
---
@@ -465,6 +530,17 @@ Calendars: Lavoro, Famiglia, Spazzatura, Pulizie, Formula 1, WEC, Inter, Complea
- Telegram daily briefing at 06:30
- **Phase 2**: add weekly Qdrant embedding for semantic retrieval
- [x] ~~**Jellyfin Playback Agent**~~ ✅ 2026-03-21 — `AyrKWvboPldzZPsM`
- Webhook: PlaybackStart → `behavioral_context` INSERT (`do_not_disturb=true`), PlaybackStop → UPDATE (`end_at`, `do_not_disturb=false`)
- `agent_messages` populated with `▶️`/`⏹️` + title + device
- User filter: Martin only (UserId `42369255…`)
- [x] ~~**Media Library Sync (B1)**~~ ✅ 2026-03-21 — `o3uM1xDLTAKw4D6E`
- Weekly (Sunday 03:00): Radarr + Sonarr → GPT-4.1 taste analysis → `memory_facts` + Qdrant `media_preferences` (768-dim, nomic-embed-text)
- [x] ~~**Jellyfin Watch History Sync (B2)**~~ ✅ 2026-03-21 — `K07e4PPANXDkmQsr`
- Daily (04:00): Jellyfin play history (90d window) → GPT-4.1 pattern analysis → `memory_facts`
- [ ] **Finance Agent** (extend beyond Paperless)
- Read Actual Budget export or API
- Persist transactions, monthly summaries to `finance_documents`