Kaynağa Gözat

Add CLAUDE.md with project documentation

Paweł Chodaczek 1 ay önce
ebeveyn
işleme
59b2c46bb4
1 değiştirilmiş dosya ile 141 ekleme ve 0 silme
  1. 141 0
      CLAUDE.md

+ 141 - 0
CLAUDE.md

@@ -0,0 +1,141 @@
+# mic-system
+
+Aplikacja webowa do monitoringu i nagrywania audio z 2 mikrofonów INMP441 (I2S) na RPi Zero 2W.
+
+## Architektura
+
+```
+Hardware:  2x INMP441 (I2S, 24-bit) → Google AIY Voice HAT → RPi Zero 2W (aarch64, 416MB RAM)
+Backend:   Python 3.13, Flask + Flask-SocketIO (eventlet), 4 procesy worker
+Frontend:  Vanilla JS + Canvas (waveform), WebSocket (Socket.IO)
+Port:      5000 (HTTP)
+Service:   systemd mic_system.service (user: pch, auto-restart)
+Network:   WiFi 10.0.100.24
+```
+
+## Struktura plików
+
+| Plik | LOC | Rola |
+|------|-----|------|
+| `app.py` | ~115 | Flask server + WebSocket — REST API (status, recordings CRUD), Socket.IO (audio_data stream, commands) |
+| `audio_capture.py` | ~700 | **Serce systemu** — AudioEngine: odczyt I2S (sounddevice), resampling, DSP pipeline, routing do UI/recorder |
+| `agc.py` | ~110 | Stateful AGC — envelope follower, noise gate, speech gating, limiter |
+| `beamforming.py` | ~45 | Delay-and-sum beamforming — 2 mikrofony, fractional delay (linear interp) |
+| `recorder.py` | ~160 | Threaded WAV writer — non-blocking queue, int16 conversion |
+| `static/app.js` | ~530 | Frontend — waveformy (Canvas), VU metery, kontrolki, live monitor (WebAudio), lista nagrań |
+| `templates/index.html` | ~190 | UI — polskie etykiety, kontrolki audio |
+| `static/style.css` | ~315 | Dark theme, glow effects, responsive |
+| `scripts/setup_rpi.sh` | Setup I2S overlay, ALSA, venv, systemd |
+| `scripts/deploy_from_windows.ps1` | Deployment z Windows przez SSH/SCP |
+| `scripts/diag_ws_record.py` | Diagnostyczny klient Socket.IO — auto-record + JSON stats |
+| `deploy/mic_system.service` | Systemd unit file |
+
+## DSP Pipeline (audio_capture.py)
+
+```
+I2S 48kHz stereo (int32)
+  → convert to float32 (>>8 for 24-bit MSB in 32-bit frame)
+  → [optional] HPF 75Hz + notch 50Hz (hum removal)
+  → resample to target rate (16k/22k/24k/32k via scipy polyphase)
+  → split: mic1, mic2
+  → [optional] mono_mix = (mic1+mic2)/2
+  → [optional] beamforming:
+      - GCC-PHAT for angle estimation (speech band 300-3400Hz, ProcessPoolExecutor)
+      - delay-and-sum with auto-tracking (smoothing 0.88/0.12)
+      - beam clarity enhancement (high-freq blend 0.22)
+      - presence boost (0.20)
+  → [optional] noise suppression (spectral subtraction, alpha varies by gate state)
+  → [optional] speech gate (VAD-based, hold 850ms, attack 12ms, release 360ms)
+  → [optional] AGC (per-channel: mic1, mic2, beam — AgcProcessor instances)
+  → [optional] limiter (peak clipping at 0.97)
+  → downsample waveform for UI (every Nth sample)
+  → encode PCM16 base64 for live monitor
+  → emit via Socket.IO every ~80ms
+```
+
+## Tryby pracy
+
+1. **Mic1 / Mic2** — single mic mono
+2. **Mono mix** — (L+R)/2
+3. **Beamforming** — delay-and-sum z auto-kierowaniem (GCC-PHAT w paśmie mowy)
+4. **HiFi test** — raw 48kHz, brak DSP, single mic
+
+## Nagrywanie
+
+- Źródła: mic1, mic2, mono_mix, beam, compare_all (3 pliki jednocześnie), hifi_raw
+- Auto-stop po zadanym czasie (sekundy)
+- Format: WAV 16-bit, mono
+- Threaded writer (WavRecorder) — nie blokuje audio callback
+
+## WebSocket Protocol
+
+### Server → Client: `audio_data` (co ~80ms)
+```json
+{
+  "mic1": [float samples...],
+  "mic2": [float samples...],
+  "beam": [float samples...],
+  "mono_mix": [float samples...],
+  "rms_mic1": float,
+  "rms_mic2": float,
+  "rms_beam": float,
+  "rms_mono_mix": float,
+  "recording": bool,
+  "rec_duration": float,
+  "speech_detected": bool,
+  "speech_gate_open": bool,
+  "beam_angle_deg": float,
+  "hifi_mode": bool,
+  "monitor_on": bool,
+  "monitor_source": str,
+  "monitor_chunk_b64": str (PCM16 base64),
+  "monitor_sr": int
+}
+```
+
+### Client → Server: `client_message`
+- `{"type": "settings", ...}` — update all audio settings
+- `{"type": "record_start", "source": str, "duration_sec": float}` — start recording
+- `{"type": "record_stop"}` — stop recording
+
+### Server → Client: `server_ack`
+- `{"type": "settings_applied", "settings": {...}}`
+- `{"type": "record_started", "filenames": [...], ...}`
+- `{"type": "record_stopped", "status": {...}}`
+
+## REST API
+
+- `GET /` — main page (index.html)
+- `GET /api/status` — audio engine status + settings
+- `GET /api/recordings` — list recordings (JSON array)
+- `GET /api/recordings/<filename>` — download WAV
+- `DELETE /api/recordings/<filename>` — delete recording
+
+## Parametry sprzętowe
+
+- Mikrofony: INMP441, 24-bit, I2S, rozmieszczone na okręgu ⌀6cm (sloty co 90°)
+- Mic spacing: ~0.0424m (sąsiednie sloty, sin(π/4) × 0.06)
+- Hardware sample rate: 48000 Hz (stały, Voice HAT)
+- Voice HAT overlay: `googlevoicehat-soundcard`
+- ALSA config: `~/.asoundrc` → hw:0, S32_LE, 48kHz, 2ch
+- RPi Zero 2W: 4 cores @ 1GHz, ~51% CPU na główny proces
+
+## Development
+
+```bash
+# Na RPi
+cd /home/pch/mic_system
+source .venv/bin/activate
+python app.py
+
+# Z Windows
+.\scripts\deploy_from_windows.ps1 -Host 10.0.100.24 -User pch
+
+# Diagnostyka
+python scripts/diag_ws_record.py --url http://127.0.0.1:5000 --duration 10 --source compare_all
+```
+
+## Git
+
+Repo: https://git.mm.mk/suby/mic-system.git
+Branch: master