# mic-system Aplikacja webowa do monitoringu i nagrywania audio z 2 mikrofonów INMP441 (I2S) na RPi Zero 2W. ## Architektura ``` Hardware: 2x INMP441 (I2S, 24-bit) → Google AIY Voice HAT → RPi Zero 2W (aarch64, 416MB RAM) Backend: Python 3.13, Flask + Flask-SocketIO (eventlet), 4 procesy worker Frontend: Vanilla JS + Canvas (waveform), WebSocket (Socket.IO) Port: 5000 (HTTP) Service: systemd mic_system.service (user: pch, auto-restart) Network: WiFi 10.0.100.24 ``` ## Struktura plików | Plik | LOC | Rola | |------|-----|------| | `app.py` | ~115 | Flask server + WebSocket — REST API (status, recordings CRUD), Socket.IO (audio_data stream, commands) | | `audio_capture.py` | ~700 | **Serce systemu** — AudioEngine: odczyt I2S (sounddevice), resampling, DSP pipeline, routing do UI/recorder | | `agc.py` | ~110 | Stateful AGC — envelope follower, noise gate, speech gating, limiter | | `beamforming.py` | ~45 | Delay-and-sum beamforming — 2 mikrofony, fractional delay (linear interp) | | `recorder.py` | ~160 | Threaded WAV writer — non-blocking queue, int16 conversion | | `static/app.js` | ~530 | Frontend — waveformy (Canvas), VU metery, kontrolki, live monitor (WebAudio), lista nagrań | | `templates/index.html` | ~190 | UI — polskie etykiety, kontrolki audio | | `static/style.css` | ~315 | Dark theme, glow effects, responsive | | `scripts/setup_rpi.sh` | Setup I2S overlay, ALSA, venv, systemd | | `scripts/deploy_from_windows.ps1` | Deployment z Windows przez SSH/SCP | | `scripts/diag_ws_record.py` | Diagnostyczny klient Socket.IO — auto-record + JSON stats | | `deploy/mic_system.service` | Systemd unit file | ## DSP Pipeline (audio_capture.py) ``` I2S 48kHz stereo (int32) → convert to float32 (>>8 for 24-bit MSB in 32-bit frame) → [optional] HPF 75Hz + notch 50Hz (hum removal) → resample to target rate (16k/22k/24k/32k via scipy polyphase) → split: mic1, mic2 → [optional] mono_mix = (mic1+mic2)/2 → [optional] beamforming: - GCC-PHAT for angle estimation (speech band 300-3400Hz, ProcessPoolExecutor) - delay-and-sum with auto-tracking (smoothing 0.88/0.12) - beam clarity enhancement (high-freq blend 0.22) - presence boost (0.20) → [optional] noise suppression (spectral subtraction, alpha varies by gate state) → [optional] speech gate (VAD-based, hold 850ms, attack 12ms, release 360ms) → [optional] AGC (per-channel: mic1, mic2, beam — AgcProcessor instances) → [optional] limiter (peak clipping at 0.97) → downsample waveform for UI (every Nth sample) → encode PCM16 base64 for live monitor → emit via Socket.IO every ~80ms ``` ## Tryby pracy 1. **Mic1 / Mic2** — single mic mono 2. **Mono mix** — (L+R)/2 3. **Beamforming** — delay-and-sum z auto-kierowaniem (GCC-PHAT w paśmie mowy) 4. **HiFi test** — raw 48kHz, brak DSP, single mic ## Nagrywanie - Źródła: mic1, mic2, mono_mix, beam, compare_all (3 pliki jednocześnie), hifi_raw - Auto-stop po zadanym czasie (sekundy) - Format: WAV 16-bit, mono - Threaded writer (WavRecorder) — nie blokuje audio callback ## WebSocket Protocol ### Server → Client: `audio_data` (co ~80ms) ```json { "mic1": [float samples...], "mic2": [float samples...], "beam": [float samples...], "mono_mix": [float samples...], "rms_mic1": float, "rms_mic2": float, "rms_beam": float, "rms_mono_mix": float, "recording": bool, "rec_duration": float, "speech_detected": bool, "speech_gate_open": bool, "beam_angle_deg": float, "hifi_mode": bool, "monitor_on": bool, "monitor_source": str, "monitor_chunk_b64": str (PCM16 base64), "monitor_sr": int } ``` ### Client → Server: `client_message` - `{"type": "settings", ...}` — update all audio settings - `{"type": "record_start", "source": str, "duration_sec": float}` — start recording - `{"type": "record_stop"}` — stop recording ### Server → Client: `server_ack` - `{"type": "settings_applied", "settings": {...}}` - `{"type": "record_started", "filenames": [...], ...}` - `{"type": "record_stopped", "status": {...}}` ## REST API - `GET /` — main page (index.html) - `GET /api/status` — audio engine status + settings - `GET /api/recordings` — list recordings (JSON array) - `GET /api/recordings/` — download WAV - `DELETE /api/recordings/` — delete recording ## Parametry sprzętowe - Mikrofony: INMP441, 24-bit, I2S, rozmieszczone na okręgu ⌀6cm (sloty co 90°) - Mic spacing: ~0.0424m (sąsiednie sloty, sin(π/4) × 0.06) - Hardware sample rate: 48000 Hz (stały, Voice HAT) - Voice HAT overlay: `googlevoicehat-soundcard` - ALSA config: `~/.asoundrc` → hw:0, S32_LE, 48kHz, 2ch - RPi Zero 2W: 4 cores @ 1GHz, ~51% CPU na główny proces ## Development ```bash # Na RPi cd /home/pch/mic_system source .venv/bin/activate python app.py # Z Windows .\scripts\deploy_from_windows.ps1 -Host 10.0.100.24 -User pch # Diagnostyka python scripts/diag_ws_record.py --url http://127.0.0.1:5000 --duration 10 --source compare_all ``` ## Git Repo: https://git.mm.mk/suby/mic-system.git Branch: master