Aplikacja webowa do monitoringu i nagrywania audio z 2 mikrofonów INMP441 (I2S) na RPi Zero 2W.
Hardware: 2x INMP441 (I2S, 24-bit) → Google AIY Voice HAT → RPi Zero 2W (aarch64, 416MB RAM)
Backend: Python 3.13, Flask + Flask-SocketIO (eventlet), 4 procesy worker
Frontend: Vanilla JS + Canvas (waveform), WebSocket (Socket.IO)
Port: 5000 (HTTP)
Service: systemd mic_system.service (user: pch, auto-restart)
Network: WiFi 10.0.100.24
| Plik | LOC | Rola |
|---|---|---|
app.py |
~115 | Flask server + WebSocket — REST API (status, recordings CRUD), Socket.IO (audio_data stream, commands) |
audio_capture.py |
~700 | Serce systemu — AudioEngine: odczyt I2S (sounddevice), resampling, DSP pipeline, routing do UI/recorder |
agc.py |
~110 | Stateful AGC — envelope follower, noise gate, speech gating, limiter |
beamforming.py |
~45 | Delay-and-sum beamforming — 2 mikrofony, fractional delay (linear interp) |
recorder.py |
~160 | Threaded WAV writer — non-blocking queue, int16 conversion |
static/app.js |
~530 | Frontend — waveformy (Canvas), VU metery, kontrolki, live monitor (WebAudio), lista nagrań |
templates/index.html |
~190 | UI — polskie etykiety, kontrolki audio |
static/style.css |
~315 | Dark theme, glow effects, responsive |
scripts/setup_rpi.sh |
Setup I2S overlay, ALSA, venv, systemd | |
scripts/deploy_from_windows.ps1 |
Deployment z Windows przez SSH/SCP | |
scripts/diag_ws_record.py |
Diagnostyczny klient Socket.IO — auto-record + JSON stats | |
deploy/mic_system.service |
Systemd unit file |
I2S 48kHz stereo (int32)
→ convert to float32 (>>8 for 24-bit MSB in 32-bit frame)
→ [optional] HPF 75Hz + notch 50Hz (hum removal)
→ resample to target rate (16k/22k/24k/32k via scipy polyphase)
→ split: mic1, mic2
→ [optional] mono_mix = (mic1+mic2)/2
→ [optional] beamforming:
- GCC-PHAT for angle estimation (speech band 300-3400Hz, ProcessPoolExecutor)
- delay-and-sum with auto-tracking (smoothing 0.88/0.12)
- beam clarity enhancement (high-freq blend 0.22)
- presence boost (0.20)
→ [optional] noise suppression (spectral subtraction, alpha varies by gate state)
→ [optional] speech gate (VAD-based, hold 850ms, attack 12ms, release 360ms)
→ [optional] AGC (per-channel: mic1, mic2, beam — AgcProcessor instances)
→ [optional] limiter (peak clipping at 0.97)
→ downsample waveform for UI (every Nth sample)
→ encode PCM16 base64 for live monitor
→ emit via Socket.IO every ~80ms
audio_data (co ~80ms){
"mic1": [float samples...],
"mic2": [float samples...],
"beam": [float samples...],
"mono_mix": [float samples...],
"rms_mic1": float,
"rms_mic2": float,
"rms_beam": float,
"rms_mono_mix": float,
"recording": bool,
"rec_duration": float,
"speech_detected": bool,
"speech_gate_open": bool,
"beam_angle_deg": float,
"hifi_mode": bool,
"monitor_on": bool,
"monitor_source": str,
"monitor_chunk_b64": str (PCM16 base64),
"monitor_sr": int
}
client_message{"type": "settings", ...} — update all audio settings{"type": "record_start", "source": str, "duration_sec": float} — start recording{"type": "record_stop"} — stop recordingserver_ack{"type": "settings_applied", "settings": {...}}{"type": "record_started", "filenames": [...], ...}{"type": "record_stopped", "status": {...}}GET / — main page (index.html)GET /api/status — audio engine status + settingsGET /api/recordings — list recordings (JSON array)GET /api/recordings/<filename> — download WAVDELETE /api/recordings/<filename> — delete recordinggooglevoicehat-soundcard~/.asoundrc → hw:0, S32_LE, 48kHz, 2ch# Na RPi
cd /home/pch/mic_system
source .venv/bin/activate
python app.py
# Z Windows
.\scripts\deploy_from_windows.ps1 -Host 10.0.100.24 -User pch
# Diagnostyka
python scripts/diag_ws_record.py --url http://127.0.0.1:5000 --duration 10 --source compare_all
Repo: https://git.mm.mk/suby/mic-system.git Branch: master