ProxyCall

A prototype local AI workflow that attends your online meetings and responds in your cloned voice. Everything runs on Apple Silicon with 24GB RAM.

GitHub

About the project

A local AI agent prototype that attends your online meetings and responds in your cloned voice. No cloud APIs — everything runs locally on a Macbook Pro (tested on 24GB RAM).

It works with any app that outputs audio (Google Meet, Zoom, Teams) by routing the audio through BlackHole. The pipeline captures call audio, transcribes speech in real-time using Voxtral 4B, classifies intent and generates responses using Llama 3.1 8B via Ollama, and finally speaks it back using VoiceBox.

Fitting ASR, LLM, and TTS all into 24GB of unified memory alongside other apps is challenging. To solve this, the orchestrator runs a state machine that juggles GPU memory between components (e.g., ASR pauses while TTS is speaking).

The agent is prepped with a Markdown file before each call that acts as a system prompt, defining your status, positions, and style. For anything not covered, it gracefully defaults to 'let me get back to you on that'. The LLM is also specially tuned to infer meaning from context to handle noisy or imperfect audio transcriptions.

Customize this for your use case? Please reach out to hello@appgambit.com

Key Features

✓
Universal App Support: Works with Google Meet, Zoom, Teams via BlackHole 16ch audio routing.
✓
Complete Local AI Pipeline: Real-time speech transcription with Voxtral 4B, intent classification via Llama 3.1 8B (Ollama), and voice cloning with VoiceBox TTS.
✓
Memory Orchestration: Built-in state machine juggles GPU memory between ASR and TTS components.
✓
Meeting Prep via Markdown: Pre-load the agent with your status updates, positions on topics, and communication style using a simple markdown file.
✓
Graceful Fallbacks: LLM is tuned to handle noisy ASR and defaults to "let me get back to you" for unknown topics rather than hallucinating.
✓
Apple Silicon Native: Inference running entirely on macOS without any cloud APIs.

Tech Stack

PythonOllamaLlama 3.1Voxtral ASRVoicebox TTSBlackHole

Keep Exploring

Want to keep exploring?

Here's another project you can jump into next.

Next project

RepoSense

AI Agent Powered Code Analysis Platform that turns legacy codebases into living documentation.

AIWebAgentic