What is ffmcp?
ffmcp (pronounced "F-F-M-C-P") is a command-line tool inspired by ffmpeg that provides a unified interface for accessing AI services. It allows you to interact with multiple AI providers including OpenAI (GPT-4, GPT-3.5, DALL-E, Whisper), Anthropic (Claude 3.5 Sonnet, Claude 3 Opus), Google Gemini (Gemini 2.0 Flash, Gemini 1.5 Pro), Groq, DeepSeek, Mistral AI, Together AI, Cohere, Perplexity, AI33, and AIMLAPI - all from a single, simple command-line interface.
Just like ffmpeg revolutionized multimedia processing by providing a unified command-line interface for video, audio, and image manipulation, ffmcp brings the same simplicity and power to AI interactions. Whether you need to generate text, analyze images, transcribe audio, create embeddings, build AI agents, or orchestrate multi-agent teams, ffmcp makes it easy to do all of this directly from your terminal.
Key Concept: ffmcp is to AI services what ffmpeg is to multimedia - a powerful, composable command-line tool that unifies multiple providers and capabilities into one simple interface. If you're familiar with ffmpeg, you'll feel right at home with ffmcp.
Features
- 🚀 Unified CLI - Single command-line interface for multiple AI providers
- 🔌 Modular - Easy to add new AI providers
- 📝 Simple - Works just like ffmpeg - simple, powerful, composable
- 🔧 Configurable - Manage API keys and settings easily
- 📊 Streaming - Real-time streaming support for responses
- 🎨 Full OpenAI Support - Vision, images, audio, embeddings, and assistants
- 🎙️ TTS/Voiceover - Text-to-speech with ElevenLabs and FishAudio, voice cloning, voice configurations
- 🧠 Brains (Zep/LEANN Memory) - Create brains with Zep (cloud/self-hosted) or LEANN (local, 97% storage savings), store/retrieve chat memory, collections, and graph
- 🤖 Agents - Named agents with model, instructions, brain, properties, and actions (web, images, vision, embeddings)
- 👥 Hierarchical Teams - Multi-agent collaboration with orchestrators, nested teams, and shared memory that flows up the hierarchy
Download
Quick Examples
Text Generation
ffmcp generate "Write a haiku about coding"
Image Analysis
ffmcp openai vision "What's in this image?" photo.jpg
Audio Transcription
ffmcp openai transcribe audio.mp3
Brains & Memory (Zep or LEANN)
# Create a Zep brain (cloud/self-hosted)
ffmcp brain create my-zep-brain --backend zep
# Create a LEANN brain (local, no API key needed)
ffmcp brain create my-leann-brain --backend leann
# Use memory (works with both backends)
ffmcp brain memory add --role user --role-type user --content "Who was Octavia Butler?"
ffmcp brain memory get
Agents
ffmcp agent create myagent -p openai -m gpt-4o-mini -i "You are a helpful assistant" --brain mybrain
ffmcp agent use myagent
ffmcp agent run "Plan a 3-day trip to Paris"
Hierarchical Teams
ffmcp agent create ceo -p openai -m gpt-4o-mini -i "You orchestrate teams"
ffmcp agent action enable ceo delegate_to_agent
ffmcp team create org-team -o ceo -m researcher -m writer -b team-brain
ffmcp team run "Create a comprehensive report"
Text-to-Speech
ffmcp voiceover create my-voice --provider elevenlabs --voice-id 21m00Tcm4TlvDq8ikWAM
ffmcp tts "Hello, world!" output.mp3 --voice my-voice
Resources
Frequently Asked Questions
What is ffmcp and how does it work?
ffmcp is a command-line tool that provides a unified interface for accessing multiple AI services. Instead of using different tools or APIs for each AI provider, you can use ffmcp to interact with OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, Mistral, Together AI, Cohere, Perplexity, and more - all with the same simple commands.
How is ffmcp different from using AI provider APIs directly?
ffmcp provides a consistent command-line interface across all providers, similar to how ffmpeg provides a unified interface for multimedia processing. You don't need to learn different API formats or SDKs - just use ffmcp commands. It also includes advanced features like agents, teams, memory management, and TTS that work across providers.
Do I need API keys for all providers?
No, you only need API keys for the providers you want to use. You can start with just one provider (like OpenAI) and add more as needed. Each provider's API key is configured separately using the ffmcp config command.
Can I use ffmcp in scripts and automation?
Yes! ffmcp is designed for scripting and automation. You can pipe input/output, use it in bash scripts, integrate it into CI/CD pipelines, and even use it programmatically via the npm package in Node.js/JavaScript projects.
What are agents and teams in ffmcp?
Agents are named AI assistants with specific models, instructions, and capabilities (like web search, image generation, etc.). Teams allow multiple agents to collaborate on complex tasks, with an orchestrator agent delegating work to team members. This enables sophisticated multi-agent workflows.
What's the difference between Zep and LEANN for memory?
Zep is a cloud/self-hosted memory platform with full features including graph relationships. LEANN is a local vector index that offers 97% storage savings and works without any API keys - perfect for privacy-focused applications. Both backends support memory, collections, and document storage. Choose Zep for cloud features and graphs, or LEANN for local, private operation with minimal storage.
Is ffmcp free to use?
ffmcp itself is free and open-source (MIT License). However, you need API keys from the AI providers you want to use, and those providers charge based on usage. Many providers offer free tiers or credits to get started. If you find ffmcp useful, please consider sponsoring the project to support continued development.
What operating systems does ffmcp support?
ffmcp works on Windows, macOS, and Linux. It requires Python 3.8+ and optionally Node.js 14+ if you want to use the npm package.
How do I get started with ffmcp?
1. Install via pip (pip install ffmcp) or npm (npm install -g ffmcp). 2. Configure an API key: ffmcp config -p openai -k YOUR_KEY. 3. Start generating: ffmcp generate "Hello, world!". See the Quick Start section above for more details.
Common Use Cases
- Text Generation: Generate content, code, summaries, translations, and more from the command line
- Image Analysis: Analyze images, extract text, understand visual content using vision models
- Audio Processing: Transcribe audio files, translate speech, convert text to speech
- AI Agents: Create specialized AI assistants with specific capabilities and instructions
- Multi-Agent Workflows: Build teams of agents that collaborate on complex tasks
- Memory Management: Store and retrieve conversation history, documents, and knowledge bases with Zep (cloud/self-hosted) or LEANN (local, 97% storage savings)
- Embeddings & Search: Create embeddings for semantic search and similarity matching
- Automation: Integrate AI capabilities into scripts, CI/CD pipelines, and automated workflows
- Development Tools: Use AI for code generation, documentation, testing, and debugging
- Content Creation: Generate images, text, audio, and multimedia content from the terminal