I've been refining a compact, practical AI stack that covers chat, images, video, coding, avatars, writing, analysis, agents and search. My go-to 'overall' assistant is ChatGPT o3 for conversational orchestration and quick queries. For images I lean on GPT-4o for multimodal prompts, Flux for iterative design and Mystic for stylized renders. Video needs are served by Veo 3 and Seedance, with Gemini handling deep video analysis. Coding workflows use Claude Code, Cursor and Windsurf. Avatars and speech come from HeyGen and 11Labs, while Claude Sonnet handles polished writing. Agents like Comet, Genspark and n8n glue things together. Grok 4 and Perplexity round out search.
My Favorite AI Stack , Overview

I've been refining a compact, practical AI stack that covers chat, images, video, coding, avatars, writing, analysis, agents and search. My go-to 'overall' assistant is ChatGPT o3 for conversational orchestration and quick queries. For images I lean on GPT-4o for multimodal prompts, Flux for iterative design and Mystic for stylized renders. Video needs are served by Veo 3 and Seedance, with Gemini handling deep video analysis. Coding workflows use Claude Code, Cursor and Windsurf. Avatars and speech come from HeyGen and 11Labs, while Claude Sonnet handles polished writing. Agents like Comet, Genspark and n8n glue things together. Grok 4 and Perplexity round out search.
Image & Creative Tools: GPT-4o, Flux, Mystic

Editing images via natural language is amazing , a short line from a reply I've seen, and it reflects my experience. GPT-4o is my first stop for multimodal prompts and quick in-image edits; it interprets context and reference pixels intelligently. Flux is for iterative compositing and design iterations, letting teams refine layouts rapidly. Mystic excels at stylization and turning concepts into consistent brand art. Together they compress hours of Photoshop work into minutes, while keeping a human-in-the-loop for quality control. Use cases: rapid product mockups, A/B creative variations, marketing assets, and exploratory concepting before handing to designers.
Video Production & Analysis: Veo 3, Seedance, Gemini

Video production and analysis are split between creative generation and deep inspection. I use Veo 3 and Seedance for content creation: quick scene builds, scripted motion, and automated editing that accelerate short-form marketing and product demos. For video analysis, Gemini shines , it transcribes, summarizes, detects scenes, extracts highlights and provides semantic tags that make large footage searchable. Pairing generative video with analysis means you can iterate on rough cuts and quickly surface the best moments for social or training material. This combo also plays well with agents for automated highlight reels, content moderation, and indexing archives for knowledge management.
Coding & Agents: Claude Code, Cursor, Windsurf, Comet

For coding, my stack blends specialized code LLMs and developer tools. Claude Code handles multi-file reasoning and project-level refactors, while Cursor integrates LLM assistance directly into the editor for rapid iteration. Windsurf is great for scaffolding and API wiring. I also experiment with Gemini CLI for quick, terminal-based code assistance. Agents like Comet, ChatGPT agents, Genspark and n8n glue these capabilities into workflows , orchestrating tests, deployments, and integrations across services. Using agents reduces context switching: the agent runs tests, files issues, or triggers automations so developers focus on design and complex problem solving rather than mundane plumbing.
Avatars, Voice & Writing: HeyGen, 11Labs, Claude Sonnet
![]()
Avatars and voice are where generative AI feels closest to human production. I use HeyGen for photoreal or stylized avatars and 11Labs for expressive, natural-sounding TTS. For voice cloning and emotional nuance I test Hume in niche projects. On the writing side, Claude Sonnet has become my favorite for long-form, polished output , I like its tone and structure for documentation and marketing copy. Sonnet's style fits collaborative projects where a consistent voice is important. Combined, these tools produce on-brand tutorials, narrated demos, and realistic spokesperson videos without the full studio overhead. They also reduce scheduling friction and make multilingual deliveries faster and cheaper.
Search, Research & Automation: Grok 4, Perplexity, Zapier

For research, search and automation I keep a lightweight toolkit: ChatGPT Deep Research for longer synthesis tasks, Grok 4 for deep reasoning and occasional voice mode, and Perplexity for quick web-sourced answers. Gemini CLI also appears in my research/code loop for terminal workflows. Automation via Zapier or n8n connects outputs across tools , saving repetitive handoffs between creators, storage, and deployment. Hume shows promise for voice cloning experiments when I need highly specific inflections. The aim is not to use every shiny tool, but to combine a few that interoperate well and let agents or automations handle routine stitching.
Final Thoughts: Workflow, Costs & Experimentation

Practically speaking, the best approach is to standardize a core stack and experiment around its edges. I frequently try new tools (as I told Nathan , 'Will experiment more!') but keep core services stable to avoid version sprawl and integration chaos. The growing list of paid and free tools makes cost math scary , sometimes I 'rather not calculate it so I don't panic' , so set budgets and measure impact. Thanks to folks who chimed in with tips; community feedback shortens the learning curve. Keep things modular, automate handoffs with agents, and review quality periodically to avoid tech debt.

