Screencast MCP

MCP server TypeScript Windows-first ffmpeg

A Model Context Protocol server that lets an agent record the screen, take screenshots, "watch" footage by sampling frames into viewable images, and run a small set of ffmpeg edits. It speaks MCP over stdio.

View on GitHub

Prerequisites

ffmpeg and ffprobe must be installed and on PATH (or set via FFMPEG_PATH / FFPROBE_PATH). Screen capture uses gdigrab, which is Windows-only; the watch and edit tools run anywhere ffmpeg runs.

Install

npm install -g @tmhs/screencast-mcp
{
  "mcpServers": {
    "screencast": {
      "command": "npx",
      "args": ["-y", "@tmhs/screencast-mcp"]
    }
  }
}

Tools

ToolWhat it does
start_recordingStart a background recording. Target = full, a monitor, a window by title, or a region. Optional fps and quality preset.
stop_recordingStop a session by id with a graceful quit so the file is finalized, not truncated.
list_sessionsList active and finished recording sessions.
get_sessionInspect a single session by id.
screenshotCapture a single PNG of a target.
sample_framesExtract frames at a fixed fps or at explicit timestamps so the agent can view what happened.
get_media_infoffprobe wrapper: duration, resolution, fps, codecs, format, size.
trimCut a sub-clip by start + end or duration.
concatJoin two or more videos into one.
convertConvert between mp4, gif, and webm.
cropCrop to a pixel rectangle; an off-frame rectangle is rejected.
scaleResize to a width and/or height, keeping aspect when one side is given.
speedChange playback speed by a factor; audio is retempo'd when present.
overlayComposite a logo, watermark, or picture-in-picture, optionally scaled and time-limited.
compressRe-encode smaller with a CRF ladder and an optional width cap.
extract_audioWrite the audio track to its own file (mp3, aac, wav, or copy).
clipExtract one or more frame-accurate sub-segments to separate files.
redact_regionCover declared rectangles (solid box, blur, or pixelate) to hide on-screen secrets. Declared regions only, not automatic detection.
list_audio_devicesList DirectShow audio devices and flag a likely system-audio loopback device for start_recording.
xfade_transitionCrossfade two videos into one with an xfade transition. Inputs are auto-normalized first.
assemble_highlightsStitch two or more clips into one with hard cuts or an xfade transition between each.
title_cardGenerate a standalone title card with centered text on a solid background. Uses a bundled font.
music_bedLay a music track under a video: looped/trimmed, faded, leveled, and mixed with any existing audio.
reframeRe-aspect to 16:9, 9:16, 1:1, or 4:5 with pad (letterbox) or crop (fill).
export_presetEncode a platform-ready file (youtube, instagram_reel, tiktok, x, square) at the right aspect, fps, and bitrate.

Windows notes

Threat model

Screen capture can record anything on screen, including secrets. Capture is always explicit (a tool call, never automatic), output stays on the local filesystem, and this public repo ignores captured media so test recordings cannot be committed by accident. Review frames before sharing a file.