Skip to content

Forge MCP tools

Forge ships a built-in MCP server (id forge) auto-registered in ~/.codex/config.toml on each launch. Codex’s agent calls these tools the same way it calls any other MCP tool. The Forge backend handles the vendor API call, saves the file under <project>/.forge/generated/..., and returns a ready-to-paste forge-link block so the user gets an inline preview card the moment the tool finishes.

+--------+ spawn +-----------+ TCP loopback +---------+
| Codex | ------------> | forge-mcp | ------------------> | Forge |
| agent | <-- stdio --- | shim | <----- JSON ------- | backend |
+--------+ (MCP) +-----------+ +---------+

The shim is a tiny Node ESM CLI Forge extracts to ~/.forge/mcp-shim/ on each launch. Forge writes its loopback port + auth token to ~/.forge/mcp-shim.json; the shim reads that file on spawn so each Forge boot is self-contained (a stale shim from a previous boot can’t talk to the new server).

node must be on PATH for the shim to spawn. If it isn’t, Forge logs the skip and the tools simply don’t appear in Codex’s tool list. Install Node.js and restart Forge to enable.

Generate speech via ElevenLabs.

FieldTypeNotes
textstring (required)The line to speak.
voice_idstringElevenLabs voice id. Defaults to Rachel (21m00Tcm4TlvDq8ikWAM).
labelstringShort human-readable name shown on the chat card.
model_idstringDefaults to eleven_multilingual_v2.

Saves under <project>/.forge/generated/audio/<timestamp>.mp3. Default open_in for the returned forge-link is audiotrim so the user lands in the trimmer (most generations are too long and need cropping).

Free tier on ElevenLabs covers light voice usage.

Generate a short sound effect via ElevenLabs.

FieldTypeNotes
promptstring (required)Describe the SFX.
duration_secondsnumber (0.5–22)Clip length. Default 3.
labelstringShort human-readable name.

Saves under .forge/generated/sfx/. Default open_in is audiotrim.

Free tier.

Pack a list of pre-generated frame images into a single sprite sheet PNG.

FieldTypeNotes
frames{ path, label? }[] (1–64)Absolute or project-relative paths to each frame image.
colsintegerGrid columns. Defaults to ceil(sqrt(N)).
rowsintegerGrid rows. Defaults to ceil(N/cols).
paddinginteger (0–32)Pixels between cells. Default 0.
namestringDisplay name in the descriptor.
labelstringShort human-readable name on the chat card.

The agent generates frames first via Codex’s built-in image tool (one prompt at a time, saved paths printed), then calls this once with the saved paths. Cells are sized to fit the largest frame; smaller frames center. Saves the packed PNG plus a JSON descriptor with frame coords and source-path references under .forge/generated/spritesheets/.

The returned forge-link points at the JSON descriptor, not the PNG. The card preview reads the descriptor to find and render the PNG; the click opens the sprite sheet builder seeded with all source frames so the user can re-pack with different padding or rearrange.

Capstone of the agent-driven asset chain. Bundles a tilemap with an optional parallax background + character sprite sheet into a single level descriptor.

FieldTypeNotes
tilemap_pathstring (required)Path to a .tmj from generate_tilemap (or any Tiled JSON).
parallax_pathstringOptional parallax descriptor JSON.
character_sheet_pathstringOptional sprite sheet descriptor; frame 0 shows in the preview.
character_pos{ x, y }Pixel position; default ~30% from the left, near the ground.
namestringDisplay name.
labelstringShort human-readable name on the chat card.

Backend composes a static 320x140 thumbnail PNG (parallax tiled in the back, tilemap scaled to fit the lower 60%, character sprite drawn at character_pos) and saves it alongside the descriptor. The level preview tab loads the same composition into a 800x360 scrollable canvas with parallax depth intact — drag to pan.

Typical chain:

generate_tileset(theme="forest")
→ savedPath A
generate_tilemap(tileset_path=A, width=64, height=20)
→ savedPath B
generate_parallax_layers(layers=[sky, hills, trees, fg])
→ savedPath C
generate_spritesheet(frames=[idle1, idle2, run1, run2])
→ savedPath D
generate_platformer_level(tilemap_path=B, parallax_path=C, character_sheet_path=D)
→ forge-link → user opens the level preview

Save named animation clips against an existing sprite sheet descriptor.

FieldTypeNotes
sheet_pathstring (required)Sprite sheet descriptor JSON from generate_spritesheet.
clips{ name, frames[], fps? }[] (1–32)Each clip lists frame indices to cycle and an fps (1–30, default 8).
namestringDisplay name.
labelstringShort human-readable name on the chat card.

The backend copies the sheet descriptor and adds a clips array, so the animation tab gets the full sheet metadata + clip definitions in one read. Frame indices are 0-based against the sheet’s frames array.

The animation tab cycles each clip live. Click any frame thumbnail to add or remove it from the selected clip; tweak fps with the slider.

Save a branching dialog tree.

FieldTypeNotes
nodes{ id, character?, voice_id?, text, choices?[] }[] (1–128)Each node carries one line of dialog plus optional metadata.
default_voice_idstringRecorded for reference.
namestringDisplay name.
labelstringShort human-readable name on the chat card.

Per-node voice_id lets multi-character conversations use distinct ElevenLabs voices; nodes that omit it fall back to the project default voice (set in Settings → Integrations → ElevenLabs). The dialog tab opens with every node loaded; clicking Speak on a node calls elevenlabs_generate_speech with the chosen voice id.

choices is an array of target node ids. The save tool validates that every choice resolves to a known node id — dangling choices abort the save with a clear error.

Stack pre-generated layer images into a parallax scene descriptor + static stacked thumbnail PNG.

FieldTypeNotes
layers{ name?, image_path, scroll?, y_offset?, visible? }[] (1–12)Layers from back to front.
viewport_widthinteger (64–4096)Thumbnail / editor preview width. Default 800.
viewport_heightinteger (64–2048)Thumbnail / editor preview height. Default 320.
namestringDisplay name in the descriptor.
labelstringShort human-readable name on the chat card.

Each layer carries a scroll multiplier where 0 is a static back layer (sky doesn’t move) and 1 is camera-locked (foreground moves 1:1 with the camera). Values in between control depth — 0.2 for distant hills, 0.5 for mid trees, etc.

Workflow: agent generates each layer image first via Codex’s built-in image tool, notes the saved paths, then calls this once with the array of layers. Backend validates each path, decodes via the image crate, and composes a static stacked thumbnail (each layer horizontally tiled at scroll=0 + its y_offset) so the chat card has something to render. The full scrolling preview lives in the parallax editor.

The forge-link points at the JSON descriptor. Click → parallax editor opens with all layers loaded; drag the canvas (or the scroll slider in the bottom rail) to see the depth effect.

Persist an agent-authored GLSL fragment shader.

FieldTypeNotes
glslstring (required)Full GLSL ES 3.0 fragment source.
kindstringShader stage. Currently only fragment.
namestringDisplay name in the descriptor.
labelstringShort human-readable name on the chat card.

The agent writes the source itself; this tool just persists it under .forge/generated/shaders/<timestamp>.glsl plus a JSON sidecar with metadata (kind, line count, name). The forge-link points at the .glsl file directly so the inline preview can read GLSL without a descriptor round-trip.

The card runs a live WebGL2 preview at ~30fps with the same uniform set as the shader sandbox: iResolution (vec2 px), iTime (float seconds), iMouse (vec4 px). Compile errors render inline in red so the agent gets immediate feedback if the syntax is wrong. Click the card → shader sandbox opens with the source loaded.

Generate a procedural tileset PNG plus a JSON descriptor.

FieldTypeNotes
themestring (required)One of forest, dungeon, platformer, cave, scifi.
tile_sizeinteger (8–128)Pixel size per tile. Default 32.
namestringDisplay name in the descriptor.
labelstringShort human-readable name on the chat card.

Each theme defines 8 tiles with semantic tags (e.g. forest = grass / grass-dark / dirt-path / stone / water / tree / flower / rock). V1 renders them as solid blocks with a contrasting inner border and center accent dot — visually distinct enough to read at editor scale, plus a future iteration can replace the renderer without changing the tool surface.

The forge-link points at the JSON descriptor. Click → tilemap editor opens with the tileset image already loaded so the user can paint without picking a file.

Procedurally fill a width × height grid using a curated theme generator.

FieldTypeNotes
tileset_pathstring (required)Path to a tileset .json descriptor produced by generate_tileset.
widthinteger (4–128) (required)Map width in tiles.
heightinteger (4–128) (required)Map height in tiles.
themestringGenerator override. Defaults to the tileset’s theme.
seedintegerRNG seed for reproducible generation.
namestringDisplay name in the descriptor.
labelstringShort human-readable name on the chat card.

Generators:

  • forest — scatter on grass: dark-grass sprinkles, 1–2 random-walk dirt paths, tree clusters, occasional flowers + rocks.
  • dungeon — BSP-ish: random rooms connected by L-shaped corridors, torches near walls, a chest or two.
  • platformer — ground + grass top in the bottom rows, stone outcrops underground, floating platforms with coins, scattered clouds, occasional spikes.
  • cave — 4 passes of cellular automata over a 45/55 random fill, then sprinkle moss / ore / crystal / rare lava.
  • scifi — dungeon BSP shape with neon strips on floors and energy accents.

Output is a Tiled-format .tmj saved under .forge/generated/tilemaps/ with a forge: block recording the generator + seed for reproducibility. Click the forge-link → tilemap editor opens with both the tileset and the painted layer in place; the user can tweak, paint, then re-export through the existing Save flow.

For “build me a level” prompts, chain generate_tilesetgenerate_tilemap in two MCP calls.

Build a 5-color palette descriptor and save it under .forge/generated/palettes/.

FieldTypeNotes
themestringOne of: sunset, forest, ocean, desert, dungeon, monochrome, neon, pastel, noir, candy, retro, cyberpunk. Ignored when colors is supplied.
colorsstring[] (1–16)Explicit #RRGGBB hexes that override theme.
namestringDisplay name. Defaults to the theme title.
labelstringShort human-readable name shown on the chat card.

Saves a small JSON descriptor ({ name, theme, colors[], createdAt }). The returned forge-link’s open_in defaults to palette. Clicking the card opens a fresh palette tab seeded with the descriptor’s name + first color, so the user can iterate on the harmony from there.

Hybrid model is on purpose: the curated themes are fast wins for “give me a palette” prompts, while explicit colors lets the agent ship a hand-picked set when the user has a specific look in mind.

Generate a music track via ElevenLabs Music.

FieldTypeNotes
promptstring (required)Describe the track.
length_msinteger (10000–300000)Track length. Default 30000.
labelstringShort human-readable name.

Saves under .forge/generated/music/. Default open_in is audio (player rather than trimmer; music tends to be used full-length).

Paid tier. Confirm with the user before calling if cost matters.

Every tool returns a JSON object with the same shape:

{
"savedPath": "C:\\Users\\you\\projects\\game\\.forge\\generated\\sfx\\1730000000000.mp3",
"projectRelativePath": ".forge/generated/sfx/1730000000000.mp3",
"byteCount": 28345,
"forgeLinkBlock": "```forge-link\npath: .forge/generated/sfx/1730000000000.mp3\npreview: audio\nopen_in: audiotrim\nlabel: sfx\n```"
}

The forgeLinkBlock is the value of an MCP text content frame the agent receives. The skill copy tells the agent to drop it verbatim into the chat reply, which produces the inline preview card.

Settings → MCP servers shows the entry as forge. Toggle it off to remove the tools from Codex’s tool list on next sidecar restart. The underlying loopback keeps running (it’s cheap), it just becomes unreachable from the agent.

The most common error is “ElevenLabs key not configured”. Open Settings → Integrations and add a key. Free tier signups at elevenlabs.io cover voice + SFX usage.