Building Custom Tools and Widgets for ChatGPT: The OpenAI Apps SDK Handbook

OpenAI recently launched the OpenAI Apps SDK, which is currently in preview. This SDK introduces the Model Context Protocol (MCP) as a standardized method for extending ChatGPT with custom tools and resources. While in preview, it offers early access to features like tool registration and widget serving, enabling developers to create immersive experiences directly within conversations. As large language models continue to expand their role in developer workflows, the Apps SDK provides a robust framework for integrating external functions, managing asynchronous tasks, and delivering interactive interfaces without interrupting the user experience.

This article explores how to apply MCP principles in Node.js to build an advanced ChatGPT App. We draw from a practical example, a “TaleStitch” application that processes story generation requests through backend services and custom displays. The focus remains on core setup, tool definitions, and session handling, with structured code examples for clarity.

Why Use the MCP SDK for ChatGPT Applications?

The MCP SDK’s emphasis on structured inputs and metadata makes it well-suited for scalable applications that require validation, safety hints, and rich outputs. Node.js’s asynchronous nature complements MCP, supporting efficient handling of concurrent sessions and API interactions.

With the SDK’s preview status, developers can experiment with features like streaming transports, though production use may require awaiting full release. Libraries for OpenAI APIs and schema validation further simplify adoption.

Key Concepts of MCP Applied in Node.js

MCP’s primary components, tool specifications, metadata for OpenAI-specific enhancements, and transport mechanisms adapt seamlessly to Node.js. Key principles include:

Tool Specifications: Define invocable functions with input schemas and behavioral hints.
Metadata: Customize invocation flows and widget rendering for better user engagement.
Transports: Facilitate session persistence and streaming communication.

Building an MCP Workflow in Node.js

We will construct a Node.js application that:

Registers Tools: Defines functions with validated parameters and OpenAI metadata. Serves Widgets: Provides HTML resources for displaying dynamic content.

Manages Sessions: Uses transports to maintain stateful connections

Step 1: Setting Up the Node.js Environment

Ensure Node.js is installed. Install the MCP SDK and Zod for input validation.

npm install @modelcontextprotocol/sdk zod

Step 2: Initializing the MCP Server and Registering Tools

Create a reusable MCP server instance. Register tools using a structured specification that includes input schemas, hints, and OpenAI metadata.

const { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
const { z } = require("zod");
const server = new McpServer({
name: "talestitch-mcp",
version: "1.0.0"
});
server.registerTool(
"talestitch.generate_story",
{
title: "Generate Story With Talestitch",
description: `This tool generates creative stories based on plot.`,
inputSchema: {
plot: z.string().min(50).max(1000).describe("Detailed plot outline starting with a hook. Example: A brave explorer stumbles upon an ancient temple guarded by mythical creatures."),
},
readOnlyHint: true,
destructiveHint: false,
idempotentHint: false,
openWorldHint: true,
_meta: {
"openai/outputTemplate": "ui://widget/widget.html",
"openai/toolInvocation/invoking": "Asking Talestitch to generate a story for you",
"openai/toolInvocation/invoked": "Talestitch will generate a creative story based on your plot, length, and genre preferences",
"openai/widgetAccessible": true
}
},
async ({ plot, length, genre }) => {
try {
// Placeholder for processing logic
return {
structuredContent: {
title: "Generated Title",
story: "<p>Sample story content based on inputs.</p>",
image: "https://example.com/image.jpg"
},
content: [
{ type: "text", text: "Your Story is Ready!" }
]
};
} catch (error) {
return {
structuredContent: { error: error.message },
content: [{ type: "text", text: error.message }]
};
}
}
);

The specification ensures validated inputs and guides ChatGPT on usage through descriptions and hints. OpenAI metadata enables custom widget rendering and user-friendly messages during invocation.

OpenAI-Specific Metadata:

Within the tool specification’s _meta object, OpenAI provides a set of prefixed keys to customize how ChatGPT handles invocations and outputs, creating a more polished user experience. These are optional but powerful for widget-enabled tools.

  • “openai/outputTemplate”: Specifies a URI (e.g., “ui://widget/widget.html”) for a custom HTML template to render the tool’s structured output. When the tool returns structuredContent, ChatGPT loads this template and populates it dynamically via the openai:set_globals event. This enables rich, interactive displays like story viewers, tying directly to registered resources.
  • “openai/toolInvocation/invoking”: A string message displayed to the user during tool execution (e.g., “Asking Talestitch to generate a story for you”). It appears in real-time as the invocation progresses, providing transparency for potentially long-running tasks.
  • “openai/toolInvocation/invoked”: A follow-up string shown immediately after invocation completes (e.g., “Talestitch will generate a creative story based on your plot, length, and genre preferences”). This sets user expectations and can direct attention to the widget or next steps.
  • “openai/widgetAccessible”: A boolean (true/false) that controls widget visibility in the ChatGPT UI. Set to true to make the rendered template interactive and embedded; false keeps outputs text-only for simpler tools.

Step 3: Adding Session Management with Transports

MCP uses transports to handle communication sessions. The StreamableHTTPServerTransport is particularly useful for HTTP-based streaming, allowing bidirectional data flow between the client (ChatGPT) and server. It supports session IDs for persistence and callbacks for initialization and closure, making it ideal for stateful interactions in preview environments.

const { StreamableHTTPServerTransport } = require("@modelcontextprotocol/sdk/server/streamableHttp.js");
const { randomUUID } = require("node:crypto");
const transports = {}; // In-memory session tracking
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => randomUUID(),
onsessioninitialized: async (sessionId) => {
transports[sessionId] = transport;
console.log(`Session initialized: ${sessionId}`);
}
});
transport.onclosed = async () => {
if (transport.sessionId) {
delete transports[transport.sessionId];
console.log(`Session closed: ${transport.sessionId}`);
}
};
server.connect(transport);

This transport manages unique sessions, preventing duplicates and ensuring cleanups, which is essential for scalable, concurrent use.

Step 4: Implementing Widgets (Advanced)

Register HTML resources as widgets to enhance tool outputs. The server serves the content via a URI, which the metadata links to during tool calls.

const fs = require("node:fs");
const htmlContent = fs.readFileSync("./widget.html", "utf-8");
server.registerResource(
"talestitch.generate_story",
"ui://widget/widget.html",
{
title: "Story Viewer",
description: "Interactive display for generated stories"
},
async () => ({
contents: [{
uri: "ui://widget/widget.html",
mimeType: "text/html+skybridge",
text: htmlContent
}]
})
);

In the widget file (widget.html), use JavaScript to listen for the “openai:set_globals” event and populate elements from toolOutput.structuredContent. For persistence, call setWidgetState to store data across reloads.

// Event handler for openai:set_globals
window.addEventListener("openai:set_globals", () => {
const toolOutput = window.openai?.toolOutput?.result?.structuredContent;
if (toolOutput) {
// Update title, content, image from toolOutput
document.getElementById('chapterTitle').innerText = toolOutput.title || 'Untitled Story';
document.getElementById('chapterContent').innerHTML = toolOutput.story || '<p>No story available.</p>';
document.getElementById('storyImage').src = toolOutput.image || 'default.jpg';
// Hide loading, show content
document.getElementById('heroLoading').style.display = 'none';
document.getElementById('storyContainer').style.display = 'block';
// Save state
window.openai?.setWidgetState({ title: toolOutput.title, story: toolOutput.story, image: toolOutput.image });
}
});

Step 5: Exposing the MCP Endpoint with Routes and Controller

To integrate with ChatGPT, expose your MCP server via an HTTP endpoint. Define a POST route for “/mcp” and a controller to handle requests, passing the body and session ID to the MCP service.


const MCPController = {
process: async (req, res) => {
try {
const sessionId = req.headers['mcp-session-id'] || null;
const transport = await MCPService.process({ body: req.body, sessionId });
return transport.handleRequest(req, res, req.body);
} catch (error) {
console.error(`Error in MCPController.process - ${error.message}`);
return res.status(400).json(error);
}
}
};
Talestitch inside ChatGPT

Step 6: Testing Your MCP Setup Locally

To test your MCP endpoint during development:

  1. Run your Node.js server locally (e.g., on port 3000).
  2. Use ngrok to expose localhost:3000 publicly: ngrok http 3000. Note the ngrok URL (e.g., https://abc123.ngrok.io).
  3. In ChatGPT, go to Settings > Connectors > Advanced > Developer mode.
  4. Paste the ngrok URL + your MCP path (e.g., https://abc123.ngrok.io/mcp) as the connector endpoint.
  5. Trigger a tool call in ChatGPT (e.g., request a story generation). Monitor logs for session initialization and tool invocation.

Demo:

https://x.com/sumitp01/status/1977016269275812336

This template allows quick iteration: Update your server, refresh ngrok, and test in ChatGPT’s developer mode for real-time feedback.

Conclusion

While the OpenAI Apps SDK remains in preview, its MCP components provide a solid foundation for extending ChatGPT with custom integrations. In this article, we examined how to:

Register Tools: Use specifications with schemas, hints, and metadata for precise and engaging invocations.

Incorporate Transports: Leverage StreamableHTTPServerTransport for efficient, streaming session management.

Build Widgets: Serve resources that deliver interactive, stateful user interfaces.

By utilizing Node.js’s strengths alongside MCP’s protocol features, developers can construct efficient AI extensions capable of managing intricate interactions. Now is the time to experiment with these preview capabilities. Combining MCP’s tools with creative designs unlocks vast potential for ChatGPT applications.

Leave a Reply