A Server! A frontend with React! Render dynamic contents! Get & set global parameters (tool input, output, etc)! Call Tools From App!
Full Demo On my GitHub!
Chat GPT just released this Apps SDK to build apps for ChatGPT!
Originally, I thought it is going to be another App provider like App Store or Google Play, but NOOOOOO!
It is just an iFrame rendered within the ChatGPT!
Yes, it is like calling something render within a Webview of an iOS App an app! (Sounds stupid but that all it is about! Seriously!)
Which means to create an App for ChatGPT,
Five Steps!
- Create a Regular MCP Server with some Tools to return some structured content
- Create some UI/UX. We will be using
Vite
withReact
here and bundle those, but you can also just write some HTML, JS, and CSS directly! - Register our UI components (actually just some HTML) as MCP Server Resources.
- Modify tools to return components UI in addition to structured content with embedded resource
- Start the MCP Server and register it as an ChatGPT Connector
Simple! Right?
Let’s check it out! Step by step!
Before Started
Since how we deploy our App is by registering our MCP server as a ChatGPT Connector, we need developer mode.
Which means, free plan does not work!
- Make sure to pay OpenAI Some money to upgrade the plan! Yes, I just paid 22 (20 + tax) dollars to try this thing out! I will make the most out of this !!!!! 22 dollars!
- Toggle Settings > Apps & Connectors > Advanced > Developer mode in the ChatGPT client
Project Structure
First of all, as you might have realized from those steps I had above, we need both a server and a frontend. Here will be my folder structure.
.
├── server # MCP Server
│ ├── package.json
│ ├── src
│ └── tsconfig.json
└── web # Bundled UI
├── dist # Build output
├── package.json
├── src
└── tsconfig.json
I have left out couple config files here for the web
because they depend on the framework you want to use.
Basic MCP Server
The MCP server is the foundation of the Apps SDK integration, responsible for the following.
- List tools: exposes tools that the model can call, including their JSON Schema input and output contracts and optional annotations
- Call tools: executes the
call_tool
request from the model (ChatGPT) and returns structured content the model can parse. - Return components (UI): If we want ChatGPT to render some UI (which we do in this article) for the tool call, those tools will need to point to an embedded resource that represents the interface to render (inline) in the ChatGPT client.
The transport protocol COULD be Server-Sent Events, but Streamable HTTP is indeed recommended!
Let’s start with an MCP Server with the basic tool functionality, ie: return structured contents, we will then add the resource-related ones after we have created our frontend UI!
In this article, please allow me to assume that you are familiar with MCP, especially the Streamable HTTP protocol, if you need a little catch, please feel free to check out some of my previous articles!
Set up
Let’s start with adding the following dependencies after running npm init -y
in the server folder!
npm install @types/express typescript @types/node --save-dev
npm install express @modelcontextprotocol/sdk
I will be leaving out the full package.json
and tsconfig.json
here, but you can just grab those from my GitHub!
MCP Server
Like what I have said, we are starting here with a regular MCP server with a simple get_pokemon
tool to get some detail information for a specific pokemon!
import { McpServer} from '@modelcontextprotocol/sdk/server/mcp.js'
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'
import express, { Request, Response } from 'express'
import { z } from 'zod'
const PORT = 3000
/****************************/
/******** MCP Server ********/
/****************************/
const server = new McpServer({
name: 'pokemon-server',
version: '1.0.0',
}, {
capabilities: {},
})
// Add list pokemons tool
server.registerTool(
'get_pokemon',
{
title: 'Get Pokemon',
description: 'Get detail infomation of a pokemon.',
inputSchema: { name: z.string({ description: "The name of the pokemon to get detail for." }) },
outputSchema: {
result: z.object({
name: z.string({ description: "Pokemon name." }),
id: z.number({ description: "Pokemon index id." }).int(),
height: z.number({ description: "Pokemon height." }).int(),
weight: z.number({ description: "Pokemon weight." }).int(),
types: z.array(
z.object({
slot: z.number().int(),
type: z.object({
name: z.string({ description: "type name." }),
url: z.string({ description: "URL to get type detail." }).url(),
})
}).passthrough()
),
sprites: z.object({
back_default: z.string({ description: "URL to get type back_default image." }).url().nullable(),
back_female: z.string({ description: "URL to get type back_female image." }).url().nullable(),
back_shiny: z.string({ description: "URL to get type back_shiny image." }).url().nullable(),
back_shiny_female: z.string({ description: "URL to get type back_shiny_female image." }).url().nullable(),
front_default: z.string({ description: "URL to get type front_default image." }).url().nullable(),
front_female: z.string({ description: "URL to get type front_female image." }).url().nullable(),
front_shiny: z.string({ description: "URL to get type front_shiny image." }).url().nullable(),
front_shiny_female: z.string({ description: "URL to get type front_shiny_female image." }).url().nullable(),
}, { description: "URLs to get pokmeon images." }).passthrough()
}).passthrough()
}
},
async ({ name }) => {
if (name.length == 0) {
throw new Error("Pokemon name cannot be empty.")
}
const response = await fetch(`https://pokeapi.co/api/v2/pokemon/${name}`)
if (!response.ok) {
throw new Error(`HTTP error. status: ${response.status}`)
}
const json = await response.json()
const structuredContent = {
result: json
}
return {
content: [
{ type: 'text', text: JSON.stringify(structuredContent) },
],
structuredContent: structuredContent
}
}
)
/********************************/
/******** Express Server ********/
/********************************/
// Set up Express and HTTP transport
const app = express()
app.use(express.json())
app.post('/mcp', async (req: Request, res: Response) => {
// Create a new transport for each request to prevent request ID collisions
const transport = new StreamableHTTPServerTransport({
// stateless mode
// for stateful mode:
// (https://levelup.gitconnected.com/mcp-server-and-client-with-sse-the-new-streamable-http-d860850d9d9d)
// 1. use sessionIdGenerator: () => randomUUID()
// 2. save the generated ID: const sessionId = transport.sessionId and the corresponding transport
// 3. try retrieve the session id with req.header["mcp-session-id"] for incoming request
// 4. If session id is defined and there is an existing transport, use the transport instead of creating a new one.
sessionIdGenerator: undefined,
// to use Streamable HTTP instead of SSE
enableJsonResponse: true
})
res.on('close', () => {
transport.close()
})
await server.connect(transport)
await transport.handleRequest(req, res, req.body)
})
app.listen(PORT, () => {
console.log(`Pokemon MCP Server running on http://localhost:${PORT}/mcp`)
}).on('error', error => {
console.error('Server error:', error)
process.exit(1)
})
Two points here!
- It is really important to define an
outputSchema
if we are planning on using those information within our components UI (and we are) so that what we returned can be verified. - The
text
content should be theJSON.stringified
version of thestructuredContent
. The reason for this actually have something to do with the UI Parts as well so please let me leave it out for now and come back to it in couple seconds!
Confirm Schema
Before we move onto some frontend stuff, let’s quickly confirm that what we are returning as the structuredContent
indeed confirm to what we have declared for the outputSchema
!
If you have never get a chance to use this MCP Inspector provided officially, I strongly recommended!
- Start the express server. If you are using the
package.json
from my GitHub,npm run dev
. - Run
npx @modelcontextprotocol/inspector
. This will automatically start a local server for inspection and open the following page in the browser.
Choose Streamable HTTP for Transport Type, Enter the URL, and Connect!
Under the Tools tab, choose List Tools, select the get_pokemon
tool, enter a name, and confirm that the Tool Result is Success!
If our structuredContent
does not match our outputSchema
, this will tell us exactly where the problem occurs!
Confirmed?
Time to React!
Window.OpenAI API
I am sorry, but I just lied!
We are not ready for some UI yet!
Because!
Before that, I would like to take a second to look at this window.openai
.
This is the bridge between our frontend and ChatGPT, we use it to get agent-related states such as the tool input, tool output, persist component state, trigger server actions, and more!
Let’s take a look at what we can do with it one by one!
Starting with the definition!
Oh, by the way, please Copy and Paste the code we have in this section (some are provided by OpenAI on their GitHub., some are written by me wrapping around the provided ones) to your web folder! Or at least the ones you need!
Yes! OpenAI is planning on creating a public package for it, but NOT YET!
window.openai Definition
Starting with how this window.openai
is defined, because what we have for the rest of this section is just how to make use of this OpenAiGlobals
!
/**
* Global oai object injected by the web sandbox for communicating with chatgpt host page.
*/
declare global {
interface Window {
openai: API & OpenAiGlobals;
}
interface WindowEventMap {
[SET_GLOBALS_EVENT_TYPE]: SetGlobalsEvent;
}
}
export type OpenAiGlobals<
ToolInput = UnknownObject,
ToolOutput = UnknownObject,
ToolResponseMetadata = UnknownObject,
WidgetState = UnknownObject
> = {
// visuals
theme: Theme;
userAgent: UserAgent;
locale: string;
// layout
maxHeight: number;
displayMode: DisplayMode;
safeArea: SafeArea;
// state
toolInput: ToolInput;
toolOutput: ToolOutput | null;
toolResponseMetadata: ToolResponseMetadata | null;
widgetState: WidgetState | null;
setWidgetState: (state: WidgetState) => Promise<void>;
};
// currently copied from types.ts in chatgpt/web-sandbox.
// Will eventually use a public package.
type API = {
callTool: CallTool;
sendFollowUpMessage: (args: { prompt: string }) => Promise<void>;
openExternal(payload: { href: string }): void;
// Layout controls
requestDisplayMode: RequestDisplayMode;
};
export type UnknownObject = Record<string, unknown>;
export type Theme = "light" | "dark";
export type SafeAreaInsets = {
top: number;
bottom: number;
left: number;
right: number;
};
export type SafeArea = {
insets: SafeAreaInsets;
};
export type DeviceType = "mobile" | "tablet" | "desktop" | "unknown";
export type UserAgent = {
device: { type: DeviceType };
capabilities: {
hover: boolean;
touch: boolean;
};
};
/** Display mode */
export type DisplayMode = "pip" | "inline" | "fullscreen";
export type RequestDisplayMode = (args: { mode: DisplayMode }) => Promise<{
/**
* The granted display mode. The host may reject the request.
* For mobile, PiP is always coerced to fullscreen.
*/
mode: DisplayMode;
}>;
export type CallToolResponse = {
// result: the string (text) content return by the tool using { type: 'text', text: JSON.stringify(structuredContent) },
result: string;
};
/** Calling APIs */
export type CallTool = (
name: string,
args: Record<string, unknown>
) => Promise<CallToolResponse>;
/** Extra events */
export const SET_GLOBALS_EVENT_TYPE = "openai:set_globals";
export class SetGlobalsEvent extends CustomEvent<{
globals: Partial<OpenAiGlobals>;
}> {
readonly type = SET_GLOBALS_EVENT_TYPE;
}
This type of structure on defining a global object on window
should look pretty familiar if you have ever created a custom Plugin Provider for Web App for other developers to call on! (If you have not but interested in creating one, please feel free to check out my article on Create Custom Plugin Provider (Javascript API) for Web App!)
Now, here is a little note I would like to make on this CallToolResponse
type. In the provided definition, it only contains a result
string corresponding to the content return by the tool using { type: 'text', text: JSON.stringify(structuredContent) }
which is why I have recommended to set the text
content to be the JSON.stringified
version of the structuredContent
, but actually, all the following key will be included.
content
: the samecontent
array we returned from the serverstructuredContent
: again, that samestructuredContent
conforms to ouroutputSchema
isError
: like the name suggests, a boolean indicating whether the tool call succeed or not.meta
: the_meta
returned from the server tool call. We will take a more detailed look at this in couple seconds when registering resources, but this actually corresponds to thetoolResponseMetadata
in theOpenAiGlobals
that we can use to pass data that should not influence the model’s reasoning but needed for rendering UIs, like the full set of locations that backs a dropdown._meta
: you think this will be the field containing the the_meta
returned from the server? Not based on my testing! I am not sure if it is a bug or not, but for now, I have to be honest here. I have NO clue on what is this field for, because it is alwaysnull
for me!
useOpenAiGlobal
This is a hook provided (code-wise, not package-wise yet…) by OpenAI listening for host openai:set_globals
events and lets React components subscribe to a single global value.
export function useOpenAiGlobal<K extends keyof OpenAiGlobals>(
key: K
): OpenAiGlobals[K] {
return useSyncExternalStore(
(onChange) => {
const handleSetGlobal = (event: SetGlobalsEvent) => {
const value = event.detail.globals[key];
if (value === undefined) {
return;
}
onChange();
};
window.addEventListener(SET_GLOBALS_EVENT_TYPE, handleSetGlobal, {
passive: true,
});
return () => {
window.removeEventListener(SET_GLOBALS_EVENT_TYPE, handleSetGlobal);
};
},
() => window.openai[key]
);
}
For example, we can use it to read tool input, output, and metadata.
export function useToolInput() {
return useOpenAiGlobal('toolInput')
}
export function useToolOutput() {
return useOpenAiGlobal('toolOutput')
}
export function useToolResponseMetadata() {
return useOpenAiGlobal('toolResponseMetadata')
}
Or adding in a little typing for our own use case here!
import { useOpenAiGlobal } from "./use-openai-global";
import { z } from 'zod'
export function useToolInput(): InputSchema | null {
const input = useOpenAiGlobal('toolInput')
if (input === null) {
return null
}
try {
return InputSchema.parse(input)
} catch (error) {
console.error(error)
return null
}
}
export const InputSchema = z.object({ name: z.string().nonempty() })
export type InputSchema = z.infer<typeof InputSchema>
import { useOpenAiGlobal } from "./use-openai-global";
import { z } from 'zod'
export function useToolOutput(): StructuredOutput | null {
const output = useOpenAiGlobal('toolOutput')
if (output === null) {
return null
}
try {
return StructuredOutput.parse(output)
} catch (error) {
console.error(error)
return null
}
}
export const StructuredOutput = z.object({
result: z.object({
name: z.string(),
id: z.number().int(),
height: z.number().int(),
weight: z.number().int(),
types: z.array(
z.object({
slot: z.number().int(),
type: z.object({
name: z.string(),
url: z.url(),
})
}).loose()
),
sprites: z.object({
back_default: z.url().nullable(),
back_female: z.url().nullable(),
back_shiny: z.url().nullable(),
back_shiny_female: z.url().nullable(),
front_default: z.url().nullable(),
front_female: z.url().nullable(),
front_shiny: z.url().nullable(),
front_shiny_female: z.url().nullable(),
}).loose()
}).loose()
})
export type StructuredOutput = z.infer<typeof StructuredOutput>
(You might wonder why does my zod
notation look so different from the server side. That’s because the zod
version on the server is fairly low, enforced by the modelcontextprotocol/sdk
…)
setOpenAiGlobal
This is how I called it because OpenAI didn’t mention a single word on it! I don’t know the reason for that but I decided to make it myself!
And unfortunately, it is not as simple as just setting the key on that window.openai
object!
I meant, it is but in that case, our useOpenAiGlobal
hook will never be invoked, which means our UI will never be updated!
export function setOpenAIGlobal<K extends keyof OpenAiGlobals>(
key: K,
value: OpenAiGlobals[K] | null
) {
if (window.openai !== null && window.openai !== undefined) {
window.openai[key] = value as any
const event = new SetGlobalsEvent(SET_GLOBALS_EVENT_TYPE, {
detail: {
globals: {
[key]: value
}
}
})
window.dispatchEvent(event)
}
}
Yes! Dispatching that SetGlobalsEvent
manually is the key!
Because, as we can see from the useOpenAiGlobal
definition, that’s when the onChange
triggered!
Widget State
Widget state can be used for persisting component state, as well as expose context to ChatGPT. Anything we pass to setWidgetState
will be shown to the model, and hydrated into window.openai.widgetState
.
Here is the helper hooks provided to keep host-persisted widget state aligned with local React state.
import { useCallback, useEffect, useState, type SetStateAction } from "react";
import { useOpenAiGlobal } from "./use-openai-global";
import type { UnknownObject } from "./types";
export function useWidgetState<T extends UnknownObject>(
defaultState: T | (() => T)
): readonly [T, (state: SetStateAction<T>) => void];
export function useWidgetState<T extends UnknownObject>(
defaultState?: T | (() => T | null) | null
): readonly [T | null, (state: SetStateAction<T | null>) => void];
export function useWidgetState<T extends UnknownObject>(
defaultState?: T | (() => T | null) | null
): readonly [T | null, (state: SetStateAction<T | null>) => void] {
const widgetStateFromWindow = useOpenAiGlobal("widgetState") as T;
const [widgetState, _setWidgetState] = useState<T | null>(() => {
if (widgetStateFromWindow != null) {
return widgetStateFromWindow;
}
return typeof defaultState === "function"
? defaultState()
: defaultState ?? null;
});
useEffect(() => {
_setWidgetState(widgetStateFromWindow);
}, [widgetStateFromWindow]);
const setWidgetState = useCallback(
(state: SetStateAction<T | null>) => {
_setWidgetState((prevState) => {
const newState = typeof state === "function" ? state(prevState) : state;
if (newState != null) {
window.openai.setWidgetState(newState);
}
return newState;
});
},
[window.openai.setWidgetState]
);
return [widgetState, setWidgetState] as const;
}
Note that currently everything passed to setWidgetState
is shown to the model. For the best performance, it’s recommended to keep this payload small, and to not exceed more than 4k tokens.
Direct MCP Tool Calls
We can use this window.openai.callTool
to make direct MCP tool calls. This can be useful for refreshing data, fetching pokemon information for another pokemon in our example, and more!
export async function getPokemon(name: string) {
const toolInput: InputSchema = { name }
setOpenAIGlobal("toolInput", toolInput)
setOpenAIGlobal("toolOutput", null)
const response = await window.openai?.callTool("get_pokemon", { name: name });
if ("structuredContent" in response) {
setOpenAIGlobal("toolOutput", response["structuredContent"] as any)
} else {
const jsonResult = JSON.parse(response.result)
setOpenAIGlobal("toolOutput", jsonResult)
}
}
Two notes here.
window.openai.callTool
will not automatically update any states (any key defined inOpenAiGlobals
) for us. We have to do it manually!- The tool we are calling here needs to be marked as able to be initiated by the component, as we will do in couple seconds while registering resources and embedding those!
Send follow up conversations
We can use window.openai.sendFollowupMessage
to insert a message into the conversation as if the user asked it.
await window.openai?.sendFollowupMessage({
prompt: "some user message.",
});
Request Alternate Layouts
For example, if our component UI needs more space such as maps or tables, we can request for it using window.openai.requestDisplayMode
.
await window.openai?.requestDisplayMode({ mode: "fullscreen" })
Use host-backed navigation
Skybridge (the sandbox runtime) mirrors the iframe’s history into ChatGPT’s UI, which means we can use standard routing APIs such as React Router and the host will keep navigation controls in sync with our component.
To set up the routing with BrowserRouter
export default function PizzaListRouter() {
return (
<BrowserRouter>
<Routes>
<Route path="/" element={<PizzaListApp />}>
<Route path="place/:placeId" element={<PizzaListApp />} />
</Route>
</Routes>
</BrowserRouter>
);
}
And to perform programmatic navigation, we can just use the useNavigate
hook.
const navigate = useNavigate();
function openDetails(placeId: string) {
navigate(`place/${placeId}`, { replace: false });
}
function closeDetails() {
navigate("..", { replace: true });
}
End of window.openai
!
Finally, some UI!
Component UI/UX
As I have mentioned, I will be using Vite
with React
here, styling with tailwind, but you can choose whatever you like!
What is important here is that
The entry file should mount a component into a
root
element (which is how React works) and read initial data fromwindow.openai.toolOutput
or persisted state.
Code for UI
This is not the important part , so let me just dump some code at you using those hooks we had above, reading tool outputs, watching for changes, and calling some tools ourselves!
main.tsx
import { createRoot } from 'react-dom/client'
import App from './App.tsx'
import './index.css'
createRoot(document.getElementById('container')!).render(<App />)
App.tsx
import { useMemo, useState } from 'react'
import { InputSchema, useToolInput } from './helpers/use-tool-input'
import { StructuredOutput, useToolOutput } from './helpers/use-tool-output'
import PokemonCard from './Card'
import useEmblaCarousel from "embla-carousel-react"
import { getPokemon } from './helpers/fetch-pokemon'
const recommended = ["pikachu", "bulbasaur", "charmander", "squirtle"]
function App() {
const toolInput: InputSchema | null = useToolInput()
const toolOutput: StructuredOutput | null = useToolOutput()
const [emblaRef, _emblaApi] = useEmblaCarousel({
align: "center",
loop: false,
containScroll: "trimSnaps",
slidesToScroll: "auto",
dragFree: false,
})
const [name, setName] = useState("")
const [error, setError] = useState("")
const isLoading: boolean = useMemo(() => {
return toolOutput === null && error === ""
}, [toolOutput, error])
const sprites: { title: string, url: string | null }[] | null = useMemo(() => {
const sprites = toolOutput?.result.sprites
if (!sprites) {
return null
}
const array = Object.keys(sprites).map(function (title) {
let url: string | null = sprites[title] as string | null
return {
title: title,
url: url
}
})
return array
}, [toolOutput])
async function getPokemonHelper(name: string) {
setError("")
try {
await getPokemon(name)
} catch (error) {
console.error(error)
setError("Oops, something went wrong! Please check try again later!")
}
}
return (
<div className='flex flex-col gap-4 bg-amber-100 border rounded-md border-amber-700 text-black p-4' >
<div className='flex flex-col gap-2'>
<h2 className="font-semibold">Recommended Pokemons</h2>
<div className="antialiased relative w-full flex flex-row gap-2">
{recommended.map((pokemon) => (
<button
className="border rounded-md border-black bg-white py-1 px-2 disabled:opacity-50 disabled:cursor-not-allowed"
disabled={isLoading}
onClick={async () => await getPokemonHelper(pokemon)}
key={pokemon}
>{pokemon == "pikachu" ? `⭐ ${pokemon} ⭐` : pokemon}</button>
))}
</div>
</div>
<div className='flex flex-row gap-2 items-center'>
<h2 className="font-semibold">Search</h2>
<input onChange={(event) => setName(event.target.value)} className='py-1 px-2 border rounded-md' />
<button
className="border rounded-md border-black bg-white py-1 px-2 disabled:opacity-50 disabled:cursor-not-allowed"
disabled={isLoading}
onClick={async () => {
await getPokemonHelper(name)
setName("")
}}>Go</button>
</div>
<div className='flex flex-col gap-2'>
<h2 className="font-semibold">Pokemon: {toolInput?.name}</h2>
{
toolOutput === null ?
error !== "" ? <div className="text-red-400">{error}</div> : <div className='text-gray-400'>Loading...</div> :
<div className='flex flex-col gap-1 text-sm'>
<p>Id: {toolOutput.result.id}</p>
<p>Height: {toolOutput.result.height}</p>
<p>Weight: {toolOutput.result.weight}</p>
<p>Type: {toolOutput.result.types.map((t) => t.type.name).join(", ")}</p>
</div>
}
{
(sprites !== null && sprites.length > 0) ?
<div className="overflow-hidden" ref={emblaRef}>
<div className="flex flex-row gap-2 max-sm:mx-5 items-stretch">
{sprites?.map((sprite) => (
<PokemonCard key={sprite.title} url={sprite.url} title={sprite.title} />
))}
</div>
</div>
: null
}
</div>
</div>
)
}
export default App
Card.tsx
export type PokemonCardProps = {
title: string,
url: string | null
}
export default function PokemonCard({ title, url }: PokemonCardProps) {
if (url === null || url === undefined || typeof (url) !== "string" || url.length === 0) return null
return (
<div className="min-w-[160px] select-none max-w-[160px] w-[40vw] sm:w-[160px] self-stretch flex flex-col">
<div className="w-full">
<img
src={url}
alt={title}
className="w-full aspect-square rounded-2xl object-cover ring ring-black/5 shadow-[1px_2px_6px_rgba(0,0,0,0.06)]"
/>
</div>
<div className="mt-3 flex flex-col flex-auto">
<div className="text-base font-medium truncate line-clamp-1">{title}</div>
</div>
</div>
)
}
Test UI Locally
Obviously, we are not embedded in the OpenAI yet so we don’t have that window.openai
available, but what if we want to test it locally?
Inject some values ourselves!
If you are using Vite
, you should already have an index.html
created at the root folder. Update it to something like the following. For the result
, just copy and paste one of those API responses from pokeAPI, for example, this one for pikachu.
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>pokemon-widget</title>
<script>
window.openai = {
"toolOutput": {
"result": {...}
},
"toolInput": "pikachu"
}
</script>
</head>
<body>
<div id="container"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>
We can then run npm run dev
to check it out!
Bundle
Now we are done writing our React component, it is time to build it into a single JavaScript/CSS module that the server can render inline, with an HTML as the entry point.
What do I meant by that?
Or what do we actually need here so that we can register it as a MCP Server Resources?
Since the ChatGPT renders our UI in an iframe using the the skybridge sandbox runtime, we have to provide
An HTML resource whose
mimeType
istext/html+skybridge
and whose body loads the compiled JS/CSS bundle.
Yes, that is we cannot use tsc -b && vite build
to build our project and just use the index.html
generated in the dist
, because it uses relative references such as /assets/index-D-XA_-fY.css
. (You could use absolute URLs though.)
I meant, technically speaking, we COULD use the the JS and CSS generated and compose the HTML (which are just some string) on the server side, but as we can see here, the name for those files are so random that we will have to change some code every time!
So!
To build it, here are couple options we have.
Directly running
esbuild
command.
esbuild src/main.tsx --bundle --format=esm --outfile=dist/index.js
This will bundle the JS and CSS for us. We can then create the HTML on our server side like following.
const JS_PATH = join(__dirname, "..", "..", "web/dist/index.js")
const JS_FILE = readFileSync(JS_PATH, "utf8");
const CSS_PATH = join(__dirname, "..", "..", "web/dist/index.css")
const CSS_FILE = readFileSync(CSS_PATH, "utf8");
const HTML = [
"<!doctype html>",
"<html>",
`<head><style>${CSS_FILE}</style></head>`,
"<body>",
` <script type="module">${JS_FILE}</script>`,
` <div id="container"></div>`,
"</body>",
"</html>",
].join("\n");
Custom Build script to run Vite build
Build script: build.mts
import { build, type InlineConfig, type Plugin } from "vite"
import react from "@vitejs/plugin-react"
import fg from "fast-glob"
import path from "path"
import fs from "fs"
import tailwindcss from "@tailwindcss/vite"
const htmlName = "index.html"
const htmlRootName = "container"
const entryFileName = "src/main.tsx"
const outDir = "dist"
const PER_ENTRY_CSS_GLOB = "**/*.{css,pcss,scss,sass}"
const PER_ENTRY_CSS_IGNORE = "**/*.module.*".split(",").map((s) => s.trim())
const GLOBAL_CSS_LIST = [path.resolve("src/index.css")]
function wrapEntryPlugin(
virtualId: string,
entryFile: string,
cssPaths: string[]
): Plugin {
return {
name: `virtual-entry-wrapper:${entryFile}`,
resolveId(id) {
if (id === virtualId) return id
},
load(id) {
if (id !== virtualId) {
return null
}
const cssImports = cssPaths
.map((css) => `import ${JSON.stringify(css)}`)
.join("\n")
return `
${cssImports}
export * from ${JSON.stringify(entryFile)}
import * as __entry from ${JSON.stringify(entryFile)}
export default (__entry.default ?? __entry.App)
import ${JSON.stringify(entryFile)}
`
},
}
}
fs.rmSync(outDir, { recursive: true, force: true })
fs.mkdirSync(outDir, { recursive: true })
const builtName = path.basename(path.dirname(entryFileName))
const entryAbs = path.resolve(entryFileName)
const entryDir = path.dirname(entryAbs)
// Global CSS (Tailwind, etc.), only include those that exist
const globalCss = GLOBAL_CSS_LIST.filter((p) => fs.existsSync(p))
const perEntryCss = fg.sync(PER_ENTRY_CSS_GLOB, {
cwd: entryDir,
absolute: true,
dot: false,
ignore: PER_ENTRY_CSS_IGNORE,
}).filter((p) => !globalCss.includes(p))
// Final CSS list (global first for predictable cascade)
const cssToInclude = [...globalCss, ...perEntryCss].filter((p) =>
fs.existsSync(p)
)
const virtualId = `\0virtual-entry:${entryAbs}`
const createConfig = (): InlineConfig => ({
plugins: [
wrapEntryPlugin(virtualId, entryAbs, cssToInclude),
tailwindcss(),
react(),
{
name: "remove-manual-chunks",
outputOptions(options) {
if ("manualChunks" in options) {
delete (options as any).manualChunks
}
return options
},
},
],
esbuild: {
jsx: "automatic",
jsxImportSource: "react",
target: "es2022",
},
build: {
target: "es2022",
outDir,
emptyOutDir: false,
chunkSizeWarningLimit: 2000,
minify: "esbuild",
cssCodeSplit: false,
rollupOptions: {
input: virtualId,
output: {
format: "es",
entryFileNames: `${builtName}.js`,
inlineDynamicImports: true,
assetFileNames: (info) => {
const name = info.names[0]
const modified = (name || "").endsWith(".css")
? `${name}`
: `[name]-[hash][extname]`
return modified
}
},
preserveEntrySignatures: "allow-extension",
treeshake: true,
},
},
})
console.log(`Building ${builtName} (react)`)
await build(createConfig())
console.log(`Built ${builtName}`)
const htmlPath = path.join(outDir, htmlName)
// css get renamed sometimes
const cssPaths = fg.sync(`${outDir}/**/*.css`)
const jsPaths = fg.sync(`${outDir}/**/*.js`)
let cssBlock: string = ""
for (const cssPath of cssPaths) {
const css = fs.existsSync(cssPath)
? fs.readFileSync(cssPath, { encoding: "utf8" })
: ""
cssBlock = cssBlock + css ? `\n <style>\n${css}\n </style>\n` : ""
}
let jsBlock: string = ""
for (const jsPath of jsPaths) {
const js = fs.existsSync(jsPath)
? fs.readFileSync(jsPath, { encoding: "utf8" })
: ""
jsBlock = jsBlock + js ? `\n <script type="module">\n${js}\n </script>` : ""
}
const html = [
"<!doctype html>",
"<html>",
`<head>${cssBlock}</head>`,
"<body>",
` <div id="${htmlRootName}"></div>${jsBlock}`,
"</body>",
"</html>",
].join("\n")
fs.writeFileSync(htmlPath, html, { encoding: "utf8" })
console.log(`${htmlPath} (generated)`)
To build it, we can then run tsx ./build.mts
. This will also generated the HTML for us to use so that we can just read it in on our server side directly.
const HTML_PATH = join(__dirname, "..", "..", "web/dist/index.html")
const HTML = readFileSync(HTML_PATH, "utf8");
Since I am using Vite here, I went for this approach, but it actually uses esbuild
underwood anyway!
Regardless of which apporach you take, make sure the id
used in HTML is the same as that we have mounted our React App onto! In my case container
!
By the way if you are running into Error: Cannot find module ‘@rollup/rollup-darwin-arm64’ while building, simply clean everything and reinstall those!
rm -rf node_modules
rm package-lock.json
npm install
Register & Use UI Components on MCP Server
Last bit before we can deploy our Chat GPT App!
Two things we need to do here!
- Register our UI components (that HTML) as an MCP Server Resource.
- Modify tools to return components UI in addition to structured content with embedded resource
First thing first, let’s register our resource!
const VERSION = "1.0.0"
const BASE_RESOURCE_URI = "ui://widget/pokemon-board.html"
const RESOURCE_URL = `${BASE_RESOURCE_URI}?version=${VERSION}`
const RESOURCE_MIME_TYPE = "text/html+skybridge"
const HTML_PATH = join(__dirname, "..", "..", "web/dist/index.html")
const HTML = readFileSync(HTML_PATH, "utf8")
server.registerResource(
"pokemon-widget",
RESOURCE_URL,
{},
async () => ({
contents: [
{
uri: RESOURCE_URL,
mimeType: RESOURCE_MIME_TYPE,
text: HTML,
},
],
})
)
Three things here.
mimeType
:text/html+skybridge
for the sandbox runtime- resource
URI
: I have included a version here because ChatGPT caches templates aggressively, so unique URIs can help us preventing stale assets from loading. - No inline data assignment. We are not setting anything related to that
window.openai
! Host, ie: ChatGPT, will inject the data for us!
We can then link our tool to the template by setting _meta["openai/outputTemplate"]
to the same resource URI.
// Add list pokemons tool
server.registerTool(
'get_pokemon',
{
// ...
_meta: {
"openai/outputTemplate": RESOURCE_URL,
"openai/toolInvocation/invoking": "Invoking...",
"openai/toolInvocation/invoked": "Invoked!",
// Allow component-initiated tool access: https://developers.openai.com/apps-sdk/build/mcp-server#%23%23allow-component-initiated-tool-access
"openai/widgetAccessible": true
},
// ...
},
async ({ name }) => {
if (name.length == 0) {
throw new Error("Pokemon name cannot be empty.")
}
const response = await fetch(`https://pokeapi.co/api/v2/pokemon/${name}`)
if (!response.ok) {
throw new Error(`HTTP error. status: ${response.status}`)
}
const json = await response.json()
const structuredContent = {
result: json
}
return {
// ...
// The _meta property/parameter is reserved by MCP to allow clients and servers to attach additional metadata to their interactions.
// This allows us to define Arbitrary JSON passed only to the component.
// Use it for data that should not influence the model’s reasoning, like the full set of locations that backs a dropdown.
// // _meta is never shown to the model.
// _meta: {
// "key": "value"
// }
}
}
)
Here I have also set openai/widgetAccessible
to true
so that our tool can be invoked from our frontend! There are also other optional _meta
fields that let us declare properties such as security schemes and etc.
And I as have mentioned a little bit above, for the data our tool returns, in addition to structuredContent
and content
, we can also use this _meta
field to pass data to the component without exposing it to the model.
Deploy & Connect!
Yes, we are here!
To connect our MCP server to ChatGPT, we need a stable HTTPS endpoint!
Here are some of the common deployment options.
- Managed containers: Fly.io, Render, or Railway for quick spin-up and automatic TLS.
- Cloud serverless: Lambda, Google Cloud Run or Azure Container Apps if we need scale-to-zero. Whichever you choose, that cold start can interrupt streaming HTTP!
- Kubernetes
But!
For testing, a local server with ngrok is more than enough!
So!
If your server is not started yet, get it running and run ngrok http 3000
assuming you are listening to PORT 3000
!
You should see some like following. That Forwarding HTTPS is what we will be using!
Connect From ChatGPT
Now, let’s head to ChatGPT Console to create our connector for our little MCP Server!
Settings > Apps & Connectors > Create
Enter a name and a description you like.
For the MCP server URL, make sure to include the /mcp
path, or whatever path you have for handling MCP post requests! Here I will not have any authentication involved, but of course, you can authenticate the users with OAuth2.0!
Choose Create!
If the connection succeeds we will see a list of the tools our server advertises. If it fails, use the Testing guide to debug with MCP Inspector or the API Playground.
Call!
We could just prompt ChatGPT as we always do (I don’t use AI so not what I always do..but!), but we can also enforce it by clicking on the + button near the message composer and select the tool.
Let’s go!
Yeah!!!
We can also test it with other clients
- API Playground: visit
https://platform.openai.com/playground
, open Tools → Add → MCP Server, and paste the same HTTPS endpoint. This is useful when we want raw request/response logs. - Mobile clients: once the connector is linked on web it is available on ChatGPT mobile apps as well. Automatically! And we can use it to test mobile layouts if our component has custom controls.
Also, if we decide to make any code modifications
- Rebuild the component bundle (
npm run build
) if it is frontend. - Restart the MCP server.
- Refresh the connector in ChatGPT settings to pull the latest metadata.
Thank you for reading!
That’s it for this article!
Again, feel free to grab everything from my GitHub!
I don’t know about you but while I was making this, I started wondering if I can just turn any web service into a ChatGPT App by simply throwing in another Iframe within it!
I guess if your service support CORS to allow that, it might work, but obviously, most do not!
Why cannot we just read in the URL content and register it directly with the server as the resource? Because it is probably not bundle or is using some relative Javascript or CSS for rendering!
Anyway!
Happy App making!