refactor-image-gen-mcp-schema
Upstream image models may still return base64, but the MCP tool should not pass that base64 back to clients by default.
The connector decodes upstream base64 into bytes, writes those bytes to a private R2 bucket, and returns an MCP
resource_link with an expiring HTTPS gateway URL.
/mcp/image-gen with existing JWT/API-key middleware.userId and optional apiKeyId.resource_link.uri.| In this PR | Out of this PR |
|---|---|
| Convert upstream base64 outputs into R2-backed artifacts. | Video generation or broader model-selection changes. |
Return resource_link and metadata-only assets[]. |
Default inline base64 delivery. |
| Add a private R2 binding and a gateway download route. | Raw R2 presigned URLs as user-facing links. |
| Use signed URL tokens plus R2 metadata for minimal artifact state. | D1/KV artifact metadata tables unless refresh/list/audit becomes necessary. |
The standard MCP result is content plus optional structured content. Inline media blocks carry bytes; linked resources do not.
interface CallToolResult { content: ContentBlock[]; structuredContent?: { [key: string]: unknown }; isError?: boolean; _meta?: { [key: string]: unknown }; }
| Content block | Media behavior | Use here? |
|---|---|---|
image | Requires base64 data. | Only explicit inline mode later. |
audio | Requires base64 data. | Only explicit inline mode later. |
resource | Embedded binary resources require base64 blob. | No, same context-size issue. |
resource_link | References media by URI without embedding bytes. | Yes, default artifact contract. |
The output stays small: text plus linked resources in content, with normalized metadata in structuredContent.assets.
{
"content": [
{
"type": "text",
"text": "Generated 1 image with openai/gpt-image-2."
},
{
"type": "resource_link",
"name": "generated-image-1",
"title": "Generated image 1",
"uri": "https://gateway.example.com/artifacts/art_123?token=...",
"mimeType": "image/png",
"size": 1234567
}
],
"structuredContent": {
"model": "openai/gpt-image-2",
"assets": [
{
"id": "art_123",
"kind": "image",
"mimeType": "image/png",
"size": 1234567,
"width": 1024,
"height": 1024,
"uri": "https://gateway.example.com/artifacts/art_123?token=...",
"expiresAt": "2026-05-13T20:00:00Z"
}
]
}
}
structuredContent.images[] briefly as metadata-only records. Do not include
data in default success output.
There is no filesystem step in the Worker. Decode upstream base64 in memory, write the bytes to R2, then discard the base64.
upstream b64_json
-> decode to Uint8Array / ArrayBuffer
-> IMAGE_ARTIFACTS.put(r2Key, bytes, { httpMetadata, customMetadata })
-> return resource_link + assets[] metadata
artifacts/{yyyy}/{mm}/{dd}/{artifactId}/{index}.{extension}
Keys must not contain prompts, provider filenames, user-controlled path fragments, bucket names, or account IDs.
artifactId, R2 key, kind, MIME type, size, width, height.userId and optional apiKeyId.For this PR, that metadata can be carried in the signed gateway URL token and R2 object metadata. Add a D1/KV artifact table later only if refresh, listing, audit, or revocation requires server-side lookup.
/mcp/image-gen already runs behind existing identity middleware.userId from the Keycloak sub claim.userId and optional apiKeyId.resource_link.uri is fetched like a normal HTTPS URL.Returned gateway URLs are bearer credentials. Default URL TTL should be short, such as 15-60 minutes. The R2 object can live longer, but this PR can lean on R2 lifecycle cleanup instead of building a full artifact management subsystem.
A future refresh tool can use normal Keycloak/API-key identity to issue a fresh URL while the artifact is retained:
{
"name": "get_artifact_url",
"arguments": { "id": "art_123" }
}
artifact_storage_failedGeneration succeeded, but R2 upload or metadata signing failed. Do not fall back to base64.
artifact_url_expiredThe signed gateway URL token expired.
artifact_forbiddenThe token is invalid for this artifact or scope.
artifact_not_foundThe R2 object is missing or was already cleaned up.
| Area | Assertions |
|---|---|
| Tool output | resource_link blocks exist; no default base64 image.data, audio.data, resource.blob, or assets[].data. |
| R2 write | Decoded bytes, correct content type, opaque key, and expected object metadata are written. |
| Identity | Artifact ownership uses existing Keycloak/API-key middleware values. |
| Download | Gateway token signature, expiry, and read scope are validated before R2 is read. |
| Keycloak boundary | Direct artifact download does not require a Keycloak bearer header and does not expose a Keycloak token in the URL. |
| Failure | Storage failure returns artifact_storage_failed and never falls back to base64. |