Multimodal agentic checkout that maps natural language, voice, photos, and video to catalog items — enriched with user context, orchestrated across merchants.
From prompt to payment — context enrichment, catalog resolution, and the agentic wallet in action.
Order chicken breasts, basmati rice, coconut milk, curry paste… I also want to make a carbonara this weekend. Plus the usual weekly essentials. Saturday morning delivery.
Text, voice, photo, or video — the checkout API accepts any human input and resolves it to structured catalog items.
“Add organic milk and free-range eggs to my weekly shop”
input modalities: text, voice, photo, video
prompt-to-cart resolution time
merchant catalogues indexed
item match accuracy with context enrichment
One API adapts to every commerce vertical and input modality.
Resolved 8 items from order history
Matched to Waitrose + Ocado catalogues
Swapped peanut butter → sunflower seed butter
Nut allergy detected from context profile
Split across 2 merchants for best price
Saved £3.40 vs single-merchant
Auto-approved via agentic wallet
£47.00 — under £150 threshold
{ "checkout_id": "chk_abc123", "input": { "type": "text", "prompt": "Order my usual weekly shop for Saturday" }, "resolved_items": 8, "context_enrichments": 2, "merchants": [ "waitrose", "ocado" ], "total": "GBP 47.00", "payment": { "wallet_id": "wal_7kx9m2", "status": "auto_approved" }}Text, voice, photo, and video inputs resolved to structured catalogue items via a single API. The same checkout endpoint handles a typed prompt, a voice command in a game, or a camera feed from smart glasses.
User preferences, dietary restrictions, sizes, and order history are automatically applied during item resolution. No manual filtering — the checkout API queries the context profile and enriches every item.
One checkout, many merchants. Items are split by availability and price across merchant catalogues. Each merchant fulfilment is tracked independently with its own status timeline.
Create, resolve, update, and fulfil — clean interfaces that handle multimodal input, context enrichment, and multi-merchant orchestration behind the scenes.
import Hyperfold from "hyperfold";
const hf = new Hyperfold({ apiKey: "hf_live_..." });
const checkout = await hf.checkout.create({ user_id: "usr_abc123", input: { type: "text", prompt: "Order my usual weekly groceries for Saturday", }, context: true, options: { multi_merchant: true, price_optimise: true, delivery_window: "saturday_am", },});Interactive schema explorer for the core Checkout endpoints. Expand each section to inspect parameters, request bodies, and response shapes.
POST /v1/checkout/sessions
Initialize a checkout session from text, voice, or image input. The API resolves items from your catalogue, enriches with user context, and orchestrates across multiple merchants — returning a priced session ready for payment.
Initialize a new checkout session from any input modality.
The customer initiating the checkout
Prevents duplicate session creation
POST /v1/checkout/sessions/{session_id}/items
Append items to an active session using any input modality. Send a photo of a shelf, a voice note, or catalogue IDs — the API resolves each item with confidence scores and merchant attribution.
Add items to an existing checkout session.
The checkout session to add items to
POST /v1/checkout/sessions/{session_id}/place
Submit the session for fulfilment. The API authorises payment via the linked wallet, splits the order across merchants if needed, and returns fulfilment tracking for each shipment.
Finalise the session and place the order.
The session to place
Text, voice, photo, or video — the checkout API resolves any input to structured catalog items, enriches them with user context, and orchestrates payment across merchants. One call. Any device. Any modality.