Early this morning, news of OpenAI's full acquisition of io dominated most headlines. At the same time, OpenAI also "quietly" released another significant announcement: the core API for developing intelligent agents, the Responses API, now supports MCP services.
Traditionally, developing intelligent agents required interacting with external services through function calls. Each operation involved network transmission from the large model to the backend and then to the external service, leading to multiple jumps, high latency, and increased complexity in scaling and management.
Now that the Responses API supports MCP, developers no longer need to manually connect specific services for each function call. Instead, they can configure the model to point to one or more MCP services.
Responses API supports MCP
Since OpenAI released the Responses API, hundreds of thousands of developers have processed trillions of tokens of data through this API and developed various intelligent agent applications, including Zencoder's coding agent, Revi's market agent for private equity and investment banking, and MagicSchool AI's educational agent.
To further simplify agent development, the Responses API now supports MCP services, allowing developers to connect their agents to powerful external tools and services with just a few lines of code.
For example, with just 9 lines of code, you can link your agent to the e-commerce platform Shopify.
In the past, we needed to write custom cart_add or create_payment_link wrappers and host our own relay servers. Now, everything is simplified, easily managed with a single pointer.
It only took 13 lines of code to connect the agent to the cloud communication platform Twilio. Previously, you would have needed to integrate two tool calls into your backend and batch the final SMS payload yourself.
Another benefit of MCP support is centralized tool management, enabling agents to efficiently call external services. We can use the allowed_tools parameter to precisely control which tools an agent can access, thereby avoiding unnecessary tool calls, context expansion, and shortening response times.
For example, when handling user queries, the agent can select the most appropriate tool to call based on preset rules, rather than blindly attempting all possible tools.
When using MCP, precise permission control can also ensure the security of the agent. For example, you can limit the agent to calling only certain specific tools, or require explicit approval when calling a tool. This permission control mechanism not only prevents agents from misusing tools but also protects the security of external services.
By passing authorization keys and server URLs with each call, MCP ensures the security of authentication and authorization while avoiding the leakage of sensitive information in response objects.
Furthermore, MCP supports dynamic tool list import and caching mechanisms. When an agent first connects to an MCP server, it imports the tool list from the server and caches it in the model's context. In subsequent calls, the agent can directly use the cached tool list without needing to fetch it from the server again, thereby reducing latency and improving response speed.
Of course, there are many other benefits to MCP support that are not listed here. Interested parties can try it themselves or attend the "AIGC Open Community" offline MCP public class for a real demonstration.
Responses API other new features
In addition to MCP support, OpenAI has also made significant updates to the image generation, Code Interpreter, and file search tools within the Responses API, further enhancing agent capabilities.
Image Generation: Developers can now directly access OpenAI's latest image generation models (such as <gpt-image-1>) within the Responses API and use them as tools. This tool supports real-time streaming, allowing developers to view previews during image generation and supporting multi-round editing, enabling developers to gradually fine-tune images.
Code Interpreter: The Code Interpreter tool is now available in the Responses API, supporting data analysis, solving complex mathematical and coding problems, and even helping models deeply understand and manipulate images. For example, when dealing with mathematical problems, models can leverage Code Interpreter to run code to derive answers, significantly improving performance.
File Search: The file search tool has been enhanced, allowing developers to extract relevant content blocks from documents into the model's context based on user queries. Additionally, the tool supports performing searches across multiple vector stores and allows attribute filtering using arrays.
At the same time, OpenAI has also introduced new functionalities to the Responses API.
Background Mode: For tasks requiring longer processing times, developers can use background mode to start these tasks asynchronously without worrying about timeouts or other connection issues. Developers can poll these tasks to check for completion or start streaming events when needed.
Inference Summary: The Responses API can now generate concise natural language summaries of the model's internal chain of thought. This makes it easier for developers to debug, audit, and build better end-user experiences.
Encrypted Inference Items: Customers eligible for Zero Data Retention (ZDR) can reuse inference items between API requests without storing any inference items on OpenAI's servers. This not only improves intelligence but also reduces token usage, lowering costs and latency.
Article content sourced from OpenAI. Please contact us for removal if there is any infringement.
END