توليد الردود باستخدام واجهة برمجة تطبيقات الاستجابات

10 دقائق

نصيحة

راجع علامة التبويب النص والصور لمزيد من التفاصيل!

تجمع واجهة برمجة تطبيقات OpenAI Responses بين قدرات من واجهتين منفصلتين سابقا (ChatCompleteionsوAssistant Interfaces) في تجربة موحدة. يوفر توليد استجابة متعددة الأدوار بحالة ووضعية، مما يجعله مثاليا لتطبيقات الذكاء الاصطناعي المحادثة. يمكنك الوصول إلى واجهة برمجة تطبيقات Responses عبر عميل متوافق مع OpenAI باستخدام إما Foundry SDK أو OpenAI.

فهم واجهة برمجة التطبيقات (PRA) للردود

تقدم واجهة برمجة تطبيقات الردود عدة مزايا مقارنة بإكمال الدردشة التقليدية:

المحادثات ذات الدولة: يحافظ على سياق المحادثة عبر عدة أدوار
تجربة موحدة: تجمع بين إكمال الدردشة وأنماط واجهة برمجة التطبيقات الخاصة بالمساعدين
Foundry Direct models: يعمل مع النماذج المستضافة مباشرة في Microsoft Foundry، وليس فقط Azure نماذج OpenAI
Simple integration: Access عبر العميل المتوافق مع OpenAI

‏‫ملاحظة‬

واجهة برمجة التطبيقات Responses هي النهج الموصى به لتوليد استجابات الذكاء الاصطناعي في تطبيقات Microsoft Foundry. يحل محل واجهة برمجة التطبيقات القديمة لمعظم السيناريوهات.

توليد استجابة بسيطة

مع عميل متوافق مع OpenAI، يمكنك توليد ردود باستخدام طريقة response.create():

# Generate a response using the OpenAI-compatible client
response = openai_client.responses.create(
    model="gpt-4.1",  # Your model deployment name
    input="What is Microsoft Foundry?"
)

# Display the response
print(response.output_text)

يقبل معامل الإدخال سلسلة نصية تحتوي على الطلب. يولد النموذج استجابة بناء على هذا المدخل.

فهم هيكل الاستجابة

يحتوي كائن الاستجابة على عدة خصائص مفيدة:

output_text: الرد النصي المولد
معرف: معرف فريد لهذا الرد
الحالة: حالة الاستجابة (على سبيل المثال، "اكتمل")
الاستخدام: معلومات استخدام الرموز (الإدخال، الإخراج، وإجمالي الرموز)
النموذج: النموذج المستخدم لتوليد الاستجابة

يمكنك access هذه الخصائص للتعامل مع الردود بفعالية:

response = openai_client.responses.create(
    model="gpt-4.1",
    input="Explain machine learning in simple terms."
)

print(f"Response: {response.output_text}")
print(f"Response ID: {response.id}")
print(f"Tokens used: {response.usage.total_tokens}")
print(f"Status: {response.status}")

إضافة التعليمات

بالإضافة إلى إدخال المستخدم، يمكنك تقديم تعليمات (غالبا ما يشار إليها بموجه النظام) لتوجيه سلوك النموذج:

response = client.responses.create(
    model="gpt-4.1",
    instructions="You are a helpful AI assistant that answers questions clearly and concisely.",
    input="Explain neural networks."
)

print(response.output_text)

التحكم في توليد الاستجابة

يمكنك التحكم في توليد الاستجابة باستخدام معايير إضافية:

response = openai_client.responses.create(
    model="gpt-4.1",
    instructions="You are a helpful AI assistant that answers questions clearly and concisely.",
    input="Write a creative story about AI.",
    temperature=0.8,  # Higher temperature for more creativity
    max_output_tokens=200  # Limit response length
)

print(response.output_text)

درجة الحرارة: تتحكم في العشوائية (0.0-2.0). القيم الأعلى تجعل الإنتاج أكثر إبداعا وتنوعا
max_output_tokens: يحد من الحد الأقصى لعدد الرموز في الاستجابة
top_p: بديل لدرجة الحرارة للتحكم في العشوائية

العمل مع نماذج Foundry المباشرة

عند استخدام عميل FoundrySDK أو AzureOpenAI للاتصال بنقطة نهاية project، تعمل واجهة برمجة تطبيقات الاستجابات مع كل من نماذج OpenAI Azure ونماذج Foundry المباشرة (مثل Microsoft Phi أو DeepSeek أو نماذج أخرى مستضافة مباشرة في Microsoft Foundry):

# Using a Foundry direct model
response = openai_client.responses.create(
    model="microsoft-phi-4",  # Example Foundry direct model
    instructions="You are a helpful AI assistant that answers questions clearly and concisely.",
    input="What are the benefits of small language models?"
)

print(response.output_text)

خلق تجارب حوارية

لسيناريوهات المحادثة الأكثر تعقيدا، يمكنك تقديم تعليمات النظام وبناء محادثات متعددة الأدوار:

# First turn in the conversation
response1 = openai_client.responses.create(
    model="gpt-4.1",
    instructions="You are a helpful AI assistant that explains technology concepts clearly.",
    input="What is machine learning?"
)

print("Assistant:", response1.output_text)

# Continue the conversation
response2 = openai_client.responses.create(
    model="gpt-4.1",
    instructions="You are a helpful AI assistant that explains technology concepts clearly.",
    input="Can you give me an example?",
    previous_response_id=response1.id
)

print("Assistant:", response2.output_text)

في الواقع، من المرجح أن يتم بناء التنفيذ كحلقة يمكن للمستخدم فيها إدخال الرسائل بشكل تفاعلي بناء على كل استجابة تستلم من النموذج:

# Track responses
last_response_id = None

# Loop until the user wants to quit
print("Assistant: Enter a prompt (or type 'quit' to exit)")
while True:
    input_text = input('\nYou: ')
    if input_text.lower() == "quit":
        print("Assistant: Goodbye!")
        break

    # Get a response
    response = openai_client.responses.create(
                model=model_name,
                instructions="You are a helpful AI assistant that explains technology concepts clearly.",
                input=input_text,
                previous_response_id=last_response_id
    )
    assistant_text = response.output_text
    print("\nAssistant:", assistant_text)
    last_response_id = response.id

يبدو الناتج من هذا المثال مشابها لهذا:

Assistant: Enter a prompt (or type 'quit' to exit)

You: What is machine learning?

Assistant: Machine learning is a type of artificial intelligence (AI) that enables computers to learn from data and improve their performance over time without being explicitly programmed. It involves training algorithms on large datasets to recognize patterns, make predictions, or take actions based on those patterns. This allows machines to become more accurate and efficient in their tasks as they are exposed to more data.

You: Can you give me an example?

Assistant: Certainly! Let's look at a simple example of supervised learning—predicting house prices based on features like size, location, and number of rooms.
Imagine you want to build a machine learning model that can predict the price of a house based on various factors.
...
    { the example provided in the model response may be extensive}
...

You: quit

Assistant: Goodbye!

عندما يدخل المستخدم مدخلات جديدة في كل دور، تتضمن البيانات المرسلة إلى النموذج رسالة نظام التعليمات ، ومدخلات المستخدم، والرد السابق المستلم من النموذج. وبهذه الطريقة، يرتكز المدخل الجديد في السياق الذي يوفره الرد الذي يولده النموذج للإدخال السابق.

بديل: تسلسل المحادثات يدويا

يمكنك إدارة المحادثات يدويا عن طريق بناء سجل الرسائل بنفسك. هذا النهج يمنحك تحكما أكبر في السياق المدرج:

try:
    # Start with initial message
    conversation_history = [
        {
            "type": "message",
            "role": "user",
            "content": "What is machine learning?"
        }
    ]

    # First response
    response1 = openai_client.responses.create(
        model="gpt-4.1",
        input=conversation_history
    )

    print("Assistant:", response1.output_text)

    # Add assistant response to history
    conversation_history += response1.output

    # Add new user message
    conversation_history.append({
        "type": "message",
        "role": "user", 
        "content": "Can you give me an example?"
    })

    # Second response with full history
    response2 = openai_client.responses.create(
        model="gpt-4.1",
        input=conversation_history
    )

    print("Assistant:", response2.output_text)

except Exception as ex:
    print(f"Error: {ex}")

هذا النهج اليدوي مفيد عندما تحتاج إلى:

تخصيص الرسائل المضمنة في السياق
تنفيذ تقليم المحادثات لإدارة حدود الرموز
تخزين واستعادة سجل المحادثات من قاعدة بيانات

استرجاع ردود سابقة محددة

تقوم واجهة برمجة تطبيقات الاستجابات بالحفاظ على سجل الاستجابات، مما يسمح لك باسترجاع الردود السابقة:

try:   

    # Retrieve a previous response
    response_id = "resp_67cb61fa3a448190bcf2c42d96f0d1a8"  # Example ID
    previous_response = openai_client.responses.retrieve(response_id)

    print(f"Previous response: {previous_response.output_text}")

except Exception as ex:
    print(f"Error: {ex}")

اعتبارات نافذة السياق

يربط previous_response_id المعامل الردود معا، محافظا على سياق المحادثة عبر عدة استدعاءات API.

من المهم ملاحظة أن الاحتفاظ بسجل المحادثات يمكن أن يزيد من استخدام الرموز. لتشغيل واحد، يمكن أن تشمل نافذة السياق النشط:

تعليمات النظام (تعليمات، قواعد السلامة)
طلبك الحالي
سجل المحادثة (رسائل المستخدم + المساعد السابقة)
مخططات الأدوات (الدوال، مواصفات OpenAPI، أدوات MCP، إلخ)
مخرجات الأدوات (نتائج البحث، مخرجات مفسر الكود، الملفات)
تم استرجاع الذاكرة أو المستندات (من مخازن الذاكرة، RAG، البحث عن الملفات)

كل هذه تجمع، وترميزة، وترسلها معا إلى النموذج في كل طلب. SDK يساعدك في إدارة الحالة، لكنه لا يجعل استخدام الرموز أرخص تلقائيا.

إنشاء تطبيقات دردشة سريعة الاستجابة

قد تستغرق ردود النموذج بعض الوقت لتوليدها اعتمادا على عوامل مثل النموذج المستخدم، حجم نافذة السياق، وحجم الطلب. قد يشعر المستخدمون بالإحباط إذا بدا أن التطبيق "يتجمد" أثناء انتظار الرد، لذا من المهم مراعاة استجابة التطبيق أثناء التنفيذ.

ردود البث المباشر

للردود الطويلة، يمكنك استخدام البث لاستقبال الإخراج تدريجيا - بحيث يرى المستخدم ردودا جزئيا كاملة مع توفر المخرج:

stream = openai_client.responses.create(
    model="gpt-4.1",
    input="Write a short story about a robot learning to paint.",
    stream=True
)

for event in stream:
    print(event, end="", flush=True)

إذا كنت تتابع سجل المحادثات أثناء البث، يمكنك الحصول على معرف الاستجابة عند انتهاء البث، مثل هذا:

stream = openai_client.responses.create(
    model="gpt-4.1",
    input="Write a short story about a robot learning to paint.",
    stream=True
)
for event in stream:
                if event.type == "response.output_text.delta":
                    print(event.delta, end="")
                elif event.type == "response.completed":
                    response_id = event.response.id

الاستخدام غير المتزامن

بالنسبة للتطبيقات عالية الأداء، يمكنك استخدام عميل غير متزامن يسمح لك بإجراء استدعاءات API غير محجوبة. الاستخدام غير المتزامن مثالي للطلبات طويلة الأمد أو عندما ترغب في التعامل مع عدة طلبات في نفس الوقت دون حظر تطبيقك. لاستخدامه، استوردها AsyncOpenAI بدلا من OpenAI واستخدمها await مع كل استدعاء API:

import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="https://<resource-name>.openai.azure.com/openai/v1/",
    api_key=token_provider,
)

async def main():
    response = await client.responses.create(
        model="gpt-4.1",
        input="Explain quantum computing briefly."
    )
    print(response.output_text)

asyncio.run(main())

البث غير المتزامن يعمل بنفس الطريقة:

async def stream_response():
    stream = await client.responses.create(
        model="gpt-4.1",
        input="Write a haiku about coding.",
        stream=True
    )

    async for event in stream:
        print(event, end="", flush=True)

asyncio.run(stream_response())

باستخدام واجهة برمجة التطبيقات Responses عبر حزمة تطوير Microsoft Foundry، يمكنك بناء تطبيقات ذكاء اصطناعي محادثة متقدمة تحافظ على السياق، وتدعم أنواع نماذج متعددة، وتوفر تجربة مستخدم سريعة الاستجابة.

الملاحظات

هل كانت هذه الصفحة مفيدة؟