How to search multiple URLs?

Diego Sousa 5 Reputation points
2023-11-29T13:36:51.4033333+00:00

Dears,

I'm testing the Azure platform with Openai and I would like to know how I add a URL data source, for example, I would like the AI response data source to be 4 URLs.

In the playground I can add just one and if I need to ask a question comparing 2 products in two URLs, the AI cannot identify

thanks

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,342 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,083 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. ajkuma 28,036 Reputation points Microsoft Employee Moderator
    2023-12-08T07:06:06.6966667+00:00

    Adding to Azar's response.

    @Diego Sousa, Currently, we don't support 4 URLs at a time for ingestion.

    If you wish, you may share your feedback on our Uservoice -  All of the feedback, you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure. Additionally, users with a similar request can up-vote your post and add their comments.

    I have also relayed this feedback internally to our product team.

    Just sharing an approach, if you are comfortable, you can try, but it’s more of a makeshift solution.

    Upload 4 URLs separately with URL as the data source in the blob storage and then use blob storage as a data source for inferencing.

    To make this scenario work, you would need to make four separate URL ingestion requests. Then locate the newly created containers in the blob storage, combine the contents of these containers into a single container, and then use the blob storage scenario in the studio to index the entire content.

    Kindly note that each URL ingestion call will create a separate search index, resulting in 4 indexes being created for each of the URL ingestion requests. These can be deleted later if desired.


     If the answer helped (pointed, you in the right direction) > please click Accept Answer.

    1 person found this answer helpful.
    0 comments No comments

  2. Azar 29,520 Reputation points MVP Volunteer Moderator
    2023-11-29T13:41:11.43+00:00

    I guess OpenAI does not directly support the integration of multiple URLs as data sources in a single API call. However, you can work around this limitation by concatenating the text content from multiple URLs into a single input and sending that as a prompt to the OpenAI API.

    Look at this sample code below

    import openai
    
    # Your OpenAI API key
    api_key = "YOUR_API_KEY"
    openai.api_key = api_key
    
    # Example URLs
    url1 = "https://example.com/product1"
    url2 = "https://example.com/product2"
    
    # Fetch text content from the URLs (you can use a web scraping library for this)
    text_from_url1 = "Text content from URL 1..."
    text_from_url2 = "Text content from URL 2..."
    
    # Concatenate the text content
    combined_text = f"Compare product information from {url1} and {url2}. {text_from_url1} {text_from_url2}"
    
    # Call the OpenAI API
    response = openai.Completion.create(
        engine="text-davinci-002",  # or use the engine you prefer
        prompt=combined_text,
        max_tokens=150
    )
    
    # Extract and print the generated answer
    answer = response['choices'][0]['text']
    print(answer)
    
    
    

    it might not work well if the content from the URLs is extensive or if the combined text exceeds the model's token limit so this is juz a work around I'm suggesting hope this helps.

    If this helps kindly accept the answer thanks much.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.