Azure AI Studio - How do I use html files within an index and provide the url rather than the file name

Anonymous
2024-09-17T15:30:44.5633333+00:00

Hi all,

I'm using Azure AI Studio. I'm trying to create an Index using a few html files that Ive downloaded from the source website. Its working great within my prompt flow, but with one issue. When the AI outputs the citations where the info came from, its using the html file name. How I can make it use the website url instead?

Example

Q: What is Harry Potter?

A: Harry Potter is a series of seven fantasy novels written by British author J. K. Rowling. (en.wikipedia.org_wiki_Harry_Potter.html)

What I want

Q: What is Harry Potter?

A: Harry Potter is a series of seven fantasy novels written by British author J. K. Rowling. (https://en.wikipedia.org/wiki/Harry_Potter)

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,106 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2024-09-19T10:18:32.1933333+00:00

    Steven Parry Apologies for the confusion. Thanks for sharing the additional information.

    This is by design. I tried with two Web URLs and seen the same results. There is no way to fetch the direct URLs.

    Once you have added the URL/web address for data ingestion, the web pages from your URL are fetched and saved to Azure Blob Storage with a container name: webpage-<index name>. Each URL will be saved into a different container within the account. Then the files are indexed into an Azure AI Search index, which is used for retrieval when you’re chatting with the model.

    Please see below for more clarity.

    User's image

    So, when we query the data it will be fetched from these indexes and the results will be displayed as HTML pages.

    User's image

    User's image

    Please see Azure OpenAI On Your Data for more details.

    Do let me know if you have any further queries.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.