Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Extract entities with the
The ai.extract
function uses Generative AI to scan input text and extract specific types of information designated by labels you choose—for example, locations or names—all with a single line of code.
AI functions turbocharge data engineering by putting the power of Fabric's built-in large languages models into your hands. To learn more, visit this overview article.
Important
This feature is in preview, for use in the Fabric 1.3 runtime and higher.
- Review the prerequisites in this overview article, including the library installations that are temporarily required to use AI functions.
- By default, AI functions are currently powered by the gpt-3.5-turbo (0125) model. To learn more about billing and consumption rates, visit this article.
- Although the underlying model can handle several languages, most of the AI functions are optimized for use on English-language texts.
- During the initial rollout of AI functions, users are temporarily limited to 1,000 requests per minute with Fabric's built-in AI endpoint.
Use ai.extract
with pandas
The ai.extract
function extends the pandas Series class. Call the function on a pandas DataFrame text column to extract custom entity types from each row of input.
Unlike other AI functions, ai.extract
returns a pandas DataFrame, instead of a Series, with a separate column for each specified entity type that contains extracted values for each input row.
Syntax
df_entities = df["text"].ai.extract("entity1", "entity2", "entity3")
Parameters
Name | Description |
---|---|
labels Required |
One or more strings representing the set of entity types to be extracted from the input text values. |
Returns
The function returns a pandas DataFrame with a column for each specified entity type. The column or columns contain the entities extracted for each row of input text. If the function identifies more than one match for a given entity, it returns only one of those matches. If no match is found, the result is null
.
Example
# This code uses AI. Always review output for mistakes.
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/
df = pd.DataFrame([
"MJ Lee lives in Tuscon, AZ, and works as a software engineer for Microsoft.",
"Kris Turner, a nurse at NYU Langone, is a resident of Jersey City, New Jersey."
], columns=["descriptions"])
df_entities = df["descriptions"].ai.extract("name", "profession", "city")
display(df_entities)
Use ai.extract
with PySpark
The ai.extract
function is also available for Spark DataFrames. The name of an existing input column must be specified as a parameter, along with a list of entity types to extract from each row of text.
The function returns a new DataFrame, with a separate column for each specified entity type that contains extracted values for each input row.
Syntax
df.ai.extract(labels=["entity1", "entity2", "entity3"], input_col="text")
Parameters
Name | Description |
---|---|
labels Required |
An array of strings that represents the set of entity types to be extracted from the text values in the input column. |
input_col Required |
A string that contains the name of an existing column with input text values to be scanned for the custom entities. |
error_col Optional |
A string that contains the name of a new column to store any OpenAI errors that result from processing each input text row. If this parameter isn't set, a default name is generated for the error column. If an input row has no errors, the value in this column is null . |
Returns
The function returns a Spark DataFrame with a new column for each specified entity type. The column or columns contain the entities extracted for each row of input text. If the function identifies more than one match for a given entity, it returns only one of those matches. If no match is found, the result is null
.
Example
# This code uses AI. Always review output for mistakes.
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/
df = spark.createDataFrame([
("MJ Lee lives in Tuscon, AZ, and works as a software engineer for Microsoft.",),
("Kris Turner, a nurse at NYU Langone, is a resident of Jersey City, New Jersey.",)
], ["descriptions"])
df_entities = df.ai.extract(labels=["name", "profession", "city"], input_col="descriptions")
display(df_entities)
Related content
- Calculate similarity with
ai.similarity
. - Categorize text with
ai.classify
. - Detect sentiment with
ai.analyze_sentiment
. - Fix grammar with
ai.fix_grammar
. - Summarize text with
ai.summarize
. - Translate text with
ai.translate
. - Answer custom user prompts with
ai.generate_response
. - Learn more about the full set of AI functions here.
- Learn how to customize the configuration of AI functions here.
- Did we miss a feature you need? Suggest it on the Fabric Ideas forum.