Seeking Efficient Methods for Handling Interconnected SQL Tables for Predictive Query Responses

KT 170 Reputation points
2024-08-05T10:07:03.2733333+00:00

Hi All,

We want to implement a feature in our application where users can ask about the consequences of certain actions and receive answers based on logic stored across multiple tables in their SQL database, which contains over 50 interconnected tables.

Initially, we considered consolidating the tables into a single view to use Azure AI search, which allows us to connect only one index. However, this approach has proven problematic and inefficient. Are there better methods for integrating such a large, interconnected database to support this functionality?

Would using a vector database pose similar challenges, and are there other options we should consider given that Azure CosmosDB's vector database is still in preview? Should Elasticsearch be a better option?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
961 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,615 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,954 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 23,486 Reputation points
    2024-08-27T21:18:55.6033333+00:00

    What I understood from what you described is you are trying to avoid managing complex relationships and dependencies among the data.

    I would go for a graph database for handling complex relationships, possibly in conjunction with other tools like Azure Synapse Analytics or Elasticsearch, depending on the nature of the queries and the data you are working with.

    1. Should you consolidate tables into a single view?

    While consolidating tables into a single view may seem like a straightforward approach, it can lead to inefficiencies, especially when dealing with a large number of tables and complex relationships. This approach may result in slow query performance, increased complexity in maintaining the view, and difficulties in scaling as the database grows. Therefore, while it might work for smaller datasets, it's not the best solution for a larger, interconnected database.

    2. Would a vector database be appropriate?

    Vector databases, which are designed to store and search vector embeddings, can be very effective in handling unstructured data, such as text, images, and other high-dimensional data points. However, for your use case, which involves structured data across interconnected SQL tables, a vector database might not be the most suitable option. The primary challenge is that vector databases excel in similarity searches, but they do not natively support complex relational queries that your SQL database requires.

    3. Is Elasticsearch a better option?

    Elasticsearch could be a viable option, especially if you are dealing with large datasets and need advanced search capabilities. It is designed for fast search and data analytics on large volumes of data. However, Elasticsearch is primarily optimized for text-based search and may require significant customization to handle the complex relationships in your SQL tables. You would likely need to use Elasticsearch in conjunction with another system or implement custom logic to handle the relational aspects of your data.

    4. What are other options to consider?

    • Use a Graph Database: A graph database (for example Neo4j) could be highly effective for managing interconnected data. Graph databases are designed to handle complex relationships and can easily model and query the connections between different entities in your data. This could make it easier to predict the consequences of certain actions based on the relationships stored in your database.
      • Hybrid Approach: Consider a hybrid approach where you use a combination of SQL for structured queries and another system (like a graph database or Elasticsearch) for handling specific types of queries that require understanding relationships or performing text-based searches. This allows you to leverage the strengths of different systems for different parts of your data.
      • Azure Synapse Analytics: If you're working within the Azure ecosystem, Azure Synapse Analytics could be another option. It allows for big data and data warehousing capabilities and can integrate with both SQL and non-SQL data sources. Synapse might enable you to build a more scalable solution that supports both complex queries and predictive analytics.

    5. Should you use Azure AI Search or Cosmos DB?

    • Azure AI Search: While Azure AI Search is powerful for searching across large text datasets, it may not be the best fit for handling interconnected SQL tables due to the complexity of relational data. It might be more effective if used in conjunction with another system designed for handling relational data.
    • Azure Cosmos DB: Cosmos DB, particularly its native support for multiple data models, including graph and key-value store models, could be useful. However, if you are specifically looking at vector databases within Cosmos DB, you might face limitations since the feature is still in preview. Cosmos DB's flexibility, though, could be advantageous if you plan to use multiple data models to handle different parts of your queries.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.