Create a stored procedure that checks a status, validates, then update status?

cataster 1 Reputation point
2020-09-08T19:20:56.803+00:00

I have the following "Handshake" table:

CREATE TABLE [dbo].[Handshake](  
    [Report Year] [varchar](100) NULL,  
    [Status] [varchar](100) NULL,  
    [Update Time] [datetime] NULL,  
    [Process Time] [datetime] NULL  
) ON [PRIMARY]  
GO  

Some data here:

Report Year Status  Update Time            Process Time  
2020 1+2    Loaded  2020-08-25 13:10:19.073 NULL  
2020 6+8    Loaded  2020-08-25 13:11:30.733 NULL  
2020 1+2    Loaded  2020-08-25 13:11:47.257 NULL  

I have another table, "Fact", that is used to insert the load of these above "Report Years" every time and and a record is inserted into the Handshake table when the back-end process starts to load/updates status from "WIP" to "Loaded" upon load completion, as you can see from the sample Handshake data above.

After the status is "Loaded", I go in manually every time and run the following queries against "Fact" Table to validate the data against other tables.

SELECT COUNT(*) as 'Report Year Count' FROM Fact WHERE [Report Year] NOT IN ( SELECT DISTINCT [Report Year] FROM [Report Year] )  
SELECT COUNT(*) as 'COS Count' FROM Fact WHERE [Country Code] NOT IN ( SELECT DISTINCT [Country Code] FROM [COS] )  
SELECT COUNT(*) as 'Franchise Count' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Franchise] )  
SELECT COUNT(*) as 'Product Count' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Product] )  

If the count is 0 for the above queries, then i dont check what the rejects are for those tables.
if the count is > 0, i run a similar query to identify the rejects.
For example, if Franchise and Product return a count of 1063

23250-image.png

Then I run the following relevant query(s) to identify the missing rejects in those tables:

SELECT DISTINCT [Style Code] as 'Franchise - Style Code' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Franchise] )  
SELECT DISTINCT [Style Code] as 'Product - Style Code' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Product] )  

23316-image.png

If there are "rejects" (inconsistent data), i send an email to relevant users with what the rejects where so that they can update the tables (i.e. Franchise, COS, Product, Report Year) to include the missing rejects before i proceed with the next step (processing the database).

Instead of continuously performing this validation task manually, I'd like to write a stored procedure that will be triggered by some form of script/job (irrelevant to the scope of this thread for now) that will check the Handshake table for the latest "Loaded" status, run the validation queries, and if there are rejects, update the status for that record in the Handshake table from "Loaded" -> "Rejects". Otherwise, if there are no rejects, update the status to "Good" for no rejects found.

If there are rejects, i'd like to save the results to a scalar variable (i.e. cmd) so that i can extract them later into a csv file of sorts

This is my pseudocode:

CheckForRejects()  
    DECLARE @cmd NVARCHAR(1000)  
BEGIN  
    --check Handshake Status.   
    IF Status='Loaded' THEN --RUN Validation Queries  
        SELECT COUNT(*) as 'Report Year Count' FROM Fact WHERE [Report Year] NOT IN ( SELECT DISTINCT [Report Year] FROM [Report Year] )  
        SELECT COUNT(*) as 'COS Count' FROM Fact WHERE [Country Code] NOT IN ( SELECT DISTINCT [Country Code] FROM [COS] )  
        SELECT COUNT(*) as 'Franchise Count' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Franchise] )  
        SELECT COUNT(*) as 'Product Count' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Product] )  
          
        case   
            when 'Report Year Count' > 0 THEN  
                SET Status='Rejects'  
                SET @cmd = SELECT DISTINCT [Report Year] as 'Report Year' FROM Fact WHERE [Report Year] NOT IN ( SELECT DISTINCT [Report Year] FROM [Report Year] )  
            when 'COS Count' > 0 THEN   
                SET Status='Rejects'  
                SET @cmd = SELECT DISTINCT [Country Code] as 'COS - Country Code' FROM Fact WHERE [Country Code] NOT IN ( SELECT DISTINCT [Country Code] FROM [COS] )  
            when 'Franchise Count' > 0 THEN  
                SET Status='Rejects'  
                SET @cmd = SELECT DISTINCT [Style Code] as 'Franchise - Style Code' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Franchise] )  
            when 'Product Count' > 0 THEN  
                SET Status='Rejects'  
                SET @cmd = SELECT DISTINCT [Style Code] as 'Product - Style Code' FROM Fact WHERE [Style Code] NOT IN ( SELECT DISTINCT [Style Code] FROM [Product] )  
            ELSE SET Status='Good' -- No Rejects, continue to processing DB  
END  
SELECT @cmd As Rejects  
Developer technologies | Transact-SQL
SQL Server | Other
{count} votes

1 answer

Sort by: Most helpful
  1. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2020-09-08T21:24:49.807+00:00

    Let's start with the Handshake table. I note that all columns are nullable. That looks a little odd. Maybe it makes sense, but what would it mean that the Report Year is NULL? That the status is NULL? That all columns are NULL? Generally, when you consider to make a column nullable, you should have an understanding of what NULL actually means for that column. If you don't really see what it would mean, you should probably not make the column NOT NULL.

    Another problem is that the table does not have a primary key. This makes it difficult if you want to update the table as you talk about. Without keys, how would you know which rows to update?

    As for the process, I would write the stored procedures so that I have a temp table for each of these four where I insert the missing values. Then I would check with IF EXISTS if there is any data in tables. How you would then go an update the table, I don't know, because I don't see the connection between Fact and Handshake.

    Finally, I like to warn you for NOT IN, which I see that you use frequently. Keep in mind that if you have a query like:

    SELECT ... FROM tbl WHERE col NOT IN (SELECT somecol FROM othertbl)

    this query will not return a single row, if the subquery returns a single NULL value. You can avoid this by adding WHERE somecol IS NOT NULL to the subquery, but better in my opinion is to use NOT EXISTS:

    SELECT ... 
    FROM   tbl
    WHERE NOT EXISTS (SELECT *
                      FROM othertbl
                      WHERE tbl.col = othertbl.somecol)
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.