Why One Row Did Not Insert?

JAL 571 Reputation points
2022-09-04T01:00:00.663+00:00

About five years ago our company created an app to record the findings of a human evaluation.

A person examines a device, checks it against each criterion onscreen, and then marks it "Pass" or "Fail". Each criterion is one row in our DB.

The app hasn't changed much in the last five years. About 1.3 million evaluations done so far. About 50 million criterion-rows.

Typically it's a two-person process:

  • Evaluator-1
  • Evaluator-2 (a senior evaluator).

Evaluator-2 doesn't start fresh. Rather the rows of Evalutator-1 are first cloned (but marked as Evaluator-2).

After five years of success, suddenly we are seeing an occasional failure in the cloning process, where Evaluator-2 is mysteriously missing exactly one row. The cloning is done in a stored procedure that looks roughly like this:

Insert Into dbo.Workspace (PassOrFail, EvaluationID, Iteration)
Select PassOrFail, EvaluationID, 'Evaluator-2'
From dbo.Workspace where EvaluationID = @EvalID and Iteration = 'Evaluator-1'
-- The same SP ends with the following validation to verify that all rows got cloned:
Declare @count1 int = (Select count() From dbo.Workspace where EvaluationID = @EvalID and Iteration = 'Evaluation-1');
Declare @count2 int = (Select count(
) From dbo.Workspace where EvaluationID = @EvalID and Iteration = 'Evaluation-2');
If @count1 <> @count2 Throw 51000, N'Error in cloning', 1

  • That last step (the validation-by-count) has been in place for five years, and has never thrown an exception - until now - and it is always exactly ONE ROW that is missing. If we delete the cloned rows and re-run the app, all rows clone properly.

We tried to brainstorm on this - maybe another app is deleting one cloned row? But we can't find anything in our system that would do this.

Frankly, we're fresh out of theories. Any ideas, anyone?

Note:

  • The INSERT statement is not wrapped in a try-catch, nor in an explicit tran.
  • Seems xact_abort is Off. I suppose we should turn it on?
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,552 questions
{count} votes

6 answers

Sort by: Most helpful
  1. JAL 571 Reputation points
    2022-09-04T14:13:53.593+00:00

    "But if you are doing an INSERT and then a count, and you want to be sure that nothing changes, you need to have the either the serializable ro the snapshot isolation level. The former protects you against phantom reads, that is a row is inserted while the operation is running and before the SELECT COUNT runs. With SNAPSHOT you read from a version store to get the state of the database when the transaction started."

    • A moment ago, I just posted some additional info regarding our numerous validations, our identity-column sequencing, and the interim period waiting for Evaluator-2 to assign the evaluation to himself. Given all these factors, I don't think isolation-level anomalies such as phantom rows are a factor here.
    0 comments No comments