How to sum chosen digits from nvarchar column?

Decompressor 81 Reputation points
2021-02-21T12:26:53.503+00:00

I need sum digits for every row and result has to be: Id result 1 48 2 158 3 371 I have sql server 2011

CREATE TABLE [dbo].[Goods] ( [c31] NVARCHAR (MAX) NULL, [Id] INT IDENTITY (1, 1) NOT NULL );
 insert Goods values('1. Black film -14rolls,red film 12rolls and blue film 22rolls 2.Some other things'), ('1.Dogs feed 15KG and 16KG; there are 125KG in paper pack, 2KG in plastic 2.Some data'), ('1. Film -151 rolls type 1 and 220 rolls type 2 2. Some data')

Id. result

  1. 48
  2. 158
  3. 371
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,584 questions
{count} votes

Accepted answer
  1. Yitzhak Khabinsky 25,481 Reputation points
    2021-02-21T17:42:01.937+00:00

    @Decompressor ,

    Please try the following method.
    It will work starting from SQL Server 2008 onwards.

    It is using XML, XQuery, and its FLWOR expression.

    The algorithm is relatively simple:

    1. Cleanup input string via multiple REPLACE() function calls. (Big thanks goes to Viorel).
    2. Conversion to XML data type via CAST( ... AS XML).
    3. Filtering out XML nodes that hold non-integer data type value via FLWOR expression.
    4. Sum remaining XML nodes with INT data type via XQuery sum() function.
    5. Output the final result.

    SQL

    DECLARE @Goods TABLE(ID INT IDENTITY PRIMARY KEY, c31 NVARCHAR(MAX) NULL);  
    INSERT @Goods VALUES  
    ('1. Black film -14rolls,red film 12rolls and blue film 22rolls 2.Some other things'),  
    ('1.Dogs feed 15KG and 16KG; there are 125KG in paper pack, 2KG in plastic 2.Some data'),  
    ('1. Film -151 rolls type 1 and 220 rolls type 2 2. Some data');  
      
    DECLARE @separator CHAR(1) = SPACE(1);  
      
    SELECT ID  
     , CAST('<root><r>' +   
     REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(c31,  
     '-', @separator),   
     'rolls', @separator),   
     'KG', @separator),   
     'type 1', @separator),  
     'type 2', @separator), @separator, '</r><r>') + '</r></root>' AS XML)  
     .query('<root>  
            {  
            for $x in /root/r  
            return if (xs:int($x) instance of xs:int) then $x   
     else () (: filter out non-integers :)  
            }  
            </root>')  
     .query('sum(/root/r/text())').value('.', 'INT') AS [Result]  
    FROM @Goods  
    

    Output

    +----+-----+  
    | ID | Sum |  
    +----+-----+  
    |  1 |  48 |  
    |  2 | 158 |  
    |  3 | 371 |  
    +----+-----+  
    
    1 person found this answer helpful.
    0 comments No comments

3 additional answers

Sort by: Most helpful
  1. Erland Sommarskog 104.2K Reputation points MVP
    2021-02-21T12:50:43.277+00:00

    Apparently some digits count, and some do not.

    In order to pursue this, you first need to have clear rules for which numbers that count and which does not.

    Then you need a good tool for this and that is regular expressions. For which there is no native support in SQL Server. You can get access to regular expression by calling a CLR routine (and starting with SQL Server 2017, by forking out to a Python script), but you are probably better of by exporting the data to a file to fine-tune a script in Perl, Powershell, Python or what you fancy. Because this is nothing you get will get working in the first example.

    And absolutely not if you insist on doing in T-SQL.


  2. Viorel 114K Reputation points
    2021-02-21T16:05:11.47+00:00

    To see how inelegant a possible solution looks even in modern SQL, evaluate an attempt for SQL Server 2019:

    ;
    with E as
    (
        select Id,
            replace(replace(replace(replace(replace(replace(c31,
                '.', '*'), 
                '-', ' '), 
                'rolls', ' '), 
                'KG', ' '), 
                'type 1', ' '),
                'type 2', ' ') as txt
        from Goods
    )
    select Id, sum(num) as [sum]
    from E
    cross apply (select try_cast(value as int) from string_split(txt, ' ')) dt(num)
    group by Id
    
    /*
    
    Id    sum
    1   48
    2  158
    3  371
    
    */
    

    It only supports the cases that are shown in your sample data.

    0 comments No comments

  3. EchoLiu-MSFT 14,576 Reputation points
    2021-02-22T06:14:25.18+00:00

    Hi @Decompressor

    As other experts have said, for your sample data, TSQL can achieve the output you expect, but for your millions of rows of data, TSQL is obviously not the right choice.

    If you have any question, please feel free to let me know.

    Regards
    Echo


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.