Excel Power Query - Find record based on partial string

cmd41par 21 Reputation points
2020-12-22T18:13:23.59+00:00

I have a list of banking transactions with a longDescription column. Somewhere in the longDescription is a
string that is contained my Categories table in the column StrMatch.

sample transactions.longDescription "Bill Pay Verizon....". "PAYPAL INST...GOOGLE"

sample categories.StrMatch could be "Verizon", "Google"...

the sql i would execute for each row in transactions would be something like the below. I am
trying to find the StrMatch that the longDescription contains and then put the StrMatch in a transactions column.

select * from Categories where CONTAINS(StrMatch, @longDescription)

I have hard-coded the list for now but i would think there is a better way. M-code sample below...

AddStrMatch = Table.AddColumn(AddYYYYMM, "StrMatch", 

else if Text.Contains(Text.Upper([Column5]), "GOOGLE") then "GOOGLE"
else if Text.Contains(Text.Upper([Column5]), "VERIZON") then "VERIZON"
...

thx.

Community Center | Not monitored
0 comments No comments
{count} votes

Accepted answer
  1. Lz._ 9,016 Reputation points
    2020-12-24T08:39:22.76+00:00

    Hi @cmd41par

    Difficult to say for sure with only 2 truncated transactions.longDescriptions but - if your product has it - you could do it with function Table.FuzzyJoin/Table.FuzzyNestedJoin

    Otherwise, assuming data in Table1 and StrMatch in Table2:

    51007-demo.png

    the following will return the 1st match in the StrMatch list:

    let  
        fxMatch = (String as text) =>  
            List.First(  
                List.RemoveNulls(  
                    List.Generate(  
                        ()=>[i = 0, out = if Text.Contains(String, Categories{i}, Comparer.OrdinalIgnoreCase) then Categories{i} else null],  
                        each [i] < NbOfCategories,  
                        each  
                            [  
                                i = [i] + (if [out] <> null then NbOfCategories else 1),  
                                out = if Text.Contains(String, Categories{i}, Comparer.OrdinalIgnoreCase) then Categories{i} else null  
                            ],  
                        each [out]  
                    )  
                )  
            ),  
      
        Source = Table1,  
        Categories = List.Buffer(Table2[StrMatch]),  
        NbOfCategories = List.Count(Categories),  
        Categorized = Table.AddColumn(Source, "Category", each fxMatch([LongDescription]), type text)  
    in  
        Categorized  
    

    Sample with both options available here

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. cmd41par 21 Reputation points
    2020-12-25T13:39:05.847+00:00

    This is great - thank you.

    I got the Fuzzyjoin to work but I was very interested in how to do the loops.

    Your answer with the function iterating over the Categories is just what I was looking for. Now I have to study it and understand each step !

    PS - Next time I will do a better job with the data.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.