Excel Power Query - Find record based on partial string

cmd41par 21 Reputation points
2020-12-22T18:13:23.59+00:00

I have a list of banking transactions with a longDescription column. Somewhere in the longDescription is a
string that is contained my Categories table in the column StrMatch.

sample transactions.longDescription "Bill Pay Verizon....". "PAYPAL INST...GOOGLE"

sample categories.StrMatch could be "Verizon", "Google"...

the sql i would execute for each row in transactions would be something like the below. I am
trying to find the StrMatch that the longDescription contains and then put the StrMatch in a transactions column.

select * from Categories where CONTAINS(StrMatch, @longDescription)

I have hard-coded the list for now but i would think there is a better way. M-code sample below...

AddStrMatch = Table.AddColumn(AddYYYYMM, "StrMatch", 

else if Text.Contains(Text.Upper([Column5]), "GOOGLE") then "GOOGLE"
else if Text.Contains(Text.Upper([Column5]), "VERIZON") then "VERIZON"
...

thx.

Community Center | Not monitored
0 comments No comments
{count} votes

Answer accepted by question author
  1. Lz._ 38,106 Reputation points Volunteer Moderator
    2020-12-24T08:39:22.76+00:00

    Hi @cmd41par

    Difficult to say for sure with only 2 truncated transactions.longDescriptions but - if your product has it - you could do it with function Table.FuzzyJoin/Table.FuzzyNestedJoin

    Otherwise, assuming data in Table1 and StrMatch in Table2:

    51007-demo.png

    the following will return the 1st match in the StrMatch list:

    let  
        fxMatch = (String as text) =>  
            List.First(  
                List.RemoveNulls(  
                    List.Generate(  
                        ()=>[i = 0, out = if Text.Contains(String, Categories{i}, Comparer.OrdinalIgnoreCase) then Categories{i} else null],  
                        each [i] < NbOfCategories,  
                        each  
                            [  
                                i = [i] + (if [out] <> null then NbOfCategories else 1),  
                                out = if Text.Contains(String, Categories{i}, Comparer.OrdinalIgnoreCase) then Categories{i} else null  
                            ],  
                        each [out]  
                    )  
                )  
            ),  
      
        Source = Table1,  
        Categories = List.Buffer(Table2[StrMatch]),  
        NbOfCategories = List.Count(Categories),  
        Categorized = Table.AddColumn(Source, "Category", each fxMatch([LongDescription]), type text)  
    in  
        Categorized  
    

    Sample with both options available here

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. cmd41par 21 Reputation points
    2020-12-25T13:39:05.847+00:00

    This is great - thank you.

    I got the Fuzzyjoin to work but I was very interested in how to do the loops.

    Your answer with the function iterating over the Categories is just what I was looking for. Now I have to study it and understand each step !

    PS - Next time I will do a better job with the data.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.