PowerQuery Table.FuzzyJoin SimilarityColumnName index out of range

Erik Bohn 21 Reputation points Microsoft Employee
2021-06-25T13:10:17.693+00:00

Hi All,

When using Table.FuzzyJoin and specifying the optional argument SimilarityColumnName I get :

Unexpected error: Index was outside the bounds of the array.
Details:
    Microsoft.Mashup.Evaluator.Interface.ErrorException: Index was outside the bounds of the array. ---> System.IndexOutOfRangeException: Index was outside the bounds of the array. ---> System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at Microsoft.Mashup.Engine1.Runtime.TableValue.Microsoft.Mashup.Engine.Interface.ITableValue.ColumnIdentity(Int32 index)
   at Microsoft.Mashup.Evaluator.ArrayHelpers.NewArray[T](Int32 count, Func`2 getter)
   at Microsoft.Mashup.Evaluator.ITableSourceSerializationExtensions.WriteITableSource(BinaryWriter writer, ITableSource tableSource)
   at Microsoft.Mashup.Evaluator.BinarySerializer.Serialize(Action`1 serializer)
   at Microsoft.Mashup.Evaluator.Interface.BufferedMessage.Prepare()
   at Microsoft.Mashup.Evaluator.ChannelMessenger.PostWithoutFlowControl(MessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.RemotePreviewValueSource.<>c__DisplayClass0_0.<RunStub>b__0()
   at Microsoft.Mashup.Evaluator.EvaluationHost.ReportExceptions(IHostTrace trace, IEngineHost engineHost, IMessageChannel channel, Action action)
   --- End of inner exception stack trace ---
   at Microsoft.Mashup.Evaluator.EvaluationHost.<>c__DisplayClass11_0.<TryReportException>b__1()
   at Microsoft.Mashup.Common.SafeExceptions.IgnoreSafeExceptions(IEngineHost host, IHostTrace trace, Action action)
   at Microsoft.Mashup.Evaluator.EvaluationHost.TryReportException(IHostTrace trace, IEngineHost engineHost, IMessageChannel channel, Exception exception)
   at Microsoft.Mashup.Evaluator.EvaluationHost.ReportExceptions(IHostTrace trace, IEngineHost engineHost, IMessageChannel channel, Action action)
   at Microsoft.Mashup.Evaluator.RemotePreviewValueSource.RunStub(IEngineHost engineHost, IMessageChannel channel, Func`1 getPreviewValueSource)
   at Microsoft.Mashup.Evaluator.RemoteDocumentEvaluator.Service.<>c__DisplayClass12_1`1.<OnBeginGetResult>b__0()
   at Microsoft.Mashup.Evaluator.EvaluationHost.ReportExceptions(IHostTrace trace, IEngineHost engineHost, IMessageChannel channel, Action action)
   at Microsoft.Mashup.Evaluator.RemoteDocumentEvaluator.Service.OnBeginGetResult[T](IMessageChannel channel, BeginGetResultMessage message, Action`1 action)
   at Microsoft.Mashup.Evaluator.RemoteDocumentEvaluator.Service.OnBeginGetPreviewValueSource(IMessageChannel channel, BeginGetPreviewValueSourceMessage message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.ChannelMessenger.ChannelMessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.Dispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.ChannelMessenger.OnMessageWithUnknownChannel(IMessageChannel baseChannel, MessageWithUnknownChannel messageWithUnknownChannel)
   at Microsoft.Mashup.Evaluator.MessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.ChannelMessenger.ChannelMessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.Dispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.EvaluationHost.Run()
   at Microsoft.Mashup.Container.EvaluationContainerMain.Run(Object args)
   at Microsoft.Mashup.Evaluator.SafeThread2.<>c__DisplayClass9_0.<CreateAction>b__0(Object o)
   at Microsoft.Mashup.Container.EvaluationContainerMain.SafeRun(String[] args)
   at Microsoft.Mashup.Container.EvaluationContainerMain.Main(String[] args)
   --- End of inner exception stack trace ---
   at Microsoft.Mashup.Evaluator.EvaluationHost.OnException(IEngineHost engineHost, IMessageChannel channel, ExceptionMessage message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.Dispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.ChannelMessenger.ChannelMessageHandlers.TryDispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.MessageHandlers.Dispatch(IMessageChannel channel, Message message)
   at Microsoft.Mashup.Evaluator.Interface.IMessageChannelExtensions.WaitFor[T](IMessageChannel channel)
   at Microsoft.Mashup.Evaluator.RemotePreviewValueSource.PreviewValueSource.WaitFor(Func`1 condition, Boolean disposing)
   at Microsoft.Mashup.Evaluator.RemotePreviewValueSource.PreviewValueSource.get_TableSource()
   at Microsoft.Mashup.Evaluator.Interface.TracingPreviewValueSource.get_TableSource()
   at Microsoft.Mashup.Host.Document.Analysis.PackageDocumentAnalysisInfo.PackagePartitionAnalysisInfo.SetPreviewValue(EvaluationResult2`1 result, Func`1 getStaleSince, Func`1 getSampled)

If I omit then SimilarityColumnName argument the matching works. Are there any solutions to get a column indicating the similarity evaluated?

Original M Query:

let

    t1=
     let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSspPUorVAdNJlRBWJlQkMQnBjwUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type)
    in
     #"Added Index",

    t2=
        let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSspPUorVAdNJlWBWVmJeqgKMlZdaAmHmZ+QpxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Column1", "ColumnMatch"}}),
    #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Indexy", 0, 1, Int64.Type)
    in
        #"Added Index",

    Source = Table.FuzzyJoin(t1, "Column1", t2, "ColumnMatch", 1, [ConcurrentRequests=null, Culture=null, IgnoreCase=null, IgnoreSpace=null, NumberOfMatches=5, SimilarityColumnName="sim", Threshold=0.8, TransformationTable=null])
in
    Source
Community Center | Not monitored
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Lz._ 9,016 Reputation points
    2021-06-25T18:00:39.517+00:00

    Hi @Erik Bohn

    Can repro the issue with just that option on XL365 x64 v2105 b14026.20308:

    let  
        t1 = Table.FromList({"bob","bobby","bib","bab","bib"}, null,  
            type table [Column1=text]  
        ),  
        t2 = Table.FromList({"bob","bobby","janne","jannette","john"}, null,  
            type table [cMatch=text]  
        ),  
        Source = Table.FuzzyJoin(  
            t1,"Column1",  
            t2,"cMatch",  
            JoinKind.Inner,  
            [SimilarityColumnName="ABC"]  
        )  
    in  
        Source  
    

    I suggest you send this as a Frown and update your initial post with your product version + build (dev. team monitors the forum on a regular basis)

    In the meantime it seems you can use Table.FuzzyNestedJoin. The following works here:

    ...  
        Source = Table.FuzzyNestedJoin(  
            t1,"Column1",  
            t2,"cMatch",  
            "Foo", JoinKind.Inner,  
            [NumberOfMatches=5, SimilarityColumnName="ABC"]  
        ),  
        ExpandedFoo = Table.ExpandTableColumn(Source, "Foo", {"cMatch", "ABC"})  
    

    Hope this helps a bit

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.