How to merge lists with inconsistent names in Excel via vlookup or other built-in tools?

Anonymous
2023-10-18T19:19:20+00:00

Hello, I am having trouble using vlookup to merge textual data on 2 different sheets where the names are slightly different. For example, it has trouble finding David v Dave, Chris v Christopher, or Adam John Smith vs Adam Smith. The primary list I want to check against tends to have the longer versions of names.

The overall project is to take two different reports from two different enterprise-level systems, one for funding and one for training. I want to see which funded personnel have done their mandatory training. Currently, I am using Power Query to pull the reports, clean the data, and insert them into one worksheet. I then use vlookup on a third tab to merge the lists and show yes/no whether they have taken the training. Since the data comes from two different systems, there is no employee ID or other unifying key aside from their names.

My current solution is to use Power Query to cut off names with 3+ parts from the funding list with longer names via the Extract Text Before Delimiter tool. This has helped remove a lot of false negatives, but obviously risks creating false positives. I then use the following formula to identify whether or not they have been trained:

=IFERROR(IF(VLOOKUP("*"&A2, 'CITI RCR training list'!$A$2:$B$5029,1,FALSE) = A2, "Yes"), "No")

I have tried using "*"&A2, "*"&A2&"*", and A2&"*". None of these options capture 100% of mismatched names. I have given up entirely on vlookup TRUE approximate match because it doesn't seem to work well with strings.

I am using Office Professional Plus 2019, Version 1808. Build 10402.20023 Click to Run. This version does not seem to have the Fuzzy Match merge option for Power Query, so that is off the table. I also cannot using the Fuzzy Lookup Add-in because this project needs to run on multiple computers via a network drive, and I cannot install the add-in everywhere.

The goal is to be able to hand this off to someone with little Excel knowledge and have them do little to no manual work to get the final report. My current set up allows them to simply paste in new report data into the source files and then hit refresh on the main sheet. I need to keep it at that level of simplicity once it is running. To be clear, the system works as is, but I need to improve the name matching and capture as many mismatched names as possible.

Is there a more clever way to improve name matching in Excel beyond adding the vlookup wildcard characters? Maybe additional data cleaning steps I didn't think of?

Thank you!

Here is an example of the final product so far for reference:

Microsoft 365 and Office | Excel | For business | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Anonymous
    2023-10-18T19:56:29+00:00

    Data quality means that the data values should be exact.

    I would create a sorted list of unique names from each of your sheets, then place them side by side and insert blanks into one or delete values from the other to get the data to align if needed. For example

    Bernie Abraham
    Fred Bernard
    Sam Edward
    Frederick
    Patrick
    Samuel

    Becomes either

    Abraham
    Bernie Bernard
    Edward
    Fred Frederick
    Patrick
    Sam Samuel

    OR

    Bernie Bernard
    Fred Frederick
    Sam Samuel

    Then I would use a column with a vlookup formula against the thorter names to find the correct longer names.

    Then copy the formula results and overwrite the shorter names.

    0 comments No comments