I need to Consolidate similar/misspelled Names

Question

I need to Consolidate similar/misspelled Names

Anonymous

We often times get files we need to analize and there will be multiple names for what is really one name. Example being the name should read "All Side Materials" but there will be mutiple fields that read "All Side Mat.", "All Side Materials Co", Al Side Materials". These slight mispellings or variations cause the data to skew when put into a pivot table sorted by the name.

Currently I use the filter and look for the similar names, check all that apply, then unify the name. This process is very time consuming when there are 100+ customers with 10+ variations of each name.

Question being: is the a quicker way to unify the variations of names in a column?

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments

3 answers

Answer 1

We often times get files we need to analize and there will be multiple names for what is really one name. Example being the name should read "All Side Materials" but there will be mutiple fields that read "All Side Mat.", "All Side Materials Co", Al Side Materials". These slight mispellings or variations cause the data to skew when put into a pivot table sorted by the name.

Currently I use the filter and look for the similar names, check all that apply, then unify the name. This process is very time consuming when there are 100+ customers with 10+ variations of each name.

Question being: is the a quicker way to unify the variations of names in a column?

Sounds like you should use Data Validation as Bernard mentioned, and nip the problem at the start. :>)

>> the name should read "All Side Materials"

I'm not sure if this would help, but suppose you had a list of good names you wanted to use.

One of a few programming options might be a function called Edit Distance.

http://en.wikipedia.org/wiki/Edit\_distance

The idea would be to search your list and pick out names that fell within a certain threshold.

If we had names like:

{"All Side Materials","All Side Mat.","All Side Materials Co","Al Side Materials","Building Materials","Plumbing Stuff"}

Here's a table:

1 All Side Materials , Al Side Materials

3 All Side Materials , All Side Materials Co

6 All Side Materials , All Side Mat.

7 All Side Materials , Building Materials

14 All Side Materials, Plumbing Stuff

The idea is that the first comparison returns 1,

and the last comparison returns 14.

It's not perfect for what you are doing, but you could use a value of say <=6 to start.

= = = = = = = = = =

HTH :>)

~Dana

Answer 2

Anonymous

Are they using a spreadsheet that you supplied? If so, update the spreadsheet using data validation so they get a drop down box and can choose from a list. If they type something that is not on the list, you can give an error or warning to the user.

Brad

0 comments

Answer 3

Sounds like this is an ongoing exercise.

Could you make a list of knwn and possible 'errors', put them into a table and use Vlookup to get the correct result?

If the source is a Excel file, could this have Data Validation added to it to avoid errors in the first place?

best wishes

Share via

I need to Consolidate similar/misspelled Names

3 answers