Parsing copy/paste data with multiple spaces

Question

Parsing copy/paste data with multiple spaces

Anonymous

I have some data that is emailed to me in a 50-ish page PDF that has four (4) columns of structured tabular data in several pages. The data takes up about 4 pages of the PDF so parsing the entire PDF as a single-set of data isn't practical. My current approach is to parse the data after it's copy/pasted from the PDF. When I copy/paste I can structure the data into a single column into worksheet "RawData". Hence, with four (4) columns of 50-100 rows in each column I can get 200-400 rows of Excel content that is pasted into column A and only column A of my "RawData" worksheet.

Uniquely, the data has several spaces as separators. That is, from what I've worked with each "part" of the data that I need to parse has a space at the END of THAT data except at the very end of the row of data which is denoted by an end-of-line, CRLF, carriage return/line-feed. The data is consistent within each row but not fully consistent of ALL rows. See below for examples!

Here's an example of the data inColumn A of my worksheet "RawData" and my parsing which works fine for the first four (4) columns that I've done.

41 48 32 B B 35.8 FRes 4.8 BEN 22.70 +1.11 –9 4.1m 8o

My parsing command is as follows with the results immediately below it:

=IF(RawData!$A19="","",TRIM(MID(RawData!$A19,FIND(" ",RawData!$A19,LEN($A19)+1+LEN($B19)+1+LEN($C19)),LEN($A19)+LEN($B19)+LEN($C19)-1)))

Columns:

A B C D

41 48 32 B B

What I want is to put each subset of the data into a separate column so that I can sort, calculate, process etc....!! That is, the structure that I want is to put each subset into its own unique column. A 'subset' of the data is separated by blanks or spaces.

Second example:

.. .. 38 .. B– 9.9 Frnk 11FTF 8.54 +.07 –33 85 ..k

Hence, ideally, I need this output structure:

Columns:

A B C D E F G H I J K L M N

41 48 32 B B 35.8 Fres 4.8 BEN 22.70 +1.11 -9 4.1m 8o

.. .. 38 .. B- 0.0 Frnk 11 FTF 8.54 +0.07 -33 85 ..k

You'll notice that Column I generally has alphabetic data and Column H has numbers. The original raw data for my first example has a space between the 4.8 and BEN but the second example does not. That is, the raw data for the second example has 11FTF together - as far as I can tell from the copy/paste of the data. I could probably parse from the left for the Columns A through G and parse from the right for Columns N through I and look for the numeric vs alphabetic differences in the data. Column G can be 20 characters +/- at times as it describes what the ROW is about.

Thirdly, sometimes the data comes in withoutanything FOR Column H. Here's an example:

87 34 90 BA– NH FreHC FXHE 18.75 +1.06 +176 286 45k

Hence, ideally, I need this output structure:

Columns:

A B C D E F G H I J K L M N

87 34 90 B A- NH FreHC FXHE 18.75 +1.06 +176 286 45k

Is there a better way to parse other than the MID function? Is the parsing even do-able?

These are just three (3) examples and it appears that these three (3) are the main if not all the various types of rows and examples.

Note: I'm not afraid of parsing by steps. That is do the main parsing into a ParsedWkSht1 and then parse the exceptions into a ParsedWkSht2 etc.

Thanks!!

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

Answer accepted by question author

0 additional answers

Answer 1

Hi, I'm Elise, an independent advisor and I'd be happy to help with your issue.

I think you best solution would be to use the text to column wizard, located in the data tab.

Use the delimited option and select space as the delimiter.

Let me know if that helps or if you need further assistance?

Kind Regards,

Elise

Answer 2

Hi Total

While Text to columnsmight be a way to solve this issue, probably some VBA code could also be a way to find a solution to your problem.

But It would be very difficult for us to give the right solution without your data to play with.

Could you send us a link to a copy of your "RawData" worksheet so we could find a solution from our side?

Here some videos that might help you

https://www.youtube.com/watch?v=QyZ6IMkln2U

https://www.youtube.com/watch?v=K80pDxv1134

Using Text to columns with your examples

Obviously it requires further manipulation to suit your requirements.

Regards

Jeovany

Share via

Parsing copy/paste data with multiple spaces

0 additional answers