Parsing copy/paste data with multiple spaces

Anonymous
2020-06-07T13:41:39+00:00

I have some data that is emailed to me in a 50-ish page PDF that has four (4) columns of structured tabular data in several pages.  The data takes up about 4 pages of the PDF so parsing the entire PDF as a single-set of data isn't practical.  My current approach is to parse the data after it's copy/pasted from the PDF. When I copy/paste I can structure the data into a single column into worksheet "RawData".  Hence, with four (4) columns of 50-100 rows in each column I can get 200-400 rows of Excel content that is pasted into column A and only column A of my "RawData" worksheet.

Uniquely, the data has several spaces as separators.  That is, from what I've worked with each "part" of the data that I need to parse has a space at the END of THAT data except at the very end of the row of data which is denoted by an end-of-line, CRLF, carriage return/line-feed.  The data is consistent within each row but not fully consistent of ALL rows.  See below for examples!

Here's an example of the data inColumn A of my worksheet "RawData" and my parsing which works fine for the first four (4) columns that I've done.

41 48 32 B B 35.8 FRes 4.8 BEN 22.70 +1.11 –9 4.1m 8o

My parsing command is as follows with the results immediately below it:

=IF(RawData!$A19="","",TRIM(MID(RawData!$A19,FIND(" ",RawData!$A19,LEN($A19)+1+LEN($B19)+1+LEN($C19)),LEN($A19)+LEN($B19)+LEN($C19)-1)))

Columns:

A       B         C        D

41     48        32      B B

What I want is to put each subset of the data into a separate column so that I can sort, calculate, process etc....!!  That is, the structure that I want is to put each subset into its own unique column.  A 'subset' of the data is separated by blanks or spaces.

Second example:

.. .. 38 .. B– 9.9 Frnk 11FTF 8.54 +.07 –33 85 ..k

Hence, ideally, I need this output structure:

Columns:

A       B        C         D      E       F      G      H       I        J         K           L     M        N

41     48       32       B       B      35.8  Fres  4.8   BEN   22.70  +1.11   -9   4.1m   8o

..        ..         38       ..        B-     0.0   Frnk  11   FTF    8.54    +0.07   -33  85      ..k

You'll notice that Column I generally has alphabetic data and Column H has numbers.  The original raw data for my first example has a space between the 4.8 and BEN but the second example does not.  That is, the raw data for the second example has 11FTF together - as far as I can tell from the copy/paste of the data.  I could probably parse from the left for the Columns A through G and parse from the right for Columns N through I and look for the numeric vs alphabetic differences in the data.  Column G can be 20 characters +/- at times as it describes what the ROW is about.

Thirdly, sometimes the data comes in withoutanything FOR Column H.  Here's an example:

87 34 90 BA– NH FreHC FXHE 18.75 +1.06 +176 286 45k

Hence, ideally, I need this output structure:

Columns:

A       B        C         D      E       F        G          H         I            J           K        L      M        N

87     34      90        B       A-     NH    FreHC              FXHE  18.75  +1.06  +176     286  45k

Is there a better way to parse other than the MID function?  Is the parsing even do-able?

These are just three (3) examples and it appears that these three (3) are the main if not all the various types of rows and examples.

Note: I'm not afraid of parsing by steps.  That is do the main parsing into a ParsedWkSht1 and then parse the exceptions into a ParsedWkSht2 etc.

Thanks!!

Microsoft 365 and Office | Excel | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments
{count} vote
Answer accepted by question author
  1. Anonymous
    2020-06-07T16:07:53+00:00

    Hi, I'm Elise, an independent advisor and I'd be happy to help with your issue.

    I think you best solution would be to use the text to column wizard, located in the data tab.

    Use the delimited option and select space as the delimiter.

    Let me know if that helps or if you need further assistance?

    Kind Regards,

    Elise

    2 people found this answer helpful.
    0 comments No comments
Answer accepted by question author
  1. Anonymous
    2020-06-07T15:18:28+00:00

    Hi Total

    While Text to columnsmight be a way to solve this issue, probably some VBA code could also be a way to find a solution to your problem.

    But It would be very difficult for us to give the right solution without your data to play with.

    Could you send us a link to a copy of your "RawData" worksheet so we could find a solution from our side?

    Here some videos that might help you

    https://www.youtube.com/watch?v=QyZ6IMkln2U

    https://www.youtube.com/watch?v=K80pDxv1134

    Using Text to columns with your examples

    Obviously it requires further manipulation to suit your requirements.

    Regards

    Jeovany

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful