Transform Indentations into columns

Maxence C 21 Reputation points
2021-01-08T10:02:49.75+00:00

Hello everyone,
Here's my problem (I want to solve it with SSIS):
I receive a flat file of the following form:
Food
(4 spaces)Sweet food
(4 spaces)(4 spaces)Fruits
(4 spaces)(4 spaces)(4 spaces)Apple
(4 spaces)(4 spaces)(4 spaces)Strawberry
(4 spaces)(4 spaces)(4 spaces)...
(4 spaces)(4 spaces)Cakes
(4 spaces)(4 spaces)(4 spaces)KitKat
(4 spaces)(4 spaces)(4 spaces)Bounty
(4 spaces)(4 spaces)(4 spaces)...
(4 spaces)Salty food
(4 spaces)(4 spaces)Vegetables
(4 spaces)(4 spaces)(4 spaces)Carrot
(4 spaces)(4 spaces)(4 spaces)Bean
(4 spaces)(4 spaces)(4 spaces)...

As you can see, it is the indentations that are important.

I would like to build a flat file of the following form :

Label;Level 1;Level 2; Level 3
Apple;Fruit;Sweet food;Food
Strawberry;Fruit;Sweet food;Food
KitKat;Cakes;Sweet food;Food
Bounty;Cakes;Sweet food;Food
Carrots;Vegetables;Salty food;Food
Beans;Vegetables;Salty food;Food

How can I do this? What components should I use?

SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
2,702 questions
0 comments No comments
{count} votes

Accepted answer
  1. Monalv-MSFT 5,926 Reputation points
    2021-01-15T02:45:32.457+00:00

    Hi @Maxence C ,

    1.Please import the flat file into the SQL table;
    a. Do not tick "Column names in the first data row".
    56867-ffcm-general.png
    b. Set the column name as Category(Your desired column name).
    56868-ffcm-advanced.png
    c. Preview data.
    56935-flatfile-sou.png
    d. Create new table in OLEDB Destination:

    CREATE TABLE [Categories] (  
       Id int Identity(1,1),  
       Category varchar(50)  
    )  
    

    56945-oledb-des.png
    e. Mapping in Des and ignore Id.
    56826-mapping-ignoreid.png

    2.Please use T-SQL to get the expected result in SSMS.(It is very convenient to use T-SQL to process data.)
    a.

    select * from Categories;  
    

    56855-metadata.png
    b.

    ;with cte as (  
    select Id,case when LEN(Category)-LEN(ltrim(Category))=0 then Category end  [Level 3]  
    ,case when LEN(Category)-LEN(ltrim(Category))=4 then ltrim(Category) else NULL end  [Level 2]  
    ,case when LEN(Category)-LEN(ltrim(Category))=8 then ltrim(Category) else NULL end  [Level 1]  
    ,case when LEN(Category)-LEN(ltrim(Category))=12 then ltrim(Category) else NULL end  [Label]  
     from Categories)  
     ,cte1 as (  
    select *,  
    max(case when [Level 3] is not null then Id end) over (order by Id) as maxid,  
    max(case when [Level 2] is not null then Id end) over (order by Id) as maxid1,  
    max(case when [Level 1] is not null then Id end) over (order by Id) as maxid2,  
    max(case when [Label] is not null then Id end) over (order by Id) as maxid3  
    from cte)  
    ,cte2 as (  
    select   
    coalesce([Label], max([Label]) over (partition by maxid,maxid1,maxid2,maxid3)) as [Label],  
    coalesce([Level 1], max([Level 1]) over (partition by maxid,maxid1,maxid2)) as [Level 1],  
    coalesce([Level 2], max([Level 2]) over (partition by maxid,maxid1)) as [Level 2],  
    coalesce([Level 3], max([Level 3]) over (partition by maxid)) as [Level 3]  
    from cte1)  
    select * from cte2 where [Label] is not null  
    go  
    

    56926-outputdata.png

    Best Regards,
    Mona


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.
    0 comments No comments

3 additional answers

Sort by: Most helpful
  1. Monalv-MSFT 5,926 Reputation points
    2021-01-11T11:03:09.227+00:00

    Hi @Maxence C ,

    We can use Conditional Split Transformation, Union All Transformation and Derived Column Transformation in ssis package.

    Please refer to the following expressions and pictures:

    ConditionalSplit1:  
    FINDSTRING([Column 0],"            ",1) == 1  
      
    FINDSTRING([Column 0],"        ",1) == 1 && SUBSTRING([Column 0],9,10) != " "  
      
    FINDSTRING([Column 0],"    ",1) == 1 && SUBSTRING([Column 0],5,6) != " "  
      
    FINDSTRING([Column 0],"    ",1) == 0  
      
    DerivedColumn1:  
    REPLACE(Label,"            ","")  
      
    REPLACE([Level 1],"        ","")  
      
    REPLACE([Level 2],"    ","")  
      
    DerivedColumn2:  
    LEN(Label) > 0 ? (Label == "Apple" || Label == "Strawberry" ? "Fruit" : (Label == "KitKat" || Label == "Bounty" ? "Cakes" : "Vegetables")) : NULL(DT_WSTR,50)  
      
    DerivedColumn3:  
    LEN([Level 1]) > 0 ? ([Level 1] == "Vegetables" ? "Salty food" : "Sweet food") : NULL(DT_WSTR,50)  
      
    DerivedColumn4:  
    LEN(Label) > 0 || LEN([Level 1]) > 0 || LEN([Level 2]) > 0 || LEN([Level 3]) > 0 ? "Food" : NULL(DT_WSTR,50)  
      
    ConditionalSplit2:  
    ISNULL(Label) ==  FALSE   
    

    55442-df.png
    55402-flatfilesource.png
    55443-cs1.png
    55451-ua.png
    55403-dec1.png
    55353-cs2.png
    55354-desiredoutput.png

    Best Regards,
    Mona

    ----------

    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

  2. Monalv-MSFT 5,926 Reputation points
    2021-01-13T07:59:49.233+00:00

    Hi @Maxence C ,

    We can use ? : (Conditional) (SSIS Expression) in Derived Column 2-4 .

    DerivedColumn2:
    LEN(Label) > 0 ? (Label == "Apple" || Label == "Strawberry" ? "Fruit" : (Label == "KitKat" || Label == "Bounty" ? "Cakes" : "Vegetables")) : NULL(DT_WSTR,50)
    56078-derivedcolumn2.png

    DerivedColumn3:
    LEN([Level 1]) > 0 ? ([Level 1] == "Vegetables" ? "Salty food" : "Sweet food") : NULL(DT_WSTR,50)
    56123-derivedcolumn3.png

    DerivedColumn4:
    LEN(Label) > 0 || LEN([Level 1]) > 0 || LEN([Level 2]) > 0 || LEN([Level 3]) > 0 ? "Food" : NULL(DT_WSTR,50)
    56181-derivedcolumn4.png

    Best Regards,
    Mona

    0 comments No comments

  3. Maxence C 21 Reputation points
    2021-01-13T15:32:33.987+00:00

    Hi Monalv-msft,
    Thank you again for your response.
    I had already understood this. What I ask is to be able to generalize. For example, if I add values in the levels, I would like it to work the same way without having to modify the derived columns (in reality, I have many more levels and values per level):
    Food
    (4 spaces)Sweet food
    (4 spaces)(4 spaces)Fruit
    (4 spaces)(4 spaces)(4 spaces)Apple
    (4 spaces)(4 spaces)(4 spaces)Strawberry
    (4 spaces)(4 spaces)(4 spaces)...
    (4 spaces)(4 spaces)Cakes
    (4 spaces)(4 spaces)(4 spaces)KitKat
    (4 spaces)(4 spaces)(4 spaces)Bounty
    (4 spaces)(4 spaces)(4 spaces)...
    (4 spaces)Salty food
    (4 spaces)(4 spaces)Vegetables
    (4 spaces)(4 spaces)(4 spaces)Carrot
    (4 spaces)(4 spaces)(4 spaces)Bean
    (4 spaces)(4 spaces)(4 spaces)...
    (4 spaces)(4 spaces)Meat
    (4 spaces)(4 spaces)(4 spaces)Beef
    (4 spaces)(4 spaces)(4 spaces)Pork
    (4 spaces)(4 spaces)(4 spaces)...
    Clothing
    (4 spaces)Children's clothing
    (4 spaces)(4 spaces)Winter clothing
    (4 spaces)(4 spaces)(4 spaces)Ski jacket
    (4 spaces)(4 spaces)(4 spaces)Ski pants
    (4 spaces)(4 spaces)(4 spaces)...

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.