How to Remove Characters from Fields

Carlton Patterson 741 Reputation points
2022-10-08T19:42:35.32+00:00

A community member called Tom Cooper helped me with code to remove https:// from fields.

I was wondering if someone could help me build on the code to remove some forward slash characters from an http address.

For example, some fields may include a forward slash as follows www.parliament.uk/lords or www.intermed.de/ see image

248702-forwardslash.png

I would like some help modifying the code below such that www.parliament.uk/lords returns www.parliament.uk and www.intermed.de/ returns www.intermed.de

SELECT DISTINCT  
  homepage_url  
   ,SubString(websiteurl,   
     Case When CharIndex('://', websiteurl) = 0 Then 1 Else CharIndex('://', websiteurl)+ 3 End, Len(websiteurl)) as websiteurl  
FROM dbo.account  
INNER JOIN dbo.CRM2CBURL_Lookup L  
  ON Id = [Key]  
INNER JOIN dbo.organizations  
  ON CB_URL_KEY = cb_url  

You don't have to worry too much about the joins. I just need the code to remove the forward slash (and anything after the forward slash)

Sample Data

CREATE TABLE #tmpTable (  
    homepage_url nvarchar(150),  
    websiteurl varchar(100))  
  
INSERT #tmpTable VALUES  
(N'http://www.opisnet.com','www.opisnet.com'),  
(N'http://www.ebookers.com','www.ebookers.com'),  
(N'http://navico.com','navico.com'),  
(N'https://zeb-consulting.com','www.zeb.de'),  
(N'http://www.hobbycraft.co.uk','www.hobbycraft.co.uk'),  
(N'http://www.parliament.uk/lords','www.parliament.uk/lords'),  
(N'https://www.intermed.de/','www.intermed.de/'),  
(N'http://www.iac.com','www.iac.com'),  
(N'https://www.esteve.es','www.esteve.com/'),  
(N'http://www.kornferry.com/','www.kornferry.com')  
  
SELECT * FROM #tmpTable  
  
Developer technologies Transact-SQL
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 122.5K Reputation points
    2022-10-08T20:35:14.527+00:00

    Check a query that processes both of "://" and "/":

    SELECT *,  
       substring(websiteurl, b, d - b)  
    FROM #tmpTable  
    cross apply (values (charindex('://', websiteurl))) a(a)  
    cross apply (values (iif(a = 0, 1, a + 3))) b(b)  
    cross apply (values (charindex('/', websiteurl, b))) c(c)  
    cross apply (values (iif(c = 0, len(websiteurl) + 1, c))) d(d)  
    

    If it works, it can be integrated with your real tables.


1 additional answer

Sort by: Most helpful
  1. Yitzhak Khabinsky 26,586 Reputation points
    2022-10-09T00:20:15.16+00:00

    Hi @Carlton Patterson ,

    Please try the following solution.
    It is using tokenization instead of parsing and searching.

    You need all the time 3rd token from the URL.
    That's why the XPath expression /root/r[3] gets it right away.

    SQL

    -- DDL and sample data population, start  
    DECLARE @tbl TABLE (homepage_url NVARCHAR(150));  
    INSERT @tbl  VALUES  
    (N'http://www.opisnet.com'),  
    (N'http://www.ebookers.com'),  
    (N'http://navico.com'),  
    (N'https://zeb-consulting.com'),  
    (N'http://www.hobbycraft.co.uk'),  
    (N'http://www.parliament.uk/lords'),  
    (N'https://www.intermed.de/'),  
    (N'http://www.iac.com'),  
    (N'https://www.esteve.es'),  
    (N'http://www.kornferry.com/');  
    -- DDL and sample data population, end  
      
    DECLARE @separator CHAR(1) = '/';  
      
    SELECT t.*   
    	, c.value('(/root/r[3]/text())[1]', 'VARCHAR(150)') AS websiteurl  
    FROM @tbl AS t  
    CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +   
          REPLACE(homepage_url, @separator, ']]></r><r><![CDATA[') +   
          ']]></r></root>' AS XML)) AS t1(c);  
    

    Output

    +--------------------------------+----------------------+  
    |          homepage_url          |      websiteurl      |  
    +--------------------------------+----------------------+  
    | http://www.opisnet.com         | www.opisnet.com      |  
    | http://www.ebookers.com        | www.ebookers.com     |  
    | http://navico.com              | navico.com           |  
    | https://zeb-consulting.com     | zeb-consulting.com   |  
    | http://www.hobbycraft.co.uk    | www.hobbycraft.co.uk |  
    | http://www.parliament.uk/lords | www.parliament.uk    |  
    | https://www.intermed.de/       | www.intermed.de      |  
    | http://www.iac.com             | www.iac.com          |  
    | https://www.esteve.es          | www.esteve.es        |  
    | http://www.kornferry.com/      | www.kornferry.com    |  
    +--------------------------------+----------------------+  
    
    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.