regular expression exact date with multiple formats

dani shamir 81 Reputation points
2020-05-05T10:31:21.75+00:00

I have a large file with URL strings such as:

http://tg24.sky.it/mondo/2020/05/01/corea-nord-kim-riappare.html

http://tg24.sky.it/mondo/01/05/2020/corea-nord-kim-riappare.html
http://tg24.sky.it/mondo/2020/04/30/corea-nord-kim-riappare.html

http://tg24.sky.it/mondo/04/30/2020/corea-nord-kim-riappare.html

I need to extract only the URLs with date 01-05-2020 in any format it arrives, with or without separators.

so I have written the following regexp:

^./?0?(1|5|(?:20)?20)[/-]0?(1|5|(?:20)?20)[/-]0?(1|5|(?:20)?20)/?.$

it works fine, but also finds false positives such as:

XXXX/5/5/5/YYYYY

So I understand that I need to enhance it in a way - that if the first pattern is MM, then look in the second for DD or YYYY, and then in the third only look for what is left.

An thoughts of how to do it ?

Thanks,

Dani

Microsoft Entra
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Saurabh Sharma 23,751 Reputation points Microsoft Employee
    2020-05-29T18:42:51.003+00:00

    Hi,

    Q&A currently supports the products listed over here https://learn.microsoft.com/en-us/answers/products (more to be added later on).

    You might want to reach out to the experts over StackOverflow.

    (Please don't forget to accept helpful replies as answer)

    0 comments No comments