Hello and welcome to Microsoft Q&A @Rohit Sawane
The ask as you have stated is how to dynamically get the sheet names from an xlsx in Data Factory. I suspect the real goal is just to get the data from all the sheets, when you don't know the quantity or names of sheets beforehand.
I can help you with the latter, more than the former.
As far as I am aware, there isn't an out-of-box feature for getting the sheet names at this time in Data Factory. I tried with Get Metadata, that did not help. But then I wondered, how does the multiple-choice selector in the Dataset get the sheet names? So I dug into the browser to investigate. There is a microservice endpoint called enumerateItems. However it is not intended for end-user use, and looks more complicated than the alternative solution I discuss below.
If I can assume, what you really care about is the ability to iterate through all your sheets, and not the sheet name specifically, there is a simpler solution.
Iterating by sheet ordinal is simpler than by sheet name. I noticed when I did Lookup or Get Metadata on sheet number 4 of a book with only 3 sheets, the error message told me how many sheets the book had. This we can use.
By deliberately failing an activity we can find out how many sheets and work that.
So, first step. I by the sound of things, you have already parameterized the Excel dataset for dynamic file name and dynamic page name. Let's swap that over to dynamic page number.
Now lets set up that activity to fail (and doesn't that sound wrong!). I think Lookup activity ran faster, but Get Metadata has simpler error message, so I am using Get Metadata here. In the field list, choose "structure". We choose structure because that actually needs to look at the sheet, as opposed to Exists which only check for the presence of the file. Let us pass a suitably large sheet number, say 99. Also there is the file name.
Now, we want to get the error, and stop if there is no error (if someone has 100 sheet excel book!). We connect the Get Metadata to Set Variable by red on-fail dependency. To make things easier, I split the task of error-to-list-of-sheet-numbers into parts. This first part extracts the portion of the error message we care about, the 0..2)
from the rest. I do this by setting string type variable I am calling "message". The expression is:
@split(activity('Get pages by failing').error.message,'(')[2]
After this we have another Set Variable activity, this one to an array type variable I am naming "sheets". In this, several things will happen. I will extract the second number from the message, and then use it in a range function to make the array [0,1,2]
. There is a lot to unpack in this expression
@range(0,add(1,
int(substring(variables('message'),
3,
sub(length(variables('message')),4)))))
range(0,2)
yields [0,1]
. This is because the second value is number of elements, so we actually want to add 1 to the number.
The number we want starts on the third character 0..N
, so that is where 3 comes from.
The number of sheets could be a single digit , or double digit, or theoretically, more. So to determine how many characters I want to grab, depends upon the length of string. There are 4 characters we do not care about. 0
,..
,)
So we subtract 4 from the length of the string.
Finally, to the ForEach, we feed the array variable "sheets" we made in the last step.
... actually on further thought, the range always starts with 0, so this could have been easier if I split on ..
rather than (
. Less substring arithmetic that way.
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators