Azure Data Lake U-SQL March 9 2017 Updates: Deprecations turn into errors, PIVOT/UNPIVOT, cross ADLS account U-SQL catalog sharing, nuget packages and more!
After mainly internal service updates after our general availability, we released several new U-SQL features in our release last week. Note that these updates are now available in all regions, including the new Europe North region.
Here are the March 9 2017 Updates for Azure Data Lake U-SQL and Developer Tooling!
The main take away is that we continue the deprecation of items that we changed during the preview phase and introduce a lot of new capabilities including PIVOT/UNPIVOT
more catalog sharing and much more!
Thanks to all of you who continue to volunteer to test the new version of the more scalable file set. Please contact us if you want to try it and help us validate it or want to explore the new flexible-schema feature preview for TVF parameters.
Here is the list of topics with links to the detailed release notes:
- Pending and Upcoming Deprecations
- Breaking Changes
- Major U-SQL Bug Fixes, Performance and Scale Improvements
- Major performance improvement when running jobs near the ADLS read/write throttling limits
- All Unicode whitespace characters are now recognized as whitespace by the U-SQL parser
- Improved data-size dependent selection for default numbers of
HASH DISTRIBUTION
buckets - Support for
USE
statements inside U-SQL code object bodies coming from an external account - Error reporting for expressions with unbalanced parentheses is greatly improved
- U-SQL's assembly object aliasing (
USING
statement) now works again inEXTRACT
expressions - The C#
checked
expression is not a constant-foldable expression - Two previously mentioned issues with quoted identifiers got fixed
- U-SQL Preview Features
- New U-SQL capabilities
- Catalogs can be shared among ADLA accounts even across different primary ADLS accounts
- U-SQL added
PIVOT
/UNPIVOT
support - U-SQL's
VALUES
row/rowset constructor supports 1 million constant values - U-SQL's
CROSS/OUTER APPLY
adds support forVALUES
expression and C# expressions of typesIEnumerable, KeyValuePair, IEnumerable
- U-SQL added file modification check in the compiler for catalog managed files
- U-SQL table types from a different database may now be referenced using 3-part naming
- U-SQL user-defined types (UDTs) can now be used in U-SQL variables
- Azure Data Lake Tools for Visual Studio New Capabilities
- Data View now is available in Data Lake Tools for Visual Studio
- Improved failed vertex debug experience for code behind .cs file
- The Azure Data Lake U-SQL SDK is now available at Nuget.org
- The Azure Data Lake U-SQL SDK is now available at Nuget.org
- Expiration information for files now is exposed in Store Explorer
- Better representation of Metadata Operations in Job View
- Old versions for the Azure Data Lake Tools for Visual Studio are now archived
If you want to use the above preview capabilities, please request access by contacting us.
In order to get access to the new syntactic features and new tool capabilities on your local environment, you will need to refresh your ADL Tools. Otherwise you will not be able to use them during local run and submission to the cluster will give you syntax warnings for the new language features.
You can find more details with examples in the March 9 2017 release notes (or by clicking on the items above) on our GitHub site, where you also can find our previous release notes.
Comments
- Anonymous
March 31, 2017
The comment has been removed- Anonymous
April 26, 2017
Hi Jon to answer here too: The SSIS team is working on SSIS integration and have forwarded the user requests to them. Other alternatives are: Azure Data Factory, Powershell scripts using the Azure SDK, writing your own orchestration using one of the ADL SDKs (Python, node.js, .Net, Java).
- Anonymous