You can do this easily with SSIS.
Please see:
https://social.technet.microsoft.com/wiki/contents/articles/3172.split-a-flat-text-file-into-multiple-flat-text-files-using-ssis.aspx
https://www.youtube.com/watch?v=b_32_zS1tSE
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
I'm trying to split a large CSV into smaller chunks (1M rows or less) using SQL since you can't open/save files with more rows using Excel.
I've figured out how to add row numbers using the ROW_NUMBER command (Select ROW_NUMBER() over(order by <column name> asc) as row#), but I cannot figure out how to only show results for records 1 through 999,999 so I can export into Excel and repeat for 1,000,000 through the rest of the original file.
How can I show results between 2 values for the row number column I just created? When I attempt WHERE row# between 1 and 999999, I get "invalid column name row#."
You can do this easily with SSIS.
Please see:
https://social.technet.microsoft.com/wiki/contents/articles/3172.split-a-flat-text-file-into-multiple-flat-text-files-using-ssis.aspx
https://www.youtube.com/watch?v=b_32_zS1tSE
This happens because a query is logically evaluated in the order FROM-WHERE-GROUP BY-HAVING-SELECT-ORDER BY. So what you define in the SELECT list cannot be referred to in the WHERE clause.
But you can easily address with a Common Table Expression (CTE) to restructure your query:
; WITH CTE AS (
SELECT ... row# = row_number()....
FROM tbl
)
SELECT ...
FROM CTE
WHERE row# BETWEEN 1 AND 999999
I get "invalid column name row#."
As Erland answered, it is the SQL query processing order reason. You could use Subquery or CTE to handle this issue.
Also, you can have a try on OFFSET ... FETCH NEXT, check this:
--1~1000000
SELECT <column list>
FROM <table name>
ORDER BY <column name> ASC OFFSET 0 ROWS FETCH NEXT 1000000 ROWS ONLY;
--1000001~2000000
SELECT <column list>
FROM <table name>
ORDER BY <column name> ASC OFFSET 1000000 ROWS FETCH NEXT 1000000 ROWS ONLY;
Refer to this blog for more details: SQL Server OFFSET FETCH
Best regards,
LiHong
If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".
Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.