Azure Data Lake Gen2 - Use Case Advice
I am collecting weather data (history and forecast) from a third part web service. Since there will be a lot of data, and it will not have high use, I was planning to use Azure Data Lake Gen2 with blob storage, and storing the data in JSON files. My thought is that this will be cheaper than a Azure SQL database.
I have read that it is best to have larger files in Data Lake. The amount of data that is collected each hour is relatively small, so I was thinking of just having a file for each month. But this means when I collect data each hour I need to add to the current months file. What is the best way to do this? Should I read the file, add my new data to the data from the file, and then overwrite the file with the new data? That seems the easiest, but seems inefficient. Is there a better way to do this, so way to append? Or should I just live with have smaller files and create a new file each hour?
And, is this even an appropriate use case for Data Lake?