We recently moved our infrastructure to serverless and we are happy with the move. The system is performing and scaling pretty well.
However, we are continuously facing random function execution timeouts. The majority of our function runs take anywhere between 1 to 10 seconds to complete. Those that are failing take around 3 minutes before they timeout. Looking at our dependency graph, we could not spot where those 3 minutes are spent.
Is there a way for us to find out why these functions take 3 minutes to execute before they time out?
PS. We are not looking at increasing the function timeouts as they are supposed to complete within 10 seconds.