From this design perspective, an API Gateway is meant as an aggregation layer to reduce the number of requests coming in from clients. And while the same gateway could be used for inter-service requests, it would not be recommended since it would increase the load on the same gateway instance. And yes, this is a valid concern to have but can easily be mitigated by having multiple instances of the gateway - public and internal instances, for example.
Now coming to Service Meshes, they aim to provide most features of an API Gateway tailored for service-to-service communication and there are different kinds of such service meshes.
- Proxy Pattern
These could simply be API Gateways themselves, acting like a proxy between services. These are usually simpler solutions and open-source equivalents like Traefik Mesh follow this pattern. The obvious downside here is the extra hop and potential bottleneck (this can be avoided by scaling the proxy) but could make sense for a simpler setup where not a lot of microservices (and traffic) are involved.
- Sidecar Pattern
This is how most of the other service meshes work (like Istio, Linkerd, etc.). Here a sidecar (container or process) runs alongside each microservice handling all network traffic coming in and going out of the service (surprisingly two hops in this case, but these are localhost hops). The sidecar implements all the functionality you would expect like retry, authentication (like mTLS), etc. These service-meshes are usually more complex to setup and demand more compute for all the extra proxy/sidecar containers running, but pay off for the benefits they bring in.
Considering all of the above, the sidecar pattern is the more enticing one to consider but is something that you will have decide on, taking into account your requirements and the investment required to setup.
Doubling back to your questions,
So you are saying it is it okay for Service 1 call going back to API Gateway to call Service 2?
Yes. Not a problem apart from the network hop and being aware of the capacity required for your scenario. APIM has a nice developer portal which internal developers can come to try out the different APIs while they are integrating them into your app.
Also, since APIM exposes Swagger Specs for these APIs, you could use that to generate SDKs for simpler integration. This is something sidecar-style service meshes do not provide but instead developers write the swagger themselves to or author SDKs if required.
when would you chose service mesh for service-service control vs APIM gateway?
I hope the above description should help. In the end, it would depend on your requirements.
Personally, I'd go with a service mesh for large projects and a simpler solution for smaller projects (unless I could deploy them into an existing K8s cluster that has a service mesh ready to use).
Also, if we use API layers design approach, do you see all internal service - service going through APIM gateway?
Depending on the pattern you choose above. In case of APIM, yes.
I just wonder whether Azure APIM is even designed for inter service communication with in the network. If so, would that be an anti pattern of Gateway Pattern? Is this officially supported feature of APIM to act as a Service Mesh?
The Gateway Pattern is for client-to-service communication. And APIM can be used to expose APIs which can be consumed by client applications or other services, doesn't really matter to it.
I don't see any reason it should not work as a proxy between services. Additionally, APIM supports calling Dapr Services and Service Fabric Apps.
Is this a valid concern with Azure APIM? I noticed it uses EventHub underneath as there are some capacity metrics.
Yes. You would scale the gateway as required to handle the expected load. As for Event Hub, I believe it's just used for logging.