Troubleshoot service hooks
TFS 2018
Use this article for general troubleshooting guidance and answers to frequently asked questions (FAQs).
The Service Hooks page in the web access admin shows your recent activity (last 14 days) for each subscription, and whether a subscription is enabled, disabled, or restricted.
You can access detailed history about a subscription including detailed request/response data, which is useful for debugging a problematic service or subscription.
To view the activity and status of your subscriptions, go to the Service Hooks page.
To view detailed activity for a subscription, including full request, response, and event payload data, select a subscription in the table and select History.
Failures from a Service Hooks notification are grouped into the following categories:
- Terminal Failures
- Transient Failures
- Enduring Failures
The only Terminal Failure is HTTP Status Code 410 (Gone). When a subscription sees a Terminal Failure, it's automatically disabled no matter its prior status.
When a subscription sees a Transient Failure, it attempts to resend the notification up to eight times, with an increasing delay between each attempt. Transient Failures include the following codes:
- 408 (Request Timeout)
- 502 (Bad Gateway)
- 503 (Service Unavailable)
- 504 (Gateway Timeout)
Retry # | Wait time |
---|---|
Before retry 1 | wait ~1 second |
Before retry 2 | wait ~2 seconds (total delay of 3 seconds) |
Before retry 3 | wait ~4 seconds (total delay of 7 seconds) |
Before retry 4 | wait ~8 seconds (total delay of 15 seconds) |
Before retry 5 | wait ~16 seconds (total delay of 31 seconds) |
Before retry 6 | wait ~32 seconds (total delay of 63 seconds) |
Before retry 7 | wait ~60 seconds (max backoff time, total delay of 123 seconds) |
Before retry 8 | wait ~60 seconds (max backoff time, total delay of 183 seconds) |
If the notification exhausts all of its retries and continues to see a Transient Failure for each attempt, the subscription stops trying to send the notification, and treats the notification as if it saw an Enduring Failure.
Enduring Failures include all other HTTP failure codes, for example: 404 (Not Found), 500 (Internal Server Error), and so on.
When a subscription sees an Enduring Failure, it's placed on probation.
While on probation, a subscription is limited in the number of notifications it can send. If the subscription continues to hit Enduring Failures, then it gets increasingly limited, and eventually disabled. If the subscription receives a successful response while on probation, it gets restored to a fully enabled state.
When a subscription is in probation, any new events are lost. Once a retry is successful, the subscription is enabled, and events are published again.
Retry # | Wait time |
---|---|
Before retry 1 | wait ~20 minutes |
Before retry 2 | wait ~40 minutes (total probation time of 1 hour) |
Before retry 3 | wait ~1 hour 20 minutes (total probation time of 2.33 hours) |
Before retry 4 | wait ~2 hours 40 minutes (total probation time of 5 hours) |
Before retry 5 | wait ~5 hours 20 minutes (total probation time of 10.33 hours) |
Before retry 6 | wait ~10 hours 40 minutes (total probation time of 21 hours) |
Before retry 7 | wait ~15 hours (max backoff time, total probation time of 36 hours) |
After seven retries, the subscription status gets set to DisabledBySystem if notifying the consumer fails.
A: The payload limit is 2 MB. Larger payloads cause degradation in performance and reliability. As a best practice, service-hooks should limit the payload to 2 MB or less.
A: A subscription becomes restricted if too many failures occur. Enabled (restricted) is the same as being on probation.
A: A subscription is automatically disabled after a series of consecutive failures over a prolonged period or a terminal failure is encountered. Transient failures types are retried several times before being declared a failure. Enduring failure types aren't retried. The following are examples of each type of failure.
- Transient: 408 (Request Timeout), 502 (Bad Gateway), 503 (Service Unavailable), 504 (Gateway Timeout)
- Terminal: 410 (Gone)
- Enduring: All failures that aren't transient or terminal
A: The user who created the subscription is no longer a member of the team.
A: Check the following items:
Confirm the subscription is enabled
Confirm the subscription settings are correct (both event filters and actions)
Look at the history, especially if there are failures
Q: Can I grant a regular project user the ability to view and manage service hook subscriptions for a project?
A: By default, only project administrators have these permissions. To grant them to other users directly, you can use the command line tool or the Security REST API.
A: Yes, use REST APIs.