Cirrus Insight Webhooks Lifecycle
The lifecycle consists of three states: active
, disabled
, and validating
. Understanding these stages is crucial for effectively managing and utilizing webhooks in your Cirrus Insight integration.
Active
State
When a webhook is in the active
state, Cirrus Insight is actively sending events to the specified endpoint. The endpoint should consistently respond with a 200 OK
status code to acknowledge the receipt of the events. This state indicates that the webhook is functioning properly and events are being successfully delivered to the integration. It is important to ensure that the endpoint is properly configured to handle and process the incoming events during this state.
Disabled
State
When a webhook is disabled
, Cirrus Insight has stopped sending events to the specified endpoint. This can occur if the user has manually disabled
the endpoint or if the endpoint has been failing to respond for an extended period of time. In the disabled
state, no events will be delivered to the integration. It is important to investigate and resolve any issues causing the webhook to be disabled
to ensure the continuous flow of events. Review the logs in the developer dashboard. Detailed information on each delivery attempt, as well as retries, will appear in the dashboard for the most recent 7 day periood. Request and response data for failing presented in the dashboard may be helpful in identifying and resolving issues.
Validating
State
An endpoint enters the validating
state when it is activated from the developer dashboard. Upon entering this state, Cirrus Insight will send a signed developer.webhook.test
event to the endpoint. If the endpoint responds 200 OK
the endpoint will then be placed into the active
state. If the webhook fails, Cirrus Insight will retry the test event every few hours for 24 hours. If the endpoint fails to respond successfully in this period, it will be placed back into the disabled
state, and the developer can re-activate after resolving any endpoint issues.
Failure Processing
Cirrus Insight handles failed webhooks by implementing a retry mechanism. After the initial send, Cirrus Insight will retry webhooks up to 3 times. Each retry attempt is indicated by the deliveryAttempt
counter and deliveryTimestamp
on each send. The deliveryAttempt
is zero-based, meaning the initial send is considered the first attempt.
Response Codes
The only valid HTTP status code returned from an endpoint is 200 OK
. Cirrus Insight will treat any other HTTP status as a failure and will retry delivery of the event
model. Ensure your webhook endpoint returns 200 OK
to avoid re-delivery attempts.
If a webhook fails to be delivered after 4 consecutive attempts, Cirrus Insight will no longer retry that event
model. This ensures that failed events do not cause an indefinite loop of retries. You should ensure your endpoint gracefully handles receiving the same event
model and avoid duplicate processing. The event
model object has a unique eventId
field that can be used for duplicate detection and handling.
To help with debugging, logs are available in the developer dashboard. These logs provide detailed information on each delivery attempt, including retries. Reviewing the logs can be helpful in identifying and resolving any issues causing the webhook failures. The logs contain both the request sent and the response (if any) that was returned from the endpoint.
Monitor logs
Remember to monitor the logs and investigate any recurring failures to ensure the continuous flow of events in your Cirrus Insight integration. Logs are available for your webhook endpoint in the Cirrus Insight Developer dashboard.
Chronic Failures
In order to ensure the reliability and performance of the webhook system, Cirrus Insight implements an automatic disabling mechanism for failing endpoints. Endpoints which consistently fail for 72 hours are automatically changed to disabled
state.
After the initial 24 hours of all delivery attempts failing, the endpoint will be placed on probation. During this probation period, the admin will be alerted by email, notifying them of the failing endpoint and the need for investigation and resolution. During this time period, Cirrus Insight will continue to deliver events
model to the endpoint and if the problem is resolved, the system will remove the probation and continue to operate as normal.
If the failing endpoint continues to fail for an additional 48 hours, reaching a total of 72 hours of failing delivery, the endpoint will be automatically disabled
. Once disabled
, no further events will be sent to the endpoint. Again, the admin will be notified by email about the automatic disabling.
It is crucial for the admin to promptly investigate and resolve any issues causing the failing delivery to avoid disruptions in the integration. Monitoring the logs in the developer dashboard and reviewing the request and response data can be helpful in identifying and resolving the issues.
Remember to take appropriate actions to re-activate the endpoint after resolving the issues to restore the continuous flow of events in your Cirrus Insight integration. When re-activating your endpoint, it will first enter the validating
state and your endpoint will need to successfully respond to a developer.webhooks.test
just as when it was first configured. After a successful response, the webhook endpoint will be fully re-activated.