SSE Connection Best Practices
This article pertains to: Inform (v2)
SSE connections, by their nature, are one-way connections. The server sends and the client receives. There is no way for the client to acknowledge the events or the connection itself and confirm to the server that things are ok. As such, the server isn't listening for that. Validic is unable to determine what data the client has ingested, and whether a connection is being maintained. Due to that, maintaining a connection to the streaming API for continuous data ingestion is the client's responsibility.
The first recommendation is that if you don’t expect data for an extended period of time, ie. during maintenance, testing, or expected periods of inactivity, you should disconnect your stream and reconnect when the inactivity is completed. Validic will maintain a customer's checkpoint in a stream for 7 days after a disconnect. If the reconnection happens during that 7 day window then there will be no data loss.
The second recommendation is where a long-running connection to a stream could encounter a scenario where the stream goes “stale”. So, why do connections "go stale"? The fact is TCP/IP and the Internet built on it is designed to work with disruption. Network outages, reroutes, and other hiccups occur all the time. The longer a connection stays open, the higher the likelihood that such a hiccup will impact it. Unfortunately, some connections require long-running connections and are one-way in nature, such as an SSE stream. In this scenario, the client has to be responsible for determining the health of the connection. Validic has no way of knowing if the client actually received data, and therefore can't determine if there is an issue on the connection.
This is where the POKE events come in handy. Every five seconds when no data events or connection events are coming through the stream you will see a POKE event in your stream.
The client should be aware when no data, connection, or POKE events are coming through. This is the real value of the POKE events. We expect them every 5 seconds, so theoretically, not receiving any POKE events (or DATA events) for a period of more than 5 seconds should be alarming. But don't move too quickly. Sometimes, things happen, and a POKE is delayed, though not likely ever missing. So, wait 1 minute. If no POKE, data, or connection events come in during that time, you have a stale stream. In this scenario, the recommendation would be to take the following action:
Disconnect the stale stream
Create a new stream with the same resource_filter and event_type_filter as the previous stream. Set the start_date for today. That will ensure you replay the day’s events to make sure you don’t drop any data.
Make sure you have logic in place to recognize what is already written to your DB, what is a new event, and what is an update event to ensure no duplicate data and that all updates are captured.
Be aware when using a start_date on a stream that behind the scenes the stream will run through all the data in the last 30 days. You will expect to see a long period of POKE events until the stream reaches the day you entered as the start_date on the stream. This is expected behavior and shouldn’t be a cause for concern.
Sit back and rely on the logic you wrote that noticed the ‘stale’ stream previously to catch any scenario where no POKE, data, or connection events come through for 1 minute. If it occurs again, then repeat the steps above.