A retention period should be applied to your data, so that sessions in the past are removed from the feed, while respecting the RPDE invariants.
A CDN such as CloudFlare is recommended to allow your RPDE endpoint to scale to millions of requests inbound.
In order to minimise the total number of items within of RPDE feeds, it is recommended that data publishers apply a retention period - especially for
Slot data items.
The Realtime Paged Data Exchange specification specifies that:
If any record is added to the list or updated it must remain in the list in perpetuity while it is in an "updated" state, or remain in the list for at least 7 days from the point in time at which it transitioned to a "deleted" state
In order to implement a retention period, records representing events that occur in the past should be pruned by first setting their
state to the
"deleted" state, and then after 7 days removing them from the feed. This may be implemented via a regular CRON job, for example.
The high-volume proposal for the RPDE specification is currently widely adopted, and hence it is recommended that
Slot feeds that have a particularly high volume of small payload items wait 2 days before removing
"deleted" items from the feed, in place of the specified 7 days.
A CDN is simple to configure and requires a small amount of additional code within the RPDE endpoint.
Note that in order for the CDN to be effective your application should not implement the optional
limit parameter specified in the RPDE specification.
The following CDN configuration options are recommended:
Configure the CDN to "Pass through and respect cache TTL headers" instead of overriding
In the application, vary the cache headers for RPDE pages as follows:
For all pages which contain greater than zero items, return a TTL of "60 minutes":
Cache-Control: public, max-age=3600
For the last page, which contains zero items, return a TTL of "8 seconds":
Cache-Control: public, max-age=8
The settings in the CDN Configuration section will create the behaviour described in this worked example automatically:
In this scenario, 200 data consumers are tracking the RPDE feed by polling at the end of the list (the last
Although each data consumer can choose a polling frequency arbitrarily, that frequency is not relevant to the calculations here, as it is the settings of the cache header that dictate the load on the origin server. It should also be noted that during normal operation the number of data consumers also does not impact the load on the origin server, and that 200 is used illustratively.
When the last page is requested, the first consumer would request the live page (creating one hit on the origin server), and the subsiquent 199 data consumers would receive a cached version.
For a feed whose source data is not being updated, the origin server receives one hit every 8 seconds and returns an empty list of
items each time, and an identical
Hence the maximum load during "Sleep" mode is 8 requests/minute.
When an update to source data occurs, one of the 8-second interval requests will render a list of items and a new
items list is rendered once by the origin server, and the subsiquent 199 data consumers would receive a cached version.
All 200 data consumers will follow the same next URL, and again the first request will be cached for the other 199 data consumers.
Hence the maximum load during "Live" mode is bounded by the response time of the "last" page, as the CDN will queue the requests from other data consumers waiting for this page. If the origin server is under high general load from other services, and the response time of the last page is increased, then the queue waits. This avoids a large number of data consumers adversely affecting the origin server performance during times of peak general load.
In order for CloudFlare to respect your cache control headers, there are five simple steps to follow:
After you've set up CloudFlare as your DNS provider, check requests are being routed through CloudFlare by enabling the orange cloud button:
Use the wildcards to ensure the rule covers all your feeds, for example:
The page rule should have the following configuration:
Cache Level: Everything
Origin Cache Control: On
SSL: Flexible (if you do not have SSL configured on your own server)
On the Caching configuration page, ensure that following is set:
Browser Cache Expiration: Respect Existing Headers
Ensure that your web application or web server infrastructure does not set any cookies on the feed pages (for example load balancer affinity cookies), as these will prevent CloudFlare from caching pages.
Inspect the headers returned by your page to see if CloudFlare is successfully caching your feed.
A successfully cached page will return the following header:
The following articles will help you dive deeper into this in case you have any issues: