This month we have experienced rapid and unpredictable growth for our traffic visibility services. This has placed increased demand on our cloud services and has affected the recording of historical data for some customers. We would like to apologise for the inconvenience this may have caused for you. We are scaling out our cloud services and adding new regions starting with Sydney, Australia. Read on for the detail on how we’re dealing with the scale challenges and what we’re putting in place for future growth in functionality and scale.
As a result of this recent unexpected growth, we are handling numbers of probes and volumes of data not anticipated until next year. This has resulted in some of our backend services not keeping up with demand and as a result, some accounts have experienced gaps in some of their historical visibility reports. To make matters worse, the mechanism we have in place to recover from this situation actually compounded the problem. Probes automatically attempt to resend any results that fail to be processed which put even more load on these already loaded services.
About 12 months ago we began making changes to the product that would allow us to scale by an order of magnitude while also allowing us to deploy cloud servers into different geographies. This work is almost complete and we are about to launch a new region in Sydney with other geographies to follow. This represents a major step forward in the development of our cloud architecture which was designed for scale from day one. It allows us to run any number of front-end servers and any number of back-end micro-services plus any number of database servers depending on load requirements. So as the number of connected probes increases and the amount of data we ingest also increases, we can scale out horizontally as required. Unfortunately, this work was not ready in time despite our forecasting determining that it would not be required until next year.
In order to address the immediate issue, we released some of the above changes ahead of time:
- Load balancing some of our back-end micro-services across multiple servers.
- Increasing the capacity (CPU and RAM) of our core servers.
- Offloading compression and SSL functions to dedicated micro-services.
In the medium term we will be rolling out the full suite of changes:
- A new cloud region based in Sydney.
- A fully containerised bank-end.
- Load balancing all of our back-end micro-services across multiple servers.
- The ability to easily scale out by simply adding more servers.
In the long term there are additional changes that will allow our platform to auto-scale by deploying additional servers automatically, as required.
Like any unexpected problem, we've learnt a great deal and made many positive changes to the product and our process to ensure problems like these don't occur again. Again, we would like to apologise for the inconvenience and thank you for your support.