We are an agile software development company and agile is great for “moving target”. We plan, work and implement changes in small batches and ongoing re-factoring is just the nature of what we do.
We recently added some functionality as well as increased traffic for one of our Java products utilising Apache Camel and ActiveMQ. The product has been in production for years now, functioning with very much zero defect rate. Not soon after deploying the new code, our monitoring system triggered alerts about unusually high TCP TIME_WAIT connection states on the server where the new code was running. We began the troubleshooting process and found they were all ActiveMQ connections to our broker. Our developers immediately confirmed that
“there was no change on the ActiveMQ connection manager side.”
Well, it turned out that it was exactly the problem.