High-Volume Action Logging

What if your app just does too much to log?

One thing we kept stumbling over through the years was the ability to keep track of what our high-volume servers were doing under a real production load. For example, one product we worked on processed between 40 and 60 million requests every day (which breaks down to around 450-500 requests per second). We wanted to get information about hourly averages, how our load flowed and ebbed through the day, busy vs slow times, how long did the request take to process, things like that.

When you are handling really high request volumes you can't use mechanisms like our Performance Monitoring to track it. Even in our optimized database structures 40 million entries a day would bog things down in no time.

So what to do?

We eventually came up with the concept of handling action "batches". Basically we queue up the requests locally for a period of time, say a minute, and instead of logging the actions individually we log the batch numbers for the minute instead. By logging the number of actions handled during the time period as well as the time period itself (the number of seconds) we get really good view into what's happening without much overhead at all. Magic.

After we got that into place we also added in the ability to track the timings of the actions. By keeping track of the min timing, max timing, and total timings we can get a really good picture of how long things take to run and have the benefit of seeing potential performance issues proactively so we can address them in some manner.

close
Sign up to get started