I received an email today, that reminded me of a topic I wanted to throw up on my Running A ColdFusion Shop series.
Subject: Mail Queues Need Attention
Body: Hoth Spool could be backed up.
|query – Top 1 of 1 Rows|
|04/17/2007 05:43:54 PM||C:CFusionMXMailSpool||Mail22186.cfmail||719|
To unblock spool:
Drain stop node(wlbs drainstop)
Restart Cold Fusion Service.
Restore node (wlbs start)
Make sure before doing this that the node is actually backed up.
One or two messages in the queue does not a blockage make.
But depending on the number of items and length of their stay, you can make the determination.
What this email reminded me to say was: DRY isn’t just for code.
We had this problem a while back. Every once in awhile, the mail queues on a random one of our ColdFusion servers would back up. The mail would remain stuck until the server was restarted. Developers and users started getting pissed about their application email being delayed.
The short term solution was to check the queues every once in awhile. Once the problem stopped occurring, we stopped checking… until it happened again. We looked for a hotfix, to no avail. We did this a couple times, and each time we got burned when we stopped being vigilant.
Finally I said “Screw it; I’m scripting a solution to it.” I started checking to see if files were in the spool directory of all of our servers. Then I had to make sure that the files had sat there for a little bit instead of just having been written there. Finally I had to send an alert with the needed fix.
No big deal, it’s not advanced programming, it’s not even particularly good code.
The point here is that it was running in a scheduled task… since I wrote it in 2005 (and restarted it earlier this year.) We had this problem again today, and nobody outside of my team noticed.
So what’s the point of this besides a little bragging? Don’t do things by hand that you can automate and forget about, or DRY isn’t just for code.
4 thoughts on “Running a ColdFusion Shop Part 4”
Hoth? As in “ice planet of”? First technical term that I’ve understood in your blog in months.
Yeah, you can program a workaround, but it would be nice if they fixed the problem. I’ve had this issue for years, and basically solved it in the same fashion.
It’s like the CFApplication bug where the tag blows up when only one of the cfid/cftoken cookie pairs is present. Been there for I don’t know how many versions, and I fully expect it to be there when Scorpio ships.
From time to time the mail spooler internal service will crash out. The only way to fix it is to restart CF.
I had one of our VB dev’s write a little file checker utility. I plug in all my UNC paths and set a timer. If it checks the path and finds files so many times in so many minutes it will pop a little alert on my screen. It also checks the undelivr folder for files as well.
@Jeff – We are nothing if not geeks here.
@Michael – From what I’ve seen, its very hard to troubleshoot. It seems to happen when our mail servers have problems. The connection breaks and does not get picked back up.
@Tom – Wow, same principle. We do a similar thing for undelvr as well.