I was talking with TeraTech’s server tuning consultant Mike Brunt about ColdFusion tuning and clustering which we did for a client to improve stability and performance. I thought others might be interested in what we did with server metrics, threads, logs and garbage collection in the Java Virtual Machine (JVM). We edited two files, as follows…
1/ {JRun4_root}servers\default\SERVER-INF\jrun.xml
We edited the following lines Lines 126 through 130, we uncommented this section (it is sometimes already uncommented) made sure it was set to true
<!– This Service provides metrics information –>
<!– ================================================================== –>
<service class=”coldfusion.server.jrun4.metrics.MetricsServiceAdapter” name=”MetricsService”>
<attribute name=”bindToJNDI”>true</attribute>
</service>
We then edited the section from line 148 through 151 as follows…
<!– You may also need to uncomment MetricsService if you want metrics enabled –>
<attribute name=”metricsEnabled”>true</attribute>
<attribute name=”metricsLogFrequency”>60</attribute>
<attribute name=”metricsFormat”>Web threads (busy/total): {busyTh}/{totalTh} Sessions: {sessions} Total Memory={totalMemory} Free={freeMemory}</attribute>
We changed <attribute name=”metricsEnabled”>true</attribute> to true which turns on Metrics Logging we made no more changes to this section but it is good to note what they do. <attribute name=”metricsLogFrequency”>60</attribute> sets the frequency of the metrics logging to 60 second intervals, that can be changed, in my experience, 60 seconds is typically a good time interval.
The line below that determines what is displayed; {busyTh}/{totalTh} will show all busy threads and total threads in use. The busy threads value shows you how many threads are actually in use when this snapshot was taken and the total the actual busy threads along with threads in every other state. There is a good write up of different thread states here – http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=934. The total thread number shows all threads in every state, as a very broad rule of thumb, the greater the difference between busy and total the more efficiently the application is running.
Sessions: {sessions} shows the number of Java-J2EE sessions running which is enabled by turning on “Use J2EE Session Variables” in the Memory Variables section of CF Admin. This will show you how many J2EE sessions are running every 60 seconds. This is an indication of activity and this number will rise and fall as user sessions begin and end.
Total Memory={totalMemory} Free={freeMemory} is fairly self explanatory and shows the memory state in 60 second intervals, ideally we do not want to see free memory drop too low, I would get uncomfortable if it drops below 75 MB or 10% of the total. Also, we want to see any used memory being released as quickly as possible.
We next edited the way logs are generated and created a split out of the logs by adding {log.level} to line 155
<attribute name=”filename”>{jrun.rootdir}/logs/{jrun.server.name}-{log.level}-event.log</attribute>
This last change enables more targeted logs which will be written to {JRun4_root}\logs.
I recommend making these changes on all Production CFMX 6.1 systems and leaving them on, the information produced can be invaluable for troubleshooting.
2/ {JRun4_root}\bin\jvm.config
This files is the main configuration file for the JVM and is loaded at start time. After observing application behavior metrics from the logging changes above we applied the following changes to jvm.config…
-Xms768m -Xmx768m – This sets the start and maximum memory allotted to the heap at 768MB. I typically set this number as high as is needed by the metrics logging and verbose garbage collection information. In practical terms, on 32 bit Windows, it cannot be set any higher than 1.4GB.
-XX:PermSize=128m -XX:MaxPermSize=128m – Here we added a start size for the permanent generation which I where the classes for JRun-CF are stored. We did this because we observed all of this memory space being fully consumed many times, in the verbose garbage collection log.
-Dsun.rmi.dgc.client.gcInterval=600000 -Dsun.rmi.dgc.server.gcInterval=600000 – These arguments generate an explicit Full Garbage Collection (Full GC) at 10 minute intervals. We applied these because we observed Full GC's taking place at one minute intervals and in some cases at 30 second intervals. This is far too often as when there is a Full GC, everything else on the server stops.
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc -Xloggc:teratechgc.log – This argument set turns on logging of all garbage collections in a verbose manner and this is where we got our information for the memory settings along with the metrics information. We turned this logging off after making the changes and checking that all was OK.