JPROBE
As any developer that has created Java applications will tell you, memory and performance issues can be extremely difficult, frustrating, and consequently time consuming, to detect, diagnose and ultimately resolve.
DSI employs Quest Software’s market leading Java performance tuning toolset, JProbe Suite to assist our development and test teams to identify performance, memory, threading and code coverage issues down to the line of code.
The JProbe Suite contains three tools to assist in your investigations:
JProbe Profiler - To identify method and line level performance bottlenecks
JProbe Memory Debugger - To investigate memory leaks and excessive garbage collection
JProbe Coverage - To measure code coverage after testing
DSI have used JProbe Profiler and the Memory Debugger on both highly scalable real-time applications and high throughput, highly performant batch type applications. Typically during load testing of these types of applications, performance bottle necks, or memory issues will be identified. Once an issue has been identified, we re-run these load tests in a JProbe Profiler enabled environment. This allows us to establish a baseline from which a report can be generated. This report highlights method hotspots, indicating number of method calls, cumulative time spent of method and explicit method time. A graphical representation of the method call stack is also available, displaying in a easy to understand, colour coded manner, the ‘flow’ of the application. This report is analysed by a lead developer/application architect who has a expert understanding of how the application should behave. Problematic code is then identified, down to the line of code, if necessary. A code fix is applied and the load test is re-ran. We then compare the baseline report against the report generated by the new, improved code. Any change in code performance is displayed in plus or minus percentage differences from the baseline. This process then continues until any SLAs are met.
We have used JProbe Profiler to identify and resolve the following types of issues:
- Inefficient database connection pooling
- Inefficient use of the Collections API
- Inefficient use of the Calendar object which was eventually replaced with Joda Time opensource library
- Inefficient looping
- Redundant code
- Inefficient object cycling
- Inefficient IO
Thankfully, we have only been called upon to use JProbe Memory Debugger only once. During load testing it was observed that the performance of a batch type application was decreasing over time, and on observing the memory graph generated by the profiler, it was clear that the issue was memory related. The graph indicated that memory usage was increasing exponentially, thus causing excessive garbage collections. On running the application load test through the Memory Debugger, and indepth analysis of the resulting report, the culprit was found. The application was using multiple levels of caching, via Hibernate query caching and two levels of OSCache implemented caching. The application in question was a batch type application, the nature of which was that each ‘job’ was unique. The report indicated that the cache object maintained by the Hibernate query contained ‘hard references’ to each and every query (String object), of a certain type, invoked by the application. The object graph maintained by this cache got ‘deeper’ and consumed more memory during lifetime of a batch run. We disabled Hibernate query caching, re-ran the loadtest, and the memory issue disappeared. This issue would have been near impossible to identify without the use of a tool like JProbe Memory Debugger. As a side effect of this, we started thinking about the caching approach used in the application. Quite soon it was realized that we did not infact need two levels of object caching provided by OSCache. The batch application ‘inherited’ a set of services that were written for a real-time system. The nature of the real-time application required caching, but this was not the case with the batch type application. Object caching was switched off at the ‘inherited’ service level saving further memory. Lowering memory consumption generally equals improved performance.
When using tools like JProbe Profiler and Memory Debugger, yes it does help you fix any immediate problems that you may be having, but it also really gets you thinking about your code. You learn that not all solutions are applicable in all cases - ‘horses for courses’ so to speak.
PERFORMASURE
In DSI we use Quest Software’s PerformaSure software during the production implementation phase of a project. A lead developer/application architect will go on-site to a customer, install the PerformaSure software in their production environment and will monitor application performance during the rollout of the application.
Depending on the application we generally recommend a two week rollout period before the ‘go-live’ date. Our customized licensing agreement allows us to run PerformaSure in any type of production environment: no matter the OS, CPU count, number of application servers in the cluster, no matter the number of physical machines etc. This will allow our team of experts to observe the application, under a combination of virtual and real load in a production environment. Reports can be generated for the consumption of both business and technical management. The fact that these types of reports, and the detail contained there-in, can be generated give both the business and technical management a strong sense of confidence in the robustness and performance of an application.
The PerformaSure agent can run passively (almost 0% overhead on server(s))on the production environment. If and when a potential issue arises, our PerformaSure experts can activate the system agent via a remote workstation. This agent will then begin collecting various metrics, relaying this information to the PerformaSure analytical system, the Nexus. When the system agent is active, it represents < 5% overhead on the server(s). The metrics relayed to the Nexus can then be user analysed via the Workstation. The reports displayed by the workstation can render specific information on a specific technology in the technology stack e.g. JNDI, Servlet, EJB, MDB, Apache Struts, Taglib etc. The ability to have this level of information at your finger tips in a production environment is priceless. It gives the customer a sense of security and gives the development team a ‘third-eye’ that allows them to react to issues if and when they happen.
We generally will leave PerformaSure running in the production environment for a one month period of time. So that even though we may not be on site we can activate the system agent remotely and connect remotely to the Nexus to assist in problem identification and resolution. We feel that this gives real meaning to any warranty that we give with our software.
- Jay