Thread Dump: Performance management and Monitoring strategy for SOA

Following can be performance management and monitoring strategy that an SOA based solution architecture could adopt. Please share your thoughts.

Performance Management

- Continual performance evaluation (in TPS) of various services by making use of different stress scenarios such as following:

Simple strategy: The Simple Strategy runs the specified number of requests with the specified delay between each run to simulate a "breathing space" for the server.
Fixed rate strategy: Fixed rate strategy guarantees a fixed number of requests in a given time period.
Variable load strategy: This strategy related to vary the load. Following are key strategies for variable load:

Variance: This varies the number of requests over time in a “sawtooth” manner as configured; set the Interval to the desired value and the Variance to the how much the number of requests should decrease and increase.
Burst: this strategy is specifically designed for Recovery Testing and takes variance to its extreme; it does nothing for the configured Burst Delay, then runs the configured number of requests for the “Burst Duration” and goes back to sleep.
Thread: This lets you linearly change the number of requests from one level to another over the run of the stress test. It’s main purpose is as a means to identify at which level certain statistics change or events occur, for example to find at which request count, the maximum TPS or BPS can be achieved or to find at which request count, at which functional testing errors start occurring.
Grid: this strategy allows to specifically configure the relative change in number of requests over time, its main use for this is more advanced scenario and recovery testing, where you need to see the services behaviour under varying loads and load changes.
Script: the script strategy is the ultimate customization possibility; the script you specify is called regularly and should return the desired number of requests at that current time.

- Tools such as SOAP-UI, GRINDER or Apache JMeter could be used for implementing above stress strategies.

- Java code instrumentation using custom logging framework for collecting timing for various IO (Disk/Network) tasks in any APIs.

- Distributed agents-collector based architecture to store performance metrics on a cluster of centralized log servers. This could make use of framework such as Flume (Cloudera) or Scribe (Facebook) or NIO based SocketServer implementations on centralized log server for log aggregation.

- Integration with monitoring tool such as NAGIOS for active monitoring on service performance, degradation etc.

- Web services to expose the performance metrics to internal and external consumers.

- Usage of performance metrics data for business intelligence. An ETL tool would be used for transferring data from log servers to BI server.

Monitoring

- Performance metrics monitoring based on log messages arriving in centralized log servers. This could make use of framework such as Flume (Cloudera) or Scribe (Facebook) or NIO based SocketServer implementations on centralized log server for log aggregation.

- Services and Server configuration monitoring for tracking configuration changes.

- Custom alerts for sending notifications for scenarios such as configuration changes , service degradation based on performance metrics etc.

- Web services to expose the monitoring information to internal and external consumers

- Integration with monitoring tool such as NAGIOS for active monitoring on service performance, degradation etc.

- Usage of monitoring information for business intelligence. An ETL tool would be used for transferring data from log servers to BI server.

Thread Dump

Monday, January 17, 2011

Performance management and Monitoring strategy for SOA

No comments:

Post a Comment