This document will describe the critical performance factors of the Polarion platform, scalability pitfalls and limitations, and recommendations related to capacity-planning your production environment. It will do so based on a few scenarios that are representative of our install base.
System Configuration Landscape
Polarion is a web-based application. Clients interface with it through a standard web browser. As a result of this, Polarion can be accessed via LAN, WAN (e.g. company-internal inter-office networks), Internet or VPN.
Polarion is built on top of a growing number of open source frameworks and APIs. All of these components have their own performance and scalability characteristics, only some of which are relevant to Polarion, because of how these components are used by the Polarion platform. Some factors may not even come into play except in very large, or very heavily stressed environments.
Apache HTTP Server
Apache is the de facto web server solution, rivaled only by IIS on Windows platforms. It is extremely robust, and powers some of the largest websites on the Internet.
Scalability and Performance
There are no known issues related to Apache as a performance bottleneck for Polarion.
The Polarion installation procedure installs and configures Apache for you. It is recommended that you don’t deviate from this configuration unless you have very specific reason to, and only after consulting with your Polarion Support Team, to avoid unnecessary performance degradation or system downtime.
Since its initial launch in 2004, Subversion[i] has quickly become the de facto solution for version control in organizations of all sizes around the world. Polarion uses Subversion to store almost all of its configuration and data[ii].
Scalability and Performance
One of Subversion’s key strengths is also its primary performance bottleneck. Subversion uses a strict transactional commit model, which means only one commit transaction can be processed by a repository at one time. This means that if the repository is experiencing heavy write traffic, write requests are queued and processed one at a time.
That being said, Polarion change sets are typically in the range of kilobytes, which is far from heavy lifting for Subversion.
Subversion has no proven upper limit as far as repository size is concerned. The Apache Foundation uses a single repository for all of its projects (including Subversion itself, as well as the Apache Web Server project), which consists of well over 900k revisions[iii].
The key factor is to limit the amount of processing that is done in the various commit hooks on the Subversion repository itself to boost the operations’ performance. Any non-trivial automation should be implemented to run out-of-process of the commit itself whenever and wherever possible.
Subversion requires ultra-fast access to its file system. Anything other than physically attached storage is strongly discouraged.
Polarion stores all of its (XML) data (and configuration) in Subversion. This means we have to bridge the gap between not having a database backend, and having to be able to query and quickly retrieve and filter data. Lucene[iv] is a powerful indexing framework, and Polarion used it to implement what we affectionately refer to as “The Index”, which is what the Polarion platform uses for all of its read operations.
Scalability and Performance
Lucene has built-in mechanisms to balance fast in-memory storage with on-disk overflow to limit the memory footprint of larger data sets. Beyond that, there is no known upper limit to what Lucene can handle. It scales in an almost perfectly linear fashion when increasing the number of concurrent requests.
The critical performance factor for Lucene is the number of returned results. To mitigate this, Polarion makes heavy use of lazy loading whenever possible.
Users are encouraged to narrow the scope of their queries to extract more relevant results.
Scalability and Performance
Work items contained in a Document are stored separately from other work items in a project, and are treated differently because of the added constraints of the Document concept. As a result, documents with less than 5,000 work items are well supported, while exceeding the number might produce significant performance degradation.
No single Document should contain more than 5,000 work items.
Overall Performance and Scalability
This topic needs to be looked at from two separate yet related perspectives: load and volume of data.
When looking at Polarion as a whole, you will notice it scales almost perfectly linearly on Dual Core-CPU platforms. When moving to 8 CPU cores, the application essentially scales infinitely, for all practical intents and purposes.
The graph points that 80% of save operations with 100 concurrent users actively working with the server will be served in less than 4 secs on one Intel i7 CPU platform. * If there are 100 users changing work items every 5 mins, statistically computed there will be 80% probability that less than 5 save operations go in parallel.
As long as Polarion’s data remains within the parameters set out earlier in this document, scalability of volume is limited to memory (RAM) consumption, and manifests itself as a largely linear relationship between number of projects in one repository and Polarion’s overall memory footprint.
Reference Customer Installations
The following are examples showing a range of customer installations. Examples are for the purpose of illustrating what kinds of numbers you may need to be thinking of when planning your own installation’s scalability.
15,000 work items
25,000 work items
14,000 work items
External Factors and Recommendations
Like most other rich web-based applications, Polarion caches a lot of dynamic content in the browser. As a result, memory consumption of the browser process can balloon over time. Polarion recommends that you close your browser after using Polarion, to keep this from becoming a problem.
Polarion can be heavily dependent on disk operations, especially as the server scales to where a growing portion of the index is serialized to the file system.
Performance of the index can be sensitive to disk fragmentation. No operating system is immune to this. Most Linux distribution do give the option of using the ext3 file system, which has features to prevent meaningful fragmentation altogether. In all other cases, regularly scheduled defragmentation is highly recommended.
Subversion requires local or as-fast-as-local file system access to the repository. We strongly recommend either an internal drive, or attached storage (fiber-optic connection). If network-attached storage (NAS) must be used, the length, speed and stability of the network path between server and storage is absolutely critical.
Use of Solid-State Disks (SSD) should be considered carefuly. Relatively low-price devices are affected by degradation of save performance over time. We recommend only proven performant SSDs to be used for Polarion, sinse Polarion and Subversion make a lot of small writes to the disk.
Real-time virus scanning can cripple file system performance like nothing else. We recommend you exclude the Subversion repository file structure and all on-disk index (Lucene) data from being scanned, and schedule any scans you feel are needed overnight.
Windows is generally slower than Linux in all relevant areas[i]. Beyond ease of installation, there is no recommendation as to a particular flavor of Linux.
Selecting a 64-bit rather than a 32-bit operating system allows for more memory to be assigned to the server process. Even if this is not an immediate concern, it makes for much easier scalability, as replacing a 32-bit with a 64-bit operating system down the road is going to be an intrusive exercise.
For production environments, we recommend a 64-bit Linux with at least 4GB of RAM.
Make sure that OS has enough file handles available for Polarion. Since Windows has pretty good default, it is rather Linux specific issue. Polarion process should get access to 32K file handles for stable performance.
The Polarion client interface relies heavily on quick, short bursts of communication with the server. Network latency is a major factor in client performance degradation. For this reason, network roundtrip (ping) between client and server should ideally be no worse than 150ms.
Virtual environments come at a performance cost, since hardware components such as memory, network, graphics and even storage, are emulated by software. As a result, any application that runs in a virtual machine (VM) will perform worse compared with running the same application on dedicated actual hardware with the same specification.
Polarion recommends that you only run Polarion on virtual machines running Linux. This is largely due to Windows being a slower, larger-footprint operating system to begin with, a fact that only gets amplified by adding virtualization to the mix.