TopL TopM TopR
MiddleL

Top 5 Core Performance Tips for AIX Administrators

As an AIX administrator, I’m often asked what are the most important areas which need to be examined. In my view, performance is at the top of any list. In this article we’ll discuss five tips and areas you should put at the top of your performance list. 

Create a Plan for Continuous Improvement
 
The most important components of performance have nothing to do with kernel parameters. It’s about having a plan — a methodology — towards continuous performance improvement. Here is my five-part plan: 
  1. Establish a baseline
    Before tuning a system, it is imperative to establish a baseline. The baseline is a snapshot of what the system looks like when it is first put into production while it is performing at acceptable levels to the business for it to be deployed. This baseline should not only capture performance-type statistics, but it should also document the actual configuration of your system (amount of memory, CPU, and disk). 
  2. Stress test and monitor
    Continuously monitor your system — not when it is broken — but at the beginning when you first deploy. You should have a historical record of the system. You should also get to know vmstat and nmon —my two favorite utilities. vmstat is more no-nonsense and nmon is more pretty. You should also download the nmon analyzer, which provides you with nice graphs to show management. 
  3. Identify bottlenecks
  4. Tune
    Make sure you make only one change at a time. If you put in more than one change, it will be difficult to know which change made the system run better — or worse!
  5. Repeat 
    Stress test, monitor, identify bottlenecks, tune. . . . 
Finally — before you tune anything in production — implement the fixes in your development or test environments. If you’re an enterprise shop, there is no excuse not to have these environments.
 
Don’t Underarchitect Your RAM Requirements
 
Many Unix administrators like to fight database administrators (DBAs) on recommended memory requirements for applications. Be careful. While I’m not suggesting that you give away the farm and stick by the letter of the law for every application/database requirement, there are many things you can do as far as ensuring your systems have sufficient RAM. This includes the way in which you architect your systems, setting up your LPARs. Furthermore, look at some of the new features of PowerVM, particularly Active Memory Sharing and Active Memory Expansion with POWER7. 
 
Active Memory Sharing (AMS) is a new feature introduced in 2009. This feature, first available on POWER6, allows for the sharing of RAM, similar to how users have been able to share and micropartition CPUs, which, in turn, allows for the increased use of memory through the POWER hypervisor without having to perform a DLPAR operation. AMS makes it possible to use spare idle memory not being used by other LPARs toward the distribution of LPARs. This empowers customers to optimize their RAM configuration and ensures that resources do not sit idly by while their brethren LPARs may be in dire need of assistance.
 
Active Memory Expansion (AME) is a new technology for expanding a system’s effective memory capacity. It employs memory compression technology to transparently compress in-memory data, allowing more data to be placed into memory and thus expanding the memory capacity of POWER7 systems. Utilizing Active Memory Expansion can improve system utilization and increase a system’s throughput. 
 
What about paging space? I’m not too big on providing tons of paging space. As a UNIX administrator you should be able to monitor how much paging is used on a daily basis: lsps –a, vmstat and svmon are some utilities you should utilize frequently. All applications work differently. If you see your paging space at greater than 50 percent frequently, then I allocate more. If your paging space is always 1 percent utilized — providing for 32GB of paging space on a system with 32GB of RAM — this is not smart. Monitor and increase where necessary. What is the rule of thumb? While I’m not big on rules-of-thumbs, generally speaking, if a system has less than 4GB of RAM, I usually like to create a one-to-one ratio of paging space versus RAM. If it has 8GB or higher, I set my paging space to as little as half the size of RAM. 
 
Understand the Difference Between Working and Computational Segments
 
Working segments use computational memory — they are used while your processes are actually working on computing information. These working segments are temporary (transitory) and exist only up until the time a process terminates or the page is stolen. They have no real permanent disk storage location. When a process terminates, both the physical and paging spaces are released in many cases. Persistent segments using file memory has a permanent storage location on the disk. Data files or executable programs are mapped to persistent segments rather than to working segments. The data files can relate to file systems, such as Journaled File System (JFS), Enhanced Journaled File System (JFS2), or Network File System (NFS). These files remain in memory until the time that a file is unmounted, a page is stolen, or a file is unlinked. After a data file is copied into RAM, VMM controls when these pages are overwritten or used to store other data. 
 
We definitely want the Virtual Memory Manager to favor working storage, meaning we don’t want AIX to page working storage. What we really want is for the system to favor the caching that the database and application use, rather than the OS. The way to do this is to set the vmo command’s maxperm parameter to a high enough value while also making certain that the lru_file_repage parameter is set correctly.  Let’s define these parameters:
  • minperm% — The point below which the page stealer algorithm will steal file or computational pages, regardless of repaging rates.
  • maxperm% — The point above which the page stealer will steal only file pages.
  • maxclient% — The minimum percentage of RAM that can be used to cache client pages.
  • lru_file_repage — Setting this value to 0 (off) allows AIX to free only file cache memory (provided numperm is greater than minperm and VMM can steal enough memory to satisfy demand), virtually guaranteeing that working storage remains in memory.
The most important vmo settings are minperm and maxperm. Setting these parameters appropriately will ensure that your system is tuned to favor either computational memory or file memory. Our old approach to tuning minperm and maxperm was to set maxperm to a low number — much lower than the default value (20) — and set minperm to less than or equal to 10. This is how we normally would have tuned a database server.
 
The new approach is to set maxperm to a very high value — higher than its default (80) — and to make sure lru_file_repage is set to 0. IBM introduced the lru_file_repage parameter in AIX 5.2 with ML4 and in AIX 5.3 with ML1. The lru_file_repage value indicates whether the VMM repage counts should be considered and what type of memory should be stolen. The default setting is 1 (it becomes 0 in AIX 6.1), so we need to change it to 0 to have the VMM steal file pages rather than computational pages. This technique solves the old problem of having to limit JFS2 file cache to guarantee memory for applications such as Oracle.
 
Watch Your Runtime
 
On a typical system your runtime should be greater than your blocked processes. Commands such as w and nmon and vmstat should be able to help you determine if this is so. If this ratio is not what it should be, it is systemic of other issues. Check your waiting on i/o time. When investigating performance problems, more often than not I’ve found that the problem is in I/O rather than CPU or Memory – oftentimes your I/O subsystem just cannot keep up with the demands that people are making on your system. 
 
 
If you look at this vmstat output, you’ll see that the blocked processes are greater than the runtime, which is causing the waiting on I/O (wa) column to be too high.  
 
Consider the Architecture of Your Partitions
 
How is your partition architected? Entitled capacity, number of virtual processors, SMT? Are you using uncapped partitions? If not, why not? This feature will let you take advantage of unused CPU cycles. How are jobs being run and when? Is cron being utilized? Can you run jobs in the middle of the night as opposed to busy times during the day? Oftentimes bottlenecks that appear CPU related really are memory or I/O bound. Take the time to understand what the problem really is. Don’t make the mistake of just throwing more iron at the problem. Increasing the amount of CPU’s per partition is not usually the answer — and it comes at a price. While we are on CPUs, the smtctl command (introduced in AIX 5.3) displays Symmetric Multi-Threading Information (SMT). SMT, part of IBM’s Hypervisor based virtualization, PowerVM, provides for two threads of execution per virtual processor. System performance usually increases about 30 percent when SMT is enabled, so you almost always want to enable this functionality. SMT is best-suited for multithreaded, I/O-intensive applications. It is not a good fit for numerically intensive workloads. The POWER7 architecture has dynamic threading in turbocore mode. It’s a new type of intelligent threading technology, which dynamically switches the processor-threading mode to deliver either the highest per-thread performance or maximum application throughput, depending upon your application’s requirements. 
 
Finally, take the time to understand recent changes in AIX 6.1 if you’ve moved from AIX 5.x. In addition to fixing many of the older default parameters, here are some notable changes.
  • Restricted Tunables
    In AIX 6.1, IBM now classifies many tunables as “restricted” in an attempt to discourage junior administrators from changing certain parameters deemed critical enough to be classified as restricted. 
  • I/O Pacing
    In AIX 6.1 I/O pacing is turned on by default. It’s a mechanism that lets you limit the number of pending I/O requests to a file, thereby preventing disk I/O-intensive processes (usually in the form of large sequential writes) from exhausting the CPU.
  • Asynchronous I/O (AIO)
    There are no more AIO devices in the ODM. AIX 6.1 no longer provides the aio command (what a short life span), and these tunables are now used only with ioo. Two new parameters have also been added to ioo: aio_active and posix_aix_active — no more AIO servers run — so less pinned memory + fewer processes = greater performance 
  • netcdctrl
    This facility is used to manage the new network caching daemon, which has also been introduced to improve performance when resolving names using Domain Name Server (DNS). You can start this daemon from the System Resource Controller (SRC). Its main configuration file is /etc/netcd.conf.
You should also know that on the heels of AIX 6.1, AIXv7 is also coming very soon. Watch for future articles on the capabilities of AIX 7.
 
 
 
Ken Milberg is the President and Managing consultant of PowerTCO, a New York-based IBM Business Partner. He is also a technical editor and writer for IBM Systems Magazine, Power Systems Edition and has written dozens of technical journals for IBM developerWorks. Ken is the author of Driving the Power of AIX, a guide to performance turning on IBM POWER-based servers, published in 2009. Ken is also a technology writer and site expert for techtarget.com and provides Linux technical information and support at searchopensource.com. Ken holds a B.S.in Computer and Information Science as well as an M.S. in Technology Management from the University of Maryland. He has consulted with many Global Fortune 500 companies and is a PMI-certified Project Management Professional (PMP) and an IBM Certified Advanced Technical Expert (CATE) on AIX and IBM Power systems. He also holds certifications in Solaris and HP-UX.
 
 

 

Comments

Was it good for you, too?Join the discussion » ,but you need to login first before you make comments.

     

    Other Recent Comments

    1. Re: Easy Ways to Trace VSCSI Configuration with AIX

      No Anthony, it's not so obvious. As you see, I've removed first paths and after that I changed vscsi...

      --Andrey Klyachkin

    2. Re: Easy Ways to Trace VSCSI Configuration with AIX

      Thanks, Andrey. Your test shows that if one path is down, then its corresponding VSCSI adapter isn't...

      --Anthony English

    3. Re: Easy Ways to Trace VSCSI Configuration with AIX

      Test with rmpath/cvai:# lspath -l hdisk1 -F parentvscsi0vscsi1# lspath -l hdisk0 -F parentvscsi0vscs...

      --Andrey Klyachkin

    4. Re: IBM Reveals PowerLinux Details and Pricing

      Thanks for the tip, Bill -- much appreciated! (I must say, though, we'll have to wrangle a few Power...

      --Chris Maxcer

    5. Re: IBM Reveals PowerLinux Details and Pricing

      http://www.ibm.com/developerworks/group/tpl

      --Bill Buros

    Google Links

    Sponsored Links

    Featured Links

    MiddleR
    BottomL BottomM BottomR

    © Penton Media, Inc.