User login
Business Codes
DUNS: 16-498-2238
Cage Code: 5YVN2
NAICS Codes: 541330, 541511, 541512, 541519, 611420
Search
Timing Supplement
Introduction
VectorZ is designed to provide superior accuracy while providing exceptional performance. The tables in this document indicate best-case performance, in microseconds, for several vector lengths.
What is “best-case” performance?
Best-case performance is achieved when all instructions and data are in cache. Measurements are performed by running the function several times, using the processor’s time base register to measure the time, and taking the minimum time. Taking the minimum time eliminates operating system activity (interrupts, context switches, concurrent I/O) which may occur during a test. It also eliminates cache misses and loads that may take place during the first iteration because the caches were not “warmed.”
Why measure best-case timing?
Best case timing measures the algorithm efficiency as a function of the processor core’s performance. Most library vendors provide best-case timing. Real applications will not achieve best-case performance all the time. Concurrent I/O, memory bus speed and efficiency, operating system activity, interrupts, context switches, and cache misses will all compete for processor resources within an application. However, best-case timing data provides a “level playing field” on which all libraries can be compared.
Performance Factors
Applications can improve performance by observing the following guidelines:
- Establish the size of vectors and vector operations based on the L1 data cache. For most processors, the L1 data cache can contain 32Kbytes, equal to 8K real values, or 4K complex values. If vectors are much smaller than this (say, 10 to 20 elements), the overhead of vector functions may limit performance. If vectors are too large to fit within the L1 cache, cache misses will occur, and cache reload activity will limit performance.
- Align vector data: this is done automatically when applications use vectors created by vNew. However, operations like viewSubvector can result in unaligned data. Unaligned vectors takeslightly more time to process.
