IVP DSP has a unique instruction set tuned for imaging and video pixel processing that gives it an instruction throughput of over 16x the number of 16-bit pixel operations compared to that of the typical host CPU with single-issue vector instructions. In addition to its raw instruction throughput advantage to host CPUs, the imaging specific compound instructions supported by IVP give it a higher peak performance of 10 to 20x and much higher energy efficiency. IVP’s rich instruction set has more than 300 imaging, video and vision-oriented vector operations, each of which applies to 32 or more 16-bit pixels per cycle.
“As mobile camera usage grows, so grows the demand for advanced video and imaging features, which must be offloaded from the host processor for the best performance and the longest possible battery life,” stated Will Strauss, president of Forward Concepts and a leading DSP analyst. “Given the pace of innovation in image and video processing, the new IVP core should help Tensilica’s customers get efficient chips implementing proprietary algorithms to market much faster and with the added benefit of lowering the cost of changing those algorithms.”
“Consumers want advanced imaging functions like HDR, but the shot-to-shot time with the current technology is several seconds, which is way too long. Users want it to work 50x faster. We can give consumers the instant-on, high-quality image and video capture they want,” stated Chris Rowen, Tensilica’s founder and CTO. “The IVP architecture supports very high-quality image and video capture using advanced single-frame and multi-frame processing, supporting increasing sensor resolutions. It is ideal for tomorrow’s exciting new products.”
Efficient Processor-based Architecture
Tensilica’s IVP is based on a 4-way VLIW (very long instruction word) architecture that delivers high parallelism intermixed with code-compact instructions, with a 32-way vector SIMD (single instruction, multiple data) dataset. The architecture includes an integrated DMA (direct memory access) transfer engine with up to 10 GBytes/second of throughput and local memory throughput of 1024 bits per cycle (sixty-four 16-bit pixels/cycle) to keep up with the rapid pace of resolution and frame rate requirements. The IVP also features many imaging-specific operations to accelerate 8-, 16- and 32-bit pixel data types and video operation patterns.
The IVP is extremely power efficient. As an example, for IVP implemented in an automatic synthesis, place-and-route flow in 28nm HPM process, regular Vt, a 32-bit integral image computation on 16-bit pixel data at 1080p30 consumes 10.8 mW. The integral image function is commonly used in applications such as face and object detection and gesture recognition.
IVP’s high performance is demonstrated by complex algorithm kernels such as motion search and normalized cross-correlation, commonly used in high-precision block and feature matching and optical flow. For a smart motion search on 16-bit data over a 1920x1080 frame with 256x16 pixel search range and 9x3 pixel block size, IVP can achieve a rate of 142 sums of absolute differences per cycle. In addition, a normalized cross-correlation function on 16-bit pixel data with 32-bit accuracy achieves 1 million 8x8 blocks per second.
Many companies have proprietary imaging and computer vision algorithms which can be implemented on the IVP, as it employs the C programming model common among all Tensilica DPUs. Tensilica has also created a partner network to enable availability of pre-ported, efficient third-party imaging software. Initial partner companies porting advanced imaging suites to the IVP DPU include Almalence, Irida Labs, Dream Chip Technologies, and Morpho, Inc.
Tensilica’s state-of-the-art toolset enables easy programming of proprietary algorithms for higher performance and differentiation. “We were impressed with the ease of porting and optimizing our application to Tensilica’s IVP,” stated Eugene Panich, CEO of Almalence. “Tensilica’s compiler helped us achieve high performance and is among the best we’ve ever used. We were also impressed with the quality of their entire toolset.”
“We were excited to partner with Tensilica to offer our embedded vision applications since the IVP offers so much performance compared to other platforms,” said Vassilis Tsagaris, CEO of Irida Labs. “The power efficiency we’ve seen for our video stabilizer, for example, makes IVP the perfect offload engine for imaging.”
Customizable for Differentiation
Tensilica’s IVP DPU can be further customized using Tensilica’s patented processor-generation system. The DPU creation process is totally automated and fully supported by a matching software tool chain. The tool chain includes an optimized compiler, linker, assembler and debugger, plus a matching fast instruction set simulator.
Early-access lead customers took delivery of IVP last year, and the IVP DPU is available for broad licensing now.