When designing a system on a chip (SoC) that employs one or more embedded processor cores, the choice of available processors continues to expand. At last month’s Design West conference in San Jose, Calif., designers were presented with many processor options. Leading the pack, ARM, with its broad array of cores offers a wide range of performance choices, ranging from the Cortex-M0 at the low end of the performance spectrum to the 64-bit Cortex A57 at the high end. Although ARM’s cores dominate some SoC market segments, they aren’t the only game in town. EDA tool suppliers Synopsys and Cadence have acquired core suppliers ARC and Tensilica, respectively, and recently, Imagination Technologies acquired MIPS. Thus the number of independent processor core IP providers dropped considerably, but not for long.
One newcomer to the U.S. market, Andes Technology, has crafted multiple, synthesizable processor-core families, the N7, N8, N9, N10, N12, and N13, that offer 32-bit cores with gate counts that start at just 12k gates (for the N7). For applications that don’t require legacy compatibility these cores can challenge ARM and other vendors for embedded applications. Based on a proprietary instruction-set architecture (ISA), the N7 family cores can deliver about 1.19 MIPS/MHz, which is about 20 percent higher than the ARM Cortex-M0. Additionally, the cores consume about 30 percent less power at the same performance level as the M0. The low-gate-count core, referred to as the Hummingbird, also requires a small amount of chip real-estate – less than 0.04 mm2 when fabricated using a 90 nm process. With optional features such as a prefetch buffer that can serve as a small instruction cache, the core can deliver up to 1.45 DMIPS/MHz, but to get the higher performance the gate count would increase to close to 30K gates.
Figure 1: One of the higher-end processor cores from Andes Technology is
the N12. It contains an eight-stage pipeline with dynamic branch prediction, a
memory-management unit and instruction and data caches.
The ISA consists of a mix of 16- and 32-bit instructions that execute on the N7, which has a simple two-stage pipelined architecture. On the high-end, the N12 and N13 series implement the ISA on an eight-stage pipeline and pack a memory-management unit, instruction and data caches, and dynamic branch prediction (Figure 1). Programming tools and a good compiler make the proprietary ISA a non-issue and allow designers to program using tools like GCC/Linux. The Hummingbird core is targeted at applications such as Bluetooth, the Internet of Things (IOT)/machine2machine communications, touchscreen controllers, and other embedded applications, the Hummingbird core licensing fees are considerably lower than what ARM charges for its M0 core, thus keeping down the cost of the SoC. The higher-performance cores take on performance-sensitive applications such as embedded Linux systems.
Figure 2: Between the commercial CPUs and a dedicated fixed-function solution is the ASIP (application-specific
instruction-set processor)—a block of customer-defined intellectual property (left). Tools from Target Compiler Technologies allow designers to craft the IP block and incorporate the block in an ASIC, thus enabling the designers t0 significantly improve the power efficiency as well as the performance of their ASIC solution (right).
Taking a different approach to crafting an embedded processor core, Target Compiler Technologies offers tools that let designers define everything from their own optimized processor cores to a complete multicore application-specific SoC. By allowing designers to craft their own application-specific intellectual property (ASIP) the company’s IP Designer tools allow architectural exploration, SDK generation (C compiler, instruction set simulator, debugger, etc.), and RTL generation. Once the IP blocks are defined, the MP Designer tools for multicore ASIC design perform code parallelization, communication and synchronization and multicore platform generation (Figure 2).
Figure 3: A single-tile xCore processor SoC platform from XMOS can
emulate up to eight “logical” processors and has areas set aside that designers
can use to customize the I/O and bus interface/communications channel. The
platform chips from XMOS can contain 1, 2, or 4 physical processor tiles (up to
32 logical processors) and can clock at up to 500 MHz.
Somewhere between a dedicated processor core and a fully-definable multicore platform sits the configurable processor SoC platform developed by XMOS. The company offers a partially-predefined multiple processor platform that contains 1, 2, or 4 processor “tiles”, with each tile able to run up to eight threads (or eight logical processors) and basic support blocks such as SRAM, PLLs, timing (schedulers, timers, clocks), Security (one-time-programmable ROM), and JTAG debug port (Figure 3). The remainder of the platform consists of configurable sections into which designers can drop special IP blocks from the XMOS library or their own their proprietary interface/special function logic IP that connects to the platform’s I/O ports and X-Connect interface channels/links.
Each processor tile can deliver up to 500 MIPS of compute power when running at 500 MHz. Each logical processor (a thread) shares processing resources and memory in the tile, but each logical processor has its own register files and gets a guaranteed slice of the tile processor’s compute power (125 MIPS at 500 MHz). The high performance of the processor tiles allows the xCore to take on many applications in consumer and audio systems, automotive systems, industrial control, and display/imaging systems.
Semiconductor Technology Editor