Emulation and cross-development for PowerPC
Avoid the cost of new hardware
Level: Introductory
Hollis Blanchard (hollisb@us.ibm.com), Software Developer, IBM
18 Jan 2005
This article introduces PowerPC emulation and cross-compiling for developers without access to real hardware. It is intended for developers familiar with computer architecture who own an x86-based workstation but are interested in experimenting with PowerPC.
Some developers may not have access to a PowerPC® Linux™ system to play with (although you can buy one for less than US$200 at the time of this writing). For the curious x86 Linux user, emulation is a convenient and inexpensive alternative. There are at least three open source PowerPC emulators available, two of which are quite new.
Some emulators, particularly those used by processor developers, are cycle-accurate, meaning that a particular instruction in a given context will take exactly as many cycles to run as it would on real hardware. These emulators emulate not just the instruction set, but also the internal pipelines and caches of the processor. They are particularly useful during development before real silicon exists, and they can also yield more insight into performance bottlenecks than can be gleaned from hardware performance counters. However, these emulators have some severe limitations. Because they document so much intellectual property and hardware tricks, their internals are almost never free for examination or modification. Instead, the processor designer will make binaries available, sometimes for no cost, often for a very restricted range of hosts. Another problem for higher-level software developers is that because they emulate large amounts of processor internals, they are very slow. Finally, they may not be as accurate as real hardware. For reasons of speed or complexity, even a cycle-accurate emulator can omit cache or IO emulation, yielding skewed results. They're probably pretty close for most situations, but the fact remains that an emulator is only emulating the hardware, and its behavior can diverge.
None of the emulators discussed here are cycle-accurate. In fact, they probably aren't even fully behavior-accurate. (When that happens, it's called a bug, and will usually end up being squashed... eventually).
Emulating user mode
One very convenient feature for the casual developer is user-mode
emulation. If an emulator emulates only the processor and IO (such as a
network device), a Linux kernel would need to be booted (and emulated)
first, then the emulated application on top of that. That's certainly
important for more serious work, but it's much more convenient for simple
experimentation to avoid dealing with kernels entirely. If the emulator
can emulate not just the processor but also the operating system kernel,
that makes it much easier to run little programs that don't depend on many
kernel services, such as those that only need to use the write
and exit
system
calls.
When an emulator ordinarily encounters a PowerPC system call instruction, it emulates the exception by storing the instruction address into the SRR0 register, setting some architecture-defined bits in SRR1, and transferring control to physical address 0xC00. (Some PowerPC variants allow more control over this behavior, but this is the traditional PowerPC model.) The emulated kernel has its system call exception handler at 0xC00, just like on hardware, and so the kernel takes control of the processor.
When an emulator supporting user-mode emulation encounters a system call
instruction, on the other hand, it does not transfer control to the
emulated exception handler; instead it interprets the system call itself.
The easiest examples are system calls like read
and write
: these can be almost directly
converted into real system calls made by the emulator. The glue layer to
translate between emulated system calls made by the emulated application
and real system calls made by the emulator may have other functionality,
such as logging all system calls made by the emulated application.
In addition to bypassing the complexity of building a kernel to emulate and a file system image to boot into, and configuring a virtual network device for IO, this shortcut also speeds up emulation, as the reams of kernel instructions that would have run to handle the system call -- from the exception handler through the VFS and the device driver -- are bypassed. However, it should be clear that not running the kernel inside the emulator means the overall behavior could be quite different indeed. In the worst case, a bug in the emulator's system call glue could make it seem as though the emulated application is buggy, even though it would run perfectly on a real kernel. This worst case remains pretty rare, though, and these tools are generally production-ready.
|
Qemu, which is relatively new, uses dynamic translation like a Java Just In Time (JIT) compiler to achieve good performance; in this case, good performance is about 4x to 10x slower than native hardware, depending on the benchmark. It supports a few different hosts and targets, but all we'll worry about is x86 host and PowerPC target, which fortunately is one of the supported configurations. Qemu also supports a remote GDB (GNU Debugger) connection, which is very valuable for debugging. Unfortunately, qemu does not support GDB connections in user-mode emulation, only in full-system mode. Qemu does not support AltiVec™ vector-processing instructions.
PearPC is another new emulator that can use JIT dynamic translation, but only on an x86 host with a PowerPC target -- however, that environment is the goal of this article. Its performance isn't as good as qemu's, being roughly 15x slower than the host system. Unfortunately, PearPC does not support a user environment, so a kernel and basic file system would be needed as well (Linux, Darwin, and Mac OS X are currently supported). PearPC does not support a GDB connection, nor yet does it support AltiVec vector-processing instructions (although the developers plan to add them in a future release).
PSIM (PowerPC simulator) is the granddaddy of PowerPC emulation: it was written in 1994 and assisted in some of the initial port of Linux and NetBSD to the then-new PowerPC architecture. PSIM was integrated with the GDB sources, and amazingly, although it hasn't seen development since 1996, it still builds and works. Being integrated with GDB, PSIM also supports GDB connections, including user mode. Because it predates AltiVec, PSIM does not support AltiVec vector-processing instructions.
Choosing an emulator
For the reasons discussed above, this article uses qemu; the same basic issues apply with the others, but qemu is the simplest to build for the purposes of this article. Download and extract the latest qemu tarball (see Resources), then:
Listing 1: Building qemu
|
This will produce ./ppc-user/qemu-ppc
, which
will be used later to execute PowerPC binaries.
Cross-compiling
The second key ingredient in cross-development is a cross-compiler. A cross-compiler is a compiler that runs on one architecture but produces binary code for another. This is very convenient if the deployment system is significantly underpowered relative to the development system, as is usually the case in embedded system development. A cross-compiler does not overwrite the system's native compiler or interact with it in any way.
Building a GNU cross-compiler can be pretty easy depending on the architectures involved, but sometimes build breaks do happen. It can also require several stages of builds to get all the right components built for each other in the right way. To remove the guesswork and automate the process, Dan Kegel has developed a very useful build script called crosstool.
Download and extract the latest version of crosstool (see Resources). Then:
Listing 2: Building crosstool
|
That will run for a while, and when it finishes, binutils, GCC, and glibc will be installed for cross-compiling in /opt/crosstool. Have a look at the directory structure there, and consider adding it to the PATH environment variable to save typing later.
Hello, world
Now that an emulator and cross-compiler have been built, it is time to put them together and test the new environment. Put the following source into hello.c:
Listing 3: A strangely familiar program
|
For now, use static linking to avoid worrying about how to install PowerPC shared libraries on the x86 host system. To produce a 32-bit PowerPC ELF executable named "hello", run the following:
Listing 4: Cross-compiling with GCC
|
To verify that it is the expected format, you can use this command:
Listing 5: Checking file type
|
And finally, run the executable under qemu:
Listing 5: Running an executable under qemu
|
"Hello, world." should be output to the terminal.
What now?
Now you know you can build C code into PowerPC executables and run them. You can also experiment with the simple assembly example given in the "Introduction to PowerPC Assembly" article, which is listed in Resources. (Note that you could use the cross-assembler directly, it's a lot easier to continue to use the compiler instead.) Once you're satisfied with that, you can move on to bigger and more interesting examples, perhaps including shared libraries (read the qemu documentation -- which is also listed in Resources -- for help with that).
Although crosstool can produce ppc64 toolchains just as easily, there is unfortunately no open source emulator for 64-bit PowerPC, so you would need real hardware to experiment. Of course, ppc32 executables run just as well on ppc64 hardware (but the reverse is not true).
Conclusion
An emulator will never be as fast as native hardware; the biggest reason functionality is implemented in hardware is speed. An emulator will also never be as accurate as real hardware, especially when the hardware itself could contain errata that can be triggered by subtle timing interactions of internal components. However, an emulator can be very valuable for development and even general-purpose computing. Virtual PC, a commercial emulator, is used by a large number of Macintosh,® owners to run Windows® applications. It may not be as fast as hardware, but it's cheaper and easier to maintain. When developing low-level operating system code, an emulator can provide that needed glimpse into the system's state to reveal a hardware-crippling bug. In fact, during hardware development, an emulator might be the only development platform available!
The emulators above have been and are being used for operating system development, which proves some measure of robustness. But don't let that stop you from trying them out just to experience having 32 general-purpose registers, or from going out of your way to try to support a PowerPC user of software you've written. With an unbeatable price tag and convenient environment, what do you have to lose?
Resources
- PearPC is maintained on
SourceForge. See also the PearPC
documentation.
- You can get qemu from the qemu home page. See
also the QEMU
CPU Emulator User Documentation.
- The PSIM model of the
PowerPC Architecture is written in extended ANSI-C and hosted at Red Hat.
It too has ample
documentation.
- Downloads and documentation for crosstool may be found on the
project home page.
- Once you are up and running, you can play around with some of the
code from Introduction
to assembly on the PowerPC (developerWorks, July 2002).
- Did you know you can get a PowerPC Linux kit for as little as US$200?
We're talking, of course, about the Kuro (about which more later). See
also this review of the
Kuro from Penguin PPC.
- Can't wait for 64 bits? There is no need to -- here is some
information about 64-bit
PowerPC to get you started (developerWorks, October 2004).
- If you're emulating a whole system, these performance
tools will help you get something done (developerWorks, June 2004).
- You can learn more about Just-in-time
compilation from Wikipedia.
- See also A
developer's guide to Linux emulators and how they operate
(developerWorks, December 2004) and Emulate
legacy operating systems on Linux: From CP/M to OpenVMS, Linux does it
all (developerWorks, June 2003).
- Have experience you'd be willing to share with Power Architecture zone
readers? Article submissions on all aspects of Power Architecture technology from authors inside and outside
IBM are welcomed. Check out the Power Architecture author
FAQ to learn more.
- Have a question or comment on this story, or
on Power Architecture technology in general?
Post it in the Power Architecture technical forum
or send in a letter to the editors.
- All things Power are chronicled in the developerWorks Power
Architecture editors' blog, which is just one of many developerWorks
blogs.
- Find more articles and resources on Power Architecture
technology and all things
related in the developerWorks Power
Architecture technology content area.
- Download a Power
Architecture Pack to demo a SoC in a simulated
environment, or just to explore the fully licensed version of
Power Architecture technology. This and other fine Power Architecture-related downloads are listed in
the developerWorks Power Architecture technology content area's downloads section.
About the author
Hollis Blanchard started learning about the PowerPC architecture and the Linux kernel in 1998. He works in the IBM Linux Technology Center, where he's developed for embedded PowerPC, pSeries servers, and x86 systems. He's also one of the core contributors to penguinppc.org. |
留言列表