# =========== Header =========== # File: Virtual Functions and Page Fault Mechanism # Project: (Newton Bowels) # Written by: Paul Guyot (pguyot@kallisys.net) # # Created on: 06/29/2001 # Internal version: 2 # Tab size: 4 spaces # # Copyright: © 2001-2002 by Paul Guyot. # All rights reserved worldwide. # =========== # =========== Change History =========== # 03/12/2002 v2 [PG] Finished the article. # 06/29/2001 v1 [PG] Creation of the file # =========== Abstract ======== This note explains the problem of virtual functions when using the -autoCopy option and described a hack to use virtual functions nevertheless when really needed. It also describes some basic points of the NewtonOS virtual memory and page fault mechanism. Platforms ========= This applies to 2.x Newtons at least (and very probably to any Newton) Introduction ============ Packages on the Newton are stored in compressed format. When they are active, they are mapped in memory at 0x6XXXXXXX addresses (apparently in domain 7 which is between 0x60000000 and 0x67FFFFFF). When the processor tries to access an address in this range, either a ram page is mapped there with decompressed package content or no page is there and the page fault mechanism reads the package, uncompress it, relocate it to this zone and resumes the execution of the previous task. Sometimes, it might be needed to avoid code to be in that zone, especially if it could be called by the page mechanism itself (for example if you are designing a driver for memory cards). To avoid this, you can simply ask the operating system to copy your code into RAM and locked it there. This is the role of the -autoCopy option of the packer tool which sets the appropriate flag in the part. This works nice as long as you don't use virtual functions. Indeed, when the system copies the software to ram, it doesn't relocate it to this place. Virtual functions are accessed via a virtual table which is filled with relative branches. But the constructor of classes with virtual functions accesses the virtual table with its address stored in the code (and updated by the relocation mechanim) instead of computing it relatively to the current PC. This is basically why the "DDK Introduction" manual reads (page 3-8) > Donąt Define and Use Code that Has Virtual Methods > -------------------------------------------------- > An important implication of the fact that p-classes need position-independent code is that you cannot > use code that derives from classes with virtual methods. In particular, most Newton ROM classes have > virtual methods, so you generally cannot derive from classes in the Newton ROM. (You can instantiate > those classes, though.) In addition, you canąt use some methods of ROM base classes because they > require that you supply an instance of a derived class from a base class that has virtual methods. I haven't found out why this could be a problem except with the autoCopy flag. If you know of another case, please contact me. A code sample ============= The following sample shows more precisely what happens. This is a simple class's constructor with virtual functions which does just call the inherited class constructor and a non virtual function of this class. // ------------------------------------------------------------------------ // // * TATAPSSManager( void ) // ------------------------------------------------------------------------ // // Unique constructor. // Initialize the EventHandler. TATAPSSManager::TATAPSSManager( void ) : TAEventHandler() { // Setup the class of events I want to here about. (void) Init( kAECardServerID ); } This will be coded the following way by the compiler: MOV r12, sp ; save sp (r12 is lost) STMDB sp!, { r4, r11, r12, lr, pc} ; save other registers on the stack SUB r11, r12, #4 ; r11 is sp - 4 (skip pc on the stack, used when ; registers will be restored) MOVS r4, r0 ; move r0 (this) to r4, and test it BNE LBL1 ; if it's null, it's a new on a previously allocated ; zone, go to LBL1 MOV r0, #0x14 ; move 0x14 (the size of our object) to r0 BL __nw__FUi ; call operator new(unsigned int) to allocate the ; object MOVS r4, r0 ; save the result to r4 BEQ LBL2 ; if allocation failed, skip initialization LBL1: MOV r0, r4 ; this line is stupid, we always have r0 == r4 at this ; point BL __ct__14TAEventHandlerFv ; call inherited class constructor, here ; TAEventHandler::TAEventHandler() LDR r0, LBL3 ; grab the address of the virtual table below and put ; it into r0 STR r0, [r4, #0] ; store it to *this MOV r0, r4 ; r0 = this LDR r2, LBL4 ; r2 = 'newt' LDR r1, LBL5 ; r1 = 'cdsv' BL Init__14TAEventHandlerFUlT1 ; call TAEventHandler::Init( 'cdsv', 'newt' ); ('newt' ; is the default value for the second argument) LBL2: MOV r0, r4 ; the result is this LDMDB r11, { r4, r11, r13, pc } ; restore registers and return (by setting pc to lr) LBL3: DCD 0 ; this is where the linker will put relocation ; information and the OS will put ; &__VTABLE__14TATAPSSManager LBL4: DCD 0x6E657774 ; 'newt' LBL5: DCD 0x63647376 ; 'cdsv' When the package is activated, the OS will put a 0x6XXXXXXX address at LBL3. And even if the constructor is called from the copy in the RAM. This means that when a virtual function of this object will be code, the code executed will be in the package virtual space and not in the copy in the RAM which is page fault safe. The easy solution is to avoid virtual functions. P-Classes: a method to avoid virtual functions ============================================== There are several ways to avoid virtual functions. One is Newton specific and it's the one we'll consider here. NewtonOS has P-Classes which are like classes with virtual functions without some disadvantages of virtual functions, especially the order of methods. Cf the "DDK Introduction" for more details. You can create your own Interface/Implementation classes. And you can create several implementations to reproduce polymorphism. A dirty hack to fix the problem =============================== The other method is the following hack. In our constructor, we call an assembly function which restores the virtual table to the copy in ram. The assembly function is included. Our constructor becomes: // ------------------------------------------------------------------------ // // * TATAPSSManager( void ) // ------------------------------------------------------------------------ // // Unique constructor. // Initialize the EventHandler. TATAPSSManager::TATAPSSManager( void ) : TAEventHandler() { // Setup the class of events I want to here about. (void) Init( kAECardServerID ); // Hack virtual functions table because of page fault mechanism. RelocVTable( &__VTABLE__14TATAPSSManager ); } RelocVTable is a C macro which expands to: #define RelocVTable( inVTablePtr ) RelocVTableHack( this, (ULong) &RelocVTableHack, inVTablePtr ) The trick here is to pass to RelocVTableHack the pointer to itself, but in the virtual package space (any pointer to a function you can make with & is a pointer in the virtual package space). The VTable's symbol is mangled. It's __VTABLE__ followed by the size of the class name and the class name. The pointer to it is also in the virtual package space. RelocVTableHack replaces the pointer of the VTABLE in the package space by the pointer of the copy of the VTABLE in the copy-memory space. You need to call it in the constructor of every concrete class. Attached Files ============== Attached files are RelocHack.h and RelocHack.a for the C declarations and the assembler code. Note that the assembly functions are provided with C++-mangled names. They can be easily adapted to C names. However, what are virtual methods in C? History ======= I got the problem of virtual functions with the ATA project. I planned to use virtual functions everywhere (for example to switch between ATA modes), when I realized that the code called was in the virtual space and this locked the Newton (because the page fault mechanism called the ATA code which called the page fault mechanism, hence the lock). Then I avoided virtual functions everywhere, until the moment when I needed an event handler that is to be page fault safe. TAEventHandlers have virtual methods that subclasses need to derive. Finally, I'm back to virtual methods everywhere using this hack, because in the end, it's much easier in a lot of cases. ## ========================================== ## ## Do not use the blue keys on this terminal. ## ## ========================================== ##