</p> <p> One extra bell in the interface between the decode unit and the prefetch unit, though, is the speculative fetch; the prefetch unit is assumed to have some form of local read-ahead capability, and the decode unit may request an address be speculatively fetched in to the local read-ahead in case a branch is to be taken or a return from a function call is to be executed. </p> <p> The prefetch unit will actually have three lines of data, each one with an address; effectively this is a fully associative cache. The prefetch unit will automatically read-ahead as sequential instructions are presented to the decode unit, and also fill a line on a speculative request, and also (of course) on a demand fetch (i.e. non-speculative address request). </p> <?php page_section("Decode", "Decode stage"); ?> <p> The decode stage of the GIP contains two decoders; the first is a native 16-bit instruction decoder, the second is an ARM 32-bit instruction emulator. These decoders effectively run in parallel; the GIP will be in one of three operating modes (native 16-bit, ARM 32-bit emulation, or idle) and the internal instructions and prefetch decodes for the appropriate mode will be handled appropriately. </p> <p> The decode unit talks to the prefetch stage, the register file read stage, and the local scheduler; the prefetch stage interface is discussed above, and that leaves the register file read stage and the local scheduler. </p> <p> The decode unit talks to the scheduler by taking schedule requests from the scheduler and responding with an acknowledgement. The decode unit may block
<tr> <th><a href="alu/index.php">ALU</a></th> <td>ALU capabilities, outline implementation, and use for ARM emulation</td> </tr> <tr> <th><a href="arm_emulation/index.php">ARM emulation</a></th> <td>Details on the mechanisms used for ARM emulation</td> </tr> <tr> <th><a href="microkernel/index.php">Microkernel</a></th> <td>Capabilities of the microkernel and its use in ARM emulation</td> </tr> </table> <?php page_section("construction", "Construction"); ?> It should be noted that each stage of the pipeline registers its inputs immediately (possibly with clock enabled flops), and produces combinatorial outputs. There are no combinatorial input-to-output paths in the pipeline stages. <?php page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; include "local_header.php"; ?> <?php page_section("multiply", "Multiply instructions"); ?> <p> ARM multiply instructions are basically plain multiplies or multiply with accumulate. They are emulated with three different instructions: INIT, IMLA, IMLB. For simplicity a multiply instruction always takes 16 cycles; early termination is not supported. INIT[CC]A Rm, Rn/0 IMLACPA Acc, Rs IMLBCPA Acc, #0 14 times IMLBCPA[S] Acc, #0 -> Rd </p> <?php arm_emulation_table_start(); arm_emulation_table_instruction("MUL[CC] Rd, Rm, Rs", "\n INIT[CC]A Rm, #0<br>\n IMLACPA Acc, Rs<br>\n IMLBCPA Acc, #0 <i>(14 times)</i><br>\n IMLBCPA[F] Acc, #0 -> Rd", "No restrictions", ""); arm_emulation_table_instruction("MLA[CC] Rd, Rm, Rs, Rn", "\n INIT[CC]A Rm, Rn<br>\n IMLACPA Acc, Rs<br>\n IMLBCPA Acc, #0 <i>(14 times)</i><br>\n IMLBCPA[F] Acc, #0 -> Rd", "No restrictions", ""); arm_emulation_table_instruction("MUL[CC]S Rd, Rm, Rs", "\n INIT[CC]A Rm, #0<br>\n IMLACPA Acc, Rs<br>\n IMLBCPA Acc, #0 <i>(14 times)</i><br>\n IMLBCPAS[F] Acc, #0 -> Rd", "No restrictions", "<em>Differs from ARM - V is corrupted</em>"); arm_emulation_table_instruction("MLA[CC]S Rd, Rm, Rs, Rn", "\n INIT[CC]A Rm, Rn<br>\n IMLACPA Acc, Rs<br>\n IMLBCPA Acc, #0 <i>(14 times)</i><br>\n IMLBCPAS[F] Acc, #0 -> Rd", "No restrictions", "<em>Differs from ARM - V is corrupted</em>");
<?php page_section("notes", "Notes on instructions"); ?> Coprocessors are accesed through the postbus, which is a register set. <p> There are a set of registers marked as 'special' which allow for control of the sticky/unsticky flags control, condition passed shadow size, endianness, unaligned accesses. <p> The scheduler is also accessed as a set of registers, to allow for control of the threads. This provides mechanisms for moving to and from ARM and native mode. <?php page_section("missing", "Missing instructions"); ?> Missing: <ul> <li>Memory commands (prefetch, flush, and such like) <li>Coprocessor/memory DMA commands <li>Zero-overhead loops <li>Repeat count - can be set by writes to registers, but want a decode </ul>
<tr> <th>O <td>Offset <td>0=> offset of 1/2/4 (depending on access size), 1=> use SHF as the offset </tr> <tr> <th>s <td>Stack access <td>1=> use stack locality for caching access, 0=> use default locality for caching access </tr> </table> <?php page_section("encoding", "Encoding"); ?> <table border=1 class=data> <tr> <th>Mnemonic</th> <th>Class</th> <th>Subclass</th> <th>Opts</th> <th>CC</th> <th>Rd</th> <th>A</th> <th>F</th> </tr> <tr>
arm_emulation_table_instruction("LDR[CC] PC, [Rn, #+/-imm]!", "IADD[CC]A/ISUB[CC]A Rn, #imm -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn}<br>"); arm_emulation_table_instruction("LDR[CC] Rd, [Rn, +/-Rm]!", "IADD[CC]A/ISUB[CC]A Rn, Rm -> Rn<br>ILDRCP[S] #0 (ACC) -> Rd", $rd_rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] PC, [Rn, +/-Rm]!", "IADD[CC]A/ISUB[CC]A Rn, Rm -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] Rd, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>IADD[CC]A/ISUB[CC]A Rn, SHF -> Rn<br>ILDRCP[S] #0 (ACC) -> Rd", $rd_rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] PC, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>IADD[CC]A/ISUB[CC]A Rn, SHF -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] Rd, [Rn], #+/-imm", "ILDR[CC]A[S] #0 (Rn), #+/-imm -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn}<br>"); arm_emulation_table_instruction("LDR[CC] PC, [Rn], #+/-imm", "ILDR[CC]A[S] #0 (Rn), #+/-imm -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn}<br>"); arm_emulation_table_instruction("LDR[CC] Rd, [Rn], +/-Rm", "ILDR[CC]A[S] #0 (Rn), +/-Rm -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] PC, [Rn], +/-Rm", "ILDR[CC]A[S] #0 (Rn), +/-Rm -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] Rd, [Rn], +/-Rm, SHF #imm", "ILSL Rm, #imm<br>ILDR[CC]A[S] #0 (Rn), +/-SHF -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_instruction("LDR[CC] PC, [Rn], +/-Rm, SHF #imm", "ILSL Rm, #imm<br>ILDR[CC]A[S] #0 (Rn), +/-SHF -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn_rm}<br>"); arm_emulation_table_end(); ?> <?php page_section("stores", "Stores"); arm_emulation_table_start(); arm_emulation_table_instruction("STR[CC] Rd, [Rn, #0]", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn, #0]!", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn], #0", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn, #+/-imm]", "IADD[CC]AC/ISUB[CC]AC Rn, #imm<br>ISTRCP[S] #0 (ACC, +/-SHF) <- Rd", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm]", "IADD[CC]AC/ISUB[CC]AC Rn, Rm<br>ISTRCP[S] #0 (ACC, +/-SHF) <- Rd", "", $repl_rn_rm); arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm, SHF #imm]", "ISHF[CC] Rm, #imm<br>ISTRCPA[S] #0 (Rn, +/-SHF) <- Rd", "", $repl_rn_rm); arm_emulation_table_instruction("STR[CC] Rd, [Rn, #+/-imm]!", "IADDAC/ISUBAC Rn, #imm<br>ISTR[CC]A[S] #0 (ACC, +SHF) <- Rd -> Rn", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm]!", "IADDAC/ISUBAC Rn, Rm<br>ISTR[CC]A[S] #0 (ACC, +SHF) <- Rd -> Rn", "", $repl_rn_rm); arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>ISTR[CC]A[S] #0 (Rn, +/-SHF) <- Rd -> Rn", "", $repl_rn_rm); arm_emulation_table_instruction("STR[CC] Rd, [Rn], #+/-imm", "ISTR[CC][S] #0 (Rn) <- Rd<br>IADDCPA/ISUBCPA Rn, #imm -> Rn", "", $repl_rn); arm_emulation_table_instruction("STR[CC] Rd, [Rn], +/-Rm", "ISTR[CC][S] #0 (Rn) <- Rd<br>IADDCPA/ISUBCPA Rn, Rm -> Rn", "", $repl_rn_rm); arm_emulation_table_instruction("STR[CC] Rd, [Rn], +/-Rm, SHF #imm", "ISHF[CC] Rm, #imm<br>ISTRCPA[S] #0 (Rn), +/-SHF <- Rd -> Rn", "", $repl_rn_rm); arm_emulation_table_end(); ?>
item to note is the instruction set is a 2-register instruction set, with separated memory and operation instructions, much like simple RISC processors. </p> <p> One difference, though, is that the native instruction set utilizes an extension capability to provide for 3-register instructions, as well as access to the fuller flavor of internal instructions supported by the GIP pipeline. </p> <?php page_section("summary", "Summary"); ?> <p> This documentation starts with <a href="overview.php">the overview</a>; pages then give <a href="encoding.php">details of the encodings</a>, and <a href="examples.php">some examples of use</a>. </p> <?php page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
</p> <ol> <li>Addition and subtraction</li> <li>AND, OR, NOT, XOR</li> <li>Left shift by arbitrary amount</li> <li>Logical right shift by arbitrary amount</li> <li>Arithmetic right shift by arbitrary amount</li> <li>Rotate by arbitrary amount</li> <li>Two-bit step of multiply-accumulate</li> <li>Single bit step of divide</li> <li>XOR and zero-bit count (top down)</li> </ol> <?php page_section("use", "Operation use"); ?> <p> This section describes how some of the more complex ALU operations can be used </p> <?php page_subsection("multiplication", "Multiplication"); ?> <p>A multiply instruction can be implemented with Booths algorithm calculating two bits of result at a time. Say a calculation of the form r=x*y+z needs to be performed. The basic multiply step of 2 bits can be
<?php page_section("accumulator", "Use of the accumulator"); ?> <p> The accumulator is used for two purposes in the ARM emulation system. Firstly it is used to create intermediate results for instructions, such as addresses of loads and stores, without requiring an actual register. Secondly it is used for optimization of data processing, in effect as a forwarding path within the ALU. </p> <?php page_section("alu_forwarding", "ALU forwarding"); ?> <p> The accumulator in the internal GIP pipeline is utilized to enhance the performance of ARM emulation. Simply put, it maintains a local copy of a single ARM register, and the register number that it contains is tracked by the ARM emulation hardware unit. Then, instead of issuing internal instructions which use the accumulator for a register field, the internal instruction may use the accumulator instead. Note that the value in the accumulator is very volatile, and may be used by just a few instructions, and will be corrupted by many instructions, so the net usage is not likely to be more than 20% of instructions. However, it may only achieve a performance benefit in back-to-back instructions, as with more spacing the register value is likely to be ready in the register file path, and the 'forwarding' path through the accumulator does not save cycles.
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; site_set_location("gip.instruction_set"); $page_title = "GIP Documentation"; include "{$toplevel}web_assist/web_header.php"; page_header("GIP Native Instruction Set"); page_sp(); ?> <?php page_section("examples", "Examples"); ?> <?php page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
next thread. It will start with the round-robin thread number given. <p> Also the scheduler has a bit indicating whether it is running preempting or cooperatively. <p> The scheduler maintains the knowledge of the currently running thread. <p> The scheduler also determines what thread to run next, and registers that. <?php page_section("operations", "Operations"); ?> <dl> <dt>Clear flag <dd> ISCHDSEMBIC[CC][A] Rm/Imm [-> Rd] <br> This uses the semaphore flags in the mode currently specified for the system, reads them, performs a BIC operation with the specified operand, and can put the result in Rd. To clear a particular sempahore based on a register
The ALU generates a 'condition passed' indication from either current flags or from the stored flags. An ALU and data shifter instruction can store results and flags conditionally on this value, if desired. <br> The flags may be configured as 'sticky'; that is they may be set if desired due to an operation, but not cleared </p> <?php page_section("operands", "Operands to registers"); ?> <p>The ALU contains four potential operand sources: input register A, input register B, ALU accumulator ACC, and shifter result SHF. </p> <?php page_section("further_details", "Further details"); ?> <p>The <a href="operations.php">operations</a> document describes the operational capabilities required of the ALU and shifter, and discusses how these requirements are derived from the ARM emulation. The <a href="dataflow.php">dataflow</a> document takes this information in another form, presenting details as to how the data flows for each operation and what the shifter and ALU do, and how that ties back to ARM emulation. The <a href="implementation.php">implementation</a> document then gives details on the module and its implementation. </p> <?php page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
</table> <?php page_section("modes", "Modes"); ?> The microkernel has a concept of which 'mode' the ARM is running in, but it is primitive. It has a flag that it keeps in memory (or register?) that is 1 for SVC mode, 0 for user mode. <?php page_section("flags", "Flags"); ?> The microkernel maintains a single flag for interrupt enable. This is a flag which is in the scheduler, which may cause the microkernel to be scheduled when it changes from 0 (interrupts disabled) to 1 (interrupts enabled). <?php page_section("primitives", "Primitives"); ?> The microkernel supports the following primitives: <dl> <dt>SWI invoked (r10-r14 = user mode register values, r15 = calling PC+8, r16 = instruction) <dd>The highest-priority primitve; it is invoked generally from a hardware decode, and the microkernel expects r0 to r15 to be the requesting arguments and r16 to be the instruction that caused the invocation, so that it may be decoded from a table for despatch. The microkernel will preserve r0 to r15 in a fixed region of memory, and restart the ARM with r14 equal to the given r15, and use a depatch table for the start value based on r16. It will also mark itself as in 'SVC' mode, and clear the interrupts enabled flag. <dt>Enter USER mode at address (r10-r14 = user mode register values, r15 = restart PC, r16 = instruction) <dd>This is invoked by ARM code explicitly, and is effectively a kind of SWI. It takes a single argument, that of the address to start the ARM at. The microkernel will clear its 'SVC' mode flag, and restart the ARM mode code thread at the value of r15. <dt>Handle interrupt
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; site_set_location("gip.alu.implementation"); $page_title = "GIP Documentation"; include "{$toplevel}web_assist/web_header.php"; page_header("GIP ALU and Shifter Implementation"); page_sp(); ?> This documentation matches version 1.9 of the gip_alu source code <?php page_section("module_definition", "Module definition"); ?> <p> The ALU stage, then, requires the following controls and input data, which can be described as 'ports' to the ALU stage </p> <?php page_subsection("inputs", "Inputs"); ?> <table border=1> <tr> <th>Port</th> <th>Type</th> <th>Details</th> </tr>
The basic operation of a hardware interrupt thread is: <ol> <li>Wait for stuff on a register or something from an external device (or other GIP) <li>Get stuff from that register; if it has enough, or a packet, or something then notify the microkernel that there is something to do by setting a bit in the interrupt status register and by setting the hardware interrupt pending semaphore. <li>Return, but keep giving it stuff. </ol> </p> <?php page_section("thread_priorities", "Thread priorities"); ?> <p> Hardware threads should be equal top priority, nonpreemptable. Microkernel thread should be second priority, preemptable by hardware threads. ARM thread should be lowest priority, preemptable by any of the above. </p> <?php page_ep(); include "{$toplevel}web_assist/web_footer.php";
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; include "local_header.php"; ?> <?php page_section("flushing", "Pipeline flushing"); ?> <p> Pipeline flushing is used when an ARM instruction stream is being emulated where the predicted flow of instructions turns out to be incorrect, and a different flow is required. A pipeline flush must always be accompanied, in ARM emulation, with a corresponding write to the program counter from the end of the pipeline. The order of the write to the PC and the flush may vary depending on the contents of the pipeline (in particular a load may complete after a following ALU operation with flush), so the decoder must keep track of the flush and write, and pair them up. <p> The actual operations that may invoke a flush are as follows: <table class=data border=1> <tr>
If carry is one then return that as result, else return Op2 </td> <td> </td> </tr> </table> <p> The arithmetic unit produces a carry (carry out of the adder), overflow (from the adder), a zero flag indicating its result is zero, and a negative flag indicating the top bit of its result is set. </p> <?php page_section("Result values", "Result values"); ?> <p> Conditional execution may block execution; no effects will occur if a conditional operation is performed and its condition is not met. <br> With that in mind: <ol> <li> The ALU result is the result of the logical or arithmetic operation performed: if a logical operation was performed then the result comes from the logic unit (as do the N and Z flags; V is unchanged; C may come from the shifters last carry out); if an arithmetic operation was performed then the result comes from the arithmetic unit (as do N, Z, V, C). Note that for shifter result to be seen it must be moved through the logical or arithmetic path in a second instruction, as it is not muxed through to the output. </li> <li> The shifter result is always written to the SHF register on execution
site_set_location("company.people"); $page_title = "Embisi Inc. People"; include "{$toplevel}web_assist/web_header.php"; page_header("Embisi People"); page_sp(); ?> Embisi is headed up by <a href="gavin_stark">Gavin J Stark</a> and <a href="john_croft">John Croft</a>. <?php page_section("gavin", "<a href=gavin_stark>CEO - Gavin J Stark</a>"); ?> Gavin Stark is the CEO of Embisi Inc. Gavin was previously an architect for Network Processors at Intel, where he arrived at after their acquisition of Basis Communications, at which he was CTO. Gavin has a PhD and BA from Cambridge University, England. <?php page_section("john", "<a href=john_croft>Software - John Croft</a>"); ?> John Croft is the embedded software lead for Embisi, building on years of experience in embedded computing and operating system design and support. John's previous experience is at Cisco, Calista (acquired by Cisco), and Madge Networks. John holds a BA from Cambridge University, England. <?php page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
handled. <?php code_format("gip", "code/system_mode_no_interrupt_pending.s"); ?> <?php page_section("system_mode_interrupts", "System mode with interrupts pending"); ?> This is the code that occurs when the microkernel was idling, and when either a hardware or software interrupt occurs. It examines the source of the interrupt, clearing the indication atomically. It then despatches to the correct routines until all the interrupt sources are handled. <?php code_format("gip", "code/system_mode_interrupt_pending.s"); ?> <?php page_section("swi_entry", "SWI entry code"); ?> <?php code_format("gip", "code/swi_entry.s"); ?> <?php page_ep(); include "{$toplevel}web_assist/web_footer.php";
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; site_set_location($site_location); $page_title = "GIP Documentation"; include "{$toplevel}web_assist/web_header.php"; page_header("ARM Emulation Microkernel"); page_sp(); ?> <?php page_section("communication", "Communication primitives"); ?> These communication primitives manipulate two basic structures: the ARM mode code thread, and a local SRAM data block that contains the ARM 'state', basically the banked registers (user mode stack pointer (r13) and link register (r14), system stack pointer, interrupted flags (zcnv), and interrupted PC (r15)). <p> For the interthread management to work the registers R0-R15 must be the same for the microkernel and the ARM mode code thread, and they cannot be effected by any of the other threads running on the GIP. <table border=1> <tr><th>State</th><th>Name</th><th>Description</th></tr> <tr><th>User R13</th>
signal_output("SrcAck", "n", "medium", "Is taking presented data; combinatorial (one per source)"); signal_output("TgtType", "2*n", "medium", "Type of transaction word"); signal_output("TgtData", "32", "medium", "Data for transaction word (to all n targets)"); signal_end(); signal_list("Ports on complex registered postbus router"); signal_input("n SrcType", "2*n", "late", "Type of transaction word (one per source)"); signal_input("n SrcData", "32*n", "late", "Data for transaction word (one per source)"); signal_input("m TgtAck", "m", "late", "Can take presented data"); signal_output("SrcAck", "n", "early", "Is taking presented data; combinatorial (one per source)"); signal_output("TgtType", "2*m", "early", "Type of transaction word"); signal_output("TgtData", "32*m", "early", "Data for transaction word"); signal_end(); ?> <?php page_section("simple_implementation", "Simple implementation"); ?> A simple implementation of a postbus router takes a small number of sources, say 4, and distributes data to a small number of targets, say 8. The simple implementation has a state machine, and it multiplexes its incoming 4 sets of source data together to present to all 8 targets. <p> <?php code_format("cdl", "cdl/simple.cdl"); ?> <?php
occur, whether they are forwards or backwards. </p> <p> Conditional branches with link are emulated by inserting two internal instructions in to the pipeline; one with the reverse condition of the instruction with flush to force a branch to PC+8-4 if the condition is not met, and the other with the current condition to set the link register on a correctly predicted branch. The program counter is also updated with the branch target address. </p> <?php page_section("emulation_details", "Emulation details"); arm_emulation_table_start(); arm_emulation_table_instruction("B {offset}", "", "Guaranteed branch", "Changes PC to PC+8+offset"); arm_emulation_table_instruction("B[CC] {offset}", "", "CC will be met<br>Guaranteed branch", "Changes PC to PC+8+offset"); arm_emulation_table_instruction("B[CC] {negative offset}", "SUB{!CC}F PC, #4 -> PC", "CC may not be met<br>Predicted branch", "Changes PC to PC+8+offset<br>If mispredicted then instruction will execute and reset PC to the correct path"); arm_emulation_table_instruction("B[CC] {positive offset}", "MOV{CC}F #target -> PC", "CC may not be met<br>Unpredicted branch", "If mispredicted then instruction will execute and set PC to branch target"); arm_emulation_table_instruction("BL {offset}", "SUB PC, #4 -> R14", "Guaranteed branch with link", "Changes PC to PC+8+offset"); arm_emulation_table_instruction("BL[CC] {offset}", "SUB PC, #4 -> R14", "CC will be met<br>Guaranteed branch with link", "Changes PC to PC+8+offset"); arm_emulation_table_instruction("BL[CC] {offset}", "SUB{!CC}F PC, #4 -> PC<br>SUB PC, #4 -> R14", "CC may not be met<br>Conditional branch with link", "Changes PC to PC+8+offset<br>If mispredicted then instruction will execute and reset PC to the correct path, and R14 will not be written"); arm_emulation_table_end(); page_ep(); include "{$toplevel}web_assist/web_footer.php"; ?>
<li>Ability to do a thunking call (multicycle decode, single instruction) <li>Ability to do a SWI (couple of moves, deschedule, assert event) (multicycle decode, single instruction) <li>Ability to make return from interrupt happen... How? <li>Force enable of hardware interrupts (macro) <li>Restore interrupt enable (macro) <li>Disable interrupts, returning previous state (macro) </ul> <?php page_section("thunking_libraries", "Thunking libraries"); ?> We can use r17 or some other register to contain a base address of dynamic library thunking table assists; the dynamic mapping of registers to support this in this particular way is patentable. <p> Best method is to have a small table of static data pointers whose base address is in r17 indexed by local library number, and a global table of entry points for functions in the libraries indexed by global entry point number whose base is in r18 We can have one instruction that loads 'r12' with 'r17, #...' and pc with 'r18, #entryptr<<2' - we can use a quarter of the SWI instruction decode.
<?php page_section("arm_emulation", "ARM emulation"); ?> <p> The ARM instruction classes in general are as follows: </p> <table border=1> <tr> <th>Class</th> <th>Description</th> <th>ALU note</th> </tr> <tr> <th>Data processing</th> <td>ALU, Shift, and combined instructions</td> <td>Varied emulation issues; see below</td> </tr> <tr> <th>Multiply</th> <td>Multiply and multiply accumulate</td> <td>Utilizes INIT an MULST</td> </tr> <tr> <th>Single Data Swap</th> <td><i>Not supported</i></td>
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; site_set_location($site_location); $page_title = "GIP Documentation"; include "{$toplevel}web_assist/web_header.php"; page_header("ARM Emulation Microkernel"); page_sp(); ?> <?php page_section("linux", "Linux operation"); ?> <p> Linux supports a preempting kernel. In this model hardware interrupts can occur at any point in time, including during system calls; system calls may then be preempted by the interrupt routines. <p> It is worth examining the aim of the kernel architecture to determine how to manage hardware interrupts and system calls from a processor emulation perspective. This document does this in more detail in the sections below, but summarized first. <p> <?php page_subsection("summary", "Summary of operation"); ?>
<ul> <li>This page includes a very basic summary of the microkernel operation. <li>The basic requirements of the microkernel are derived from the way the <a href="linux.php">ARM port of Linux uses processor modes</a> <li>The operation of the microkernel is <a href="outline_operation.php">described in outline</a> <li>The operation of the microkernel is <a href="detailed_operation.php">described in detail</a> <li>The <a href="communication.php">communication primitives</a> required of the microkernel supply the mechanisms for interaction between ARM code and the microkernel. <?php page_section("overview", "Overview"); ?> <p> The microkernel provides the capability of a GIP to support a full-blown OS with hardware interrupts. <p> The basic implementation requires three classes of thread: <ul> <li>Microkernel thread (one instance) <li>ARM mode code thread (one instance for all user and supervisor code)
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; include "local_header.php"; ?> <?php page_section("data_processing", "Data processing instructions"); ?> <p> ARM data processing instructions can perform a shift and ALU operation in a single instruction, but the bulk of the instructions are just ALU operations. The internal instruction set does not support a single instruction to perform both a shift and an ALU operation, so pairs of internal instructions must be used for those ARM instructions. This leads to two distinct classes of emulated data processing instructions: <dl> <dt> Unshifted instructions <dd> ALU{cc} Rd, Rn, Rm <br> ALU{cc} Rd, Rn, #imm <p> These instructions do not require a shift operation, and so are emulated with a single internal instruction. There are a few subclasses of these instructions; comparisons differ from instructions that write to registers, and some instructions may set the flags while others do not.
<?php include "web_locals.php"; include "{$toplevel}web_assist/web_globals.php"; include "local_header.php"; ?> <?php page_section("prefetch", "Prefetching in ARM emulation"); ?> <p> The ARM emulation mode is designed to emulate ARM instructions at around 1.5 clocks per instruction for data processing, loads and stores, with obviously higher CPI for bulk transfers and multiplies. It is important at these rates to keep the instruction pipeline fed. This is particularly important as the distance to the main ROM on a GIP system is many cycles (about ten), and there is no level 1 cache. The prefetch unit performs speculative fetch of the 'next' instruction line to be used; this means that when instruction at address 'n' is executed the prefetch unit will speculatively fetch 'n' plus one line, and as each line is 8 instructions this means 'n'+8. With 10 cycles of latency, instruction 'n'+8 should be ready in about 10 cycles, whereas at 1.5 CPI it will be needed in about 12 cycles, so all is well. However, when a branch is taken there will be a long penalty; unconditional branches will see a 10 cycle penalty, for example. Compare this to the ARM, though, where the branch is no detected until the execute stage (i.e. 2 cycles later), then the penalty is slightly less, but not considerably. The worst effect, though, is on returning from a branch, as this cannot be concretely
<li>Can we be a PCMCIA target with the above pins? </ul> <p> In its simplest form two endpoints can be tied together back-to-back; that is, their repective outgoing interfaces can be wired without logic to each other's incoming interfaces. <p> <?php page_section("baud_rate_generator", "Baud rate generators"); ?> The baud rate generators are individually configurable, and run from one of six potential clock sources: <ul> <li> No clock (low power) <li> Internal clock <li> I/O clock pin 0