Example #1
0
</p>

<p>
One extra bell in the interface between the decode unit and the prefetch unit, though, is the speculative fetch; the prefetch unit
is assumed to have some form of local read-ahead capability, and the decode unit may request an address be speculatively fetched in to 
the local read-ahead in case a branch is to be taken or a return from a function call is to be executed.
</p>

<p>
The prefetch unit will actually have three lines of data, each one with an address; effectively this is a fully associative cache. The prefetch unit will automatically read-ahead as sequential instructions are presented to the decode unit, and also fill a line on a speculative request, and also (of course) on 
a demand fetch (i.e. non-speculative address request).
</p>


<?php 
page_section("Decode", "Decode stage");
?>

<p>
The decode stage of the GIP contains two decoders; the first is a native 16-bit instruction decoder, the second is an ARM 32-bit instruction emulator.
These decoders effectively run in parallel; the GIP will be in one of three operating modes (native 16-bit, ARM 32-bit emulation, or idle) and the
internal instructions and prefetch decodes for the appropriate mode will be handled appropriately.
</p>

<p>
The decode unit talks to the prefetch stage, the register file read stage, and the local scheduler; the prefetch stage interface is discussed above,
and that leaves the register file read stage and the local scheduler.
</p>

<p>
The decode unit talks to the scheduler by taking schedule requests from the scheduler and responding with an acknowledgement. The decode unit may block
Example #2
0
<tr>
<th><a href="alu/index.php">ALU</a></th>
<td>ALU capabilities, outline implementation, and use for ARM emulation</td>
</tr>

<tr>
<th><a href="arm_emulation/index.php">ARM emulation</a></th>
<td>Details on the mechanisms used for ARM emulation</td>
</tr>

<tr>
<th><a href="microkernel/index.php">Microkernel</a></th>
<td>Capabilities of the microkernel and its use in ARM emulation</td>
</tr>

</table>


<?php 
page_section("construction", "Construction");
?>

It should be noted that each stage of the pipeline registers its inputs immediately (possibly with clock enabled flops), and produces combinatorial outputs. There are no combinatorial input-to-output paths in the pipeline stages.

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

Example #3
0
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
include "local_header.php";
?>

<?php 
page_section("multiply", "Multiply instructions");
?>

<p>

ARM multiply instructions are basically plain multiplies or multiply
with accumulate. They are emulated with three different instructions:
INIT, IMLA, IMLB. For simplicity a multiply instruction always takes
16 cycles; early termination is not supported.

INIT[CC]A Rm, Rn/0
IMLACPA Acc, Rs
IMLBCPA Acc, #0 14 times
IMLBCPA[S] Acc, #0 -> Rd

</p>

<?php 
arm_emulation_table_start();
arm_emulation_table_instruction("MUL[CC] Rd, Rm, Rs", "\n    INIT[CC]A Rm, #0<br>\n    IMLACPA Acc, Rs<br>\n    IMLBCPA Acc, #0 <i>(14 times)</i><br>\n    IMLBCPA[F] Acc, #0 -> Rd", "No restrictions", "");
arm_emulation_table_instruction("MLA[CC] Rd, Rm, Rs, Rn", "\n    INIT[CC]A Rm, Rn<br>\n    IMLACPA Acc, Rs<br>\n    IMLBCPA Acc, #0 <i>(14 times)</i><br>\n    IMLBCPA[F] Acc, #0 -> Rd", "No restrictions", "");
arm_emulation_table_instruction("MUL[CC]S Rd, Rm, Rs", "\n    INIT[CC]A Rm, #0<br>\n    IMLACPA Acc, Rs<br>\n    IMLBCPA Acc, #0 <i>(14 times)</i><br>\n    IMLBCPAS[F] Acc, #0 -> Rd", "No restrictions", "<em>Differs from ARM - V is corrupted</em>");
arm_emulation_table_instruction("MLA[CC]S Rd, Rm, Rs, Rn", "\n    INIT[CC]A Rm, Rn<br>\n    IMLACPA Acc, Rs<br>\n    IMLBCPA Acc, #0 <i>(14 times)</i><br>\n    IMLBCPAS[F] Acc, #0 -> Rd", "No restrictions", "<em>Differs from ARM - V is corrupted</em>");
Example #4
0
<?php 
page_section("notes", "Notes on instructions");
?>

Coprocessors are accesed through the postbus, which is a register set.

<p>

There are a set of registers marked as 'special' which allow for control of the sticky/unsticky flags control, condition passed shadow size, endianness, unaligned accesses.

<p>

The scheduler is also accessed as a set of registers, to allow for control of the threads. This provides mechanisms for moving to and from ARM and native mode.

<?php 
page_section("missing", "Missing instructions");
?>

Missing:

<ul>

<li>Memory commands (prefetch, flush, and such like)

<li>Coprocessor/memory DMA commands

<li>Zero-overhead loops

<li>Repeat count - can be set by writes to registers, but want a decode

</ul>
Example #5
0
<tr>
<th>O
<td>Offset
<td>0=> offset of 1/2/4 (depending on access size), 1=> use SHF as the offset
</tr>

<tr>
<th>s
<td>Stack access
<td>1=> use stack locality for caching access, 0=> use default locality for caching access
</tr>

</table>

<?php 
page_section("encoding", "Encoding");
?>

<table border=1 class=data>
<tr>
<th>Mnemonic</th>
<th>Class</th>
<th>Subclass</th>
<th>Opts</th>
<th>CC</th>
<th>Rd</th>
<th>A</th>
<th>F</th>
</tr>

<tr>
arm_emulation_table_instruction("LDR[CC] PC, [Rn, #+/-imm]!", "IADD[CC]A/ISUB[CC]A Rn, #imm -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn}<br>");
arm_emulation_table_instruction("LDR[CC] Rd, [Rn, +/-Rm]!", "IADD[CC]A/ISUB[CC]A Rn, Rm -> Rn<br>ILDRCP[S] #0 (ACC) -> Rd", $rd_rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] PC, [Rn, +/-Rm]!", "IADD[CC]A/ISUB[CC]A Rn, Rm -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] Rd, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>IADD[CC]A/ISUB[CC]A Rn, SHF -> Rn<br>ILDRCP[S] #0 (ACC) -> Rd", $rd_rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] PC, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>IADD[CC]A/ISUB[CC]A Rn, SHF -> Rn<br>ILDRCP[S]F #0 (ACC) -> PC", $rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] Rd, [Rn], #+/-imm", "ILDR[CC]A[S] #0 (Rn), #+/-imm -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn}<br>");
arm_emulation_table_instruction("LDR[CC] PC, [Rn], #+/-imm", "ILDR[CC]A[S] #0 (Rn), #+/-imm -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn}<br>");
arm_emulation_table_instruction("LDR[CC] Rd, [Rn], +/-Rm", "ILDR[CC]A[S] #0 (Rn), +/-Rm -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] PC, [Rn], +/-Rm", "ILDR[CC]A[S] #0 (Rn), +/-Rm -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] Rd, [Rn], +/-Rm, SHF #imm", "ILSL Rm, #imm<br>ILDR[CC]A[S] #0 (Rn), +/-SHF -> Rd<br>MOVCP ACC -> Rn", $rd_rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_instruction("LDR[CC] PC, [Rn], +/-Rm, SHF #imm", "ILSL Rm, #imm<br>ILDR[CC]A[S] #0 (Rn), +/-SHF -> PC<br>MOVCPF ACC -> Rn", $rn_not_pc, "{$repl_rn_rm}<br>");
arm_emulation_table_end();
?>

<?php 
page_section("stores", "Stores");
arm_emulation_table_start();
arm_emulation_table_instruction("STR[CC] Rd, [Rn, #0]", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, #0]!", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn], #0", "ISTR[CC][S] #0 (Rn) <- Rd", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, #+/-imm]", "IADD[CC]AC/ISUB[CC]AC Rn, #imm<br>ISTRCP[S] #0 (ACC, +/-SHF) <- Rd", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm]", "IADD[CC]AC/ISUB[CC]AC Rn, Rm<br>ISTRCP[S] #0 (ACC, +/-SHF) <- Rd", "", $repl_rn_rm);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm, SHF #imm]", "ISHF[CC] Rm, #imm<br>ISTRCPA[S] #0 (Rn, +/-SHF) <- Rd", "", $repl_rn_rm);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, #+/-imm]!", "IADDAC/ISUBAC Rn, #imm<br>ISTR[CC]A[S] #0 (ACC, +SHF) <- Rd -> Rn", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm]!", "IADDAC/ISUBAC Rn, Rm<br>ISTR[CC]A[S] #0 (ACC, +SHF) <- Rd -> Rn", "", $repl_rn_rm);
arm_emulation_table_instruction("STR[CC] Rd, [Rn, +/-Rm, SHF #imm]!", "ILSL Rm, #imm<br>ISTR[CC]A[S] #0 (Rn, +/-SHF) <- Rd -> Rn", "", $repl_rn_rm);
arm_emulation_table_instruction("STR[CC] Rd, [Rn], #+/-imm", "ISTR[CC][S] #0 (Rn) <- Rd<br>IADDCPA/ISUBCPA Rn, #imm -> Rn", "", $repl_rn);
arm_emulation_table_instruction("STR[CC] Rd, [Rn], +/-Rm", "ISTR[CC][S] #0 (Rn) <- Rd<br>IADDCPA/ISUBCPA Rn, Rm -> Rn", "", $repl_rn_rm);
arm_emulation_table_instruction("STR[CC] Rd, [Rn], +/-Rm, SHF #imm", "ISHF[CC] Rm, #imm<br>ISTRCPA[S] #0 (Rn), +/-SHF <- Rd -> Rn", "", $repl_rn_rm);
arm_emulation_table_end();
?>
Example #7
0
item to note is the instruction set is a 2-register instruction set,
with separated memory and operation instructions, much like simple
RISC processors.

</p>

<p>

One difference, though, is that the native instruction set utilizes an
extension capability to provide for 3-register instructions, as well
as access to the fuller flavor of internal instructions supported by
the GIP pipeline.

</p>

<?php 
page_section("summary", "Summary");
?>

<p>

This documentation starts with <a href="overview.php">the overview</a>; pages then give <a href="encoding.php">details of the encodings</a>, and <a href="examples.php">some examples of use</a>.

</p>

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

Example #8
0
</p>

<ol>
<li>Addition and subtraction</li>
<li>AND, OR, NOT, XOR</li>
<li>Left shift by arbitrary amount</li>
<li>Logical right shift by arbitrary amount</li>
<li>Arithmetic right shift by arbitrary amount</li>
<li>Rotate by arbitrary amount</li>
<li>Two-bit step of multiply-accumulate</li>
<li>Single bit step of divide</li>
<li>XOR and zero-bit count (top down)</li>
</ol>

<?php 
page_section("use", "Operation use");
?>

<p>

This section describes how some of the more complex ALU operations can be used

</p>

<?php 
page_subsection("multiplication", "Multiplication");
?>

<p>A multiply instruction can be implemented with Booths algorithm calculating 
two bits of result at a time. Say a calculation of the form
r=x*y+z needs to be performed. The basic multiply step of 2 bits can be
Example #9
0
<?php 
page_section("accumulator", "Use of the accumulator");
?>

<p>

The accumulator is used for two purposes in the ARM emulation system. Firstly it
is used to create intermediate results for instructions, such as addresses of loads and stores, 
without requiring an actual register. Secondly it is used for optimization of data processing,
in effect as a forwarding path within the ALU.

</p>

<?php 
page_section("alu_forwarding", "ALU forwarding");
?>

<p>

The accumulator in the internal GIP pipeline is utilized to enhance the
performance of ARM emulation. Simply put, it maintains a local copy of
a single ARM register, and the register number that it contains is tracked
by the ARM emulation hardware unit. Then, instead of issuing internal instructions which
use the accumulator for a register field, the internal instruction may use the accumulator
instead. Note that the value in the accumulator is very volatile, and may be used by
just a few instructions, and will be corrupted by many instructions, so the net
usage is not likely to be more than 20% of instructions. However, it may only
achieve a performance benefit in back-to-back instructions, as
with more spacing the register value is likely to be ready in the register file path, and the
'forwarding' path through the accumulator does not save cycles.
Example #10
0
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
site_set_location("gip.instruction_set");
$page_title = "GIP Documentation";
include "{$toplevel}web_assist/web_header.php";
page_header("GIP Native Instruction Set");
page_sp();
?>

<?php 
page_section("examples", "Examples");
?>

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

Example #11
0
next thread. It will start with the round-robin thread number given.

<p>

Also the scheduler has a bit indicating whether it is running preempting or cooperatively.

<p>

The scheduler maintains the knowledge of the currently running thread.

<p>

The scheduler also determines what thread to run next, and registers that.

<?php 
page_section("operations", "Operations");
?>

<dl>

<dt>Clear flag

<dd>

ISCHDSEMBIC[CC][A] Rm/Imm [-> Rd]

<br>

This uses the semaphore flags in the mode currently specified for the
system, reads them, performs a BIC operation with the specified
operand, and can put the result in Rd. To clear a particular sempahore based on a register
Example #12
0
The ALU generates a 'condition passed' indication from either current flags or from the stored flags. 
An ALU and data shifter instruction can store results and flags conditionally on this value, if desired.

<br>
The flags may be configured as 'sticky'; that is they may be set if desired due 
to an operation, but not cleared

</p>

<?php 
page_section("operands", "Operands to registers");
?>

<p>The ALU contains four potential operand sources: input register A, input register B, ALU accumulator ACC, and shifter result SHF.
</p>

<?php 
page_section("further_details", "Further details");
?>

<p>The <a href="operations.php">operations</a> document describes the operational capabilities required of the ALU and shifter, and discusses
how these requirements are derived from the ARM emulation. The <a href="dataflow.php">dataflow</a> document takes this information in another form, presenting
details as to how the data flows for each operation and what the shifter and ALU do, and how that ties back to ARM emulation. The <a href="implementation.php">implementation</a> document then gives details on the module and its implementation.
</p>

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

</table>

<?php 
page_section("modes", "Modes");
?>

The microkernel has a concept of which 'mode' the ARM is running in, but it is primitive. It has a flag that it keeps in memory (or register?) that is 1 for SVC mode, 0 for user mode.

<?php 
page_section("flags", "Flags");
?>

The microkernel maintains a single flag for interrupt enable. This is a flag which is in the scheduler, which may cause the microkernel to be scheduled when it changes from 0 (interrupts disabled) to 1 (interrupts enabled).

<?php 
page_section("primitives", "Primitives");
?>

The microkernel supports the following primitives:

<dl>

<dt>SWI invoked (r10-r14 = user mode register values, r15 = calling PC+8, r16 = instruction)

<dd>The highest-priority primitve; it is invoked generally from a hardware decode, and the microkernel expects r0 to r15 to be the requesting arguments and r16 to be the instruction that caused the invocation, so that it may be decoded from a table for despatch. The microkernel will preserve r0 to r15 in a fixed region of memory, and restart the ARM with r14 equal to the given r15, and use a depatch table for the start value based on r16. It will also mark itself as in 'SVC' mode, and clear the interrupts enabled flag.

<dt>Enter USER mode at address (r10-r14 = user mode register values, r15 = restart PC, r16 = instruction)

<dd>This is invoked by ARM code explicitly, and is effectively a kind of SWI. It takes a single argument, that of the address to start the ARM at. The microkernel will clear its 'SVC' mode flag, and restart the ARM mode code thread at the value of r15.

<dt>Handle interrupt
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
site_set_location("gip.alu.implementation");
$page_title = "GIP Documentation";
include "{$toplevel}web_assist/web_header.php";
page_header("GIP ALU and Shifter Implementation");
page_sp();
?>

This documentation matches version 1.9 of the gip_alu source code

<?php 
page_section("module_definition", "Module definition");
?>

<p>
The ALU stage, then, requires the following controls and input data, which can be described as 'ports' to the ALU stage
</p>

<?php 
page_subsection("inputs", "Inputs");
?>

<table border=1>
<tr>
<th>Port</th>
<th>Type</th>
<th>Details</th>
</tr>
The basic operation of a hardware interrupt thread is:

<ol>
<li>Wait for stuff on a register or something from an external device (or
other GIP)

<li>Get stuff from that register; if it has enough, or a packet, or something
then notify the microkernel that there is something to do by setting
a bit in the interrupt status register and by setting the hardware interrupt pending semaphore.

<li>Return, but keep giving it stuff.

</ol>


</p>

<?php 
page_section("thread_priorities", "Thread priorities");
?>

<p>
Hardware threads should be equal top priority, nonpreemptable.
Microkernel thread should be second priority, preemptable by hardware
threads.
ARM thread should be lowest priority, preemptable by any of the above.
</p>

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
Example #16
0
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
include "local_header.php";
?>

<?php 
page_section("flushing", "Pipeline flushing");
?>

<p>

Pipeline flushing is used when an ARM instruction stream is being
emulated where the predicted flow of instructions turns out to be
incorrect, and a different flow is required. A pipeline flush must
always be accompanied, in ARM emulation, with a corresponding write to
the program counter from the end of the pipeline. The order of the
write to the PC and the flush may vary depending on the contents of
the pipeline (in particular a load may complete after a following ALU
operation with flush), so the decoder must keep track of the flush and
write, and pair them up.

<p>

The actual operations that may invoke a flush are as follows:

<table class=data border=1>

<tr>
Example #17
0
If carry is one then return that as result, else return Op2
</td>
<td>
&nbsp;
</td>
</tr>

</table>

<p>
The arithmetic unit produces a carry (carry out of the adder),
overflow (from the adder), a zero flag indicating its result is zero, and a negative flag indicating the top bit of its result is set.
</p>

<?php 
page_section("Result values", "Result values");
?>

<p>
Conditional execution may block execution; no effects will occur if a conditional operation is performed and its condition is not met.
<br>
With that in mind:
<ol>
<li>
The ALU result is the result of the logical or arithmetic operation performed: if a logical operation was performed then
the result comes from the logic unit (as do the N and Z flags; V is unchanged; C may come from the shifters last carry out);
if an arithmetic operation was performed then the result comes from the arithmetic unit (as do N, Z, V, C). Note that for shifter result
to be seen it must be moved through the logical or arithmetic path in a second instruction, as it is not muxed through to the output.
</li>
<li>
The shifter result is always written to the SHF register on execution
Example #18
0
site_set_location("company.people");
$page_title = "Embisi Inc. People";
include "{$toplevel}web_assist/web_header.php";
page_header("Embisi People");
page_sp();
?>

Embisi is headed up by <a href="gavin_stark">Gavin J Stark</a> and <a href="john_croft">John Croft</a>.

<?php 
page_section("gavin", "<a href=gavin_stark>CEO - Gavin J Stark</a>");
?>

Gavin Stark is the CEO of Embisi Inc. Gavin was previously an architect for Network Processors at Intel, where he arrived at after their acquisition of Basis Communications, at which he was CTO. Gavin has a PhD and BA from Cambridge University, England.

<?php 
page_section("john", "<a href=john_croft>Software - John Croft</a>");
?>

John Croft is the embedded software lead for Embisi, building on years
of experience in embedded computing and operating system design and
support. John's previous experience is at Cisco, Calista (acquired by
Cisco), and Madge Networks. John holds a BA from Cambridge University,
England.

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

Example #19
0
handled.

<?php 
code_format("gip", "code/system_mode_no_interrupt_pending.s");
?>

<?php 
page_section("system_mode_interrupts", "System mode with interrupts pending");
?>

This is the code that occurs when the microkernel was idling, and when
either a hardware or software interrupt occurs.  It examines the
source of the interrupt, clearing the indication atomically. It then
despatches to the correct routines until all the interrupt sources are
handled.

<?php 
code_format("gip", "code/system_mode_interrupt_pending.s");
?>

<?php 
page_section("swi_entry", "SWI entry code");
?>

<?php 
code_format("gip", "code/swi_entry.s");
?>

<?php 
page_ep();
include "{$toplevel}web_assist/web_footer.php";
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
site_set_location($site_location);
$page_title = "GIP Documentation";
include "{$toplevel}web_assist/web_header.php";
page_header("ARM Emulation Microkernel");
page_sp();
?>

<?php 
page_section("communication", "Communication primitives");
?>

These communication primitives manipulate two basic structures: the
ARM mode code thread, and a local SRAM data block that contains the
ARM 'state', basically the banked registers (user mode
                                             stack pointer (r13) and link register (r14), system stack pointer, interrupted flags (zcnv), and interrupted PC (r15)).

<p>

For the interthread management to work the registers R0-R15 must be
the same for the microkernel and the ARM mode code thread, and they
cannot be effected by any of the other threads running on the GIP.

<table border=1>

<tr><th>State</th><th>Name</th><th>Description</th></tr>

<tr><th>User R13</th>
Example #21
0
signal_output("SrcAck", "n", "medium", "Is taking presented data; combinatorial (one per source)");
signal_output("TgtType", "2*n", "medium", "Type of transaction word");
signal_output("TgtData", "32", "medium", "Data for transaction word (to all n targets)");
signal_end();
signal_list("Ports on complex registered postbus router");
signal_input("n SrcType", "2*n", "late", "Type of transaction word (one per source)");
signal_input("n SrcData", "32*n", "late", "Data for transaction word (one per source)");
signal_input("m TgtAck", "m", "late", "Can take presented data");
signal_output("SrcAck", "n", "early", "Is taking presented data; combinatorial (one per source)");
signal_output("TgtType", "2*m", "early", "Type of transaction word");
signal_output("TgtData", "32*m", "early", "Data for transaction word");
signal_end();
?>

<?php 
page_section("simple_implementation", "Simple implementation");
?>

A simple implementation of a postbus router takes a small number of
sources, say 4, and distributes data to a small
number of targets, say 8. The simple
implementation has a state machine, and it multiplexes its incoming
4 sets of source data together to present to all 8 targets.

<p>

<?php 
code_format("cdl", "cdl/simple.cdl");
?>

<?php 
Example #22
0
occur, whether they are forwards or backwards.

</p>

<p>

Conditional branches with link are emulated by inserting two internal
instructions in to the pipeline; one with the reverse condition of the
instruction with flush to force a branch to PC+8-4 if the condition is
not met, and the other with the current condition to set the link
register on a correctly predicted branch. The program counter is also
updated with the branch target address.

</p>

<?php 
page_section("emulation_details", "Emulation details");
arm_emulation_table_start();
arm_emulation_table_instruction("B {offset}", "", "Guaranteed branch", "Changes PC to PC+8+offset");
arm_emulation_table_instruction("B[CC] {offset}", "", "CC will be met<br>Guaranteed branch", "Changes PC to PC+8+offset");
arm_emulation_table_instruction("B[CC] {negative offset}", "SUB{!CC}F PC, #4 -> PC", "CC may not be met<br>Predicted branch", "Changes PC to PC+8+offset<br>If mispredicted then instruction will execute and reset PC to the correct path");
arm_emulation_table_instruction("B[CC] {positive offset}", "MOV{CC}F #target -> PC", "CC may not be met<br>Unpredicted branch", "If mispredicted then instruction will execute and set PC to branch target");
arm_emulation_table_instruction("BL {offset}", "SUB PC, #4 -> R14", "Guaranteed branch with link", "Changes PC to PC+8+offset");
arm_emulation_table_instruction("BL[CC] {offset}", "SUB PC, #4 -> R14", "CC will be met<br>Guaranteed branch with link", "Changes PC to PC+8+offset");
arm_emulation_table_instruction("BL[CC] {offset}", "SUB{!CC}F PC, #4 -> PC<br>SUB PC, #4 -> R14", "CC may not be met<br>Conditional branch with link", "Changes PC to PC+8+offset<br>If mispredicted then instruction will execute and reset PC to the correct path, and R14 will not be written");
arm_emulation_table_end();
page_ep();
include "{$toplevel}web_assist/web_footer.php";
?>

Example #23
0
<li>Ability to do a thunking call (multicycle decode, single instruction)

<li>Ability to do a SWI (couple of moves, deschedule, assert event) (multicycle decode, single instruction)

<li>Ability to make return from interrupt happen... How?

<li>Force enable of hardware interrupts (macro)

<li>Restore interrupt enable (macro)

<li>Disable interrupts, returning previous state (macro)

</ul>

<?php 
page_section("thunking_libraries", "Thunking libraries");
?>

We can use r17 or some other register to contain a base address of
dynamic library thunking table assists; the dynamic mapping of
registers to support this in this particular way is patentable.

<p>

Best method is to have a small table of static data pointers whose
base address is in r17 indexed by local library number, and a global
table of entry points for functions in the libraries indexed by global
entry point number whose base is in r18
We can have one instruction that loads 'r12' with 'r17, #...' and pc
with 'r18, #entryptr<<2' - we can use a quarter of the SWI instruction
decode.
Example #24
0
<?php

page_section("arm_emulation", "ARM emulation");
?>

<p>
The ARM instruction classes in general are as follows:
</p>

<table border=1>
<tr>
<th>Class</th>
<th>Description</th>
<th>ALU note</th>
</tr>

<tr>
<th>Data processing</th>
<td>ALU, Shift, and combined instructions</td>
<td>Varied emulation issues; see below</td>
</tr>

<tr>
<th>Multiply</th>
<td>Multiply and multiply accumulate</td>
<td>Utilizes INIT an MULST</td>
</tr>

<tr>
<th>Single Data Swap</th>
<td><i>Not supported</i></td>
Example #25
0
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
site_set_location($site_location);
$page_title = "GIP Documentation";
include "{$toplevel}web_assist/web_header.php";
page_header("ARM Emulation Microkernel");
page_sp();
?>

<?php 
page_section("linux", "Linux operation");
?>

<p>

Linux supports a preempting kernel. In this model hardware interrupts
can occur at any point in time, including during system calls; system calls may then be preempted by the interrupt routines.

<p>

It is worth examining the aim of the kernel architecture to determine
how to manage hardware interrupts and system calls from a processor
emulation perspective. This document does this in more detail in the sections below, but summarized first.

<p>

<?php 
page_subsection("summary", "Summary of operation");
?>
Example #26
0
<ul>

<li>This page includes a very basic summary of the microkernel operation.

<li>The basic requirements of the microkernel are derived from the way the <a href="linux.php">ARM port of Linux uses processor modes</a> 

<li>The operation of the microkernel is <a href="outline_operation.php">described in outline</a>

<li>The operation of the microkernel is <a href="detailed_operation.php">described in detail</a>

<li>The <a href="communication.php">communication primitives</a> required of the microkernel supply the mechanisms for interaction between ARM code and the microkernel.


<?php 
page_section("overview", "Overview");
?>

<p>

The microkernel provides the capability of a GIP to support a full-blown OS with hardware interrupts.

<p>

The basic implementation requires three classes of thread:

<ul>

<li>Microkernel thread (one instance)

<li>ARM mode code thread (one instance for all user and supervisor code)
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
include "local_header.php";
?>

<?php 
page_section("data_processing", "Data processing instructions");
?>

<p>

ARM data processing instructions can perform a shift and ALU operation in a single instruction, but the bulk
of the instructions are just ALU operations. The internal instruction set does not support a single instruction
to perform both a shift and an ALU operation, so pairs of internal instructions must be used for those ARM instructions. This
leads to two distinct classes of emulated data processing instructions:

<dl>

<dt>
Unshifted instructions

<dd>
ALU{cc} Rd, Rn, Rm
<br>
ALU{cc} Rd, Rn, #imm
<p>
These instructions do not require a shift operation, and so are emulated with a single internal instruction.
There are a few subclasses of these instructions; comparisons differ from instructions that write to registers,
and some instructions may set the flags while others do not.
Example #28
0
<?php

include "web_locals.php";
include "{$toplevel}web_assist/web_globals.php";
include "local_header.php";
?>

<?php 
page_section("prefetch", "Prefetching in ARM emulation");
?>

<p>

The ARM emulation mode is designed to emulate ARM instructions at
around 1.5 clocks per instruction for data processing, loads and
stores, with obviously higher CPI for bulk transfers and
multiplies. It is important at these rates to keep the instruction
pipeline fed. This is particularly important as the distance to the
main ROM on a GIP system is many cycles (about ten), and there is no
level 1 cache. The prefetch unit performs speculative fetch of the
'next' instruction line to be used; this means that when instruction
at address 'n' is executed the prefetch unit will speculatively fetch
'n' plus one line, and as each line is 8 instructions this means
'n'+8. With 10 cycles of latency, instruction 'n'+8 should be ready in
about 10 cycles, whereas at 1.5 CPI it will be needed in about 12
cycles, so all is well. However, when a branch is taken there will be
a long penalty; unconditional branches will see a 10 cycle penalty,
for example. Compare this to the ARM, though, where the branch is no
detected until the execute stage (i.e. 2 cycles later), then the
penalty is slightly less, but not considerably. The worst effect,
though, is on returning from a branch, as this cannot be concretely
Example #29
0
<li>Can we be a PCMCIA target with the above pins?


</ul>


<p>

In its simplest form two endpoints can be tied together back-to-back;
that is, their repective outgoing interfaces can be wired without
logic to each other's incoming interfaces.

<p>

<?php 
page_section("baud_rate_generator", "Baud rate generators");
?>

The baud rate generators are individually configurable, and run from one of six potential clock sources:

<ul>

<li>
No clock (low power)

<li>
Internal clock

<li>
I/O clock pin 0