What Is A Shadow Register

Shadow Register

Interrupts

Lucio Di Jasio , in Programming xvi-Bit Pic Microcontrollers in C (Second Edition), 2012

Notes for the Assembly Experts

The _ISRFAST macro can be used to declare a function as an interrupt service routine and to further specify that information technology will use an additional and convenient feature of the PIC24 architecture: a set of 4 shadow registers. By allowing the processor to automatically save the content of the first iv working registers ( W0–W3, i.e. the nigh frequently used ones) and about of the content of the SR register in special reserved locations, without requiring the use of the stack, the shadow registers provide the fastest possible interrupt response time. Naturally, since there is merely one set up of such registers, their utilise is limited to applications where only one interrupt will be serviced at whatsoever given fourth dimension. This does not limit us to use only one interrupt in the entire awarding, but rather to use _ISRFAST only in applications that have all interrupts with the same priority level or, if multiple levels are in apply, reserve the _ISRFAST options only for the interrupt service routines with the highest level of priority.

Read full chapter

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9781856178709000051

Intel® architecture bring-upwards

Contributed pastAlexey Veselyi Denis Vladimirov , in Software and Organisation Development using Virtual Platforms, 2015

Compatibility Hardware Level for Legacy Drivers

Many hardware devices are designed for register-level compatibility with previous generations. They simply add together boosted capabilities, which are non mandatory to use by default. Even if the register map has changed, devices can provide register aliases or shadow registers to be backward-uniform with legacy (previous-generation) drivers. Even if new functions are not yet supported, the first stride for post-BIOS setup is to check that BIOS does non have regression errors and that those devices are still working with the operating system. Ensuring that backward compatibility works is very important for the Intel® Architecture ecosystem.

New devices tin exist easily modified to be supported by already working legacy drivers of the current operating system. The idea is to replace the PCI configuration device ID annals of the new device with a value from the previous-generation platform, for case: dev->pci_config_device_id=0x1234. This makes an old driver identify the new hardware every bit old hardware, so it will utilize the new hardware device in the same way it would use the old hardware device.

The aforementioned arroyo of changing some registers can exist used as a common technique of checking that the device is still working although the driver is not updated right now. For case, some device capabilities registers tin can be changed to their values from previous-generation platforms.

The full general limitation of this arroyo is that device ID (or any other register) might change afterwards the reset routine (platform-based or device-based). Simics provides a adequacy to break on such events, giving the ability to change the registers every time a reset happens.

Mutual examples of devices with such "compatibility" capabilities are network interface controllers (NICs), SATA controller devices, and dissimilar PCH legacy devices. The hardware-driven creation of a legacy shadow annals layer to support previous driver versions, as well as Simics capabilities of manual register overloading, is often helpful for early Bone bring-up. It is very useful to be able to "patch" such registers in runtime without having to update or rebuild the model.

Read total chapter

URL:

https://www.sciencedirect.com/science/commodity/pii/B978012800725900010X

Register-Level Communication in Speculative Chip Multiprocessors

Milan B. Radulović , ... Veljko M. Milutinović , in Advances in Computers, 2014

3.one.1.6 Pinot

3.1.i.vi.1 Full general Data

Pinot is a multithreading compages, which also supports the control and data speculation. Its parallelizing tool identifies threads and can extract parallelism over a broad range of granularities without modifying the source lawmaking [23].

3.one.ane.6.2 Architecture Details

The Pinot consists of iv processing elements (Human foot) with their private L1 caches connected with shared bus to the memory organization. The register files are distributed and the transfer of annals values between them is performed through a unidirectional ring. Besides, the ring is as well used to transfer a start address of the spawned thread. The block diagram of a PE in Pinot is shown in the Fig. 1.11 .The light-grey-shaded areas enclosed with dashed lines represent the speculation support logic: the decoder logic for parallelizing instructions (fork, propagate, and abolish instructions are added to existing ISA—instruction set architecture), the thread spawn control, the shadow register files, and the versioning cache for speculative multithreading. The shadow register files allow overlapping the preparation of live-in register values for the side by side thread with the execution of electric current thread.

3.ane.i.6.3 Register Advice

The advice of the annals values in Pinot is producer-initiated since a consumer thread uses a register value that has been produced by its predecessors. The annals values tin be transferred without synchronization both at and subsequently spawn time. The annals numbers for the values that need to be transferred to the consumer thread are plant in a annals-list mask inside the fork instruction. In case of the register value transfer at spawn time, a sequencer inspects the register-list mask, fetches the required register values, and sends them via ring to the consumer thread. When the register values are transferred after spawn fourth dimension, the requested register values are sent to the consumer thread when they are produced.

A flake field in fork instruction is intended to select the propagation modes of the register values. If its value is after = ane (propagation always), the register values constitute in the annals-listing mask are transferred to the consumer thread both at and after spawn time. Nevertheless, if its value is later on = 0 (no propagation), they are transferred only at spawn time. To avoid the RAW violations, Pinot uses the propagate instructions to disable transfer of register values labeled with identical annals numbers [23].

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124202320000015

Hardware-Software Prototyping from LOTOS

LUIS SÁNCHEZ FERNÁNDEZ , ... WOLFGANG ROSENSTIEL , in Readings in Hardware/Software Co-Design, 2002

5 Prototyping Environment

This section aims at describing the prototyping architecture we take used to build a image from a LOTOS specification and the associated design environs.

v.i Hardware Modules

For prototyping we used the WEAVER environment designed by FZI Karlsruhe. This environs is designed especially for prototyping of entire hardware/software systems [15]. It is a modular and extensible system which tin cope with high gate complexity. The different types of modules let both the integration of standard or predefined components and adaptation to the needs of the application.

WEAVER uses a hardwired regular interconnection scheme. Thus, fewer signals accept to exist routed through programmable devices, which results in better performance. Nevertheless it will not e'er be possible to avert routing signals through the FPGAs. Charabanc modules are provided for the interconnection of modules. These offer 88-flake wide buses.

The basic module carries four Xilinx FPGAs for the configurable logic. On each side of the quadratic base module a connector with 96 pins is located. Each FPGA is connected to one of these connectors. Also, every FPGA has a 75-bit link to two of its neighbours. A Command Unit is located on the basic module. Information technology handles the programming and readback of particular FPGAs. A separate bus leads to the command unit of measurement of each base module in the system, which is programmed serially via this separate bus. The programming data is annotated with address information for the basic module and the item FPGA on the bones module. In that way the control unit determines if the programming data on the bus is relevant to its own basic module. If so, it forrard the programming data to the FPGA for which it is intended. Readback of configuration information and shadow registers is done in the aforementioned way.

A RAM module with 4 MB static RAM can be added in social club to provide for the storage of global or local data. It tin exist plugged into a bus module, then several modules may have admission to information technology, or it can be connected directly to a base module. Then this module has sectional access to the memory. The requests from other modules must be routed through the FPGAs of the straight connected base module. This is time consuming. The coach module makes information technology possible to plug modules together in a bus-oriented mode. With a bus on each side of the base module this architecture can be used to build multiprocessing systems with capricious construction.

To be able to integrate standard processors, these must be located on their own modules which must see the connection conventions of the other modules. For the piece of work described in this paper, we used an evaluation board for the Hyperstone El 32-flake processor [ten]. A board carrying a PowerPC processor is also available.

Using these modules, arbitrary structures tin can be built. The structure depends on the application which is to exist prototyped. Thus, a running organization contains all modules needed, but not more. That is important in social club to reduce overhead and to keep the price low. The basic module and a more than complex structure is depicted in Figure 7. This picture shows an architecture in meridian view and in side view. The compages is congenital in iii dimensions and consists of four bones modules, a RAM module and an I/O module. On the right-manus side is a tower of three basic modules which are continued via three bus modules. On the left mitt-side is another bones module with a local RAM module and an I/O module. The gross number of gates in this instance is almost 400K, which corresponds to approx. 120K usable gates. Thus, it is an instance which would be sufficient for many applications.

v.2 Supporting Software

The supporting software tin can be split into synthesis software and software, which supports the handling of the environs. For hardware synthesis, we rely on commercial software similar Synopsys or any other tool for hardware synthesis, and the Xilinx tool suite. For large circuits, a netlist partitioning software was developed. This tool allows for the partition of a big netlist onto a given device interconnection structure [28].

Software development is done in the particular software development environments provided by the processor vendors. In the case of the Hyperstone processor, software development is washed with PC-based cantankerous development tools including a source level debugger which allows for the debugging of programs running on the target processor.

The WEAVER environment includes a tool which provides a user interface to the accessible functionality: pick of detail boards and FPGAs on the architecture and programming, and readback of selected devices. The user has command over the circuit clock, which can be interrupted and chosen from a set of dissimilar sources including clock divisors. The user can also set, enable and disable the RESET signal by software means.

Read total chapter

URL:

https://world wide web.sciencedirect.com/science/article/pii/B9781558607026500557

Hardware Trojans

Swarup Bhunia , Mark Tehranipoor , in Hardware Security, 2019

5.7.2 Design-for-Trust

As described in the previous section, detecting a tranquillity, low-overhead hardware Trojan is however very challenging with existing techniques. A potentially more constructive way is to plan for the Trojan problem in the design stage through blueprint-for-trust. These methodologies are classified into four classes co-ordinate to their objectives.

five.7.2.1 Facilitate Detection

Functional Test: Triggering a Trojan from inputs and observing the Trojan issue from outputs are difficult due to the stealthy nature of Trojans. A large number of low-controllable and low-observable nets in a design significantly hinder the possibility of activating a Trojan. Salmani et al. [42] and Zhou et al. [43] attempted to increase controllability and observability of nodes by inserting test points into the circuit. Another arroyo proposed to multiplex 2 outputs of a DFF, Q and $\overline{Q}$ , through a two-to-1 multiplexer, and select either of them. This extends the state space of the design and increases the possibility of heady/propagating the Trojan effects to circuit outputs, making them detectable [18]. These approaches are beneficial non only to functional-examination-based detection techniques but also to side-channel-based methods that need partial activation of Trojan circuitry.

Side-channel Point Analysis: A number of design methods have been developed to increase the sensitivity of side-channel-based detection approaches. The amount of current a Trojan can draw could be and so small that information technology can exist submerged into an envelope of noise and procedure variations furnishings; therefore, information technology may exist undetectable by conventional measurement equipments. However, Trojan-detection adequacy can exist greatly enhanced past measuring currents locally, and from multiple power ports/pads. Figure 5.10 shows the current (charge) integration methodology for detecting hardware Trojans presented in [44]. Salmani and Tehranipoor [45] proposed to minimize background side-channel signals by localizing switching activities within one region, while minimizing them in other regions through a scan-cell reordering technique. Additionally, some newly adult structures or sensors are implemented in the excursion to provide a higher detection sensitivity compared to conventional measurements. Ring oscillator (RO) structures [46] , shadow registers [47], and delay elements [48] on a set of selected short paths are inserted for path delay measurements. RO sensors [49] and transient current sensors [50,51] are able to ameliorate sensitivity to voltage and current fluctuations acquired past Trojans, respectively. Likewise, integration of process variation sensors [52,53] can calibrate the model or measurement, and minimize the noise induced by manufacturing variations.

Runtime Monitoring: Every bit triggering all types and sizes of Trojans during pre-silicon and postal service-silicon tests is very hard, runtime monitoring of critical computations can significantly increase the level of trust with respect to hardware Trojan attacks. These runtime monitoring approaches can utilize existing or supplemental on-chip structures to monitor fries behaviors [10,54] or operating conditions, such as transient power [fifty,55] and temperature [25]. They can disable the fleck upon detection of any abnormalities or bypass it to permit reliable operation, admitting with some operation overhead. Jin et al. [56] present a design of an on-scrap analog neural network that can exist trained to distinguish trusted from untrusted circuit functionality based on measurements obtained via on-fleck measurement conquering sensors.

v.seven.ii.2 Prevent Trojan Insertion

These techniques consist of preventive mechanisms that try to thwart hardware Trojan insertion by attackers. To insert targeted Trojans, typically attackers need to empathise the function of the design first. Attackers who are non in the design house commonly identify circuit functionality by opposite technology.

Logic Obfuscation: Logic obfuscation attempts to hide the genuine functionality and implementation of a design by inserting congenital-in locking mechanisms into the original design. The locking circuits become transparent, and the correct function appears only when a correct key is applied. The increased complexity of identifying the 18-carat functionality without knowing the right input vectors can lower the ability of inserting a targeted Trojan by attackers. For combinational logic obfuscation, XOR/XNOR gates could be introduced at certain locations in a design [57]. In sequential logic obfuscation, additional states are introduced in a finite state machine to conceal its functional states [19]. In add-on, some techniques proposed to insert reconfigurable logics for logic obfuscation [58,59]. The design is functional when the reconfigurable circuits are correctly programmed by the design firm or cease-user.

Camouflaging: Camouflaging is a layout-level obfuscation technique to create duplicate layouts for different gates past calculation dummy contacts, and faking connections between the layers within a camouflaged logic gate [60,61] (shown in Fig. 5.11). The camouflaging technique can hinder attackers from extracting a correct gate-level netlist of a circuit from the layout through imaging different layers; in that style, the original design is protected from insertion of targeted Trojans. Additionally, Bi et al. [62] utilized a like dummy contact approach and developed a set of camouflaging cells based on polarity-controllable SiNW FETs.

Functional Filler Cell: Since layout blueprint tools are typically conservative in placement, they cannot fill 100% of the area with regular standard cells in a design. The unused spaces are usually filled with filler cells or decap cells that exercise not take whatsoever functionality. Filler cells are usually used during engineering alter order (ECO) for improving debug and yield, whereas decaps are used to manage peak current in the flake, particularly in areas where instantaneous ability is quite pregnant. Thus, the most covert way for attackers to insert Trojans in a circuit layout is replacing filler cells, and to some degree decaps, because removing these nonfunctional cells has the smallest impact on electrical parameters. The built-in self-authentication (BISA) approach fills all white spaces with functional filler cells during layout design [63]. The inserted cells are then continued automatically to course a combinational circuitry that could exist tested. A failure during afterward testing denotes that a functional filler has been replaced by a Trojan. Figure 5.12 shows the general BISA insertion flow. The white rectangles in Fig. v.12 are conventional ASIC design menses, whereas the dark rectangles correspond the additional steps required for BISA. These steps are equally follows: (i) preprocessing (gather detailed information near the standard jail cell library), (two) unused space identification, (iii) BISA prison cell placement, and (4) BISA cell routing.

5.7.two.three Trustworthy Computing

The tertiary class of blueprint for trust is trustworthy computing on untrusted components. The difference between runtime monitoring and trustworthy computing is that trustworthy calculating is tolerant to Trojan attacks past design. Trojan detection and recovery at runtime—acting as the last line of defense force—is necessary, especially for mission-critical applications. Some approaches employ a distributed software scheduling protocol to achieve a Trojan-activation-tolerant trustworthy computing arrangement in a multicore processor [64,65]. Concurrent Error Detection (CED) techniques can be adapted to detect malicious outputs generated by Trojans [66,67]. In add-on, Reece et al. [68] and Rajendran et al. [67] proposed to use a diverse ready of 3PIP vendors to forestall Trojan's effects. The technique in [68] verifies the integrity of a design via comparison of multiple 3PIPs with another untrusted pattern performing a similar function. Rajendran et al. [67] utilize operation-to-3PIP-to-vendor allotment constraints to prevent collusions betwixt 3PIPs from the same vendor.

For the blueprint for trust (DFT) techniques that crave circuitry added during the front-end design phase, the potential surface area and performance overheads are the primary concerns to designers. As the size of a circuit increases, the number of quiet (low controllability/observability) nets/gates will increase the complexity of processing and produce a large time/area overhead. Thus, the DFT techniques for facilitating detection are withal difficult to utilize to a large design that contains millions of gates. In addition, the preventive DFT techniques demand to insert additional gates (logic obfuscation) or alter the original standard cells (camouflaging), which could degrade the chip performance significantly, and affect their acceptability in high-end circuits. The functional filler cells besides increment power leakage.

five.7.2.four Divide-Manufacturing for Hardware Trust

Dissever-manufacturing has been proposed recently as an approach to enable use of country-of-the-art semiconductor foundries while minimizing the risks to an IC design [69]. Split manufacturing divides a pattern into Front end of Line (FEOL) and Back Terminate of Line (BEOL) portions for fabrication by different foundries. An untrusted foundry performs FEOL manufacturing (higher-cost), then ships wafers to a trusted foundry for BEOL fabrication (lower-cost; shown in Fig. 5.13). The untrusted foundry does not accept access to the layers in BEOL and, thus, cannot identify the "safe" places within a circuit to insert Trojans.

Existing divide manufacturing processes rely on either 2D integration [lxx–72], 2.5D integration [73], or 3D integration [74]. The 2.5D integration first splits a blueprint into ii fries fabricated by the untrusted foundry and so inserts a silicon interposer containing interchip connections between the scrap and package substrate [73]. Therefore, a portion of interconnections could be hidden in the interposer that is fabricated in the trusted foundry. In essence, it is a variant of 2d integration for split manufacturing. During the 3D integration, a design is split into two tiers fabricated by different foundries. I tier is stacked on summit the other, and the upper tiers are connected with vertical interconnects called TSVs. Given the manufacturing barriers to 3D in industry, 2d- and ii.5D-based split manufacturing techniques are more realistic today. Vaidyanathan et al. [75] demonstrate the feasibility of separate fabrication after metallic 1 (M1) on test chips and evaluated the chip performance. Although the split after M1 attempts to hide all intercell interconnections and can obfuscate the blueprint effectively, it leads to loftier manufacturing costs. Additionally, several design techniques accept been proposed to enhance a blueprint'south security with split up manufacturing. Imeson et al. [76] nowadays a chiliad-security metric to select necessary wires to be lifted to a trusted tier (BEOL) to ensure the security when divide at a higher layer. Notwithstanding, lifting a large number of wires in the original pattern introduces large timing and power overhead and significantly touch chip operation. An obfuscated BISA (OBISA) technique can insert dummy circuits into the original pattern to further obfuscate the design with separate manufacturing [77].

Read total chapter

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9780128124772000101