Blog Archive

Wednesday, May 17, 2023

05-16-2023-2352 - hardware reset, index register, barrel processor, protection ring, kernel, register windows, page, virtual, asynchronous circuit, static versus dynamic logic, dynamic digital logic, clock signal, central processing unit, overclocking, memory indirect, addressing mode, orthogonal instruction set, vector, array, vector/array, IBM, ARM, operand, machine, machine language, addressing modes, loop, failure, control, alternate, alt, delete, task manager, frozen, ms-dos, windows, reboots, reboot, avionics, pc speaker, power on self test, powers, power-on self test, power-on self-test, post, pre-boot sequence, device, power cycle, power-on process, firmware, off, on, instrument, tool, supply, materials, procedure, scale, quality, quantity, resolution, density, equipment, etc. (draft)

A hardware reset or hard reset of a computer system is a hardware operation that re-initializes the core hardware components of the system, thus ending all current software operations in the system. This is typically, but not always, followed by booting of the system into firmware that re-initializes the rest of the system, and restarts the operating system.

Hardware resets are an essential part of the power-on process, but may also be triggered without power cycling the system by direct user intervention via a physical reset button, watchdog timers, or by software intervention that, as its last action, activates the hardware reset line (e.g, in a fatal error where the computer crashes).

User initiated hard resets can be used to reset the device if the software hangs, crashes, or is otherwise unresponsive. However, data may become corrupted if this occurs.[1] Generally, a hard reset is initiated by pressing a dedicated reset button, or holding a combination of buttons on some mobile devices.[2][3] Devices may not have a dedicated Reset button, but have the user hold the power button to cut power, which the user can then turn the computer back on.[4] On some systems (e.g, the PlayStation 2 video game console), pressing and releasing the power button initiates a hard reset, and holding the button turns the system off.

Hardware reset in 80x86 IBM PC

The 8086 microprocessors provide RESET pin that is used to do the hardware reset. When a HIGH is applied to the pin, the CPU immediately stops, and sets the major registers to these values:

Register Value
CS (Code Segment) 0xFFFF
DS (Data Segment) 0x0000
ES (Extra Data Segment) 0x0000
SS (Stack Segment) 0x0000
IP (Instruction Pointer) 0x0000

The CPU uses the values of CS and IP registers to find the location of the next instruction to execute. Location of next instruction is calculated using this simple equation:

Location of next instruction = (CS<<4) + (IP)

This implies that after the hardware reset, the CPU will start execution at the physical address 0xFFFF0. In IBM PC compatible computers, This address maps to BIOS ROM. The memory word at 0xFFFF0 usually contains a JMP instruction that redirects the CPU to execute the initialization code of BIOS. This JMP instruction is absolutely the first instruction executed after the reset.[5]

Hardware reset in later x86 CPUs

Later x86 processors reset the CS and IP registers similarly, refer to Reset vector.

See also

References


  • Fredman, Josh. "Can a Forced Shutdown Ruin My Computer?". smallbusiness.chron.com. Retrieved 2019-12-13.

  • "How to Hard Reset or Reboot any Android phone or tablet". trendblog.net. 2015-07-20. Retrieved 2019-12-13.

  • "How to Force Restart the iPhone X When It's Acting Up". Gadget Hacks. Retrieved 2019-12-13.

  • "What is a Reset Button?". www.computerhope.com. Retrieved 2019-12-13.

    1. The 80x86 IBM PC and Compatible Computers (Volumes I & II (4th Edition)), By Mohamed Ali Mazidi and Janice Gillispie Mazidi, Section 9.1, Page 241.


     https://en.wikipedia.org/wiki/Hardware_reset

    A power-on reset (PoR, POR) generator is a microcontroller or microprocessor peripheral that generates a reset signal when power is applied to the device. It ensures that the device starts operating in a known state.

    PoR generator

    In VLSI devices, the power-on reset (PoR) is an electronic device incorporated into the integrated circuit that detects the power applied to the chip and generates a reset impulse that goes to the entire circuit placing it into a known state.

    A simple PoR uses the charging of a capacitor, in series with a resistor, to measure a time period during which the rest of the circuit is held in a reset state. A Schmitt trigger may be used to deassert the reset signal cleanly, once the rising voltage of the RC network passes the threshold voltage of the Schmitt trigger. The resistor and capacitor values should be determined so that the charging of the RC network takes long enough that the supply voltage will have stabilised by the time the threshold is reached.

    One of the issues with using RC network to generate PoR pulse is the sensitivity of the R and C values to the power-supply ramp characteristics. When the power supply ramp is rapid, the R and C values can be calculated so that the time to reach the switching threshold of the schmitt trigger is enough to apply a long enough reset pulse. When the power supply ramp itself is slow, the RC network tends to get charged up along with the power-supply ramp up. So when the input schmitt stage is all powered up and ready, the input voltage from the RC network would already have crossed the schmitt trigger point. This means that there might not be a reset pulse supplied to the core of the VLSI.

    Power-on reset on IBM mainframes

    On an IBM mainframe, a power-on reset (POR) is a sequence of actions that the processor performs either due to a POR request from the operator or as part of turning on power. The operator requests a POR for configuration changes that cannot be recognized by a simple System Reset.

    See also


     https://en.wikipedia.org/wiki/Power-on_reset

    In computing, the reset vector is the default location a central processing unit will go to find the first instruction it will execute after a reset. The reset vector is a pointer or address, where the CPU should always begin as soon as it is able to execute instructions. The address is in a section of non-volatile memory initialized to contain instructions to start the operation of the CPU, as the first step in the process of booting the system containing the CPU.[citation needed]

    Processors

    • The reset vector for the 8086 processor is at physical address FFFF0h (16 bytes below 1 MB). The value of the CS register at reset is FFFFh and the value of the IP register at reset is 0000h to form the segmented address FFFFh:0000h, which maps to physical address FFFF0h.[1]
    • The reset vector for the 80286 processor is at physical address FFFFF0h (16 bytes below 16 MB). The value of the CS register at reset is F000h with the descriptor base set to FF0000h and the value of the IP register at reset is FFF0h to form the segmented address FF000h:FFF0h, which maps to physical address FFFFF0h in real mode.[2] This was changed to allow sufficient space to switch to protected mode without modifying the CS register.[3]
    • The reset vector for the 80386 and later x86 processors is physical address FFFFFFF0h (16 bytes below 4 GB). The value of the selector portion of the CS register at reset is F000h, the value of the base portion of the CS register is FFFF0000h, and the value of the IP register at reset is FFF0h[4] to form the segmented address FFFF0000h:FFF0h, which maps to the physical address FFFFFFF0h in real mode.[5][6]
    • The reset vector for PowerPC/Power ISA processors is at an effective address of 0x00000100 for 32-bit processors and 0x0000000000000100 for 64-bit processors.
    • The reset vector for m68k Architecture processors is 0x0 for Initial Interrupt Stack Register (IISR; Not really a reset vector and is used to initialize the stack pointer after reset.) and 0x4 for initial program counter (reset).[7]
    • The reset vector for SPARC version 8 processors is at an address of 0x00;[8] the reset vector for SPARC version 9 processors is at an address of 0x20 for power-on reset, 0x40 for watchdog reset, 0x60 for externally initiated reset, and 0x80 for software-initiated reset.[9]
    • The reset vector for MIPS32 processors is at virtual address 0xBFC00000,[10] which is located in the last 4 Mbytes of the KSEG1 non-cacheable region of memory.[11] The core enters kernel mode both at reset and when an exception is recognized, hence able to map the virtual address to physical address.[12]
    • The reset vector for the ARM family of processors is address 0x0[13] or 0xFFFF0000. During normal execution RAM is re-mapped to this location to improve performance, compared to the original ROM-based vector table.[14]

    See also

    References


  • "iAPX 86,88 User's Manual" (PDF). Intel. 1981. System Reset, p. 2-29, table 2-4. Retrieved April 15, 2018.

  • "AMD 80286 Datasheet" (PDF). AMD. 1985. p. 13. the 286 begins execution in real mode with the instruction at physical location FFFFF0H.

  • "iAPX 286 Programmer's Reference Manual" (PDF). Intel. 1983. Appendix D, iAPX 86/88 Software Compatibility Considerations, p. D-2. Retrieved April 15, 2018. After reset, CS:IP = F000:FFF0 on the iAPX 286. This change was made to allow sufficient code space to enter protected mode without reloading CS.

  • "80386 Programmer's Reference Manual" (PDF). Intel. 1990. Section 10.1 Processor State After Reset, pages 10-1 - 10.3.

  • "80386 Programmer's Reference Manual" (PDF). Intel. 1990. Section 10.2.3 First Instruction, p. 10-4. Retrieved November 3, 2013. Execution begins with the instruction addressed by the initial contents of the CS and IP registers. To allow the initialization software to be placed in a ROM at the top of the address space, the high 12 bits of addresses issued for the code segment are set, until the first instruction which loads the CS register, such as a far jump or call. As a result, instruction fetching begins from address 0FFFFFFF0H.

  • "Intel® 64 and IA-32 Architectures Software Developer's Manual" (PDF). Intel. May 2012. Section 9.1.4 First Instruction Executed, p. 2611. Retrieved August 23, 2012. The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0h. This address is 16 bytes below the processor's uppermost physical address. The EPROM containing the software-initialization code must be located at this address.

  • Labrosse, Jean J. (2008). Embedded Software. Newnes. ISBN 9780750685832.

  • The SPARC Architecture Manual, Version 8. SPARC International. p. 75.

  • The SPARC Architecture Manual, Version 9. SPARC International. pp. 109–112.

  • "MIPS32 Architecture For Programmers; Vol III: The MIPS32 Privileged Resource Architecture" (PDF). MIPS Technologies.

  • Noergaard, Tammy (2005-02-28). Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers. Elsevier. ISBN 9780080491240.

  • "MIPS32 M4K Processor Core Software User's Manual" (PDF). cdn2.imgtec.com. August 29, 2008. Archived from the original (PDF) on 2017-08-26.

  • "5.9.1. Vector Table and Reset". Cortex-M3 Technical Reference Manual. Retrieved 2017-11-10.

    1. "Boot sequence for an ARM based embedded system -2 - DM". www.embeddedrelated.com. Retrieved 2017-11-10.


     https://en.wikipedia.org/wiki/Reset_vector

    A power-on self-test (POST) is a process performed by firmware or software routines immediately after a computer or other digital electronic device is powered on.[1]

    This article mainly deals with POSTs on personal computers, but many other embedded systems such as those in major appliances, avionics, communications, or medical equipment also have self-test routines which are automatically invoked at power-on.[2]

    The results of the POST may be displayed on a panel that is part of the device, output to an external device, or stored for future retrieval by a diagnostic tool. Since a self-test might detect that the system's usual human-readable display is non-functional, an indicator lamp or a speaker may be provided to show error codes as a sequence of flashes or beeps. In addition to running tests, the POST process may also set the initial state of the device from firmware.

    In the case of a computer, the POST routines are part of a device's pre-boot sequence; if they complete successfully, the bootstrap loader code is invoked to load an operating system

    https://en.wikipedia.org/wiki/Power-on_self-test

    https://en.wikipedia.org/wiki/PC_speaker

     

    https://en.wikipedia.org/wiki/Avionics

     

    Control-Alt-Delete (often abbreviated to Ctrl+Alt+Del and sometimes called the "three-finger salute" or "Security Keys")[1][2] is a computer keyboard command on IBM PC compatible computers, invoked by pressing the Delete key while holding the Control and Alt keys: Ctrl+Alt+Delete. The function of the key combination differs depending on the context but it generally interrupts or facilitates interrupting a function. For instance, in pre-boot environment (before an operating system starts)[3][4][5] or in MS-DOS, Windows 3.0 and earlier versions of Windows or OS/2, the key combination reboots the computer. Starting with Windows 95, the key combination invokes a task manager or security related component that facilitates ending a Windows session or killing a frozen application.  

    https://en.wikipedia.org/wiki/Control-Alt-Delete


    Index register display on an IBM 7094 mainframe from the early 1960s.

    An index register in a computer's CPU is a processor register (or an assigned memory location)[1] used for pointing to operand addresses during the run of a program. It is useful for stepping through strings and arrays. It can also be used for holding loop iterations and counters. In some architectures it is used for read/writing blocks of memory. Depending on the architecture it maybe a dedicated index register or a general-purpose register.[2] Some instruction sets allow more than one index register to be used; in that case additional instruction fields may specify which index registers to use.[3]

    Generally, the contents of an index register is added to (in some cases subtracted from) an immediate address (that can be part of the instruction itself or held in another register) to form the "effective" address of the actual data (operand). Special instructions are typically provided to test the index register and, if the test fails, increments the index register by an immediate constant and branches, typically to the start of the loop. While normally processors that allow an instruction to specify multiple index registers add the contents together, IBM had a line of computers in which the contents were or'd together.[4]

    Index registers has proved useful for doing vector/array operations and in commercial data processing for navigating from field to field within records. In both uses index registers substantially reduced the amount of memory used and increased execution speed. 

    https://en.wikipedia.org/wiki/Index_register

     

    Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions in that architecture identify the operand(s) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

    In computer programming, addressing modes are primarily of interest to those who write in assembly languages and to compiler writers. For a related concept see orthogonal instruction set which deals with the ability of any instruction to use any addressing mode. 

    https://en.wikipedia.org/wiki/Addressing_mode#Memory_indirect

    History

    In early computers without any form of indirect addressing, array operations had to be performed by modifying the instruction address, which required several additional program steps and used up more computer memory,[5] a scarce resource in computer installations of the early era (as well as in early microcomputers two decades later).

    Index registers, commonly known as B-lines in early British computers, as B-registers on some machines and a X-registers[a] on others, were first used in the British Manchester Mark 1 computer, in 1949. In general, index registers became a standard part of computers during the technology's second generation, roughly 1954–1966. Most[b] machines in the IBM 700/7000 mainframe series had them, starting with the IBM 704 in 1954, though they were optional on some smaller machines such as the IBM 650 and IBM 1401.

    Early "small machines" with index registers include the AN/USQ-17, around 1960, and the 9 series of real-time computers from Scientific Data Systems, from the early 1960s.

    The 1962 UNIVAC 1107 has 15 X-registers, four of which were also A-registers.

    The 1964 GE-635 has 8 dedicated X-registers; however, it also allows indexing by the instruction counter or by either half of the A or Q register.

    The Digital Equipment Corporation (DEC) PDP-6, introduced in 1964, and the IBM System/360, announced in 1964, do not include dedicated index registers; instead, they have general-purpose registers (called "accumulators" in the PDP-6) that can contain either numerical values or addresses. The memory address of an operand is, in the PDP-6, the sum of the contents of a general-purpose register and an 18-bit offset and, on the System/360, the sum of the contents of two general-purpose registers and a 12-bit offset.[6][7] The compatible PDP-10 line of successors to the PDP-6, and the IBM System/370 and later compatible successors to the System/360, including the current z/Architecture, work in the same fashion.

    The 1969 Data General Nova and successor Eclipse, and 1970 DEC PDP-11, minicomputers also provided general-purpose registers (called "accumulators" in the Nova and Eclipse), rather than separate accumulators and index registers, as did their Eclipse MV and VAX 32-bit superminicomputer successors. In the PDP-11 and VAX, all registers could be used when calculating the memory address of an operand; in the Nova, Eclipse, and Eclipse MV, only registers 2 and 3 could be used.[8][9][10]

    The 1971 CDC STAR-100 has a register file of 256 64-bit registers, 9 of which are reserved. Unlike most computers, the STAR-100 instructions only have register fields and operand fields, so the registers serve more as pointer registers than as traditional index registers.

    While the Intel 8080 allowed indirect addressing via a register, the first microprocessor with a true index register appears to have been the 1974 Motorola 6800.

    In 1975, the 8-bit MOS Technology 6502 processor had two index registers 'X' and 'Y'.[11]

    In 1978, the Intel 8086, the first x86 processor, had eight 16-bit registers, referred to as "general-purpose", all of which can be used as integer data registers in most operations; four of them, 'SI' (source index), 'DI' (destination index), 'BX' (base), and 'BP' (base pointer), can also be used when computing the memory address of an operand, which is the sum of one of those registers and a displacement, or the sum of one of 'BX' or 'BP", one of 'SI' or 'DI', and a displacement.[12] The 1979 Intel 8088, and the 16-bit Intel 80186, Intel 80188, and Intel 80286 successors work the same. In 1985, the i386, a 32-bit successor to those processors, introducing the IA-32 32-bit version of the x86 architecture, extended the eight 16-bit registers to 32 bits, with "E" added to the beginning of the register name; in IA-32, the memory address of an operand is the sum of one of those eight registers, one of seven of those registers (the stack pointer is not allowed as the second register here) multiplied by a power of 2 between 1 and 8, and a displacement.[13]: 3-11–3-12, 3-22–3-23  The Advanced Micro Devices Opteron, the first model of which was released in 2003, introduced x86-64, the 64-bit version of the x86 instruction set; in x86-64, the general-purpose registers were extended to 64 bits, and eight additional general-purpose registers were added; the memory address of an operand is the sum of two of those 16 registers and a displacement.[14][13]: 3–12, 3–24 

    The reduced instruction set computing (RISC) instruction sets introduced in the 1980s and 1990s all provide general-purpose registers that can contain either numerical values or address values. In most of those instruction sets, there are 32 general-purpose registers (in some of those instruction sets, the value of one of those registers is hardwired to zero) could be used to calculate the operand address; they did not have dedicated index registers. In the 32-bit version of the ARM architecture, first developed in 1985, there are only 16 registers designated as "general-purpose registers", but only 13 of them can be used for all purposes, with register R15 containing the program counter. The memory address of a load or store instruction is the sum of any of the 16 registers and either a displacement or another of the registers with the exception of R15 (possibly shifted left for scaling).[15] In the 64-bit version of the ARM architecture, there are 31 64-bit general-purpose registers plus a stack pointer and a zero register; the memory address of a load or store instruction is the sum of any of the 31 registers and either a displacement or another of the registers.[16]

    Examples

    Here is a simple example of index register use in assembly language pseudo-code that sums a 100 entry array of 4-byte words:

       Clear_accumulator
       Load_index 400,index2  //load 4*array size into index register 2 (index2)
    loop_start : Add_word_to_accumulator array_start,index2   //Add to AC the word at the address (array_start + index2)
       Branch_and_decrement_if_index_not_zero loop_start,4,index2   //loop decrementing by 4 until index register is zero
    

    See also

    Notes


  • The term X-registers was also used for accumulators on, e.g., the CDC 6600.

    1. The 702, 705 and 7080 did not have index registers.

    References


  • "Instructions: Index Words" (PDF). IBM 7070-7074 Principles of Operation (PDF). IBM. 1962. p. 11. GA22-7003-6.

  • "What Is an Index Register? (with picture)". EasyTechJunkie. Retrieved 2022-07-24.

  • IBM 709 Reference Manual, Form A22-6501-0, 1958, p. 12

  • IBM 7094 Principles of Operation (PDF). Fifth Edition. IBM. October 21, 1966. A22-6703-4.

  • IBM 1401 Reference manual, Form A24-1403-4, 1960, p. 77

  • Programmed Data Processor-6 Handbook (PDF). Digital Equipment Corporation. August 1964. pp. 20–22.

  • IBM System/360 Principles of Operation (PDF) (Eighth ed.). IBM. September 1968. pp. 8, 12–14. A22-6821-7.

  • Programmer's Reference Manual, Nova Line Computers (PDF). Data General. January 1976. pp. I-1, II-7.

  • Programmer's Reference Manual, Eclipse Line Computers (PDF). Data General. March 1975. pp. 1–1, 2–6.

  • ECLIPSE 32-Bit Systems Principles of Operation (PDF). Data General. August 1984. pp. 1–2.

  • "Registers - 6502 Assembly". www.6502.buss.hk. Retrieved 2022-07-24.

  • "The 8086 Family User's Manual" (PDF). Intel Corporation. October 1979. pp. 2–6, 2–68. Archived (PDF) from the original on April 4, 2018. Retrieved March 28, 2018.

  • Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 1: Basic Architecture. Intel Corporation. March 2018. Chapter 3. Archived from the original on January 26, 2012. Retrieved March 19, 2014.

  • AMD64 Architecture Programmer's Manual Volume 1: Application Programming (PDF). Advanced Micro Devices. October 2020. pp. 3, 16.

  • ARM Architecture Reference Manual. Arm. 2005. pp. A2-6, A3-21.

  •  https://en.wikipedia.org/wiki/Index_register

     

    In digital electronics, especially computing, hardware registers are circuits typically composed of flip flops, often with many characteristics similar to memory, such as:[citation needed]

    • The ability to read or write multiple bits at a time, and
    • Using an address to select a particular register in a manner similar to a memory address.

    Their distinguishing characteristic, however, is that they also have special hardware-related functions beyond those of ordinary memory. So, depending on the point of view, hardware registers are like memory with additional hardware-related functions; or, memory circuits are like hardware registers that just store data.[citation needed]

    Hardware registers are used in the interface between software and peripherals. Software writes them to send information to the device, and reads them to get information from the device. Some hardware devices also include registers that are not visible to software, for their internal use.

    Depending on their complexity, modern hardware devices can have many registers. Standard integrated circuits typically document their externally-exposed registers as part of their electronic component datasheet.

    Functionality

    Typical uses of hardware registers include:

    • configuration and start-up of certain features, especially during initialization
    • buffer storage e.g. video memory for graphics cards
    • input/output (I/O) of different kinds
    • status reporting such as whether a certain event has occurred in the hardware unit, for example a modem status register or a line status register.[1]

    Reading a hardware register in "peripheral units" — computer hardware outside the CPU — involves accessing its memory-mapped I/O address or port-mapped I/O address with a "load" or "store" instruction, issued by the processor. Hardware registers are addressed in words, but sometimes only use a few bits of the word read in to, or written out to the register.

    Commercial design tools simplify and automate memory-mapped register specification and code generation for hardware, firmware, hardware verification, testing and documentation.

    Registers can be read/write, read-only or write-only.

    Write-only registers are generally avoided. They are suitable for registers that cause a transient action when written but store no persistent data to be read, such as a 'reset a peripheral' register. They may be the only option in designs that cannot afford gates for the relatively large logic circuit and signal routing needed for register data readback, such as the Atari 2600 games console's TIA chip. However, write-only registers make debugging more difficult[2] and lead to the read-modify-write problem so read/write registers are preferred. On PCs, write-only registers made it difficult for the Advanced Configuration and Power Interface (ACPI) to determine the device's state when entering sleep mode in order to restore that state when exiting sleep mode,[3]

    Register varieties

    The hardware registers inside a central processing unit (CPU) are called processor registers.

    Strobe registers have the same interface as normal hardware registers, but instead of storing data, they trigger an action each time they are written to (or, in rare cases, read from). They are a means of signaling.

    Registers are normally measured by the number of bits they can hold, for example, an "8-bit register" or a "32-bit register".

    Designers can implement registers in a wide variety of ways, including:

    In addition to the "programmer-visible" registers that can be read and written with software, many chips have internal microarchitectural registers that are used for state machines and pipelining; for example, registered memory.

    Standards

    SPIRIT IP-XACT and DITA SIDSC XML define standard XML formats for memory-mapped registers.[4][5][6]

    See also

    References


  • Bose, Sanjay K. (2007). Hardware And Software Of Personal Computers. New Age International. p. 54. ISBN 9788122403039. Retrieved 2012-09-10. Once the INS 8250 has been properly initialized, we should make proper use of the Modem Status register (MSR), Line Status register (LSR) and the Interrupt Identification register (IIR) for controlling the device during actual operation.

  • http://www.microsoft.com/whdc/resources/MVP/xtremeMVP_hw.mspx#ETB Microsoft MVP: If every hardware engineer just understood that... …write-only registers make debugging almost impossible]

  • Microsoft "Guidelines for Bus and Device Specifications"

  • "blog entry on IP-XACT format". Archived from the original on 2009-03-09. Retrieved 2009-03-17.

  • IP-XACT Schema... see component XSD

  •  https://en.wikipedia.org/wiki/Hardware_register

    A barrel processor is a CPU that switches between threads of execution on every cycle. This CPU design technique is also known as "interleaved" or "fine-grained" temporal multithreading. Unlike simultaneous multithreading in modern superscalar architectures, it generally does not allow execution of multiple instructions in one cycle.

    Like preemptive multitasking, each thread of execution is assigned its own program counter and other hardware registers (each thread's architectural state). A barrel processor can guarantee that each thread will execute one instruction every n cycles, unlike a preemptive multitasking machine, that typically runs one thread of execution for tens of millions of cycles, while all other threads wait their turn.

    A technique called C-slowing can automatically generate a corresponding barrel processor design from a single-tasking processor design. An n-way barrel processor generated this way acts much like n separate multiprocessing copies of the original single-tasking processor, each one running at roughly 1/n the original speed.[citation needed]

    History

    One of the earliest examples of a barrel processor was the I/O processing system in the CDC 6000 series supercomputers. These executed one instruction (or a portion of an instruction) from each of 10 different virtual processors (called peripheral processors) before returning to the first processor.[1] From CDC 6000 series we read that "The peripheral processors are collectively implemented as a barrel processor. Each executes routines independently of the others. They are a loose predecessor of bus mastering or direct memory access."

    One motivation for barrel processors was to reduce hardware costs. In the case of the CDC 6x00 PPUs, the digital logic of the processor was much faster than the core memory, so rather than having ten separate processors, there are ten separate core memory units for the PPUs, but they all share the single set of processor logic.

    Another example is the Honeywell 800, which had 8 groups of registers, allowing up to 8 concurrent programs. After each instruction, the processor would (in most cases) switch to the next active program in sequence.[2]

    Barrel processors have also been used as large-scale central processors. The Tera MTA (1988) was a large-scale barrel processor design with 128 threads per core.[3][4] The MTA architecture has seen continued development in successive products, such as the Cray Urika-GD, originally introduced in 2012 (as the YarcData uRiKA) and targeted at data-mining applications.[5]

    Barrel processors are also found in embedded systems, where they are particularly useful for their deterministic real-time thread performance.

    An example is the XMOS XCore XS1 (2007), a four-stage barrel processor with eight threads per core. (Newer processors from XMOS also have the same type of architecture.) The XS1 is found in Ethernet, USB, audio, and control devices, and other applications where I/O performance is critical. When the XS1 is programmed in the 'XC' language, software controlled direct memory access may be implemented.

    Barrel processors have also been used in specialized devices such as the eight-thread Ubicom IP3023 network I/O processor (2004). Some 8-bit microcontrollers by Padauk Technology feature barrel processors with up to 8 threads per core.

    Comparison with single-threaded processors

    Advantages

    A single-tasking processor spends a lot of time idle, not doing anything useful whenever a cache miss or pipeline stall occurs. Advantages to employing barrel processors over single-tasking processors include:

    • The ability to do useful work on the other threads while the stalled thread is waiting.
    • Designing an n-way barrel processor with an n-deep pipeline is much simpler than designing a single-tasking processor because a barrel processor never has a pipeline stall and doesn't need feed-forward circuits.
    • For real-time applications, a barrel processor can guarantee that a "real-time" thread can execute with precise timing, no matter what happens to the other threads, even if some other thread locks up in an infinite loop or is continuously interrupted by hardware interrupts.

    Disadvantages

    There are a few disadvantages to barrel processors.

    • The state of each thread must be kept on-chip, typically in registers, to avoid costly off-chip context switches. This requires a large number of registers compared to typical processors.
    • Either all threads must share the same cache, which slows overall system performance, or there must be one unit of cache for each execution thread, which can significantly increase the transistor count and thus the cost of such a CPU. However, in hard real-time embedded systems where barrel processors are often found, memory access costs are typically calculated assuming worst-case cache behavior, so this is a minor concern.[citation needed] Some barrel processors such as the XMOS XS1 do not have a cache at all.

    See also

    References

     

  • CDC Cyber 170 Computer Systems; Models 720, 730, 750, and 760; Model 176 (Level B); CPU Instruction Set; PPU Instruction Set Archived 2016-03-03 at the Wayback Machine -- See page 2-44 for an illustration of the rotating "barrel".

  • Honeywell 800 Programmers' Reference Manual (PDF). 1960. p. 17.

  • "Archived copy". Archived from the original on 2012-02-22. Retrieved 2012-08-11.

  • "Cray History". Archived from the original on 2014-07-12. Retrieved 2014-08-19.

    1. "Cray's YarcData division launches new big data graph appliance" (Press release). Seattle, WA and Santa Clara, CA: Cray Inc. February 29, 2012. Archived from the original on 2017-03-18. Retrieved 2017-08-24.

    External links

     https://en.wikipedia.org/wiki/Barrel_processor

     

    Intel's Nehalem microarchitecture contains multiple AGUs behind the CPU's reservation station.

    The address generation unit (AGU), sometimes also called address computation unit (ACU),[1] is an execution unit inside central processing units (CPUs) that calculates addresses used by the CPU to access main memory. By having address calculations handled by separate circuitry that operates in parallel with the rest of the CPU, the number of CPU cycles required for executing various machine instructions can be reduced, bringing performance improvements.[2][3]

    While performing various operations, CPUs need to calculate memory addresses required for fetching data from the memory; for example, in-memory positions of array elements must be calculated before the CPU can fetch the data from actual memory locations. Those address-generation calculations involve different integer arithmetic operations, such as addition, subtraction, modulo operations, or bit shifts. Often, calculating a memory address involves more than one general-purpose machine instruction, which do not necessarily decode and execute quickly. By incorporating an AGU into a CPU design, together with introducing specialized instructions that use the AGU, various address-generation calculations can be offloaded from the rest of the CPU, and can often be executed quickly in a single CPU cycle.[2][3]

    Capabilities of an AGU depend on a particular CPU and its architecture. Thus, some AGUs implement and expose more address-calculation operations, while some also include more advanced specialized instructions that can operate on multiple operands at a time.[2][3] Furthermore, some CPU architectures include multiple AGUs so more than one address-calculation operation can be executed simultaneously, bringing further performance improvements by capitalizing on the superscalar nature of advanced CPU designs. For example, Intel incorporates multiple AGUs into its Sandy Bridge and Haswell microarchitectures, which increase bandwidth of the CPU memory subsystem by allowing multiple memory-access instructions to be executed in parallel.[4][5][6]

    See also

    References


  • Cornelis Van Berkel; Patrick Meuwissen (January 12, 2006). "Address generation unit for a processor (US 2006010255 A1 patent application)". google.com. Retrieved December 8, 2014.

  • "Chapter 4: Address Generation Unit (DSP56300 Family Manual)" (PDF). ecee.colorado.edu. September 16, 1999. Retrieved December 8, 2014.

  • Darek Mihocka (December 27, 2000). "Pentium 4: Round 1 – Intel blows the lead". emulators.com. Retrieved December 8, 2014.

  • David Kanter (September 25, 2010). "Intel's Sandy Bridge Microarchitecture: Memory Subsystem". realworldtech.com. Retrieved December 8, 2014.

  • David Kanter (November 13, 2012). "Intel's Haswell CPU Microarchitecture: Haswell Memory Hierarchy". realworldtech.com. Retrieved December 8, 2014.

    1. Per Hammarlund (August 2013). "Fourth-Generation Intel Core Processor, codenamed Haswell" (PDF). hotchips.org. p. 25. Retrieved December 8, 2014.

    External links

     https://en.wikipedia.org/wiki/Address_generation_unit

    In computing, autonomous peripheral operation is a hardware feature found in some microcontroller architectures to off-load certain tasks into embedded autonomous peripherals in order to minimize latencies and improve throughput in hard real-time applications as well as to save energy in ultra-low-power designs. 

    https://en.wikipedia.org/wiki/Autonomous_peripheral_operation

    From Wikipedia, the free encyclopedia

    In computer architecture, frequency scaling (also known as frequency ramping) is the technique of increasing a processor's frequency so as to enhance the performance of the system containing the processor in question. Frequency ramping was the dominant force in commodity processor performance increases from the mid-1980s until roughly the end of 2004.

    The effect of processor frequency on computer speed can be seen by looking at the equation for computer program runtime:

    where instructions per program is the total instructions being executed in a given program, cycles per instruction is a program-dependent, architecture-dependent average value, and time per cycle is by definition the inverse of processor frequency.[1] An increase in frequency thus decreases runtime.

    However, power consumption in a chip is given by the equation

    where P is power consumption, C is the capacitance being switched per clock cycle, V is voltage, and F is the processor frequency (cycles per second).[2] Increases in frequency thus increase the amount of power used in a processor. Increasing processor power consumption led ultimately to Intel's May 2004 cancellation of its Tejas and Jayhawk processors, which is generally cited as the end of frequency scaling as the dominant computer architecture paradigm.[3]

    Moore's Law was[4] still in effect when frequency scaling ended. Despite power issues, transistor densities were still doubling every 18 to 24 months. With the end of frequency scaling, new transistors (which are no longer needed to facilitate frequency scaling) are used to add extra hardware, such as additional cores, to facilitate parallel computing - a technique that is being referred to as parallel scaling.

    The end of frequency scaling as the dominant cause of processor performance gains has caused an industry-wide shift to parallel computing in the form of multicore processors.

    See also

    References


  • John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach. 3rd edition, 2002. Morgan Kaufmann, ISBN 1-55860-724-2. Page 43.

  • J. M. Rabaey. Digital Integrated Circuits. Prentice Hall, 1996.

  • Laurie J. Flynn. Intel Halts Development of 2 New Microprocessors. New York Times, May 8, 2004.

  •  https://en.wikipedia.org/wiki/Frequency_scaling

     n a computer instruction set architecture (ISA), an execute instruction is a machine language instruction which treats data as a machine instruction and executes it.

    It can be considered a fourth mode of instruction sequencing after ordinary sequential execution, branching, and interrupting.[1] Since it is an instruction that operates on other instructions like the repeat instruction, it has also been classified as a meta-instruction.[2]

    Computer models

    Many computer families introduced in the 1950s and 1960s include execute instructions: the IBM 709[1] and IBM 7090 (op code mnemonic: XEC),[3] the IBM 7030 Stretch (EX, EXIC),[4][1] the PDP-1/-4/-9/-15 (XCT),[5][6] the UNIVAC 1100/2200 (EXRI),[7] the CDC 924 (XEC),[8] the PDP-6/-10 (XCT), the IBM System/360 (EX),[9] the GE-600/Honeywell 6000 (XEC, XED),[10] and the SDS-9xx (EXU).[11][12]

    Fewer 1970s designs include execute instructions: the Nuclear Data 812 minicomputer (1971) (XCT),[13] the HP 3000 (1972) (XEQ),[14] and the Texas Instruments TI-990 (1975)[15] and its microprocessor version, the TMS9900 (1976) (X).[16] An execute instruction was proposed for the PDP-11 in 1970,[17] but never implemented for it[18] or its successor, the VAX.[19]

    Modern instruction sets do not include execute instructions because they interfere with pipelining, prefetching, and other optimizations.[citation needed]

    Semantics

    The instruction to be executed, the target instruction, may be in a register or fetched from memory. Some architectures allow the target instruction to itself be an execute instruction; others do not.

    The target instruction is executed as if it were in the memory location of the execute instruction. If, for example, it is a subroutine call instruction, execution is transferred to the subroutine, with the return location being the location after the execute instruction. However, some architectures implement variants of the execute instruction which inhibit branches.[1]

    The System/360 supports variable-length target instructions. It also supports modifying the target instruction before executing it. The target instruction must start on an even-numbered byte.[9]

    The GE-600 series supports execution of two-instruction sequences, which must be doubleword-aligned.[10]

    Some architectures support an execute instruction which operates in a different protection and address relocation mode. For example, the ITS PDP-10 paging device supports a privileged-mode XCTR 'execute relocated' instruction which allows memory reads, writes, or both to use the user-mode page mappings.[20] Similarly, the KL10 variant of the PDP-10 supports the privileged instruction PXCT 'previous context XCT'.[21]

    The execute instruction can cause several problems when one execute instruction points to another one and so on:

    • the processor may be uninterruptable for multiple clock cycles if the execute instruction cannot be interrupted in the middle of execution;
    • similarly, the processor may go into an infinite loop if the series of execute instructions is circular and uninterruptable;
    • if the execute instructions are on different swap pages, all of the pages need to be swapped in for the instruction to complete, which can cause thrashing.

    Similar issues arise with multilevel indirect addressing modes.

    Applications

    The execute instruction has several applications:[1]

    • Functioning as a single-instruction subroutine without the usual overhead of subroutine calls; that instruction may call a full subroutine if necessary.[1]
    • Late binding
      • Implementation of call by name and other thunks.[1]
      • A table of execute targets may be used for dynamic dispatch of the methods or virtual functions of an object or class, especially when the method or function may often be implementable as a single instruction.[18]
      • An execute target may contain a hook for adding functionality or for debugging; it is normally initialized as a NOP which may be overridden dynamically.
      • An execute target may change between a fast version of an operation and a fully traced version.[22][23][24]
    • Tracing, monitoring, and emulation
      • This may maintain a pseudo-program counter, leaving the normal program counter unchanged.[1]
    • Executing dynamically generated code, especially when memory protection prevents executable code from being writable.
    • Emulating self-modifying code, especially when it must be reentrant or read-only.[17]
    • In the IBM System/360, the execute instruction can modify bits 8-15 of the target instruction, effectively turning an instruction with a fixed argument (e.g., a length field) into an instruction with a variable argument.
    • Privileged-mode execute instructions as on the KL10 are used by operating system kernels to execute operations such as block copies within the virtual space of user processes.

    Notes


  • Brooks, F.P. (March 1960). "The execute operations—a fourth mode of instruction sequencing". Communications of the ACM. 3 (3): 168–170. doi:10.1145/367149.367168. S2CID 37725430.

  • Rossman, George E. (December 1975). "A Course of Study in Computer Hardware Architecture". IEEE Computer. 8 (12): 44–63. doi:10.1109/C-M.1975.218835. S2CID 977792., p. 50

  • Reference Manual, IBM 7090 Data Processing System (PDF). IBM. March 1962. p. 36.

  • Reference Manual, 7030 Data Processing System (PDF). IBM. August 1961. p. 50.

  • Programmed Data Processor-1 Manual (PDF). Digital Equipment Corporation. 1961. p. 14.

  • Supnik, Bob. "Architectural Evolution in DEC's 18b Computers" (PDF). p. 8 (page numbers not shown).

  • Univac 1107 Central Computer (PDF). November 1961. p. 12-1.

  • Control Data 924 Computer Reference Manual (PDF). October 1962. p. 2-41.

  • IBM System/360 Principles of Operation (PDF). IBM. 1964. p. 65. A22-6821-0.

  • GE-635 System Manual (PDF). General Electric Computer Department. July 1964. p. A-5.

  • SDS 92 Computer. Scientific Data Systems. June 1965. p. 2-6.

  • SDS 940 Theory of Operation (PDF). Scientific Data Systems. March 1967. p. 2-12. SDS-98-01-26A.

  • Principles of Programming the ND812 Computer (PDF). Nuclear Data, Inc. 1971. p. 4-4.

  • HP 3000 Computer System: Machine Instruction Set Reference Manual (PDF). Hewlett-Packard. 1980. p. 2-31.

  • 990 Computer Family Systems Handbook (PDF). Texas Instruments. p. 3-28.

  • TMS 9900 Microprocessor Data Manual (PDF). Texas Instruments. December 1976. p. 24.

  • van de Goor, Ad (September 21, 1970). "The Execute Instruction" (PDF). PDP-11/40 Technical Memorandum 18.

  • PDP11 Processor Handbook: PDP11/04/34a/44/60/60 (PDF). Digital Equipment Corporation. 1979.

  • VAX MACRO and Instruction Set Reference Manual (PDF). Compaq Computer Corporation. April 2001. AA-PS6GD-TE.

  • Holloway, J. (February 20, 1970). "Hardware Memo 2 - PDP-10 Paging Device" (PDF). MIT AI Lab. p. 11.

  • DECsystem-10, DECSYSTEM-20 Processor Reference Manual (PDF). Digital Equipment Corporation. June 1982. p. 2-63. AA-H391A-TK, AD-H391A-T1.

  • Gabriel, Richard P. (August 1985). Performance and Evaluation of Lisp Systems (PDF). p. 32. ISBN 9780262070935.

  • Pitman, Kent M. "PURE". The Revised Maclisp Manual, Sunday Morning Edition.

    1. Moon, David A. (April 1974). Maclisp Reference Manual (PDF). Revision 0. p. 181.

    No comments:

    Post a Comment