Device IO

Question

Device IO

Anonymous

Just as the processor reads and writes to the memory to execute a process, the processor similarly needs to read and write to I/O devices to send or receive information to and from the user environment. The purpose of an I/O device is to process and manifest the data in a human comprehensible format through the device terminals. From the CPU perspective, read/write to a device is no different from read/write to a memory location, because both tasks are essentially transporting data from one location to another. However, the system views the data transfer task from the perspective of utilising system resources and treats memory R/W differently from device I/O.

The processor instruction to memory consists of a read/write argument referenced to a processor register followed by a memory address in hexadecimal format. Read is loading data to a register from a memory address. Write is storing data from a register to a memory address.

For example:

LOAD REG1, 0xB001D006;

is a memory-read instruction which load the processor register REG1 with the contents in memory location specified by the hexadecimal address 0xB001D006.

STORE REG1, 0xB001D006;

is a memory-write instruction which stores the contents of the processor register REG1 into the memory location specified by the hexadecimal address 0xB001D006.

**The CPU uses two control signals IO/M and R/W for memory read/write operation.**With IO/M=0, the CPU reserves the system bus for communicating with the memory; with IO/M=1 the CPU reserves the system bus for communicating with the I/O devices.

R/W=1 for LOAD and R/W=0 for STORE becomes memory read-write instruction when IO/M=0, and becomes device input-output instruction when IO/M=1. The CPU puts the memory address on the address bus for both read-write and puts the processor register data on the data bus only for the write operation.

For memory LOAD, the CPU generates IO/M=0 and R/W=1. The memory controller refers to the address on the address bus, copies the data from that memory location and puts it on the data bus for the CPU to load the data into one of its registers.

For memory STORE, the CPU generates IO/M=0 and R/W=0. The memory controller unloads the data from the data bus and stores it into the memory location as indicated by the address on the address bus.

The data transfer process between a CPU and I/O device can be processor controlled by Programmed I/O or device controlled as in Direct Memory Access (DMA).

Programmed I/O

Programmed I/O follow the same principle as memory read/write, and indeed the same command in memory-mapped IO systems, where the hex-address refers to a device port instead of a memory location. However, the processor communicates with the memory via the address bus and in sync with a clock; but communicates with the device ports via the I/O bus and through mutual notification & acknowledge control signals - as internal electrical timings of devices cannot sync with a semiconductor clock. A clock ensures maximum data transfer speed when the communicating entities are pure semiconductors which have fixed switching times of their own.

Ports of devices have two interfaces; on one side it connects to the CPU through the system bus and on the other side it connects to the device through the device controller. The CPU issues instruction to a port through the I/O bus to send or receive data through the data bus. The controller is responsible to decode the instruction and first identify the device from the address part of the instruction and then perform the data input/output operation in the internal mechanics of the device. For this, the port address remains coded in the controller hardware. Ports of all devices monitor the I/O bus and the port unlocks when the address on the I/O bus matches with the port address.

Just as memory read/write, the device input/output is relative to a processor register. An input device (eg. Keyboard and Mouse) have an input port and an output device (eg. Audio Video terminal) have an output port. Devices capable of both input and output (eg. storage and network card) has an input port as well as an output port. The same port (address) can serve both as an input and output port, gets distinguished by the data input/output part of the instruction and supported through complementary hardware logic in the device controller. This is the reason a device port is called an I/O port. A device can also have multiple OUT ports, if it has to receive multiple OUT instructions from the processor as parameters for the desired device output.

A processor requests data from a device by issuing a LOAD / IN instruction on the I/O bus, which is recognised by the target IN port. The port controller retrieves the data from the device hardware, encodes it in the processor format and puts it on the data bus and sends a DATA-READY signal to the processor, which alerts the processor to load the data from the data bus into one of its registers. Conversely, the processor dispatches data from one of its registers to a device by putting the data on the data bus and issuing a STORE / OUT instruction on the I/O bus which is recognised by the target OUT port. The port controller unloads the data from the data bus and sends this to the device terminal in its native operational format.

The I/O instructions are actually polymorphic; the same instruction decodes to different I/O action intrinsic to the working of the device. For example, Video-out and Disk-out both work on the same OUT instruction but accomplishes different results. OUT instruction to the video controller sends data from the processor to the video terminal whereas the OUT instruction to hard disc stores the data from the processor into hard disc.

The I/O bus could be a dedicated set of I/O lines as in **Port Mapped IO (PMIO)**Or it can be the same address bus whose address space is split into a memory address space and a device address space as in Memory Mapped IO (MMIO). The RAM monitors the address bus and the device data ports monitors the I/O bus and recognises when it is addressed by the processor.

In actual implementation of 32-bit Port Mapped IO, the IO/M control signal is used to orient 16 of the 32 address lines as I/O bus. When the I/O line is enabled (=1), the 16 address lines emulate as a device I/O bus and when the disabled (=0), all 32 address lines comprise the memory address bus. The 32-bit memory address space is mutually exclusive with 16-bit I/O address space, meaning that both address space cannot co-exist at the same point in time. Thus the instructions to access memory via the 32-bit address bus is different from the instructions to access device via the 16-bit I/O bus. Device I/O is limited to two instructions i.e. IN and OUT, providing for simple load and store operations between CPU registers and device I/O ports. PMIO is based on the principle of time sharing of the address bus.

In 32-bit Memory Mapped IO, a 64 KB device address space is carved out from the processor’s address space of 2^32 = 4 GB. So read and writes to those special 65536 addresses (bytes) are interpreted as device I/O operation; while read and writes to all other addresses are interpreted as memory operation. Thus the address part of the instruction resolves it to a memory access or a I/O device access. The same instructions that are used to read/write to memory are also used to perform input/output from devices. The memory instructions doubling for device I/O are more versatile, allowing data transfer between device to register, register to device, and device to memory (Direct Memory Access). MMIO is based on the principle of space sharing of the address bus.

PMIO is actually a hardware implementation by the CPU design whereas MMIO is software implementation by the OS design. PMIO was a necessity in 16-bit systems with 64 KB scarce address space where otherwise reserving space for device I/O by MMIO was a constraint in some configuration. MMIO was less of a problem in 32-bit systems and indeed advantageous in 64-bit systems where the processor address space is practically unlimited. Nonetheless PMIO could still be the preferred choice in systems where a dedicated I/O bus is required which cannot be time shared / space shared with the address bus. Whatever be the choice, the system bus structure has to be designed and built accordingly.

Interrupt

The operating system use the device driver program to request data from a device or dispatch data to a device. A command in the driver program translates to a sequence of low level instructions in the processor’s assembly language to communicate with the I/O port of a device. Unlike normal I/O where the processor requests data from a device, there are occasions when a device may have to initiate the request for data input to the processor. For example, a keyboard must notify the CPU when it needs to send data to an application. A device may also need to notify the CPU when it is free to communicate, in response to a CPU request for communication, which would otherwise require the CPU to check the status of the device at intervals of time – a process called polling which degrades the CPU performance. In these situations, the device notifies the processor in a random manner, by sending a special code called Interrupt Request number (IRQ) to the processor via I/O bus and raising the interrupt flag in the status register of the CPU. The processor refers to the interrupt vector table (IVT) in a reserved memory space which holds the mapping of the IRQs to the address (vector) of the interrupt service routine (ISR) or interrupt handler. The purpose of the ISR is to provide a routine (much like a subroutine) specific to the type and relative importance of the interrupt and addressed to the device input port to resolve the interrupt condition. The processor suspends its currently executing program, saving its state, and invokes the ISR to complete the device input request or it may keep the request pending till it completes its current activities. This depends on the priority of the interrupt. On completion of the ISR, the processor clears the interrupt flag and resumes operation of the program which was suspended. Interrupt is programmed I/O with communication initiated by the device.

Bus Architecture

The computer architecture demarcates the communication strategy in terms of speed, thereby defining two bridges one for high speed data transfer and another for low speed data transfer. The Northbridge chipset acts as an interface for communication between the CPU and the high speed device controllers as the Memory, Clock, Integrated VGA / Accelerated GPU (AGP) on PCIE slot, Gigabit Ethernet all attached directly to the motherboard. The CPU and these high speed controllers connects to the Northbridge by the Front Side Bus (FSB) whose track length is minimised so as to enable it as an ultra-fast channel for data transportation. The CPU internally connects to its L1 and L2 cache by the Back Side Bus (BSB) which works independently of the FSB and operates at the speed of the processor, thus delivering the peak speed available in the system. In comparison, the rated speed of the FSB is less than half the speed of BSB. The Northbridge chip is now being integrated with the CPU for further performance enhancement.

The Southbridge chipset acts as an interface for communication between the CPU and the slower I/O device controllers. The devices connects by data cables to the external I/O ports on the motherboard such as the USB, SATA, Ethernet, 3.5mm Audio, Legacy (PS/2, Serial, Parallel) port, and these ports connects to the CPU by the I/O Bus (IOB). The PCI slot is an internal I/O port which can be loaded with an optional add-on LAN / Modem / Sound / Storage controller card.

The bridges serve as a gateway for the CPU to select either the FSB or the IOB for memory/device communication. Instructions execution is fastest from the Processor cache via BSB, followed by RAM via FSB, and slowest through I/O devices by IOB. The FSB and IOB are actually different implementation of the same system bus comprising of the address lines, data lines and control lines connecting to the CPU pins. This implies that the Northbridge and the Southbridge are themselves connected to one another and this path is used for direct communication between the high speed and low speed devices.

Disk I/O by DMA

The hard disc space is addressed by the Logical Block Addressing (LBA) where the storage media is logically built by sectors of 512 bytes. Each sector has a base address which is linear and continuous (LBA0, LBA1, LBA2…LBA4294967296), whose data is accessed as a block and not by individual bytes.

In 32-bit system, 2^32 = 4294967296 are the number of addresses possible which allows to address 4294967296 * 512 bytes = 2TB storage space.

A processor read/write instruction to a hard disc can address a Disk I/O port but cannot address the sectors in a hard disc. The solution is to send a read instruction addressed to a sector directly to the Disk I/O port which is accomplished by the Direct Memory Access (DMA) I/O method. In general, DMA enables transfer of large blocks of data between an I/O device and memory where the processor only initiates the data transfer and is not involved in the data transfer. This ensures that the processor is not unnecessarily tied down in the data transfer task between a device and memory as in Programmed I/O, but remains free to do other useful tasks when the data transfer is in progress. The processor must relinquish control on the system bus once a DMA controller is delegated with the task of transferring data from/to memory. On completing the data transfer, the DMA controller sends an interrupt signal to the processor. The interrupt alerts the processor to resume normal operation by reclaiming control of the system bus.

For Disk I/O, a DMA controller is made to interface a hard disc where the DMA serves as an I/O device to the processor but emulates as a processor when communicating with the hard disc and memory. The processor communicates with the DMA controller via the I/O bus just like any other I/O device, and the DMA controller communicates with the Disk controller I/O port via internal lines designed in the interface. This staggered addressing arrangement permits the DMA controller to send read/write instruction to the Disk I/O port via the internal lines without having to address or decode the port, thereby opening up the address part in the instruction to reference a sector in the hard disc instead. The hard disc controller at the I/O port is responsible to decode the instruction and send/receive the sector data to/from the DMA over the data bus.

The DMA controller stores the PMIO/MMIO read-write instructions of the processor in a ROM to communicate with the memory and hard disc. The DMA controller has multiple registers as OUT port, which are written into by the processor, to define the data source address, data destination address, the data size to be transferred and the read/write operation that needs to be performed. The DMA controller use these parameters to frame a I/O instruction pair to Either read from the hard disc and write to memory Or read from the memory and write to the hard disc. Read/Write to hard disc by DMA is I/O access but does not involve the I/O bus, whereas read/write to memory is memory access via address bus. Both PMIO and MMIO can be used for DMA access, although MMIO is a simpler approach.

For example, when data is to be read from hard disc and written to memory, the processor under instruction of the disc driver program initialises the DMA controller registers with the sector address (LBA), number of sectors to read, the virtual memory page address, assert the memory-write and the disk-read bits. The virtual memory address is an index in the page table array which serves as a memory pointer referencing to the PTE containing the physical memory page address.

The DMA controller generates a disk-read instruction [eg. LOAD REG2, 0x51726;] and sends this to the Disk IN port via its internal interface. The REG2 part of the instruction specifies the DMA buffer register (simulating as a processor register) as the data destination and the hex address 0x51726 represents a LBA sector as data source. The hard disc controller at the IN port performs the read operation on the sector and send the data contained in the sector to the DMA buffer register via the data bus. Remember! with the initiation of the DMA process, the data bus is owned by the DMA controller.

The DMA then issues a virtual memory write instruction [eg. STORE REG2, @0xBF2G79DA;] where the virtual memory address is a pointer address (indicated by the @ prefix) to the physical memory address as the data destination. This instruction sends the data from the DMA buffer register (REG2) through the data bus to the physical memory location. The DMA repeats the instruction pair in a loop (the number of sectors to read being the loop control variable), till all the requested sectors are read from the hard disc and written consecutively to the virtual memory. Eight sectors each of 512 bytes writes to a 4KB memory page. Finally, the DMA controller sends an interrupt signal to the processor when the data transfer of all the sectors are complete.

The same principle applies when data is to be transferred from memory to hard disc, the difference being, the data source address is a virtual memory location, the data destination address is a hard disc LBA and the size of data to be transferred is a count of the virtual page numbers. The resulting DMA operation is a memory read (load) instruction, followed by a disk store instruction in a continuous loop controlled by the page count. The loop executes till all the memory pages are copied from the memory to the hard disc.

Individual I/O devices can have their own DMA controller or a single DMA controller can be configured to interface more than one I/O device. Conflict can arise when more than one DMA tries to contend for the system bus which demands resolution through a Bus arbitration mechanism. This requires a more detailed study of the Direct Memory Access (DMA).

Regards,

Sushovon Sinha

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments

2 answers

Answer 1

Anonymous

There are multiple aspects that has to pondered upon, which unfortunately cannot be addressed from this forum. Request you to get in touch with a systems specialist.

Regards.

0 comments

Answer 2

First, great article. I recently noticed the IRQ 86 was listed 2 times with one of them blank and the status was ok. Also, one resource for devices was completed but again not a named device. Is it safe to post a copy showing all the numbers of the resource files from system information? And can you tell me what does Un-allowed DMA capable bus/device(s) mean. I have 2 ! on unknown drivers and can't figure out what to load for these. I loaded the PCI/VEN Encryption Decryption into the System Firmware and so far it has not been rejected, but I don't see an improvement. In fact, my system is slower and I can't do anything because the TPM is not supported, but it gets rid of the !. If you could tell what is safe to show for assistance, I would appreciate that. thanks

Share via

Device IO

2 answers