What is the architecture of flagship engineered system by Oracle, the Exadata? It is not simple, but it is based on commodity servers installed in rack, powered by Oracle Enterprise Linux. What is worth to mention, they are connected with high-speed, low latency InfiniBand network which I will explain soon. But let me now start from the scratch. Lets look at the general diagram:
First of all we need to distinguish database server layer and intelligent storage layer. Database nodes are just pure Oracle Enterprise Linuxes with preinstalled Grid Infrastructure (Clusterware + ASM). On top of that we can find RAC or single instance databases. For the smallest variant of Exadata (Eighth Rack) we will have 2 db nodes.
In case of the storage layer will have minimum 3 storage nodes. Storage nodes use to be also called cell servers with preinstalled Oracle Enterprise Linux. And to be clear this storage is not just a bay of disks with some (software) controller on top of it. It is really intelligent storage, where combined work of CellSRV, MS, RS processes provides one of the most efficient method to cope with db servers requests! How it exactly works? First we need to understand that CellSRV knows the matter of database blocks. By such features as Offloading (called also Smart Scan), CellSRV could limit the I/O by intelligent guess what data really need to be read from disk. Please read the example on a diagram below:
Precisely speaking that means that via InfiniBand network only small portion of block will be transferred back and DB Node with ASM will consume. It means also, that db nodes could focus on compute rather than I/O!
In our discussion, it is impossible to miss the fact, Exadata has been equipped with InfiniBand technology, which is utilized for Clusterware interconnect traffic, RAC Cache Fusion traffic and of course for inter-layer transfers (db nodes vs cell nodes). Oracle has gone even further and Linux kernel has been enriched with the support for a stack of protocols RDS, RDMA and iDB. RDS means Reliable Datagram Sockets protocol and it runs over InfiniBand ZDP (Zero-loss Zero-copy Datagram Protocol). RDMA (Remote Direct Memory Access) is a direct memory access from the memory of one computer into another computer without involving either’s operating system. The transfer require no work to be done by CPUs, caches, or context switches, and transfers continue in parallel with other system operations. It is quite useful in massively parallel processing environment. On top of RDS we have iDB (Intelligent Database protocol). Oracle Exadata uses the iDB to transfer data between Database Node and Storage Cell Node.
That is all for today. Next time I will explain how Cell Server disks are organized. See you soon. 🙂