Oracle database is a relational DBMS that has extended the relational model to an object-relational model, to store business models in an RDBMS.
\bfDatabase Schema: Database schema is a collection of logical data structures
     Table: It represents the real-world entity and can have integrity 
     constraints on columns
Data Access: Structured Query Language (SQL): Oracle RDBMS uses PL/SQL which is an extension which helps store application logic in the DB itself
Transaction Management: Oracle RDBMS supports multiuser concurrency. Transaction: Data Concurrency: Oracle RDBMS enforces a statement-level and transaction-level read consistency. This is done to avoid the dirty read problem and based on the level of consistency, DBMS guarantees data returning from the single or multiple queries is consistent and committed. Oracle RDBMS is a set of physical structures like files and applications inside a single physical database can interact with multiple logical databases.
Oracle RDBMS has multitenant architecture, Sharding Architecture: Partition horizontally across multiple physical Oracle RDBMS. Useful for OLTP applications. After sharding, every database(shard) has a dedicated server and resources - CPU, flash, disk and memory and together they make up a single logical database.
Database Storage Structures : Physical Storage Structures: Files storing data on the disk Data files: Oracle RDBMS stores data for logical database structures on physical files Control files: Contains metadata about the physical structure of the database like file locations Online redo log files: consists of redo entries that record changes made to data Local Storage Structures : Data blocks: it represents the number of bytes on disk Extent: number of continuous logical data blocks in a single allocation Segments: number of extents allocated for an object like table or index Tablespaces: a logical container for segments Oracle RDBMS Processes: Client processes, background processes, server processes
In 1977, Larry Ellison, Robert Miner, and Ed Oates founded Software Development Laboratories, which was hired by the United States Central Intelligence Agency (CIA) in order to write a new database system based upon SQL. This system came to be known as Oracle. The company changed its name to Relational Software, Inc. in 1979 and then to Oracle Systems Corporation in 1982.
Oracle follows the industry-accepted standards for SQL. Oracle SQL has many extensions to standard SQL language to provide additional statements. Oracle RDBMS processes SQL statements using a query optimizer which generates execution plans based on access paths and statistics.
Optimization The optimizer generates most of the possible ways to process a query and assigns a cost to each step and finally takes the plan with the lowest cost. The main components are • Query Transformer: changes the form of a query to generate the execution plan • Estimator: estimates the cost of a particular execution plan • Plan Generator: Generates different possible plans, sub plans for nested queries. It uses an adaptive query optimizations feature which changes the plans based on the statistics collected during the statement execution. This optimization uses a dynamic programming plan. Optimizer statistics describe the details of data storage and distribution. It includes table statistics, column statistics, index statistics, system statistics.
Oracle supports isolation levels of read committed as well as serializable, defaulting with the latter. There is an additional mode available, "read only", which is not part of the SQL standard. read committed: The default isolation level where query executed by transaction sees the data committed before the given query and thus avoids reading the commits happening while in the transaction. Even though it provides the read consistency(row, when reread, is the same as before if not committed by other transactions), there is a conflicting write issue when two concurrent transactions try to modify the same item.
serializable isolation: In this, the transaction views the changes committed before the transaction began and not just before the given query. So any commits to the item by another concurrent transaction say t2 is not reflected in the reread of the item by t1 implying read consistency. Oracle allows modifying a row if the changes made to it by another transaction are already committed before this transaction started. Otherwise, it generates an error
read-only isolation level: Similar to serializable isolation but data is not permitted to be modified in the transaction. Read consistency is achieved by reconstructing the data from the undo statements.
Naïve (Page-Level) Naïve (Record-Level) Bit Packing / Mostly Encoding
Oracle supports compression at multiple levels within the data, including by row, block, and index. It also supports network compression designed to reduce bandwidth usage and increase network throughput.
Table compression Basic table compression: intended for bulk operation Advanced row compression: intended for an OLTP application
The compressed rows are stored in a row-major format where all columns of a particular row are stored together. Also, the info needed to re-create the uncompressed data from the compressed one is stored in the data block itself.
Hybrid Columnar Compression: stores the same column for a group of rows. Data is stored as a combination of row and columnar storage(stores column data together).
The additional data structure which is associated with the table and table cluster to speed up data access(rows). Primary keys and unique keys (keys: expression or the set of columns on which index is built), already have indexes
Oracle provides a composite index that is on multiple columns in a given table. Oracle also provides multiple indexes on the same table provided different index types or different partition schemes or different uniqueness properties.
Oracle provides B+ Tree indexing and various others. RDBMS automatically reflects the indexes, the data changes made to the tables. Index Scan: DB retrieves a row by traversing through the index. Basic principles oracle uses is if an SQL query needs only the indexed columns, then DB reads the value from the index but if some access to the non-indexed columns is required then DB uses the row ids to get the rows. Oracle provides : • Full index scan: DB reads entire index in order • Fast full index scan: DB access the data in the index without accessing the table • Index range scan: ordered index scan where 1 or more columns are specified in conditions • Index Unique scan: similar to index range scan but have 0 or 1-row id associated with the index key • Index Skip Scan: uses logical subindex of a composite index Variations of B+ Trees index : • Reverse Key Indexes • Ascending and Descending Indexes Other indexes provided : • Bitmap index: DB stores a bitmap for each index key and each index stores pointers to multiple rows. • Function-Based Indexes: it can be either B+ trees or the bitmap index, where the index computes the function or expression that involves multiple columns to store in the index. • Application Domain Indexes: customized according to the application.
Multi-version Concurrency Control (MVCC)
Oracle employs MVCC for concurrency control. Data Concurrency and Consistency Oracle uses serializability for transaction isolation. It maintains data consistency by having a multi-version consistency model and the usage of locks and transactions. Oracle can prevent dirty reads, non-repeatable reads, phantom reads as it provides: read committed: The default isolation level where query executed by transaction sees the data committed before the given query and thus avoids reading the commits happening while in the transaction. Even though it provides the read consistency(row, when reread, is the same as before if not committed by other transactions), there is a conflicting write issue when two concurrent transactions try to modify the same item.
serializable isolation: In this, the transaction views the changes committed before the transaction began and not just before the given query. So any commits to the item by another concurrent transaction say t2 is not reflected in the reread of the item by t1 implying read consistency. Oracle allows modifying a row if the changes made to it by another transaction are already committed before this transaction started. Otherwise, it generates an error read-only isolation level: Similar to serializable isolation but data is not permitted to be modified in the transaction. Read consistency is achieved by reconstructing the data from the undo statements.
Locking Mechanism To prevent the incorrect updates of shared data between the concurrent transactions. Oracle has two types of locks (shared and exclusive) that provide data consistency, concurrency, and integrity. Oracle rules for taking locks :
• When in the process of modifying a row, it is locked • When a row is being written by writer A, it blocks the concurrent writer B on the same row. • A reader can never block a writer. • A writer can never block a reader. Oracle automatically uses the lowest application-level lock to provide less restriction so more data is available to be accessed by others. Oracle also allows performs the lock conversions in order to maintain the consistencies. It never escalates locks, which is done to promote the lock level to a higher level when most of the locks are at a given granularity level to decrease the number of locks. Never escalation is done to avoid the probability of deadlocks. Also, locks are automatically released when transaction no longer needs them. In the case of deadlocks, oracle resolves them by rolling back the statements involved in the deadlock and releasing locks.
Types of locks provided by oracle : DML Locks: Row Locks (TX) Table Locks(TM) : Row Share (RS) Row Exclusive Table Lock (RX) Share Table Lock (S) Share Row Exclusive Table Lock (SRX) Exclusive Table Lock (X) DDL Locks System Locks Latches Mutexes Internal Locks
Relational Key/Value Document / XML Graph
Oracle was originally designed as a relational DBMS. It now also supports a variety of data models for storage. Schema objects: Each user account on oracle RDBMS has a single schema which can contain many data structures called schema objects which are of the type : Table Index Partition Views Oracle RDBMS handles schema object dependency to keep DBMS up to date.
Table Overview : Oracle RDBMS has relational tables and object tables. Relational tables are stored as : Heap organized table - rows not stored in any particular order Index organized table - rows ordered according to primary key values. External table - only metadata store in DB but data stored outside.
The table can consist of different types of columns like - Virtual Columns - columns which don’t consume disk space Invisible Columns - values are viewable only when the column is specified by name
Table Storage : Table data is held in the data segment in tablespace. The table is heap organized so unordered rows. So when a row is inserted in the DB, the first free available space in the data segment is used.
Row Storage : Rows are stored in the data blocks as a one-row piece(if possible). For a heap organized table, a row has a unique row id that maps to the physical location of the row. In the case of the table clusters where rows from the different tables are stored in the same data block, have the same id. These ids are used for indexing like B-tree indexing to provide fast access to a row in a single I/O.
Table compression Basic table compression: intended for bulk operation Advanced row compression: intended for an OLTP application
The compressed rows are stored in a row-major format where all columns of a particular row are stored together. Also, the info needed to re-create the uncompressed data from the compressed one is stored in the data block itself.
Hybrid Columnar Compression: stores the same column for a group of rows. Data is stored as a combination of row and columnar storage(stores column data together).
Table Clusters: It is a group of tables which share columns and store the related data in same data blocks So, a single data block contains rows from multiple tables. This reduces the disk I/O for the joins. The cluster key is the common column between the tables. For, eg student and class tables share student_id column. In oracle RDBMS, data with the same cluster key values(eg, student_id = 20) are physically stored together.
Indexed Clusters Oracle provides a clustered index of the B+-tree index on the cluster key before inserting any rows. This is done to use the index and locate the data. In other words, the RDBMS stores the rows in the heap and uses the index to locate them.
Hash Cluster Oracle RDBMS also allows to create of a hash cluster and locates the rows using the key values(cluster key) stored in a separate index. It retrieves the data block(hash value). Oracle also provides the sorted hash cluster which stores rows efficiently to return them in sorted order. In case of collision, oracle links the filled block to a new overflow block, and now retrieving will take two I/Os.
Attribute-Clustered Tables: Oracle allows a heap -organized table to store data in proximity based on the user-specified clustering directives (on single/multiple tables). Directives: clustering by linear order: divides rows into ranges based on user-specified attributes in a given order interleaved order: used for dimensional hierarchy This reduces table scans and thus I/0 and CPU cost.
Zone Maps Oracle has zones which are a set of contiguous data blocks that store min and max values of the columns. It uses the predicate values in the SQL query to determine the zones to read and skip the unnecessary ones. Oracle uses zone maps which are an access structure to divide the data blocks into zones and automatically creates them when the clustering is specified on clustering columns.
http://www.oracle.com/us/products/database/
https://docs.oracle.com/en/database/oracle/oracle-database/index.html
Oracle
1977
C, C++, COBOL, Java, PHP, PL/SQL, Python, R, SQL, Visual Basic
AIX, HP-UX, Linux, Solaris, Windows