Concurrency

=Concurrency=

Definition
Database concurrency controls ensure that **transactions** //(a database transaction is a logical unit of database operations which are executed as a whole to process user requests for retrieving data or updating the database, for example a customer making an online booking for seats for a threate production//) occur in an ordered fashion. The main job of these controls is to protect transactions issued by different users/applications from the effects of each other. They must preserve the four characteristics of database transactions: atomicity, isolation, consistency and durability.

Description
Concurrency, in computing refers to when multiple path of execution (threads or processes) are running at the same time. This means that there are multiple processes running at the same time. In the case study, it is possible that many people will want to book/reserve seats in multuple theatres. The system must allow for muitple users to access data in the database and allow for a mulitple transactions to occur at the same time or overlapping hence concurrently.

If transactions are executed //serially//, i.e., sequentially with no overlap in time, no transaction concurrency exists. However, if concurrent transactions with interleaving operations are allowed in an uncontrolled manner, some unexpected, undesirable result may occur. Here are some typical examples:
 * 1) The lost update problem: A second transaction writes a second value of a data on top of a first value written by a first concurrent transaction, and the first value is lost to other transactions running concurrently which need, by their precedence, to read the first value. The transactions that have read the wrong value end with incorrect results. For example, two customers, Adam and Aki, wanted to book seats for the same production on the same night. Adam selects 4 seats in row F (F1, F2, F3 and F4) and a few milliseconds later, Aki wants to reserve seats F3 and F4. If Aki goes ahead and books F3 and F4 and then seconds later Adam also decides to book his seats, the system should tell Adam that those seats are no longer available. However if the system is not set up correctly, then Adam would book his seats and overwrite the data for F3 and F4. Aki will lose out as there will be no record in the seats table (it will show that Adam has the tickets) but Aki may be charged for the tickets in the accounts table.
 * 2) The dirty read problem: Transactions read a value written by a transaction that has been later aborted. This value disappears from the database upon abort, and should not have been read by any transaction ("dirty read"). The reading transactions end with incorrect results. In this case, if Aki has reserved F3 and F4 (and this is updated in the seats table) and then aborts his transaction - for example, his payment may be declined, then the seats F3 and F4 need to be made available again so that Adam can purchase them. If this is not the case, then seats F3 and F4 would appear to be reserved and Adam cannot purchase these seats (and no one else either).
 * 3) The incorrect summary problem: While one transaction takes a summary over the values of all the instances of a repeated data-item, a second transaction updates some instances of that data-item. The resulting summary does not reflect a correct result for any (usually needed for correctness) precedence order between the two transactions (if one is executed before the other), but rather some random result, depending on the timing of the updates, and whether certain update results have been included in the summary or not. In this case, Adam may want to purchase 10 seats for a production that has free-seating. He looks to see how many seats are available and a number 15 is given. Meanwhile, Aki reserves 7 seats leaving only 8 seats left. Adam tries to purchase the 10 seats...can he?

Most high-performance transactional systems need to run transactions concurrently to meet their performance requirements. Thus, without concurrency control such systems can neither provide correct results nor maintain their databases consistent.

=== Concurrency control mechanisms (this needs to be edited - simplified and made relevant to the case study) === The main categories of concurrency control mechanisms are:
 * **[|Optimistic]** - Delay the checking of whether a transaction meets the isolation and other integrity rules (e.g., [|serializability] and [|recoverability]) until its end, without blocking any of its (read, write) operations ("...and be optimistic about the rules being met..."), and then abort a transaction to prevent the violation, if the desired rules are to be violated upon its commit. An aborted transaction is immediately restarted and re-executed, which incurs an obvious overhead (versus executing it to the end only once). If not too many transactions are aborted, then being optimistic is usually a good strategy.
 * **Pessimistic** - Block an operation of a transaction, if it may cause violation of the rules, until the possibility of violation disappears. Blocking operations is typically involved with performance reduction.
 * **Semi-optimistic** - Block operations in some situations, if they may cause violation of some rules, and do not block in other situations while delaying rules checking (if needed) to transaction's end, as done with optimistic.

Different categories provide different performance, i.e., different average transaction completion rates (//throughput//), depending on transaction types mix, computing level of parallelism, and other factors. If selection and knowledge about trade-offs are available, then category and method should be chosen to provide the highest performance. The mutual blocking between two transactions (where each one blocks the other) or more results in a [|deadlock], where the transactions involved are stalled and cannot reach completion. Most non-optimistic mechanisms (with blocking) are prone to deadlocks which are resolved by an intentional abort of a stalled transaction (which releases the other transactions in that deadlock), and its immediate restart and re-execution. The likelihood of a deadlock is typically low. Both blocking, deadlocks, and aborts result in performance reduction, and hence the trade-offs between the categories. ==== [[|edit]] Methods ==== Many methods for concurrency control exist. Most of them can be implemented within either main category above. The major methods,[|[][|1][[[|http://en.wikipedia.org/wiki/Concurrency_control#cite_note-Bern2009-0]|]]] which have each many variants, and in some cases may overlap or be combined, are: Other major concurrency control types that are utilized in conjunction with the methods above include: The most common mechanism type in database systems since their early days in the 1970s has been //[|Strong strict Two-phase locking]// (SS2PL; also called //Rigorous scheduling// or //Rigorous 2PL//) which is a special case (variant) of both [|Two-phase locking] (2PL) and [|Commitment ordering] (CO). It is pessimistic. In spite of its long name (for historical reasons) the idea of the **SS2PL** mechanism is simple: "Release all locks applied by a transaction only after the transaction has ended." SS2PL (or Rigorousness) is also the name of the set of all schedules that can be generated by this mechanism, i.e., these are SS2PL (or Rigorous) schedules, have the SS2PL (or Rigorousness) property.
 * 1) Locking (e.g., **[|Two-phase locking]** - 2PL) - Controlling access to data by [|locks] assigned to the data. Access of a transaction to a data item (database object) locked by another transaction may be blocked (depending on lock type and access operation type) until lock release.
 * 2) **Serialization [|graph checking]** (also called Serializability, or Conflict, or Precedence graph checking) - Checking for [|cycles] in the schedule's [|graph] and breaking them by aborts.
 * 3) **[|Timestamp ordering]** (TO) - Assigning timestamps to transactions, and controlling or checking access to data by timestamp order.
 * 4) **[|Commitment ordering]** (or Commit ordering; CO) - Controlling or checking transactions' chronological order of commit events to be compatible with their respective [|precedence order].
 * **[|Multiversion concurrency control]** (MVCC) - Increasing concurrency and performance by generating a new version of a database object each time the object is written, and allowing transactions' read operations of several last relevant versions (of each object) depending on scheduling method.
 * **[|Index concurrency control]** - Synchronizing access operations to [|indexes], rather than to user data. Specialized methods provide substantial performance gains.
 * **Private workspace model** (**Deferred update**) - Each transaction maintains a private workspace for its accessed data, and its changed data become visible outside the transaction only upon its commit (e.g., [|Weikum and Vossen 2001]). This model provides a different concurrency control behavior with benefits in many cases.

References and Resources:
Database Transaction, [], 11 January 2012 Concurrency Control, [], 11 January 2012

Resources:
[] (This is a heavy-duty item).