Subscribe by Email


Thursday, September 11, 2014

Database access: What is Multi - version concurrency control?

A common concurrency control method used by the database management systems is the multiversion concurrency control often abbreviated as MVCC or MCC. The method helps in regulating the probelm causing situation of concurrent access to the database. Plus this concurrency control method is used for implementing transactional memory in the programming languages. When a read operation and another write operation is being carried out on the same piece of data item in a database, the database is likely to appear as inconsistent to the user. Using such half – written piece of data further is dangerous and can lead to system failure.
Concurrency control methods provide the simplest solution to this problem. These make sure that no read operation is carried out until the write operation is complete. In this case, the data item is said to have been locked by the write operation. But this traditional method is quite slow, with processes waiting for read access having to wait. So we use a different approach to concurrency control as provided by MVCC. At a given instant of time, a snapshot of the database is shown to each connected user.
Changes made by a write operation won’t be visible to the other database users until it is done completely. In this approach when an item has to be updated, the old data is not overwritten with the new data. Instead the old data is marked as obsolete and the new data is added to the database. Thus, the database has multiple versions of the same data but out of them only one is up to date. With this method the users are able to access the data that they were reading when they began the operation. Now it doesn’t matter whether the data was modified or deleted by some other thread. Though this method avoids the overhead generated by system in filling memory holes, it does require the system to periodically inspect the database and remove the obsolete data.
In case of the document – oriented databases, this method allows optimization of these documents by storing them on to a contiguous memory section of the disk. If any update is made to the document, it is simply rewritten. Thus, having multiple pieces of the documented maintained through links is not required. One can have point in time views in MVCC at a certain consistency. Any read operations carried out under MVCC make use of either transaction ID or a timestamp for determining what state the database is in. this avoids the need for lock management or  reading the transactions by isolating the write operations.
Only the future version of the data is affected by the write operation whereas the transaction ID on which the read operation is being carried out remains consistent because it’s a later transaction ID on which the transaction ID is working. The transactional consistency is achieved by MVCC by means of increasing IDs or timestamps. Another advantage of this concurrency control method is that no transaction has to wait for an object since it maintains many versions of the same data object. Each version is labeled with a timestamp or an ID. But, MVCC fails too at certain points. First of all true snapshot isolation cannot be achieved by MVCC. In some cases read – read anomalies and skew write anomalies also surface. There are two solutions to these anomalies namely:

- Serializable snapshot isolation and
- Precisely serializable snapshot isolation

But these solutions work at the cost of transaction abortion. Some databases which use MVCC are:
- ArangoDB
- Bigdata
- CouchDB
- Cloudant
- HBase
- Altibase
- IBM DB2
- Ingres
- Netezza


No comments:

Facebook activity