Friday, April 9, 2010

CouchDB 101 - 1: What is CouchDB?

CouchDB Discovery

It is my tradional to visit www.linuxtoday.com daily to get updated as to what is happening. So as I was scrolling, NoSQL poped up and I was interested as to what that is. As There was another aricle that really cought my attention, CouchDB basics for PHP developers. So the journey began.

What is CouchDB?

Apache CouchDB is a document-oriented database that can be queried and indexed in a MapReduce fashion using JavaScript. CouchDB also offers incremental replication with bi-directional conflict detection and resolution.

CouchDB provides a RESTful JSON API than can be accessed from any environment that allows HTTP requests. There are myriad third-party client libraries that make this even easier from your programming language of choice. CouchDB’s built in Web administration console speaks directly to the database using HTTP requests issued from your browser.

Key Characteristics

Documents

A CouchDB document is an object that consists of named fields. Field values may be strings, numbers, dates, or even ordered lists and associative maps. A CouchDB database is a flat collection of these documents. Each document is identified by a unique ID.

Example
{
"firstName": "Emanuel",
"lastName": "Feruzi",
"email": [
"emanuel.feruzi@trilabs.co.tz",
"feruzi@gmail.com"
],
"web": "http://www.joelennon.ie"
}

In the example above, the firstName, lastName and web are string values, where as email contains a list of emails addresses.

Schema-Free

Unlike SQL databases which are designed to store and report on highly structured, interrelated data, CouchDB is designed to store and report on large amounts of semi-structured, document oriented data. CouchDB greatly simplifies the development of document oriented applications, which make up the bulk of collaborative web applications.

In an SQL database, as needs evolve the schema and storage of the existing data must be updated. This often causes problems as new needs arise that simply weren’t anticipated in the initial database designs, and makes distributed “upgrades” a problem for every host that needs to go through a schema update.

With CouchDB, no schema is enforced, so new document types with new meaning can be safely added alongside the old. The view engine, using Javascript, is designed to easily handle new document types and disparate but similar documents.

Views

To address this problem of adding structure back to semi-structured data, CouchDB integrates a view model using Javascript for description. Views are the method of aggregating and reporting on the documents in a database, and are built on-demand to aggregate, join and report on database documents. Views are built dynamically and don’t affect the underlying document, you can have as many different view representations of the same data as you like.

Distributed

CouchDB is a peer based distributed database system. Any number of CouchDB hosts (servers and offline-clients) can have independent “replica copies” of the same database, where applications have full database interactivity (query, add, edit, delete). When back online or on a schedule, database changes are replicated bi-directionally.
CouchDB has built-in conflict detection and management and the replication process is incremental and fast, copying only documents and individual fields changed since the previous replication. Most applications require no special planning to take advantage of distributed updates and replication.
Unlike cumbersome attempts to bolt distributed features on top of the same legacy models and databases, it is the result of careful ground-up design, engineering and integration. The document, view, security and replication models, the special purpose query language, the efficient and robust disk layout are all carefully integrated for a reliable and efficient system.

Next: CouchDB 101 - 2: The First Encounter - Install and Get Started

No comments:

Post a Comment

Afrigator