Data Glossary

Table of Contents:


    What is NoSQL (different types)

    Piotr Kononow - Dataedo Team Piotr Kononow 2018-10-04 2018-10-08

    NoSQL are a class of non-SQL (relational) databases that use other data models than predefined tables and columns. This class consists of many different types of databases and approaches to data storage and manipulation.

    What is SQL

    NoSQL stands in opposition to relational (SQL) databases that were de facto standard for a few decades. SQL databases store data in predefined tables that consist of columns with strict data types. Tables can have unique (relationship between rows in table), check constraints (validation of data in a one row) or foreign key constraints (relationships between data in different tables).

    Sample relational data model:

    SQL is a standard query language to define, manipulate and search data in relational databases.

    Basic SQL query that lists customers from California:

    select customer_no, first_name, last_name, last_purchase
    from customers
    where address_state = 'CA'
    

    Why relational databases are not enough?

    1. Flexibility of schema - relational databases have fixed schemas, which means that they can hold only objects (in tables) and attributes (in columns) that were predefined by database admin (schema on write).
    2. Agility in development - having to define schema before writing data makes it much slower to develop databases. In NoSQL databases you can just write data and worry about their format of the data when you will need to read it (schema on read) and that makes development more agile.
    3. Size of data - many NoSQL databases were build in large scale of data in mind and are able to store huge amounts of rows or objects, sacrificing managing its integrity.

    Types of NoSQL databases

    Document databases

    Document databases store data in JSON documents, which are hierarchical sets of key-value pairs. Documents can be complex and contain sub-documents and lists.

    This is sample document representing a customer:

    {
      "id": "1",
      "name":{
        "firstName":"John",
        "lastName": "Doe"
      },
      "address":{
        "street":"Lombard street",
        "city": "San Francisco",
        "state": "CA",
        "country": "US",
      }
    }
    

    Popular document databases:

    1. MongoDB
    2. DynamoDB
    3. Couchbase
    4. CouchDB

    Key-value store

    Key-value stores are most basic NoSQL databases and store data in simple pairs of key-value (in simplest case both are plain strings), meaning that you can store some value under certain label.

    An example from Redis:

    > set customer1 JohnDoe
    

    Popular key-value stores:

    1. Redis
    2. Berkley DB

    Graph databases

    Graph databases represent data as networks built from nodes and relationships.

    Popular graph databases:

    1. Neo4J
    2. Giraph

    Wide column store

    Wide column stores are similar to relational databases in manner that they store data in tables but data is organized physically by columns. Data in wide column databases is sparse, meaning empty columns don't take up space and that allows for creation of hundreds, thousands of even millions of columns in a table.

    Popular wide column stores:

    1. Apache Cassandra
    2. Apache HBase
    3. Google Bigtable
    Comments (0)