An Introductory Guide to Databases

This guide will cover the basics of what every aspiring developer should know about databases:
- Databases and database software
- Relational databases and SQL
- SQL tutorials
- Non-relational databases and NoSQL
- Choosing the right database for a project

Databases and database software:

A database is a collection of data that is organized so that it can be easily accessed, managed, and updated.

A DBMS (Database Management System) is the software used to create and manage your database.

Widely-used DBMSs include:

Oracle             MySQL            SQL Server             PostgreSQL (aka Postgres)

MongoDB       IBM Db2          Microsoft Access   Redis

SQLite            Cassandra        MariaDB                 Neo4j

Additionally, cloud database services are available from most major tech companies (Microsoft, Google and Amazon all have both free and paid cloud storage services).

Most, but not all, of these DBMSs are relational and use SQL. SQL stands for Structured Query Language and is the standard language used by relational database management systems (RDBMSs) to manage and access the data. Versions of SQL used by different RDBMSs may vary slightly, but in general most RDBMSs are pretty similar to each other.

Relational databases and databases that use SQL are not strictly the same although there is a huge amount of overlap. While pretty much all relational databases use SQL, some NoSQL (i.e. non-relational) databases can also use SQL for some purposes. Thus, while NoSQL originally meant "no SQL", many sources now refer to it as meaning "not only SQL", and use "NoSQL" interchangeably with "non-relational".

Don't worry if you're confused. This kind of thing happens all the time in the tech industry. Just keep reading.

Relational databases and SQL:

Relational databases are what most people think of when they think of databases. Relational databases use tables to organize data and use SQL to access the data. Here is an example of a database table:

Relational databases consist of one or more tables, and each table contains any number of records. A record contains data on a single object, and each record is one row in the table. In the table above, each record describes a geographic area (like a province/county/district).

This is an example of structured data, which is what relational databases are best used for. Note that each district has a name, country, continent, population, and sqkm_admin. Furthermore, the name, country, and continent is always text, while the population and sqkm_admin is always a number. So all records contain the exact same types of data.

Contrast this with if you were trying to pick a vacation destination, and had a stack of travel brochures. Some brochures might have pictures of beaches and hotels while others might have lists of cultural events and shows. You would have different kinds of information for each district, and most information wouldn't be easy to describe using a number or a few words. This is known as unstructured data, which relational databases are not good for.

You can read more about structured and unstructured data here:
https://www.datamation.com/big-data/structured-vs-unstructured-data.html

And here's an article about a standard set of properties know as ACID that pretty much all RDBMSs have, and that you should have basic knowledge of:
https://database.guide/what-is-acid-in-databases/

SQL Tutorials:

Despite the fact that most of the data out there is unstructured, structured data is easier to actually get useful information from, thus relational databases are still by far the most widely used type of database. And as previously stated, even NoSQL databases can use SQL for some tasks. Because of that, I highly recommend picking one of the following SQL tutorials and putting in a few hours to learn some basic SQL.

https://sqlbolt.com/
Bare bones, concise, gets right to the point, is useful as a basic reference as well.

https://www.w3schools.com/sql/default.asp
Provides a lot of diagrams and examples, and useful as an in-depth reference.

https://www.khanacademy.org/computing/computer-programming/sql
Video examples and simple exercises, explains a lot beyond just writing queries.

https://www.codecademy.com/learn/learn-sql
A tutorial based around guided hands-on work - interactive lessons, projects, quizzes.

Quick Review:
After going through the tutorial, everything on the first page of this list should be familiar:
https://www.kdnuggets.com/2016/07/database-key-terms-explained.html

Non-relational databases and NoSQL:

Unlike relational databases, NoSQL databases use a variety of methods to organize data. Database.guide does a great job of outlining the generally agreed upon 4 categories of NoSQL databases. These 5 short articles will give you a solid understanding of the basics of NoSQL:
https://database.guide/nosql-database-types/
https://database.guide/what-is-a-key-value-database/
https://database.guide/what-is-a-document-store-database/
https://database.guide/what-is-a-column-store-database/
https://database.guide/what-is-a-graph-database/

Choosing the right database for a project:

First, let's review what you learned so far. If you've gone through this guide up to this point, you should be able to read and understand these articles, which discuss the pros and cons of different database types:

https://dzone.com/articles/the-types-of-modern-databases
https://www.infoworld.com/article/3240644/what-is-nosql-nosql-databases-explained.html

Now, let's look at some basic considerations you may have when picking a database for a student project, which may be different from the considerations a company has when picking a database. You'll probably be less concerned with things like scalability and security (unless security is included in the focus of your project), and more concerned with things like cost (free, please), how easily you can connect your database to the other technologies you're using, and where/how to host your database.

Here's a video where the narrator discusses some of the issues you might consider when choosing a database. The information he gives is excellent, though he only discusses 4 databases and is more focused on CAP properties (the video will explain what they are) than anything else.
https://www.youtube.com/watch?v=v5e_PasMdXc

And here's a different tech professional who bases his choice mostly on how his data is best organized:
https://arcentry.com/blog/choosing-a-database-in-2018/

I'm going to add one more thing to consider if you're doing this for a portfolio project: how widely used the database is. Since one of your main goals will be to gain experience with databases, you should use one that a potential employer is likely to care about.

To that end, here's a massive list of DBMSs, ranked by popularity, with in-depth information about all of them:
https://db-engines.com/en/ranking
For an RDBMS, generally speaking your best bets are any of the top 4, possibly MariaDB, or Hive if you're working with Hadoop.

And if you decide to go with a NoSQL database, here's one more article that might help:
https://www.improgrammer.net/most-popular-nosql-database/

You can find a full list of all the pages on the Handbook here..... 

Create your website for free! This website was made with Webnode. Create your own for free today! Get started