Read SQL databases in R

Data Manipulation in R Import and export data
Read SQL databases in R

The DBI package is an interface to communicate R and relational database management systems. This package can connect to MySQL, PostgreSQL, SQLite, MariaDB and other databases through their own drivers. In this tutorial you will learn how to connect to different data bases, how to perform queries and how to write and remove tables.

SQL database connections

The dbConnect function from DBI creates a connection to a data base through a driver and the necessary authentication parameters, such as the user name, password, host, port and database name or database file name. In the following subsections we will review how to connect to the most common databases.

MySQL

To connect to a MySQL database you have to specify the MySQL driver from RMySQL and specify your authentication parameters.

library(DBI)

db <- dbConnect(RMySQL::MySQL(),
                user = "your_user_name",
                host = "your_host",
                password = "your_secret_password",
                dbname = "your_database_name",
                port = "your port")

PostgreSQL

In order to connect to a PostgreSQL data base you will need to use RPostgreSQL::PostgreSQL() and your authentication parameters.

library(DBI)

db <- dbConnect(RPostgreSQL::PostgreSQL(),
                user = "your_user_name",
                host = "your_host",
                password = "your_secret_password",
                dbname = "your_database_name",
                port = "your port")

MariaDB

A connection to MariaDB can be done through the RMariaDB::MariaDB() driver. You can connect to a remote database with your credentials or setting a group in your .my.cnf, mariadb.cnf or my.ini file (the default MariaDB option file).

library(DBI)

db <- dbConnect(RMariaDB::MariaDB(), group = "my-db")

# Or:
db <- dbConnect(RMariaDB::MariaDB(),
                host = "your-db.your-company.com",
                user = "your_user_name",
                password = "your_secret_password")

SQLite

The last connection we will review is to SQLite through RSQLite::SQLite(). In this scenario you just need to specify the name of your database.

library(DBI)

db <- dbConnect(RSQLite::SQLite(), "your-db.sqlite")

# In-memory database
# db <- dbConnect(RSQLite::SQLite(), ":memory:")

You can check the connections with dbListConnections(dvr). Unfortunately this is a deprecated function and not all drivers implement this method and you will have to keep track of your connections by yourself.

Read tables

In this section we are going to use the following SQLite sample database named db that contains two tables. You can use the following code to create it under your working directory:

# Sample data
df <- data.frame(x = 1:26, LETTERS)

db <- dbConnect(RSQLite::SQLite(), dbname = 'sample_db.sqlite')

# Sample tables
dbWriteTable(conn = db, name = "first_table", value = df)
dbWriteTable(conn = db, name = "second_table", value = mtcars)

List available tables

You can list the available tables with the dbListTables function.

dbListTables(db)
"first_table"  "second_table"

As stated before, this database has two tables named "first_table" and "second_table".

List fields on a table

You can also list the fields of each of the tables with dbListFields.

dbListFields(db, "first_table")
"x"       "LETTERS"

For instance, the first table contains two fields named "x" and "LETTERS".

Read specific tables with dbReadTable

Once you know the names of the table of your databases you can read them with the dbReadTable function. In the following example we are reading the table named "first_table".

df <- dbReadTable(db, "first_table")

df
    x LETTERS
1   1       A
2   2       B
3   3       C
4   4       D
5   5       E
...

Send queries

It is possible to execute SQL queries to the database with the dbSendQuery function. Then, you will need to use the dbFetch function to fetch the desired records of the query. Finally, you will need to free the resources with dbClearResult.

In the following example we are extracting 5 rows of the field named x from the table named first_table:

# Select the variable "x" from "first_table"
query <- dbSendQuery(db, "SELECT x FROM first_table")

# First 5 rows
n5 <- dbFetch(query, n = 5)

n5

# Clear
dbClearResult(query)

You can learn more about SQL and SQL queries in the following site: https://sqlearning.com/

Write and remove tables

The dbWriteTable and dbRemoveTable functions allow to add or remove tables from the database. If you want to add a new table named third_table to the database you can do the following:

df <- iris[1:10, ]

dbWriteTable(db, "third_table", df)

Now you can check that the table was added to the database with dbListTables.

dbListTables(db)
"first_table"  "second_table" "third_table"

Lastly, consider that you want to remove the second table. For that purpose you can type the following:

dbRemoveTable(db, "second_table")

Close a connection

When you finish your work you can close your connection to the database with the dbDisconnect function. If you were working with a temporary database it will be deleted.

dbDisconnect(db)

You can also close all connections listing all connections with dbListConnections and closing all at once following the example below. This is only available for some drivers as dbListConnections is a deprecated function.

# Specify the driver
dvr <- RMySQL::MySQL()

lapply(dbListConnections(dvr), dbDisconnect)