Skip to content

dbWriteTable() much slower than RMySQL, especially slow on Mac #125

@anguspmitchell

Description

@anguspmitchell

RMySQL is orders of magnitude faster than RMariaDB, especially for writes. Tested on both OSX and Amazon Linux. Reading/writing from an AWS RDS database, MySQL v5.6.39. Also Amazon Linux is much faster than Mac, although I guess that could be a product of network speed. That is also a concern because we do some development on Macs. The reason I don't just switch to RMySQL is that RMariaDB handles unicode characters better than RMySQL, which is another important consideration.

Script

There is a DBI::dbWriteTable() and DBI::dbConnect() in addition to RMariaDB::dbWriteTable() and RMariaDB::dbConnect(). To avoid any namespacing confusion, I ran everything with every combination of DBI, RMySQL, and RMariaDB.

Before running I uninstalled and reinstalled R packages.

I also installed mariadb-devel and mysql-devel on Linux and mariadb-connector-c and mysql-connector-c on Mac.

# detach("package:RMySQL", unload=T)
# detach("package:RMariaDB", unload=T)
# detach("package:DBI", unload=T)
# remove.packages(c("DBI", "RMySQL", "RMariaDB"))
# install.packages(c("DBI", "RMySQL", "RMariaDB"))

# Amazon AMI
# sudo yum install mariadb-devel
# sudo yum install mysql-devel

# OSX
# brew install mariadb-connector-c
# brew install mysql-connector-c

dbString <- "xxxxx.us-west-2.rds.amazonaws.com"
maria_mariaConn <- RMariaDB::dbConnect(RMariaDB::MariaDB(),
                                       host = dbString,
                                       user="xxx", password="xxx", dbname="xxx")
mysql_mysqlConn <- RMySQL::dbConnect(RMySQL::MySQL(),
                                     host = dbString,
                                     user="xxx", password="xxx", dbname="xxx")
maria_dbiConn <- DBI::dbConnect(RMariaDB::MariaDB(),
                                host = dbString,
                                user="xxx", password="xxx", dbname="xxx")
mysql_dbiConn <- DBI::dbConnect(RMySQL::MySQL(),
                                host = dbString,
                                user="xxx", password="xxx", dbname="xxx")


# Query Maria - Maria Function / Maria Conn
startTime = Sys.time()
rows_MariaFunc_MariaConn = RMariaDB::dbGetQuery(maria_mariaConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_MariaFunc_MariaConn = (endTime - startTime)

# Query Maria - Maria Function / DBI Connection
startTime = Sys.time()
rows_MariaFunc_MariaDBIConn = RMariaDB::dbGetQuery(maria_dbiConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_MariaFunc_MariaDBIConn = (endTime - startTime)

# Query Maria - DBI Function / Maria Connection
startTime = Sys.time()
rows_DBIFunc_MariaConn = DBI::dbGetQuery(maria_mariaConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_DBIFunc_MariaConn = (endTime - startTime)

# Query Maria - DBI Function / DBI Connection
startTime = Sys.time()
rows_DBIFunc_MariaDBIConn = DBI::dbGetQuery(maria_dbiConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_DBIFunc_MariaDBIConn = (endTime - startTime)

# Query MySQL - MySQL conn
startTime = Sys.time()
rows_MySQLConn = DBI::dbGetQuery(mysql_mysqlConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_MySQLConn = (endTime - startTime)

# Query MySQL - DBI conn
startTime = Sys.time()
rows_MySQLDBIConn = DBI::dbGetQuery(mysql_dbiConn, "Select * From TestTable")
endTime = Sys.time()
queryTime_MySQLDBIConn = (endTime - startTime)




# Cut down rows to make it run faster
rowsMySQL1000 = rows_MySQLConn[1:1000,]
rowsMaria1000 = rows_MariaFunc_MariaConn[1:1000,]

# Write Maria - Maria Func / Maria Conn
startTime = Sys.time()
RMariaDB::dbWriteTable(maria_mariaConn, "TestTable2", rowsMaria1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_MariaFunc_MariaConn = (endTime - startTime)

# Write Maria - Maria Func / DBI Conn
startTime = Sys.time()
RMariaDB::dbWriteTable(maria_dbiConn, "TestTable2", rowsMaria1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_MariaFunc_MariaDBIConn = (endTime - startTime)

# Write Maria - DBI Func / Maria Conn
startTime = Sys.time()
DBI::dbWriteTable(maria_mariaConn, "TestTable2", rowsMaria1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_DBIFunc_MariaConn = (endTime - startTime)

# Write Maria - DBI Func / DBI Conn
startTime = Sys.time()
DBI::dbWriteTable(maria_dbiConn, "TestTable2", rowsMaria1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_DBIFunc_MariaDBIConn = (endTime - startTime)


# Write MySQL - MySQL Func / MySQL Conn
startTime = Sys.time()
RMySQL::dbWriteTable(mysql_mysqlConn, "TestTable2", rowsMySQL1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_MySQLFunc_MySQLConn = (endTime - startTime)

# Write MySQL - MySQL Func / DBI Conn
startTime = Sys.time()
RMySQL::dbWriteTable(mysql_dbiConn, "TestTable2", rowsMySQL1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_MySQLFunc_MySQLDBIConn = (endTime - startTime)

# Write MySQL - DBI Func / MySQL Conn
startTime = Sys.time()
DBI::dbWriteTable(mysql_mysqlConn, "TestTable2", rowsMySQL1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_DBIFunc_MySQLConn = (endTime - startTime)

# Write MySQL - DBI Func / DBI Conn
startTime = Sys.time()
DBI::dbWriteTable(mysql_dbiConn, "TestTable2", rowsMySQL1000, row.names=FALSE, append=TRUE)
endTime = Sys.time()
writeTime_DBIFunc_MySQLDBIConn = (endTime - startTime)

print(paste("queryTime_MariaFunc_MariaConn", as.numeric(queryTime_MariaFunc_MariaConn, units="secs")))
print(paste("queryTime_MariaFunc_DBIConn", as.numeric(queryTime_MariaFunc_MariaDBIConn, units="secs")))
print(paste("queryTime_DBIFunc_MariaConn", as.numeric(queryTime_DBIFunc_MariaConn, units="secs")))
print(paste("queryTime_DBIFunc_MariaDBIConn", as.numeric(queryTime_DBIFunc_MariaDBIConn, units="secs")))

print(paste("queryTime_MySQLConn", as.numeric(queryTime_MySQLConn, units="secs")))
print(paste("queryTime_MySQLDBIConn", as.numeric(queryTime_MySQLDBIConn, units="secs")))


print(paste("writeTime_MariaFunc_MariaConn", as.numeric(writeTime_MariaFunc_MariaConn, units="secs")))
print(paste("writeTime_MariaFunc_MariaDBIConn", as.numeric(writeTime_MariaFunc_MariaDBIConn, units="secs")))
print(paste("writeTime_DBIFunc_MariaConn", as.numeric(writeTime_DBIFunc_MariaConn, units="secs")))
print(paste("writeTime_DBIFunc_MariaDBIConn", as.numeric(writeTime_DBIFunc_MariaDBIConn, units="secs")))

print(paste("writeTime_MySQLFunc_MySQLConn", as.numeric(writeTime_MySQLFunc_MySQLConn, units="secs")))
print(paste("writeTime_MySQLFunc_MySQLDBIConn", as.numeric(writeTime_MySQLFunc_MySQLDBIConn, units="secs")))
print(paste("writeTime_DBIFunc_MySQLConn", as.numeric(writeTime_DBIFunc_MySQLConn, units="secs")))
print(paste("writeTime_DBIFunc_MySQLDBIConn", as.numeric(writeTime_DBIFunc_MySQLDBIConn, units="secs")))


print(paste("RMariaDB", packageVersion("RMariaDB")))
print(paste("RMariaDB", packageVersion("RMySQL")))
print(paste("R.Version Platform", R.Version()$platform))
print(paste("R.Version String", R.Version()$version.string))

Amazon Linux Output

RMySQL queries are 1.2x to 3x faster than RMariaDB
RMySQL writes are 10x to 25x faster than RMariaDB

[1] "queryTime_MariaFunc_MariaConn 0.721820831298828"
[1] "queryTime_MariaFunc_DBIConn 0.30603814125061"
[1] "queryTime_DBIFunc_MariaConn 0.382649660110474"
[1] "queryTime_DBIFunc_MariaDBIConn 0.300202369689941"
[1] "queryTime_MySQLConn 0.241434335708618"
[1] "queryTime_MySQLDBIConn 0.263653039932251"
[1] "writeTime_MariaFunc_MariaConn 1.173743724823"
[1] "writeTime_MariaFunc_MariaDBIConn 1.25841975212097"
[1] "writeTime_DBIFunc_MariaConn 1.1358757019043"
[1] "writeTime_DBIFunc_MariaDBIConn 1.13124299049377"
[1] "writeTime_MySQLFunc_MySQLConn 0.0569257736206055"
[1] "writeTime_MySQLFunc_MySQLDBIConn 0.113566160202026"
[1] "writeTime_DBIFunc_MySQLConn 0.0538115501403809"
[1] "writeTime_DBIFunc_MySQLDBIConn 0.0567092895507812"
[1] "RMariaDB 1.0.6"
[1] "RMariaDB 0.10.15"
[1] "R.Version Platform x86_64-koji-linux-gnu"
[1] "R.Version String R version 3.4.3 (2017-11-30)"
$ yum list | grep maria
mariadb-devel.x86_64                   1:5.5.60-1.amzn2              @amzn2-core
mariadb-libs.x86_64                    1:5.5.60-1.amzn2              installed  
mariadb.x86_64                         1:5.5.60-1.amzn2              amzn2-core 
mariadb-bench.x86_64                   1:5.5.60-1.amzn2              amzn2-core 
mariadb-embedded.x86_64                1:5.5.60-1.amzn2              amzn2-core 
mariadb-embedded-devel.x86_64          1:5.5.60-1.amzn2              amzn2-core 
mariadb-libs.i686                      1:5.5.60-1.amzn2              amzn2-core 
mariadb-server.x86_64                  1:5.5.60-1.amzn2              amzn2-core 
mariadb-test.x86_64                    1:5.5.60-1.amzn2              amzn2-core
$ yum list | grep mysql
apr-util-mysql.x86_64                  1.6.1-5.amzn2.0.2             amzn2-core 
dovecot-mysql.x86_64                   1:2.2.10-8.amzn2.0.2          amzn2-core 
freeradius-mysql.x86_64                3.0.13-9.amzn2                amzn2-core 
libdbi-dbd-mysql.x86_64                0.8.3-16.amzn2.0.1            amzn2-core 
mysql-connector-java.noarch            1:5.1.25-3.amzn2              amzn2-core 
mysql-connector-odbc.x86_64            5.2.5-7.amzn2                 amzn2-core 
pcp-pmda-mysql.x86_64                  3.12.2-5.amzn2                amzn2-core 
php-mysql.x86_64                       5.4.16-43.amzn2               amzn2-core 
php-mysqlnd.x86_64                     5.4.16-45.amzn2.0.6           amzn2-core 
qt-mysql.i686                          1:4.8.5-15.amzn2.0.3          amzn2-core 
qt-mysql.x86_64                        1:4.8.5-15.amzn2.0.3          amzn2-core 
qt5-qtbase-mysql.i686                  5.9.2-3.amzn2.0.1             amzn2-core 
qt5-qtbase-mysql.x86_64                5.9.2-3.amzn2.0.1             amzn2-core 
redland-mysql.x86_64                   1.0.16-6.amzn2.0.1            amzn2-core 
rsyslog-mysql.x86_64                   8.24.0-16.amzn2.6.1           amzn2-core

OSX Output

RMySQL queries are similar to RMariaDB
RMySQL writes are 100x to 200x faster than RMariaDB

[1] "queryTime_MariaFunc_MariaConn 26.3695220947266"
[1] "queryTime_MariaFunc_DBIConn 47.9590289592743"
[1] "queryTime_DBIFunc_MariaConn 13.6953809261322"
[1] "queryTime_DBIFunc_MariaDBIConn 10.8443579673767"
[1] "queryTime_MySQLConn 16.7168970108032"
[1] "queryTime_MySQLDBIConn 29.732666015625"
[1] "writeTime_MariaFunc_MariaConn 106.188654899597"
[1] "writeTime_MariaFunc_MariaDBIConn 103.038119077682"
[1] "writeTime_DBIFunc_MariaConn 100.028841018677"
[1] "writeTime_DBIFunc_MariaDBIConn 98.1016211509705"
[1] "writeTime_MySQLFunc_MySQLConn 1.18589186668396"
[1] "writeTime_MySQLFunc_MySQLDBIConn 0.954946041107178"
[1] "writeTime_DBIFunc_MySQLConn 0.477998971939087"
[1] "writeTime_DBIFunc_MySQLDBIConn 0.491441965103149"
[1] "RMariaDB 1.0.6"
[1] "RMariaDB 0.10.16"
[1] "R.Version Platform x86_64-apple-darwin15.6.0"
[1] "R.Version String R version 3.5.0 (2018-04-23)"
-> brew list --versions | grep mysql
mysql-connector-c 6.1.11
-> brew list --versions | grep mariadb
mariadb-connector-c 3.0.8
-> pip freeze | grep mysql
mysql-connector-python==2.0.4
-> pip freeze | grep mariadb

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions