This repo contains code to perform analysis of performance of strategies that attempt to copy other mutual funds based on reported holdings.
This repo contains scripts needed to set up a data base and perform analysis. Data base in PostgreSQL is set up using bash script files. Analysis is performed using R scripts.
Data is downloaded from WRDS web-site. The process is described in getting-data.html in details.
Note that much more data is downloaded than is actually needed for the analysis, however, scripts work only with this extra data. Data can be also found here, however, this link is encrypted, because it is probably prohibited to share this data. Here is the PostgreSQL dump, which contains enough data to actual construction of copycat returns and performance analysis.
Bond index can be downloaded from here for free. It should be saved into ~/data/raw/fred-bond-index-1989-12-2016-05.csv.
Data base is set up in PostgreSQL 9.5.1 under Windows 10 operating system. The bash script is intended to run using the cygwin environment setting the working directory to the project directory. The bash script is called create-databse.sh. It performs the following tasks:
- Creates a user
- Creates a data base
- Creates schemas in the data base
- Creates tables
- Populates tables with data from .csv files downloaded from WRDS.
- Manipulates databases including creation of a "clean" schema which contains data for more convenient linking of CRSP and Thomson Reuters data bases.
The script will skip already accomplished steps automatically.
Batch files, might require additional manual configuration before running them:
- Set PostgreSQL bin folder
- Add 7-zip folder.
- Set usernames, passwords, etc for database.
- SQL files in
~/sql/
directory that start with wordscopy
orimport
require changing the directory of raw data.
So overall, the minimum input from a user to set up a database is to install PostgreSQL, 7-zip and change paths described above.
R scripts are divided into logical chunks that are stored in separate numbered files and are intended to run in that order (but one might not want to run them all in one run):
01-load-packages.R
: this file loads (and downloads if necessary) packages that are required for analysis.02-connect-to-db.R
: this file sets up connections to the data base.03-export-cusips.R
: this file creates a file with the list of stocks information about which should be downloaded from WRDS. This might be tricky as it should be run somewhere in the middle ofcreate-database.sh
script, therefore, the simplest way is to just download it from the repo and thecreate-database.sh
will work.04-copy-performance-functions.R
: this file contains functions used in the04-copy-performance.R
.04-copy-performance.R
: this file creates the performance of copycat funds.05-analysis-functions.R
: this file contains functions used05-analysis.R
05-analysis.R
: this files performs analysis of copycat performance.
The following analysis is performed (in the 05-analysis.R
):
- We calculate means of various indicators such as gross, after trading costs and net returns of original (primitive) funds and copycats. Respective t-statistics and p-values are also reported.
- We calculate means of various indicators such as gross, after trading costs and net returns of original (primitive) funds and copycats for each year. Respective t-statistics and p-values are also not reported.
- We perform decile sorting of funds based on various indicators using data for the last 12 months. We perform this sorting once in every 3 months. Then we compare various performance indicators for each decile and for bottom-minus-top decile.
- We perform decile sorting of funds based on various indicators using data for the last 12 months. We perform this sorting once in every 3 months. Then we calculate Carhart's alphas of various performance indicators for each decile and for bottom-minus-top decile.