Skip to content

Commit

Permalink
Update R packages for macOS and Windows.
Browse files Browse the repository at this point in the history
  • Loading branch information
romanhaa committed Sep 25, 2019
1 parent b600382 commit 12bc20f
Show file tree
Hide file tree
Showing 9,263 changed files with 1,391,962 additions and 26,718 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
SchemaGuidelines
================

We want our schemas to be as portable as possible. In particular they should
be compatible with SQLite, MySQL, PostgreSQL and Oracle.
All the current schemas (i.e. the *_DB.sql files in schemas_0.9/ and
schemas_1.0/) have been successfully tested (i.e. imported) with SQLite
(3.4.1), MySQL+InnoDB (5.0.26) and PostgreSQL (8.1.9) on a 64-bit openSUSE
10.2 system. They have not been tested on Oracle yet.

o All of the *_DB.sql files must define a "metadata" table (with cols "name"
and "value") and all the *.sqlite files using one of these schemas must
have a 'DBSCHEMA' and a 'DBSCHEMAVERSION' entry in their "metadata" table.

o Make explicit use of the NULL or NOT NULL constraint on every column (except
on PRIMARY KEY cols that are implicitly NOT NULL).

o Centralize all the data type definitions in the DataTypes.txt file and use
them consistently across all the *_DB.sql files.

o Use preferably CHAR(n) or VARCHAR(n) types instead of non-standard TEXT type
for character columns. Note that n must be <= 255 for compatibility with
MySQL. Don't define an INDEX (or try to put a UNIQUE constraint, which
implicitly creates an INDEX) on a TEXT column since this is not portable.
Also don't define an INDEX on a CHAR(n) or VARCHAR(n) column where n is
large (i.e. > 80) since this is not portable either. Doing so when n <= 80
should be safe though.

o Always use a character type for "external" (aka "real world") ids even for
ids like Entrez Gene IDs, PubMed IDs or OMIM IDs that are in fact integers.

o Put PRIMARY KEY and UNIQUE definitions "in line" at the end of the column
definition (after the SQL type and the NULL/NOT NULL constraint for UNIQUE).

o Make the PRIMARY KEY column the first column in the CREATE TABLE statement.

o Put "regular" (i.e. non PRIMARY KEY, non UNIQUE) INDEX definitions all
together at the end of the *_DB.sql file.

o Make sure that referenced tables are created before referencing tables.

o Make the FOREIGN KEYs portable. This means that:
- Use portable syntax. Putting
FOREIGN KEY (col) REFERENCES table (col)
inside the CREATE TABLE statement but at the end of it (after all the
column definitions) is compatible with SQLite (which will just ignore
it), MySQL+InnoDB and PostgreSQL.
- There must be NOT NULL and UNIQUE constraints on the referenced column
(typically a PRIMARY KEY).
- The referencing and referenced columns must have the same type.

o Prefix with an underscore the column names that store "internal" ids.

Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
This directory contains version 0.9 of the full set of schemas used in the
sqlite-based annotation data packages.

Version 0.9 is our target for the BioC 2.1 release i.e. it is the version
that we plan to use in the data packages that will be released with
Bioconductor 2.1 (AnnotationDbi 1.0.0).

Version 0.9 has been successfully tested (i.e. imported) with SQLite
(3.4.1), MySQL+InnoDB (5.0.26) and PostgreSQL (8.1.9) on a 64-bit openSUSE
10.2 system. It has not been tested on Oracle yet.

All the *.sqlite files using one of the 0.9 schemas must set DBSCHEMAVERSION
to 0.9 in their "metadata" table.

See the DataTypes.txt file for all the data types used across the 0.9 schemas.

Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
--
-- ARABIDOPSISCHIP_DB schema
-- =========================
--

-- The "genes" table is the central table.
CREATE TABLE genes (
id INTEGER PRIMARY KEY,
gene_id CHAR(9) NOT NULL UNIQUE -- AGI locus ID
);

-- Data linked to the "genes" table.
CREATE TABLE probes (
id INTEGER NULL, -- REFERENCES genes
probe_id VARCHAR(80) NOT NULL, -- manufacturer ID
is_multiple SMALLINT NOT NULL, -- a silly and useless field
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE aracyc (
id INTEGER NOT NULL, -- REFERENCES genes
pathway_name VARCHAR(255) NOT NULL, -- AraCyc pathway name
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE chromosome_locations (
id INTEGER NOT NULL, -- REFERENCES genes
chromosome CHAR(1) NOT NULL, -- Arabidopsis chromosome
start_location INTEGER NOT NULL,
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE ec ( -- Table
id INTEGER NOT NULL, --
ec_number VARCHAR(13) NOT NULL, -- NOT
FOREIGN KEY (id) REFERENCES genes (id) --
); -- used!
CREATE TABLE enzyme (
id INTEGER NOT NULL, -- REFERENCES genes
ec_name VARCHAR(255) NOT NULL, -- EC name
FOREIGN KEY (id) REFERENCES genes (id)
);
-- Note that the "gene_info" table differs from other schemas:
-- o no UNIQUE constraint on col "id"
-- o no NOT NULL constraints on cols "gene_name" and "symbol"
-- o one additional col "chromosome"
CREATE TABLE gene_info (
id INTEGER NOT NULL, -- REFERENCES genes
gene_name VARCHAR(255) NULL, -- gene name
symbol VARCHAR(80) NULL, -- gene symbol
chromosome CHAR(1) NULL, -- Arabidopsis chromosome
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_bp (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_bp_all (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_cc (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_cc_all (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_mf (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE go_mf_all (
id INTEGER NOT NULL, -- REFERENCES genes
go_id CHAR(10) NOT NULL, -- GO ID
evidence CHAR(3) NOT NULL, -- GO evidence code
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE kegg (
id INTEGER NOT NULL, -- REFERENCES genes
kegg_id CHAR(5) NOT NULL, -- KEGG pathway short ID
FOREIGN KEY (id) REFERENCES genes (id)
);
CREATE TABLE pubmed (
id INTEGER NOT NULL, -- REFERENCES genes
pubmed_id VARCHAR(10) NOT NULL, -- PubMed ID
FOREIGN KEY (id) REFERENCES genes (id)
);

-- Metadata tables.
CREATE TABLE metadata (
name VARCHAR(80) PRIMARY KEY,
value VARCHAR(255)
);
CREATE TABLE qcdata (
map_name VARCHAR(80) PRIMARY KEY,
count INTEGER NOT NULL
);
CREATE TABLE map_metadata (
map_name VARCHAR(80) NOT NULL,
source_name VARCHAR(80) NOT NULL,
source_url VARCHAR(255) NOT NULL,
source_date VARCHAR(20) NOT NULL
);

-- Explicit index creation on the referencing column of all the foreign keys.
-- Note that this is only needed for SQLite: PostgreSQL and MySQL create those
-- indexes automatically.
CREATE INDEX Fprobes ON probes (id);
CREATE INDEX Faracyc ON aracyc (id);
CREATE INDEX Fchromosome_locations ON chromosome_locations (id);
CREATE INDEX Fec ON ec (id);
CREATE INDEX Fenzyme ON enzyme (id);
CREATE INDEX Fgene_info ON gene_info (id);
CREATE INDEX Fgo_bp ON go_bp (id);
CREATE INDEX Fgo_bp_all ON go_bp_all (id);
CREATE INDEX Fgo_cc ON go_cc (id);
CREATE INDEX Fgo_cc_all ON go_cc_all (id);
CREATE INDEX Fgo_mf ON go_mf (id);
CREATE INDEX Fgo_mf_all ON go_mf_all (id);
CREATE INDEX Fkegg ON kegg (id);
CREATE INDEX Fpubmed ON pubmed (id);

-- Other indexes.
CREATE INDEX Lprobes ON probes (probe_id);

Loading

0 comments on commit 12bc20f

Please sign in to comment.