Skip to content

Commit 27c6281

Browse files
committed
Added information about file structure to the formatting.txt file.
1 parent 650dc39 commit 27c6281

File tree

2 files changed

+40
-1
lines changed

2 files changed

+40
-1
lines changed

.idea/misc.xml

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

formatting.txt

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,4 +82,40 @@ list of variable names, and the "Graph Edges" section is followed by an
8282
optionally numbered list of edges in form X --> Y (for a DAG). Such ground truth
8383
graphs will be in files with the suffix ".ground.truth.graph.txt", thus:
8484

85-
mydata.ground.truth.graph.txt
85+
mydata.ground.truth.graph.txt
86+
87+
6. A certain directory structure has been has been used which seems fortuitous.
88+
First, the highest level directories have been sorted into 'real' and 'simulated';
89+
'real' contains all real datasets; 'simulated' all simulated datasets. One
90+
important difference is that for simulated data, ground truth is known for
91+
certain, whereas for real data, there is some rationale needed to infer ground
92+
truth, either time order, or expertise, or probably the best, basis in
93+
actual experimental results.
94+
95+
Inside each directory, example datasets are presented in each subdirectory, with
96+
subdirectories of those being 'data', 'ground.truth', and 'images'. Ground truth
97+
is not always known, so this directory may be empty. If a data set
98+
99+
mydata.xxx.txt
100+
101+
is given in the data directory, and
102+
103+
mydata.knowledge.txt
104+
105+
or
106+
107+
mydata.ground.truth.graph.txt
108+
109+
is given, these are ground truth files for that dataset. If ground truth is
110+
given in some other format, it will have a different filename from these.
111+
112+
The 'images' dataset contains, for instance, plot matrices of the data or
113+
other images.
114+
115+
Also, for each example, a 'readme.txt' file is given with a reference
116+
back to the original source of the data, and possibly, where hard to easily
117+
retrieve from that reference, information about the variables in the dataset.
118+
As this repository is not meant to be a replacement for the original data
119+
sources but rather a consistent formatting of the data from those original
120+
data sources, extensive notes are not usually included here, and the user
121+
is encouraged to explore the original sources.

0 commit comments

Comments
 (0)