Skip to content

Commit

Permalink
ordinal
Browse files Browse the repository at this point in the history
  • Loading branch information
vincentarelbundock committed Jan 6, 2023
1 parent d55a790 commit a9c0a8b
Show file tree
Hide file tree
Showing 13 changed files with 2,638 additions and 3 deletions.
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Imports:
nlme,
nycflights13,
openintro,
ordinal,
palmerpenguins,
plm,
plyr,
Expand Down
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# What is this?

`Rdatasets` is a collection of 1937 datasets which were originally
`Rdatasets` is a collection of 1941 datasets which were originally
distributed alongside the statistical software environment `R` and some
of its add-on packages. The goal is to make these data more broadly
accessible for teaching and statistical software development.
Expand Down Expand Up @@ -76,13 +76,12 @@ Rscript scrape.R
cd doc
../documentation.sh
cd ..
Rscript -e "rmarkdown::render("README.Rmd")"
Rscript -e "rmarkdown::render('README.Rmd')"
```

Second, commit to master.

``` bash
cd ..
git add .
git commit
```
Expand Down
15 changes: 15 additions & 0 deletions csv/ordinal/income.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
"","year","pct","income"
"1","1960",6.5,"0-3"
"2","1960",8.2,"3-5"
"3","1960",11.3,"5-7"
"4","1960",23.5,"7-10"
"5","1960",15.6,"10-12"
"6","1960",12.7,"12-15"
"7","1960",22.2,"15+"
"8","1970",4.3,"0-3"
"9","1970",6,"3-5"
"10","1970",7.7,"5-7"
"11","1970",13.2,"7-10"
"12","1970",10.5,"10-12"
"13","1970",16.3,"12-15"
"14","1970",42.1,"15+"
1,848 changes: 1,848 additions & 0 deletions csv/ordinal/soup.csv

Large diffs are not rendered by default.

73 changes: 73 additions & 0 deletions csv/ordinal/wine.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
"","response","rating","temp","contact","bottle","judge"
"1",36,"2","cold","no","1","1"
"2",48,"3","cold","no","2","1"
"3",47,"3","cold","yes","3","1"
"4",67,"4","cold","yes","4","1"
"5",77,"4","warm","no","5","1"
"6",60,"4","warm","no","6","1"
"7",83,"5","warm","yes","7","1"
"8",90,"5","warm","yes","8","1"
"9",17,"1","cold","no","1","2"
"10",22,"2","cold","no","2","2"
"11",14,"1","cold","yes","3","2"
"12",50,"3","cold","yes","4","2"
"13",30,"2","warm","no","5","2"
"14",51,"3","warm","no","6","2"
"15",90,"5","warm","yes","7","2"
"16",70,"4","warm","yes","8","2"
"17",36,"2","cold","no","1","3"
"18",50,"3","cold","no","2","3"
"19",42,"3","cold","yes","3","3"
"20",23,"2","cold","yes","4","3"
"21",80,"5","warm","no","5","3"
"22",81,"5","warm","no","6","3"
"23",73,"4","warm","yes","7","3"
"24",62,"4","warm","yes","8","3"
"25",46,"3","cold","no","1","4"
"26",27,"2","cold","no","2","4"
"27",48,"3","cold","yes","3","4"
"28",32,"2","cold","yes","4","4"
"29",57,"3","warm","no","5","4"
"30",37,"2","warm","no","6","4"
"31",84,"5","warm","yes","7","4"
"32",58,"3","warm","yes","8","4"
"33",26,"2","cold","no","1","5"
"34",45,"3","cold","no","2","5"
"35",61,"4","cold","yes","3","5"
"36",41,"3","cold","yes","4","5"
"37",48,"3","warm","no","5","5"
"38",41,"3","warm","no","6","5"
"39",58,"3","warm","yes","7","5"
"40",55,"3","warm","yes","8","5"
"41",46,"3","cold","no","1","6"
"42",30,"2","cold","no","2","6"
"43",54,"3","cold","yes","3","6"
"44",37,"2","cold","yes","4","6"
"45",32,"2","warm","no","5","6"
"46",60,"4","warm","no","6","6"
"47",88,"5","warm","yes","7","6"
"48",73,"4","warm","yes","8","6"
"49",13,"1","cold","no","1","7"
"50",19,"1","cold","no","2","7"
"51",31,"2","cold","yes","3","7"
"52",29,"2","cold","yes","4","7"
"53",22,"2","warm","no","5","7"
"54",43,"3","warm","no","6","7"
"55",32,"2","warm","yes","7","7"
"56",49,"3","warm","yes","8","7"
"57",25,"2","cold","no","1","8"
"58",32,"2","cold","no","2","8"
"59",39,"2","cold","yes","3","8"
"60",40,"3","cold","yes","4","8"
"61",51,"3","warm","no","5","8"
"62",45,"3","warm","no","6","8"
"63",42,"3","warm","yes","7","8"
"64",67,"4","warm","yes","8","8"
"65",12,"1","cold","no","1","9"
"66",29,"2","cold","no","2","9"
"67",47,"3","cold","yes","3","9"
"68",28,"2","cold","yes","4","9"
"69",47,"3","warm","no","5","9"
"70",38,"2","warm","no","6","9"
"71",72,"4","warm","yes","7","9"
"72",65,"4","warm","yes","8","9"
3 changes: 3 additions & 0 deletions datasets.csv
Original file line number Diff line number Diff line change
Expand Up @@ -1333,6 +1333,9 @@
"openintro","yawn","Contagiousness of yawning",50,2,2,0,2,0,0,"https://vincentarelbundock.github.io/Rdatasets/csv/openintro/yawn.csv","https://vincentarelbundock.github.io/Rdatasets/doc/openintro/yawn.html"
"openintro","yrbss","Youth Risk Behavior Surveillance System (YRBSS)",13583,13,2,8,0,0,5,"https://vincentarelbundock.github.io/Rdatasets/csv/openintro/yrbss.csv","https://vincentarelbundock.github.io/Rdatasets/doc/openintro/yrbss.html"
"openintro","yrbss_samp","Sample of Youth Risk Behavior Surveillance System (YRBSS)",100,13,2,8,0,0,5,"https://vincentarelbundock.github.io/Rdatasets/csv/openintro/yrbss_samp.csv","https://vincentarelbundock.github.io/Rdatasets/doc/openintro/yrbss_samp.html"
"ordinal","income","Income distribution (percentages) in the Northeast US",14,3,1,0,2,0,1,"https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/income.csv","https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/income.html"
"ordinal","soup","Discrimination study of packet soup",1847,12,4,0,12,0,0,"https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/soup.csv","https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/soup.html"
"ordinal","wine","Bitterness of wine",72,6,2,0,5,0,1,"https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/wine.csv","https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/wine.html"
"palmerpenguins","penguins","Size measurements for adult foraging penguins near Palmer Station, Antarctica",344,8,1,0,3,0,5,"https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv","https://vincentarelbundock.github.io/Rdatasets/doc/palmerpenguins/penguins.html"
"plm","Cigar","Cigarette Consumption",1380,9,0,0,0,0,9,"https://vincentarelbundock.github.io/Rdatasets/csv/plm/Cigar.csv","https://vincentarelbundock.github.io/Rdatasets/doc/plm/Cigar.html"
"plm","Crime","Crime in North Carolina",630,44,1,0,2,0,42,"https://vincentarelbundock.github.io/Rdatasets/csv/plm/Crime.csv","https://vincentarelbundock.github.io/Rdatasets/doc/plm/Crime.html"
Expand Down
78 changes: 78 additions & 0 deletions datasets.html
Original file line number Diff line number Diff line change
Expand Up @@ -34710,6 +34710,84 @@
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/doc/openintro/yrbss_samp.html'> DOC </a>
</td></tr>

<tr>
<td class=cellinside>ordinal
</td>
<td class=cellinside>income
</td>
<td class=cellinside>Income distribution (percentages) in the Northeast US
</td>
<td class=cellinside> 14
</td>
<td class=cellinside> 3
</td>
<td class=cellinside> 1
</td>
<td class=cellinside> 0
</td>
<td class=cellinside> 2
</td>
<td class=cellinside> 0
</td>
<td class=cellinside> 1
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/income.csv'> CSV </a>
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/income.html'> DOC </a>
</td></tr>

<tr>
<td class=cellinside>ordinal
</td>
<td class=cellinside>soup
</td>
<td class=cellinside>Discrimination study of packet soup
</td>
<td class=cellinside> 1847
</td>
<td class=cellinside> 12
</td>
<td class=cellinside> 4
</td>
<td class=cellinside> 0
</td>
<td class=cellinside>12
</td>
<td class=cellinside> 0
</td>
<td class=cellinside> 0
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/soup.csv'> CSV </a>
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/soup.html'> DOC </a>
</td></tr>

<tr>
<td class=cellinside>ordinal
</td>
<td class=cellinside>wine
</td>
<td class=cellinside>Bitterness of wine
</td>
<td class=cellinside> 72
</td>
<td class=cellinside> 6
</td>
<td class=cellinside> 2
</td>
<td class=cellinside> 0
</td>
<td class=cellinside> 5
</td>
<td class=cellinside> 0
</td>
<td class=cellinside> 1
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/csv/ordinal/wine.csv'> CSV </a>
</td>
<td class=cellinside><a href='https://vincentarelbundock.github.io/Rdatasets/doc/ordinal/wine.html'> DOC </a>
</td></tr>

<tr>
<td class=cellinside>palmerpenguins
</td>
Expand Down
103 changes: 103 additions & 0 deletions doc/ordinal/income.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<!DOCTYPE html><html><head><title>R: Income distribution (percentages) in the Northeast US</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
<script type="text/javascript">
const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
function processMathHTML() {
var l = document.getElementsByClassName('reqn');
for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
return;
}</script>
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
onload="processMathHTML();"></script>
<link rel="stylesheet" type="text/css" href="R.css" />
</head><body><div class="container">

<table style="width: 100%;"><tr><td>income</td><td style="text-align: right;">R Documentation</td></tr></table>

<h2>
Income distribution (percentages) in the Northeast US
</h2>

<h3>Description</h3>

<p>Income distribution (percentages) in the Northeast US in 1960 and 1970
adopted from McCullagh (1980).
</p>


<h3>Usage</h3>

<pre><code class='language-R'>income
</code></pre>


<h3>Format</h3>


<dl>
<dt><code>year</code></dt><dd>
<p>year.
</p>
</dd>
<dt><code>pct</code></dt><dd>
<p>percentage of population in income class per year.
</p>
</dd>
<dt><code>income</code></dt><dd>
<p>income groups. The unit is thousands of constant (1973) US dollars.
</p>
</dd>
</dl>



<h3>Source</h3>

<p>Data are adopted from McCullagh (1980).
</p>


<h3>References</h3>

<p>McCullagh, P. (1980) Regression Models for Ordinal Data. <em>Journal
of the Royal Statistical Society. Series B (Methodological)</em>,
Vol. 42, No. 2., pp. 109-142.
</p>


<h3>Examples</h3>

<pre><code class='language-R'>
print(income)

## Convenient table:
(tab &lt;- xtabs(pct ~ year + income, income))

## small rounding error in 1970:
rowSums(tab)

## compare link functions via the log-likelihood:
links &lt;- c("logit", "probit", "cloglog", "loglog", "cauchit")
sapply(links, function(link) {
clm(income ~ year, data=income, weights=pct, link=link)$logLik })
## a heavy tailed (cauchy) or left skew (cloglog) latent distribution
## is fitting best.

## The data are defined as:
income.levels &lt;- c(0, 3, 5, 7, 10, 12, 15)
income &lt;- paste(income.levels, c(rep("-", 6), "+"),
c(income.levels[-1], ""), sep = "")
income &lt;-
data.frame(year=factor(rep(c("1960", "1970"), each = 7)),
pct = c(6.5, 8.2, 11.3, 23.5, 15.6, 12.7, 22.2,
4.3, 6, 7.7, 13.2, 10.5, 16.3, 42.1),
income=factor(rep(income, 2), ordered=TRUE,
levels=income))

</code></pre>


</div>
</body></html>
77 changes: 77 additions & 0 deletions doc/ordinal/rst/income.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
.. container::

====== ===============
income R Documentation
====== ===============

.. rubric:: Income distribution (percentages) in the Northeast US
:name: income-distribution-percentages-in-the-northeast-us

.. rubric:: Description
:name: description

Income distribution (percentages) in the Northeast US in 1960 and
1970 adopted from McCullagh (1980).

.. rubric:: Usage
:name: usage

::

income

.. rubric:: Format
:name: format

``year``
year.

``pct``
percentage of population in income class per year.

``income``
income groups. The unit is thousands of constant (1973) US
dollars.

.. rubric:: Source
:name: source

Data are adopted from McCullagh (1980).

.. rubric:: References
:name: references

McCullagh, P. (1980) Regression Models for Ordinal Data. *Journal of
the Royal Statistical Society. Series B (Methodological)*, Vol. 42,
No. 2., pp. 109-142.

.. rubric:: Examples
:name: examples

::

print(income)

## Convenient table:
(tab <- xtabs(pct ~ year + income, income))

## small rounding error in 1970:
rowSums(tab)

## compare link functions via the log-likelihood:
links <- c("logit", "probit", "cloglog", "loglog", "cauchit")
sapply(links, function(link) {
clm(income ~ year, data=income, weights=pct, link=link)$logLik })
## a heavy tailed (cauchy) or left skew (cloglog) latent distribution
## is fitting best.

## The data are defined as:
income.levels <- c(0, 3, 5, 7, 10, 12, 15)
income <- paste(income.levels, c(rep("-", 6), "+"),
c(income.levels[-1], ""), sep = "")
income <-
data.frame(year=factor(rep(c("1960", "1970"), each = 7)),
pct = c(6.5, 8.2, 11.3, 23.5, 15.6, 12.7, 22.2,
4.3, 6, 7.7, 13.2, 10.5, 16.3, 42.1),
income=factor(rep(income, 2), ordered=TRUE,
levels=income))
Loading

0 comments on commit a9c0a8b

Please sign in to comment.