Skip to content

Commit b0b79f4

Browse files
committed
Update to: Introduction to DHS Data Analysis" project
1 parent 11fc9e7 commit b0b79f4

File tree

12 files changed

+319
-0
lines changed

12 files changed

+319
-0
lines changed

Intro_DHSdata_Analysis/7_SpecialTopics/6a.Multilevel_Weights/India/EA average size_Urban.xlsx renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Multilevel_Weights/India/EA average size_Urban.xlsx

File renamed without changes.

Intro_DHSdata_Analysis/7_SpecialTopics/6a.Multilevel_Weights/India/India_NFHS4_compile_rural_HHsize.do renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Multilevel_Weights/India/India_NFHS4_compile_rural_HHsize.do

File renamed without changes.

Intro_DHSdata_Analysis/7_SpecialTopics/6a.Multilevel_Weights/India/Readme.MD renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Multilevel_Weights/India/Readme.MD

File renamed without changes.

Intro_DHSdata_Analysis/7_SpecialTopics/6a.Multilevel_Weights/Level-Weights_approximate.do renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Multilevel_Weights/Level-Weights_approximate.do

File renamed without changes.

Intro_DHSdata_Analysis/7_SpecialTopics/6a.Multilevel_Weights/Readme.MD renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Multilevel_Weights/Readme.MD

File renamed without changes.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
/*******************************************************************************************************************************
2+
Program: HouseholdStucture.do
3+
Purpose: Construct household structure variable
4+
Author: Tom Pullum
5+
Date last modified: Jan 17, 2023 by Tom Pullum
6+
*******************************************************************************************************************************/
7+
8+
* using Philippines 2022 DHS survey as an example
9+
use "PHPR81FL.DTA", clear
10+
11+
* Criteria for household typology
12+
* household head is male (hv101=1, hv104=1)
13+
* household head is female (hv101=1, hv104=2)
14+
* no spouse present (no one in hh with hv101=2
15+
* at least one unmarried child present (hv101=3,hv105<=17, hv115=0)
16+
17+
gen hhhead_male =0
18+
gen hhhead_female=0
19+
gen spouse=0
20+
gen child=0
21+
22+
replace hhhead_male =1 if hv101==1 & hv104==1
23+
replace hhhead_female=1 if hv101==1 & hv104==2
24+
replace spouse=1 if hv101==2
25+
replace child=1 if hv101==3 & hv105<=17 & hv115==0
26+
27+
egen nhhhead_male =total(hhhead_male), by(hv024 hv001 hv002)
28+
egen nhhhead_female=total(hhhead_female), by(hv024 hv001 hv002)
29+
egen nspouse =total(spouse), by(hv024 hv001 hv002)
30+
egen nchild =total(child), by(hv024 hv001 hv002)
31+
32+
gen hhtype=.
33+
replace hhtype=1 if nhhhead_male==1 & nspouse>=1 & nchild>0
34+
replace hhtype=2 if nhhhead_male==1 & nspouse>=1 & nchild==0
35+
replace hhtype=3 if nhhhead_male==1 & nspouse==0 & nchild>0
36+
replace hhtype=4 if nhhhead_male==1 & nspouse==0 & nchild==0
37+
replace hhtype=5 if nhhhead_female==1 & nspouse>=1 & nchild>0
38+
replace hhtype=6 if nhhhead_female==1 & nspouse>=1 & nchild==0
39+
replace hhtype=7 if nhhhead_female==1 & nspouse==0 & nchild>0
40+
replace hhtype=8 if nhhhead_female==1 & nspouse==0 & nchild==0
41+
replace hhtype=9 if hhtype==.
42+
43+
label define hhtype 1 "Male head with spouse and children" 2 "Male head with spouse, no children" 3 "Male head, no spouse, and children" 4 "Male head, no spouse, no children" 5 "Female head with spouse and children" 6 "Female head with spouse, no children" 7 "Female head, no spouse, and children" 8 "Female head, no spouse, no children" 9 "Other"
44+
label values hhtype hhtype
45+
46+
tab hhtype if hv101==1 [iweight=hv005/1000000]
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
/*******************************************************************************************************************************
2+
Program: PregOutcomes.do
3+
Purpose: Calculate pregnancy outcomes (Births, Miscarriages, Abortions, Stillbirths)
4+
Author: Tom Pullum
5+
Date last modified: Feb 9, 2023 by Tom Pullum
6+
July 18, 2023 by Shireen Assaf to adapat code and add more notes for this project
7+
8+
Description:
9+
* Pregancy outcomes referred to generically as BMAS, for Births, Miscarriages, Abortions, Stillbirths
10+
* Stata program to calculate the months pregnant (P) preceding BMAS codes.
11+
* Illustrate with PKIR71FL.dta (Pakistan 2017-18 DHS) and all pregnancies in vcal_6.
12+
* Sequence them as in the table in the final report (Births, Stillbirths, Miscarriages, Abortions)
13+
* An interval is censored if it is so close to the beginning of the calendar (the early months) that
14+
* it is impossible to be sure of the number of preceding months of P's
15+
If the data are perfect, the month of conception will be the first month with a "P". However, the woman may not know when she actually became pregnant. It can happen that an abortion or miscarriage in the data is not preceded by ANY months with P. DHS would usually say that the duration of the pregnancy is one plus the number of preceding months of "P".
16+
17+
The pregnancy outcome variable produced is "type" and can be tabulate by covariates.
18+
This program is different from the perinatal mortality code, CM_PMR.do, found in the DHS Program Code Share Library.
19+
https://github.com/DHSProgram/DHS-Indicators-Stata/tree/master/Chap08_CM
20+
The CM_PMR.do will calculate stillbirths and live births as well as perinatal mortality but excludes miscarriage and abortions.
21+
22+
***************************************************************************/
23+
24+
* drops any saved programs.
25+
program drop _all
26+
27+
program calc_interval_length
28+
29+
* Construct a file with a separate record for each BMAS
30+
* We look for strings that end in B, C, A, or S and are preceded (chronologically) by P's
31+
* Allow for possible codes C, A that are not preceded by P
32+
33+
* mbi: months as months before interview. mbi=col-v018
34+
* cmc: months in century month codes. cmc=v017+80-col
35+
36+
* Because there is a reshape of the data, the data should be reduced to only the variables needed.
37+
* For the program the following variables are needed: v001 v002 v003 v005 v017 v018 vcal_6
38+
39+
* However, if there is interest to tabulate the pregnancy outcome by specific background variables, then they should be included as well. For instance, v024, v025, v106, and v190 are included below.
40+
keep v001 v002 v003 v005 v017 v018 v024 v025 v106 v190 vcal_6
41+
42+
* Calculate the number of P's that immediately precede the BMAS, allowing for 0 to 11
43+
* Index the event by col, not cmc or mbi, and allow col to go from 1 to 79
44+
45+
* To reduce to events and intervals that are entirely included in the 60 months before the interview,
46+
* in the loop below replace "80" with "60+v018"
47+
48+
quietly forvalues lcol=1/80 {
49+
gen type_`lcol'=.
50+
replace type_`lcol'=1 if substr(vcal_6,`lcol',1)=="B"
51+
replace type_`lcol'=2 if substr(vcal_6,`lcol',1)=="S"
52+
replace type_`lcol'=3 if substr(vcal_6,`lcol',1)=="C"
53+
replace type_`lcol'=4 if substr(vcal_6,`lcol',1)=="A"
54+
gen interval_`lcol'= .
55+
replace interval_`lcol'= 0 if substr(vcal_6,`lcol'+1,1)~="P" & `lcol'+1<=80
56+
replace interval_`lcol'= 1 if substr(vcal_6,`lcol'+1,1) =="P" & substr(vcal_6,`lcol'+2,1)~="P" & `lcol'+2<=80
57+
replace interval_`lcol'= 2 if substr(vcal_6,`lcol'+1,2) =="PP" & substr(vcal_6,`lcol'+3,1)~="P" & `lcol'+3<=80
58+
replace interval_`lcol'= 3 if substr(vcal_6,`lcol'+1,3) =="PPP" & substr(vcal_6,`lcol'+4,1)~="P" & `lcol'+4<=80
59+
replace interval_`lcol'= 4 if substr(vcal_6,`lcol'+1,4) =="PPPP" & substr(vcal_6,`lcol'+5,1)~="P" & `lcol'+5<=80
60+
replace interval_`lcol'= 5 if substr(vcal_6,`lcol'+1,5) =="PPPPP" & substr(vcal_6,`lcol'+6,1)~="P" & `lcol'+6<=80
61+
replace interval_`lcol'= 6 if substr(vcal_6,`lcol'+1,6) =="PPPPPP" & substr(vcal_6,`lcol'+7,1)~="P" & `lcol'+7<=80
62+
replace interval_`lcol'= 7 if substr(vcal_6,`lcol'+1,7) =="PPPPPPP" & substr(vcal_6,`lcol'+8,1)~="P" & `lcol'+8<=80
63+
replace interval_`lcol'= 8 if substr(vcal_6,`lcol'+1,8) =="PPPPPPPP" & substr(vcal_6,`lcol'+9,1)~="P" & `lcol'+9<=80
64+
replace interval_`lcol'= 9 if substr(vcal_6,`lcol'+1,9) =="PPPPPPPPP" & substr(vcal_6,`lcol'+10,1)~="P" & `lcol'+10<=80
65+
replace interval_`lcol'=10 if substr(vcal_6,`lcol'+1,10)=="PPPPPPPPPP" & substr(vcal_6,`lcol'+11,1)~="P" & `lcol'+11<=80
66+
replace interval_`lcol'=11 if substr(vcal_6,`lcol'+1,11)=="PPPPPPPPPPP" & substr(vcal_6,`lcol'+12,1)~="P" & `lcol'+12<=80
67+
}
68+
69+
reshape long type_ interval_, i(v001 v002 v003) j(col)
70+
rename *_ *
71+
72+
drop if type==.
73+
label variable type "Type of pregnancy outcome"
74+
label define type 1 "Live birth" 2 "Stillbirth" 3 "Miscarriage" 4 "Abortion"
75+
label values type type
76+
77+
label variable interval "Preceding months of P"
78+
replace interval=99 if interval==.
79+
label define interval 99 "Censored"
80+
label values interval interval
81+
82+
*tab interval type
83+
*tab interval type [iweight=v005/1000000]
84+
85+
tab type [iweight=v005/1000000]
86+
87+
save months_pregnant.dta, replace
88+
89+
end
90+
91+
***************************************************************************
92+
***************************************************************************
93+
***************************************************************************
94+
***************************************************************************
95+
* Execution begins here
96+
97+
* Specify your workspace, the data should be saved here as well unless you specify a different path for the data.
98+
* cd ...
99+
100+
********************************
101+
use "PKIR71FL.DTA" , clear
102+
* In this survey the BMAS are in vcal_6; C is the symbol for Miscarriage
103+
********************************
104+
105+
* run the program
106+
calc_interval_length
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
2+
# Purpose: Special indicators
3+
4+
The section will contain code for construction of special indicators and measures that may not be found the DHS Code Share Library or other repositories.
5+
6+
In no particular order, the following code is available:
7+
8+
### HouseholdStructure.do
9+
10+
Constructs household structure variable using the PR file
11+
12+
### SexRatio.do
13+
14+
Calculates the sex retio which is defined here as males per 100 females and females per 100 males and saves the results as scalars.
15+
The program also contains code on how to calculate Sex Ratios by district for each state using the India 2019-2021 survey.
16+
This code could be adapted for other covariates and surveys.
17+
18+
### PregOutcomes.do
19+
20+
The program will reshape the IR file to produce a variable "type" which represents the pregnancy outcomes (live births, stillbirths, miscarriages, and abortions) among pregnancies in the 5 years preceding the survey.
21+
Please read the notes provided throughout the file for more details and instructions.
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
/*******************************************************************************************************************************
2+
Program: SexRatio.do
3+
Purpose: Construct sex ratio
4+
Author: Tom Pullum
5+
Date last modified: Feb 6, 2023 by Tom Pullum
6+
7+
Description:
8+
The sex ratio can be defined in different ways. In biology it is the proportion of individuals who are female. In demography it is usually the number of males per 100 females. You say you want the number of females per 1000 males. The following Stata program calculates males per 100 females and females per 1000 males and saves the results as scalars. In the labels, P, L, and U are the point estimate, the lower end of the 95% confidence interval, and the upper end of the 95% confidence interval, respectively.
9+
*******************************************************************************************************************************/
10+
11+
*** Calculate the sex ratio at birth for births in the past 5 years using the KR file ***
12+
13+
* using the India 2019-2021 DHS survey as an example but any survey can be used.
14+
use "IAKR7EFL.DTA" , clear
15+
16+
egen cluster_ID=group(v024 v001)
17+
svyset cluster_ID [pweight=v005], strata(v023) singleunit(centered)
18+
19+
tab b4 [iweight=v005/1000000]
20+
21+
* Sex ratio defined as males per 100 females
22+
gen m_per_100f=0
23+
replace m_per_100f=1 if b4==1
24+
25+
svy: logit m_per_100f
26+
matrix T=r(table)
27+
matrix list T
28+
29+
scalar P_m_per_100f=100*exp(T[1,1])
30+
scalar L_m_per_100f=100*exp(T[5,1])
31+
scalar U_m_per_100f=100*exp(T[6,1])
32+
33+
* Sex ratio defined as females per 1000 males
34+
gen f_per_1000m=0
35+
replace f_per_1000m=1 if b4==2
36+
37+
svy: logit f_per_1000m
38+
matrix T=r(table)
39+
matrix list T
40+
41+
scalar P_f_per_1000m=1000*exp(T[1,1])
42+
scalar L_f_per_1000m=1000*exp(T[5,1])
43+
scalar U_f_per_1000m=1000*exp(T[6,1])
44+
45+
scalar list
46+
47+
****************************************************************
48+
*** Sex ratio by covariates ***
49+
50+
* The code below was delveloped for India but can be used for other surveys.
51+
* India is a special case because it has districts and states.
52+
53+
* Program to produce the sex ratio at birth for districts in the India NFHS-5 survey and will produce a Stata data file for each state.
54+
* Can be adapted for subpopulations in any DHS survey.
55+
56+
* Two definitions of the sex ratio are used.
57+
58+
* The time interval is the past five years, all births in the KR file, except those
59+
* in the month of interview.
60+
61+
* The program provides the lower and upper ends of 95% confidence intervals.
62+
* The intervals are wide.
63+
* The estimates use svy, including subpop (within the state).
64+
* Bayesian procedures would reduce the confidence intervals and move the estimates toward the state value
65+
66+
* The results are saved into the workfile and then the workfile is reduced to just the saved results.
67+
* A seperate data file will be produced for each state with the estimates.
68+
69+
use "IAKR7EFL.DTA"
70+
* keep the variables of interest, below the India country-specific variable sdist for the districts is included
71+
keep v001 v002 v003 v005 v008 v023 v024 v025 sdist b4
72+
73+
* Construct binary variable m_per_f
74+
gen m_per_f=0
75+
replace m_per_f=1 if b4==1
76+
77+
save IAKR7Etemp.dta, replace
78+
79+
levelsof v024, local(lstates)
80+
81+
foreach ls of local lstates {
82+
use IAKR7Etemp.dta, clear
83+
keep if v024==`ls'
84+
85+
* Trick: use this file to save the results
86+
gen vstate=`ls'
87+
gen vdist=.
88+
gen vb=.
89+
gen vL=.
90+
gen vU=.
91+
gen vcases=.
92+
93+
svyset v001 [pweight=v005], strata(v023) singleunit(centered)
94+
95+
* First do the state estimate
96+
svy: logit m_per_f
97+
matrix T=r(table)
98+
replace vb=T[1,1] if _n==1
99+
replace vL=T[5,1] if _n==1
100+
replace vU=T[6,1] if _n==1
101+
replace vcases=e(N) if _n==1
102+
103+
* Now loop through all the districts in this state
104+
scalar sline=2
105+
levelsof sdist, local(ldistricts)
106+
quietly foreach ld of local ldistricts {
107+
108+
* Construct a variable for subpop to select the district
109+
gen select_dist=1 if sdist==`ld'
110+
111+
svy, subpop(select_dist): logit m_per_f
112+
matrix T=r(table)
113+
replace vdist=`ld' if _n==sline
114+
replace vb=T[1,1] if _n==sline
115+
replace vL=T[5,1] if _n==sline
116+
replace vU=T[6,1] if _n==sline
117+
replace vcases=e(N) if _n==sline
118+
drop select_dist
119+
scalar sline=sline+1
120+
}
121+
122+
* Finished with a state; save the results for this state in a data file
123+
drop if vb==.
124+
keep vstate vdist vb vL vU vcases
125+
126+
rename v* *
127+
128+
* Re-attach the labels for state and district; must confirm the label names
129+
label values state V024
130+
label values dist SDIST
131+
132+
* Sex ratio defined as males per 100 females
133+
gen P_m_per_100f=100*exp(b)
134+
gen L_m_per_100f=100*exp(L)
135+
gen U_m_per_100f=100*exp(U)
136+
137+
* Sex ratio defined as females per 1000 males
138+
gen P_f_per_1000m=1000*exp(-b)
139+
gen L_f_per_1000m=1000*exp(-L)
140+
gen U_f_per_1000m=1000*exp(-U)
141+
142+
save results_`ls'.dta, replace
143+
}
144+
145+
format P_* L_* U_* %6.1f
146+
list, table clean noobs

Intro_DHSdata_Analysis/7_SpecialTopics/6b.Survival_Analysis/Readme.md renamed to Intro_DHSdata_Analysis/7_SpecialTopics/Survival_Analysis/Readme.md

File renamed without changes.

0 commit comments

Comments
 (0)