README_en.Rmd

---
output: github_document
---

Para a versão em português, clique no escudo abaixo:
<!-- badges: start -->
[![pt-br](https://img.shields.io/badge/lang-pt--br-blue.svg)](https://github.com/datazoompuc/datazoom_social_Stata/blob/English-READ.ME/README.md)

<a href="https://github.com/datazoompuc/datazoom_social_Stata"><img src="https://raw.githubusercontent.com/datazoompuc/datazoom_social_stata/master/logo.jpg" align="left" width="100" hspace="10" vspace="6"></a>

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# datazoom_social

<!-- badges: start -->
![Languages](https://img.shields.io/github/languages/count/datazoompuc/datazoom_social_Stata?style=flat)
![Commits](https://img.shields.io/github/commit-activity/y/datazoompuc/datazoom_social_Stata?style=flat)
![Open Issues](https://img.shields.io/github/issues-raw/datazoompuc/datazoom_social_Stata?style=flat)
![Closed Issues](https://img.shields.io/github/issues-closed-raw/datazoompuc/datazoom_social_Stata?style=flat)
![Files](https://img.shields.io/github/directory-file-count/datazoompuc/datazoom_social_Stata?style=flat)
![Followers](https://img.shields.io/github/followers/datazoompuc?style=flat)
<!-- badges: end -->

The `datazoom_social` package reads and processes microdata from IBGE surveys. We read all IBGE household surveys into Stata format, as well as making different Census instances compatible, generating individual identification for the Continuous PNAD, and much more.

## Installation <a name="instalacao"></a>

Enter the code below in the Stata command line to download and install the latest version of the package

```
net install datazoom_social, from("https://raw.githubusercontent.com/datazoompuc/datazoom_social_stata/master/") force
```

## Usage

All of our functions can be used through interactive dialog boxes. To access them, type
```
db datazoom_social
```

The buttons below link to the function help files. They are best viewed on Stata, with command `help function`.

|       |        |       |       |
|:-----:|:------:|:-----:|:-----:|
| <a href = "Censo/datazoom_censo_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> Censo </font>  &nbsp;&nbsp; <br><br> </kbd> </a> <br> <br> <small> Demographic Census </small> <br> <small> 1970 to 2010 </small> | <a href = "ECINF/datazoom_ecinf_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> ECINF </font> &nbsp;&nbsp; <br><br> </kbd> </a> <br><br> <small> Urban Informal Economy </small> <br> <small> 1997 and 2003 </small> | <a href = "PME/datazoom_pme_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> PME </font> &nbsp;&nbsp; <br><br> </kbd> </a> <br><br> <small> Monthly Employment Survey </small> <br> <small> 1990 to 2015 </small> | <a href = "PNAD/datazoom_pnad_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> PNAD </font> &nbsp;&nbsp; <br><br> </kbd> </a> <br><br> <small> Old PNAD </small> <br> <small> 2001 to 2015 </small> |
| <a href = "PNAD_Continua/Trimestral/datazoom_pnadcontinua_en.sthlp"> <kbd> <br> <font size = 3> PNAD Contínua </font> <br><br> </kbd> </a> <br><br> <small> Continuous PNAD </small> <br> <small> 2012 to 2023 </small> | <a href = "PNAD_Covid/datazoom_pnad_covid_en.sthlp"> <kbd> <br> &nbsp; <font size = 3> PNAD Covid </font> &nbsp; <br><br> </kbd> </a> <br><br> <small> PNAD Covid </small> <br> <small> 2020 </small> | <a href = "PNS/datazoom_pns_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> PNS </font> &nbsp;&nbsp; <br><br> </kbd> </a> <br><br> <small> National Health Survey </small> <br> <small> 2013 and 2019 </small> | <a href = "POF/datazoom_pof_en.sthlp"> <kbd> <br> &nbsp;&nbsp; <font size = 3> POF </font> &nbsp;&nbsp; <br><br> </kbd> </a> <br><br> <small> Consumer Expenditure Survey </small> <br> <small> 1995 to 2018 </small> |

<a href = "#credits">![Static Badge](https://img.shields.io/badge/Credits%20-%20PUC%20Rio%20Department%20of%20Economics%20-%20blue) </a> <a href = "#credits"> ![Static Badge](https://img.shields.io/badge/Citation%20-%20green) </a>

## Auxiliary Programs (Dictionaries)

Most of the package programs encounter original data stored in *.txt* format, which requires dictionaries – *.dct* format in Stata – to be read. The result is a volume of dictionaries that exceeds the 100-file limit allowed for a Stata package to be installed. Therefore, individual dictionaries are compressed into a single *.dta* file, read within each program. Both functions are defined in the file *read_compdct.ado*.

The first program defined in this file is `write_compdct`, which can be used as follows: after running the *.ado* file to define the function, simply use the code:

    write_compdct, folder("/folder with dictionaries") saving("/path/dict.dta")

The function then reads all *.dct* files present in the folder and combines them into the *dict.dta* file, with each dictionary identified by a variable with its name.

To transform this compressed file back into the original dictionary, we reccomend using the `read_compdct` program:

    read_compdct, compdct("dict.dta") dict_name("original_dict") out("extracted_dict.dct")

which extracts the *original_dict* from the *dict.dta* file and saves it as *extracted_dict.dct*. 
As an example, see the use of this function in the `datazoom_pnadcontinua` program:

    tempfile dic // Temporary file where the extracted .dct will be saved

    findfile dict.dta // Finds the dict.dta file saved by the package installation
                      // in the /ado/ folder and stores the path to it in the r(fn) 
                      //macro.

    read_compdct, compdct("`r(fn)'") dict_name("pnadcontinua`lang'") out("`dic'")
      // Reads the compacted dict.dta dictionary, extracts the pnadcontinua 
      // dictionary (or pnadcontinua_en, `lang` is empty or "_en"), and saves the 
      // final file in the tempfile dic, which is used to read the data.

For our internal organization, each folder corresponding to a program stores the dictionaries in the */dct/* sub-folder. All these dictionaries are also stored together in the */dct/* folder directly, which is used to generate the *dict.dta* file using `write_compdct`. Note that no *.dct* files are actually listed in the *datazoom\_social.pkg* file, and therefore, they are not installed on the user's computer. Only the *dict.dta* file is sent.

The automated do-file `atualizacao_dict.do` is used to update `dict.dta`.

## Credits

[Data Zoom](https://www.econ.puc-rio.br/datazoom/)is developed by a team at the PUC-Rio Department of Economics.

To cite package `datazoom_social`, use:

> Data Zoom (2023). Data Zoom: Simplifying Access To Brazilian Microdata.  
> https://www.econ.puc-rio.br/datazoom/english/index.html

Or in BibTeX format:

``` 
@Unpublished{DataZoom2023,
	author = {Data Zoom},
	title = {Data Zoom: Simplifying Access To Brazilian Microdata},
	url = {https://www.econ.puc-rio.br/datazoom/english/index.html},
	year = {2023},
}

```