Skip to content

asjadnaqvi/stata-streamplot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

streamplot


Installation | Syntax | Citation guidelines | Examples | Feedback | Change log


StataMin issues license Stars version release

streamplot v1.82

(10 Jun 2024)

This package provides the ability to generate stream plots in Stata. It is based on the Streamplot Guide (December 2020).

Installation

The package can be installed via SSC or GitHub. The GitHub version, might be more recent due to bug fixes, feature updates etc, and may contain syntax improvements and changes in default values. See version numbers below. Eventually the GitHub version is published on SSC.

SSC (v1.82):

ssc install streamplot, replace

GitHub (v1.82):

net install streamplot, from("https://raw.githubusercontent.com/asjadnaqvi/stata-streamplot/main/installation/") replace

The palettes package is required to run this command:

ssc install palettes, replace
ssc install colrspace, replace

If you want to make a clean figure, then it is advisable to load a clean scheme. These are several available and I personally use the following:

ssc install schemepack, replace
set scheme white_tableau  

I also prefer narrow fonts in figures with long labels. You can change this as follows:

graph set window fontface "Arial Narrow"

Syntax

The syntax for the latest version is as follows:

streamplot y x [if] [in], by(varname) 
            [ palette(str) smooth(num) labcond(str) offset(num) alpha(num) droplow yreverse cat(varname) recenter(top|mid|bot) 
               lcolor(str) lwidth(str) labsize(num) labcolor(color|palette) percent format(str) nolabel area
               tline tlcolor(str) tlwidth(str) tlpattern(str) yline(str)
               xlabel(str) xtitle(str) ytitle(str) title(str) subtitle(str) note(str) 
               ysize(num) xsize(num) scheme(str) aspect(str) name(str) saving(str)
            ]

See the help file help streamplot for details.

The most basic use is as follows:

streamplot y x, by(varname)

where y is the variable we want to plot, and x is usually the time dimension. The by variable splits the data into different groupings that also determines the colors. The color schemes can be modified using the palettes(name) option. Here any scheme from the colorpalettes package can be used.

Citation guidelines

Software packages take countless hours of programming, testing, and bug fixing. If you use this package, then a citation would be highly appreciated. Suggested citations:

in BibTeX

@software{streamplot,
   author = {Naqvi, Asjad},
   title = {Stata package ``streamplot''},
   url = {https://github.com/asjadnaqvi/stata-streamplot},
   version = {1.82},
   date = {2024-06-10}
}

or simple text

Naqvi, A. (2024). Stata package "streamplot" version 1.82. Release date 10 June 2024. https://github.com/asjadnaqvi/stata-streamplot.

or see SSC citation (updated once a new version is submitted)

Examples

Set up the data:

clear
set scheme white_tableau
graph set window fontface "Arial Narrow"

use "https://github.com/asjadnaqvi/stata-streamplot/blob/main/data/streamdata.dta?raw=true", clear

We can generate basic graphs as follows:

streamplot new_cases date, by(region) 

streamplot new_cases date if date > 22400, by(region) smooth(6)

Recenter the graphs to top or bottom:

streamplot new_cases date if date > 22400, by(region) smooth(6) recenter(bot)

streamplot new_cases date if date > 22400, by(region) smooth(6) recenter(top)

streamplot new_cases date if date > 22400, by(region) smooth(6) ///
	labcond(20000) ylabsize(1.8) lc(black) lw(0.04)

streamplot new_cases date if date > 22400, by(region) smooth(6) ///
	labcond(20000) ylabsize(1.8) lc(black) lw(0.04) format(%12.0fc) offset(20)

streamplot new_cases date if date > 22400, by(region) smooth(6) palette(CET D11) ///
	labcond(2) ylabsize(1.8) lc(black) lw(0.04)  percent format(%3.2f) offset(20) ylabc(red)

streamplot new_cases date if date > 22400, by(region) smooth(6) palette(CET C6, reverse) ///
	labcond(1) ylabsize(1.8) lc(black) lw(0.04)  percent format(%3.2f) offset(20) ylabc(palette)

qui summ date if date > 22400

local xmin = `r(min)'
local xmax = `r(max)'

streamplot new_cases date if date > 22400, by(region) smooth(6) palette(CET D02)  ///
	title("My Stata stream plot") /// 
	subtitle("Subtitle here") note("Note here") ///
	labcond(20000) ylabsize(1.5) lc(white) lw(0.08) ///
	xlabel(`xmin'(20)`xmax', angle(90)) xtitle("")

or a custom graph scheme:

streamplot new_cases date if date > 22600, by(region) smooth(6)  palette(CET CBD1)  ///
	title("My Stata stream plot", size(6)) subtitle("with colorblind-friendly colors", size(4))  ///
	labcond(20000) ylabs(2) lc(black) lw(0.03) offset(25) xtitle("") ///
	scheme(neon) 

where the dark background neon scheme is loaded from the schemepack suite.

v1.6 updates

Test the yreverse option:

streamplot new_cases date if date > 22400, by(region) smooth(6) ///
	labcond(20000) ylabsize(1.8) lc(black) lw(0.04) format(%12.0fc) offset(20) yrev

Test the region split option. First let's define a variable:

gen ns = .
replace ns = 2 if inlist(region, 1, 2, 5, 6, 7, 8)
replace ns = 1 if inlist(region, 3, 4, 9, 10, 11, 12, 13)

lab de ns 2 "North" 1 "South"
lab val ns ns

tab region ns

And plot it:

streamplot new_cases date if date > 22400, by(region) smooth(6) cat(ns) palette(CET D02) labcond(20000)

We can use the new variable itself in the by() option:

streamplot new_cases date if date > 22400, cat(ns) by(ns) smooth(6) 

v1.7 updates

Get the data:

use "https://github.com/asjadnaqvi/stata-streamplot/blob/main/data/wbgdpdata.dta?raw=true", clear

drop if year < 1990
gen splitvar = category!="M"
streamplot value_real year if countrycode=="TSA", by(category) smooth(2) xsize(2) ysize(1)

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2)  xsize(2) ysize(1) 

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Green-Orange-Teal) ///
	yline(0) xsize(2) ysize(1) 

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Green-Orange-Teal) ///
	yline(0) xsize(2) ysize(1) tline 

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Nuriel Stone) ///
	yline(0) xsize(2) ysize(1) tline tlc(white) tlw(0.8) tlp(dash)	

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Green-Orange-Teal) ///
	yline(0) xsize(2) ysize(1) tline tlc(black) tlw(0.5) tlp(dash)	

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Green-Orange-Teal) ///
	yline(0) xsize(2) ysize(1) tline tlc(black) tlw(0.5) tlp(dash) xtitle("") ///
	xlabel(1990(2)2022, angle(90)) labsize(2.2) offset(8) 	///
	title("{fontface Arial Bold:GDP Expenditures in South Asia (Constant 2015 USD billions)}")	///
	note("World Bank Open Data.", size(2))

v1.8 updates

streamplot value_real year if countrycode=="TSA", by(category) cat(splitvar) smooth(2) palette(tab Green-Orange-Teal) ///
	yline(0) xsize(2) ysize(1) tline tlc(black) tlw(0.5) tlp(dash) xtitle("") xline(2020) ///
	xlabel(1990(2)2022, angle(90)) labsize(2.2) offset(8) labprop 	///
	title("{fontface Arial Bold:GDP Expenditures in South Asia (Constant 2015 USD billions)}")	///
	note("World Bank Open Data.", size(2))

streamplot value_real year  if countrycode=="TSA", by(category) smooth(2) palette(tab Green-Orange-Teal) ///
	xsize(2) ysize(1) xtitle("") ///
	xlabel(, angle(90)) labsize(2.2) offset(8) recenter(bottom)  labprop  	///
	title("{fontface Arial Bold:GDP Expenditures in South Asia (Constant 2015 USD billions)}")	///
	note("World Bank Open Data.", size(2)) 

streamplot value_real year  if countrycode=="TSA", by(category) smooth(2) palette(tab Green-Orange-Teal) ///
	xsize(2) ysize(1) xtitle("")  ///
	xlabel(, angle(90)) labsize(2.2) offset(8) recenter(bottom) labprop labcolor(palette)  	///
	title("{fontface Arial Bold:GDP Expenditures in South Asia (Constant 2015 USD billions)}")	///
	note("World Bank Open Data.", size(2)) 

v1.81 stacked area graph

streamplot value_real year  if countrycode=="TSA", by(category) smooth(0) area recenter(bottom)  ///
	xsize(2) ysize(1) xtitle("") palette(tab Green-Orange-Teal)  ///
	xlabel(, angle(90)) labsize(2.2) offset(8)   	///
	title("{fontface Arial Bold:GDP Expenditures in South Asia (Constant 2015 USD billions)}")	///
	note("World Bank Open Data.", size(2))  

Feedback

Please open an issue to report errors, feature enhancements, and/or other requests.

Change log

v1.82 (10 Jun 2024)

  • Added wrap() option for label wrapping.
  • Minor code fixes.

v1.81 (30 Apr 2024)

  • Added area option to allow stacked area graphs.

v1.8 (25 Apr 2024)

  • Added labprop and labscale() options to allow easy label scaling.
  • Added share and percent as substitutes.
  • Major code rework to optimize the speed of the graph generation.
  • Generic twoway options added.

v1.7 (01 Apr 2024)

  • Added trendline options: tline, tlcolor(), tlpattern(), tlwidth().
  • Added additional checks for plotting data.
  • Better handling of missing values and categories.

v1.61 (15 Jan 2024)

  • Fixed issues with locals.
  • Change ylabcolor() and ylabsize() to labcolor() and labsize() respectively.

v1.6 (15 Oct 2023)

  • Major update with the cat() option added to compare top versus bottom streams.
  • Option yreverse fixed.
  • Option nolab fixed.
  • Several internal routines rewritten and cleaned up.
  • The option percent() is now defined in the 0-100 (or higher range). Changed from the 0-1 range.

v1.52 (25 Aug 2023)

  • Support for aspect(), saving(), xscale(), and graphregion() added.

v1.51 (28 May 2023)

  • Cleaned up labcond() to align it with other packages.
  • offset() changed to percentages to align it with other packages.
  • Minor code cleanups, updates to defaults, and help file.

v1.5 (20 Nov 2022)

  • Option to recenter the graphs added.
  • Improve the precision of the calculations.

v1.4 (08 Nov 2022)

  • Major code cleanup.
  • The command now does error checks on the number of observations.
  • The command now correct deals with sequence of variables.
  • Additional colorpalette options added.
  • Several fixes to the help file.

v1.3 (20 Jun 2022)

  • ado distribution date added.
  • ylabel color, format, and percentages added (Thanks to Marc Kaulisch who suggested and contributed to these options).
  • Fixes to variables precisions.
  • y-label color fixed. Labels can either take on a named color, or they can be assigned the same colors as the color palette.

v1.2 (06 Jun 2022)

  • Fixes to value labels no passing through to graphs (Thanks to Marc Kaulisch).
  • Several graph options modified to passthru for better integration with twoway options.
  • Smoothing parameter adjusted
  • Error checks added. If there are too few observations per group, the command will abort.

v1.1 (08 Apr 2022)

  • Public release. Several options and features added.

v1.0 (06 Aug 2021)

  • Beta version