Using loops to run (and export) many regressions / Usando loops para correr (y exportar) múltiples regresiones

español

The use of  loops becomes essential when needing to perform repetitive calculations. Looping has many advantages, for example, when needing to do corrections in all the calculations specifications. So here are some interesting features that you would like to do when implementing a loop to run many regressions, and export their outputs:

  • Choose the appropriate method for the regression according to the type of dependent variable. For instance,  you might want to estimate the model using OLS (regress) when the dependent variable is continuous and or a probit or a logit model when it is discrete (a dummy variable).
  • Progressively add explanatory variables to the model and export all the output in a single table. This can be done using outreg2 ‘s  replace and append options, but if you want instead to write a single command line inside a loop you will have to make the appropriate changes.

So assume that you want to estimate a number of econometric models that are quite similar in terms of the explanatory variables that are incorporated, but differ between them in terms of the dependent variables, for example :

Model 1: outcome1=b1*x1+b2*x2+b3*X3+b4*X4+e
Model 2: outcome2=b1*x1+b2*x2+b3*X3+b4*X4+e

In addition you also want to progressively add sets of explanatory variables. So for instance you want to estimate:

Model 1: outcome1=b1*x1+b2*x2+e
Model 2: outcome1=b1*x1+b2*x2+b3*X3+b4*X4+e
Model 3: outcome2=b1*x1+b2*x2+e
Model 4: outcome2=b1*x1+b2*x2+b3*X3+b4*X4+e

Of course the code should work for n number of explanatory variables and k number of explanatory vars. Let’s do it.


What I am going to do is to first define the sets of explanatory variables using separate globals, for instance:

global CONTROLGROUP1 "x1 x2"
global CONTROLGROUP2 "x3 x4"

* And add the outcome variables into a global

global OUTCOMES "outcomevar1 outcomevar2 "

/*Start by looping on the types of explanatory variables, in this case we have discrete and continuous variables
for each discrete variable we are running a probit model, and for each ordinal variable we will run an ordinal probit model
*/
global TIPO "disc ord"
foreach tipo of global TIPO {

/*Here is an interesting feature: using vartyp to separate the outcome variables into groups of discrete and ordinal variables.
In this case, the first loop will assign all discrete variables to the
global OUTCOMEBYTYPE, the second loop will assign the ordinal variables to the global OUTCOMEBYTYPE and so on.
*/
	vartyp $OUTCOMES, list(`tipo')
	global OUTCOMEBYTYPE "`r(varlist)'"

/*Now assign the appropiate regression methodology to each type of explanatory variable*/

	if "`tipo'"=="disc" {
	local method "dprobit"
	}
	else {
	local method "oprobit"
	}

/*Start the loop of outcomes to be regressed*/
	foreach outcome of global OUTCOMEBYTYPE {

		/*We are not using replace and append when exporting output to tables, so we need to erase the output file
		if already exists*/
		capture confirm file reg_`outcome'.txt
					if !_rc {
							erase reg_`outcome'.txt
					}
					else {
					*no action when file not found
					}

		/*Start the loop that accumulates explanatory variables. Using while instead of foreach will allow adding more
		sets of explanatory variables later*/
		global i = 1
		global CONTROLS "$CONTROLGROUP1"

		while $i < 4 {
		/*Finally, the lines for running the regression and exporting the output. It is important to use capture when doing this
		since if any of the regression ends in an error (for example due to no observations), the loop will continue*/

		    capture noisily xi: `method' `outcome' $CONTROLS if respondiente==1 [fweight=pesosf]
			capture noisily outreg2 using reg_`outcome'.txt, append bdec(6) e(all)

		*The following lines do the trick to add the explanatory sets of variables
		global i = $i + 1
		global CONTROLS "$CONTROLS ${CONTROLGROUP$i}"

		*Close explanatory variables loop
		}
			/* Use this space to add other alternative specifications*/

		macro drop CONTROLS
		macro drop i

	*Close outcomes loop
	}
*Close type loop
}

Advertisements

Importing text files into Excel Part II / Importar archivos de texto al Excel Parte II

español

For those who find useful to transfer the output of their statistics and regressions to Excel, here is another macro that might be useful.

As in the last case, imagine that you are dealing with many tables of statistics and regressions that you have computed with Stata, and you will find useful to take them all to Excel. Such a thing might be useful for visualization, comparing statistics (robustness checks), formatting the tables for presentations or publications, elaborating further graphics, and so on.

Building on the macro presented in the previous post, this time I built another one to deal with importing multiple files simultaneously.

To see how it works, imagine, as an example, that you have 5 key variables that you are analysing and for each of them you have produced 4 tables (corresponding, for example to different estimation methods) and exported each respectively to a text file. That gives a total of 20 tables to be imported from Excel.

First, it might be useful to organize the text files information in a table in Excel as shown in the table below:

Using the information in the previous table, the multipletextload macro will :

i. Generate a new sheet for each variable (one for each row in the reference table) andname the sheet with the name of the corresponding variable.
ii. Paste the contents of all the tables corresponding to a same variable (in columns) within a same sheet, one table next to the right to each other.
iii. In case the macro do not find a specific file (might be the case that you did not want to apply a method for a given variable), will skip it and jump to the next file to import.

The macro uses the macro “Textload” that I have posted in the previous post, so you will have to generate the former one in order to run this one.

In order to run it, just select the contents of the table (in my example first select cells A2:E6 and then run the macro)


Sub multipletextload()
  Dim rgFiles As Range
  Dim i As Long, j As Long

  Dim sName As String
  Dim fName As String
  Dim strWork As String

 'Dim LastRow As Long
 Dim LastCol As Long
  strWork = ThisWorkbook.Name
  Set rgFiles = Selection
'Set rgFiles = Range("B2:F4")

  For i = 1 To rgFiles.Rows.Count
    'Row i Column 1 has the name of the variable. generate and name new sheet
    sName = rgFiles.Cells(i, 1).Value
    'sName = ActiveCell.Value
    Sheets.Add(After:=ActiveSheet).Name = sName
    'Add After:=ActiveSheet,
    For j = 2 To rgFiles.Columns.Count
    ' Row i Columns 2 onwards have the names of the files to import
    fName = rgFiles.Cells(i, j).Value
    'If there is no name in the cell, not do anything
    If fName = "" Then
    Else

      If j = 2 Then
      ' Only the file in Column 2 will be imported into cell  (1,1)
        Worksheets(sName).Cells(1, 1).Activate
        ActiveCell.FormulaR1C1 = fName
        Application.Run "Textload"
'        Application.Run "'reg results_vtest.xlsm'!Textload"
      Else
      ' The remaining will be imported to cell location (1,LastCol)
        Worksheets(sName).Cells(1, 1).Activate
        Set rgLast = Worksheets(sName).Range("A1").SpecialCells(xlCellTypeLastCell)
        'LastRow = rgLast.Row
        LastCol = rgLast.Column
        Worksheets(sName).Range("A1").Cells(1, LastCol).FormulaR1C1 = fName
        Worksheets(sName).Range("A1").Cells(1, LastCol).Activate
        Application.Run "Textload"
      End If
    End If

    Next j
  Next i

End Sub