Stata: How to select a random number from a list.

Standard

I want to randomly select a single value of a variable in Stata in order to e.g. delete it in a simulation/bootstrap.

Stata offers simple ways of creating subsamples from a dataset using:

With replacement use:

bsample

Without replacement:

sample

However, it seems a bit more involved to get a random selection from a list of numbers not in the dataset.

This is what I tried to pick a number from a list of numbers at random – undoubtedly to cumbersome.

1) Create a list of random numbers, which is has as many entries as the list you want to select from

foreach num of local listtoselectfrom { 
 local randomnumber = runiform()
 local rnumbers = "`rnumbers'`randomnumber' "
}

2) Sort the list of random numbers

local selection : list sort renumbers

3) Select the lowest random number

local posofselected = word("`selection'",1)

4) Find the position of this lowest random number in the unsorted list of random numbers

local posinselectionlocal : list posof "`posofselected'" in rnumbers 

5) Use this position to select at random ONE item from the list you want to select from:

local randomitem : word `posinselectionlocal' of `listtoselectfrom'

Example

set seed 7492001 //For replicability.

local listtoselectfrom = " 1 3 4 5 9 10"

foreach num of local listtoselectfrom { 
	local randomnumber = runiform()
	local rnumbers = "`rnumbers'`randomnumber' "
}
local selection : list sort rnumbers
local posofselected = word("`selection'",wordcount("`selection'")) 
local posinselectionlocal : list posof "`posofselected'" in rnumbers 
local randomitem : word `posinselectionlocal' of `listtoselectfrom'
dis "This is the selected number: `randomitem'"

P.S. More than one number
The select more than one number, select more numbers from the random number list in 3). And change loop over their positions using step 4) and 5).

Reference:
The described procedure was inspired by following two posts:
1) http://whatthestats.wordpress.com/2012/08/07/stata-selecting-random-samples/

2) http://blog.stata.com/2012/07/18/using-statas-random-number-generators-part-1/

Advertisements

Merge for Mata – Combining two matrices using a column of id values. (Stata merge for mata)

Standard

 

Occasionally I find myself in the need to combine to matrices of different dimensions from Stata.

(E.g. I estimated svy: proportions over a number of countries for a number of variables and wanted to have one data set containing all results)

In this Stata-list post, the suggestion is to either do this using Stata’s [merge] command or to import the matrix into Mata in loop over all elements.  – It took me quite a while to get this to work, and any feedback would be very welcome.


Example


I want to merge “matm” and “matu” into a matrix “final” using column 1 as id variable.

matm
1 2 +-----------+ 1 | 1 33 | 2 | 2 22 | 3 | 3 11 | +-----------+
matu
     1 2
 +---------+
 1 | 3 2 |
 2 | 1 2 |
 3 | 4 2 |
 4 | 5 2 |
 +---------+
final
     1 2 3
 +----------------+
 1 | 1 33 2 |
 2 | 2 22 . |
 3 | 3 11 2 |
 4 | 4 .  2 |
 5 | 5 .  2 |
 +----------------+

The code below is for a function, which does this using “Mata”.

mata: matamatrixmerge("matm","matu","columwithID1","columwithID2")

This function generates the “final” matrix if it is supplied with:

  • matm: a merging matrix
  • matu: a using matrix
  • columnwidthID1/2: the column number of the column containing the id values for both matm & matu.

State code:


*Define mata function to merge

* – the function expands the two matrices supplied (mergin/using matrix)
* to achieve conformability.
* – The matricies are merge using a single id-variable column.
* [ACHTUNG:] each row needs to be uniquely idenfied.
* [Output:] a merged Stata matrix called “final”.

mata //enter Mata

/*Define mata function and it’s arguments*/

mata drop matamatrixmerge()
function matamatrixmerge(string scalar matm, string scalar matu,string scalar columwithID1, string scalar columwithID2)
{

/*Import Stata matricies into Mata*/

m = st_matrix(matm) //Similar to "merge" there is one merging matrix and
u = st_matrix(matu) // a using matrix - hence matu.
posIDm = st_matrix(columwithID1)
posIDu = st_matrix(columwithID2)

/*Make vector containing all unique values of the ID var*/

idvalues = uniqrows(m[1..rows(m),posIDm]\u[1..rows(u),posIDu])
idvaluesMIN = idvalues[1,1]
idvaluesMAX = idvalues[rows(idvalues),1]

/*Merging:
– The following loop goes over all values of the ID var and
checks these against those supplied in the first matrix.
– If the value does not appear in the supplied matrix,
a row of missing values is added.
– This is done for both matrices supplied.
Result: Two matrices of same dimension which can be row-joined.*/

/*Merge Matrix*/
i=idvaluesMIN
checkedvaluescounter = 0
fm = J(idvaluesMAX,cols(m),.) //matrix collecting expanded mat.
while (i<=idvaluesMAX) {
j = 1
 while (j<=rows(m)) {
 if (idvalues[i,1] == m[j,posIDm]) {
 fm[i,1..cols(m)] = m[j,]
 break
 }
 ++j
 }
++i
}
/*Using Matrix*/
i=idvaluesMIN
checkedvaluescounter = 0
fu = J(idvaluesMAX,cols(u),.) //matrix collecting expanded mat.
while (i<=idvaluesMAX) {
j = 1
 while (j<=rows(u)) {
 if (idvalues[i,1] == u[j,posIDu]) {
 fu[i,1..cols(u)] = u[j,]
 break
 }
 ++j
 }
++i
}

/*Assembles merged matrix*/

f = idvalues,fm,fu //"idvalues column is added" and the now conform matricies joined.

/*Delete colums with duplicated id values */

selectionvectorm = J(1,cols(fm),1)
selectionvectorm[1,posIDm] =.
selectionvectoru = J(1,cols(fu),1)
selectionvectoru[1,posIDu] =.
selectionvector = 1,selectionvectorm, selectionvectoru
f = selectionvector\f
final = select(f, f[1,]:!=.)
final = final[2..rows(final),]

/*Export final matrix from Mata to Stata*/

st_matrix("final",final) //Returns matrix to Stata.
} //end of function definition

/*Save the function in current folder.*/

mata mosave matamatrixmerge(), replace 
end //exit Mata