Friday, July 22, 2005

Using Arrays in Excel VBA

I love to use arrays in VBA for manipulating large chunks of data in a very quick time.
I often need to take output from a mainframe system like SAP or Peoplesoft or MIMS. Output data is usually always in the same layout, thought the content may be different. Sometimes it comes out in print (PRT) format and this contains page headers every page break. Compiling this data into a database in Excel (columns and rows of fields and records) is necessary in order to work with it for reporting and charting.
I used to do this before I knew about arrays by recording in VBA the actions I needed to perform on the raw data amd then editing it so it could work every time.
However I found where I needed to manipulate sometimnes several thousand lines and lots of columns of data that these macro's became very slow.

Now I use arrays instead. The main reason that arrays are quicker is that all your manipulations can be performed using memory rather than on the spreadhseet itself.

For example to take a set of data that has every second line blank and delete the blank rows can be done numerous ways using the spreadsheet method but they are all slow especially once you have over 10,000 rows of data.

Sub delete_rows_slowly()
'macreo written by NEUROTech 22 July 2005
Range("A1").Select 'number of rows of data
For i = 1 To 50
If ActiveCell = "" Then
ActiveCell.EntireRow.Delete
End If
ActiveCell.Offset(1, 0).Activate
Next i
End Sub

The quick way in an array is as follows:

Option Base 1
Sub delete_rows_quickly()
'macreo written by NEUROTech 22 July 2005
Dim myrange1(), myrange2()
Dim lastcell
Dim i, x

Range("A50").Select 'identify last cell
lastcell = ActiveCell.Row
Range("A1:A50").Select 'number of rows of data
ReDim myrange1(lastcell, 1), myrange2(lastcell, 1)
myrange1() = Selection
x = 1
For i = 1 To lastcell
If myrange1(i, 1) <> "" Then
myrange2(x, 1) = myrange1(i, 1)
x = x + 1
End If
Next i
Range("B1").Select
Range(ActiveCell, ActiveCell.Offset(x - 2, 0)).Select
Selection = myrange2()
End Sub

Note, the "Option Base 1" statement must appear at the top of the module above all procedures. It will affect all arrays in the module. It makes the starting variable in an array number 1 not zero.