Wednesday, July 25, 2007

Using Excel documents as a data source in ASP.Net

This is something that I mentioned last blog, finally I've gotten around to posting it, on a side note, I'm also trying out the new 'blogging' feature in Word 2007, go to the office button, click new, and start a new blog post, it'll run through setting it up, and then away you go.. hopefully it works!

The scenario I had, was that we have a spreadsheet setup for collecting payments for the 'bar tab' we have for our social committee, who sell chocolates, chips, beer etc at a low cost. This is tallied up on a spreadsheet, and an email is sent around with amounts. I thought it would be good to have this data on each individual users intranet homepage. So I set about making a web part for it.

We didn't want to change the spreadsheet to an ASP form, as it was working fine as it is, and I didn't have the time to code one and teach the social committee treasurer how to use it. and I didn't have the time to code one and teach the social committee treasurer how to use it. So I thought, surely there is a way to extract the data out of an excel spreadsheet, and sure enough there is.

First off, I setup my cache as mentioned in my previous post 'using cache to determine a time for content reloading' so it only queries the excel spreadsheet once every 12 hours, so if the cache is still there, it doesn't get the new data, if not then it gets new data and adds something to cache. I decided that instead of having the data in cache, which could be deleted by IIS, and would require regular expressions to query properly, I would store it in a SQL table. So effectively I was synchronizing a couple of columns out of an excel spreadsheet with a SQL table.

I setup my SQL table, and then set about talking to the excel sheet. I stored it under the a folder on the server, and just gave the social committee people permissions to access and edit it – which probably isn't the most secure approach, but it was fine for what I am doing, and fairly idiot proof.

To talk to the table, you set up an 'OleDbConnection', I setup mine as follows:

Dim DBConnection = New OleDbConnection( _

"Provider=Microsoft.Jet.OLEDB.4.0;" & _

"Data Source=" & Server.MapPath("~/SocCom/collection.xls") & ";" & _

"Extended Properties=""Excel 8.0;HDR=Yes""")

There I've got my data source as a mapped path to the xls file, and had its properties as an excel file. It connects using the Microsoft jet engine – similar to Access.

From then, I added a Try/Catch statement to catch errors which occur when someone is editing the file while it is trying to be accessed.

Try

DBConnection.Open()

Dim SQLString As String = "SELECT [Name], [Balance] FROM [Collection Summary$] WHERE [Name] is not null and [Balance] is not null"

Dim DBCommand = New OleDbCommand(SQLString, DBConnection)

Dim DBReader As Data.IDataReader = DBCommand.ExecuteReader()

There I've opened the connection, and selected the columns with 'Name' and 'Balance' in the topmost cell – they need to be enclosed in square brackets. The From part of the SQL statement has [Collection Summary$] as that is the sheet name, the $ sign is required at the end (though my sheet name is only 'Collection Summary').

Also take note that I have added a where statement that does not show Null values of each column, this is important, as if you do not do this, then it will take more than 5 seconds to do the query, and will return a whole heap of null values, as it queries all the values in the spreadsheet, null cell or not.

That will return like a normal SQL query, you could bind it to a data grid, or whatever you like, in my case, I first deleted all the old values from my SQL table (I could have updated them – however employees leave and join, and spelling mistakes are corrected etc, so its better just to start a fresh each time) then did a 'While DBReader.Read' to loop through the values, writing them to variables, and inserting each to the database as it went.

At the end of that loop, I closed the DBReader, and DBConnection. Then added my catch statement:

Catch ex As System.Data.OleDb.OleDbException

Finally End Try

ReadFromDB()

The exception was the System.Data.OleDB.OleDbException – which I tested was when the spreadsheet was already in use. As you can see I did nothing on the event of an error – as it doesn't matter if you can't read the data, I just didn't want the page to spit out an error – I wanted it to continue, which it does, and goes to the next line of code which is a call to my 'ReadFromDB()' sub procedure, which queries the database to get the data from the SQL table. So if there is an error, it goes to the ReadFromDB Sub, if not, it still goes there.

It then reads from that and prints my data. Done!

Edit: As a side note on my blogging with Word experiment, it worked great, however the formatted text I copied from Visual Studio 2005 added extra spaces, which is because it shows up as huge text in my editing menu, so each line break is size 72 or whatever it is. Other than that, worked brilliantly and was very easy.

1 comment:

rajesh said...

it helped me....thanks