Nov
20
2008
I have been building several data processing facilities for the project I am currently working on.
Basically processing large volumes of groundwater elevation data and groundwater quality data. The facilities require different file formats, MS Excel, MS Access, and Tab-Delimited flat files. I have had a very interesting time since it isn't just a clean transfer of data. There are layers upon layers of validation which are required to accept the data to do things like handle duplicates in the files, records which already exist in the database, valid lookup field values, etc...
I'll tell you, working with Excel files can prove to be a daunting task from my experience. I will get into that in a different post and share the pain, but for now, I would just like to share some helpful code to assist you guys in dealing with this, at times, painful experience. Maybe I can ease the pain for some of you.
Here is a simple one... Sometimes you may want to find out what worksheet names are within an Excel file. Using the System.Data.OleDb facilities makes this very easy;
This is a simple method which is fairly straight forward to understand. The method accepts a full file path to the Excel file and opens it up via an OleDbConnection object. This object has a method called GetOleDbSchemaTable, which in this instance, is returning the tables/ worksheet names that are in the Excel file. Pretty easy, nothing complicated...
private const string ExcelConnectionString = @"Provider=Microsoft.Jet.OleDb.4.0;
data source={0};
Extended Properties='Excel 8.0;
HDR=Yes;
IMEX=1;'";
private static IList<string> GetWorkSheetNames(string filePath)
{
var sheets = new List<string>();
try
{
using (var oleConn = new OleDbConnection(string.Format(ExcelConnectionString, filePath)))
{
// open the connecton
oleConn.Open();
DataTable dbSchema = oleConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (dbSchema == null || dbSchema.Rows.Count < 1)
{
throw new Exception("Worksheets not found");
}
// loop worksheet names
foreach (DataRow s in dbSchema.Rows)
{
sheets.Add(s["TABLE_NAME"].ToString());
}
}
}
catch (OleDbException oex)
{
throw new Exception(string.Format("Bad file path? {0}", filePath), oex);
}
return sheets;
}
Hopefully this will help some of you to work towards a higher level of pain when working with Excel files...
Enjoy!