Showing posts with label Interop. Show all posts
Showing posts with label Interop. Show all posts

Saturday, 7 January 2017

Automate Excel Operations -2 - Deal with excel using Microsoft Interop


This is the second post of the series - "Automate Excel Operations".
If you wish to take look at previous article, here is the list -
  1. Read Excel Data into DataTable
In this post, we will create some of the functions to deal with excel.
But before jumping to that, We will create a class "InteropHelper.cs". This class will act as a wrapper to interact with actual Interop DLL provided by Microsoft.



Add some Interop properties in this class. This will help us to dispose of all the objects related to Interop DLL. It is very important to dispose the Interop objects properly or else we can get into some serious trouble. We will come to the dispose part a few moment later. These properties are just the one we needed for this post. We can add as many as we require.


And here comes the most important part IMO. Disposing of the objects. If we don't dispose of them properly, there will be an excel.exe running each time we open an excel file with Interop.So do whatever you want, don't forgot to dispose of all the properties you have declared in your code.


Below is the definition of ReleaseObject(). This is how we dispose of ComObject



So let's go with the first one.
1. CheckIfSheetExists - This function allows us to check if the desired exists in the excel file before doing any other operation. This function helps us to avoid the non-informational "HResult" exceptions getting thrown from Interop DLL, I will post the exact exception message once I get one. :-)


2. GetFirstSheetName - In many cases, while reading data from excel, if the user hasn't passed sheet name in function, or first sheet name is not fixed, Use this function to get the name before reading the excel.


3. GetLastRowColumnIndex - To get the value of last row/column where data is present in excel sheet. Note following points -
  • This function does not take into consideration hidden row/column in the sheet. 
  • We remove the filter before calculating the row/column to consider all the data.
  • Blank cells(with no data) are also taken into consideration. (Sometimes, we accidentally add blank rows/columns. For us, there is no data but excel consider these cells as actual data). Try removing unwanted cells if the count is not as expected.


4. DeleteSheets - To delete the sheets in excel file. Keep in mind that we can't just delete all the sheets in the excel. There should be at least one sheet available. That's why we have "forAll" and "defaultSheetName" parameters. Check description in the image below -


5. SetSheetVisibility - To show/hide the sheets in excel file. Keep in mind that we can't just hide all the sheets in the excel. There should be at least one sheet visible. That's why we have "forAll" and "defaultSheetName" parameters. Check description in the image below -


Tuesday, 27 December 2016

Automate Excel Operations -1 - Read Excel Data into DataTable


Disclaimer - Neither Microsoft nor I advise to automate MS Office operations using Interop DLL

But then we don't have much of a choice when it comes to automating such operations. At least, I am not aware of any such tool at the time of writing this post.

Note - Before we proceed any further, let me tell you, from here on, whenever I say "Data", I am referring to tabular data in excel sheet until stated otherwise.

I am planning to write multiple posts and will try to cover one topic in each post. So let's start with the very first point - How to read data from excel file.

Let's start by creating an empty solution "FokatRnd" and adding an empty class project "CustomInteropHelper" to it. Now add a new class "ExcelHelper.cs" to this. To read data from excel file, we don't need any special tool. "System.Data.OleDb"  namespace will solve our purpose easily.

Declare a connection string that will be passed to OleDbConnection class to create a connection.

public const string EXCEL_CONNECTIONSTRING = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=\"Excel 12.0 Xml;HDR={1};IMEX=1\";";
As you might have noticed, we are using 2 placeholders in the above connection string.
1. Data Source={0} ; - Fully qualified path of the excel file.
2. HDR={1} ; - This boolean value(Yes/No) will inform OleDbDataAdapter whether it should treat first row as table header or not.

Next, we will use OleDbDataAdapter to fill the DataTable object with results. DataTable is part of "System.Data" namespace so we don't need to include any other namespace for now. We need to pass the select query into this OleDbDataAdapter object.

string selectQuery = "select * from [SheetNameHere$]";

We need to pass the sheet name of the excel file from which we want to read the data. "*" is used to read all the columns. The query syntax is pretty much the same as SQL query. Take a look at the final piece of code that will read tabular data from specified excel sheet.

public static DataTable LoadDataFromExcel(string fileName, string sheetName,bool considerFirstRowHeader = true)
{
    string connection = string.Format(EXCEL_CONNECTIONSTRING, fileName, considerFirstRowHeader ? "Yes" : "No");

    DataTable objDataTable = new DataTable(sheetName);

    using (OleDbConnection objOleDbConnection = new OleDbConnection(connection))
    {
        OleDbDataAdapter dataAdapter = new OleDbDataAdapter();
        objOleDbConnection.Open();
        string selectQuery = "select * from [SheetName$]";
        dataAdapter = new OleDbDataAdapter(selectQuery, objOleDbConnection);
        dataAdapter.Fill(objDataTable);
        dataAdapter.Dispose();
    }

    return objDataTable;
}
And, that's it. This function returns a DataTable which contains all the data in excelsheet. We can use this tabular data as per our requirement. Go ahead, Give it a try and as always, feel free to improve this article by providing your valuable suggestions and comments.