Showing posts with label vba. Show all posts
Showing posts with label vba. Show all posts

Sunday, 19 October 2014

Extract All sheets inside a folder to individual Excel File


It was supposed to be a peaceful Friday night in office. As I was about to leave the office on time, I was bombed with a very stupid task. Actually, We had some 8-10 files related to some important data and each Excel file contained 10-15 sheets. The task was to extract all those sheets into individual excel file. For me, to do this manually was like digging the well with spoon. It would have taken hours to complete the task. So I thought of trying hands on VBA. It was pretty easy and quick to do this with VBA and within half hour, I was done with my code and task to Extract sheet to files. (Note that I have written the code from scratch. I found below mentioned code segments on some sites and adjusted it to meet my goal.) Lets see how we did that.

First, I tried to extract the excel sheets for the single Excel file.(Please note that i have written this code. Copied it from this location)
Sub Splitbook()

Dim xPath As String

'Pick current excel path
xPath = Application.ActiveWorkbook.Path

Application.ScreenUpdating = False
Application.DisplayAlerts = False

'Loop through all sheets
For Each xWs In ThisWorkbook.Sheets
    xWs.Copy
    
    'Save excel file with sheet name
    Application.ActiveWorkbook.SaveAs Filename:=xPath & "\" & xWs.Name & ".xlsx"
    Application.ActiveWorkbook.Close False
Next

Application.DisplayAlerts = True
Application.ScreenUpdating = True

End Sub


Well, that works pretty straight. It extracted all the sheets into files(naming sheet name as file name) Now, the next challenge was to loop through all the files in a folder, process each file and loop through its sheets. For that, I created a new Macro-supported file (.xlsm) file. Pressed [Alt] + [F11]. This opens up the code window. In that, go to Menu -> Insert -> Module and pasted these lines -
Sub SplitAllBook()

Dim fso As Object
Dim xPath As String
Dim wbk As Workbook
Dim Filename As String

Application.ScreenUpdating = False
Application.DisplayAlerts = False

xPath = "YOUR_SOURCE_FOLDER_NAME_HERE"

'Pick only ".xlsx" files. In case you want to process older version of files, please change the extension to .xls
Filename = Dir(xPath & "*.xlsx")

'Loop through all files
Do While Len(Filename) > 0  'IF NEXT FILE EXISTS THEN

    Set wbk = Workbooks.Open(Filename:=xPath & Filename)

    'Use fso to remove the extension from file name. We used this name to Name excel file.
    Set fso = CreateObject("scripting.filesystemobject")
    Dim tempFileName As String
    tempFileName = fso.GetBaseName(Filename)
    
    'Loop through all sheets
    For Each xWs In wbk.Sheets

        xWs.Copy
        'I have used naming convention as - OriginalFileName_SheetName.xlsx to avoid name conflict. You can put any logic you want.
        Application.ActiveWorkbook.SaveAs Filename:="YOUR_DESTINATION_FOLDER_NAME_HERE\" & tempFileName & "_" & xWs.Name & ".xlsx"
        Application.ActiveWorkbook.Close False

    Next

    wbk.Close True
    Filename = Dir

Loop

Application.DisplayAlerts = True
Application.ScreenUpdating = True

End Sub

With that, we are done. Run you macro and you should get all the sheets created as new excel files in the destination folder.

Tip -
 
I faced a couple of problems while running the code.
1. Unique Name for files - Figure out a way to provide a unique name to your file or else the macro will throw the error.

2. Folder Path - Define your folder path correctly with all the "\" etc wherever required. In case a problem occurs, a quick debug will solve it easily.

Sample Code

Thank you.

Friday, 11 January 2013

Extract OLE Objects from MS Access to Disk


Problem Statement - 
Fetch the documents(images, .pdf, .doc etc.) from the MS Access(OLE Object) and store it on disk.
Initially, i thought it would be quite easy. Being a C# developer, i quickly began with the VS2010. I  already had the MS Access DB which has a table with several columns. But my focus was on only few such as
  • ID - Primary Key - Int
  • Name - text
  • Structure - OLEObject (Contains images)
  • Certificate - OLEObject (Contains PDFs)
Within half an hour, i had all the information in the DataTable with OLEObject as byte[]. Now, the point was how would i know the format of the images in the Structure columns. Because if we view the Access, it simply displays the type of document ( Bitmap Image for .bmp, Microsoft Word Document for .doc etc.), no file or extension name. Moreover , for other types, it just display the text as "Package".
To start with, i assumed that System.Drawing.Image
might have the detailed information  once i convert the byte[] to MemoryStream and assign this to Image Object.
byte[] bPhoto;//Assign it with data.

int id;//Primary key - Used as the name for the image here.

/*We have also hard-coded the extension as .png for now.*/

 using (MemoryStream stream = new MemoryStream(bPhoto)){
       System.Drawing.Image tempImage = System.Drawing.Image.FromStream(stream);
       string FName = tempPathPhoto + Path.DirectorySeparatorChar + Convert.ToString(id) + ".png";
       tempImage.Save(FName);
  }
As soon as i executed the code to debug and analyze the tempImage data, it threw an exception on this line -
System.Drawing.Image tempImage = System.Drawing.Image.FromStream(stream);
and the exception was - Parameter is not valid.  
I googled it and found various links to get rid of that. But, those solutions didn't worked for me. So i tried a couple of other ways to write the files such as - 

FileStream file = new FileStream(tempPathPhoto + Path.DirectorySeparatorChar + Convert.ToString(id) + ".GIF", FileMode.Create, System.IO.FileAccess.Write);
file.Write(bPhoto, 0, bPhoto.Length);
file.Close();
and
File.WriteAllBytes(tempPathPhoto + Path.DirectorySeparatorChar + Convert.ToString(id) + extension, bPhoto);
Using the above specified ways, i was able to write the file on my disk. Hmmm.. That wasn't tough.    But these files didn't open as expected. The Image viewer displayed it as the crashed image and the PDF were displayed as corrupted as well. I was on Zero again. 

After googling with some new keywords, i landed on this page. I found more than 50 blogs and sites pointing to this link. The first thing I (and most probably whoever visited) noticed was that the site is out of support. But don't worry, the download link and the project works just fine. Run the Access page and fill all the fields accordingly and viola!! It extracted the files perfectly. Nice work Mr. Leban.

All my .bmp,.pdf files were there in the correct place. But wait!! What about the other file format. Those who had banged their head on wall might have noticed that when we insert the .bmp file, the field value says "Bitmap File Format" but when we insert other type of images, it display "package". When you try to extract the data with "package" name, the code runs smoothly but the files were created with ".pak" extension and that too didn't opened up correctly. So, Is the issue with the images uploaded?? - I don't think so as the Access form displays the images perfectly.  So, what is the issue??

Lets see the code. I went through almost all the lines of my interest especially the fGetContentsStream(). Going through the  "package" case, Mr. Leban mentioned all the bits he has added to the "lPos" variable  so as to get the correct data from the file. I don't know if this is the case only with me because thousands of people have used his code and if there is some issue there, someone must have pointed it out. Anyways, i analyzed some of the files that were getting displayed as "package" and found that before extracting the file information( such as File Name, Full Path etc.), the lPos value should be set to "84".
So i reinitialize the lPos just before the following line of code in fGetContentsStream() "Package" case - 

'Reset the lPos
lPos = 84

' Package original File Name
Do While bCurValue <> 0
    bCurValue = arrayOLE(lPos)
    FileNamePackage = FileNamePackage & Chr(bCurValue)
    lPos = lPos + 1
Loop
This is it. Run the form again and it will extract all the OLE Object named as "package". All hail Mr. Leban.

One more issue, i faced was with MS Office 2007 Docs. For ex- .doc file would be extracted fine but the .docx file be displayed as corrupted when you try to open it.
I tried some alterations and after a while of head-banging, i got this code working - 

Case "Word.Do"
' MS Word document

sExt = "doc"
If sClassName = "Word.Document.12" Then
FileExtension = "docx"
sStreamName = "Package" ' This is important
Else: FileExtension = "doc"
sStreamName = "WordDocument"
End If
And
Select Case sStreamName

Case "CONTENTS", "Package" ' Package added by Shadab
' Contents stream in Root Storage
' Call our function in the StrStorage DLL
    lLen = GetContentsStream(arrayOLE(0), UBound(arrayOLE) - lOffSet, sStreamName)

Case "Ole10Native"
' "Ole10Native" stream in sub storage named "Embedding 1" Of Root Storage
' Call our function in the StrStorage DLL
    lLen = GetContentsStreamChild(arrayOLE(0), UBound(arrayOLE) - lOffSet, sStreamName)

Case Else
' Need to fix this to allow for other stream names other than Office Docs.
' Extract Office doc
    lLen = ExtractOfficeDocument(arrayOLE(0), UBound(arrayOLE) - lOffSet, sStreamName)

End Select

I was able to extract .docx files now. Unfortunately, the same didn't work for the other files such as .xlsx (Excel2007). Not that much important as of now but I am still looking for the workaround for this problem. Any suggestions???

Thanks
S. Shafique
Fokat RnD Team Member