Import Multiple XML files into Access

So boss emails me and asks hey can you import 2,900 XML files into this attached Access database. Of course I say sure thing, no sweat. I knew that I was not the first person in the world with the need to import XML files into access so was confident I could find something on Google to facilitate.

The first thought I had was just throwing the data in a SQL Server database (boss mentioned SQL Server is a fine destination also) using an SSIS pacakge. I know SSIS has a sweet foreach loop container. Optimistically I fired up the Visual Studio designer, thinking that I would actually be able to figure this SSIS thing out pretty quick.

Wrong. Man I do have a desire to learn SSIS (well I at least tell myself I do) but every time I sit down to actually do it, I get flustered and think of an easier way to perform the task in a VBScript, DTS Package, etc.

I went back to Google thinking there had to be some freeware software out there to help with the import. The closest I found was some shareware version of an multi-file XML import. However with the shareware version you could only work with one file at a time. No thanks.

After uninstalling the shareware software, I stumbled on this article from Technet. This looks way to easy to actually work was my first thought.

Set objAccess = CreateObject(“Access.Application”)
objAccess.OpenCurrentDatabase “C:\Scripts\Test.mdb”
objAccess.ImportXML “c:\scripts\test.xml”, acAppendData

That’s it??? Really is it that simple? I fired up my Notepad++ and quickly modified the script to line up with my file paths. Double clicked on the resulting VBS file and poof like magic the XML files were importing. So then I just wrapped a Files Collection around it and whamo, I am watching wscript.exe do the rest. So simple.

Final Script:

Set objAccess = CreateObject(“Access.Application”)
objAccess.OpenCurrentDatabase “c:\Scripts\db3.mdb”
set fso = CreateObject(“Scripting.FileSystemObject”)
Set f = fso.GetFolder(“C:\XML”)
Set fc = f.Files
For Each f1 in fc
objAccess.ImportXML f1.Path, acAppendData
Next

Backup script with datestamp

Below is a script you can use to backup a database and include a date and time stamp on the file name.

DECLARE @backdate varchar(20)
set @backdate = (select REPLACE(REPLACE(CONVERT(varchar(100), GETDATE(), 100),’:’,’_’),’ ‘,’_’))
exec (‘BACKUP DATABASE [DatabaseName] TO DISK = N”c:\Program Files\Microsoft SQL Server\MSSQL\BACKUP\databaseName_’ + @backdate + ‘.bak” WITH INIT , NOUNLOAD , NAME = N”Database Name backup”, SKIP , STATS = 10, FORMAT’)

The date and time stamp will look like this: Dec_14_2009_12_00PM
You can use this to append on the file if you need to keep multiple copies of your .bak file.

Kill all user processes

Here is a cursor to kill all the spids for a database you specify.

DECLARE @spid int
DECLARE @sql varchar(500)
DECLARE spid_cursor CURSOR FOR
select spid from sysprocesses where spid > 50 and DB_NAME(dbid) = ‘Your Database Here’
OPEN spid_cursor
FETCH NEXT FROM spid_cursor INTO @spid
WHILE @@FETCH_STATUS = 0
BEGIN
set @sql = (‘KILL ‘ + CAST(@spid as varchar(20)))
exec (@sql)
FETCH NEXT FROM spid_cursor INTO @spid
END
CLOSE spid_cursor
DEALLOCATE spid_cursor

You specify what database you want to kill in your where clause DB_NAME(dbid) = ‘Your Database here’

Automate DBCC CHECKDB scan

Update 10/30/2014 – I wrote this original article a long long long time ago. Geez its terrible. If you are not using Ola scripts for this kind of stuff you should check it out. https://ola.hallengren.com/

Original Article

Nothing like that sinking feeling when you have corruption in a production database. Regardless if you have a bullet proof backup strategy, database corruption can cause data loss if corruption goes undetected. One of the best ways to combat corruption is to know about the db corruption ASAP.

This article will not discuss the nitty gritty of DBCC CHECKDB or database corruption. That may be in a future article. Below are the steps I take to monitor Database corruption. This solution is not very elaborate but gets the job done. Let me know how to make it better.

First create a job on the server you wish to run the CheckDB scan on using this script:

Set Nocount on
Declare @dbname varchar(100)

Declare db Cursor For    –Cursor that holds the names of the databases without Pubs and Northwind
Select name from master.dbo.sysdatabases

Declare @osql varchar(1000)

Open db
Fetch Next from db into @dbname
While @@Fetch_status=0
Begin
Set @osql=’EXEC master.dbo.xp_cmdshell ‘+””+’osql -E -Q”DBCC Checkdb (“‘+@dbname+'”)” -oE:\scripts\checkdb\logs\’+@dbname+’.log’+””
EXEC (@osql) –Execute the osql statement
Fetch Next from db into @dbname
End
Close db

Deallocate db

Make sure you change the location of the log files to the location you will parse in the VBScript step below. Change this: E:\scripts\checkdb\logs\ to where you want the DBCC logs to be dumped.

Next you will make a Scheduled task to execute this VBScript.

‘Change these to work in your environment
logLocation = “E:\scripts\checkdb\logs”
attachLocation = “E:\scripts\checkdb\”
fromAddress = “YOURSERVER@mail.com”
emailServer = “mail.whatever.com”
youremail = “dude@whatever.com”

set fso = CreateObject(“Scripting.FileSystemObject”)
set outputLog = fso.OpenTextFile(“checkDBScan.log”,2,True)
FailFlag = 0

Set f = fso.GetFolder(logLocation)
Set fc = f.Files
if fc.Count = 0 then
Call SendCDOEmail(“**** WARNING **** CheckDB No Logs”, “No CheckDB logs are there. This may be a problem.”, youremail)
Wscript.Quit
End if

For Each f1 in fc
if f1.DateLastModified < Now – 1 then
Call SendCDOEmail(“**** WARNING **** Log Date Problem!!!”, “The dates did not check out on the CHECKDB Script. Connect to the server and find out if the SQL job failed.”, youremail)
WScript.Quit
end if

Next
‘——————————————————-
‘–        PARSE and LOOP
‘–
‘–
‘–     Following for statement will open each checkdb
‘–        log and parse the contents. Looking for 8000 msgs
‘–        if we find an 8000 message we will set the fail
‘–        flag to 1 indicating we have some corruption
‘–         somewhere
‘–
‘–
‘——————————————————
Set f = fso.GetFolder(logLocation)
Set fc = f.Files

For Each f1 in fc

set logfile = fso.OpenTextFile(f1.path, 1,True)

while not logfile.AtEndOfStream

lineread = logfile.ReadLine

if left(lineread,5) = “Msg 8” then
FailFlag = 1
outputLog.WriteLine(Now)
outputLog.WriteLine(“We have a serious problem here:” )
outputLog.WriteLine(”      ” & f1.Name & “::::   ” &  lineread)
lineread = logfile.Readline
outputLog.WriteLine(”      ” & f1.Name & “::::   ” &  lineread)
lineread = logfile.Readline
outputLog.WriteLine(”      ” & f1.Name & “::::   ” &  lineread)
lineread = logfile.Readline
outputLog.WriteLine(”      ” & f1.Name & “::::   ” &  lineread)
outputLog.WriteLine
outputLog.WriteLine

end if
wend

logfile.Close
Next

outputLog.Close

if FailFlag = 1 then
Call SendCDOEmailAttach(“***** FAILURE ******* CHECK DB Problem!!!”, “The attached log file details a serious CHECKDB error. Connect to the server and investigate ASAP.”, youremail, attachLocation & “checkDBScan.log”)
end if

‘——————————————————-
‘–        EMAIL Sub
‘–
‘–
‘–     This subroutine will send the email. Takes in some
‘–        parameters to get the job done.
‘–
‘–
‘——————————————————

Sub SendCDOEmailAttach(sSubject, sMessage, sToAddress,sAttachment)

Set myMail=CreateObject(“CDO.Message”)
myMail.Subject=sSubject
myMail.From=fromAddress
myMail.To=sToAddress
myMail.TextBody=sMessage
myMail.AddAttachment sAttachment
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/sendusing&#8221;)=2
‘Name or IP of remote SMTP server
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/smtpserver&#8221;)=emailServer
‘Server port
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/smtpserverport&#8221;) =25
myMail.Configuration.Fields.Update
myMail.Send
set myMail=nothing

End Sub

Sub SendCDOEmail(sSubject, sMessage, sToAddress)

Set myMail=CreateObject(“CDO.Message”)
myMail.Subject=sSubject
myMail.From=fromAddress
myMail.To=sToAddress
myMail.TextBody=sMessage
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/sendusing&#8221;)=2
‘Name or IP of remote SMTP server
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/smtpserver&#8221;)=emailServer
‘Server port
myMail.Configuration.Fields.Item (“http://schemas.microsoft.com/cdo/configuration/smtpserverport&#8221;) =25
myMail.Configuration.Fields.Update
myMail.Send
set myMail=nothing

End Sub

At the top of the script there are a list of variables. You will need to change these to fit your needs. Make sure you run the Scheduled task after the DBCC CHECKDB job has finished. The script will parse the CHECKDB logs looking for “8000 Messages”. The Error code 8XXX indicates an error was found in the CHECKDB. The script will then fire off an email to let you know there is an issue.

Here is a sample Corrupted database you can play with: broken (rename this to a .bak or whatever). Just restore this .bak file to a SQL instance and run a CHECKDB against it. You will see some 8000 error messages. You can point the above VBScript to the DBCC CHECKB log and then test to make sure the script is working and properly notifies you.

Again the best tool in combating  corruption is to know about it as quickly as possible. I have read articles that say running a CHECKDB daily is excessive. However, say you have database corruption on Monday, you do not run the CHECKDB job until SUNDAY. You find out Sunday you have DB Corruption. Depending on the corruption, if your only option is to restore the db, you may lose data for the entire week because you have backed up a corrupt database all week.

If you are running a CHECKDB weekly, then at least do yourself a favor and make sure and check the logs each week or setup some kind of script to parse the logs to let you know if errors were encountered. Speaking from experience, I had a production database go corrupt. I did not know about it for 15 days. By that time we did not have a good backup to restore to (tape rotations), so I had to run a DBCC CHECKDB WITH REPAIR_ALLOW_DATA_LOSS…yeah that is as bad as it sounds. Fortunately the DB was repaired (took 12 hours of downtime).

Some Other CHECKDB Resources:

Sysprocesses Tracker

In developing some Server/Application baselines it is good to know some details about  who and what connects to the database server. The sysprocesses system table contains some good data for tracking down what is up on your SQL Servers.

I set out to create a VBScript that would go out and grab results from sysprocesses  from all  my SQL Servers on demand or on a schedule.

First step was to create a table to collect the data. So I hopped on my test server and created  the following table:

CREATE TABLE [dbo].[sysprocessTracker](
[spid] [int] NULL,
[status] [varchar](250) NULL,
[runid] [int] NULL,
[nt_username] [varchar](250) NULL,
[hostname] [varchar](250) NULL,
[blocked] [varchar](50) NULL,
[dbid] [varchar](50) NULL,
[cmd] [varchar](200) NULL,
[cpu] [int] NULL,
[physical_io] [int] NULL,
[memusage] [int] NULL,
[last_batch] [varchar](100) NULL,
[program_name] [varchar](200) NULL,
[nt_domain] [varchar](200) NULL,
[net_library] [varchar](200) NULL,
[login_time] [datetime] NULL,
[lastwaittype] [varchar](200) NULL,
[servername] [varchar](200) NULL,
[capturetime] [datetime] NULL,
[loginame] [varchar](100) NULL
) ON [PRIMARY]

The table represents the basic output from the master.dbo.sysprocesses table with a few tweaks.

  1. I added a column called servername. I set this during runtime of the script in the server loop. I add it  to my table so I can later query by server name.
  2. Capture Time. I added this column to the table to know when I captured the master.dbo.sysprocesses results.

I referenced a server loop in step 1. To run the script against all servers I setup a little server text file  with a list of my servers in the text file. When I add or remove a server I update the list.

Format of the Text File:

SERVER1
SERVER2
TESTSERVER1

After the setup was complete I wrote the script to go out and execute the master.dbo.sysprocesses query.

set fso = CreateObject(“Scripting.FileSystemObject”)
set f = fso.OpenTextFile(“C:\scripts\sysprocessTracker\servers+.txt”,1,True)
set counterFile = fso.OpenTextFile(“C:\scripts\sysprocessTracker\runcounter.txt”,1,True)
set logfile = fso.OpenTextFile(“C:\Scripts\sysprocessTracker\logfile.txt”,2,True)

counter = counterFile.ReadLine

counterFile.Close

dbs=”server=servername;DRIVER={SQL Server};database=dbname;Trusted_Connection=yes”
Set db = createobject(“adodb.connection”)
db.ConnectionString = dbs
db.Open

while not f.AtEndOfStream

lineArray = Split(f.ReadLine, “|”, -1)
serverName = lineArray(0)

ssql = “insert into sysprocessTracker (servername,capturetime,runid,spid,status,nt_username,hostname,blocked,dbid,cmd,cpu,physical_io,memusage,last_batch,program_name,nt_domain,net_library,login_time,lastwaittype,loginame) SELECT ‘” & servername & “‘,GETDATE(), ” & counter & “,spid,RTRIM(status),RTRIM(nt_username),RTRIM(hostname),RTRIM(blocked),dbid,RTRIM(cmd),cpu,physical_io,memusage,last_batch,RTRIM(program_name),RTRIM(nt_domain),RTRIM(net_library),login_time,RTRIM(lastwaittype),RTRIM(loginame) FROM OPENROWSET(‘SQLOLEDB’,’Server=” & serverName & “;Trusted_Connection=Yes;Database=Master’,’select spid,sp.status,nt_username,hostname,blocked,sd.name as dbid,cmd,cpu,physical_io,memusage,last_batch,program_name,nt_domain,net_library,login_time,lastwaittype,loginame from master..sysprocesses sp, master..sysdatabases sd where sp.dbid = sd.dbid’) ”
On Error Resume Next
db.Execute(ssql)
If Err.Number <> 0 Then
logfile.WriteLine(“error connecting to: ” & servername & ” — ” & Now)
Err.Clear

End if
wend

set counterFile = fso.OpenTextFile(“C:\scripts\sysprocessTracker\runcounter.txt”,2,True)
newcount = counter + 1
counterFile.WriteLine(newcount)
counterFile.Close

ssql = “select count(*) from sysprocessTracker where capturetime < GETDATE() – 2”
set rs = db.Execute(ssql)
if not rs.EOF then
logfile.Writeline(“Rows Deleted: ” & rs(0))
end if

‘put this in here to keep the table cleaned up. You can remove this if you want to view more than 2 days worth of history
ssql = “delete from sysprocessTracker where capturetime < GETDATE() – 2”
db.Execute(ssql)

db.Close
logfile.close


Update:

The script will loop through my server text file and then insert the results of the sysprocesses call into a table. The  most difficult part was working out the INSERT command based on results on the OPENROWSET command. That took a few rounds but worked out well.  Initially I tried creating a temp table then selecting the temp table into a recordset, parsing the recordset and creating the insert. I had some problems  with temp table scope so went this route after some google searching. The single insert seems a lot more efficient and works well.

Update 2:

Added a run counter in my script. Easier to track the specific runs. I started off by using the time stamp but soon realized that the timestamp resolves down to the millisecond and at times  the query of sysprocesses spanned a few milliseconds. Made it tough to group the captures so now I just use the runid column.

What to do with the info? Some sample queries:

–How many users connected, avg CPU and Diskio based on each capture
select count(spid) ConnectedUsers, avg(cpu) AvgCPU,avg(physical_io) AVGDiskIO, capturetime from sysprocessTracker where servername = ‘SERVER1’ group by servername, capturetime

–Which command is executed most often, which db is used the most, who are the users logged in, what programs are used the most, etc.
select avg(spid) ConnectedUsers, avg(cpu) AvgCPU,avg(physical_io) AVGDiskIO from sysprocessTracker where servername = ‘SERVER1’ group by servername
select count(cmd), cmd from sysprocessTracker where servername = ‘SERVER1’ group by cmd
select count(dbid), dbid from sysprocessTracker where servername = ‘SERVER1’ group by dbid
select count(nt_username),nt_username,capturetime from sysprocessTracker where servername = ‘SERVER1’ group by capturetime,nt_username
select count(program_name),program_name,capturetime from sysprocessTracker where servername = ‘SERVER1’ group by capturetime,program_name

SQL Job Tracker

I am not big on buying off the shelf software to monitor SQL for two main reasons:

  1. Not much flexibility.
  2. You can learn much more by developing scripts, websites, and such yourself.

Now I understand that in some circumstances it is nice to buy off the shelf products. It is nice having a piece of software that someone else can support. Also buying off the shelf software is necessary at times due to time constraints and project loads.

One custom process I have running is a SQL Job Checker.  The job checker will report on all the SQL Jobs that have failed. Even if you only have a handful of servers it is nice to have a single place to monitor all your jobs across multiple servers.

Now I am still old skool when it comes to DTS vs SSIS. DTS is simple and works. I am sure SSIS would get the job done just as well if not better than DTS. I am slow to make the jump to SSIS.  I created a DTS package on server that points to all the other servers. I created the following table:

CREATE TABLE [dbo].[SQLJobTracker](
[servername] [varchar](20) NULL,
[jobname] [varchar](75) NULL,
[message] [varchar](200) NULL,
[rundate] [datetime] NULL,
[stepid] [int] NULL,
[capturedate] [datetime] NULL CONSTRAINT [DF_SQLJobTracker_capturedate] DEFAULT (getdate())
) ON [PRIMARY]

In order to keep the job tracker table clean, I run a delete command in the DTS package as the first step to delete the old stuff  from the table:

delete from SQLJOBTracker where capturedate < Getdate() – 2

Then on each source in the DTS package I use the following query to extract the data and dump it into the table above:

select @@SERVERNAME ServerName,sj.name, sjh.message,
cast(dateadd(ss,
cast(substring(cast(run_duration + 1000000 as char(7)),6,2) as int),
— Add Run Duration Minutes
dateadd(mi,
cast(substring(cast(run_duration + 1000000 as char(7)),4,2) as int),
— Add Run Duration Hours
dateadd(hh,
cast(substring(cast(run_duration + 1000000 as char(7)),2,2) as int),
— Add Start Time Seconds
dateadd(ss,
cast(substring(cast(run_time + 1000000 as char(7)),6,2) as int),
— Add Start Time Minutes
dateadd(mi,
cast(substring(cast(run_time + 1000000 as char(7)),4,2) as int),
— Add Start Time Hours
dateadd(hh,
cast(substring(cast(run_time + 1000000 as char(7)),2,2) as int),
convert(datetime,cast (run_date as char(8))))
))))) as char(19)) rundate,sjh.step_id
from sysjobhistory sjh, sysjobs sj
where sj.job_id = sjh.job_id
and sjh.run_status = 0
and convert(datetime,cast (run_date as char(8))) > Getdate() – 2

I like to keep 2 days worth of logs in the table to show jobs failing one day but then success the next. Just how my mind works. Now one of the draw backs on DTS is the lack of control about what steps happen in what order, step precedence. SSIS does a much better job with step precedence. This limitation was glaring in DTS when trying to make the  generate report step the last step after all the extracts. So as a workaround I just created another DTS package to extract some data into a CSV, and then send me the report in an email:

select * from sqljobtracker where order by rundate

So the end result is a daily email with a list of the jobs that failed. Each day you get a quick snapshot of the problem jobs. I am sure there are more elegant ways to make this happen, but this method is about 10 minutes of work and works out pretty well.