String Search in Scratch

When I originally wrote the custom string blocks for Scratch, I posted a discussion on the ScratchEd site to let people know about them. I received a message a couple of weeks ago about adding a block that allows a string to be searched for a specific string (sub string). This is certainly a useful addition to the library of blocks and one I wanted to add when I had some free time.

Having found myself with the free time this morning, I have implemented a new custom block called FindString. FindString allows you to specify a string to be searched, a string to find and finally the search start character position within the string to be searched. The block is used as follows:

FindStringLooking at the example above, when executed, the block will search the string I like programming in Scratch for the string in starting at the character position 1 within our string to search. The block returns the start position of the string to find in a variable called startsAt. Executing the example will set startsAt to 16, as in starts at character position 16 in our string to search, if the string is not found, startsAt will be set to zero.

This block stops running after it has found the first occurrence of the string to search for. I will explain at the end how to search for more than one occurrence of a string.

How Does It Work?

So, how does the block work? Well, its actually very simple to follow. The image below shows the program code for the block.

Find String BlockWhen I first started writing the block, my first version used two loops (nested loops). The block worked fine, but after looking at the finished block, I realised that it could be re-written to use just one loop, making it a little easier to follow. So, lets break down the block above into logical steps to understand whats going on.

1. As we are doing a charcter by character comparison of the two strings, we need two variables to act as ‘pointers’ into our string to be searched and the string we are searching for. stringPos will hold the current character position in our string to be searched and subPos will hold the character position in the string we are searching for.

2. We initialise the startsAt variable to zero. Remember this variable holds the start position of our string we are searching for when the block has completed.

3. We now loop over the whole of the string to search for character by character, doing the following:

4. We check if the character at position stringPos in our string to be searched is the same as the character at position subPos in our string we are searching for. If they are, we first check if subPos is set to 1, if it is, we want to set startsAt to the value of stringPos, as this could possibly be the start location of our string to search for.

5. We then add 1 to both stringPos and subPos.

6. We next check to see if subPos (remember this is the character position in our string to search for) is equal to the length of our string to search for plus 1 character. If this is the case, we have actually found our string, so we can stop the block executing any further.

7. You may find what has been said in Step 6 confusing, but when we check if the characters from the two strings match in Step 4, if they don’t match, we set subPos back to 1 and add 1 to stringPos. We set subPos back to 1, as we want to continue searching the rest of the string should only a few of the characters of our string to search for has been found.

Its actually quite tricky to clearly explain in words what the block is doing, the best way to understand it is to look at it in Scratch, it really won’t be that hard to follow.

How Can I Search More Than Once Occurrence?

I did say at the beginning of this post that the block will only search for the first occurrence of a string, but what if you want to search for more than one occurrence of the same string? Well its very simple, all we need to do is keep calling the FindString block with a new start position. The easiest way is to do this in a loop and keep calling the block until startsAt equals zero. We will change the start position of each call to the FindString block by the position the last occurrence was found plus 1 character, this ensures we don’t find the same occurrence twice.

I have written a simple example below, that inserts all the positions of the occurrences in a list. The string we are going to search is “My cat likes to eat cat food. He is a happy cat” (I don’t have a cat by the way!). We will search for all the occurrences of the word cat. Shown below is the Scratch code to do this.

Occurrence CodeOnce this code has been run, the Occurances list will look as follows:

Occurrence ListAs you see, searching for multiple occurrences of string is quite a simple process. I hope you find the addition of this block useful and feel free to use it in any way you like.

Advertisements

Catching and Storing Exceptions in SSIS Script Tasks

I use SSIS (SQL Server Integration Services) very often to write data extraction and migration packages. More recently I have been using SSIS to write an interface to update a third party system through calling web services. The SSIS package runs as a scheduled task and queries a database table (call this the interface table) which holds records that have been updated in the source system, that also need updating in the third party system. By using the supplied web services, I am ensuring that all business logic is maintained when updating the third party system, I am also ensuring that no support agreements are invalidated as I am not performing direct updates to the database.

When the package runs, I need to maintain an audit trail of which records were successfully updated and which were not. There are 3 ways in which a record will not be updated, these are:

  1. The record from the source system did not exist in the third party system (This should be a rare occurrance as the source system is updated overnight from the third party system).
  2. Whilst programmatically the call to the web service was successful, the record could not be updated as the web service returned an unsuccessful response.
  3. An exception was thrown in the SSIS package.

Within the SSIS package, I am using a Script Task to perform all the calls to the web services and process the response. The first two items in the list above are handled fairly easily by examining the response coming back from the web service call and updating the status of the record in the interface table.

But what happens if there is an exception thrown in the Script Task in the SSIS package? Well, I would find out eventually that there was an issue, when the customer tells me there is and I have to go and start digging around to find out why the package failed. If I looked in the interface table and there was a status next to the record, I could inform them why the update did not occur. If exceptions are not handled properly, there will be no status information in the interface table saying that an exception has been thrown and I then have to start looking at the job history in SQL Server to find out when and why this happened.

So, how do we handle this in our Script Task? Catching exceptions is easy enough in code and originally I wrote the following code:

try
{
    .. code that could throw an exception ..
    Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception e)
{
    Dts.Events.FireError(-1, "Task Name", e.message, String.Empty, 0);
    Dts.TaskResult = (int)ScriptResults.Failure;  
}

Within the Error Event Handler in the package, I added a simple Execute SQL Task, that used the contents of the System::ErrorDescription variable to update the status message.  Forcing the Script Task to throw an exception (by changing the URL of the web service end-point), the exception information was written back to the interface table, unfortunately all I ended up with as a status was “error in script task” – not exactly useful!! I know there was an error, but I want to know what it was!!

After some further digging and a lot of research, I found a solution to the problem. To overcome the problem, you need to create a package variable that is going to hold the exception message and you need to set this in your catch block. Normally when we work with variables in a Script Task we have to populate the ReadOnlyVariables and/or the ReadWriteVariables lists in the task configuration. For this solution to work, we do not add our package variable to the ReadWriteVariables list.

The code I have ended up with that works is as follows.

try
{
    .. program code that could throw and exception ..
    Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception e)
{
    Variables exceptionVariable = null;
    Dts.VariableDispenser.LockOneForWrite("User::ScriptException", 
                                          ref exceptionVariable);
    exceptionVariable["User::ScriptException"].Value = e.Message;
    exceptionVariable.Unlock();

    Dts.Events.FireError(-1, "Task Name", e.Message, String.Empty, 0);
    Dts.TaskResult = (int)ScriptResults.Failure;
}

Looking at the code above, we have a package variable called ScriptException (remember do not add this to the ReadWriteVariables list!), I then modified my Execute SQL Task for the Error Event Handler to work with the User::ScriptException package variable instead of the System::ErrorDescription variable. When forcing the code to throw an exception, looking at the interface table, the full exception message is now held.

Whilst I don’t want my code to throw exceptions and I always strive for my code to be exception neutral, should the code throw an exception, I can now at least start by looking at the interface table and gain a good understanding of where things went wrong.