String Search in Scratch

When I originally wrote the custom string blocks for Scratch, I posted a discussion on the ScratchEd site to let people know about them. I received a message a couple of weeks ago about adding a block that allows a string to be searched for a specific string (sub string). This is certainly a useful addition to the library of blocks and one I wanted to add when I had some free time.

Having found myself with the free time this morning, I have implemented a new custom block called FindString. FindString allows you to specify a string to be searched, a string to find and finally the search start character position within the string to be searched. The block is used as follows:

FindStringLooking at the example above, when executed, the block will search the string I like programming in Scratch for the string in starting at the character position 1 within our string to search. The block returns the start position of the string to find in a variable called startsAt. Executing the example will set startsAt to 16, as in starts at character position 16 in our string to search, if the string is not found, startsAt will be set to zero.

This block stops running after it has found the first occurrence of the string to search for. I will explain at the end how to search for more than one occurrence of a string.

How Does It Work?

So, how does the block work? Well, its actually very simple to follow. The image below shows the program code for the block.

Find String BlockWhen I first started writing the block, my first version used two loops (nested loops). The block worked fine, but after looking at the finished block, I realised that it could be re-written to use just one loop, making it a little easier to follow. So, lets break down the block above into logical steps to understand whats going on.

1. As we are doing a charcter by character comparison of the two strings, we need two variables to act as ‘pointers’ into our string to be searched and the string we are searching for. stringPos will hold the current character position in our string to be searched and subPos will hold the character position in the string we are searching for.

2. We initialise the startsAt variable to zero. Remember this variable holds the start position of our string we are searching for when the block has completed.

3. We now loop over the whole of the string to search for character by character, doing the following:

4. We check if the character at position stringPos in our string to be searched is the same as the character at position subPos in our string we are searching for. If they are, we first check if subPos is set to 1, if it is, we want to set startsAt to the value of stringPos, as this could possibly be the start location of our string to search for.

5. We then add 1 to both stringPos and subPos.

6. We next check to see if subPos (remember this is the character position in our string to search for) is equal to the length of our string to search for plus 1 character. If this is the case, we have actually found our string, so we can stop the block executing any further.

7. You may find what has been said in Step 6 confusing, but when we check if the characters from the two strings match in Step 4, if they don’t match, we set subPos back to 1 and add 1 to stringPos. We set subPos back to 1, as we want to continue searching the rest of the string should only a few of the characters of our string to search for has been found.

Its actually quite tricky to clearly explain in words what the block is doing, the best way to understand it is to look at it in Scratch, it really won’t be that hard to follow.

How Can I Search More Than Once Occurrence?

I did say at the beginning of this post that the block will only search for the first occurrence of a string, but what if you want to search for more than one occurrence of the same string? Well its very simple, all we need to do is keep calling the FindString block with a new start position. The easiest way is to do this in a loop and keep calling the block until startsAt equals zero. We will change the start position of each call to the FindString block by the position the last occurrence was found plus 1 character, this ensures we don’t find the same occurrence twice.

I have written a simple example below, that inserts all the positions of the occurrences in a list. The string we are going to search is “My cat likes to eat cat food. He is a happy cat” (I don’t have a cat by the way!). We will search for all the occurrences of the word cat. Shown below is the Scratch code to do this.

Occurrence CodeOnce this code has been run, the Occurances list will look as follows:

Occurrence ListAs you see, searching for multiple occurrences of string is quite a simple process. I hope you find the addition of this block useful and feel free to use it in any way you like.

Advertisements

More Scratch String Custom Blocks

On September 22nd I blogged about a set of custom string blocks I implemented for Scratch. These have been very well received and I am glad people are finding them useful. I did say that I was going to add some more custom blocks to manipulate strings, and I am pleased to say that I have now added four more blocks to the Scratch project. I have now implemented the following blocks:

LTRIM

LTRIM allows you to trim any leading spaces from a string (Left Trim). Using an example, consider the string ”     Scratch” (quotes used to show the spaces). You can pass this string into the LTRIM block and it will remove all the leading spaces, giving you the string “scratch”. You use the block as follows:

LTrim BlockThe trimmed string is stored in the variable result.

RTRIM

RTRIM allows you to trim any trailing spaces from a string (Right Trim). Again using an example where the string is in quotes, if you call the block with the string “Scratch    “ you will receive a string back with the trailing spaces removed. The block is used as follows:

RTrim BlockThe trimmed string is stored in the variable result.

TRIM

TRIM is a combination of LTRIM and RTRIM and will remove any leading and trailing spaces. If you look at the block in Scratch, it shows a good example of being able to reuse exisiting blocks. All TRIM does is call the LTRIM and RTRIM blocks. As an example, if you call the block with the string ”    Scratch    “ all of the leading and trailing spaces are removed. You use the block as follows:

Trim BlockAs with the previous two blocks, the trimmed string is returned in the variable result.

SPLIT

SPLIT is an extremely useful block and in the programming languages I use on a daily basis, it is implemented and I use it often. SPLIT allows you to split a string at a certain character. A simple example of this is splitting a sentence into individual strings. If we passed in the string “I like programming in Scratch” and tell it to split on spaces, the SPLIT block will split this into indiviudal strings as follows:

I
like
programming
in
Scratch

I very often have to write programs that work with comma seperated values (CSV). In my programs, I would read a line of data from a file that could look like “phil curnow,11/04/1972,41” – This just shows a simple line with my name, date of birth and age. Now, using my language of choice, which is C#, if I wanted to display each of these values on screen, I could write some program code that looks like this:

string str = “phil curnow,11/04/1972,41”;

foreach (string s in str.Split(‘,’))
    Console.WriteLine(s);

This would print each of the three items on a seperate line on the screen. All I am saying in the program is that I want to split the string wherever there is a comma.

So how can we implement this in Scratch? Well, this is where lists come in to play. If we pass a string into our custom block and tell it to split on a space, each of the ‘sub-strings’ can be added to a list, and this is exactly how I have implemented the block.

To call our custom block, we need to supply two parameters, the string we want to split and the character we want to split on. So, looking at the example below, we will pass in the string I like programming in Scratch and the second parameter, although looks blank, is a space.

Split BlockAll our sub-strings are added to a list called SplitList and once the block has been executed, SplitList looks as follows:

SplitListUsing my example of the CSV list, we could pass in the string phil curnow,11/04/1972,41 into the block as follows:

Split Block CSVAnd our SplitList list would then contain:

CSV Split ListAs you can see SPLIT is a very powerful and above all, useful block to have.

Do you want to see anymore string blocks?

We now have several very useful custom blocks to manipulate strings in our Scratch projects. Are there any other blocks you would like to see for manipulating strings? If you can think of any other blocks, leave a comment on here or drop me a tweet and time permitting, I will see if I can implement it for you.

 

How the Scratch Custom Upper and Lower case blocks work

Last week I wrote a post about the custom string handling blocks I have written for Scratch. These have been very well received and I have had quite a few positive tweets back about them. I’m glad people are finding them useful!

Two of the blocks I have written allow you to convert a string to either upper or lower case. Now Scratch 2 when dealing with strings or characters is case insensitive, by this I mean if you perform a comparison of A = a Scratch will evaluate these to be the same. To clarify this, lets look at the block of code below.

Character ComparisonIf you run this block, you will see that the variable result is set to 1. This is what I mean when I say case insensitive, we may consider that A is not the same as a. So what does this have to do with converting our string to upper or lower case? Well, I’ll cover this a little later and you will see how this case insensitivity can help us!

ASCII Codes

If I was asked to write a very basic case changing method (same as a block in Scratch) in say C# (which is my preferred language), I may well consider looking at ASCII codes for characters, this is ignoring the fact that there is already two methods available in the .NET Framework to convert between upper and lower case!!. What is ASCII? well, I’m not going into the depths of explaining this here, but if you look at the Wikipedia entry for it at http://en.wikipedia.org/wiki/ASCII this will give you a full overview of it. ASCII codes can be our friend here for converting between case, if you consider the ASCII code for A is 65 (decimal) and the ASCII code for a is 97 (decimal), you can see there is a difference in the code values of 32. This works for B/b and C/c, etc.

So a method to convert upper case to lower case could loop over all the characters in a string and check the ASCII value for it. If the value is between 65 and  90, (A to Z), we can consider it to be an upper case letter, if we then add 32 to the ASCII value, this will give us the lower case version. We want to range check the letter value to ensure we are working with an upper case letter, otherwise the resulting string would look a little strange.

The small C# method below is something that will convert upper to lower case. Now at this point any C# programmer looking at the method would probably pick it to pieces and to be honest I can too. If I was going to write a version for production code it would be vastly different to this. This method has been written to explain my points above and nothing more.

string ToLowerCase(string str)
{
    string result = String.Empty;

    for (int loop = 0; loop < str.Length; loop++)
    {
        if ((int)str[loop] >= 65 && (int)str[loop] <= 90)
            result += Convert.ToChar((int)str[loop] + 32);
        else
            result += str[loop];
    }

    return result;
}

Passing the string abcd&ABCD to this method would return the string abcd&abcd. Great! we have a working conversion routine. So lets go and implement this in Scratch.

Our Scratch Version

Right, so now we know we can work with the ASCII codes for letters, we just convert what we have above to the corresponding Scratch code, right? …. Well, no we don’t sadly. Scratch does not give us the ability to work with the ASCII codes for letters, so this method will not work. Great! so how can we do this then? Well, remember I explained at the beginning of this post that Scratch is case insensitive and it can be our friend, well this is where it becomes our best friend!

When I was implementing my string blocks, I really did think that case conversion had to be included, so with a little thought figured there must be a way of doing this. The trick with solving any problem like this is to have a look at what we have to work with in Scratch. Whilst the C# method is very nice, we have to throw most of it away and think again. My eureka moment came when I thought about lists. My thinking here when converting to lower case was as follows:

1. Have a list that contains all the lower case letters a through to z
2. Loop through every character in our string and see if it exists in our list
3. If it exists, use the character found in the list (which will be the lower case character)
4. If it is not found in the list, use the original character.

So, putting this together, I generated a list called LowerCase which looks like this:

Lowercase ListThis method will work even if we pass over lowercase letters, we will still get our character from the list, which will be lower case. So, my Scratch ToLowerCase block looks like this:

ToLowerCase BlockOur block starts by initialising a couple of variables. result will hold our converted string and charCount is used by the repeat block to loop through every character in our string. I have then implemented another custom block called GetLowerCaseLetter. Splitting out code like this into another block makes the code more readable and also gives us yet another block that allows us to do single character conversion. Our GetLowerCaseLetter block looks in our list and finds the lower case letter we need. After calling this block, our lower case letter is held in the variable letterResult so we simply add this to our result variable. We then add 1 to the character count and go around the loop again. This keeps going until we have worked with every character in our string.

The GetLowerCaseLetter Block

This is where most of the work is done. Lets have a look at this block.

GetLowerCaseLetter BlockAgain at the beginning we initialist a couple of variables. loop is used to loop through our LowerCase list. letterResult is used to return our converted character. We are then going to loop through the list a maximum of 26 times (remember a to z!!). When we are in the loop, we check if the letter we have passed to the block, e.g. A is in our list. So using a simple if block we are essentially saying if A = a then set letterResult to the item found in our list, which will be a. As we don’t want to go around the loop again, we stop this script. At this point letterResult holds a which is the lower case version of what we passed in. If we had passed in a it would have found this and letterResult would hold a.

If our block has got as far as the end of the loop, we have another if block that says if letterResult is empty, we must have passed in a character other than a letter of the alphabet, so just simply return it. This ensures anything like %&*() is still kept in the converted string.

So putting this to the test, if we use the block like this:

Call ToLowerCaseThe string we will get back is abcd&abcd, which is the lower case version of what we passed in!

Converting to Upper Case

Converting to upper case is exactly the same, we just use a list that contains upper case A to Z. If you look at the ToUpperCase block in the Scratch project, you will see it is exactly the same as its lower case version. The GetUpperCaseLetter block again is exactly the same as its lower case version.

So, again as you can see, whilst our Scratch version of converting to upper and lower case is very different to what we may do in other programming languages, if you sit back and think about the problem and consider what you have to work with in Scratch, there is generally always a way of implementing what you want to do.

Scratch Custom String Blocks

Having the ability to manipulate strings is one of the most powerful facilities you can have in a programming language. In my day to day coding I manipulate strings an awful lot. You may want to be able to get specific characters in a string, convert the whole string to uppercase characters or lowercase characters. Many years ago, one of the first programming languages I learnt was BASIC (http://en.wikipedia.org/wiki/BASIC). BASIC has several functions for allowing you to manipulate strings, for example:

LEFT$(string, n)

LEFT$ allows you to return the LEFT most n characters from your string. For example if you use the command PRINT LEFT$(“Scratch”,3) the letters Scr would be printed to the screen. The n parameter tells the function how many characters starting at the beginning of the string you want.

RIGHT$(string, n)

Similar to LEFT$, RIGHT$ allows you to return the RIGHT most n characters from your string. So something like PRINT RIGHT$(“Scratch”, 3) would print tch to the screen. The n parameter tells the function how many characters starting at the end of the string and working backwards you want.

MID$(string, start, n)

MID$ is quite a powerful function. It allows you to get a certain amount of characters, but starting at any point in the string. So using a command like PRINT MID$(“Scratch Programming”, 9, 7) would print Program to the screen. The start parameter tells the function where you want to start in the string and the n parameter tells the function how many characters you want to return from the starting point.

UPPER$ and LOWER$

Some other functions available are UPPER$ and LOWER$. These return either an uppercase or lowercase version of your string. So something like PRINT UPPER$(“scratch”) would print SCRATCH and PRINT LOWER$(“SCRATCH”) would print scratch. Again, very useful!

How can we do this in Scratch?

Out of the box, Scratch does not have these kind of blocks built in. As we know, Scratch 2 allows us to write our own custom blocks, so we could write our own versions of these. What I have done is build versions of the functions explained above as custom blocks that can be used in your Scratch programs.

I have made these available at http://scratch.mit.edu/projects/12402145/. If you look inside the project, all the custom blocks are defined on the stage, with an example of how you would use each block next to the block definition. I will given an explanation below on how you would use each custom block.

Please Note: The resulting string from each of these blocks is stored in a variable called ‘result’.

Scratch LEFT$

This has been implemented as a custom block called Left. You would use the block like this:

Scratch Left BlockThe first parameter is the string or the variable holding the string you want to work with. The second parameter is the number of characters from the left you want. The result is stored in a variable called result. Looking at the example above, after calling this block, the result variable would hold Scr.

Scratch RIGHT$

This has been implemented as a custom block called Right. You would use the block like this:

Scratch Right BlockThe first parameter is the string or variable holding the string you want to work with. The second parameter is the number of characters from the right that you want. The result is stored in a variable called result. Looking at the example above, after calling this block, the result variable would hold tch.

Scratch MID$

This has been implemented as a custom block called Mid. You would use the block like this:

Scratch Mid BlockWe have three parameters that we have to supply to this block. The first parameter is the string or variable holding the string you want to work with. The second parameter is where you want to start in the string. The final parameter is the number of characters you want to get from the string. The result is stored in a variable called result. So, looking at the example call above, we can see that we will be working with the string Scratch Programming and we want to start returning characters from the 9th character in the string and that we want to return 7 characters. Starting at the 9th character (P) we will get 7 characters, which will give us the string Program.

Scratch UPPER$ and LOWER$

I have also implemented blocks that allow strings to be converted to upper of lower case. These blocks are called ToUpperCase and ToLowerCase. As with all the other blocks, the result is stored in a variabled called result. You would use the blocks like this:

Scratch ToUpperCase BlockScratch ToLowerCase BlockCalling the ToUpperCase block on the string shown would convert it to SCRATCH and calling the ToLowerCase block on the string shown would convert it to scratch. If you supplied a string such as ScRaTcH to either of these blocks, it will still work as expected.

I am going to write a further blog post on how the ToUpperCase and ToLowerCase blocks work, because Scratch is a little different to most languages when comparing upper and lower case characters. The code for the other blocks is fairly easy to follow and figure out whats going on.

Anymore Blocks?

I will be implementing more string manipulation blocks over a period of time. The blocks I want to implement next are:

  – SPLIT(string, char)
  – LTRIM(string)
  – RTRIM(string)
  – TRIM(string)

SPLIT enables you to split one string into several strings, splitting the string at a specific character. For example if you supplied a string I Program In Scratch and told the block to split on the space character, you would end up with 4 strings containing:

  I
  Program
  In
  Scratch

Each of these strings would be held in a list.

LTRIM and RTRIM removes all spaces to either the left or the right of a string. So if you had a string ”    Scratch” (quotes used to show the spaces) and passed the string to the LTRIM block, it would remove all the leading spaces. Similarly, if you had the string “Scratch   ” and passed the string to the RTRIM block, it would remove all the trailing spaces. TRIM performs a union of LTRIM and RTRIM and will remove all leading and trailing spaces.

As these blocks are added, I will add a post to this blog and will also add a comment in the Scratch forums and on the ScratchEd web site.

Use and Share!

I do hope that you find these blocks useful and feel free to use and modify them in any way you would like. Share them around with your fellow Scratchers.