Conversion of Expressions from Infix to Postfix Notation in C# – Part 1

Conversion of Expressions from Infix to Postfix Notation in C# – Part 3 Custom Functions

Originally in the second part, I was going to add the ability to both evaluate expressions with the unary + and – and also build in support for functions. Due to time pressures I am going to have to save the addition of functions for part 3, but this will be added! Part 2 is going to concentrate on building in support for unary + and -.

## What are Unary Operators

For an in-depth explanation you can read the Wikipedia article on this. You can see from the article that there are more unary operations than + and -, but these are the operators we are going to concentrate on supporting in our expression evaluator. Quite simply put, the expression -5 expresses negative 5. The expression **2 * -5 = -10**. There is the concept of a unary +, so you could have the expression **2 * +5 = 10**, but actually the unary + is not needed, because the 5 is assumed to be positive.

## Why Explicitly Support This in Code

So why are we having to explicitly build in support for this in our parser? If you look back at Part 1, the Shunting Yard algorithm for converting infix to postfix is explained. As it stands, the unary operators are going to confuse the code that converts the expression. If you consider the infix expression **2 * 5**, this converts to a postfix expression of **25*** and using our code to evaluate this causes no problems at all. If you add in the unary – our (**2 * -5**) our code converts this to a postfix expression of **2*5-** and when we try to evaluate this, our code is going to get very confused, so we need to do something to the code that converts the expression from infix to postfix to stop the confusion.

So what can we do? I have read a couple of articles to see how people have got around this. The first piece of good news and really this is obvious, we can ignore the unary + as it serves no real purpose. We first need to think about how we decide if an operator is a unary operator and there are a some simple rules we can follow, these are:

- If the + or – is the first character in the infix expression, its assumed to be unary. E.g.
**-5 * 2** - If the + or – comes after another operator, its assumed to be unary. E.g.
**5 * -2** - If the + or – comes after an
*opening*parenthesis, its assumed to be unary. E.g.**1 + (-5 * 2)**

So that’s our rules, what can we do in code. Well a lot of responses on forums have said to find out if the operator is unary and if its a + ignore it and if its a – replace it with another character such as # or ~. That’s fine and would overcome the issue, but we then need to really play around with our code, both the infix to postfix converter and evaluation code and I want to avoid this if possible as the code seems to work well, is there an alternative? Well, yes I think there is, lets see how we can convert a positive number to a negative using simple maths. To convert 5 to -5 I would normally say *‘multiply by -1’*, **5 * -1 = -5**, but that doesn’t help us as there is a unary – in the expression and that’s what we are trying to replace!! So we could always do **0 – 5 = -5**, that gives us -5 quite nicely and if you think about our expression, gives us a clue as to what we can do. Lets look at our three rules above, could we not do the following and derive the same answers?

- -5 * 2 = (0 – 5) *2
- 5 * -2 = 5 * (0 – 2)
- 1 + (-5 * 2) = 1 + ( (0 – 5) * 2 )

So our solution would involve a slight code change, which would be to applied to our infix expression *before* it is converted to postfix.

- Look for a unary +, if its found, remove it from the infix expression as its not needed.
- Look for a unary -, if its found, insert a
**(**and a**0**before the unary – and a**)**after the operand the unary minus applies to.

We would then have a valid infix expression with the unary operators removed but produces the same result when evaluated, but more importantly, we have an infix expression that still works with our postfix conversion code.

## The Code!

Enough theory, lets put this into code! We’ve said that we need to do the above before the expression is converted to postfix, so we are either going to build the code into our InfixToPostfix method, or write a method to work on the expression but call it from the InfixToPostfix method. I’m going to follow the route of writing a method and call it.

One of the first things our InfixToPostfix method does is to split an infix expression and produce a list of the expressions component parts, this makes adding and removing items from the expression nice and easy. Our original InfixToPostfix method had a LINQ expression that took our infix expression as a string, applied a regular expression to it and looped around the list. As I want to pass the expression converted to a list into our new method, I have refactored the method slightly to now do the following:

string pattern = @"(?<=[-+*/(),^<>=&])(?=.)|(?<=.)(?=[-+*/(),^<>=&])"; expression = expression.Replace(" ", ""); Regex regExPattern = new Regex(pattern); List<string> expr = new List<string>(regExPattern.Split(expression)); expr.RemoveAll(s => String.IsNullOrEmpty(s.Trim())); //parse our expression and fix unary + and -ParseUnary(ref expr);expr.ForEach(s =>

I’ve highlighted in bold our call to a method called ParseUnary which applies our rules we have derived above. This is a very simple method and looks as follows:

private static void ParseUnary(ref List<string> expr) { for(int i = 0; i < expr.Count; i++) { //we have a minus //if its a unary, then the following happens: //1. At the beginning: -1 * 2 = ( 0 - 1 ) * 2 //2. After an opening parenthesis: 1 + ( -2 * 3 ) = 1 + ( ( 0 - 2 ) * 3 ) //3. After an operator: 1 + -2 = 1 + ( 0 - 2 ) if (expr[i] == "-") { //is it at the beginning of the expression OR //it follows and opening parenthesis OR operator> //if so, its a unary minus if (i == 0 || expr[i - 1] == "(" || operators.ContainsKey(expr[i - 1])) { expr.Insert(i, "("); expr.Insert(i + 1, "0"); expr.Insert(i + 4, ")"); } } //we have a + else if(expr[i]=="+") { //is it at the beginning of the expression //OR it follows an opening parenthesis OR operator? //if so, we can drop it, as it serves no purpose if (i == 0 || expr[i - 1] == "(" || operators.ContainsKey(expr[i - 1])) expr.RemoveAt(i); } } }

As you can see, we iterate over the list and apply our rules for checking if a + or a – is a unary operator. As we have said + can be ignored, so if we find one, we simply remove it from the list. If we find a unary -, we insert our leading ( and 0 before the unary – and then insert our closing parenthesis after the operand the unary – applied to. As we are passing the list that contains the expression by reference, we don’t need to return anything from our ParseUnary method. This style is personal choice.

And that’s it! We’ve added the new method and made the changes to the existing method to implement handling the unary + and -, so lets see this working with the expression **1 + (-5 * 2)**. This should give us the answer -9. Evaluate -5 * 2 first to give us -10 and then add 1 to it gives us -9.

As you can see from the postfix expression in the image above, the (0-5) as been applied and when evaluated, has given us the correct answer of -9.

I have made available a new version of the Visual Studio solution that implements the new additions above and you can download this from here.