Conversion of Expressions from Infix to Postfix Notation in C# – Part 2 Unary Operators

Conversion of Expressions from Infix to Postfix Notation in C# – Part 3 Custom Functions

If you have a need in your applications to evaluate mathematical expressions and you search the term ‘convert infix to postfix’, you will find a huge amount of information and code out there that will get you to where you need to be, some with explanations of how the code is working, others with just the program code. This post will hopefully fulfill both criteria, by giving an explanation of what the code is doing and also provide the Visual Studio Solution (VS2013) I have produced.

## This is all out there already, why do this?

Good question!! Over the past 12 months I have been pretty much working solidly on my day job (the one that pays the bills!) and haven’t been able to spend much of my spare time sitting back and writing code for pleasure. With all the travel I have had, the time spent at home has been with the family. I’ve found myself with 2 weeks off over the Christmas period and this has been an ideal time to spend some time coding for pleasure. I’ve got a tonne of code I could be writing for work, but I’ve needed to switch off from that totally (not easy!!). For some reason the idea of writing some code to do expression evaluation popped into my head and this was an ideal little coding project that I could dip in and out of over the couple of weeks off.

## What do the conversion from Infix to Postfix?

If you are reading this post, you more than likely already know the difference between infix and postfix notation. For a good description of these two notations (and for that matter prefix), follow the link here, its a very good brief description of the three notations.

With infix expressions, writing an evaluator can be a fairly complicated task, you need to cater for parenthesized expressions (potentially levels deep) and this can become potentially complicated to code (recursion anyone?). For example, with the expression **( (A * B) + (C / D) )** your evaluator will need to take into account all the parenthesized parts in order to derive the correct answer. Throw into the mix that you may also want to use functions, e.g. sin, cos, and it suddenly becomes not so much fun to write.

## Postfix (Reverse Polish Notation)

Postfix notation specifies that all operators follow all of their operands and all parenthesis are removed. Looking at our infix expression **( (A * B) + (C / D) )** its posfix form would be **AB*CD/+**. To write code to evaluate this becomes an awful lot easier. If our code utilises a stack, we can do the following:

infix:** ( ( 5 * 5) + (4 / 2) )**

postfix expression: **55*42/+**

for each char in postfix expression { if char is operator { rhs = Pop(); lhs = Pop(); switch(char) { case '*': Push(lhs*rhs); break; case '/': Push(lhs/rhs); break; ... all other operators ... } } else Push(char); //push non-operator value to the stack. }

So, working this through using our pseudo-code above, we get the following:

- push 5 to the stack
- push 5 to the stack
- Char * is an operator, so pop values from the stack into lhs and rhs and evaluate 5 * 5 and push result 25 to the stack
- Push 4 to the stack
- Push 2 to the stack
- Char / is an operator, so pop values from the stack into lhs and rhs and evaluate 4 / 2 and push result 2 to the stack
- Char + is an operator, so pop values from the stack into lhs and rhs and evaluate 25 + 2 and push result 27 to the stack
**END as postfix expression has been evaluated**

## Converting from Infix Please!!

Having read everything above, it all looks wonderful, but how to we get from infix to postfix and is it difficult? Fortunately this is not a difficult process and there is an algorithm for this called the Shunting Yard Algorithm. This is the algorithm we are going to implement. My code does not support functions or unary + and -. In part 2 of this series I will implement support for functions and unary + and -.

I’m not going to put the Shunting Yard Algorithm in this post, as I have linked to it above, so we can dive into the code that performs the conversion.

## Operator Precedence and Associativity

The algorithm needs to take into account operator precedence and associativity when it constructs the postfix version of the expression. The wikipedia article gives this in the form of a table and this is what we will replicate in our code. We will implement this as a generic dictionary Dictionary<string,int[]) where the string will be the operator symbol and the integer array will hold the precedence and a value of 1 or 0 for either left or right association. The definition for this is as follows:

private static Dictionary<string, int[]> operators = new Dictionary<string, int[]>(); //populate operators //int format is {precedence, association -> 0=Left 1=Right} operators.Add("^", new int[] { 5, 1 }); operators.Add("*", new int[] { 4, 0 }); operators.Add("/", new int[] { 4, 0 }); operators.Add("+", new int[] { 3, 0 }); operators.Add("-", new int[] { 3, 0 }); operators.Add("<", new int[] { 2, 0 }); operators.Add(">", new int[] { 2, 0 }); operators.Add("=", new int[] { 2, 0 }); operators.Add("&", new int[] { 1, 0 });

Our implementation of the algorithm also takes into account the operators < > = and &, so the precedence values have been modified compared to the Wikipedia article. Should you wish to add more operators you must make sure you get the precedence of the operator dictionary absolutely correct.

## Starting the Conversion

We then need to split our infix expression into its component parts so that we can parse it more easily. I’m using a regular expression to do this, as follows:

string pattern = @"(?<=[-+*/(),^<>=&])(?=.)|(?<=.)(?=[-+*/(),^<>=&])"; expression = expression.Replace(" ", ""); Regex regExPattern = new Regex(pattern); regExPattern.Split(expression).Where(s => !String.IsNullOrEmpty(s.Trim())).ToList().ForEach(s => {

I don’t profess to be an expert with regular expressions and there is a small side effect with our expression in that it does produce empty elements in our list when we perform the split. This I think is to do with grouping within the regex and I will address this, but for the time being, I’m removing these in the LINQ expression when I start to iterate over the elements of the expression.

Compared to the algorithm explanation, I’m handling the order I do things slightly differently. I’m first checking if the element of the infix expression is an operator, if so, we process it, if not we go on to handle opening and closing parenthesis. If the element is neither of these things, its an operand and I’m simply adding it to the output queue. The order I have done things is just personal choice, it just seemed more logical to me.

//is our token one of our defined operators //deal with precedence and association if (operators.ContainsKey(s)) { //while the stack is not empty and the top of the stack is not an ( while (stack.Count > 0 && stack.Peek() != "(") { if ((GetAssociation(s) == 0 && GetPrecedence(s) <= GetPrecedence(stack.Peek())) || (GetAssociation(s) == 1 && GetPrecedence(s) < GetPrecedence(stack.Peek())) ) queue.Enqueue(stack.Pop()); else break; } //push operator onto the stack stack.Push(s); } //handle opening parenthesis //simply push this on the stack else if (s == "(") { stack.Push(s); } //handle closing parenthesis //pop all operators off the stack until the matching //opening parenthesis is found and then discard the //opening parenthesis else if (s == ")") { while (stack.Count != 0 && stack.Peek() != "(") queue.Enqueue(stack.Pop()); //forget the ( stack.Pop(); } else //none of the above so queue it queue.Enqueue(s); }); //pop off the rest of the stack while (stack.Count != 0) queue.Enqueue(stack.Pop()); return queue;

When we enter the loop to process the expression elements, we are first checking if the element is a valid operator, if it is, we are checking to see if there are any operators on the stack. If there are, we perform the precedence and association check for our current operator against what is on the top of the stack and pop off the operator on the stack and add to the output queue if required, pushing our currently selected operator element onto the stack. If we don’t have any operators on the stack, we simply push our operator onto the stack.

If our current operator in the expression is an opening parenthesis, we simply push this onto the stack and that’s all.

If our currently operator in the expression is a closing parenthesis, we pop all operators off the stack and add them to the output queue *up to the point of the matching opening parenthesis*. We then pop off and forget about the opening parenthesis, in other words, we don’t add it to the output queue.

Finally if the element in our expression is neither an operator or an opening or closing parenthesis, we assume that its an operand and we add it to the output queue.

When all expression elements have been processed within the loop, we check to see if there any operators left on the stack, if there are, we simply pop them off the stack and add them to the output queue.

And that’s it!! We then just return the queue as a return parameter from the method. My decision to use a Queue within the method is just personal choice and may or may not be a good decision, quite probably a generic list would have been better.

There are a couple of additional methods used that obtain the precedence and association values and these are shown below.

//Get the precedence of the operator passed in private static int GetPrecedence(string s) { return operators[s][0]; } //Get the association of the operator passed in private static int GetAssociation(string s) { return operators[s][1]; }

To use these methods, we simply pass in the operator and the associated precedence or association value is returned.

## The Output

I have wrapped all my code into a class called Evaluate and with this in mind, the code I can use to convert an infix expression to postfix and display the result is:

string exp = "( (5 * 5) + (4 / 2) )"; Console.WriteLine("Infix Expression: {0}", exp); Console.Write("Postfix Expression: "); Queue<string> q = Evaluate.InfixToPostFix(exp); foreach (string s in q) Console.Write(s);

The output on the console looks as follows:

Evaluating the Postfix Expression

So now we have the expression converted to postfix form, we need to evaluate it, after all this is the whole purpose of this post! At the beginning of this post, I gave the algorithm for evaluating a postfix expression, so there is no need to go over this again. The method to evaluate is as follows:

public static double CalculateExpression(Queue<String> postfix) { Stack<double> stack = new Stack<double>(); postfix.ToList<String>().ForEach(token => { if (operators.ContainsKey(token)) { if (stack.Count > 1) { double rhs = stack.Pop(); double lhs = stack.Pop(); switch (token) { case "+": stack.Push(lhs + rhs); break; case "-": stack.Push(lhs - rhs); break; case "*": stack.Push(lhs * rhs); break; case "/": stack.Push(lhs / rhs); break; case "^": stack.Push(Math.Pow(lhs, rhs)); break; case "=": stack.Push(Convert.ToDouble(lhs == rhs)); break; case "<": stack.Push(Convert.ToDouble(lhs < rhs)); break; case ">": stack.Push(Convert.ToDouble(lhs > rhs)); break; case "&": stack.Push(Convert.ToDouble((int)lhs & (int)rhs)); break; } } } else stack.Push(Convert.ToDouble(token)); }); //everything has been evaluated and our result is on the stack //so return it return stack.Pop(); }

To call this method, we simply pass in what was returned from our method that converts infix to postfix, for example:

string exp = "( (5 * 5) + (4 / 2) )"; Console.WriteLine("Infix Expression: {0}", exp); Console.Write("Postfix Expression: "); Queue<string> q = Evaluate.InfixToPostFix(exp); foreach (string s in q) Console.Write(s);Console.WriteLine("\nThe expression evaluates to: {0}", Evaluate.CalculateExpression(q));

The output on the console then looks as follows:

As you can see, the expression is evaluated and returns the answer of 27, which in my book means the evaluation has worked!

## Other Expression Types

As I have added the operators < > and & and = you can build and evaluation some more complex expressions, such as:

- (2*2) = (1+3)
**which would return the value 1 meaning true** - (2*2) < 1
**which would return the value 0 meaning false**

## Conclusion

Whilst this has been a fairly long post, we have covered a lot of ground, both theory and program code. I hope you find this post useful and feel free to leave a comment or contact me via twitter (@pcurnow).

In part 2 of this series, we will look at adding the ability to evaluate functions and give the ability to add your own functions to be evaluated.

You can find a full Visual Studio 2013 solution that contains a full working version of the conversion and evaluation routines described in this post here.