Basic Instincts

Lambda Expressions

Timothy Ng

Contents

What are Lambda Expressions?
Lambda Expressions as Callbacks
Why Add Lambda Expressions?
Type Inference
Code Generation under the Hood
Lambda Expressions and Variable Lifting
Make the Most of Lambda Expressions

Lambda expressions, new in Visual Basic® 2008, are a handy addition to any programmer's toolbox. They are callable entities that are defined within a function, and they're first-class citizens; you can return a lambda expression from a function and you can pass lambda expressions to other functions. Lambda expressions were added to Visual Basic 2008, formerly code-named "Orcas," in order to support Language Integrated Queries (LINQ), which adds data programmability to Visual Basic (more on that later). As you use lambda expressions, you will begin to see the power and flexibility they promote. I invite you to sample the basic concepts of lambda expressions, explore their benefits, and witness how to use them to write more expressive programs.

What are Lambda Expressions?

The following is an example of a basic lambda expression definition. It defines doubleIt as a lambda expression that takes an integer and returns an integer. The lambda expression effectively takes the input, multiplies it by 2, and then returns the result.

Dim doubleIt As Func(Of Integer, Integer) = _
    Function(x As Integer) x * 2

The Func type is also new with Visual Basic 2008; it is essentially a delegate that has the return type specified as the last generic parameter and allows up to four arguments to be supplied as the leading generic parameters (there are actually several Func delegates, each of which accepts a specific number of parameters). The Func delegate type is defined in the assembly System.Core.dll in the System namespace. All new projects created with Visual Basic will automatically have a reference to System.Core.dll, so you can take advantage of the Func type immediately.

The following code shows the various overloads of Func:

Dim f0 As Func(Of Boolean)
Dim f1 As Func(Of Integer, Boolean)
Dim f4 As Func(Of Integer, Integer, Integer, Integer, Boolean)

Here, f0 is a delegate that returns a Boolean, f1 is a delegate that takes an integer and returns a Boolean, and f4 is a delegate that takes four integer arguments and returns a Boolean. The key point to note is that a lambda expression is typed as a delegate; that is, it is a callable entity, just like the delegates in Visual Basic 2005.

On the right-hand side of the assignment in the first code snippet, you can see the new lambda expression syntax. It starts with the keyword Function, which is then followed by an argument list and a single expression.

In that earlier example, the lambda expression takes one argument, x, which is an integer. Notice, however, that there is no return statement. That's because the Visual Basic compiler already knows the type based on the expression, and so a return statement is superfluous. In this case, since x is an integer, 2 * x is also an integer, so the result of the lambda expression is an integer.

The magic of a lambda expression is that you can simply invoke it like a normal delegate, as you see here:

Dim doubleIt As Func(Of Integer, Integer) = _
    Function(x As Integer) x * 2
Dim z = doubleIt(20)

If you run this code, you will see that the value stored in z is 40. Essentially, you have created a lambda expression that doubles any integer value that you throw at it.

Let's consider a more complex example—a lambda expression factory:

Dim mult As Func(Of Integer, Func(Of Integer, Integer)) = _
    Function(x As Integer) Function(y As Integer) x * y

Mult is a fairly complex lambda expression; it takes as input one integer argument and returns another lambda expression that takes as input one integer argument and returns an integer. The syntax is a little tricky, but using the following line continuations and formatting can help indicate the exact nesting structure:

Dim mult As Func(Of Integer, Func(Of Integer, Integer)) = _
    Function(x As Integer) _
        Function(y As Integer) x * y

That should be a little clearer; the outer lambda expression contains another lambda expression, which the compiler uses as the return statement. The inner lambda expression's signature matches the Func(Of Integer, Integer) delegate signature in the return argument of the outer lambda expression, so the compiler compiles the statement with no errors.

How would you use such a lambda expression?

Dim mult_10 = mult(10)
Dim r = mult_10(4)

The first line here defines mult_10 to be mult(10). Since Mult(10) returns a lambda that takes an argument and multiplies it by 10, the type of mult_10 is Func(Of Integer, Integer). The second line calls mult_10 with the value 4, so the result of r is 40 and the type of r is Integer.

Essentially, mult is a lambda expression factory. It returns lambda expressions that are customized by the first argument. You may have noticed that the inner lambda expression references the parameter of the outer lambda expression, but the lifetime of the inner lambda expression exceeds the lifetime of the outer one. I will discuss this variable lifting later.

Lambda Expressions as Callbacks

Since lambda expressions are simply delegates, you can use them anywhere you use delegates. Consider the following method, which takes a delegate and calls the delegate for each element in a list:

Delegate Function ShouldProcess(Of T)(element As T) As Boolean

Sub ProcessList(Of T)( _
        elements As List(Of T), shouldProcess As ShouldProcess(Of T))
    For Each elem in elements
        If shouldProcess(elem) Then
            ' Do some processing here
        End If
    Next
End Sub

This is a fairly standard application of delegates; the method ProcessList will loop through each of the elements of the list, check whether it should process the item, and then do some standard processing.

To use this in Visual Basic 2005, you would have to define a function in your class or module that has the same signature as the delegate, and then pass the address of the function to the ProcessList procedure, like so (note the code in red):

Class Person
    Public age As Integer
End Class

Function _PrivateShouldProcess(person As Person) As Boolean
    Return person.age > 50
End Function

Sub DoIt()
    Dim list As New List(Of Person)
    ' Obtain list of Person from a database, for example
    ProcessList(list, AddressOf _PrivateShouldProcess)
End Sub

This is troublesome at best; you often have to dig through the code documentation to figure out what signature the delegate imposes, and then you have to match it exactly. Furthermore, if you need to call ProcessList with many different shouldProcess functions, you pollute your code with lots of little private functions.

Let's take a look at how you can call this function with lambda expressions:

Class Person
    Public age As Integer
End Class

Sub DoIt()
    Dim list As New List(Of Person)
    ' Obtain list of Person from a database, for example
    ProcessList(list, Function(person As Person) person.age > 50)
End Sub    

I love the elegance and simplicity of lambda expressions. There's no need to create your own function to perform the processing logic. The delegate is defined at the point where it is used, which is much better than having it defined in a private method somewhere and have it lose locality with the method using the private method.

I'm sure you can see how lambda expressions are powerful, convenient, and can make your code easier to read and maintain. More advanced features such as type inference add even more power.

One limitation to note is that a lambda expression is exactly that—a single expression. In Visual Basic 2008, you can only have a single expression in a lambda expression. Further on in this column, I will show you a new ternary operator introduced in Visual Basic 2008 that will allow you to construct conditional expressions, but the current feature will not support arbitrary statements in a lambda expression.

But before exploring the deeper concepts behind lambda expressions, I'll discuss why lambda expressions were introduced in the first place.

Why Add Lambda Expressions?

To support LINQ queries, a few features needed to be added; among them were Visual Basic and lambda expressions. Assume you have the following query statement in Visual Basic:

Dim q = From p In Process.GetProcesses() _
        Where p.PriorityClass = ProcessPriorityClass.High _
        Select p

A lot of work goes on under the hood to get this query statement to compile. At a high level, the compiler will iterate through the Process.GetProcesses collection, apply the Where filter to it, and return a list of processes that match the filter in the Where clause.

Notice that there is a Visual Basic expression inside the Where clause: p.PriorityClass = ProcessPriorityClass.High. To execute this filter, the compiler creates a lambda expression for the Where filter and applies it to each element in the process list:

Dim q = Process.GetProcesses().Where( _
            Function(p) p.PriorityClass = ProcessPriorityClass.High)

Essentially, the lambda expression provides a shorthand for the compiler to emit methods and assign them to delegates; this is all done for you. The benefit you get with a lambda expression that you don't get from a delegate/function combination is that the compiler performs automatic type inference on the lambda arguments. In the example above, the argument p's type is inferred by the usage; in this case, the Where argument defines the lambda expression type, and the type for the argument lambda expression is inferred by the compiler. The type inference features supported by the compiler are a powerful addition to Visual Basic. Let's see what they can do for you.

Type Inference

The introduction of powerful type inference mechanisms means that you don't need to worry about figuring out the type of each variable. Furthermore, type inference enables scenarios that are otherwise impossible. Let's look at three different ways types are inferred when using lambda expressions.

Inferring the Lambda Expression Argument Types This scenario is really handy if you have a delegate type you want to assign a lambda to and you don't want to fully specify the arguments:

Dim lambda As Func(Of Integer, Integer) = Function(x) x * x

In the example, the lambda variable is typed as a Func(Of Integer, Integer). This is a delegate that takes one integer argument and returns an integer argument. As a result, the compiler automatically infers the lambda argument x to be an integer, and the return value of the lambda to be an integer.

You also benefit from type inference of lambda expression arguments when you are calling a method that takes a delegate. Consider a modification of an earlier example:

Delegate Function ShouldProcess(Of T)(element As T) As Boolean

Sub ProcessList(Of T)( _
        elements As List(Of T), shouldProcess As ShouldProcess(Of T))
    ' Method body removed for brevity
End Sub

In this case, the ProcessList function takes a lambda expression.

You can call the procedure like this:

Sub DoIt()
    Dim list As New List(Of A)
    ' fill or obtain elements in list
    ProcessList(list, Function(a) a.x > 50)
End Sub

Notice that I didn't specify the type of the lambda argument as I did earlier, yet the compiler infers it to be Person. How does something like this happen? Well, there are actually several levels of type inference in this example.

First the compiler sees that ProcessList is a generic procedure that takes as input List(Of T) and ShouldProcess(Of T). In the call to ProcessList, the compiler sees that list is the first argument, and that list is a List(Of Person). Since the second argument offers no hints as to what the type T is, the compiler decides that T is of type Person. Next, it infers that the generic argument for ShouldProcess(Of T) is Person and so it infers the second argument to be ShouldProcess(Of Person). Finally, since the lambda expression did not supply a type for its argument, the compiler knows the argument type based on the delegate signature of ShouldProcess(Of Person), and it infers the type of the parameter (a) to be Person. This is an extremely powerful model for type inference; you do not need to know the type of the delegate arguments when you construct the lambda and, in fact, I recommend that you let the compiler do the hard work for you. Fortunately, however, you do get IntelliSense® and tooltips that help indicate what the inferred type is, so you can be just as productive, if not even more so.

Inferring the Result Type This scenario is really handy if you do not have a delegate type and want the compiler to synthesize one for you. It's a feature available only in Visual Basic:

Dim lambda = Function(x As Integer) x * x

In the example, the lambda expression is fully typed (the lambda argument x is of type integer, and the compiler infers that the return value is an integer, since integer * integer = integer). However, the lambda variable has no type. Therefore, the compiler will synthesize an anonymous delegate that matches the lambda expression shape and assign that delegate type to the lambda.

This is a great feature because it means you can create lambda expressions on the fly without having to statically construct their delegate types. For example, how often have you been in the situation in which you have a condition that you need to apply to a set of variables, and you need to do it in several places as in Figure 1? There have been quite a few times where I've been coding and I ran into a situation like this. Normally I would factor that out so that I could do the condition check in one place rather than scattered throughout the function.

Figure 1 Repeated Conditional Checks

Class Motorcycle
    Public color As String
    Public CC As Integer
    Public weight As Integer
End Class

Sub PrintReport(motorcycle As New Motorcycle)
    If motorcycle.color = "Red" And motorcycle.CC = 600 And _
       motorcycle.weight > 300 And motorcycle.weight < 400 Then
       ' do something here
    End If

    ' do something here

    If motorcycle m.color = "Red" And motorcycle.CC = 600 And _
       motorcycle.weight > 300 And motorcycle.weight < 400 Then
       ' do something here
    End If
End Sub

But there are times when the check is used only in this function and nowhere else; I don't like polluting my class with random helper functions used only to support this function. Doing so negatively affects maintainability—what if someone else calls this function and I need to make a change? It also may cause name pollution—I find classes with lots of private methods really hard to follow. And it could make IntelliSense less useful since there are more and more entries in the IntelliSense list. In addition, locality of logic is impacted. If I make a separate private method, I'd like it to be physically close to the method that uses it. With many people working in the same code base, it can be hard to maintain this locality over the long run. Using lambda expressions and having the compiler automatically create delegate classes addresses these issues, as in Figure 2.

Figure 2 Lambda Expressions and Delegates

Sub PrintReport(motorcycle As New Motorcycle)
    Dim check = Function(m As Motorcycle) m.color = "Red" And _
                                          m.CC = 600 And _
                                          m.weight > 300 And _
                                          m.weight < 400
    If check(motorcycle) Then
        ' do something here
    End If

    ' do something here

    If check(motorcycle) Then
        ' do something here
    End If
End Sub

I have factored out the logic to check some conditions on the Motorcycle class—not into a private method where there are disadvantages, but into a lambda expression where the compiler will automatically create a delegate type—that is hidden—and hook up all the necessary work so that I can call the lambda expression as though it were a method.

I love this approach because it puts the logic close to the implementation (within the method body), it's factored (only one copy), and the compiler takes care of much of the maintenance. This works well because you can build an arbitrarily complex expression as the body for the lambda expression.

Late Binding—Inferring Object In this scenario, neither the lambda variable nor the lambda expression is typed:

Dim lambda = Function(x) x * x

Here, an anonymous delegate is also generated for you by the compiler, but the lambda's types are System.Object. This means that late binding is enabled in this scenario when Option Strict is off.

This is a really nice scenario for those that rely on late binding. Lambda expressions fully support late-bound operations so, in the above example, as long as the * operator is defined on the types you give to the lambda, it will work:

Dim a = lambda(10)
Dim b = lambda(CDec(10))
Dim c = lambda("This will throw an exception because " & _
               "strings don't support the * operator")

As you can see in the examples above, as long as the runtime type has a * operator, everything works out well. Lambda expressions of this nature fit very nicely with the late binding model in Visual Basic.

Code Generation under the Hood

Now that I've explored lambda expressions, let's look at the kind of code the compiler generates. Consider the earlier example:

Sub TestLambda()
    Dim doubleIt As Func(Of Integer, Integer) = _
        Function(x As Integer) x * 2
    Console.WriteLine(doubleIt(10))
End Sub

You know that Func is a delegate and delegates are just pointers to functions, so how does the compiler get this magic to work? In this case, the compiler emits a new function for you and sets up the delegate so that it points to the new function:

Function $GeneratedFunction$(x As Integer) As Integer
    Return x * 2
End Sub

Sub TestLambda()
    Dim doubleIt As Func(Of Integer, Integer) = _
        AddressOf $GeneratedFunction$
    Console.WriteLine(doubleIt(10))
End Sub

The compiler essentially takes the lambda expression and creates a new function with its contents, and changes the assignment statement so the lambda expression will take the address of the generated function. In this case the function is generated in the same parent that contains the method that uses the lambda expression. If TestLambda is defined on a class C, then the generated function will be defined on C. Note that the generated function is not callable and is marked private.

Lambda Expressions and Variable Lifting

In the previous examples, the lambda expression bodies referred to variables that were passed into the lambda. However, the power of lambda expressions comes to fruition with variable lifting. As I hinted earlier, the compiler "performs some magic" in certain scenarios. Before exploring these scenarios, though, let's get some basic concepts down from the branch of mathematics called lambda calculus—lambda expressions are based on a similar concept.

The fundamental concept in lambda calculus is that a function can have free variables or bound variables. Free variables are those that are defined in the containing method's variables (locals and parameters). Bound variables are those that are defined in the lambda signature or are members of the class containing the lambda, including base classes.

It is very important to make the distinction between bound and free variables in your lambda expressions because they affect the semantics of the lambda expression, the code that is generated, and, ultimately, the correctness of your program. Here is an example of a lambda expression that contains bound and free variables:

Dim y As Integer = 10
Dim addTen As Func(Of Integer, Integer) = Function(ByVal x) x + y

Here, x is considered a bound variable inside the lambda since it is a formal parameter to the lambda expression, and y is considered a free variable since it is a variable that belongs to the containing method of the lambda expression.

Remember that after you define a lambda expression, it's treated like a delegate type (you can return the lambda expression from a method, for example). Consider this:

Function MakeLambda() As Func(Of Integer, Integer)
    Dim y As Integer = 10
    Dim addTen As Func(Of Integer, Integer) = Function(ByVal x) x + y
    Return addTen
End Function

Sub UseLambda()
    Dim addTen = MakeLambda()
    Console.WriteLine(addTen(5))
End Sub

This code will print "15" to the console when UseLambda is called. But, you might ask yourself, how does it work? The function MakeLambda defines y as a local variable, and the lambda is using y, but the lambda returns out of the function MakeLambda. The function UseLambda gets the lambda from MakeLambda and executes the lambda, and it seems like somehow the variable y is "remembered" in the lambda.

Is this some sort of behind-the-scenes trick? The lifetime of y is the method of MakeLambda. When we execute the lambda returned from MakeLambda, MakeLambda will be out of scope and its stack space should be removed. But y was defined on the stack and, somehow, it got "stuck" with the lambda.

This stickiness is the magic commonly referred to as variable lifting. In this case, the variable y is called a lifted variable. And, as you can see, lifted variables are powerful: the compiler does a lot of leg work for you to capture the state of variables and preserve them outside of their normal lifetimes.

More formally, when the compiler encounters a lambda expression that has free variables, it will lift the free variables into a class called a closure. The closure's lifetime exists beyond the lifetime of the free variables hoisted in it. The compiler rewrites the variable access in the method to access the one within the closure instance.

First, let's walk through the MakeLambda example again:

Function MakeLambda() As Func(Of Integer, Integer)
    Dim y As Integer = 10
    Dim addTen As Func(Of Integer, Integer) = Function(ByVal x) x + y
    Return Lambda
End Function

As we analyzed before, x is bound to the parameter of the lambda but y is a free variable. The compiler detects this and proceeds to create a closure class that captures the free variables, as well as the definition of the lambda expression:

Public Class _Closure$__1
    Public y As Integer
    Public Function _Lambda$__1(ByVal x As Integer) As Integer
        Return x + Me.y
    End Function
End Class

You can see that the closure variable captures the variable y and stores it in the closure class. The free variable is then converted to a bound variable on the closure class.

The compiler also rewrites the method that contains the lambda expression to look like this:

Function MakeLambda() As Func(Of Integer, Integer)
    Dim Closure As New _Closure$__1
    Closure.y = 10
    Return AddressOf Closure._Lambda$__1
End Function

Now you can see how the compiler creates the closure variable, rewrites the local variable y that is lifted into the closure variable, initializes the variable, and simply returns the address of the lambda expression that is written into the closure class.

It's important to note that the compiler only lifts free variables in the lambda expression. The state of the variable is captured in the closure, which exists as long as the lambda expression exists.

Let's look at another example:

Sub Test()
    Dim y As Integer = 10
    Dim Lambda As Func(Of Integer, Integer) = Function(ByVal x) x + y
    y = 20
    Console.WriteLine(Lambda(5))
End Sub

What value is displayed when you execute this function? If you said 25, you were spot on. Why 25? Well, the compiler captures and rewrites all of the free variables y to the closure's copy, like so:

Sub Test()
    Dim Closure As New $CLOSURE_Compiler_Generated_Name$
    Closure.y = 10
    Dim Lambda = AddressOf Closure.Lambda_1
    Closure.y = 20
    Console.WriteLine(Lambda(5))
End Function

As you can see, by the time the lambda expression gets executed, the value of y has been changed to 20 and thus, when the lambda expression gets executed, it returns 5 + 20.

This is really important when it comes to loops. Because the free variables are captured into a single closure, you may see unexpected behavior if, for example, you spawn a thread that uses the lambda expression with a captured variable that changes:

Sub Test()
    For I = 1 To 5
        StartThread(Function() I + 10)
    Next
End Function

In this example, suppose StartThread creates a new thread and prints the result of the lambda expression to the console. Since I is captured into the closure, it's possible that by the time the thread executes the lambda expression, the for loop has modified the value of I. In that case, the program may not print 11, 12, 13, 14, and 15 as expected. Instead, you have to scope the captured variable inside the for loop:

Sub Test()
    For I = 1 To 5
        Dim x = I
        StartThread(Function() x + 10)
    Next
End Function

This code will now capture the value of x in the closure and the program will print 11, 12, 13, 14, and 15 as expected. It is extremely important to know which variables are lifted, when the lambda expressions will be executed, and when the lifted variables may change, so that you can be sure your program executes correctly in all circumstances.

Make the Most of Lambda Expressions

In Visual Basic 2008, you can only supply one expression as the body of the lambda, but a new ternary keyword is also introduced that allows you to do simple, fully typed conditional expressions:

Dim x = If(condition, 10, 20)

The If keyword is similar to the IIF function call, except that the If keyword is fully type safe. This means that in the above example, the compiler deduced that both branches of the If keyword return an integer, and so it applies type inference rules and decides that the type of x is Integer. Using IIF, x will be of type object.

You can use the If keyword in a lambda expression:

Dim x = Function(c As Customer) _
    If(c.Age >= 18, c.Address, c.Parent.Address)

In this example, assume there is a Customer class whose definition includes an Address property that represents, as you'd expect, the current address of the customer. The lambda expression makes use of the ternary expression to apply a condition on the input argument; if the customer is 18 years or older, it returns his address. Otherwise, it returns the parent's address.

Now type inference kicks in, and the compiler determines the return type of the lambda expression to be Address. It then creates a delegate type for x (as discussed previously), where the delegate type takes as input a Customer and returns an Address.

If you're interested in learning more about lambda expressions, you can subscribe to my blog (blogs.msdn.com/timng) where I will be discussing lambda expressions (and other Visual Basic 2008 language features) in more detail.

Send your questions and comments to instinct@microsoft.com.

Timothy Ng is a Software Development Engineer on the Visual Basic compiler team at Microsoft. For the upcoming release of Visual Studio 2008 he has worked on several features, including type inference, friend assemblies, expression trees, and anonymous types. You can contact Tim at timng@microsoft.com.