Performance Considerations for Entity Framework Applications

This topic describes performance characteristics of the ADO.NET Entity Framework and provides some considerations to help improve the performance of Entity Framework applications.

Stages of Query Execution

In order to better understand the performance of queries in the Entity Framework, it is helpful to understand the operations that occur when a query executes against an Entity Data Model (EDM) and returns data as objects. The following table describes this series of operations.

Operation Relative Cost Frequency Comments

Loading metadata

Moderate

Once in each application domain.

EDM model and mapping metadata is loaded into a MetadataWorkspace. This metadata is cached globally and is available to other instances of ObjectContext in the same application domain. For more information, see Metadata Workspace Overview.

Opening the database connection

Moderate1

As needed.

Because an open connection to the database consumes a valuable resource, Object Services opens and closes the database connection only as needed. You can also explicitly open the connection. For more information, see Managing Connections in Object Services (Entity Framework).

Generating views

High

Once in each application domain. (Can be pre-generated.)

Before the Entity Framework can execute a query against an EDM or save changes to the data source, it must generate a set of local query views to access the database. Because of the high cost of generating these views, you can pre-generate the views and add them to the project at design-time. For more information, see How to: Pre-Generate Views to Improve Query Performance (Entity Framework).

Preparing the query

Moderate2

Once for each unique query.

Includes the costs to compose the query command, generate a command tree based on the EDM metadata, and define the shape of the returned data. Because Entity SQL query commands are cached, later executions of the same query take even less time. You can also use compiled LINQ queries to reduce this cost in later executions. For more information, see Compiled Queries (LINQ to Entities). For general information about LINQ query execution, see LINQ to Entities Flow of Execution.

Executing the query

Low2

Once for each query.

The cost of executing the command against the data source by using the ADO.NET data provider. Because most data sources cache query plans, later executions of the same query may take even less time.

Loading and validating types

Low3

Once for each ObjectContext instance.

Types are loaded and validated against the types that the EDM defines.

Tracking

Low3

Once for each object that a query returns. 4

An EntityKey is generated for each tracked object that the query returns. This key is used to create an ObjectStateEntry in the ObjectStateManager. The MergeOption that is used to execute the query defines the tracking behavior. If the query uses the NoTracking merge option or if the object already exists in the ObjectContext and the query uses the AppendOnly or PreserveChanges merge options, this stage does not affect performance. For more information, see Change Tracking and Identity Resolution (Entity Framework).

Materializing the objects

Moderate3

Once for each object that a query returns. 4

The process of reading the returned DbDataReader object and creating objects and setting property values that are based on the values in each instance of the DbDataRecord class. If the object already exists in the ObjectContext and the query uses the AppendOnly or PreserveChanges merge options, this stage does not affect performance. For more information, see Change Tracking and Identity Resolution (Entity Framework).

1 When a data source provider implements connection pooling, the cost of opening a connection is distributed across the pool. The .NET Provider for SQL Server supports connection pooling.

2 Cost increases with increased query complexity.

3 Total cost increases proportional to the number of objects returned by the query.

4 This overhead is not required for EntityClient queries because EntityClient queries return an EntityDataReader instead of objects. For more information, see EntityClient Provider for the Entity Framework.

Additional Considerations

The following are other considerations that may affect performance of Entity Framework applications.

Query Execution

Because queries can be resource intensive, consider at what point in your code and on what computer a query is executed.

Deferred versus immediate execution.

When you create an ObjectQuery or LINQ query, the query may not be executed immediately. Query execution is deferred until the results are needed, such as during a foreach (C#) or For Each (Visual Basic) enumeration or when it is assigned to fill a List collection. Query execution begins immediately when you call the Execute method on an ObjectQuery or when you call a LINQ method that returns a singleton query, such as First or Any. For more information, see Object Queries (Entity Framework) and Query Execution (LINQ to Entities).

Client-side execution of LINQ queries.

Although the execution of a LINQ query occurs on the computer that hosts the data source, some parts of a LINQ query may be evaluated on the client computer. For more information, see the Store Execution section of Query Execution (LINQ to Entities).

Query and Mapping Complexity

The complexity of individual queries and of the mapping in the entity model will have a significant effect on query performance.

Mapping complexity

Models that are more complex than a simple one-to-one mapping between entities in the conceptual model and tables in the storage model generate more complex commands than models that have a one-to-one mapping.

Query complexity

Queries that require a large number of joins in the commands that are executed against the data source or that return a large amount of data may affect performance in the following ways:

  • Queries against an EDM that seem simple may result in the execution of more complex queries against the data source. This can occur because the Entity Framework translates a query against an EDM into an equivalent query against the data source. When a single entity set in the conceptual model maps to more than one table in the data source, or when a relationship between entities is mapped to a join table, the query command executed against the data source query may require one or more joins.

    Note

    Use the ToTraceString method of the ObjectQuery or EntityCommand classes to view the commands that are executed against the data source for a given query. For more information, see How to: View the Store Commands (Entity Framework).

  • Nested Entity SQL queries may create joins on the server and can return a large number of rows.

    The following is an example of a nested query in a projection clause:

    SELECT c, (SELECT c, (SELECT c FROM AdventureWorksModel.Vendor AS c  ) As Inner2 
        FROM AdventureWorksModel.JobCandidate AS c  ) As Inner1 
        FROM AdventureWorksModel.EmployeeDepartmentHistory AS c
    

    In addition, such queries cause the query pipeline to generate a single query with duplication of objects across nested queries. Because of this, a single column may be duplicated multiple times. On some databases, including SQL Server, this can cause the TempDB table to grow very large, which can decrease server performance. Care should be taken when you execute nested queries.

  • Any queries that return a large amount of data can cause decreased performance if the client is performing operations that consume resources in a way that is proportional to the size of the result set. In such cases, you should consider limiting the amount of data returned by the query. For more information, see How to: Page Through Query Results (Entity Framework).

Any commands automatically generated by the Entity Framework may be more complex than similar commands written explicitly by a database developer. If you need explicit control over the commands executed against your data source, consider defining a mapping to a table-valued function or stored procedure. For more information, see Stored Procedure Support (Entity Framework).

Relationships

For optimal query performance, you must define relationships between entities both as associations in the entity model and as logical relationships in the data source.

Query Paths

By default, when you execute an ObjectQuery, related objects are not returned (although objects that represent the relationships themselves are). You can load related objects in one of two ways: either by setting the query path before the ObjectQuery is executed, or by calling the Load method on the navigation property that the object exposes. When you consider which option to use, be aware that there is a tradeoff between the number of requests against the database and the amount of data returned in a single query. For more information, see Shaping Query Results (Entity Framework).

Using query paths

Query paths define the graph of objects that a query returns. When you define a query path, only a single request against the database is required to return all objects that the path defines. Using query paths can result in complex commands being executed against the data source from seemingly simple object queries. This occurs because one or more joins are required to return related objects in a single query. This complexity is greater in queries against a complex entity model, such as an entity with inheritance or a path that includes many-to-many relationships.

Note

Use the ToTraceString method to see the command that will be generated by an ObjectQuery. For more information, see How to: View the Store Commands (Entity Framework).

When a query path includes too many related objects or the objects contain too much row data, the data source might be unable to complete the query. This occurs if the query requires intermediate temporary storage that exceeds the capabilities of the data source. When this occurs, you can reduce the complexity of the data source query by explicitly loading related objects.

You can explicitly load related objects by calling the Load method on a navigation property that returns an EntityCollection or EntityReference. Explicitly loading objects requires a round-trip to the database every time Load is called.

Note

if you call Load while looping through a collection of returned objects, such as when you use the foreach statement (For Each in Visual Basic), the data source-specific provider must support multiple active results sets on a single connection. For a SQL Server database, you must specify a value of MultipleActiveResultSets = true in the provider connection string.

Although explicitly loading related objects will reduce the number of joins and reduced the amount of redundant data, Load requires repeated connections to the database, which can become costly when explicitly loading a large number of objects.

Saving Changes

When you call the SaveChanges method on an ObjectContext, a separate create, update, or delete command is generated for every added, updated, or deleted object in the context. These commands are executed on the data source in a single transaction. As with queries, the performance of create, update, and delete operations depends on the complexity of the mapping in the EDM.

Distributed Transactions

Operations in an explicit transaction that require resources that are managed by the distributed transaction coordinator (DTC) will be much more expensive than a similar operation that does not require the DTC. Promotion to the DTC will occur in the following situations:

  • An explicit transaction with an operation against a SQL Server 2000 database or other data source that always promote explicit transactions to the DTC.

  • An explicit transaction with an operation against SQL Server 2005 when the connection is managed by Object Services. This occurs because SQL Server 2005 promotes to a DTC whenever a connection is closed and reopened within a single transaction, which is the default behavior of Object Services. This DTC promotion does not occur when using SQL Server 2008. To avoid this promotion when using SQL Server 2005, you must explicitly open and close the connection within the transaction. For more information, see Managing Connections in Object Services (Entity Framework).

An explicit transaction is used when one or more operations are executed inside a System.Transactions transaction. For more information, see Managing Transactions in Object Services (Entity Framework).

Strategies for Improving Performance

You can improve the overall performance of queries in the Entity Framework by using the following strategies.

Pre-generate views.

Generating views based on an entity model is a significant cost the first time that an application executes a query. Use the EdmGen.exe utility to pre-generate views as a Visual Basic or C# code file that can be added to the project during design. Pre-generated views are validated at runtime to ensure that they are consistent with the current version of the specified entity model. For more information, see How to: Pre-Generate Views to Improve Query Performance (Entity Framework).

Use compiled LINQ queries.

When you have an application that executes structurally similar queries many times in the Entity Framework, you can frequently increase performance by compiling the query one time and executing it several times with different parameters. For example, an application might have to retrieve all the customers in a particular city; the city is specified at runtime by the user in a form. LINQ to Entities supports using compiled queries for this purpose.

For more information, see Compiled Queries (LINQ to Entities).

Return the correct amount of data.

In some scenarios, specifying a query path using the Include method is much faster because it requires fewer round trips to the database. However, in other scenarios, additional round trips to the database to load related objects may be faster because the simpler queries with fewer joins result in less redundancy of data. Because of this, we recommend that you test the performance of various ways to retrieve related objects. For more information, see Shaping Query Results (Entity Framework).

To avoid returning too much data in a single query, consider paging the results of the query into more manageable groups. For more information, see How to: Page Through Query Results (Entity Framework).

Limit the scope of the ObjectContext.

In most cases, you should create an ObjectContext instance within a using statement (Using…End Using in Visual Basic). This can increase performance by ensuring that the resources associated with the object context are disposed automatically when the code exits the statement block. However, when controls are bound to objects managed by the object context, the ObjectContext instance should be maintained as long as the binding is needed and disposed of manually. For more information, see Managing Resources in Object Services (Entity Framework).

Consider opening the database connection manually.

When your application executes a series of object queries or frequently calls SaveChanges to persist create, update, and delete operations to the data source, Object Services must continuously open and close the connection to the data source. In these situations, consider manually opening the connection at the start of these operations and either closing or disposing of the connection when the operations are complete. For more information, see Managing Connections in Object Services (Entity Framework).

Consider using the NoTracking merge option for queries.

There is a cost required to track returned objects in the object context. Detecting changes to objects and ensuring that multiple requests for the same logical entity return the same object instance requires that objects be attached to an ObjectContext instance. If you do not plan to make updates or deletes to objects and do not require identity management , consider using the NoTracking merge options when you execute queries.

Implement IEntityWithKey for custom data classes.

Object Services enables you to define your own custom data classes for use with an EDM. Although it is not a requirement for using custom data classes, implementing the IEntityWithKey interface can help improve query performance because Object Services uses the EntityKey property that the IEntityWithKey interface defines to more efficiently retrieve an object from the object state manager. For more information, see Implementing Custom Data Class Interfaces (Entity Framework).

Performance Data

Some performance data for the Entity Framework is published in the following posts on the ADO.NET team blog:

See Also

Concepts

Security Considerations (Entity Framework)
Migration Considerations (Entity Framework)
Deployment Considerations (Entity Framework)

Other Resources

Programming Guide (Entity Framework)