我正在寻找插入实体框架的最快方法。

我之所以问这个问题,是因为您有一个活动的TransactionScope,并且插入量很大(4000+)。它可能会持续超过10分钟(事务的默认超时),这将导致事务不完整。


当前回答

SqlBulkCopy速度极快

这是我的实现:

// at some point in my calling code, I will call:
var myDataTable = CreateMyDataTable();
myDataTable.Rows.Add(Guid.NewGuid,tableHeaderId,theName,theValue); // e.g. - need this call for each row to insert

var efConnectionString = ConfigurationManager.ConnectionStrings["MyWebConfigEfConnection"].ConnectionString;
var efConnectionStringBuilder = new EntityConnectionStringBuilder(efConnectionString);
var connectionString = efConnectionStringBuilder.ProviderConnectionString;
BulkInsert(connectionString, myDataTable);

private DataTable CreateMyDataTable()
{
    var myDataTable = new DataTable { TableName = "MyTable"};
// this table has an identity column - don't need to specify that
    myDataTable.Columns.Add("MyTableRecordGuid", typeof(Guid));
    myDataTable.Columns.Add("MyTableHeaderId", typeof(int));
    myDataTable.Columns.Add("ColumnName", typeof(string));
    myDataTable.Columns.Add("ColumnValue", typeof(string));
    return myDataTable;
}

private void BulkInsert(string connectionString, DataTable dataTable)
{
    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        SqlTransaction transaction = null;
        try
        {
            transaction = connection.BeginTransaction();

            using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction))
            {
                sqlBulkCopy.DestinationTableName = dataTable.TableName;
                foreach (DataColumn column in dataTable.Columns) {
                    sqlBulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName);
                }

                sqlBulkCopy.WriteToServer(dataTable);
            }
            transaction.Commit();
        }
        catch (Exception)
        {
            transaction?.Rollback();
            throw;
        }
    }
}

其他回答

这种组合可以很好地提高速度。

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

您是否尝试过通过后台工作人员或任务插入?

在我的例子中,我插入了7760个寄存器,分布在182个具有外键关系的不同表中(通过NavigationProperties)。

没有这项任务,花了2分半钟。在一个Task(Task.Factory.StartNew(…))中,花费了15秒。

我只在将所有实体添加到上下文之后才执行SaveChanges()。(确保数据完整性)

TL;博士我知道这是一个老帖子,但我已经实施了一个解决方案,从其中一个提议开始,扩展它并解决其中的一些问题;此外,我还阅读了所提出的其他解决方案,与这些方案相比,我似乎提出了一种更适合原始问题中提出的要求的解决方案。

在这个解决方案中,我扩展了Slauma的方法,我认为它非常适合原始问题中提出的情况,即使用实体框架和事务范围对数据库进行昂贵的写入操作。

在Slauma的解决方案中,这只是一个草稿,只是用来了解​​EF的速度与实施批量插入的策略-存在以下问题:

交易超时(默认情况下,1分钟可通过代码延长至最多10分钟);复制宽度等于事务结束时使用的提交大小的第一个数据块(这个问题很奇怪,可以通过变通方法解决)。

我还报告了一个例子,其中包括几个从属实体的上下文插入,从而扩展了Slauma提出的案例研究。

我能够验证的性能是10K记录/分钟,在数据库中插入200K宽的记录块,每个记录块大约1KB。速度是恒定的,性能没有下降,测试需要大约20分钟才能成功运行。

详细的解决方案

主持在示例存储库类中插入的批量插入操作的方法:

abstract class SomeRepository { 

    protected MyDbContext myDbContextRef;

    public void ImportData<TChild, TFather>(List<TChild> entities, TFather entityFather)
            where TChild : class, IEntityChild
            where TFather : class, IEntityFather
    {

        using (var scope = MyDbContext.CreateTransactionScope())
        {

            MyDbContext context = null;
            try
            {
                context = new MyDbContext(myDbContextRef.ConnectionString);

                context.Configuration.AutoDetectChangesEnabled = false;

                entityFather.BulkInsertResult = false;
                var fileEntity = context.Set<TFather>().Add(entityFather);
                context.SaveChanges();

                int count = 0;

                //avoids an issue with recreating context: EF duplicates the first commit block of data at the end of transaction!!
                context = MyDbContext.AddToContext<TChild>(context, null, 0, 1, true);

                foreach (var entityToInsert in entities)
                {
                    ++count;
                    entityToInsert.EntityFatherRefId = fileEntity.Id;
                    context = MyDbContext.AddToContext<TChild>(context, entityToInsert, count, 100, true);
                }

                entityFather.BulkInsertResult = true;
                context.Set<TFather>().Add(fileEntity);
                context.Entry<TFather>(fileEntity).State = EntityState.Modified;

                context.SaveChanges();
            }
            finally
            {
                if (context != null)
                    context.Dispose();
            }

            scope.Complete();
        }

    }

}

仅用于示例目的的接口:

public interface IEntityChild {

    //some properties ...

    int EntityFatherRefId { get; set; }

}

public interface IEntityFather {

    int Id { get; set; }
    bool BulkInsertResult { get; set; }
}

db上下文中,我将解决方案的各个元素实现为静态方法:

public class MyDbContext : DbContext
{

    public string ConnectionString { get; set; }


    public MyDbContext(string nameOrConnectionString)
    : base(nameOrConnectionString)
    {
        Database.SetInitializer<MyDbContext>(null);
        ConnectionString = Database.Connection.ConnectionString;
    }


    /// <summary>
    /// Creates a TransactionScope raising timeout transaction to 30 minutes
    /// </summary>
    /// <param name="_isolationLevel"></param>
    /// <param name="timeout"></param>
    /// <remarks>
    /// It is possible to set isolation-level and timeout to different values. Pay close attention managing these 2 transactions working parameters.
    /// <para>Default TransactionScope values for isolation-level and timeout are the following:</para>
    /// <para>Default isolation-level is "Serializable"</para>
    /// <para>Default timeout ranges between 1 minute (default value if not specified a timeout) to max 10 minute (if not changed by code or updating max-timeout machine.config value)</para>
    /// </remarks>
    public static TransactionScope CreateTransactionScope(IsolationLevel _isolationLevel = IsolationLevel.Serializable, TimeSpan? timeout = null)
    {
        SetTransactionManagerField("_cachedMaxTimeout", true);
        SetTransactionManagerField("_maximumTimeout", timeout ?? TimeSpan.FromMinutes(30));

        var transactionOptions = new TransactionOptions();
        transactionOptions.IsolationLevel = _isolationLevel;
        transactionOptions.Timeout = TransactionManager.MaximumTimeout;
        return new TransactionScope(TransactionScopeOption.Required, transactionOptions);
    }

    private static void SetTransactionManagerField(string fieldName, object value)
    {
        typeof(TransactionManager).GetField(fieldName, BindingFlags.NonPublic | BindingFlags.Static).SetValue(null, value);
    }


    /// <summary>
    /// Adds a generic entity to a given context allowing commit on large block of data and improving performance to support db bulk-insert operations based on Entity Framework
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="context"></param>
    /// <param name="entity"></param>
    /// <param name="count"></param>
    /// <param name="commitCount">defines the block of data size</param>
    /// <param name="recreateContext"></param>
    /// <returns></returns>
    public static MyDbContext AddToContext<T>(MyDbContext context, T entity, int count, int commitCount, bool recreateContext) where T : class
    {
        if (entity != null)
            context.Set<T>().Add(entity);

        if (count % commitCount == 0)
        {
            context.SaveChanges();
            if (recreateContext)
            {
                var contextConnectionString = context.ConnectionString;
                context.Dispose();
                context = new MyDbContext(contextConnectionString);
                context.Configuration.AutoDetectChangesEnabled = false;
            }
        }

        return context;
    }
}

您应该考虑为此使用System.Data.SqlClient.SqlBulkCopy。这是文档,当然还有很多在线教程。

抱歉,我知道您正在寻找一个简单的答案来让EF做您想做的事情,但批量操作并不是ORM真正的用途。

尝试使用存储过程来获取要插入的数据的XML。