MyISAM vs InnoDB

我正在从事一个涉及大量数据库写入的项目(70%的插入和30%的读取)。这个比率还包括我认为是一个读一个写的更新。读取可能是脏的(例如，在读取时我不需要100%准确的信息)。该任务每小时将处理超过100万个数据库事务。

我在网上读了一堆关于MyISAM和InnoDB之间区别的东西，对于我将用于这个任务的特定数据库/表来说，MyISAM似乎是显而易见的选择。从我看来，InnoDB在需要事务时是很好的，因为它支持行级锁。

有人有这种负载(或更高)的经验吗?MyISAM是正确的选择吗?

当前回答

对于这样的读写比率，我猜InnoDB会表现得更好。既然您可以接受脏读，那么您可以(如果您负担得起)复制到一个从服务器，并让您的所有读都到从服务器。另外，考虑批量插入，而不是一次插入一条记录。

2010-07-05 15:51:58

其他回答

我曾经在一个使用MySQL的大容量系统上工作过，我也尝试过MyISAM和InnoDB。

我发现MyISAM中的表级锁定对我们的工作负载造成了严重的性能问题，这听起来与您的工作负载类似。不幸的是，我还发现在InnoDB下的性能也比我希望的要差。

最后，我通过分割数据解决了争用问题，这样插入就进入了一个“热”表，而选择从不查询热表。

这也允许删除(数据是时间敏感的，我们只保留X天的价值)发生在“陈旧”的表上，这些表同样不会被选择查询触及。InnoDB在批量删除方面的性能似乎很差，所以如果你打算清除数据，你可能想要以这样一种方式来构造它，即旧数据在一个陈旧的表中，可以简单地删除而不是对其进行删除。

当然，我不知道你的应用程序是什么，但希望这能让你对MyISAM和InnoDB的一些问题有一些了解。

2008-09-16 21:57:00

如果使用MyISAM，则每小时不会执行任何事务，除非将每个DML语句视为一个事务(在任何情况下，在崩溃时都不是持久的或原子的)。

因此我认为你必须使用InnoDB。

每秒300个交易听起来很多。如果您绝对需要这些事务在电源故障时保持持久，请确保您的I/O子系统能够轻松地处理每秒这么多的写操作。您至少需要一个带有电池缓存的RAID控制器。

如果你可以降低一点持久性，你可以使用InnoDB，将innodb_flush_log_at_trx_commit设置为0或2(参见文档)，你可以提高性能。

有许多补丁可以从谷歌和其他补丁中提高并发性——如果没有它们仍然不能获得足够的性能，这些补丁可能会引起您的兴趣。

2008-09-16 21:34:54

为了增加广泛的选择，这里涵盖了两个发动机之间的机械差异，我提出了一个经验速度比较研究。

就纯粹的速度而言，MyISAM并不总是比InnoDB快，但根据我的经验，在pure READ工作环境中，MyISAM往往快2.0-2.5倍。显然，这并不适用于所有环境——正如其他人所写的那样，MyISAM缺少事务和外键之类的东西。

我在下面做了一些基准测试——我使用python进行循环，使用timeit库进行时间比较。出于兴趣，我还包括了内存引擎，这提供了最好的性能，尽管它只适用于较小的表(当您超过MySQL内存限制时，您会不断遇到表'tbl'已满)。我研究的四种选择类型是:

香草选择计数有条件的选择索引和非索引子选择

首先，我使用以下SQL创建了三个表

CREATE TABLE
    data_interrogation.test_table_myisam
    (
        index_col BIGINT NOT NULL AUTO_INCREMENT,
        value1 DOUBLE,
        value2 DOUBLE,
        value3 DOUBLE,
        value4 DOUBLE,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8

在第二和第三个表中用“MyISAM”替换“InnoDB”和“memory”。

1)香草选择

查询:SELECT * FROM tbl WHERE index_col = xx

结果:画

它们的速度基本上是相同的，并且正如预期的那样，与要选择的列数成线性关系。InnoDB似乎比MyISAM快一点，但这真的是微不足道的。

代码:

import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint

db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()

lengthOfTable = 100000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):

    for x in xrange(numberOfRecords):
        rand1 = randint(0,lengthOfTable)

        selectString = "SELECT * FROM " + testTable + " WHERE index_col = " + str(rand1)
        cur.execute(selectString)

setupString = "from __main__ import selectRandomRecords"

# Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []

for theLength in [3,10,30,100,300,1000,3000,10000]:

    innodb_times.append( timeit.timeit('selectRandomRecords("test_table_innodb",' + str(theLength) + ')', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('selectRandomRecords("test_table_myisam",' + str(theLength) + ')', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('selectRandomRecords("test_table_memory",' + str(theLength) + ')', number=100, setup=setupString) )

2)计算

查询:SELECT count(*) FROM tbl

结果:MyISAM获胜

这个说明了MyISAM和InnoDB之间的一个很大的不同——MyISAM(和内存)跟踪表中的记录数量，所以这个事务是快速的，O(1)。在我调查的范围内，InnoDB计数所需的时间随着表的大小超线性增加。我怀疑在实践中观察到的许多MyISAM查询的加速都是由于类似的效果。

代码:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to count the records
def countRecords(testTable):

    selectString = "SELECT count(*) FROM " + testTable
    cur.execute(selectString)

setupString = "from __main__ import countRecords"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('countRecords("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('countRecords("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('countRecords("test_table_memory")', number=100, setup=setupString) )

3)有条件选择

查询:SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5

结果:MyISAM获胜

在这里，MyISAM和内存的性能大致相同，对于更大的表，它比InnoDB高出50%左右。在这类查询中，MyISAM的好处似乎得到了最大化。

代码:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to perform conditional selects
def conditionalSelect(testTable):
    selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
    cur.execute(selectString)

setupString = "from __main__ import conditionalSelect"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('conditionalSelect("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('conditionalSelect("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('conditionalSelect("test_table_memory")', number=100, setup=setupString) )

4)子

结果:InnoDB胜出

对于这个查询，我为子选择创建了一组额外的表。每个都是简单的两列bigint，一列有主键索引，另一列没有任何索引。由于表的大小很大，我没有测试内存引擎。SQL表创建命令为

CREATE TABLE
    subselect_myisam
    (
        index_col bigint NOT NULL,
        non_index_col bigint,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8;

在第二个表中，'MyISAM'再次替换'InnoDB'。

在这个查询中，我将选择表的大小保留为1000000，而是改变子选择列的大小。

在这一点上，InnoDB很容易获胜。在我们得到一个合理的大小表后，两个引擎都线性缩放子选择的大小。索引加快了MyISAM命令的速度，但有趣的是，它对InnoDB的速度几乎没有影响。 subSelect.png

代码:

myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []

def subSelectRecordsIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString = "from __main__ import subSelectRecordsIndexed"

def subSelectRecordsNotIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString2 = "from __main__ import subSelectRecordsNotIndexed"

# Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"

cur.execute(truncateString)
cur.execute(truncateString2)

lengthOfTable = 1000000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)

for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE subselect_innodb"
    truncateString2 = "TRUNCATE subselect_myisam"

    cur.execute(truncateString)
    cur.execute(truncateString2)

    # For each length, empty the table and re-fill it with random data
    rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
    rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)

    for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
        insertString = "INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
        insertString2 = "INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)

    db.commit()

    # Finally, time the queries
    innodb_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString) )
        
    innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString2) )
    myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString2) )

我认为所有这些的关键信息是，如果你真的关心速度，你需要对你正在执行的查询进行基准测试，而不是假设哪个引擎更适合。

2015-06-11 09:15:31

根据我的经验，MyISAM是一个更好的选择，只要你不做delete、update、大量的单个INSERT、事务和全文索引。顺便说一句，CHECK TABLE太可怕了。随着表的行数越来越老，你不知道它什么时候会结束。

2009-01-06 00:14:42

InnoDB offers:

ACID transactions
row-level locking
foreign key constraints
automatic crash recovery
table compression (read/write)
spatial data types (no spatial indexes)

在InnoDB中，一行中除TEXT和BLOB外的所有数据最多占用8000字节。InnoDB没有全文索引。在InnoDB中，COUNT(*)s(当WHERE, GROUP BY或JOIN不使用时)执行速度比MyISAM慢，因为行数没有存储在内部。InnoDB将数据和索引存储在一个文件中。InnoDB使用缓冲池来缓存数据和索引。

MyISAM提供：

fast COUNT(*)s (when WHERE, GROUP BY, or JOIN is not used)
full text indexing
smaller disk footprint
very high table compression (read only)
spatial data types and indexes (R-tree)

MyISAM has table-level locking, but no row-level locking. No transactions. No automatic crash recovery, but it does offer repair table functionality. No foreign key constraints. MyISAM tables are generally more compact in size on disk when compared to InnoDB tables. MyISAM tables could be further highly reduced in size by compressing with myisampack if needed, but become read-only. MyISAM stores indexes in one file and data in another. MyISAM uses key buffers for caching indexes and leaves the data caching management to the operating system.

总的来说，我推荐InnoDB用于大多数用途，MyISAM仅用于特殊用途。InnoDB现在是MySQL新版本的默认引擎。

2013-05-28 07:03:17

MyISAM vs InnoDB

推荐文章

最新文章

标签