到目前为止,我已经避免了测试多线程代码的噩梦,因为它似乎是一个太大的雷区。我想知道人们是如何测试依赖于线程的代码以获得成功执行的,或者人们是如何测试那些仅在两个线程以给定方式交互时才会出现的问题的?

对于今天的程序员来说,这似乎是一个非常关键的问题,恕我直言,将我们的知识集中在这个问题上是很有用的。


当前回答

For J2E code, I've used SilkPerformer, LoadRunner and JMeter for concurrency testing of threads. They all do the same thing. Basically, they give you a relatively simple interface for administrating their version of the proxy server, required, in order to analyze the TCP/IP data stream, and simulate multiple users making simultaneous requests to your app server. The proxy server can give you the ability to do things like analyze the requests made, by presenting the whole page and URL sent to the server, as well as the response from the server, after processing the request.

您可以在不安全的http模式下找到一些错误,在这种模式下,您至少可以分析正在发送的表单数据,并为每个用户系统地更改表单数据。但真正的测试是在https(安全套接字层)中运行。然后,您还必须有系统地修改会话和cookie数据,这可能有点复杂。

在测试并发性时,我发现的最好的错误是,当我发现开发人员在登录时依赖Java垃圾收集来关闭登录时建立的到LDAP服务器的连接请求。这导致用户暴露在其他用户的会话中,当试图分析服务器瘫痪时发生了什么,几乎每隔几秒钟就能完成一次事务时,结果非常令人困惑。

In the end, you or someone will probably have to buckle down and analyze the code for blunders like the one I just mentioned. And an open discussion across departments, like the one that occurred, when we unfolded the problem described above, are most useful. But these tools are the best solution to testing multi-threaded code. JMeter is open source. SilkPerformer and LoadRunner are proprietary. If you really want to know whether your app is thread safe, that's how the big boys do it. I've done this for very large companies professionally, so I'm not guessing. I'm speaking from personal experience.

提醒一句:理解这些工具确实需要一些时间。这不是简单地安装软件并启动GUI的问题,除非您已经接触过多线程编程。我试图确定需要理解的3个关键领域(表单、会话和cookie数据),希望至少从理解这些主题开始,可以帮助您集中精力快速获得结果,而不必通读整个文档。

其他回答

一个简单的测试模式可以用于一些(不是所有!)用例,就是多次重复相同的测试。例如,假设你有一个方法:

def process(input):
    # Spawns several threads to do the job
    # ...
    return output

创建一堆测试:

process(input1) -> expect to return output1
process(input2) -> expect to return output2
...

现在将每个测试运行多次。

如果流程的实现包含一个微小的错误(例如死锁、竞态条件等),出现的概率为0.1%,那么运行1000次测试,则该错误至少出现一次的概率为64%。运行测试10000次,得到>99%的概率。

我做过很多这样的事,的确很糟糕。

一些建议:

GroboUtils for running multiple test threads alphaWorks ConTest to instrument classes to cause interleavings to vary between iterations Create a throwable field and check it in tearDown (see Listing 1). If you catch a bad exception in another thread, just assign it to throwable. I created the utils class in Listing 2 and have found it invaluable, especially waitForVerify and waitForCondition, which will greatly increase the performance of your tests. Make good use of AtomicBoolean in your tests. It is thread safe, and you'll often need a final reference type to store values from callback classes and suchlike. See example in Listing 3. Make sure to always give your test a timeout (e.g., @Test(timeout=60*1000)), as concurrency tests can sometimes hang forever when they're broken.

清单1:

@After
public void tearDown() {
    if ( throwable != null )
        throw throwable;
}

清单2:

import static org.junit.Assert.fail;
import java.io.File;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Proxy;
import java.util.Random;
import org.apache.commons.collections.Closure;
import org.apache.commons.collections.Predicate;
import org.apache.commons.lang.time.StopWatch;
import org.easymock.EasyMock;
import org.easymock.classextension.internal.ClassExtensionHelper;
import static org.easymock.classextension.EasyMock.*;

import ca.digitalrapids.io.DRFileUtils;

/**
 * Various utilities for testing
 */
public abstract class DRTestUtils
{
    static private Random random = new Random();

/** Calls {@link #waitForCondition(Integer, Integer, Predicate, String)} with
 * default max wait and check period values.
 */
static public void waitForCondition(Predicate predicate, String errorMessage) 
    throws Throwable
{
    waitForCondition(null, null, predicate, errorMessage);
}

/** Blocks until a condition is true, throwing an {@link AssertionError} if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param errorMessage message use in the {@link AssertionError}
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
    Predicate predicate, String errorMessage) throws Throwable 
{
    waitForCondition(maxWait_ms, checkPeriod_ms, predicate, new Closure() {
        public void execute(Object errorMessage)
        {
            fail((String)errorMessage);
        }
    }, errorMessage);
}

/** Blocks until a condition is true, running a closure if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param closure closure to run
 * @param argument argument for closure
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
    Predicate predicate, Closure closure, Object argument) throws Throwable 
{
    if ( maxWait_ms == null )
        maxWait_ms = 30 * 1000;
    if ( checkPeriod_ms == null )
        checkPeriod_ms = 100;
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    while ( !predicate.evaluate(null) ) {
        Thread.sleep(checkPeriod_ms);
        if ( stopWatch.getTime() > maxWait_ms ) {
            closure.execute(argument);
        }
    }
}

/** Calls {@link #waitForVerify(Integer, Object)} with <code>null</code>
 * for {@code maxWait_ms}
 */
static public void waitForVerify(Object easyMockProxy)
    throws Throwable
{
    waitForVerify(null, easyMockProxy);
}

/** Repeatedly calls {@link EasyMock#verify(Object[])} until it succeeds, or a
 * max wait time has elapsed.
 * @param maxWait_ms Max wait time. <code>null</code> defaults to 30s.
 * @param easyMockProxy Proxy to call verify on
 * @throws Throwable
 */
static public void waitForVerify(Integer maxWait_ms, Object easyMockProxy)
    throws Throwable
{
    if ( maxWait_ms == null )
        maxWait_ms = 30 * 1000;
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    for(;;) {
        try
        {
            verify(easyMockProxy);
            break;
        }
        catch (AssertionError e)
        {
            if ( stopWatch.getTime() > maxWait_ms )
                throw e;
            Thread.sleep(100);
        }
    }
}

/** Returns a path to a directory in the temp dir with the name of the given
 * class. This is useful for temporary test files.
 * @param aClass test class for which to create dir
 * @return the path
 */
static public String getTestDirPathForTestClass(Object object) 
{

    String filename = object instanceof Class ? 
        ((Class)object).getName() :
        object.getClass().getName();
    return DRFileUtils.getTempDir() + File.separator + 
        filename;
}

static public byte[] createRandomByteArray(int bytesLength)
{
    byte[] sourceBytes = new byte[bytesLength];
    random.nextBytes(sourceBytes);
    return sourceBytes;
}

/** Returns <code>true</code> if the given object is an EasyMock mock object 
 */
static public boolean isEasyMockMock(Object object) {
    try {
        InvocationHandler invocationHandler = Proxy
                .getInvocationHandler(object);
        return invocationHandler.getClass().getName().contains("easymock");
    } catch (IllegalArgumentException e) {
        return false;
    }
}
}

清单3:

@Test
public void testSomething() {
    final AtomicBoolean called = new AtomicBoolean(false);
    subject.setCallback(new SomeCallback() {
        public void callback(Object arg) {
            // check arg here
            called.set(true);
        }
    });
    subject.run();
    assertTrue(called.get());
}

它并不完美,但我用c#写了这个帮助程序:

using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;

namespace Proto.Promises.Tests.Threading
{
    public class ThreadHelper
    {
        public static readonly int multiThreadCount = Environment.ProcessorCount * 100;
        private static readonly int[] offsets = new int[] { 0, 10, 100, 1000 };

        private readonly Stack<Task> _executingTasks = new Stack<Task>(multiThreadCount);
        private readonly Barrier _barrier = new Barrier(1);
        private int _currentParticipants = 0;
        private readonly TimeSpan _timeout;

        public ThreadHelper() : this(TimeSpan.FromSeconds(10)) { } // 10 second timeout should be enough for most cases.

        public ThreadHelper(TimeSpan timeout)
        {
            _timeout = timeout;
        }

        /// <summary>
        /// Execute the action multiple times in parallel threads.
        /// </summary>
        public void ExecuteMultiActionParallel(Action action)
        {
            for (int i = 0; i < multiThreadCount; ++i)
            {
                AddParallelAction(action);
            }
            ExecutePendingParallelActions();
        }

        /// <summary>
        /// Execute the action once in a separate thread.
        /// </summary>
        public void ExecuteSingleAction(Action action)
        {
            AddParallelAction(action);
            ExecutePendingParallelActions();
        }

        /// <summary>
        /// Add an action to be run in parallel.
        /// </summary>
        public void AddParallelAction(Action action)
        {
            var taskSource = new TaskCompletionSource<bool>();
            lock (_executingTasks)
            {
                ++_currentParticipants;
                _barrier.AddParticipant();
                _executingTasks.Push(taskSource.Task);
            }
            new Thread(() =>
            {
                try
                {
                    _barrier.SignalAndWait(); // Try to make actions run in lock-step to increase likelihood of breaking race conditions.
                    action.Invoke();
                    taskSource.SetResult(true);
                }
                catch (Exception e)
                {
                    taskSource.SetException(e);
                }
            }).Start();
        }

        /// <summary>
        /// Runs the pending actions in parallel, attempting to run them in lock-step.
        /// </summary>
        public void ExecutePendingParallelActions()
        {
            Task[] tasks;
            lock (_executingTasks)
            {
                _barrier.SignalAndWait();
                _barrier.RemoveParticipants(_currentParticipants);
                _currentParticipants = 0;
                tasks = _executingTasks.ToArray();
                _executingTasks.Clear();
            }
            try
            {
                if (!Task.WaitAll(tasks, _timeout))
                {
                    throw new TimeoutException($"Action(s) timed out after {_timeout}, there may be a deadlock.");
                }
            }
            catch (AggregateException e)
            {
                // Only throw one exception instead of aggregate to try to avoid overloading the test error output.
                throw e.Flatten().InnerException;
            }
        }

        /// <summary>
        /// Run each action in parallel multiple times with differing offsets for each run.
        /// <para/>The number of runs is 4^actions.Length, so be careful if you don't want the test to run too long.
        /// </summary>
        /// <param name="expandToProcessorCount">If true, copies each action on additional threads up to the processor count. This can help test more without increasing the time it takes to complete.
        /// <para/>Example: 2 actions with 6 processors, runs each action 3 times in parallel.</param>
        /// <param name="setup">The action to run before each parallel run.</param>
        /// <param name="teardown">The action to run after each parallel run.</param>
        /// <param name="actions">The actions to run in parallel.</param>
        public void ExecuteParallelActionsWithOffsets(bool expandToProcessorCount, Action setup, Action teardown, params Action[] actions)
        {
            setup += () => { };
            teardown += () => { };
            int actionCount = actions.Length;
            int expandCount = expandToProcessorCount ? Math.Max(Environment.ProcessorCount / actionCount, 1) : 1;
            foreach (var combo in GenerateCombinations(offsets, actionCount))
            {
                setup.Invoke();
                for (int k = 0; k < expandCount; ++k)
                {
                    for (int i = 0; i < actionCount; ++i)
                    {
                        int offset = combo[i];
                        Action action = actions[i];
                        AddParallelAction(() =>
                        {
                            for (int j = offset; j > 0; --j) { } // Just spin in a loop for the offset.
                            action.Invoke();
                        });
                    }
                }
                ExecutePendingParallelActions();
                teardown.Invoke();
            }
        }

        // Input: [1, 2, 3], 3
        // Ouput: [
        //          [1, 1, 1],
        //          [2, 1, 1],
        //          [3, 1, 1],
        //          [1, 2, 1],
        //          [2, 2, 1],
        //          [3, 2, 1],
        //          [1, 3, 1],
        //          [2, 3, 1],
        //          [3, 3, 1],
        //          [1, 1, 2],
        //          [2, 1, 2],
        //          [3, 1, 2],
        //          [1, 2, 2],
        //          [2, 2, 2],
        //          [3, 2, 2],
        //          [1, 3, 2],
        //          [2, 3, 2],
        //          [3, 3, 2],
        //          [1, 1, 3],
        //          [2, 1, 3],
        //          [3, 1, 3],
        //          [1, 2, 3],
        //          [2, 2, 3],
        //          [3, 2, 3],
        //          [1, 3, 3],
        //          [2, 3, 3],
        //          [3, 3, 3]
        //        ]
        private static IEnumerable<int[]> GenerateCombinations(int[] options, int count)
        {
            int[] indexTracker = new int[count];
            int[] combo = new int[count];
            for (int i = 0; i < count; ++i)
            {
                combo[i] = options[0];
            }
            // Same algorithm as picking a combination lock.
            int rollovers = 0;
            while (rollovers < count)
            {
                yield return combo; // No need to duplicate the array since we're just reading it.
                for (int i = 0; i < count; ++i)
                {
                    int index = ++indexTracker[i];
                    if (index == options.Length)
                    {
                        indexTracker[i] = 0;
                        combo[i] = options[0];
                        if (i == rollovers)
                        {
                            ++rollovers;
                        }
                    }
                    else
                    {
                        combo[i] = options[index];
                        break;
                    }
                }
            }
        }
    }
}

使用示例:

[Test]
public void DeferredMayBeBeResolvedAndPromiseAwaitedConcurrently_void0()
{
    Promise.Deferred deferred = default(Promise.Deferred);
    Promise promise = default(Promise);

    int invokedCount = 0;

    var threadHelper = new ThreadHelper();
    threadHelper.ExecuteParallelActionsWithOffsets(false,
        // Setup
        () =>
        {
            invokedCount = 0;
            deferred = Promise.NewDeferred();
            promise = deferred.Promise;
        },
        // Teardown
        () => Assert.AreEqual(1, invokedCount),
        // Parallel Actions
        () => deferred.Resolve(),
        () => promise.Then(() => { Interlocked.Increment(ref invokedCount); }).Forget()
    );
}

我曾经有过测试线程代码的不幸任务,这绝对是我写过的最难的测试。

在编写测试时,我使用委托和事件的组合。基本上,它都是关于使用PropertyNotifyChanged事件和WaitCallback或某种轮询的ConditionalWaiter。

我不确定这是否是最好的方法,但它对我来说是有效的。

并发是内存模型、硬件、缓存和代码之间复杂的相互作用。在Java的情况下,至少这样的测试主要由jcstress部分解决。众所周知,该库的创建者是许多JVM、GC和Java并发特性的作者。

但是即使是这个库也需要对Java内存模型规范有很好的了解,这样我们才能确切地知道我们在测试什么。但我认为这项工作的重点是微基准测试。不是庞大的业务应用。