Skip to content

Thread pool synchronizing problem? [JIRA: RIAK-1912] #141

Closed
@agilesoftware

Description

in, submit(eleveldb_thread_pool::submit) :

 else if (!FindWaitingThread(item))
         {
             // no waiting threads, put on backlog queue
             lock();
             eleveldb::inc_and_fetch(&work_queue_atomic);
             work_queue.push_back(item);
             unlock();

             // to address race condition, thread might be waiting now
             FindWaitingThread(NULL);

             perf()->Inc(leveldb::ePerfElevelQueued);
             ret_flag=true;
         }   // if 

It first check if there's an waiting thread to submit directly(and pthread_cond_broadcast in FindWaitingThread), or it will add item to work_queue.

On the other hand, in eleveldb_write_thread_worker(threading.cc), when a working thread cannot find a work item, it will wait on its condition variable.

            pthread_mutex_lock(&tdata.m_Mutex);
            tdata.m_DirectWork=NULL; // safety

            // only wait if we are really sure no work pending
            if (0==h.work_queue_atomic)
        {
                // yes, thread going to wait. set available now.
            tdata.m_Available=1;
                pthread_cond_wait(&tdata.m_Condition, &tdata.m_Mutex);
        }    // if

            tdata.m_Available=0;    // safety
            submission=(eleveldb::WorkTask *)tdata.m_DirectWork; // NULL is valid
            tdata.m_DirectWork=NULL;// safety

            pthread_mutex_unlock(&tdata.m_Mutex);

after checking the working queue is empty(work_queue_atomic==0), it sets its available to 1.

The problem is, if an working thread has checked that the working queue is empty , and before it's setting available, submit maybe cannot find a waiting thread, add an item to queue. At this moment, the working thread starts waiting, then this working thread will wait forever.(The second FindWaitingThread(NULL) doesn't help, it also maybe executed in this period). In fact, if the working thread has setting its available, and before waiting on its condition, and submit set its direct_work, broadcast the condition variable, and the working thread wait, it will wait forever too.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions