Skip to content

Distributed querying

Sam Burdick edited this page Jun 23, 2018 · 1 revision

Overview

The purpose of this page is to describe the distributed querying process as well as the necessary structures for doing so.

DBMS

The DBMS passes a query message to the master process in this form. Point queries look like

P:vid

while range queries are of the form

R:R_1[&R_2[&...]]

where R_i is a range of vector IDs.

Master

Point queries

Point queries are resolved with a single remote procedure call to the machine containing the vector id. The master process passes the result to the DBMS.

Range queries

  1. Split query message based on &, obtaining the set R = {R_1, R_2, ..., R_n}. In Pythonese:
res := []
for r in R:
    sublist := []
    for vec in r:
        res.append((consistent_hash(vec), vec))
    sublist.sort()
    res.append(sublist)

where consistent_hash maps vector IDs to slave IP addresses. Since we handle the query one vector/machine at a time, by handling the query in machine order we're minimizing the number of RPCS that actually need to be made. 2. In a random, round-robin, or bandwidth-considerate way, delegate a new COORDINATOR slave and pass it res. 3. If the coordinator crashes, repeat 2 with a new COORDINATOR.

Slave

There are 2 types of slaves, coordinators and handlers.

COORDINATOR

The coordinator takes the list of lists (machine, vec_id) tuples, res, from the master.

/* Globals */
failed_slaves
vector_array
mutex_lock

function query(res):
    failed_slaves := []
    num_handlers := len(res)
    vector_array := word[num_handlers]
    for list in res:
        Spawn new thread of the handler_call function, passing global_vector_array index and res
    Join all threads
    if any slaves failed: return failed_slaves
    v := 0xffffffffffffffff
    for v[i] in vector_array:
        v &= v[i]
    return v


function handler_call(index, machine_vec_list):
    client := clnt_create(machine_vec_list[0][0], ...)
    result := handle_query_rpc([machine_vec_list, 0x0], client)
    if timeout or result.failed_machine != NULL:
        lock(mutex_lock)
        failed_slaves += result
        unlock(mutex_lock)
    /* otherwise, it succeeded */
    vector_array[index] := result.vector

HANDLER

function handle_query_rpc([current_machine_vec, result], ...):
    while current_machine_vec.machine = this machine:
        result |= current_machine_vec.vec
        current_machine_vec = current_machine_vec->next
    if current_machine_vec = NULL:
        return result in new query_return_value
    /* otherwise, pass on to the next client */
    client := clnt_create(current_machine_vec, ...)
    result := handle_query_rpc([current_machine_vec, result], client)
    if timeout or result.failed_machine != NULL:
        return current_machine_vec.name in new query_return_value
    return result

Structures

A lot of the pseudocode above used Python-esque structures, but in C/RPC we need well-defined structs.

Machine-vector list (input to handler)

struct machine_vec_node {
    char *machine_addr;
    vec_t vector;
    struct machine_vec_node *next;
};

Query result (returned by handler)

If the query failed, failed_machine denotes the machine that failed. Otherwise it is NULL and vector is the result of the query.

struct query_return_value {
    vec_t vector;
    char *failed_machine;
}
Clone this wiki locally