Skip to content
Yu Xia edited this page Apr 25, 2018 · 17 revisions

Prerequisites

Node.js >= v0.10

We assume you're already familiar with node.js.

In addition, you should have some understanding on promise based async programming, see bluebird for detail.

Some samples require node >= v0.12 with --harmony option, although this is not required by memdb, we strongly recommend you use generator in client code. how-to-generators

Redis

You should have have redis installed

MongoDB

You should have mongodb installed.

Understanding mongodb can help you get quick started, since many concepts in memdb is quite similiar.

Architecture

architecture

The chart above is a typical architecture of a memdb server cluster.

Shard

A shard is a node (the unit for scaling) in the MemDB cluster.

Each shard hold a part of data in local memory (just like cache). Data has been accessed is held in local memory until being requested by other shard (or when memory is low).

When the data being requested is already in this shard, a fast in memory operation will be made. Otherwise, the shard will first sync the data from the current holder shard (or from backend storage if no shard owns the data).

To maximize performance, you should always access same data from the same shard if possible. (If there is any indexed fields, access docs that has same index value from the same shard)

Global Backend Storage (MongoDB)

Center persistent storage, data in all shards will be eventually persistented to backend storage.

Currently only MongoDB is supported for backend engine.

Global Locking (Redis)

Internal use for global locking mechanism, based on Redis.

Shard Replica (Redis)

Each shard use one redis as data replication. You can add more redis replication too the slave.

All commited data can be restored after shard failure.

Frontend Server

Frontend server is not a part of MemDB, it must be implemented properly.

Frontend server connect directly to user client side, it can be a web server, or connectors of game server, whatever. The job of frontend server is to do load balancing and routing. Remember, MemDB maximize performance when you always access the same data from the same shard, so the goal of routing algorithm is to make sure (if possible) the API accessing the same data is always being routed to the same backend server (In a typical game server scenario, using areaId as routing key is a good option).

Database Concepts

Document

Document is like mongodb's document or mysql's row.

Document is just json, data in document must be json serializable. Number, String, Boolean, Array and Dict are supported, other type is not supported.

Document size is limited to 1MB after json serialization, large document is not recommended.

Collection

One collection of documents, like mongodb's collection or mysql's table.

Connection

A connection to shard, like traditional database's connection.

Due to node's async nature, connection is not concurrency safe, DO NOT share connection in different API handlers which may run concurrently, we suggest you use AutoConnection instead.

var conn = yield memdb.connect({host : '127.0.0.1', port: 31017});
yield conn.collection('player').find(id);

AutoConnection

AutoConnection manages a pool of connections for each shard, execute transaction on specified shard, and auto commit on transaction complete or rollback on failure.

AutoConnection use domain (to determine which transaction scope you're currently in). User must ensure the code in transaction scope is domain safe.

Please put each API request handler in one transaction, therefore the request will be processed in one connection and guarded with transaction.

var autoconn = yield memdb.autoConnect({
    shards : { // Specify all shards
        s1 : {host : '127.0.0.1', port : 31017},
        s2 : {host : '127.0.0.1', port : 31018},
    }
});

// Start one transaction in shard s1
yield autoconn.transaction(Promise.coroutine(function*(){
    // *** This is the transaction scope ***
    // Make sure domain won't changed
        
    yield autoconn.collection('player').insert({...});
    // more queries

    // ***************
}), 's1');

// Start another transaction in shard s2
yield autoconn.transaction(func, 's2');

Concurrency/Locking/Transaction

Locking is based on document.

One connection will hold the lock to any docs queried from memdb using find/insert/remove/update (except findReadOnly).

All locks held by a connection will be released after commit or rollback.

All changes (after last commit) is not visible to other connections until being commited.

All changes (after last commit) will be discarded after rollback or closing a connection without commit.

If any error occurred, the entire transaction will be rolled back immediately.

Query and index

Memdb use simple hash based index which support basic querying, querying inside MemDB is full CUID transaction guarded:

  • Each query must be key-value pair, query require value comparing (sort, $gt) is not supported (Actually you can query backend mongoDB to do this). Compound index or unique index is supported.
  • The keys used in query must be configured as index, query not using index is treated as error.
  • Index is defined in database config file memdb.conf.js. The index can not be changed when any shard is running.
  • The doc count which has same index value is limited, so index on 'boolean' or 'enum' is discouraged and can cause performance issue.

For more complicated querying, you can directly do querying and indexing in backend MongoDB, this can leverage the full power of MongoDB querying system, but with the following restrictions

  • Querying backend MongoDB is NOT transaction guarded, the data may not up to date. (To reduce data delay, reduce persistentDelay option value at the cost of performance)
  • DO NOT write directly to backend MongoDB unless all shards are offline.

Index update

To update index in a non-empty database, you should first shutdown all shards, and then run memdbindex to rebuild index.

Mdbgoose

Mdbgoose is the 'mongoose' for memdb, actually its modified from mongoose. You can leverage must power of mongoose for object modeling, just use it like you were using mongoose!

Performance optimize

  • Access same document from the same shard if possible. If there is any indexed fields, access documents that has same index value from the same shard.
  • Use findReadOnly instead of find if you only read document (without modifying or do modification depend on the doc).

In production environment, make sure

  • Set log level to WARN
  • Disable promise longStackTraces
  • Turn off lineDebug in log4js.json

Scaling and High availability

Scaling

MemDB can be simply scale up by adding more shards. You need to stop the cluster when modify memdb config.

The backend storage and locking is global service, when system is huge, they may become bottleneck. Thankfully both MongoDB and Redis support horizontal scale.

Mongodb scaling

You can use mongodb sharding. Just use _id as sharding key for every collection.

Redis scaling

Please refer to redis partitioning, twemproxy is recommended.

High availability

MemDB use redis as shard data replica, all commited data can be restored after shard failure.

Recover from MemDB shard failure

Currently we do not support automatic failover, this is the recover guide

  • If the memdb process is down, simply restart the shard process. You can make a monitor service to do this.
  • If the host VM is unrecoverable. Make sure the failed VM is disconnected from network before start failed shards in a new VM, if the new VM has different IP, you have to stop the cluster and modify memdb config.

Mongodb HA

ReplicaSet provide high availability using automatic failover.

Redis HA

Redis Replication can be used for data replication.

Further read

Please read API reference carefully for detailed usage.