Skip to content

no5ix/realtime-server

Repository files navigation

A lightweight game server engine

Key Features

  • Business layer development based on ECS framework: You can inherit from the base entity class and base component class.
  • Integrated service mechanism based on etcd: It includes service registration, TTL, service discovery, load balancing, load reporting, and Watch mechanism.
  • RPC framework based on msgpack: It supports direct calling via IP address and direct calling of remote virtual entities/components in combination with ECS.
  • Coroutine business layer support based on asyncio asynchronous IO: You can achieve the effect of directly calling RPC and getting the return value like result = await rpc_call().
    • A coroutine pool is implemented and encapsulated into a concise decorator for easy calling in the business layer.
  • Support for TCP and RUDP
  • Asynchronous HTTP microservice framework based on Sanic: It is convenient for developing various public services.
    • An authentication module based on JWT.
    • A data persistence module based on Redis.
    • An ODM module based on Umongo.
  • Hot - update reload module:
    • Full - scale update: Stable.
    • Incremental update: Faster, convenient for daily development.
  • Asynchronous TimedRotating log module support:
    • Automatically rolls over log files according to date and time.
    • Supports callbacks of coroutine objects.
    • Changes colors according to log levels for easy querying.
    • Prints the stack and locals when reporting traces.
    • Provides file jump support in PyCharm for log levels above warning.
  • Timer module supporting 1:N model: It avoids the error - prone issue of overwriting the same key.
    • You can reuse a key without overwriting the previous timers associated with the key. However, when calling cancel_timer, all timers with the same key will be cancelled at once.
  • Enhanced JSON parser: It supports comments, automatic removal of commas, and variable macros.
  • Data persistence module based on MongoDB
  • Client - side simulation and automated testing support
  • Pre - gateway gate server for the lobby server: It is responsible for data compression/decompression, encryption/decryption, and authentication.

Architecture Diagram

This architecture diagram is automatically generated by writing PlantUML.

Python Version

Python 3.8.8

Q&A

Here are the answers to common questions. Please don't spam the group with these questions:

  • Where can I find the old version?
    • Do you mean the CPP version in the following demo image? It's in the master branch.
  • Why choose Python for the new version instead of C++? Or why not make C++ the bottom layer and Python the upper layer?
    • It's possible to use C++ as the bottom layer and Python as the upper layer for calling. However, the current vision is to encourage more developers to contribute. The threshold of C++ is significantly higher than that of Python for other contributors.
    • If there are enough users in the future, we will consider making the bottom layer in C++ and packaging it into .pyd or .so for the upper layer to call.
    • For most games, the performance of Python is sufficient.
    • The vision of this server engine is to target mass developers for rapid development. Python has a wide audience and is easy to learn.
  • What's the significance of lobby_gate? Why doesn't the client connect directly to the lobby?
    • Generally, when latency is not a critical issue, the client accesses the lobby through the gate. The gate is responsible for proxying and forwarding network communication data between the client and the game. It also handles data encryption, decryption, compression, and decompression, reducing the performance pressure on the lobby.
    • The logic of message processing in the lobby can be simpler without dealing with I/O multiplexing, as all messages come from a single TCP connection (single gateway) or a fixed number of TCP connections (multiple gateways).
    • After the client starts, it connects to the gateway, which then connects to the scene server and forwards all message packets. When switching scenes, the connection between the client and the gateway remains intact. The gateway is responsible for connecting to the new scene. Since the gateway and the game server are usually in the same local area network, the problem of disconnection is greatly improved.
    • In a multi - process online game architecture, the game server not only processes client messages but also messages from other processes such as accounts and GM tools. For security reasons, it is necessary to isolate and restrict client messages.
    • Game servers often need to start new servers, merge servers, or migrate due to hardware failures. If the game server directly provides services to the client, the client usually needs to log out and log in again when the service changes. If a gateway is used, when the server address changes internally, only the gateway needs to be updated and reconnect, and the client may not even notice.
  • What is RUDP?
    • Its full name is reliable UDP.
    • Currently, this server engine is implemented in Python based on the standard KCP algorithm.
    • There are plans to improve KCP in the future, and the performance will be further enhanced. Stay tuned:
      • Add a dynamic redundant packet mechanism based on SRTT monitoring.
      • Simplify the packet header (for example, some fields in the packet header do not need 32 bits and can be changed to 16 bits).
      • Introduce the dupack mechanism, not just sending IKCP_CMD_ACK packets.
        • Make full use of the fields in the packet header. For example, when the number of ACKs to be sent in the ACK list array is less than or equal to 3, use the len, rdc_len, and sn in the packet header to represent the sequence numbers of the received packets.
        • Redundant ACK mechanism: When the number of ACKs to be sent in the ACK list array is greater than 3, the body will contain all the ACKs in the merged ACK list and the previously sent ACKs until the packet size reaches the MSS.
      • ...
  • What about etcd?
    • It implements service registration, service discovery, load balancing, and load reporting through its TTL and Watch mechanisms.
    • It is convenient for implementing distributed locks.
    • The V2 version of the HTTP API is used for interacting with etcd. Since the interaction is not sensitive to latency, gRPC is not used to keep it simple.
    • Then why choose etcd instead of ZooKeeper?
      • The purposes of the two applications are different. The purpose of etcd is to be a highly available Key/Value storage system, mainly used for sharing configurations and service discovery; the purpose of ZooKeeper is to be a highly effective and reliable collaborative working system.
      • The interface call methods are different. etcd is based on an HTTP + JSON API, which can be easily used with just curl, making it convenient for each host in the cluster to access; ZooKeeper is based on TCP and requires a dedicated client.
      • The functions are quite similar. Both etcd and ZooKeeper provide key - value storage services, cluster queue synchronization services, and the ability to observe changes in the value of a key.
      • The deployment methods are also similar: They both use a cluster mode and can support thousands of nodes. etcd is written in Go and can be deployed by directly compiling the binary files; ZooKeeper is written in Java and depends on JDK, so JDK needs to be deployed first.
      • Implementation languages: Go has almost the same efficiency as C, especially since Go is a language designed for multi - threading and process communication, and it performs very well in small - scale clusters; Java requires more code to implement, and its performance is average in small - scale clusters. However, after optimizing multi - threading in large - scale scenarios, its performance is not much different from that of Go.

To - Do List

  • Introduce Python in Unity based on Python.NET for hot updates or business writing to unify front - end and back - end code.
  • Develop supporting tools for flame graphs.
  • Create tools for importing and exporting configuration tables.
  • ...

一个轻量级的游戏服务器引擎

要点

  • 业务层基于ECS框架来做开发, 继承实体基类与组件基类即可
  • 基于etcd的 服务注册 / TTL / 服务发现 / 负载均衡 / 上报负载 / Watch机制 一体化
  • 基于msgpack的RPC框架, 支持 ip地址直接call以及配合ECS的remote虚拟实体/组件直接call
  • 基于asyncio异步IO的协程业务层支持, 可实现类似 result = await rpc_call() 的直接调RPC拿返回的效果
    • 实现了协程池, 封装成了简洁的装饰器便于业务层调用
  • 支持TCP与RUDP
  • 基于sanic开发的异步HTTP微服务框架供方便开发各类公共服务
    • 基于jwt的auth模块
    • 基于redis的数据落地模块
    • 基于umongo的ODM模块
  • 热更新reload模块
    • 全量式, 稳
    • 增量式, 速度更快, 方便平时开发
  • 支持异步的TimedRotating日志模块
    • 根据日期时间自动滚动切换日志文件
    • 支持协程对象的callback
    • 根据日志level改变颜色, 方便查询
    • 报trace可打印堆栈与locals
    • 对于 warning 以上的日志级别直接对Pycharm提供文件跳转支持
  • 支持1:N模型的定时器模块, 避免覆盖同一个key的易错点
    • 可以重复使用一个key, 并不会冲掉之前key的timer, 但是当调用cancel_timer的时候, 会一次性全部cancel掉所有
  • 制作了增强型json解析器, 支持注释/自动去除逗号/变量宏
  • 基于MongoDB的数据落地模块
  • client端的模拟与自动化测试配套
  • 大厅服务器的前置网关gate服务器, 负责压缩/解压, 加密/解密数据以及鉴权

架构图

本架构图通过编写 PlantUML 自动生成

python version

python 3.8.8

Q&A

常见问题解答, 就不要去群里刷屏问了哈:

  • 老版本去哪儿了?
    • 是说下面这个演示图的那个CPP版本么? 在master分支
  • 为何新版选用python? 不选cpp? 或者做成cpp的底层py的上层?
    • 用cpp做底层, py做上层来调用的话也是可以的, 但是目前愿景是希望更多的开发者能够参与贡献, 对于其他贡献者cpp门槛明显比python高
    • 之后用户够多了的话, 会考虑把底层cpp化, 然后打包成.pyd.so供上层调用
    • 对于大多数游戏, python的性能已经够用
    • 本服务器引擎的愿景是面向大众开发者, 能够快速开发, python受众广, 易上手
  • lobby_gate的意义? 为何客户端不直连lobby?
    • 一般的,在延迟不敏感的情况下,客户端通过连接 gate 来访问 lobby, gate负责代理转发客户端与 game 之间的网络通信数据。由 gate 负责完成对通信数据加密解析、压缩解压操作, 减轻lobby性能压力。
    • lobby 处理消息的逻辑可以更加简洁,不用处理 I/O 复用,因为所有的消息来自单个 TCP 连接(单网关)或者固定数量的几个 TCP 连接(多网关的情况)
    • 客户端启动后连接的是网关,网关再去连接场景服务器并转发所有消息包。切换场景时客户端与网关的连接是不断的,由网关负责去连接新的场景,因为网关和游戏服务器通常在同一局域网内,所以掉线的问题大大改善了。
    • 在多进程的网络游戏架构中,game 除了要处理客户端的消息,同时还会处理其他进程的消息,比如账号、GM 工具等。出于安全原因,一定需要把客户端的消息隔离开来加以限制。
    • 游戏服务器经常需要开服、合服,还有硬件故障等原因导致不得不做服务器迁移。如果 game 直接对客户端提供服务,服务发生变更时往往需要客户端退出重新走登录流程。如果使用网关,在服务器内部发生地址变更时只需要保证网关得到更新并发起重连,客户端甚至根本觉察不到。
  • RUDP是什么?
    • 全称是 reliable UDP
    • 目前本服务器引擎基于标准KCP算法使用py实现
    • 后续有计划要改造kcp, 性能将更上一层楼敬请期待:
      • 加入基于srtt监测的动态冗余包机制
      • 精简包头(如包头的一些字段并不需要32位, 可改为16位)
      • 引入dupack机制, 不只是发IKCP_CMD_ACK
        • 充分利用包头的字段, 比如当acklist数组里需要发送的ack小于等于3时, 用包头中的len/rdc_len/sn来表示收到了的包的序号
        • 冗余ack机制: 当acklist数组里需要发送的ack大于3时, body里会包含合并acklist的所有ack以及之前发过的ack, 直到包大小到达mss
      • ...
  • etcd ?
    • 通过其ttl以及watch机制来实现 服务注册 / 服务发现 / 负载均衡 / 上报负载
    • 方便做分布式锁
    • 与etcd的交互使用的是其V2版本的HTTP API, 与其交互是延迟不敏感的所以不使用grpc, 保持简单
    • 那为啥选etcd不选zookeeper?
      • 两个应用实现的目的不同。etcd的目的是一个高可用的 Key/Value 存储系统,主要用于分享配置和服务发现;zookeeper的目的是高有效和可靠的协同工作系统。
      • 接口调用方式不同。etcd是基于HTTP+JSON的API,直接使用curl就可以轻松使用,方便集群中每一个主机访问;zookeeper基于TCP,需要专门的客户端支持。
      • 功能就比较相似了。etcd和zookeeper都是提供了key,value存储服务,集群队列同步服务,观察一个key的数值变化。
      • 部署方式也是差不多:采用集群的方式,可以达到上千节点。只是etcd是go写的,直接编译好二进制文件部署安装即可;zookeeper是java写的,需要依赖于jdk,需要先部署jdk。
      • 实现语言: go 拥有几乎不输于C的效率,特别是go语言本身就是面向多线程,进程通信的语言。在小规模集群中性能非常突出;java,实现代码量要多于go,在小规模集群中性能一般,但是在大规模情况下,使用对多线程的优化后,也和go相差不大。

ToDo List

  • Unity基于pythonnet来引入python做热更或业务编写, 实现前后端代码统一
  • 火焰图的配套工具制作
  • 配表导表工具
  • ...

QQ群

觉得好的话, star一哈项目并加群 496687140 查看更多文档交流吧