MHServerEmu Progress Report - December 2023

Happy holidays everyone! This month and year are coming to an end, and it is time for another MHServerEmu progress report.

Game Database Update

In December there has been significant progress reimplementing the game database. If you have not yet read the last month’s report, I suggest you go and do that right now, because what is coming is very much a continuation of that post.

The most significant milestone we have been able to achieve this month is restoring the original hierarchy of prototypes. This allows us to understand relationships between various pieces of data and make use of them.

For example, MHServerEmu has a command that allows you to look up costumes, !lookup costume. Previously it was implemented by iterating through all data files and checking every single file path for a certain pattern. This approach has always been a temporary solution: it is slow and it relies on files being consistently named (spoiler: they are not). What we can do instead now is just get a collection of all prototypes that reference the costume blueprint. This reduces pattern matching complexity and the number of iterations we have to do, and removes the reliance on naming. To put it in simpler terms, it is much faster and more convenient.

The prototype hierachy is also going to be essential for implementing the property system, which is the foundation of all game entities, such as player avatars and enemies. We are going to go into more detail on this topic in a future report, once it is actually up and running.

Another thing this development has allowed us to do is make some charts. Exciting, am I right?

First we have a chart of top 30 most used prototype classes. These use standard inheritance, so there is some overlap here (e.g. all world entity prototypes are also entity prototypes).

Top 30 Most Used Prototype Classes

Next is top 30 most referenced blueprints. These function more like composition: entities at their core are collections of properties, such as their level or maximum health, and what properties an entity is supposed to have is defined by referencing individual property blueprints.

Top 30 Most Referenced Blueprints

However, while restoring the hierarchy is an important step, this is just one piece of the puzzle. Since there is a lot of interdependency between pieces of data, everything has to be loaded in the right order, or else things are going to get very messy very fast. So there has also been ongoing work on restoring various underlying systems that are going to allow us to do just that.

A good example of such system is on demand prototype loading. A naive approach that we used to take is to load all 93144 prototypes on startup in the order they are stored. Other than loading in the wrong order, this way we also load a lot of unnecessary data, including prototypes for cut and unfinished content, as well as debug prototypes that are disabled in shipping versions of the game. Although optimization is not really a high priority for us right now, it is still good to keep it in mind, especially when it comes as a side effect of doing something else.

So we have reverse engineered the solution Gazillion had: store all data in memory compressed using the LZ4 algorithm and decompress and deserialize only what is requested. By doing it this way we not only gain control over what is loaded and in what order, but also get some convenient savings on initial memory usage and initialization time. And this is despite the fact that we now spend additional memory and time on initializing and storing the hierarchy cache!

MHServerEmu Initial Memory Usage

MHServerEmu Game Database Initialization Time

This is going to be especially useful for playing solo offline on lower end hardware. For potential larger scale servers this may cause hitching if too much new data is requested at once, and for that we have implemented an option to frontload all data in the correct order that is also present in the original game.

As we implement additional functionality, these numbers are going to gradually go back up. However, considering the amount of unused data I mentioned previously, even after loading all the relevant data it may end up using less memory than loading everything without storing a compressed copy.

Despite the great progress we have had with the game database this month, there is still work to be done. As I write this, we are approaching a point where we will be able to implement proper deserialization for Calligraphy prototypes, which is the biggest roadblock for working on game systems. When that happens, we are going see more things happening in-game.

Region Generation Update

Meanwhile, as I have been working mostly on the game database, Alex continues his efforts on reimplementing the procedural region generation. If you have not yet, you can check his progress in the experimental branch.

We are lucky to have most of the logic for procedural generation in the client. However, implementing it still proved to be a very laborious process due to how intricate and occasionally buggy the original code is. Once all the pieces are done and put together, this, along with the game database, is going to make all regions in the game explorable and, where applicable, randomly generated, as they should be.

To keep everything as authentic as possible and avoid future replication issues we try to stick to the disassembled original code as much as possible. This includes reimplementing even the basic building blocks that are usually already taken care of by various existing game engines these days.

One example of such building block are quadtrees, which is a data structure commonly used in video games for optimization. For instance, if you were to do collision detection without any tricks, you would have to check every single object against every other object, so the number of checks you would have to do would increase exponentially with the amount of objects in your game (which is the ultimate evil in computer science, also known as an O(n²) algorithm).

But when you think about it, it does not make any sense to check for collisions against far away objects. So it is common to divide space into sections, and do checks within each section separately. This is where quadtrees come in: they are used to divide objects into four sections (quads), but they are also trees, so each section can have subsections of its own. This may be a little difficult to grasp in writing, so here is a visualization of collision detection using quadtrees I found on YouTube:

Although the Marvel Heroes game client is built using Unreal Engine 3, the game logic actually running everything is all custom Gazillion code. In a way this resembles the semi-recently released Diablo II Resurrected, where the original game is running underneath a new graphical engine. Except in the case of Marvel Heroes the “original game” does not have any graphics of its own.

So when we reimplement things such as quadtrees, we have to figure out and take into account all the original quirks. For example, the Marvel Heroes implementation of this data structure uses what is known as an intrusive circular linked list for storing its elements: each element is linked to elements before and after it, these links are stored within the element itself rather externally (which is why it is intrusive), and the link to the next element at the end of the list points back to the first element (circular).

This is just one example of the work that goes into restoring this system. The good news is, region generation is also steadily approaching testable state, and once everything is ready to combine it with the game database implementation, we are going to have a lot of fun.

Core System Improvements

In December I have also taken some time to do a pass on some of the foundational systems of MHServerEmu that could have used some extra work.

Previously we used a TCP server implementation ported from a Diablo III Beta server emulator that was in development in 2011-2012. This is the part of the server that handles client connections, as well as receiving and sending data. While this implementation does get the job done, it never got the chance to get polished, and it was built on an obsolete design pattern known as asynchronous programming model (APM). So this month I have written a new implementation based on the newer and hipper task-based asynchronous pattern (TAP), which is hopefully going to be cleaner and more maintainable.

Another piece ported from the Diablo III server is our logging system. I was mostly happy with it, however one issue it had was the fact that it was synchronous, meaning that while something was being logged nothing else could be happening on the caller thread. This was an issue for logging large quantities of messages from the main game thread, which is supposed to update on a 50 millisecond cycle. And instead of the game running, everything had to wait until messages finished printing in the console window. This problem is gone: log messages are now enqueued synchronously allowing us to get accurate timestamps for each message, but the output happens asynchronously as a separate task. So now the impact of logging on game performance is just about as low as it can get without compromises.

Finally, as a test for doing Calligraphy prototype deserialization later on, I have improved our configuration system using the reflection capabilities of C#. This is not something that is going to affect the end user, but it has made adding and removing various server settings easier.

The end goal is to have not just something that is capable of running the game, but it also being maintainable and reasonably fast, so expect the work on this front to continue.

Miscellaneous Updates

Here are some additional points I would like to highlight without dedicating entire sections to them:

  • We now have a tool for unpacking and packing .sip files. This is the first step for modding game data files. You can grab the source code for it here.

  • The work on documenting various versions of the game mentioned in the previous report continues. There is now a new repository for documenting the evolution of the game’s network protocol.


That’s it for today. A lot has happened this year, and even more exciting stuff awaits us in the next. See you next time!