Spring break – development log #170

Mjeno · February 17, 2019, 7:58pm

It’s Martin speaking. Spring hit Germany this week-end! Out of the blue (or rather, the grey), we have clear skies, sunny weather and crowds of people flooding parks and promenades. Those crowds include Mjeno, which is why I have the honor of writing the intro to this week’s devlog!

You can find the full issue of the development log here: link

Kodos · February 17, 2019, 9:24pm

Instead of spending time writing code for every specific metric, why isn’t every single action already logged/monitored in an easily-exportable format?

Failing that, why not just dump records of everything into a database of some form? All of the game stats and statistics mentioned in dev blogs and chats can be implemented with a tiny number of SQL queries, for example, or simple analysis with any stats package.

molp · February 18, 2019, 6:16am

There are limits due to the sheer amount of actions. Logging everything is not feasible.

We are using an event sourcing architecture to store the game’s state. This means that we already store every event of the game in the database. The events are stored per entity (think company, player, corporation, …). So statistics that encompass multiple entities cannot simply be queried from the database.

While the write-side of our architecture is incredibly simple the read-side is more complicated.

Kodos · February 18, 2019, 10:34pm

I’m sorry that you’re using an architecture that makes moderately difficult things easier, but that makes easy things effectively impossible.

As a thought experiment, if you take the top 20 stats about the game the you wished you had, how hard would it be to express them as SQL queries or using R/etc, assuming that the game’s current state and/or history was stored in a queryable form? It’s hard to overestimate the value of being able to perform ad-hoc queries and analysis over your data.

If the current architecture is really causing so many problems, have you considered post-processing the event data stored in the “database” (actually serialized opaque event blobs stored in an append-only row storage system, based on previous dev comments) to extract a subset of data into an easily queryable form? That might give you the flexibility to experiment with ad hoc queries and analysis at a very low incremental cost.

Having no easy way to query/introspect on the game’s state and history means that abuse detection is effectively impossible, which I assume you’ve already noticed given how much abuse has been ignored.

I’m amazed and perplexed that a game with a few dozen users can generate so much data as to make logging infeasible. If there were a few tens of millions of simultaneous users, maybe. How many terabytes per day of data is the game generating, and why? The costs for running the game must be very high–I’d have expected a game this small and with so few users would run just fine on a Raspberry Pi or equivalent. Are there specific gameplay patterns that players are using that are causing problems so far beyond what was apparently expected? Will any of the planned features on the roadmap improve the situation?

molp · February 19, 2019, 6:40am

Actually the architecture works really good for us! We have way less technical issues that we expected we’d have.

Yes, we are working on that.

Right now the game could run on a single server, no problem. With the experience from our other game we decided to try another architecture since scaling SQL databases is incredibly hard. With our vision in mind (one game world, no restarts) we will reach the limits of SQL based architectures sooner than later.
It is possible to just duplicate all the events generated and put it into a easily query-able database at the moment, but some processing from List of events to complete state of an entity is still necessary. Later on (more players, more events) this will not be feasible. Therefore we do some processing within the game and only export what we need.