Weekend project: Destiny 2 account tracker (feat. improved metrics infrastructure)


Following the theme set by the previous post, I have continued my pursuit for improved infrastructure. Not that it really is anything special yet. Just more services with almost default config. But the idea here is that these services will form some kind of stable core for many other services to follow, and hopefully evolve over time to become even more dependable. At this point they are to “just be out there” and usable in order to test new ideas.

One of these ideas is about finally upgrading how I might collect timeseries data. Over the years I’ve had several tiny data collection projects, each implementing the storing of the data different ways. I’ve already reinvented the wheel so many times and it is about time to stop it. Or at least try to do it a bit less :p Also, in the previous stage I installed MongoDB, so for this stage I thought that it is about time to also install a relational database, and PostgreSQL has been my absolute favourite on this front for a while now.

Meanwhile, after doing a tiny bit of research on storing timeseries data I found TimescaleDB. And what a coincidence, it is a PostgreSQL extension! I think that we’ll be BFFs! That is, after it supports PostgreSQL 12... I wanted to install the latest version of Postgres so that I get to enjoy whatever new features it has. But mostly because I’d be then avoiding a version upgrade from 11 to 12, had I chosen to install the older version. Not that it would have probably been a big problem. Anyway, the data can be easily stored in a format TimescaleDB would expect, and it shouldn’t balloon to sizes that absolutely require the acceleration structures provided by TimescaleDB before I it is upgraded for 12. Rather, this smaller dataset should be perfectly usable with just a plain PostgreSQL server. Upgrading should be just a matter of installing the extension and running a few commands. Avoiding one upgrade to perform another, oh well…

In the far past I’ve used both custom binary files and text files containing lines JSON to store time series data like hardware or room temperatures. More recently I’ve used SQLite databases to keep track of stored energy and items on a modded Minecraft server (Draconic Evolution Energy Core + AE2, OpenComputers lua script, Dockerized TCP host (there was not enough RAM  in the OC computer to serialize a full JSON in-memory)). I should try to add some pictures if I happen to find them…

For visualizing the data in the past I’ve used either generated Excel sheets or generated JavaScript files with whatever visualization library I found. Not very nice.

* * *

But let’s get to the point, as there was a reason I wanted to improve data collection this time. I finally got around to checking out the API of Destiny 2 in more depth, and built a proof-of-concept of an account tracker.

For those poor souls that don’t know what Destiny 2 is, it is a relatively multifaceted MMOFPS, and I’ve been playing it since I got it from a Humble Monthly (quite a lot :3). As with any MMO, there is a lot to do with almost endless amount of both short- and long-term goals. It made sense to build a tracker of some sorts so that I could feel even more pride and accomplishment for completing them. And in case of some specific goals, in order to maybe even see what strategy works the best, and what doesn’t. The API provides near-realtime statistics on great many things, and it would be nice to also be able to visualize everything at real time.

To accomplish this I needed multiple things: authenticate, get data, store data and lastly visualize the data.

Authentication in the API is via OAuth, so I needed to register my application on Bungie’s API console and set up a redirection URL for my app. After this I could generate a login-link pointing to the authorize endpoint of Bungie’s API. This redirects back to my application, containing a code in the query string. This code can then be posted as form url encoded to Bungie’s token endpoint. This endpoint requires using basic authentication with the app’s client id and secret. After all this the reply contains an access token (valid for one hour) and an url to call in order to refresh the token (valid for a few months, but reset on larger patches). The access token can then be used to call the API for that specific account. This would probably be a great opportunity to opensource some of the code… 

Speaking of which, there already exists some open-source libraries for using the API! I didn’t look into them yet, as I was most unsure about how the authentication would work. I guess I should now take a look.

The process of figuring out how the authentication works contained quite a bit of stumbling in the dark. The documentation wasn’t that clear at all steps, although at least it did exist. On the other hand I’d never really used OAuth before, so there was quite a bit of learning to do.
This also presented one nice opportunity to put all this infrastructure I’m building to good use! As part of the OAuth flow there is the concept of application’s redirection URL, but in case of a script there really isn’t any kind of permanent address for it. So what do? I didn’t yet implement it, but I think that a nice solution for this would be to create a single serverless endpoint for passing the code forward. While I haven’t yet talked about it, I’m planning on using NATS (a pub-sub broker, optional durability) for routing and balancing many kinds of internal traffic. In this case an app could listen to a topic like /reply/well-known/oauth-randomstatehere. When the remote OAuth implementation redirects back to the serverless endpoint, it publishes the code to that topic, and the app received it. All this without the app needing to have a dedicated endpoint! It seems that someone really thought things through when designing OAuth. And as a bonus that code is short-lived, and must only be used once, so it can be safely logged as part of traffic analysis.

Reading game data is just a matter of sending some API requests with the access token from earlier, and then parsing the results. At the moment I am only utilizing a fraction of what the API has to offer, so I can’t really tell much. So at the moment this means the profile components API with components 104,202 and 900. This returns status of account-wide quests and “combat record” counters, which can be used to track weapon catalyst progression. I’m reducing this data to key-value pairs. Each objective has a int64 key called “objectiveHash”, and an another int64 as the value. The same goes for the combat record data. At the moment I'm using a LinqPad script that I start when I start playing, but in the future I'd like to move this to be a microservice. This service could ideally poll some API endpoint to see if I'm online in the game, and only then call the more expensive API methods. Not that it would probably be a problem, but I'd like to be nice.

Data is saved to the PostgreSQL database. I wrote a small shared library abstracting the metrics database queries (and another for general database stuff), so now writing the values is very simple. This shared library could be used for writing other data, too. Like the temperatures and energy amounts I mentioned above. I should probably add better error handling, so that lost connection could be automatically retried without interaction from the code using the library. But anyway, here is how it is used:

var worker = new PsqlWorker(dbConfig); // Lib1
var client = new MetricsGenericClient(worker); //Lib2
var last_progress = /*client.get*/;
// ...
var id = await client.GetOrCreateMetricCachedAsync("destiny2.test." + objectiveHash); // Result is cached in-memory after first call
if(progress != last_progress) // Compress data by dropping not-changed values
    await client.SaveMetricAsync(id, progress);
    last_progress = progress;

Visualizing the data was next. I have been jealously eying Grafana dashboards for a long time, but never had the time to set something up. There was one instance a few years ago with Tracker3 where I stumbled around a bit with Netdata and Prometheus, but that didn’t really stay. Now I made some quick research on Grafana, and everything became clear.

Grafana is just a tool to visualize data stored elsewhere. It supports multiple implementations for that, and they each have slightly different use cases. I’m still not exactly sure what kind of aggregation optimizations are possible when viewing larger datasets at once, but I kinda just accepted that it doesn’t matter, especially when most of the time I’d be viewing the most recent data. What I also had to accept was that Grafana doesn’t automagically create the pretty dashboards for me and that I’d have to see some effort there. But not too much. Adding a graph is just a matter of writing a relatively simple SQL-query and slapping in the time macro to the SELECT-clause. And then the graph just appears. For visualizing the number of total kills with a weapon this was as complicated as it would get. For counters displaying the current value it likewas was just a matter of writing the SQL query with ORDER BY time DESC LIMIT 1.

And while I was at it, I also added a metric for the duration of the API calls. I also remembered that Grafana supports annotations, which could also be saved to Postgres. And the dashboard started to really look like something! Here there's one graph for "favourite" things and then one which just visualizes everything that is changing.


And why stop there? I also installed Telegraf for collecting system metrics such as CPU or RAM utilization or ping times. I went with the simplest approach of installing InfluxDB for this data, as there were some ready-made dashboards for this combination. More services, more numbers, more believable stack :S

* * *

That’s it. No fancy conclusions. See you next time. I’ve been using this system for only a week or two now. Maybe in the future I have some kind of deeper analysis to give. Maybe. And maybe I get to refine the account tracker a bit more, so that I could consider maybe (again, maybe) opensourcing it.

PS. These posts are probably not very helpful if you are trying to set up something like this yourself. Well, there's a reason. These are blog posts, not tutorials. I don't want to claim to know so much that I'd dare to create a tutorial. Although... some tutorials are very bad, I'm sure I could do better than those.

No comments: