Advice for modeling a database for multiple games

I have multiple games I want to store data for. The data for each player is stored in a single document with a UUID to identify it. The two options I have thought of so far is:

  1. Have one collection per game.
  2. Have one collection for all games.

The problem with option 1 is that I do not know how I would get all stats for all games per player while having good performance. Sending a query to each collection simultaneously to get all their stats seems wrong, although I think this is the best option so far.

For option 2, I would not be able to create indexes on the fields that needs to be indexed because the number of games and fields per game could be expanded at any time. It would also be very many read and write operations on a single collection which seems very wrong.

Does anyone have any advice on this?

Hi @Henrik ,

Having a collection per game doesn’t sound right. We have a known antipatterns of having to many collections and indexes and we should avoid it as much as possible.

The MongoDB locking is per document so writing similar objects to the same collection should not interfere with concurrency as long as you hit different documents.

Additionally, as your database grow and you wish going to sharding the deployment having single collection to be sharded makes more sense than lots of small ones which doesn’t.

You might consider having a hybrid solutions if the single collection doesn’t work like collection per month or per 6 months for all games.

Now if you have a large number of attributes that you need to index for search you may consider the attribute pattern for each game

I suggest reading the following articles

https://www.mongodb.com/article/schema-design-anti-pattern-summary/

https://www.mongodb.com/article/mongodb-schema-design-best-practices/

Best
Pavel

1 Like

HI @Pavel_Duchovny,

Thanks for the good reference to Henriks problem. I have bit similar, and did some research about patterns.

I have a followup question and thought perhaps it would be ok to continue in this thread, feel free to move or ask me to create a new topic.

I’m tracking users progress like following. User has few paths to complete. Paths have few stages to complete that are basically different games.

The data is used to give feedback to the user and also for analytical purposes and exported to olap(bq).I was also thinking about having different games data
in their own collections, but Pavel didn’t suggest that approach. Different games amount would be less than fifty.

I was thinking something like following, but started thinking that maybe all the stage data should be as key-value pairs? Also if want at some point more detailed data about gameplay for analytical purposes,
maybe that data should be in its own collection to prevent not to fill 15 mb limit and because it is not accessed by the app?


"Paths":
{
  "PathOrder": [Path1, Path3, Path4, Path2]     
  "Path1":
  { 	
     "Stage1":
     {
        "GameName": "SpeedDrill", 
	"PlayTime":92,
	"FirstPlayTime": 168123123,
	"LastPlayTime": 168124567,
	"SpecData": 
		{[
		{"key": "Crashes", "value": 2},
		{"key": "AvgSpeed", "value": 65},
		{"key": "Laps", "value": 7}
		]}
        "completed": false,
        "attempts": 1
     }, 
     "Stage2":
     {
	"GameName": "AccuracyDrill", 
	...
     }

  }
  "Path2":
  ...

I flagged the first post as spam rather than the last one.

I do not see the option to undo my mistake.

1 Like
  1. Consider Hybrid Approach: Create a separate collection for player data and a collection for each game. Store player-related data in the player collection and game-specific data in respective game collections. This allows for efficient retrieval of player stats while maintaining game-specific details.
  2. Use Indexing Wisely: For the player collection, focus on indexing fields crucial for player-related queries. For game collections, consider indexing game-specific fields. This balances performance without excessive indexing on a single collection.
  3. Optimize Query Patterns: Design your queries to minimize the number of database hits. Utilize MongoDB’s aggregation framework to efficiently retrieve relevant data across multiple collections, reducing the need for simultaneous queries.
  4. Keep Collections Lean: Avoid unnecessary duplication of data across collections. Store shared player information in the player collection and only game-specific details in game collections. This helps in managing data consistency and reduces storage overhead.
  5. Monitor and Scale: Regularly monitor database performance and adjust your approach based on evolving needs. If performance issues arise, consider sharding, caching, or other scaling strategies to ensure efficient handling of growing data volumes.

This looks a lot like a ChatGPT created post. A lot of words, nothing really specific. May be you can elaborate.

I do not really the hybrid part of point 1. Player info into a player collection and game info in a game collection.

Could you provide examples of game-specific fields?

As for .3, how could we use MongoDB’s aggregation framework to retrieve relevant data across multiple collections?

By the way I flagged your other posts as spam and I started to follow you to catch your next attempt faster.