Social Media Following Implementation using Mnesia?

Abdelghani · August 20, 2022, 3:54pm

Hello everybody I hope you are all fine and in good health condition,
Consider we have a social media database that we want to implement via Mnesia (key->value table), each record represents a person and each person is followed by other persons, how can we implement this ?
1-by a record field followers which is a list but I don’t think it’s a good idea because this list may contain millions of followers
2-by another table of type bag with 2 fileds : followed key and follower key but we will get out of Mnesia principle and this is sort of relational database

juhlig · August 20, 2022, 6:28pm

I have never used Mnesia, so this is just my two cents, take it with a grain of salt

Personally, I would always go with (2), as it also allows you to easily query for “who is person X following”, on top of “who is following person Y”. This is something you will need if a person is deleted and needs to be removed from the lists of all persons if you went with (1), which would be quite expensive.

Also, just because you are using a non-relational database does not mean that you should avoid relations at all costs. You just have to model them differently.

Abdelghani · August 20, 2022, 10:53pm

but you will loss all key-value database benefits like scalability and concurrent dirty access no ?

juhlig · August 22, 2022, 11:50am

Hm, I’m not sure what you mean But maybe we are talking about different things here… You mentioned the “Mnesia principle” in your opening post, can you briefly explain to me what you mean by that?

Anyway. A relation as I outlined would only exist in the application using the Mnesia database. As far as Mnesia is concerned, there are just two tables, one with person records and one with 2-tuples of terms (keys in the person table). That the terms in the tuples are keys in the person table is only known to your application, Mnesia doesn’t care. So as far as Mnesia is concerned, a database like that scales just as well as if the tables inside had nothing at all to do with each other.

What may be a bother is that this way, you have to generate keys for the persons to use in the follower table. And if you change a persons’ key or delete it, you have to provide code in your application that also updates the follower table. And so on.

About dirty operations, you can still use them, but you have to be careful and decide for yourself in the code that uses the database when it is safe to use them and when not.

juhlig · August 22, 2022, 12:01pm

By the way, maybe you should change the title of this post to be a bit more specific to Mnesia and more general as far as the problem is concerned, so more people will chime in. I’m pretty interested in hearing other (more Mnesia-experienced) peoples’ opinions on the topic as well but I suspect that most people will read “Social Media… Implementation” as something like “How do I build yet another Twitter clone?”, and won’t bother reading it at all

Abdelghani · August 22, 2022, 3:44pm

Thank you @juhlig for your reply, but I think using another table with person’s foreign key will generate a list too when reading data from, for example
mnesia:read(Key) or mnesia:dirty_read(Key) will give a list with the millions of followers so why not using directly a list of followers as a table field or in other terms can an erlang list contains millions of elements without problem ?

juhlig · August 23, 2022, 9:07am

That depends on what would be a problem for your use case, I think That is, it depends on what you want to do and how often. AFAIK, there is no problem in just having lists with millions of elements (as long as you have enough memory available on your machine to store them all), but working with such a list may be a problem.

(By the way, I think what you really want there is not a list (where an element may be present multiple times) but a set (where an element may be present only once). I mean, a person either follows another person, or it doesn’t, it makes no sense if a person was following another person multiple times.)

Anyway, I realize that a social media following implementation has many facets to it, with many design choices to be made.

On the one hand, what comes to mind is what needs to be done when a person A starts or stops following another person B, for example. With approach (1), A needs to be added to or removed from the list (set?) of followers of B. To do that, you need to read B, change the contents of the followers field, then write it back. With approach (2), you just need to insert or delete an object in the who-follows-who table. So here, approach (2) is arguably better.

On the other hand, what needs to be done when person C posts something and all the followers (D, E, F, …) should be informed? In approach (1), you just read C and you just get the list of followers and can work with them. In approach (2), you need to query the who-follows-who-table to collect (ie, generate) the list of followers to work with. So here, approach (1) is a bit better.

On yet another hand, what needs to be done when a person G is being deleted entirely? After deletion, G should not appear in any follower (persons following G) or following (persons G follows) “relationship”. With both approaches, G needs to be deleted from the person table. With approach (1), the follower “relations” get deleted implicitly with the G record, as they are stored there. However, all other persons in the database need to be checked and G removed from their follower lists if present. With approach (2), all following/follower relationships can be removed by deleting the entries that contain G (as follower or being followed) from the who-follows-who table.

So, what I’m trying to say is, there are many things you need to consider when designing your database. What happens often, what happens rarely? What needs to be done quickly, what can take a while or be delayed? Etc, etc.

(Sorry for the very long post And sorry, I guess it is not of much help to you, either. There probably is no simple, single, one-fits-all answer to this…)

Abdelghani · August 23, 2022, 3:13pm

Waw don’t be sorry please, you have described the problem from scratch so thank you so much for your help and Iam so sorry for your time.
As you said, it’s not as simple as it appears to resolve this problem, in fact I see that the implementation (1) is so good because it takes less memory(in implementation (2) you should enter the followed Key at each record but in the (1) each list is a field of just one record).
Stop following someone is too simple and needs just lists:delete which is efficient enough to do that(no problem if that needs a couple of seconds)
Deleting a Person is really a problem because it needs to search the entire database and delete any trace of that person but I think we have all the time to do that and I think that this is what is happening in most social media networks(I see more than that in facebook messenger when you want to message a person which has already deactiavte its account, it appear as normal user with its photo and infos and when I want to send him I get error)
Thank you again @juhlig

juhlig · August 24, 2022, 12:15pm

You’re welcome