How does Twitter's Recommendation Algorithm work ?

Twitter's recommendation algorithm is now public. That's huge for content creators, developers and all social network users. This is why I think it's important to have as much people as possible understanding how it works, not only tech scientist. Here's my friendly summary.

As social media becomes an increasingly integral part of our daily lives, it’s essential to understand how algorithms shape our experiences on these platforms. Twitter is no exception, with its recommendation algorithm serving as the backbone of its For You timeline (feed). In this article, we’ll explore the key things you need to know about how Twitter’s algorithm distill the roughly 500 million Tweets posted daily down to a handful of top Tweets that ultimately show up on your device.

Twitter’s recommendation system is composed of many interconnected services and jobs that work together to deliver the best of what’s happening in the world right now.

If you want to have access to the source code, here's a link to Elon's Tweet anouncing the release :

Twitter recommendation source code now available to all on GitHub https://t.co/9ozsyZANwa
— Elon Musk (@elonmusk) March 31, 2023

How Twitter choose Tweets ?

First, The recommendation pipeline is made up of three main stages: candidate sourcing, ranking, and applying heuristics and filters. The foundation of Twitter’s recommendations is a set of core models and features that extract latent information from Tweet, user, and engagement data. These models aim to answer important questions about the Twitter network, such as, “What is the probability you will interact with another user in the future?” or, “What are the communities on Twitter and what are trending Tweets within them?” Answering these questions accurately enables Twitter to deliver more relevant recommendations.

The recommendation pipeline is made up of three main stages that consume these features:

Fetch the best Tweets from different recommendation sources in a process called candidate sourcing.
Rank each Tweet using a machine learning model.
Apply heuristics and filters, such as filtering out Tweets from users you’ve blocked, NSFW content, and Tweets you’ve already seen.

This diagram below illustrates the major components used to construct a timeline:

Twitter Recommendation Algorithm - Main Stages

Let’s explore the key parts of this system, roughly in the order they’d be called during a single timeline request, starting with retrieving candidates from Candidate Sources.

Candidate Sources

Twitter has several Candidate Sources that we use to retrieve recent and relevant Tweets for a user. For each request, they attempt to extract the best 1500 Tweets from a pool of hundreds of millions through these sources. Twitter finds candidates from people you follow (In-Network) and from people you don’t follow (Out-of-Network). Today, the For You timeline consists of 50% In-Network Tweets and 50% Out-of-Network Tweets on average, though this may vary from user to user.

In-Network Source

The In-Network source is the largest candidate source and aims to deliver the most relevant, recent Tweets from users you follow. It efficiently ranks Tweets of those you follow based on their relevance using a logistic regression model. The top Tweets are then sent to the next stage.

The most important component in ranking In-Network Tweets is Real Graph. Real Graph is a model which predicts the likelihood of engagement between two users. The higher the Real Graph score between you and the author of the Tweet, the more of their tweets we'll include.

Out-of-Network Sources

Finding relevant Tweets outside of a user’s network is a trickier problem. Twitter takes two approaches to addressing this: social graph and embedding spaces.

The social graph is a term used to describe the network of connections between users on a social media platform like Twitter. Essentially, it's a map of the relationships between users, based on who follows whom, who engages with whose content, and so on.

In the context of Twitter's recommendation algorithm, the social graph is used to help identify relevant content for users. By analyzing the engagements (likes, retweets, replies, etc.) of people you follow or those with similar interests, Twitter can estimate what you would find relevant and generate candidate Tweets based on those engagements.

In other words, the social graph helps Twitter understand the relationships between users and their interests, and use that information to suggest content that users may be interested in.

They traverse the graph of engagements and follows to answer the following questions:

What Tweets did the people I follow recently engage with?
Who likes similar Tweets to me, and what else have they recently liked?

Embedding Spaces

Embedding space approaches aim to answer a more general question about content similarity: What Tweets and Users are similar to my interests?

Embedding spaces are a way of representing data (in this case, users and tweets) as numerical values, or "embeddings". Essentially, an embedding is a way of transforming the data into a format that can be more easily analyzed by a computer.

In the context of Twitter's recommendation algorithm, embedding spaces are used to determine which tweets and users are similar to your interests. By generating numerical representations of users' interests and tweets' content, Twitter can calculate the similarity between any two users, tweets, or user-tweet pairs in this embedding space. The idea is that if two users have similar embeddings, they are likely to have similar interests, and if a tweet's embedding is similar to a user's embedding, the tweet is more likely to be relevant to that user.

One of Twitter's most useful embedding spaces is called SimClusters. SimClusters are communities anchored by a cluster of influential users using a custom matrix factorization algorithm.

By embedding tweets into these communities and calculating their similarity to users' interests, Twitter can recommend content that is more likely to be relevant to users. The more that users from a community like a Tweet, the more that Tweet will be associated with that community.

Ranking

The goal of the For You timeline is to serve you relevant Tweets. At this point in the pipeline, we have ~1500 candidates that may be relevant. Scoring directly predicts the relevance of each candidate Tweet and is the primary signal for ranking Tweets on your timeline. At this stage, all candidates are treated equally, without regard for what candidate source it originated from.

Ranking is achieved with a ~48M parameter neural network that is continuously trained on Tweet interactions to optimize for positive engagement (e.g. Likes, Retweets, and Replies). This ranking mechanism takes into account thousands of features and outputs ten labels to give each Tweet a score, where each label represents the probability of an engagement. We rank the Tweets from these scores.

Heuristics, Filters, and Product Features

After the ranking stage, Twitter applies heuristics and filters to implement various product features. These features work together to create a balanced and diverse feed.

Some exemples :

Visibility Filtering: Filter out Tweets based on their content and your preferences. For instance, remove Tweets from accounts you block or mute.
Author Diversity: Avoid too many consecutive Tweets from a single author.
Content Balance: Ensure we are delivering a fair balance of In-Network and Out-of-Network Tweets.
Feedback-based Fatigue: Lower the score of certain Tweets if the viewer has provided negative feedback around it.
Social Proof: Exclude Out-of-Network Tweets without a second degree connection to the Tweet as a quality safeguard. In other words, ensure someone you follow engaged with the Tweet or follows the Tweet’s author.
Conversations: Provide more context to a Reply by threading it together with the original Tweet.
Edited Tweets: Determine if the Tweets currently on a device are stale, and send instructions to replace them with the edited versions.

Mixing and Serving

The last step in the process is mixing and serving. Home Mixer has a set of Tweets ready to send to your device. As the last step in the process, the system blends together Tweets with other non-Tweet content like Ads, Follow Recommendations, and Onboarding prompts, which are returned to your device to display.

The pipeline above runs approximately 5 billion times per day and completes in under 1.5 seconds on average. A single pipeline execution requires 220 seconds of CPU time, nearly 150x the latency you perceive on the app.

Twitter has released the code powering its recommendations to provide full transparency to users about how its systems work. The company is also working on several features to provide greater transparency within the app.

Conclusion

Understanding how Twitter’s algorithm works can help users make more informed decisions about their social media use. With this knowledge, users can take control of their Twitter experience and ensure that the content they see aligns with their interests and values.