Modeling information dissemination on Twitter in formal Networking terms

Information geeks all over are relying on Twitter to keep them informed about everything that’s relevant to them. Was there an earthquake? What do people think of Google’s Privacy changes? What are the most interesting facts about the Facebook S-1? Where did everyone go for dinner last night?

Assuming you follow people relevant to your interests, and given enough time, someone will push what you need to know, right into your Twitter stream.

How does this happen? After all, we check Twitter only a few times a day, and only consume tweets originating around that time. Why don’t we miss out on major news events or interesting information?

Twitter’s product conventions and user dynamics make it more than just a glorified soapbox for the masses – I think it can in fact be analyzed as a formal information dissemination network, with clear parallels to traditional Computer Networks.

Parallels with Computer Networks

The Universe keeps you informed…

Structurally, Twitter streams are just a list of tweets from people one follows, sorted by the time of creation. Conceptually, however, we could view this stream as a multiplexed stream of information about various topics; with each person one follows contributing to various topics simultaneously, and to different degrees.
For instance, a set of people being followed might be tweeting about SES, another commenting on a football match that’s going on, while yet another set could be discussing the Facebook S-1. Of course, the people in these sets might very well overlap, with each such person participating in multiple conversations at about the same time. All of these tweets intersperse, and a picture of various topics emerges simultaneously.

In some sense, the Universe is conspiring to inform and educate, even if the people one follows couldn’t care less about our intellectual growth.

…in Tweet-sized chunks that resemble IP packets…

Due to the 140-character constraint, tweeters comment on topics in tweet-sized chunks. While they might sometimes continue the same thought over multiple tweets, generally, each tweet stands alone. Each of these tweets can be considered a separate informational entity as it flows through the Twitter network, being shared, retweeted and commented on.

Just like in IP networks, each tweet is expendable, since the underlying network is understood to be lossy. In other words, any given tweet could potentially be viewed by no one, and that’s perfectly consistent with the expectations of the network.

…via people acting like repeaters and routers…

As people interact with tweets, they act like repeaters and routers. When one retweets, for example, we’re essentially rebroadcasting an informational ‘packet’ from one subnet (one comprised of the followers of the person you’re retweeting) to a different ‘subnet’, the one comprised of all your followers.

In addition, since the retweets don’t have to happen right at the instant of the initial tweet, a retweet increases the TTL, ie, ‘time-to-live’ of the initial tweet. For instance, a tweet might be posted at 9:05am. It gets retweeted by someone who checked Twitter at 10:30am. Now, this retweet might be seen by someone who first checks Twitter at 11:35am that day – even if they missed the initial tweet.

The more people you’re following that have related interests, the more likely this routing effect is likely to happen, so connecting to relevant routers (active twitter users within relevant domains) is critical.

…while handling nodes that are down…

With a large enough number of followers with related interests, there might be enough critical mass for interesting information to flow, even if some part of the network is sleeping, in a flight, or busy in meetings!

In other words, there would be packet loss, but if there are enough nodes in the network, information flow would not be significantly impeded.

…for a reasonably lossless delivery.

This is still a pretty lossy channel, but as information dissemination architectures go, it seems to me to be a fairly robust organically evolving system. It certainly seems to be working very well for me!

Twitter doesn’t have to be real-time for this to work

Interestingly,  in this postulation, Twitter’s being real-time  is incidental, just like the fact that IP is ‘real-time’ is incidental to TCP/IP. In fact, the existence of ‘IP over Avian Carriers‘ shows that real-time-ness isn’t critical to the design of networking protocols – that’s just a repercussion of using an underlying electromagnetic substrate where bits flow pretty fast.

More analysis needed

I might be imagining things, but there seems to be enough meat in here for a PhD or two. I just don’t know if it will be in the Computer Science department, or the Sociology department!

Or, I might be way off base. Please comment below!

Amit

Leave a comment