Saturday, September 24, 2011

Just how much data must TA 1 mine?

Here's a quote from the initial introduction to TA 1 technologies:
TA1 performers will develop automated and semi-automated operator support tools and techniques for the systematic and methodical use of social media at data scale and in a timely fashion 
Since I have been working on TA 2 test systems with just 5k users, and finding 185k posts per fake year at a posting rate p of 0.1 posts/day,  I wondered what the real world has in store for SMiSC.  That is, what is "social media at data scale and in a timely fashion?"

Well here is "data scale" as of   By The Numbers: Twitter Vs. Facebook Vs. Google Buzz
Updates/Posts
  • Facebook status updates: 700 per second
  • Twitter tweets: 600 per second
  • Buzz posts: 55 per second
1355 updates per second, discriminated, categorized, aggregated, and reported on.  A "timely fashion" implies that it is okay to be "behind" by some time, but eventually the system must process everything.  I figure the requirement for maximum delay is set up to give a report on any new/significant meme within our leaders' decision-making cycle so that leaders cannot be outfoxed by a rapidly-spreading strategic message.

Yikes.

Here's stuff just on Facebook (current): FB stats
Twitter doesn't seem to have a similar page.
Couldn't find one for Google+ either.

No comments:

Post a Comment