Earlier today TechCrunch posted an item regarding Facebook servers exposing raw PHP code, with blogosphere echo chamber making its rounds, telling a more negative story each time around.
There are two important things that need to be addressed. First. No matter how sexy a theory about disgruntled employee or cunning attacker may sound, the story posted by Brandee in TechCrunch comments is somewhat duller - sometimes those .php files end up being served raw, not interpreted by PHP, on an Apache server.
Second. Source code is not user data. Not to go into Web Page Building 101 here (the course might be available at a local friendly community college), but data is stored in the databases, which are then accessed by some code (PHP in this case), and displayed to the user. What’s displayed is always visible to the user (View Source in your browser), the code is sometimes open (Wordpress, Joomla, Drupal) and sometimes not (pretty much any non-standard Web site out there), while DB is always locked down from outside peeks, unless you have developers do some stupid things, like leave username and password in the PHP code, and allow outside access. Generally speaking, even if I have all the source code for a certain Web site, it’s still impossible for me to take a peek at the data.
But most of you didn’t come here for the lesson in basic Web building. Judging by the title, you wanted to get Facebook source. The more the better. So here it is.
- Facebook Thrift - developed, supported and actually used by Facebook, this is a set of libraries and code generators to allow for maximum throughput data transfers between a client and a server. If you’ve got some server that speaks C++ or Java, and some client that speaks Python or PHP, you can have those two living in perfect harmony, clients issuing the client requests in whatever language they prefer, and servers responding back with the data structures in their preferred language. Read the whitepaper here or join the group here. And guess what, you can download the source.
- Memcached - originally written by the guys who created LiveJournal, this “high-performance, distributed memory object caching system” is quite popular inside Facebook, as evidenced in this mailing list posting by our engineer Steve Grimm. You can naturally get the source of that, too, to add it to your Facebook source collection.
- phpsh - another product written by Facebook engineers and used throughout the company. Ever wished PHP had an interactive shell, just like the one you get when you download Python? Facebook’s phpsh is written (get this) in Python, but offers some of the best interactive shell features to a PHP developer. Ever need to execute a single function just to see what the output will be? Just type the function name with parameters and see it run. Curious to see where a certain function lives? Just do d function_name to get the definition of that function together with its location in the codebase. e function_name opens up emacs, and gets you to the exact location of that function in the code. It’s downloadable here with source available.
- Facebook toolbar for Firefox is also open source, since that’s the way Firefox extensions are distributed. Ever wanted to build a Firefox toolbar of your own incorporating some features of Facebook into it? By installing the toolbar, you get the sources for it placed in your Firefox extensions directory.
- Facebook’s APC - what would you give for a copy of Facebook’s APC configuration? Don’t answer yet, as Facebook engineer Brian Shire provides it for free in his APC@Facebook talk he’s given at PHP conferences. It talks about optimal configuration and trade-offs one needs to consider when optimizing a large number of servers running PHP.
- Facebook’s PHP client for Facebook platform - granted, it would be weird if the company did not open source that, but nevertheless, if you ever wanted to see samples of PHP code and run them against Facebook servers, this is your best bet. Java client is available from Facebook as well, with the rest of the client code being unofficial, which doesn’t mean it’s not good, it’s just written and supported by someone else.
- And finally, PHP scripting language. Not developed by Facebook, but actively used with some contributions to the codebase as well. In fact, a quick search around mailing list area lets you know what those contributions are. PHP is downloadable, with source, naturally, available to anyone who cares to peruse it.
Hopefully this will satiate any hunger for Facebook code, and when you feel yourself very comfortable with everything described above (or maybe none of that was news to you), feel free to drop me a line with a resume attached, if you so desire. The name is alex, what follow after @ should probably be obvious.
Alex, my quick post wasn’t at all intended to be a negative telling of the story of facebook’s servers’ exposing raw code. Quite the opposite, the idea was to reveal that, though the stakes are undeniably high, nothing untoward happened. Sure, my post’s title was meant to be eye-catching, but the point was to mimic the echo chamber–and give it a mild dose of derision. In view of the fact that my post recommends Brandee’s TechCrunch comments, changes the subject only to wish that there were some structural way we could verify it, and then ends with nothing so negative as “So far, so good…,” I hope my other readers don’t take it to be earnestly sensationalist. I regret that I wasn’t clearer.
[…] http://www.moskalyuk.com/…leaked-get-it-all-here/1474 […]
Hey, Josh,
Oh, yeah, I didn’t read your post as negative at all, sorry that I linked in that context.
[…] go into that in this post, and instead I would like to discus the facebook internals here which alex.moskalyuk touched […]