Archives for the PHP category

24 Web site performance tips

Yahoo! Developer Network blog had an entry by Stoyan Stefanov and presentation from PHP Quebec conference. A few points to take away, in case you don’t feel like going through 76-slide presentation:

  1. A drop of 100ms in page rendering time leads to 10% in sales on Amazon. A drop of 500 ms leads to 20% less traffic to Google.
  2. Make fewer HTTP requests - combine CSS and JS files into single downloads. Minify both JS and CSS.
  3. Combine images into CSS sprites.
  4. Bring static content closer to the users. That usually means CDNs like Akamai or Limelight, but sometimes a co-location facility or data center in a foreign country is the only option.
  5. Static content should have Expires: headers way into the future, so that they’re never re-requested.
  6. Dynamic content should have Cache Control: header.
  7. Offer content gzip’ed.
  8. Stoyan claims nothing will be rendered in the browser till the last piece of CSS has been served, and therefore it’s critical to send CSS as early in the process as possible. I happen to have a document with CSS declared at the very end, and disagree with this statement - at least the content seems to render OK without CSS, and then self-corrects when CSS finally loads.
  9. Move the scripts all the way to the bottom to avoid the download block - Stoyan’s example shows placing the javascript includes right before </body> and </html>, although it’s possible to place them even further down (well, you’d break XHTML purity, I suppose, if you declare your documents to be XHTML).
  10. Avoid CSS expressions.
  11. Consider placing the minified CSS and JS files on separate servers to fight browser’s default pipelining settings - not everybody has FasterFox or tweaked pipeline settings.
  12. For super-popular pages consider inlining JS for fewer HTTP requests.
  13. Even though placing content on external servers with different domains will help you with HTTP pipelining, don’t go crazy with various domains - they all require DNS lookups.
  14. Every 301 redirect is a wasted HTTP request.
  15. For busy backend servers consider PHP’s flush().
  16. Use GET over POST any time you have a choice.
  17. Analyze your cookies - large number of them could substantially increase the number of TCP packets.
  18. For faster JavaScript and DOM parsing, reduce the number of DOM elements.
  19. document.getElementByTagName(’*').length will give you the number of total elements. Look at those abusive <div>s.
  20. Any missing JS file is a significant performance penalty - the browser will browse the 404 page you generate, trying to see if it has valid <script>s.
  21. Optimize your PNGs - check out pngcrush, pngoptimizer
  22. Optimize JPEGs - jpegtran
  23. Make sure you have favicon.ico - generating those 404s will be expensive, plus once you have it, it’s cache-able.
  24. Toolkits for measuring page loads: AOL PageTest, FiddlerTool HTTP debugging proxy, IBM Page Detailer instrumentation tool, YSlow, and Firebug are suggested in the presentation. My personal addition to the list is Charles that has been recommended by a colleague.

And here’s the whole presentation, although it’s not possible to follow links from Slideshare slides.

SlideShare

register_shutdown_function possible use cases

Eirik Hoem on his blog provides an overview of PHP’s register_shutdown_function, and suggests using it for the cases when for whatever reason your Web page ran out of memory, fatal’ed, and you don’t want to display a blank page to the users.

register_shutdown_function is also useful for command-line scripts with PHP. Pretty frequently your script has to do some task like parse a large XML file, and the test examples when it was originally written did not account for the XML file possible being huge. Therefore your script dies with like 23% completion, and you’re left with 23% of the XML file parsed. Not ideal, but a quick duct-tape-style fix, would be to introduce a register_shutdown_function call to system(), to which you pass the script itself.

If you happen to keep track of which line you’re on while parsing, you can pass the line number as the first parameter to your own script, and make it start off after that 23% mark, or wherever it died. The script then needs to be launched with 0 passed as the first parameter. It will run out of memory, die, launch register_shutdown_function, which will launch another copy of the script (while successfully shutting down the original process) with a new line number, which will repeat the process.

Again, this is a duct tape approach to PHP memory consumption issues while working with large data sets.

PHP contest from PHParchitect.com

Guys at PHParchitect are running a PHP contest for smallest, fastest, most efficient command-line PHP script. A seemingly simple link parser task is probably very tricky, but the task itself is somewhat poorly specced out, as several things are not clear:

  1. Their example lists the href enclosed in <link rel=”stylesheet” type=”text/css” href=”/css/c7y.css” id=”Main C7Y CSS”></link>. So is that a valid link? Anything in href qualifies as a link?
  2. Does a JavaScript window.open qualifies as a link?
  3. What about <a href=”http://www.yahoo.com” onclick=”window.location=http://www.google.com”>link</a> or any similar shenanigans? What qualifies as a link there?

__DIR__ in PHP 5.3

Lars Strojny says that a new magic constant __DIR__ is coming to PHP 5.3. __DIR__ will refer to the current directory of the script. It’s useful for those include and include_once directives where it’s preferable to use absolute paths to avoid navigating down the include path.

Facebook source leaked - get it all here

Earlier today TechCrunch posted an item regarding Facebook servers exposing raw PHP code, with blogosphere echo chamber making its rounds, telling a more negative story each time around.

There are two important things that need to be addressed. First. No matter how sexy a theory about disgruntled employee or cunning attacker may sound, the story posted by Brandee in TechCrunch comments is somewhat duller - sometimes those .php files end up being served raw, not interpreted by PHP, on an Apache server.

Second. Source code is not user data. Not to go into Web Page Building 101 here (the course might be available at a local friendly community college), but data is stored in the databases, which are then accessed by some code (PHP in this case), and displayed to the user. What’s displayed is always visible to the user (View Source in your browser), the code is sometimes open (Wordpress, Joomla, Drupal) and sometimes not (pretty much any non-standard Web site out there), while DB is always locked down from outside peeks, unless you have developers do some stupid things, like leave username and password in the PHP code, and allow outside access. Generally speaking, even if I have all the source code for a certain Web site, it’s still impossible for me to take a peek at the data.

But most of you didn’t come here for the lesson in basic Web building. Judging by the title, you wanted to get Facebook source. The more the better. So here it is.

  1. Facebook Thrift - developed, supported and actually used by Facebook, this is a set of libraries and code generators to allow for maximum throughput data transfers between a client and a server. If you’ve got some server that speaks C++ or Java, and some client that speaks Python or PHP, you can have those two living in perfect harmony, clients issuing the client requests in whatever language they prefer, and servers responding back with the data structures in their preferred language. Read the whitepaper here or join the group here. And guess what, you can download the source.
  2. Memcached - originally written by the guys who created LiveJournal, this “high-performance, distributed memory object caching system” is quite popular inside Facebook, as evidenced in this mailing list posting by our engineer Steve Grimm. You can naturally get the source of that, too, to add it to your Facebook source collection.
  3. phpsh - another product written by Facebook engineers and used throughout the company. Ever wished PHP had an interactive shell, just like the one you get when you download Python? Facebook’s phpsh is written (get this) in Python, but offers some of the best interactive shell features to a PHP developer. Ever need to execute a single function just to see what the output will be? Just type the function name with parameters and see it run. Curious to see where a certain function lives? Just do d function_name to get the definition of that function together with its location in the codebase. e function_name opens up emacs, and gets you to the exact location of that function in the code. It’s downloadable here with source available.
  4. Facebook toolbar for Firefox is also open source, since that’s the way Firefox extensions are distributed. Ever wanted to build a Firefox toolbar of your own incorporating some features of Facebook into it? By installing the toolbar, you get the sources for it placed in your Firefox extensions directory.
  5. Facebook’s APC - what would you give for a copy of Facebook’s APC configuration? Don’t answer yet, as Facebook engineer Brian Shire provides it for free in his APC@Facebook talk he’s given at PHP conferences. It talks about optimal configuration and trade-offs one needs to consider when optimizing a large number of servers running PHP.
  6. Facebook’s PHP client for Facebook platform - granted, it would be weird if the company did not open source that, but nevertheless, if you ever wanted to see samples of PHP code and run them against Facebook servers, this is your best bet. Java client is available from Facebook as well, with the rest of the client code being unofficial, which doesn’t mean it’s not good, it’s just written and supported by someone else.
  7. And finally, PHP scripting language. Not developed by Facebook, but actively used with some contributions to the codebase as well. In fact, a quick search around mailing list area lets you know what those contributions are. PHP is downloadable, with source, naturally, available to anyone who cares to peruse it.

Hopefully this will satiate any hunger for Facebook code, and when you feel yourself very comfortable with everything described above (or maybe none of that was news to you), feel free to drop me a line with a resume attached, if you so desire. The name is alex, what follow after @ should probably be obvious.

Removing curly quotes from WordPress 2.1.*

A while ago I wrote a post on preventing the curlification of the quotes on WordPress. This is useful when you run something like a code-sharing site, and each string variable gets encoded with unusable fancy quotes. mhinze.com recently shared the solution for removing curly quotes on WordPress 2.1, but it actually might do a bit more than you expect.

In your Wordpress folder go for the wp-includes folder and find a file called formatting.php. Lines 20 and 21 look like:


$static_characters = array_merge(
array('---', ' -- ', '--', 'xn–', '...', '``', '\'s', '\'\'', ' (tm)'), $cockney);
$static_replacements = array_merge(
array('—', ' — ', '–', 'xn--', '…', '“', '’s', '”', ' ™'), $cockneyreplace);

The arrays contain characters to find and characters to replace. There’s one to one correspondence. As you can see, a double dash, a triple dash and a double dash with spaces get replaced by an em dash. The ellipsis gets replaced by …

The characters in question are ‘\’\”‘ - backslash-escaped (since PHP requires it) single quote and double quote replaced by ” Remove both ‘\’\”‘ from static_characters array and &#8221; from static_replacements array, and you’re good to go - you will still keep your ellipsis, your em dashes, and your trademark symbols.

Tip 1: this will still leave the fancy quotes for backticks (those ` characters to the left of 1 on your keyboard) and fancy apostrophe in the possessive ’s, like “This is Alex’s site”. If you want those removed as well, remove elements ‘“’, ‘\’s’ from static_characters and corresponding values ‘&#8220;’, ‘&#8217;s’ from static_replacements.

Tip 2: Want some other character replaced automatically? Then insert a character to replace into static_characters, such as (c) and insert a character into static_replacements, such as &copy;

PHP + MySQL links of the day

Patrick Galbraith from Grazr is describing MySQL multi-master replication, where two nodes replicate each other’s updates. Useful for the case when you’re running the product out of multiple data centers, and there is no predictability on where the writes will occur, i.e. both are hot MySQL servers.

IBM DeveloperWorks introduces us to XCache on PHP:

XCache is a relative newcomer, but many sites are reporting good results with it. In addition, it is easy to build, install, and configure because it’s implemented as a PHP extension. Recompiling Apache and PHP isn’t required. This article is based on XCache V1.2.0. It reliably supports PHP V4.3.11 to V4.4.4, PHP V5.1.x to V5.2.x, and early versions of PHP V6. (XCache doesn’t support PHP V5.0.x.) XCache works with mod_php and FastCGI, but not with the Common Gateway Interface (CGI) or the command-line PHP interpreter. The XCache source code builds on a variety of systems, including FreeBSD, Sun Solaris, Linux®, and (as shown here) on Mac OS X. XCache can be built on Microsoft® Windows®, as well, using the Cygwin UNIX® emulation environment or Visual C. You can build XCache for Cygwin or for native Win32. The latter target is compatible with the official Win32 release of PHP.

VIM tips for PHP developers

Fresh from VIM talk, I was curious to see Andrei Zmievski post VIM script files from his VIM for PHP programmers presentation. It’s not one of those 3 page presentations ending with “Read the VIM manual” either, it’s a 77-page guide to optimizing one’s VIM experience when writing PHP.

Some PHP links for the day

PHPBuilder runs a chapter from an APress book on PEAR. The chapter is dedicated to using PEAR authentication modules. Pretty much any site you build nowadays allows the users to register, choose a password, validate a password, send a password out when it’s forgotten, etc.

You use the Auth package to authenticate users in your site. Out of the box, it supports many different ways of authenticating users, including storage in a database, in files, or even by using SOAP calls. You can even write a custom container object that allows you to write your own method to authenticate users.

Donald McArthur over at NewsForge creates a generic reusable PHP calendar template that could be used on any site requiring inputting dates:

My design goals were to create a PHP page that would take as input a querystring value in the form of a Unix epoch number that would represent the beginning moment of a particular date. (I chose the Unix epoch number, which represents the number of seconds that have transpired since the start of January 1, 1970, as that was the data my database SQL statement used as a SELECT criteria.) The script would determine the month and year of that value, and create an array holding a Unix epoch number for the beginning moment for each day in that month. The script would then output HTML to display a calendar, with each date a hyperlink back to the original PHP page, with the associated querystring value for that date.

Playing with ASP.net AJAX with PHP

Microsoft rebranded Atlas as ASP.NET AJAX and made the reusable components available as a public release. I vaguely skimmed over the news headlines, since Active Server Pages dot net Asyncronous JavaScript and XML did not hold any promise of being too useful on the LAMP stack, but now CodePlex released some libraries to allow PHP to interact with Microsoft’s reusable components:

require_once '../../dist/MSAjaxService.php';
class HelloService extends MSAjaxService
{
function SayHello($name)
{
return "Hello, " . $name . "!";
}
}
$h = new HelloService();
$h->ProcessRequest();