Archives for the Silicon Valley category

Cuill getting heavy with indexing

Might be coincidental, but looking through the logs of a few sites I host, I noticed Cuill, a new search engine with supposedly faster indexing methods, going through quite a few pages:38.99.13.123 - - [25/Dec/2007:17:21:10 -0800] “GET /page.php HTTP/1.0″ 200 17123 “-” “Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)”Looks like there are a few other folks perplexed with the intensity of Cuill indexing (it wasn’t anything to stress about in my case, but was pretty noticeable).

Web 2.0 event at Plug and Play Tech Center

Ramu Yalamanchi of Hi5I went to my first Plug and Play Tech Center Web 2.0 event (a handful to say) tonight. Sunnyvale’s Plug and Play Tech Center is a project by Amidzad Partners, and if you don’t know who they are, but always wondered how downtown Palo Alto manages to have so many rug stores, New York Times had the answer a few months back. The center’s Web 2.0 event is monthly, and features a handful of startup presentations, and a keynote speech. The keynote tonight was from Ramu Yalamanchi, founder and CEO of Hi5. Ramu is a serial entrepreneur and shared his common-sense advice on starting and running a business. He seemed to favor the business models that did not require raising tons of money right away, and starting his newest venture in 2004 taught him to concentrate on profitability right from the start.

As far as the presenting startups, the topic for the night was widgets and applications, embeddable in social networks. TripWiser presented their social traveling Facebook application Going Places, that allows you to specify places you’ve been to, places you want to go, and see what your friends are up to as far as travel activities. Minekey presented their Facebook application iThink, and announced a promotional $300 giveaway in order to drive up usage of their app. iThink allows the user to agree or disagree with controversial statements (”Women are better drivers then men”, “Angelina Jolie is over-rated”), providing a clearer picture of who your friends are, and what they think.

The College Freeway presentsThe College Freeway allows its users to log in using their Facebook IDs, but is a destination site. A few people in the audience seemed to think that this required a special deal with Facebook, but allowing Facebook logins on third-party sites has been in the API long before the Facebook Platform was released. The College Freeway allows students to upload class notes for their college, and considers itself a service to the universities and professors, just like OpenCourseWare is. Pollection is another company that presented a Facebook application that’s already pretty successful - Polls.

Two event startups - imThere and MadeIt. First one focuses on mobile event interaction, the other one is all about creating easy full-blown Web sites for specific events, where people can add photos, videos, audio, etc. for an event and meet new people via events they attend.

Two companies helping application/widget developers to reach markets - gigya offering interfaces for embedding practically any JavaScript/Flash widget on a social networking or startpage site, and Social URL aggregating social networking profiles into one place, and providing some services on top of aggregation.

InnerCircle.cc solves the problem of e-mailing some content or photos over and over to the same people. It allows you to set up a special e-mail address on their server, and then forwards each message to that address to a number of recipients. iPling pitched itself as the first iPhone-only company. It allows users to set up their current mood and express their thoughts anonymously, and then charges for anonymous SMS. So theoretically if you’re at a party and looking for a date, you can set your status to “Seeking date”, and someone else could send you an anonymous SMS expressing their interest.

Overall, interesting businesses and interesting business models. It’s worth noting that even nowadays, at the heyday of Internet advertising market,  many startups are thinking about alternative monetization methods, such as subscriptions and add-on services.

Facebook source leaked - get it all here

Earlier today TechCrunch posted an item regarding Facebook servers exposing raw PHP code, with blogosphere echo chamber making its rounds, telling a more negative story each time around.

There are two important things that need to be addressed. First. No matter how sexy a theory about disgruntled employee or cunning attacker may sound, the story posted by Brandee in TechCrunch comments is somewhat duller - sometimes those .php files end up being served raw, not interpreted by PHP, on an Apache server.

Second. Source code is not user data. Not to go into Web Page Building 101 here (the course might be available at a local friendly community college), but data is stored in the databases, which are then accessed by some code (PHP in this case), and displayed to the user. What’s displayed is always visible to the user (View Source in your browser), the code is sometimes open (Wordpress, Joomla, Drupal) and sometimes not (pretty much any non-standard Web site out there), while DB is always locked down from outside peeks, unless you have developers do some stupid things, like leave username and password in the PHP code, and allow outside access. Generally speaking, even if I have all the source code for a certain Web site, it’s still impossible for me to take a peek at the data.

But most of you didn’t come here for the lesson in basic Web building. Judging by the title, you wanted to get Facebook source. The more the better. So here it is.

  1. Facebook Thrift - developed, supported and actually used by Facebook, this is a set of libraries and code generators to allow for maximum throughput data transfers between a client and a server. If you’ve got some server that speaks C++ or Java, and some client that speaks Python or PHP, you can have those two living in perfect harmony, clients issuing the client requests in whatever language they prefer, and servers responding back with the data structures in their preferred language. Read the whitepaper here or join the group here. And guess what, you can download the source.
  2. Memcached - originally written by the guys who created LiveJournal, this “high-performance, distributed memory object caching system” is quite popular inside Facebook, as evidenced in this mailing list posting by our engineer Steve Grimm. You can naturally get the source of that, too, to add it to your Facebook source collection.
  3. phpsh - another product written by Facebook engineers and used throughout the company. Ever wished PHP had an interactive shell, just like the one you get when you download Python? Facebook’s phpsh is written (get this) in Python, but offers some of the best interactive shell features to a PHP developer. Ever need to execute a single function just to see what the output will be? Just type the function name with parameters and see it run. Curious to see where a certain function lives? Just do d function_name to get the definition of that function together with its location in the codebase. e function_name opens up emacs, and gets you to the exact location of that function in the code. It’s downloadable here with source available.
  4. Facebook toolbar for Firefox is also open source, since that’s the way Firefox extensions are distributed. Ever wanted to build a Firefox toolbar of your own incorporating some features of Facebook into it? By installing the toolbar, you get the sources for it placed in your Firefox extensions directory.
  5. Facebook’s APC - what would you give for a copy of Facebook’s APC configuration? Don’t answer yet, as Facebook engineer Brian Shire provides it for free in his APC@Facebook talk he’s given at PHP conferences. It talks about optimal configuration and trade-offs one needs to consider when optimizing a large number of servers running PHP.
  6. Facebook’s PHP client for Facebook platform - granted, it would be weird if the company did not open source that, but nevertheless, if you ever wanted to see samples of PHP code and run them against Facebook servers, this is your best bet. Java client is available from Facebook as well, with the rest of the client code being unofficial, which doesn’t mean it’s not good, it’s just written and supported by someone else.
  7. And finally, PHP scripting language. Not developed by Facebook, but actively used with some contributions to the codebase as well. In fact, a quick search around mailing list area lets you know what those contributions are. PHP is downloadable, with source, naturally, available to anyone who cares to peruse it.

Hopefully this will satiate any hunger for Facebook code, and when you feel yourself very comfortable with everything described above (or maybe none of that was news to you), feel free to drop me a line with a resume attached, if you so desire. The name is alex, what follow after @ should probably be obvious.

Google Answers relaunches, so far in Russia only

Google Answers launched today for the Russian market, as announced in Google Russia blog by a product manager from Mountain View, so it looks like expansion to other languages is only the matter of time. The service is not entirely unlike a competing answers service from another Internet company in Silicon Valley, but does feature some Google-specific features, such as tagging and prominence of search (do a view image to see the full screenshot).

Front page of Google Answers

Every new user to the system starts off with 100 points, and can spend those points asking a question. The cost of the question can be 10, 20, 30, 50, 80 or 100 points. A daily login to the site will earn you 5 points, every answer to the question will earn you 2 points, and ever rating for a specific answer will earn you 1 point.

One can also specify the number of days before the question is considered closed. The values are 1, 2, 3, 5, 10, 20, 30 with the default of 5 suggested. The best answer gets all the points paid by the user who asked that question, so there’s motivation in answering higher-priced questions first. If in process the answer gets high rating from other users, the author of the answer gets additional points. If the answer is “dugg down”, the author of the answer can lose points.

Google Answers - ask a question

The sidebar links allow you to browse the questions you’ve asked, the answers you’ve submitted, the tags you’ve subscribed to (Google will probably call them labels in English UI), and the starred Q&A.

Google Answers - answer a question

Ostrich burger

In two hours’ drive from Silicon Valley and one hour away from Yosemite National Park, CA right on the intersection of Highways 140 and 49 there’s a Happy Burger Diner in Mariposa, CA. For $4.99 extra you can have any burger made with an ostrich patty, which I enjoyed tremendously, even though ostrich meat is a bit dryer than any meat I am used to.

Ostrich burger

Ostrich burger

Ostrich burger

SearchSIG on personal search

Spock, ZoomInfo, and Wink presented at tonight’s SearchSig. The event was hosted by Google on its Mountain View campus, and moderated by Michael Arrington of TechCrunch. Each startup presented their own vision of personal search, with Spock collecting all sorts of personal information from public Internet sites, ZoomInfo crawling various directories and corporate sites in order to create a business-oriented people directory, while Wink is also parsing all sorts of public sites in order to aggregate a single profile, which they then allow the user to own.

If you like Arrington as TechCrunch writer, you’d definitely love him as panel moderator. He’s not confrontational, but he definitely cuts through marketing bs in order to get his questions answered. All participants kinda stumbled around monetization, but then agreed that currently they maintain somewhere between $1.50 and $2.00 CPM. ZoomInfo also sells premium subscriptions to some of the business-related information, and currently is profitable.

Not too many people are sure how to monetize the people search - you generally can show some contextual ads on the celebrity profiles, since you roughly know what the visitors are looking for, but search for a relatively generic name (your former coworker, classmate, etc.), and that opportunity is hardly monetizable. The Wink demo was particularly attesting to this, as the CEO was browsing the site, the only contextual ads that would show up on the right would be dating ads served by Google AdSense.

At the same time, many agreed the opportunity is there - anywhere between 1.5 billion and 2 billion searches a month are for people. If you saw Dustin’s slide from Facebook tech tasting, you know that Facebook alone generates 600 million searches a month (with actual share of people searches not being disclosed). Spock seemed to think it’s going to be great to allow people to tag other people, and Arrington pressured them into the scenario when someone would be tagged as “pedophile” or “unethical”, at which point the CEO did a little of arm-waving, referring to the “community process”. I want to see that tested when thousands of diggers would get a chance to tag anybody employed with RIAA/MPAA, or thousands of slashdotters get to tag an employee of SCO or Microsoft. It didn’t look like anybody had any good idea on dealing with tag spam, malicious tagging, or misrepresentation by claiming someone’s profile on Wink or Spock.

Overall, looks like the industry is in fairly early stage, with more questions than answers. Pressured by big search engines from one end, and social networks from another, people search engines need to come up with some winning value proposition that makes customers either reach for their credit card, or spend more time on the engines themselves, consuming some ads meanwhile.

DoubleClick CEO on ad market size

In today’s Wall Street Journal David Rosenblatt, CEO of DoubleClick, responds to some questions regarding Google’s acquisition of his company, and quotes some interesting stats from the online advertising market:

There are somewhere between half a million and a million search advertisers in the market, there are probably only a couple to five thousand graphical advertisers and probably less than a hundred video advertisers. There is no reason for that imbalance to exist. So one of our goals is to increase efficiencies with which people buy and sell video advertising and democratize access to the process in the same way that Google has democratized access to the search market.

There’s also an important quote on data collection practices, as right after acquisition announcement I saw a huge increase in the number of articles speculating on Google tapping into the browsing habits of DoubleClick banner recipients:

Ad-serving information collected by DoubleClick — and this is a really important point — has always been the property of our clients, not us. And so a change of ownership of DoubleClick will not change the terms of those contracts…so we are very comfortable with our current policy.

Which effectively means that cookies collected by DoubleClick on behalf of the advertising clients will remain the property of the advertising clients, and will not be combined with search data to further profile Internet users.

NeuroSky to capture, interpret brain activity

Associated Press profiles NeuroSky, a company that started selling a brain activity sensor and an algorithm library to analyze it. The current application is better video games, where a golfer incapable of concentrating on the game would make an inferior move, or a scared Grand Theft Auto player would lose the precision in his aim. An EE Times article from 2005 says the company hired the top neuroscience experts from Moscow and licensed their inventions in order to produce a device that is capable of recognizing and interpreting brain activity.

Earlier this year some German scientists used brain surveillance techniques to determine whether the test participants decided to add or subtract a number, and reached 71% rate.

D-Wave’s quantum computer demoed

Canadian company D-Wave Systems is getting some technology press buzz after successfully demonstrating their quantum computer that the company plans to rent out. Scientific American has more of technical description of how the quantum computer works as well as possible areas of application: “The quantum computer was given three problems to solve: searching for molecular structures that match a target molecule, creating a complicated seating plan, and filling in Sudoku puzzles.” There are also some videos from the demo.

Who really makes money off online video

Today Associated Press ran an article on how Cisco Systems benefited from online video explosion and apparently is either considering new product lines or vastly improving the existing ones to accommodate the YouTube phenomenon:

Charlie Giancarlo, Cisco senior vice president and chief development officer, said the push also is forcing Cisco to change the way its products are built to deliver high-quality video, which puts more strain on the network than voice or data transmissions.

What’s interesting here is that the flagship online video site - YouTube, has never made a dime of profit, and depending on the server bills and advertiser saturation, might never reach green. Same goes for Google Video and other large scale video projects, whose bandwidth-to-ads margins are, one could imagine, pretty slim. At the same time there are these “hidden” players of the industry, who benefit tremendously, since their stuff sells and sells fast. Cisco is one of them, another hidden flagship of the online video industry is Akamai, whose 12-month share price going from $21.88 to $54.64 looks pretty nice to any investor who got in at the right time.

So if you judge online video industry by the money that YouTube and Google Video made, it’s really insignificant. If you judge by the sales that Cisco and Akamai made, all of a sudden it’s multi-billion-dollar industry.

Same goes for some other industries like e-commerce and FedEx. If you track e-commerce trends by the sales and margins of Amazon.com, Staples.com or other players, you might think that the margins are tough and price competition is enormous, therefore being in e-commerce is more of a curse than a solid business model. But both successful e-commerce shops and unsuccessful ones ultimately turn to their FedEx guy for shipping, and a 5-year curve of FDX follows US e-commerce volume closer than that of AMZN or other significant pure-e-commerce players.