2/13/2008

Interview with MSN Web Search Team

he recent launch of MSN's own, unique web search technology has catapulted Microsoft into direct competition with Yahoo!, Google, AskJeeves and the smaller web search engines like Gigablast & Clusty. The following interview was conducted over the course of 3 weeks through email question & answers between myself and the entire MSN search team (with Zac Rivera of Maloney & Fox as the PR contact).

In the initial interview, I asked 11 questions relating to the development of the MSN Search project, some issues that are currently important in the world of web search and a couple questions on the future of MSN Search. I followed up with another 5 questions and received answers to all of these as well.


#1: The Creation of MSN Search

Rand: When did MSN first decide to develop its own search engine and what was the catalyst? Was this an internal decision that had been brewing for a long time or something that was influenced externally?

MSN Search: MSN has operated a web search service since 1998 following a 3rd party model. In January of 2003 the decision was made to build our own technology from the ground up in order to allow us to innovate more quickly and easily and to meet customer and merchant demands. The growth in the search marketplace was also an incentive for us to build our own engine. Finally, Microsoft has a long heritage of search technology productization (Windows NT, Windows, Office, Sharepoint) and research, so in a way it was past due time for us to get into the search engine game.

Follow-Up
Rand: MSN has recently had some news about a paid search service, similar to Yahoo’s Search Marketing Solutions (formerly Overture) and Google’s Adwords. Will this new product be search based only, or include a content advertising component? What differences, if any, will MSN’s service have from the competition – why should search marketers be excited about this product?

MSN Search: The new pilot MSN paid-search solution, built on the MSN adCenter platform, is in the early stages of development. Yahoo will continue to provide MSN with its paid search solution as they have and continue to be an important strategic partner for us. Additional details on MSN adCenter and the paid search solution will be provided in the coming months.

Our vision is to utilize the power of Microsoft software to enable advertisers to meet their brand and direct needs in a new way through a complete suite of advertising technology, where a complete end to end campaign is managed via one system. We’ve heard two complaints from agencies and marketers about today’s offerings available today to manage online advertising. First, there isn’t one place they can go to manage all of their online campaigns which makes campaign management difficult and tedious. Second, there is a current lack of business intelligence available to advertisers online. We are developing MSN adCenter to address these needs by delivering a one-stop shop where advertisers can manage all their campaigns across MSN globally as well as access audience intelligence.

#2: Community Input

Rand: When you released MSN in beta, did you receive a lot of input from the web community? Were those recommendations able to help you in the development process? Can you think of anything in particular that stands out as being a terrific idea from a user that was eventually implemented?

MSN Search: We actually launched a Technology Preview of the engine in July of 2004. This was simply just the bare bones engine with no UI layered on it. This allowed us to connect early with web masters and other tech enthusiasts to make sure our crawler was doing what we intended it to do. The more public beta of the service launched in November of 2004 and we received a lot of feedback from consumers on the types of instant answers they would like to see. This will be a focus for us on-going, helping answer questions, not just providing links.

Follow-Up
Rand:
When you say “instant answers” are you talking about human-entered data in response to a question, or just more human-like machine responses – natural language answers? Is this an area that MSN intends to develop, or do you see MSN as staying focused on serving web pages as results?

MSN Search: MSN Search provides more than 1.5 million factual, instant answers to specific questions in the Encarta (No. 1 best-selling encyclopedia brand) database – such as the Longest River in the World, the third highest mountain, the population of a country, or the birth or death dates of people. The queries do not have to be entered as a question necessarily. MSN Search’s technology can isolate the key parts of a query and compare that against Encarta’s structured database. Certainly some natural language programming is part of this. The answers are delivered at the top of the search results, called out by an orange “carrot.”

#3: MSN Search & Hardware

Rand: Can you tell us something about the hardware infrastructure required for the new search engine? Are you making use of large clusters of inexpensive machines as some of your rivals have done, or did you opt for a different methodology?

MSN Search: MSN Search has the benefit of being part of Microsoft and is able to leverage the incredible infrastructure assets across the company. Specifics on the infrastructure are not something we discuss publicly.

#4: Crawling

Rand: Obviously, MSN Search had to start crawling and indexing from the ground up, can you tell us what that process was like? Many in the webmaster community feel that MSN was able to crawl and index exceptionally quickly, but that they’ve recently slowed down, is that the case? Was there a “ramp-up” time where more resources were devoted to finding web documents and indexing them?

MSN Search: We started very small…indexing 25 some docs and now are north of five billion documents. We have a focused effort on crawling more frequently so we can keep our index as up to date as possible. There is obviously a balance here of hitting index sites too much, and now that the index is built we can be more selective.

#5: Corpus Size

Rand: Can you tell me how big MSN’s repository of indexed page is, or tell me what neighborhood it’s in? What is the current rate of growth of the index – 5 million pages per week, or 50 million? I understand if this is confidential, but it would be great to know if it’s over or under a particular number.

MSN Search: To be honest it varies on what has been updated online. We index daily, weekly, and monthly.

Rand's Note: in #4 - the team mentions that the index is "north of five billion documents."

#6: Competing against Google

Rand: How do you feel about competing against Google’s brand? Is there concern that tackling Google’s user base is more difficult because of the consumer branding that they’ve been able to achieve?

MSN Search: There are a number of great competitors in the Search space. We recognize we are underdogs today – we’re in the game now in a solid third place, but we have a way to go still. Building on the MSN brand makes sense for us because of all the other MSN services we’re able to tap into. Beyond brand, though, our main focus by far is on providing a unique experience that gives customers precisely what they are looking for. Succeeding on quality is paramount; we need to do that first and foremost.

Follow-Up
Rand: With the MSN brand, what have you generally found to be the most positive elements that users appreciate about the technology & functionality? What are some specific weaknesses that you’re currently working to upgrade?

MSN Search: The connectivity of the MSN brand is an asset, connecting people to what is important to them. For example with the MSN Toolbar, you can have immediate access to other MSN properties that millions of people use every day, whether it is Instant Messenger, or Hotmail and Spaces. With MSN.com we have optimized if for Search, making it easier for consumers to find what they’re looking for. In general MSN wants to connect consumers to the people and information that is important to them, Search and our other properties are means to that end. As far as upgrades, we’re always upgrading and making changes based on consumer feedback. Delivering a meaningful user experience is key and will offer an incentive to people to return. The old marketing adage still holds true: “surprise and delight.”

#7: Quality Testing & Spam

When you do quality testing for the MSN Search results, what are some of the biggest problems you encounter? What specific kinds of spam or manipulation is particular aggrieving to you?

MSN Search:Spam has been one of our biggest challenges and we are working with our search engines such as Google and Yahoo to help combat spam. One example of this is the work with nofollow. Another major challenge is listening to and incorporating all of our users feedback on the quality of our results. We receive thousands of pieces of feedback every day. This data is extremely valuable and in order to leverage it we have built processes and system that help us analyze all of this data.

Follow-Up
Rand: What is your position on individual websites or webmasters who use spam techniques to manipulate the search results? Is MSN’s policy to take action against the spammers and their websites or simply to attempt large scale solutions to negate the effects of the spam? Which do you feel is a bigger focus or do you actively engage in both?

MSN Search: There are two approaches that we take. First, we explicitly disallow certain techniques that are often used by spammers – stuffing pages with a lot of keywords, doorway pages. If we identify people using these techniques we will remove their pages from the index. Second, but not distinct, we have an algorithmic spam detection tool that looks for pages that are low value or likely the result of being spammed.

#8: Working for MSN Search

Rand: If a person were interested in working for the MSN Search team, what academic or business endeavors and goals would you recommend to them? Where does most of your recruiting come from?

MSN Search: In general Microsoft is fanatical about hiring smart people who are passionate about technology and have great drive to get the right things done. Our team members come from a broad range of academic backgrounds, with the heaviest focus on mathematics, computer science, computer engineering, data warehousing and data mining, interaction design, and related research fields. We also have a number of business-focused people, typically with MBA backgrounds. We recruit from many sources across academia and the software industry, with most hires in North America, but also many from around the world.

#9: The People Behind MSN Search

Rand: How many people are currently employed by MSN’s search team? Is it a very close-knit team, or geographically disparate? Can I expect to see you at the company picnic in North Bend this year?

MSN Search: We have an awesome team, some of the best talent from across Microsoft resides in the Search halls. You might laugh at this, but… the easiest way to think about MSN Search is, “small scrappy startup”. The team is moderately sized by Microsoft standards, and we are growing fast, always looking for talented people to join the group. The core MSN Search team works on internet search, the toolbar, and desktop search, and is based in Redmond and the San Francisco bay area. We also partner with many other teams like Microsoft Research, with the end result that search is truly a company-wide effort.

#10: Non-Search Technologies

Rand: What other technologies do you find exciting or enjoyable on the web (besides search)? What are some favorite websites of the search team members?

MSN Search:
News
A lot of us are news junkies and as a result we use Newsbot a lot. http://newsbot.msnbc.msn.com

Archives
Very often we are interested in the history of a site and how it has changed over time. Since our data only goes back a couple of years we like to tap into the Internet Archive at www.archive.org.

Popularity
When you work on search you become obsessed with reputation based ranking systems. It is always interesting to check other sites that measure domain popularity. Alexa.com is a great example of this.

#11: Future Developments

Rand: What new developments at MSN Search are you most excited about? What kinds of releases should we anticipate in the next 6-12 months?

MSN Search: Keep your eyes on us…now that we have the platform, the foundation, we can innovate very quickly. The last two years was a lot of hard work, now we can have some fun and introduce a lot of unique features. Search is still early, so there is a lot more to come.

Follow-Up
Rand: MSN Search currently operates in several verticals, are there additional markets you plan to expand to – blogs or news feeds? Is MSN Search interested in the recent explosion in popularity of folksonomies and tagging?

MSN Search: Local, blogs, mobile, API’s and RSS are all important next areas of development for us and anyone serious about Search.

No comments:

Live Page Popularity