Introduction to Sphinx Search

How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.

Sphinx stands for SQL Phrase Index. It is free, open source, powerful, easy to use, full-text search engine, which comes with apis in PHP and other languages. Sphinx is being used by some very large sites like craigslist.org, netlog.com and more.

Its key features include

  • high indexing speed (upto 10 MB/sec on modern CPUs)
  • high search speed (avg query is under 0.1 sec on 2-4 GB text collections)
  • high scalability (upto 100 GB of text, upto 100 M documents on a single CPU)
  • supports distributed searching (since v.0.9.6)
  • supports MySQL natively (MyISAM and InnoDB tables are both supported)
  • supports phrase searching
  • supports phrase proximity ranking, providing good relevance
  • supports English and Russian stemming
  • supports any number of document fields (weights can be changed on the fly)
  • supports document groups
  • supports stopwords
  • supports different search modes (“match all”, “match phrase” and “match any” as of v.0.9.5)
  • pure-PHP (ie. NO module compiling etc) search client API

Sphinx is one of the tool which I enjoyed to work with. I love its speed, stability and cool features. For more you can see following links

Later we will be looking how to use sphinx things.

Leave a Reply