Discussion of improvements of site search here on Cotonti Headquarters

#1 2012-07-05 07:41

There have been quite some complaints that search results on this site are often irrelevant and contain strange entries like "...".

For now I've enabled Google search results on the same page. So you can use it and compare results relevance. Please test results of both searches and tell us how useful each of them is.

Later we will make a decision whether to improve Find module further, make a new search module using Sphinx or switch to Google's search on this site.

Other improvement ideas are welcome too.

#2 2012-07-05 16:42

#3 2012-07-06 08:06

What can I say, i make a request "Файловый архив в Siena" to find my topic. Google: 1st position of results, standart search cant find this topic..

#4 2012-07-06 10:15

I'm not surprised Google comes up with better results. They have spent millions tweaking their crawlers and search engine. Find is a 2 month project and intended to provide an alternative to the old search plugin, using a more efficient searching method for large sites (index-based vs full-text). The index-based method seems not to work so well, especially with cyrillic languages. Probably it can be improved a lot but I doubt its worth the effort (using Sphider would be a better choice in this case).

I'd be fine with using either Google search or Sphinx on Cotonti. Google would be the easy choice, but creating a Sphinx module for Cotonti will allow large sites to have a reliable and free search engine without the Google branding. Sites not on a VPS are probably small enough to use the search plugin.

Here's our options:

  • Google Custom Search: Good results; not customizable (Google branding)
  • Sphinx: Good results; customizable; requires VPS / dedicated hosting
  • Sphider: Probably pretty good results; customizable; works on shared hosting; project is no longer maintained.
  • Find: Poor results; very customizable; tight integration with Cotonti; works on shared hosting.

My initial idea was to modify Find so it uses Sphider. That should improve its search results a lot. Downside is that Sphider is no longer maintained so we're on our own on that. I think this is the best option for medium-sized websites running on shared hosting. For Sphinx is probably the best option, as we'd like to be able to customize the results (both visually and functionally, e.g. search in specific site area).

#5 2012-07-06 12:16

We're quite limited in human resources. E.g. Sphinx is indeed the best solution for sites running on VPS (like this one) but it requires such site owners to invest into development of a module with sufficient features and reliability. I assume it's quite a similar story with Sphinder.

The main problem with Google is that you can't customize it and you can't use it for private site sections. I made a plugin for commercial version of Google Site Search for Genoa and it is pretty customizable (you can make your own TPLs and you can make it search in specific site sections), but the prices start at $100/yr (for just 20k requests per year).

