00 Votes

MySQL: What is the 50-percent threshold?

Question by Sledge | 2012-06-08 at 10:21

Related to the full-text search of MySQL, I have now often heard, that a so-called 50 percent threshold is applied. However, the problem is, that this term is always only mentioned, but never explained or defined.

So, can someone explain this 50% threshold to me ?

ReplyPositiveNegative
0Best Answer0 Votes

The 50 percent mark in the MySQL full text search should lead to better search results. It says, that all words are excluded from the search index, which occur in at least 50 percent of the datasets.

The idea is the following: if we search, for example, for words like "then" or "have", which are appearing in almost every text in the search result, the search result will not be improved signitficantly, quite it is the opposite, because searching for those words should actually return almost all records from the database.

Therefore, MySQL focuses on the significant words, that can actually differ between the individual records.

This is problematic, of course, in the case, when we have very few records and therefore some relevant words can also occur (by chance) in all available records. So, the MySQL full text search actually makes only sense with a large amount of records in the database.

If we would like to avoid the 50% threshold, we can also use the full text search of MySQL "IN BOOLEAN MODE". Here, the 50 percent mark is not of applied and we can even determine more precisely, which words should occur and which not.
2012-06-08 at 15:25

ReplyPositive Negative
Reply

Related Topics

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own article on askingbox.com. That’s how it’s done.