MySQL: What is the 50-percent threshold?
Question by Sledge | 2012-06-08 at 10:21
Related to the full-text search of MySQL, I have now often heard, that a so-called 50 percent threshold is applied. However, the problem is, that this term is always only mentioned, but never explained or defined.
So, can someone explain this 50% threshold to me ?
Related Topics
MySQL: Change minimum word length for full text search
Tip | 1 Comment
How old grow frogs? A balance of death.
Info | 0 Comments
MySQL: CSV Export as File stored on the Server
Tutorial | 0 Comments
MySQL: CSV Export as automatic Download
Tutorial | 0 Comments
MySQL: Regular Expressions in MySQL queries
Info | 0 Comments
MySQL: Combine full text search with LIKE search for words with 3 letters
Tutorial | 2 Comments
MySQL/PHP: How to retrieve the last INSERT ID
Info | 0 Comments
Important Note
Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.
Participate
Ask your own question or write your own article on askingbox.com. That’s how it’s done.
The 50 percent mark in the MySQL full text search should lead to better search results. It says, that all words are excluded from the search index, which occur in at least 50 percent of the datasets.
The idea is the following: if we search, for example, for words like "then" or "have", which are appearing in almost every text in the search result, the search result will not be improved signitficantly, quite it is the opposite, because searching for those words should actually return almost all records from the database.
Therefore, MySQL focuses on the significant words, that can actually differ between the individual records.
This is problematic, of course, in the case, when we have very few records and therefore some relevant words can also occur (by chance) in all available records. So, the MySQL full text search actually makes only sense with a large amount of records in the database.
If we would like to avoid the 50% threshold, we can also use the full text search of MySQL "IN BOOLEAN MODE". Here, the 50 percent mark is not of applied and we can even determine more precisely, which words should occur and which not.
2012-06-08 at 15:25