How to Create a PHP Word Popularity Script
Posted on November 7, 2008, Filled under PHP,
Bookmark it
Thanks for visiting our website! We regularly publish posts like this one. If you are interested in receiving the latest updates as soon as they are posted, please consider subscribing to the RSS feed or to our e-mail newsletter.
This is a function that is meant to calculate the density of the words from a text. Since there are many words that have less then 3 characters, I’ve decided to add a filter that will not take into account words that aren’t bigger then (X) characters (examples: if, or, is, it etc.). Also, you can setup an array with a list of words that you do not want to add in the ranking calculation. Here’s the function (I’ll explain you how it works below):
<?php
function calculate_word_popularity($string, $min_word_char = 2, $exclude_words = array())
{
$string = strip_tags($string);
$initial_words_array = str_word_count($string, 1);
$total_words = sizeof($initial_words_array);
$new_string = $string;
foreach($exclude_words as $filter_word)
{
$new_string = preg_replace("/\b".$filter_word."\b/i", "", $new_string); // strip excluded words
}
$words_array = str_word_count($new_string, 1);
$words_array = array_filter($words_array, create_function('$var', 'return (strlen($var) >= '.$min_word_char.');'));
$popularity = array();
$unique_words_array = array_unique($words_array);
foreach($unique_words_array as $key => $word)
{
preg_match_all('/\b'.$word.'\b/i', $string, $out);
$count = count($out[0]);
$percent = number_format((($count * 100) / $total_words), 2);
$popularity[$key]['word'] = $word;
$popularity[$key]['count'] = $count;
$popularity[$key]['percent'] = $percent.'%';
}
function cmp($a, $b)
{
return ($a['count'] > $b['count']) ? +1 : -1;
}
usort($popularity, "cmp");
return $popularity;
}
?>
