Skip to content

Sentence Normalizer

Derek Jones edited this page Jul 5, 2012 · 3 revisions

This is a simple function for normalizing sentences. Features include: 1- Removes duplicated question marks, exclamations and periods 2- Capitalize first letter of a sentence. 3- Split sentences not only with "." but also with "?" and "!" 4- Puts a white space at the end of each sentence 5- Retains newlines

--removed from orginal function-- undestand the meaning of "¡" and "¿" in languages like spanish. undestand the htmlentitity version of this simbols. --removed from orginal function--

Credit: A modified sentenceNormalizer by gregomm

The original script can be found here: http://php.net/manual/en/function.ucwords.php

The Codeigniter Form Validation class allows for custom validation functions. Learn more about this in Codeigniter's documentation on Form Validation: http://codeigniter.com/user_guide/libraries/form_validation.html#callbacks. To implement the Sentence Normalizer function, place the following function in your Controller which is using the Form Validation class.

function sentence_normalizer($str) {
    
        $str = preg_replace(array('/[!]+/','/[?]+/','/[.]+/'),
                                       array('!','?','.'),$str);        
        
        $textbad = preg_split("/(\!|\.|\?|\n)/", $str,-1,PREG_SPLIT_DELIM_CAPTURE);
        $newtext = array();
        $count = sizeof($textbad);
        
        foreach($textbad as $key => $string) {
            if (!empty($string)) {
                $text = trim($string, ' ');
                $size = strlen($text);
                
                if ($size > 1){     
                    $newtext[] = ucfirst(strtolower($text));
                }
                    elseif ($size == 1) {
                        $newtext[] = ($text == "\n") ? $text : $text . ' ';
                    }      
            }
        }
        
        return implode($newtext);
    
    }

Then, place the name of the callback when setting your validation rules (callback_sentence_normalizer). For example:

$this->form_validation->set_rules('about', 'About', 'required|xss_clean|prep_for_form|strip_tags|callback_sentence_normalizer');

The above code sets validation rules for a form field named "about." Validation rules include several standard Codeigniter form validation rules. Note "callback_sentence_normalizer." This calls the sentence normalizer function.

That's it!

Clone this wiki locally