PHP: trim a string and add a delimiter without breaking words

/** 
 * Trim a string and add a delimiter without breaking words
 * 
 * @param string $string string to trim 
 * @param integer $width character count to trim
 * @param string  $delim delimiter. Default: '...' 
 * @return string processed string 
**/ 
function trim_string($string, $width, $delim='...') {
  // Remove any newline and carriage return characters
  $string = str_replace("\n", '', $string);
  $string = str_replace("\r", '', $string);
 
  $len = strlen($string); 
  if ($len > $width) { 
    preg_match('/(.{' . $width . '}.*?)\b/', $string, $matches); 
    $trStr = rtrim($matches[1]) . $delim;
    if (str_replace($delim,'',$trStr) == $string) {
      $trStr = $string;
    }
    return $trStr;  
  } else { 
    return $string; 
  } 
}

Shortens an UTF-8 encoded string without breaking words.

<?php
/**
 * Shortens an UTF-8 encoded string without breaking words.
 * Be sure to set content-type of your source file to utf-8
 *
 * @link   http://wordpress.stackexchange.com/q/11085/11089#11089
 * @param  string $string     string to shorten
 * @param  int    $max_chars  maximal length in characters
 * @param  string $append     replacement for truncated words.
 * @return string
 */
function utf8_truncate( $string, $max_chars = 200, $append = "\xC2\xA0…" )
{
    $string = strip_tags( $string );
    $string = html_entity_decode( $string, ENT_QUOTES, 'utf-8' );
    // \xC2\xA0 is the no-break space
    $string = trim( $string, "\n\r\t .-;–,—\xC2\xA0" );
    $length = strlen( utf8_decode( $string ) );
 
    // Nothing to do.
    if ( $length < $max_chars )
    {
        return $string;
    }
 
    // mb_substr() is in /wp-includes/compat.php as a fallback if
    // your the current PHP installation doesn’t have it.
    $string = mb_substr( $string, 0, $max_chars, 'utf-8' );
 
    // No white space. One long word or chinese/korean/japanese text.
    if ( FALSE === strpos( $string, ' ' ) )
    {
        return $string . $append;
    }
 
    // Avoid breaks within words. Find the last white space.
    if ( extension_loaded( 'mbstring' ) )
    {
        $pos   = mb_strrpos( $string, ' ', 'utf-8' );
        $short = mb_substr( $string, 0, $pos, 'utf-8' );
    }
    else
    {
        // Workaround. May be slow on long strings.
        $words = explode( ' ', $string );
        // Drop the last word.
        array_pop( $words );
        $short = implode( ' ', $words );
    }
 
    return $short . $append;
}

One thought on “PHP: trim a string and add a delimiter without breaking words

  1. Ooof…nice script but trim_string() it will break on unicode strings because of the use of strlen, str_replace and some other ascii-assuming functions..

    Also if it’d normally break at position 80, and there is a long word starting at position 79, say ‘deinstitutionalization’ your line is still 79+22+3=104 chars long instead of 79+3=82

Leave a Reply