Finding whether a UTF-8 encoded string is palindrome

Finding whether a UTF-8 encoded string is palindrome

Created:27 Jul 2017 13:49:31 , in  Maths

When recently entertaining myself with recursion in PHP I have come across a problem which asked to find whether a given word was a character-unit palindrome or not.

I guess, a word of explanation on character-unit palindromes is needed in this place now. Basically, they are words which read the same backwards and forwards. Here are two quick examples: abba, stats.

In order to make both the problem more interesting and my solution to it more useful, my code tests UTF-8 encoded strings. UTF-8 character encoding is de facto a standard encoding for the vast majority (90%) of websites on the web now.

Finding palindromic words recursively

To find out whether a word is palindrome I wrote a tiny PHP class called SwwwPalindrome. The class when given a string representing a word splits the string up into an array of code points ( each built of one through four 8-bit long code units ) which next get compared recursively. During each recursive call exactly two code points are compared against one another. The comparison starts from the innermost pair and continues till reaching the outermost pair, at which point items with the array indices 0 and n-1 get compared.

If all tested pairs have consisted of two identical items, the string is a palindrome. Otherwise, it is not.

Here is the PHP code:


/*
  Class: SwwwPalindrome 
  Description: Check whether a string is character-unit palindrome or not 
  Author: Sylwester Wojnowski
  WWW: wojnowski.net.pl

  methods:
    is_palindrome
      return value: bool  
*/

final class SwwwPalindrome{
  private $chars = null;
  private $middle = 0;
  private $palindrome = true;

  public static function is_palindrome($string){
    $instance = new SwwwPalindrome($string);
    return $instance -> get();
  }

  public function __construct($string){
    // deal with empty string
    $this -> chars = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
    $len = count($this -> chars);
    $this -> even = $len % 2 === 0;
    $this -> middle = ($this -> even) ? floor(($len + 1) / 2) : floor(($len)/2);
  }
  
  public function get(){
    $this -> check();
    return $this -> palindrome;
  }

  private function check($n = null){

    if(!$this -> palindrome){
      return;
    }

    $n = ( $n === null ) ? 0 : $n;  
    $left = null;
    $right = $this -> middle + $n;

    // even number of items in the array
    if($this -> even){
      $left = $this -> middle -1 - $n;
      if( $left < 0){
        return;
      }
      $this -> palindrome = ($this -> chars[$left] === $this -> chars[$right]);
    // odd number of items in the array
    }else{
      $left = $this -> middle - $n;
      if( $left < 0){
        return;
      }
      $this -> palindrome = ($this -> chars[$left] === $this -> chars[$right]); 
    }

    $this -> check($n + 1);    
  }
}

Use examples

SwwwPalindrome has static method "is_palindrome", which returns true if a given string represents a words which is palindrome. It returns false otherwise.

Here are some examples of use:

 
SwwwPalindrome::is_palindrome('abba');
=> true

SwwwPalindrome::is_palindrome('abc');
=> false

SwwwPalindrome::is_palindrome('śćaćś');
=> true

SwwwPalindrome::is_palindrome('śóćś');
=> false

SwwwPalindrome treats empty string as a palindrome.

Final thoughts

SwwwPalindrome has been designed to work on strings representing words, rather than sentences. Hence it is sensitive to blanks and punctuation in the string it receives. A good first step to make the class recognize palindromic sentences would be removal of all blanks and punctuation from the array built by preg_split function before the recursive comparison is applied.

SwwwPalindrome has a few useful bits of code in it. Among other things it shows how to split a UTF-8 encoded string properly, find the middle and traverse array in the opposite directions recursively. I hope you'll will find one or two of these tricks somehow useful.

This post was updated on 27 Jul 2017 14:53:53

Tags:  php ,  recursion 


Author, Copyright and citation

Author

Sylwester Wojnowski

Author of the above article, Sylwester Wojnowski, is sWWW admin and owner.He enjoys doing Maths and studying algorithms, writing code in scripting and command languages, Thrash Metal music and playing electric guitar.

Copyrights

©Copyright, 2017 Sylwester Wojnowski. This article may not be reproduced or published as a whole or in parts without permission from the author. If you share it, please give author credit and do not remove embedded links.

Computer code, if present in the article, is excluded from the above and licensed under GPLv3.

Citation

Cite this article as:

Wojnowski, Sylwester. "Finding whether a UTF-8 encoded string is palindrome." From sWWW - Code For The Web . https://wojnowski.net.pl//main/index/finding-whether-a-utf-8-encoded-string-is-palindrome