How to extract content between two delimiters in PHP

Posted on August 29, 2008, under PHP 

Hi,

Here’s a function which is useful when you need to extract some content between two delimiters. For instance you need to extract content using a robot that connects to a page.

<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/web-programming/php/extracting-content-between-two-delimiters.html
*/

function extract_unit($string, $start, $end)
{
$pos = stripos($string, $start);

$str = substr($string, $pos);

$str_two = substr($str, strlen($start));

$second_pos = stripos($str_two, $end);

$str_three = substr($str_two, 0, $second_pos);

$unit = trim($str_three); // remove whitespaces

return $unit;
}

This is an usage example of this function:

$text = 'PHP is an acronym for "PHP: Hypertext Preprocessor".';

$unit = extract_unit($text, 'an', 'for');

// Outputs: acronym
echo $unit;
?>

How it works?

First, we use stripos() to determine the numeric position of the first occurrence of needle in the haystack string. In our example, there are 7 characters from the beginning of the string until ‘an’.

$pos = stripos($string, $start);

Now, we will use this information to get the content of $string, from the $pos character until the last one:

an acronym for “PHP: Hypertext Preprocessor”.

$str = substr($string, $pos);

Remove ‘an’ from the recently created string:

acronym for “PHP: Hypertext Preprocessor”.

$str_two = substr($str, strlen($start));

Determine the number of characters from the beginning of $str_two until ‘for’ (9 in this case):

$second_pos = stripos($str_two, $end);

Now use this number to get the content from the beginning of the string until ‘for’:

$str_three = substr($str_two, 0, $second_pos);

The last variable would be equal with ‘ acronym ‘. Eventually, let’s strip the whitespaces from the beginning and ending of the string:

acronym

$unit = trim($str_three); // remove whitespaces

If you have any comments, suggestions regarding this snippet please post them.

Comment via Facebook

comments

Leave a Reply


* = required fields

  (will not be published)


XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Note: If you want to post CODE Snippets, please make them postable first!
(e.g. <br /> should be converted to &lt;br /&gt;)

POSTING RULES:

  • The comment must be relevant with the topic of the post.
  • Only comments with real email addresses will get approved. So, emails like 'abc@domain.com' will not be accepted.
  • Do not post the same message in multiple articles through the site.
  • Do not post advertisements, junk mail or pyramid schemes.
  • In case you post a link to another site, please explain briefly where the link goes as a courtesy to other users.
  • Do not post comments such as: "Thank you", "Awesome", "Nice tutorial", "Merci", etc.