Extract URL(s) from Link(s) with PHP

Posted on September 4, 2008, Filled under PHP,  Bookmark it

This is a script which extracts URLs from Links. The function gets the content from the HREF attribute and ignores the non-urls like: “javascript: openWindow()”.

Using Regular Expressions

<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/
*/

$url = 'http://www.php.net/';

// Fetch page
$string = FetchPage($url);

// Regex that extracts the urls from links

$links_regex = '/<a[^/>]*'.

'href=[\"|\']([^javascript:].*)[\"|\']/Ui';

preg_match_all($links_regex, $string, $out, PREG_PATTERN_ORDER);

echo "<pre>"; print_r($out); echo "</pre>";

function FetchPage($path)
{
$file = fopen($path, "r"); 

if (!$file)
{
exit("The was a connection error!");
} 

$data = '';

while (!feof($file))
{
// Extract the data from the file / url

$data .= fgets($file, 1024);
}
return $data;
}
?>

Do you wish to receive the latest updates as soon as they are posted? Get our RSS Feed or Subscribe to the Newsletter!

Get our RSS Feed!

Sponsors

One Reply to "Extract URL(s) from Link(s) with PHP"

  1. Thank you. Will give it a try.

Leave a Reply


* = required fields

(will not be published)


XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


  

CommentLuv Enabled