Extract URL(s) from Link(s)
This is a script which extracts URLs from Links. The function gets the content from the HREF attribute and ignores the non-urls like: “javascript: openWindow()”.
Using Regular Expressions
<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/
*/
$url = 'http://www.php.net/';
// Fetch page
$string = FetchPage($url);
// Regex that extracts the urls from links
$links_regex = '/<a[^/>]*'.
'href=["|\']([^javascript:].*)["|\']/Ui';
preg_match_all($links_regex, $string, $out, PREG_PATTERN_ORDER);
echo "<pre>"; print_r($out); echo "</pre>";
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("The was a connection error!");
}
$data = '';
while (!feof($file))
{
// Extract the data from the file / url
$data .= fgets($file, 1024);
}
return $data;
}
?>
The archive is made using WinZip 12.0. If you're having problems unzipping it, consider using WinRar, WinAce or a similar software to extract the files from the archive.Be notified when we have new posts by subscribing to


Thank you. Will give it a try.