Extract URL(s) from Link(s) with PHP
Posted on September 4, 2008, Filled under PHP,
Bookmark it
This is a script which extracts URLs from Links. The function gets the content from the HREF attribute and ignores the non-urls like: “javascript: openWindow()”.
Using Regular Expressions
<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/
*/
$url = 'http://www.php.net/';
// Fetch page
$string = FetchPage($url);
// Regex that extracts the urls from links
$links_regex = '/<a[^/>]*'.
'href=[\"|\']([^javascript:].*)[\"|\']/Ui';
preg_match_all($links_regex, $string, $out, PREG_PATTERN_ORDER);
echo "<pre>"; print_r($out); echo "</pre>";
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("The was a connection error!");
}
$data = '';
while (!feof($file))
{
// Extract the data from the file / url
$data .= fgets($file, 1024);
}
return $data;
}
?>
Do you wish to receive the latest updates as soon as they are posted? Get our RSS Feed or Subscribe to the Newsletter!
- September 4, 2008
- article by Gabriel C.
- 1 comment
Related Posts
How to extract images from an URL in PHPat August 30, 2008 with 5 comments
PHP: How to extract numbers from a string (text)at October 5, 2008 with 6 comments
PHP: Extract Alphabetical Sequences from a Stringat October 5, 2008
PHP: Extract Alphanumeric Sequences from a Stringat October 5, 2008
How to extract content between two delimiters in PHPat August 29, 2008 with 16 comments

One Reply to "Extract URL(s) from Link(s) with PHP"
September 5, 2008 at 4:38 AM
Thank you. Will give it a try.