PHP regex find and replace url attributes in DOM -
currently have following code:
//loop here foreach ($doc['a'] $link) { $href = pq($link)->attr('href'); if (preg_match($url,$href)) { //delete matched string , append custom url href attr } else { //prepend custom url href attr } } //end loop
basically i've fetched vial curl external page. need append own custom url each href link in dom. need check via regex if each href attr has base url e.g. www.domain.com/mainpage.html/subpage.html
if yes, replace www.domain.com
part custom url.
if not, append custom url relative url.
my question is, regex syntax should use , php function? preg_replace() proper function this?
cheers
you should use internals opposed regex whenever possible, because authors of functions have considered edge cases (or read really long rfc urls details of cases). case, use parse_url()
, http_build_url()
(note latter function needs pecl http, can installed following the docs page http package):
$href = 'http://www.domain.com/mainpage.html/subpage.html'; $parts = parse_url($href); if($parts['host'] == 'www.domain.com') { $parts['host'] = 'www.yoursite.com'; $href = http_build_url($parts); } echo $href; // 'http://www.yoursite.com/mainpage.html/subpage.html';
example using code:
foreach ($doc['a'] $link) { $urlparts = parse_url(pq($link)->attr('href')); $urlparts['host'] = 'www.yoursite.com'; // replaces domain if there one, otherwise prepends domain $newurl = http_build_url($urlparts); pq($link)->attr('href', $newurl); }
Comments
Post a Comment