我如何检查一个URL是否存在(不是404)在PHP?


当前回答

cURL可以返回HTTP代码,我不认为所有额外的代码是必要的?

function urlExists($url=NULL)
    {
        if($url == NULL) return false;
        $ch = curl_init($url);
        curl_setopt($ch, CURLOPT_TIMEOUT, 5);
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $data = curl_exec($ch);
        $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        curl_close($ch); 
        if($httpcode>=200 && $httpcode<300){
            return true;
        } else {
            return false;
        }
    }

其他回答

简单的方法是卷曲(和更快)

<?php
$mylinks="http://site.com/page.html";
$handlerr = curl_init($mylinks);
curl_setopt($handlerr,  CURLOPT_RETURNTRANSFER, TRUE);
$resp = curl_exec($handlerr);
$ht = curl_getinfo($handlerr, CURLINFO_HTTP_CODE);


if ($ht == '404')
     { echo 'OK';}
else { echo 'NO';}

?>

在这里:

$file = 'http://www.example.com/somefile.jpg';
$file_headers = @get_headers($file);
if(!$file_headers || $file_headers[0] == 'HTTP/1.1 404 Not Found') {
    $exists = false;
}
else {
    $exists = true;
}

从这里和上面帖子的正下方,有一个卷曲的解决方案:

function url_exists($url) {
    return curl_init($url) !== false;
}

在检查报头是否有404错误时,需要考虑的一件事是站点不会立即生成404错误。

很多网站会检查PHP/ASP(等等)源代码中是否存在某个页面,然后将您转到404页面。在这些情况下,头基本上是由生成的404头扩展的。在这种情况下,404错误不会出现在头文件的第一行,而是第10行。

$array = get_headers($url);
$string = $array[0];
print_r($string) // would generate:

Array ( 
[0] => HTTP/1.0 301 Moved Permanently 
[1] => Date: Fri, 09 Nov 2018 16:12:29 GMT 
[2] => Server: Apache/2.4.34 (FreeBSD) LibreSSL/2.7.4 PHP/7.0.31 
[3] => X-Powered-By: PHP/7.0.31 
[4] => Set-Cookie: landing=%2Freed-diffuser-fig-pudding-50; path=/; HttpOnly 
[5] => Location: /reed-diffuser-fig-pudding-50/ 
[6] => Content-Length: 0 
[7] => Connection: close 
[8] => Content-Type: text/html; charset=utf-8 
[9] => HTTP/1.0 404 Not Found 
[10] => Date: Fri, 09 Nov 2018 16:12:29 GMT 
[11] => Server: Apache/2.4.34 (FreeBSD) LibreSSL/2.7.4 PHP/7.0.31 
[12] => X-Powered-By: PHP/7.0.31 
[13] => Set-Cookie: landing=%2Freed-diffuser-fig-pudding-50%2F; path=/; HttpOnly 
[14] => Connection: close 
[15] => Content-Type: text/html; charset=utf-8 
) 

这是一个解决方案,只读取源代码的第一个字节…如果file_get_contents失败,返回false…这也适用于远程文件,如图像。

 function urlExists($url)
{
    if (@file_get_contents($url,false,NULL,0,1))
    {
        return true;
    }
    return false;
}
$url = 'http://google.com';
$not_url = 'stp://google.com';

if (@file_get_contents($url)): echo "Found '$url'!";
else: echo "Can't find '$url'.";
endif;
if (@file_get_contents($not_url)): echo "Found '$not_url!";
else: echo "Can't find '$not_url'.";
endif;

// Found 'http://google.com'!Can't find 'stp://google.com'.