我如何检查一个URL是否存在(不是404)在PHP?
当前回答
$url = 'http://google.com';
$not_url = 'stp://google.com';
if (@file_get_contents($url)): echo "Found '$url'!";
else: echo "Can't find '$url'.";
endif;
if (@file_get_contents($not_url)): echo "Found '$not_url!";
else: echo "Can't find '$not_url'.";
endif;
// Found 'http://google.com'!Can't find 'stp://google.com'.
其他回答
我使用这个函数:
/**
* @param $url
* @param array $options
* @return string
* @throws Exception
*/
function checkURL($url, array $options = array()) {
if (empty($url)) {
throw new Exception('URL is empty');
}
// list of HTTP status codes
$httpStatusCodes = array(
100 => 'Continue',
101 => 'Switching Protocols',
102 => 'Processing',
200 => 'OK',
201 => 'Created',
202 => 'Accepted',
203 => 'Non-Authoritative Information',
204 => 'No Content',
205 => 'Reset Content',
206 => 'Partial Content',
207 => 'Multi-Status',
208 => 'Already Reported',
226 => 'IM Used',
300 => 'Multiple Choices',
301 => 'Moved Permanently',
302 => 'Found',
303 => 'See Other',
304 => 'Not Modified',
305 => 'Use Proxy',
306 => 'Switch Proxy',
307 => 'Temporary Redirect',
308 => 'Permanent Redirect',
400 => 'Bad Request',
401 => 'Unauthorized',
402 => 'Payment Required',
403 => 'Forbidden',
404 => 'Not Found',
405 => 'Method Not Allowed',
406 => 'Not Acceptable',
407 => 'Proxy Authentication Required',
408 => 'Request Timeout',
409 => 'Conflict',
410 => 'Gone',
411 => 'Length Required',
412 => 'Precondition Failed',
413 => 'Payload Too Large',
414 => 'Request-URI Too Long',
415 => 'Unsupported Media Type',
416 => 'Requested Range Not Satisfiable',
417 => 'Expectation Failed',
418 => 'I\'m a teapot',
422 => 'Unprocessable Entity',
423 => 'Locked',
424 => 'Failed Dependency',
425 => 'Unordered Collection',
426 => 'Upgrade Required',
428 => 'Precondition Required',
429 => 'Too Many Requests',
431 => 'Request Header Fields Too Large',
449 => 'Retry With',
450 => 'Blocked by Windows Parental Controls',
500 => 'Internal Server Error',
501 => 'Not Implemented',
502 => 'Bad Gateway',
503 => 'Service Unavailable',
504 => 'Gateway Timeout',
505 => 'HTTP Version Not Supported',
506 => 'Variant Also Negotiates',
507 => 'Insufficient Storage',
508 => 'Loop Detected',
509 => 'Bandwidth Limit Exceeded',
510 => 'Not Extended',
511 => 'Network Authentication Required',
599 => 'Network Connect Timeout Error'
);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
if (isset($options['timeout'])) {
$timeout = (int) $options['timeout'];
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
}
curl_exec($ch);
$returnedStatusCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if (array_key_exists($returnedStatusCode, $httpStatusCodes)) {
return "URL: '{$url}' - Error code: {$returnedStatusCode} - Definition: {$httpStatusCodes[$returnedStatusCode]}";
} else {
return "'{$url}' does not exist";
}
}
function url_exists($url) {
$headers = @get_headers($url);
return (strpos($headers[0],'200')===false)? false:true;
}
在检查报头是否有404错误时,需要考虑的一件事是站点不会立即生成404错误。
很多网站会检查PHP/ASP(等等)源代码中是否存在某个页面,然后将您转到404页面。在这些情况下,头基本上是由生成的404头扩展的。在这种情况下,404错误不会出现在头文件的第一行,而是第10行。
$array = get_headers($url);
$string = $array[0];
print_r($string) // would generate:
Array (
[0] => HTTP/1.0 301 Moved Permanently
[1] => Date: Fri, 09 Nov 2018 16:12:29 GMT
[2] => Server: Apache/2.4.34 (FreeBSD) LibreSSL/2.7.4 PHP/7.0.31
[3] => X-Powered-By: PHP/7.0.31
[4] => Set-Cookie: landing=%2Freed-diffuser-fig-pudding-50; path=/; HttpOnly
[5] => Location: /reed-diffuser-fig-pudding-50/
[6] => Content-Length: 0
[7] => Connection: close
[8] => Content-Type: text/html; charset=utf-8
[9] => HTTP/1.0 404 Not Found
[10] => Date: Fri, 09 Nov 2018 16:12:29 GMT
[11] => Server: Apache/2.4.34 (FreeBSD) LibreSSL/2.7.4 PHP/7.0.31
[12] => X-Powered-By: PHP/7.0.31
[13] => Set-Cookie: landing=%2Freed-diffuser-fig-pudding-50%2F; path=/; HttpOnly
[14] => Connection: close
[15] => Content-Type: text/html; charset=utf-8
)
$url = 'http://google.com';
$not_url = 'stp://google.com';
if (@file_get_contents($url)): echo "Found '$url'!";
else: echo "Can't find '$url'.";
endif;
if (@file_get_contents($not_url)): echo "Found '$not_url!";
else: echo "Can't find '$not_url'.";
endif;
// Found 'http://google.com'!Can't find 'stp://google.com'.
cURL可以返回HTTP代码,我不认为所有额外的代码是必要的?
function urlExists($url=NULL)
{
if($url == NULL) return false;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if($httpcode>=200 && $httpcode<300){
return true;
} else {
return false;
}
}
推荐文章
- 原则-如何打印出真正的sql,而不仅仅是准备好的语句?
- 如何从关联PHP数组中获得第一项?
- PHP/MySQL插入一行然后获取id
- 我如何排序一个多维数组在PHP
- 如何在PHP中截断字符串最接近于一定数量的字符?
- PHP错误:“zip扩展名和unzip命令都没有,跳过。”
- Nginx提供下载。php文件,而不是执行它们
- Json_encode()转义正斜杠
- 如何在PHP中捕获cURL错误
- 如何要求一个分叉与作曲家?
- 如何在php中创建可选参数?
- 在文本文件中创建或写入/追加
- 为什么PHP的json_encode函数转换UTF-8字符串为十六进制实体?
- 如何从一个查询插入多行使用雄辩/流利
- URL中的“#:~:text=”位置哈希值到底是什么?