有一个在线文件(如http://www.example.com/information.asp),我需要抓取并保存到一个目录。我知道有几种逐行抓取和读取在线文件(url)的方法,但是否有一种方法可以使用Java下载并保存文件?
当前回答
public class DownloadManager {
static String urls = "[WEBSITE NAME]";
public static void main(String[] args) throws IOException{
URL url = verify(urls);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream in = null;
String filename = url.getFile();
filename = filename.substring(filename.lastIndexOf('/') + 1);
FileOutputStream out = new FileOutputStream("C:\\Java2_programiranje/Network/DownloadTest1/Project/Output" + File.separator + filename);
in = connection.getInputStream();
int read = -1;
byte[] buffer = new byte[4096];
while((read = in.read(buffer)) != -1){
out.write(buffer, 0, read);
System.out.println("[SYSTEM/INFO]: Downloading file...");
}
in.close();
out.close();
System.out.println("[SYSTEM/INFO]: File Downloaded!");
}
private static URL verify(String url){
if(!url.toLowerCase().startsWith("http://")) {
return null;
}
URL verifyUrl = null;
try{
verifyUrl = new URL(url);
}catch(Exception e){
e.printStackTrace();
}
return verifyUrl;
}
}
其他回答
下载一个文件需要你阅读它。无论哪种方式,您都必须以某种方式查看该文件。而不是逐行,你可以从流中逐字节读取:
BufferedInputStream in = new BufferedInputStream(new URL("http://www.website.com/information.asp").openStream())
byte data[] = new byte[1024];
int count;
while((count = in.read(data, 0, 1024)) != -1)
{
out.write(data, 0, count);
}
更简单的非阻塞I/O用法:
URL website = new URL("http://www.website.com/information.asp");
try (InputStream in = website.openStream()) {
Files.copy(in, target, StandardCopyOption.REPLACE_EXISTING);
}
试试Java NIO:
URL website = new URL("http://www.website.com/information.asp");
ReadableByteChannel rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream("information.html");
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
使用transferFrom()可能比从源通道读取并写入此通道的简单循环更有效。许多操作系统可以直接将字节从源通道传输到文件系统缓存中,而不需要实际复制它们。
点击这里查看更多信息。
注意:transferFrom中的第三个参数是传输的最大字节数。整数。MAX_VALUE将传输最多2^31字节,长。MAX_VALUE最多允许2^63字节(比现有的任何文件都大)。
public class DownloadManager {
static String urls = "[WEBSITE NAME]";
public static void main(String[] args) throws IOException{
URL url = verify(urls);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream in = null;
String filename = url.getFile();
filename = filename.substring(filename.lastIndexOf('/') + 1);
FileOutputStream out = new FileOutputStream("C:\\Java2_programiranje/Network/DownloadTest1/Project/Output" + File.separator + filename);
in = connection.getInputStream();
int read = -1;
byte[] buffer = new byte[4096];
while((read = in.read(buffer)) != -1){
out.write(buffer, 0, read);
System.out.println("[SYSTEM/INFO]: Downloading file...");
}
in.close();
out.close();
System.out.println("[SYSTEM/INFO]: File Downloaded!");
}
private static URL verify(String url){
if(!url.toLowerCase().startsWith("http://")) {
return null;
}
URL verifyUrl = null;
try{
verifyUrl = new URL(url);
}catch(Exception e){
e.printStackTrace();
}
return verifyUrl;
}
}
这个答案几乎和选中的答案完全一样,但是有两个增强:它是一个方法,它关闭了FileOutputStream对象:
public static void downloadFileFromURL(String urlString, File destination) {
try {
URL website = new URL(urlString);
ReadableByteChannel rbc;
rbc = Channels.newChannel(website.openStream());
FileOutputStream fos = new FileOutputStream(destination);
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
fos.close();
rbc.close();
} catch (IOException e) {
e.printStackTrace();
}
}
推荐文章
- 在流中使用Java 8 foreach循环移动到下一项
- 访问限制:'Application'类型不是API(必需库rt.jar的限制)
- 用Java计算两个日期之间的天数
- 如何配置slf4j-simple
- 在Jar文件中运行类
- 带参数的可运行?
- 我如何得到一个字符串的前n个字符而不检查大小或出界?
- 我可以在Java中设置enum起始值吗?
- Java中的回调函数
- c#和Java中的泛型有什么不同?和模板在c++ ?
- 在Java中,流相对于循环的优势是什么?
- Jersey在未找到InjectionManagerFactory时停止工作
- 在Java流是peek真的只是调试?
- Recyclerview不调用onCreateViewHolder
- 将JSON字符串转换为HashMap