개요
좋아, 여기있어 이것은 스크립트 형태의 프로그래밍 솔루션입니다.
#!/bin/bash
# NAME: pdflinkextractor
# AUTHOR: Glutanimate (http://askubuntu.com/users/81372/), 2013
# LICENSE: GNU GPL v2
# DEPENDENCIES: wget lynx
# DESCRIPTION: extracts PDF links from websites and dumps them to the stdout and as a textfile
# only works for links pointing to files with the ".pdf" extension
#
# USAGE: pdflinkextractor "www.website.com"
WEBSITE="$1"
echo "Getting link list..."
lynx -cache=0 -dump -listonly "$WEBSITE" | grep ".*\.pdf$" | awk '{print $2}' | tee pdflinks.txt
# OPTIONAL
#
# DOWNLOAD PDF FILES
#
#echo "Downloading..."
#wget -P pdflinkextractor_files/ -i pdflinks.txt
설치
당신이 필요합니다 wget
및 lynx
설치 :
sudo apt-get install wget lynx
용법
스크립트는 .pdf
웹 사이트 의 모든 파일 목록을 가져 와서 명령 행 출력과 작업 디렉토리의 텍스트 파일로 덤프합니다. "선택적" wget
명령 을 주석 처리 하면 스크립트가 모든 파일을 새 디렉토리로 다운로드합니다.
예
$ ./pdflinkextractor http://www.pdfscripting.com/public/Free-Sample-PDF-Files-with-scripts.cfm
Getting link list...
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/JSPopupCalendar.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/ModifySubmit_Example.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/DynamicEmail_XFAForm_V2.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/AcquireMenuItemNames.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/BouncingButton.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/JavaScriptClock.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/Matrix2DOperations.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/RobotArm_3Ddemo2.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/SimpleFormCalculations.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/TheFlyv3_EN4Rdr.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/ImExportAttachSample.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/AcroForm_BasicToggle.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/AcroForm_ToggleButton_Sample.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/AcorXFA_BasicToggle.pdf
http://www.pdfscripting.com/public/FreeStuff/PDFSamples/ConditionalCalcScripts.pdf
Downloading...
--2013-12-24 13:31:25-- http://www.pdfscripting.com/public/FreeStuff/PDFSamples/JSPopupCalendar.pdf
Resolving www.pdfscripting.com (www.pdfscripting.com)... 74.200.211.194
Connecting to www.pdfscripting.com (www.pdfscripting.com)|74.200.211.194|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 176008 (172K) [application/pdf]
Saving to: `/Downloads/pdflinkextractor_files/JSPopupCalendar.pdf'
100%[===========================================================================================================================================================================>] 176.008 120K/s in 1,4s
2013-12-24 13:31:29 (120 KB/s) - `/Downloads/pdflinkextractor_files/JSPopupCalendar.pdf' saved [176008/176008]
...