wget Command Examples

Downloading files with wget

wget command


wget is a free utility that is available to most distributions of Linux. wget is a command line utility that supports "HTTP", "HTTPS" and "FTP" protocols. wget is non interactive meaning that it can continue to handle downloads in the background whilst the user has logged out. wget has the ability to recover after a network error. wget will continue to try and retrieve its file until the entirety of the file has been successfully retrieved. If the remote server supports "regetting", then wget will instruct the server to continue where it last left off. wget is often used to download packages from online repositories. Below are some simple examples:



Downloading a rpm package


In this example we will use "wget" to download a rpm package to our current directory. Progress of your download will be displayed:



john@john-desktop:~$ cd /tmp
john@john-desktop:/tmp$ wget http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
--2013-05-03 20:58:11--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
Resolving ftp.hosteurope.de (ftp.hosteurope.de)... 80.237.136.138, 2a01:488:10:1::50ed:888a
Connecting to ftp.hosteurope.de (ftp.hosteurope.de)|80.237.136.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11316 (11K) [application/x-redhat-package-manager]
Saving to: `ipcalc-0.40-10.noarch.rpm'

100%[======================================>] 11,316      --.-K/s   in 0.002s  

2013-05-03 20:58:12 (4.81 MB/s) - `ipcalc-0.40-10.noarch.rpm' saved [11316/11316]

Limit Speed on wget downloads


If you are using a shared server with other people, it is generally advised to limit your download speed on a large download so that you do not impact other users of the system. To do this, we can pass the "--limit-rate" parameter along with a max download speed. In the example below, we are limiting our download speed to 50Kb/s:



john@john-desktop:/tmp$ wget --limit-rate=50k http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
--2013-05-03 21:11:01--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
Resolving ftp.hosteurope.de (ftp.hosteurope.de)... 80.237.136.138, 2a01:488:10:1::50ed:888a
Connecting to ftp.hosteurope.de (ftp.hosteurope.de)|80.237.136.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11316 (11K) [application/x-redhat-package-manager]
Saving to: `ipcalc-0.40-10.noarch.rpm'

100%[======================================>] 11,316      50.0K/s   in 0.2s    

2013-05-03 21:11:01 (50.0 KB/s) - `ipcalc-0.40-10.noarch.rpm' saved [11316/11316]


Downloading a large file in the background


To download a large file in the background, simply pass the "-b" parameter to the wget command. The output associated with this option will be written to a "wget-log" file. This file will be located in the directory where you initiated the download from:



john@john-desktop:/tmp$ wget -b http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
Continuing in background, pid 31001.
Output will be written to `wget-log'.

We can then view the output of the "wget-log" file:



john@john-desktop:/tmp$ cat wget-log
--2013-05-03 21:16:27--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/ipcalc-0.40-10.noarch.rpm
Resolving ftp.hosteurope.de (ftp.hosteurope.de)... 80.237.136.138, 2a01:488:10:1::50ed:888a
Connecting to ftp.hosteurope.de (ftp.hosteurope.de)|80.237.136.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11316 (11K) [application/x-redhat-package-manager]
Saving to: `ipcalc-0.40-10.noarch.rpm'

     0K .......... .                                          100% 4.67M=0.002s

2013-05-03 21:16:27 (4.67 MB/s) - `ipcalc-0.40-10.noarch.rpm' saved [11316/11316]

Downloading multiple files with wget


A very useful feature of wget is its ability to download multiple files. The location of the files to be download can be stored in a simple file. The name of the file is passed via the "-i" parameter. In the example below our list is stored in "my_files_to_download.txt".



john@john-desktop:/tmp$ cat my_files_to_download.txt 
http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/john-wordlists-1-7.noarch.rpm
http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/man-pages-2.29-3.noarch.rpm
http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/myspell-welsh-20040425-26.noarch.rpm
http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/suselinux-manual_en-pdf-10.1-19.noarch.rpm

One we have our list, we can issue the "wget -i my_files_to_download.txt":



john@john-desktop:/tmp$ wget -i my_files_to_download.txt 
--2013-05-03 21:29:31--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/john-wordlists-1-7.noarch.rpm
Resolving ftp.hosteurope.de (ftp.hosteurope.de)... 80.237.136.138, 2a01:488:10:1::50ed:888a
Connecting to ftp.hosteurope.de (ftp.hosteurope.de)|80.237.136.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14155361 (13M) [application/x-redhat-package-manager]
Saving to: `john-wordlists-1-7.noarch.rpm'

100%[======================================>] 14,155,361  1.01M/s   in 12s     

2013-05-03 21:29:43 (1.13 MB/s) - `john-wordlists-1-7.noarch.rpm' saved [14155361/14155361]

--2013-05-03 21:29:43--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/man-pages-2.29-3.noarch.rpm
Reusing existing connection to ftp.hosteurope.de:80.
HTTP request sent, awaiting response... 200 OK
Length: 4102717 (3.9M) [application/x-redhat-package-manager]
Saving to: `man-pages-2.29-3.noarch.rpm'

100%[======================================>] 4,102,717    852K/s   in 5.3s    

2013-05-03 21:29:48 (763 KB/s) - `man-pages-2.29-3.noarch.rpm' saved [4102717/4102717]

--2013-05-03 21:29:48--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/myspell-welsh-20040425-26.noarch.rpm
Reusing existing connection to ftp.hosteurope.de:80.
HTTP request sent, awaiting response... 200 OK
Length: 540757 (528K) [application/x-redhat-package-manager]
Saving to: `myspell-welsh-20040425-26.noarch.rpm'

100%[======================================>] 540,757     1.36M/s   in 0.4s    

2013-05-03 21:29:48 (1.36 MB/s) - `myspell-welsh-20040425-26.noarch.rpm' saved [540757/540757]

--2013-05-03 21:29:48--  http://ftp.hosteurope.de/mirror/ftp.opensuse.org/discontinued/SL-10.1/inst-source/suse/noarch/suselinux-manual_en-pdf-10.1-19.noarch.rpm
Reusing existing connection to ftp.hosteurope.de:80.
HTTP request sent, awaiting response... 200 OK
Length: 14467103 (14M) [application/x-redhat-package-manager]
Saving to: `suselinux-manual_en-pdf-10.1-19.noarch.rpm'

100%[======================================>] 14,467,103  1.04M/s   in 14s     

2013-05-03 21:30:03 (979 KB/s) - `suselinux-manual_en-pdf-10.1-19.noarch.rpm' saved [14467103/14467103]

FINISHED --2013-05-03 21:30:03--
Total wall clock time: 32s
Downloaded: 4 files, 32M in 32s (1016 KB/s)


Help with wget


As always, if you require further help with a command you can normally issue: command --help to display available parameters. Below we issued "wget --help":



GNU Wget 1.13.4, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...

Mandatory arguments to long options are mandatory for short options too.

Startup:
  -V,  --version           display the version of Wget and exit.
  -h,  --help              print this help.
  -b,  --background        go to background after startup.
  -e,  --execute=COMMAND   execute a `.wgetrc'-style command.

Logging and input file:
  -o,  --output-file=FILE    log messages to FILE.
  -a,  --append-output=FILE  append messages to FILE.
  -d,  --debug               print lots of debugging information.
  -q,  --quiet               quiet (no output).
  -v,  --verbose             be verbose (this is the default).
  -nv, --no-verbose          turn off verboseness, without being quiet.
  -i,  --input-file=FILE     download URLs found in local or external FILE.
  -F,  --force-html          treat input file as HTML.
  -B,  --base=URL            resolves HTML input-file links (-i -F)
                             relative to URL.
       --config=FILE         Specify config file to use.

Download:
  -t,  --tries=NUMBER            set number of retries to NUMBER (0 unlimits).
       --retry-connrefused       retry even if connection is refused.
  -O,  --output-document=FILE    write documents to FILE.
  -nc, --no-clobber              skip downloads that would download to
                                 existing files (overwriting them).
  -c,  --continue                resume getting a partially-downloaded file.
       --progress=TYPE           select progress gauge type.
  -N,  --timestamping            don't re-retrieve files unless newer than
                                 local.
  --no-use-server-timestamps     don't set the local file's timestamp by
                                 the one on the server.
  -S,  --server-response         print server response.
       --spider                  don't download anything.
  -T,  --timeout=SECONDS         set all timeout values to SECONDS.
       --dns-timeout=SECS        set the DNS lookup timeout to SECS.
       --connect-timeout=SECS    set the connect timeout to SECS.
       --read-timeout=SECS       set the read timeout to SECS.
  -w,  --wait=SECONDS            wait SECONDS between retrievals.
       --waitretry=SECONDS       wait 1..SECONDS between retries of a retrieval.
       --random-wait             wait from 0.5*WAIT...1.5*WAIT secs between retrievals.
       --no-proxy                explicitly turn off proxy.
  -Q,  --quota=NUMBER            set retrieval quota to NUMBER.
       --bind-address=ADDRESS    bind to ADDRESS (hostname or IP) on local host.
       --limit-rate=RATE         limit download rate to RATE.
       --no-dns-cache            disable caching DNS lookups.
       --restrict-file-names=OS  restrict chars in file names to ones OS allows.
       --ignore-case             ignore case when matching files/directories.
  -4,  --inet4-only              connect only to IPv4 addresses.
  -6,  --inet6-only              connect only to IPv6 addresses.
       --prefer-family=FAMILY    connect first to addresses of specified family,
                                 one of IPv6, IPv4, or none.
       --user=USER               set both ftp and http user to USER.
       --password=PASS           set both ftp and http password to PASS.
       --ask-password            prompt for passwords.
       --no-iri                  turn off IRI support.
       --local-encoding=ENC      use ENC as the local encoding for IRIs.
       --remote-encoding=ENC     use ENC as the default remote encoding.
       --unlink                  remove file before clobber.

Directories:
  -nd, --no-directories           don't create directories.
  -x,  --force-directories        force creation of directories.
  -nH, --no-host-directories      don't create host directories.
       --protocol-directories     use protocol name in directories.
  -P,  --directory-prefix=PREFIX  save files to PREFIX/...
       --cut-dirs=NUMBER          ignore NUMBER remote directory components.

HTTP options:
       --http-user=USER        set http user to USER.
       --http-password=PASS    set http password to PASS.
       --no-cache              disallow server-cached data.
       --default-page=NAME     Change the default page name (normally
                               this is `index.html'.).
  -E,  --adjust-extension      save HTML/CSS documents with proper extensions.
       --ignore-length         ignore `Content-Length' header field.
       --header=STRING         insert STRING among the headers.
       --max-redirect          maximum redirections allowed per page.
       --proxy-user=USER       set USER as proxy username.
       --proxy-password=PASS   set PASS as proxy password.
       --referer=URL           include `Referer: URL' header in HTTP request.
       --save-headers          save the HTTP headers to file.
  -U,  --user-agent=AGENT      identify as AGENT instead of Wget/VERSION.
       --no-http-keep-alive    disable HTTP keep-alive (persistent connections).
       --no-cookies            don't use cookies.
       --load-cookies=FILE     load cookies from FILE before session.
       --save-cookies=FILE     save cookies to FILE after session.
       --keep-session-cookies  load and save session (non-permanent) cookies.
       --post-data=STRING      use the POST method; send STRING as the data.
       --post-file=FILE        use the POST method; send contents of FILE.
       --content-disposition   honour the Content-Disposition header when
                               choosing local file names (EXPERIMENTAL).
       --auth-no-challenge     send Basic HTTP authentication information
                               without first waiting for the server's
                               challenge.

HTTPS (SSL/TLS) options:
       --secure-protocol=PR     choose secure protocol, one of auto, SSLv2,
                                SSLv3, and TLSv1.
       --no-check-certificate   don't validate the server's certificate.
       --certificate=FILE       client certificate file.
       --certificate-type=TYPE  client certificate type, PEM or DER.
       --private-key=FILE       private key file.
       --private-key-type=TYPE  private key type, PEM or DER.
       --ca-certificate=FILE    file with the bundle of CA's.
       --ca-directory=DIR       directory where hash list of CA's is stored.
       --random-file=FILE       file with random data for seeding the SSL PRNG.
       --egd-file=FILE          file naming the EGD socket with random data.

FTP options:
       --ftp-user=USER         set ftp user to USER.
       --ftp-password=PASS     set ftp password to PASS.
       --no-remove-listing     don't remove `.listing' files.
       --no-glob               turn off FTP file name globbing.
       --no-passive-ftp        disable the "passive" transfer mode.
       --retr-symlinks         when recursing, get linked-to files (not dir).

Recursive download:
  -r,  --recursive          specify recursive download.
  -l,  --level=NUMBER       maximum recursion depth (inf or 0 for infinite).
       --delete-after       delete files locally after downloading them.
  -k,  --convert-links      make links in downloaded HTML or CSS point to
                            local files.
  -K,  --backup-converted   before converting file X, back up as X.orig.
  -m,  --mirror             shortcut for -N -r -l inf --no-remove-listing.
  -p,  --page-requisites    get all images, etc. needed to display HTML page.
       --strict-comments    turn on strict (SGML) handling of HTML comments.

Recursive accept/reject:
  -A,  --accept=LIST               comma-separated list of accepted extensions.
  -R,  --reject=LIST               comma-separated list of rejected extensions.
  -D,  --domains=LIST              comma-separated list of accepted domains.
       --exclude-domains=LIST      comma-separated list of rejected domains.
       --follow-ftp                follow FTP links from HTML documents.
       --follow-tags=LIST          comma-separated list of followed HTML tags.
       --ignore-tags=LIST          comma-separated list of ignored HTML tags.
  -H,  --span-hosts                go to foreign hosts when recursive.
  -L,  --relative                  follow relative links only.
  -I,  --include-directories=LIST  list of allowed directories.
  --trust-server-names             use the name specified by the redirection
                                   url last component.
  -X,  --exclude-directories=LIST  list of excluded directories.
  -np, --no-parent                 don't ascend to the parent directory.