Downloading a Bunch of Files From Sourceforge

Suppose you want to download a whole slug of files from Sourceforge, or any other place that downloads like Sourceforge does.

What I mean by that is, does the site present you with a URL that looks something like this:

http://sourceforge.net/projects/notarealproject/files/subjecta/Sounds/lowsound.whatever/download

Notice how the end of the URL has a "download" suffix? The filename you want comes before that as: "lowsound.whatever".

Now, let’s suppose there are 100-files you want to download. Hey, they are part of the application you downloaded earlier, but weren’t included initially, for some reason.

Okay, here’s what you can do. First, make sure you have a good clipboard manager installed, like Parcellite, the lightweight GTK+ clipboard manager. Yes, this is for Linux. Anyone else, well as they say, YMMV.

Open a terminal session & create a directory to work in and change "cd dirname" to it.

Open a file with your favorite text editor (ed or nano) call it lowsound.urls.

Navigate your web browser to the Sourceforge site you’re interested in. Locate the files you want to download. Just right-click your mouse on each filename, copy & paste each one into your file (lowsound.urls). Be sure to start each file on a new line.

Save the file when you’re done.

Assign a shell variable to contain your URL list. Something like: URLS

Now, copy the files to the variable

URLS=`cat lowsound.urls`

Yes, that’s the grave accent "`"

Check it

echo $URLS

You should see a single line of space-separated URLs.

Now, if you run this command-line BASH script:

for u in $URLS; do wget $(echo $u | tr ‘\n’ ‘ ‘; echo "-O" | tr ‘\n’ ‘ ‘; echo $u | sed ‘s/\/download//’ | sed ‘s/.*\///’); done;

All those files will be downloaded & they will have the filenames they are supposed to have.

How it works.

The script parses each individual URL contained in $URLS and passes the results, as an argument, to the "WGET" program, which downloads & saves the files.

for u in $URLS – "u" is the number of URLs contained in $URLS.

$(echo $u | tr ‘\n’ ‘ ‘; echo "-O" | tr ‘\n’ ‘ ‘; echo $u | sed ‘s/\/download//’ | sed ‘s/.*\///’);

These four piped commands display the the individual URLS with the newlines at the end of each one converted into spaces. It next adds the "-O", with:

echo "-O" | tr ‘\n’ ‘ ‘

to the command as an argument for wget, instructing it to use a specific filename instead of just defaulting with whatever is at the end of the URL string.

Next, the URL is brought-up again with:

echo $u

which is piped through SED twice:

| sed ‘s/\/download//’

which substitutes the string "/download" for the NULL string. Finally, the URL is stripped down to the actual filename by:

| sed ‘s/.*\///’

leaving us with the actual filename of the source file we’re trying to download.

The ";" ends this iteration of the command.

Enclosing the set of commands in $( …) causes the generated string to be passed to WGET as it’s arguments.

The "for – do – done" construction causes the commands to be run against each URL contained in $URLS, until "u" equals zero.

=-=-=-=-=
Powered by Bilbo Blogger

Leave a Reply

Your email address will not be published.