chebe | Entries tagged with images

Go to your images. Check out the metadata;
exiv2 image.jpg
exiftool image.jpg

Remove the exif data;
exiftool -all= image.jpg
exiftool -all= *

You'll be left with an _original copy. If you don't need them;
rm *_original

Say you have a simple blog and are hosting the images yourself. You want to get an SSL cert, but the only options provided by your hosting provider are expensive and aimed at much larger sites/usage. You've heard about Let's Encrypt, but your hosting provider doesn't provide it on your package (e.g. shared hosting). But, they do provide a way for you to install SSL certs yourself.

SSL certs, the DIY way.

First, you will need a linux machine, and a way to ftp (or otherwise get files onto your website). We will be doing this manually, so we'll need to create a file at the following location (you'll be given specifics later).
/webspace/httpdocs/$site/.well-known/acme-challenge/$file is equivalent to http://$site/.well-known/acme-challenge/$file

Install certbot. I'm on Fedora so;
sudo dnf install certbot python2-certbot-apache

Run it manually;
sudo certbot certonly --manual --preferred-challenges http

You'll get a warning that your IP will be publicly logged. If this bothers you perhaps wait to run this until you have access to public internet, like a cafe, hackerspace, or even on holidays.

-------------------------------------------------------------------------------
NOTE: The IP of this machine will be publicly logged as having requested this
certificate. If you're running certbot in manual mode on a machine that is not
your server, please ensure you're okay with that.

Are you OK with your IP being logged?
-------------------------------------------------------------------------------
(Y)es/(N)o: Y

After uploading the $file it's a good idea to view it in your browser to make sure it's working.

At the end you'll (hopefully) get a success message that tells you when the cert will expire, and where they are on your system, e.g. /etc/letsencrypt/live/$site/fullchain.pem.

With the cert generated now you'll need to manually install it. You'll need /etc/letsencrypt/live/$site/fullchain.pem and /etc/letsencrypt/live/$site/privkey.pem.

Log in through your hosting providers control panel. Hopefully they'll have instructions. Basically copy the contents of fullchain.pem and privkey.pem into the respective clipboard copy-paste fields. Save, wait a few minutes, and that's it. Your site has SSL.

At this point I copied all my images over to httpsdoc, and updated the links in my blog (for the last year, and the header image in Customize Style). Now if you visit my blog directly you shouldn't get any worrying warnings.

You are Good People, right? You just want to get what's yours (albeit in the laziest way possible), right? You don't want to misuse any tools to cause any damage, right? Okay, great, listen up.

A while back I came up with some cmd line calls using wget to back up my LJ Scrapbook. This method stopped working as LJ restructured Scrapbook a few times. Me, I kept using them as free image hosting. But, for some reason, I've been remotivated to get a backup.

They use Flash. Flash does not play nice with, well, anything. So I am left with the not so elegant brute-force approach.

Log in to LJ. Go to your Scrapbook, view a photo, click on the little share icon. Grab your $usernumber.
Format:
http://ic.pics.livejournal.com/$username/$usernumber/$photonumber/$photonumber_original.jpg
Have a look at your newest upload, note the $photonumber.

While logged in, export your cookies.txt (see previous post, basically find plugin for your browser).
Plug $username, $usernumber, and a number greater than the $photonumber into $maxphotonumber.

Run script.

#!/bin/bash
username=your_user_name
usernumber=your_user_number
maxphotonumber=your_max_number
for (( c=0; c<=$maxphotonumber; c++))
do
	wget --load-cookies cookies.txt -erobots=off -nd -np -r http://ic.pics.livejournal.com/$username/$usernumber/$c/"$c"_original.jpg
done

The script simply checks every single number between zero and your maximum number. If there exists an image, it will save it, with the same name. It only checks for the _original images. It saves everything in the one directory. It is not optimised, but it should get everything. Most of the flags aren't needed. But I think they show well just how much patience I lost with this whole thing.

*edit* About 14 hours to get over 1,000 pictures in a 280,000 number range.

*edit2* If you get a few unopenable files, try different format extensions.

Edit: check the comments for more recent methods some other helpful people have found.

NOTE: these are instructions for LJs old ScrapBook service. If you've been moved to the new one the process is mostly the same, but some changes will be needed. See link in this comment.

Step 1: cookies.txt

My browser of choice is Firefox, which since v3 has used sqlite databases to store cookies, instead of the older method of dumping them into a .txt file. The auto download tool I'm going to use requires the older style cookies file. So I went and found a plugin (Firefox only). You have to restart after installation.

Now, go to LJ and log in. Then Tools > Export Cookies... and chose a save file location.

Note: the next step can be done without the cookies file, but only your publicly visible gallery>files will be saved. The cookies.txt file is a snapshot. You will have to regenerate it next time you want to back up your Scrapbook if you've logged-out/logged-in in the meantime.

Step 2: go get those files
I'm using wget. It exists on many platforms but I'm running it, and the other programs used later on, under linux.

(Do you have any Unsorted files? Please see edit note at bottom of post.)

Option A: greedily grab it all

The short commands are;

-nc; no clobber, meaning don't download additional copies of existing files

-np; no parent, this is very IMPORTANT, it means only look in subdirs, don't ascend up into the depths of livejournal.com saving everything along the way

-r; recursive

-o output.file; redirect output to specified log file

Put it all together and you have;
wget --load-cookies cookies.txt -nc -np -r -o log.txt http://pics.livejournal.com/your_user_name

This will run for quite some time, the more pics you have, the more time it will take.

But wget downloads everything, not just pictures, and in ScrapBooks hierarchy (which is not the most useful for humans). So it's going to need sorting, but I'll leave that up to you.

Option B: pick and choose

Part 1; spider

--spider; follow the links as usual, but don't download any files

-nd; no directories, don't create any

wget --load-cookies cookies.txt --spider -nd -np -r -o spider.txt http://pics.livejournal.com/your_user_name

I left this running for a few hours. LJ says I've over 400 ScrapBook files (on my Profile page, but these are just public images), using <300MB of storage. And I have a very unreliable network. In the end spider.txt reached 1.6MB and finished with this;
Downloaded: 817 files, 5.1M in 3m 33s (24.3 KB/s)

Part 2; filter and grab

cat spider.txt | grep pics.livejournal.com/your_user_name/pic | sed -r 's/^.*(pics\.livejournal\.com\/your_user_name\/pic\/[0-9a-z]{8}).*$/http:\/\/\1/' | sort -n | uniq > links.txt

Resulting links.txt file was 635 lines/links long.

-i input.file; input file of links to visit

wget --load-cookies cookies.txt -i links.txt -np -o dl.txt

dl.txt ending with;
Downloaded: 629 files, 262M in 56m 56s (122 KB/s)
And after everything, 629 images end up on my local hard-drive, weighing in at just 264MB.

If you are getting 'ERROR 403: Forbidden' messages you probably aren't signed in while trying to access protected images. Make sure you are loading your cookies!

When all is said and done you'll have your images, with their LJ names (helpful if you've linked them in journal posts but want to move image hosting), but you'll lose any gallery/directory structure you may have hoped for.

And please don't abuse this tool, annoying LJ won't help anyone get their images.

*edit* Are you missing the images from the Unsorted gallery? Yeah, me too. Seems that because they aren't linked like the others they get missed.

> Prevention: before you do anything with wget create a gallery and add all the images in Unsorted to it.
> Patch: hopefully you don't have too many Unsorted. Go through them one-by-one copying link location (leave off the trailing /), put them in a new .txt file, and run the last command with the new input file. (Or just save them manually while you're there anyway getting the links.)
If you do have lots, you could try spidering from http://pics.livejournal.com/manage/pics?gal=1, but I haven't done it.

Current Mood: productive

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Busy hands | Disquiet mind

Craft and Tech Notebook

Entries tagged with images

Remove exif data from your images in linux

Let's Encrypt and manually generating certs

Backing-up my LJ ScrapBook pics, redux

Backing-up my LJ ScrapBook pics

Profile

Syndicate

June 2025

Expand Cut Tags

Index

DW links

Style Credit