Website Screenshots and Thumbnails on Linux (updated) 
May 9th, 2007
For our new project RobotReplay we’ve been looking into displaying data as overlays on images of websites. There is a quick and simple ActiveX component that we could use if we were still on Windows… but we’ve moved on to greener pastures and had to find a new way.
khtml2png gave us exactly what we were looking for. The only issue then was running it headless. After wrestling with Xvfb for a while with no success we tried with a vncserver and it worked just fine.
vncserver :1 -geometry 1024x768 -depth 24 export DISPLAY=:1 khtml2png2 --width 1024 --height 768 http://www.nitobi.com nitobi_image.png
khtml2png got us most of the way, but we still needed to reduce the images from 1024×768 to a more thumbnailish size (the width and height command line options specify the size of the browser window). We wanted to avoid a post-processing step like using imagemagick.
Adding your own scaling is as easy as changing a single line in the khtml2png.cpp file.
328 | return pix->convertToImage().save(file, format); |
becomes
328 | return pix->convertToImage().smoothScale(scaledWidth, scaledHeight, QImage::ScaleMin).save(file, format); |
Where scaledWidth and scaledHeight are values you want to scale to. Using QImage::ScaleMin as the third argument means that the image will scale to take up at most the area described by a rectangle with width = scaledWidth and height = scaledHeight. You can substitute QImage::ScaleMax to scale the screenshot to make it as large as it can be with either the width respecting scaledWidth or the height respecting scaledHeight.
I added command line switches for scaled-width and scaled-height to our copy of khtml2png. Download the khtml2png-nitobi patch here and patch against khtml2png-2.6.0.
wget http://blogs.nitobi.com/jake/files/khtml2png-nitobi.patch.zip gunzip khtml2png-nitobi.patch.zip cd khtml2png-2.6.0 patch < ../khtml2png-nitobi.patch
You’ll then need to build khtml2png from the patched source. Afterwards you can call khtml2png like so:
khtml2png2 --height 1024 --width 768 \ --scaled-width 120 --scaled-height 90
This entry was posted on Wednesday, May 9th, 2007 at 12:03 pm and is filed under Development, RobotReplay. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

May 12th, 2007 at 9:48 am
Nice one Jake!
May 21st, 2007 at 12:44 pm
Any experience with khtmld (Daemon). And what does headless mean :), I thought it would mean without using VNCServer or the likes. Is that possible?
May 22nd, 2007 at 9:15 pm
Ok I reached this far. But when I run
khtml2png2 –display :1 –width 1024 –height 768 http://www.google.com google.png
I get
kbuildsycoca running…
and then nothing happens. It looks like it hangs. I’d really appreciate if you could help me with this.
May 23rd, 2007 at 6:55 am
Things worked! with khtml2png version 1.x
But my problems continue.. I couldn’t get flash to work with Konqeror :(. I really wanted to be able to capture flash movies as well.
June 1st, 2007 at 10:42 am
did what you specified and it worked! thanks :)
just wondering - did you understand why using the command khtml2png (i.e.
using version 1) works? cos khtml2png2 doesn’t for me.
also how do you run your patch - do you need to be in a specific directory?
thanks ,
June 5th, 2007 at 11:05 am
@Animesh: Headless in this case just meant without a monitor. I wanted to use Xvfb, a virtual frame buffer for X, but couldn’t get it to fly. The overhead of the vncserver is pretty minimal anyhow.
June 5th, 2007 at 11:48 am
@joe: I actually do use khtml2png version 2 - just a typo in my command line above. That’s what you can patch against, the latest source of version 2.
In the source directory of khtml2png, you patch like so:
November 8th, 2007 at 3:20 pm
When I look at the .png files, it looks very poor. None of the images display and it doesn’t capture the whole page. And often times the text is covered up. How do I make it work properly?
November 30th, 2007 at 6:44 pm
Jake, really nice job on the scaling modification, works great on 2.6.0 version.
However, the new version (2.6.7) uses a different syntax in it’s functions so your patch will not work. Florent Bruneau managed to clean really nice the code, while breaking everything in a way so we can’t patch it anymore. :)
I was wondering if you have the time to check the new code and post a new solution. C++ is not my forte.
Thanks for your help.