User login

Navigation

You are here

On digitizing and editing figures

Recently I have observed that if I don't perform a routine task for a few months, I tend to forget the steps needed in the process. I have to relearn the process and that's a waste of time. The intention of this blog post is to provide a list of tips for my use which can also be of use to other Linux users. I like free software. Most of the software on this list can be downloaded from SourceForge or some such place.

 

My job involves a lot of validation of numerical codes. Since I don't do experiments, I have to get my experimental data from published sources. That often involves digitization of two-dimensional plots. In this post I'll laid down the steps that I use for the entire process. In the next post I'll talk about version control and my experience with subversion.

 

  • Scanning plots

     

    The first step in the process is to create or locate a PDF file containing the plot of interest. If the plots are from recently published papers, then I can usually get a distortion free PDF version from my library. Otherwise, thanks to the electronic revolution, most of the papers that I request from Interlibrary Loan are delivered in PDF format without much distortion. If the paper that I have is in paper form I use the copier next to my office to scan the paper in PDF. If the file in is PostScript format then I use epstopdf to convert it into PDF.

     

  • Grabbing plots

     

    The next step is to grab the plot of interest from the PDF file. I use ImageMagick for the process. ImageMagick can easily be installed on Debian using the apt-get process (or some such method on other distributions). The main executable for ImageMagick is called display. When you run display a screen containing the ImageMagick logo pops up with a FileSection widget. One of the bottom buttons on the widget says Grab . Click on that. You get a dialog that asks for a time delay and has another Grab button. Move your PDF to a position where the plot of interest is not obscured and click on Grab. Move the cursor over the plot, click of the left mount button, and drag the mouse until the box contains the region of interest. Then let go of the mouse button. Your selection shows in the ImageMagick window. Click the left mouse button inside the image to get the Command buttons. Click on File->Save and save the Image in the JPEG format with no compression. Now you have the plot that you want to digitize. It takes much longer to write down the procedure than to actually do it.

     

  • Digitizing plots

     

    I use Engauge to do my digitization. All you have to do is download Engauge and install it somewhere under your home directory. The executable is called digitizer . Then set your paths so that digitizer can be found. Next, run digitizer and you will get a gray window with some button and icons near the top. Click on File->Import. You will get a FileChooser dialog. Select the JPEG file containing the plot. The digitizer will try to scan in the segments it find and highlight them in green on the imported plot. Ignore the green stuff. Instead, click on the icon at the top that shows just the axes (with the origin and the two ends marked with red crosses). Take your mouse down to the origin and click on the point. You will get a dialog that asks for the coordinates. Enter the coordinates given in the plot. Do the same thing of the other two limits of the axes. If the axes are OK then you will get a message saying so. The software corrects for any distortion in the axes at this stage. Then take the mouse to the icon that shows a pair of axes and a blue line connecting two black crosses. Click on that. You are now ready to digitize the curves from your image. Select the points of interest on the curves and a straight line connecting the points will be drawn for you. Once one curve is complete you can click on the Settings->Curves menu and add a New curve to the list. Next choose the new curve, it will be called "Curve 2" by default in the icon next to the icon you had selected before. Digitize the points for this curve and repeat if necessary. Before saving the data click on Settings->Export Setup. In the dialog box that opens up, select Raw X's and Y's, One curve on each line, Spaces, None and press OK. The select File->Export and name the data file. I usually call the files .dat.

     

  • Checking plots

     

    The next step that I usually need is to check that everything looks OK. I plot the data in Matlab. At that stage I also have some model prediction that I want to compare with the experimental data. I plot the predicted values on the same plot in Matlab. Then, depending on whether the plot is for my own purposes or for a journal paper, I save it as a color EPS file or a black and white EPS file using "print -depsc SomePlot.eps" or "print -deps SomePlot.eps".

     

  • Editing plots

     

    I usually take care to make sure that the line widths and fonts that I use in Matlab are readable when reduced in size. Even then there are situations when there is some extra information that needs to be added to the plots or some line widths and font sizes need to be increased. I need to be able to edit my EPS files.

     

    I use pstoedit to convert by EPS figures generated by Matlab into the XFig format using the command "pstoedit -f xfig .eps .fig". I then edit the XFig file and add arrows and other such things. Then I save the figure as EPS and convert it into PDF if necessary using epstopdf. The final result usually looks much better than plots generated my Matlab alone.

     

  • Joining PDF files together 

     

    Sometimes I also need to join some of these PDF figures together (without creating a LaTeX document using pdflatex). I do that by using the following pointer from Scott Nesbitt - do it the GhostScript way.
    gs - dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=joinedFile.pdf figure1.pdf figure2.pdf

     

Comments

vh's picture

Biswajit,

For windows users, there is a free software called WINDIG which can be used to digitize plots.  I use ubuntu and it comes with a digitizing software called g3data.

V. Hegadekatte, University of Karlsruhe, Germany

Mogadalai Gururajan's picture

Dear Biswajit,

If you have LaTeX installed (which, I am certain you do), the following command also works for concatenating pdf files:

texexec –pdfarrange –result out.pdf in1.pdf in2.pdf in3.pdf …

Here is some more information on manipulating pdf files (in GNU/Linux OS). 

Mark T Fondrk's picture

Thanks for the tips.  I like Matlab too, but GNU Octave is free.  Or for making plots, my favorite is Gnuplot.  Linux users probably have both and Windows users can get both free as part of the cygwin package.

 -Mark

Thanks for the tips.  I'll try them out.   Please post other tips and time-saving techniques that you might know of (for figures).

Rant: It drives me crazy when I see raster images in papers when vector images are just as easy to create.  The quality of figures in journal papers seems to have deteriorated since the late 1980s.  In the past there were expert draftsmen who drew figures.   They spent quite a bit of time drawing plots and other figures.  Now that most such draftsmen have lost their jobs, the final versions of figures are made by the authors of a paper.  The result is often amateurish and in some cases undecipherable (see for example the Proceedings of the APS Shock Compression Conferences).  Three of the last five papers that I have reviewed had figures that were illegible when printed in black and white.  We need to bring back the professionals :)

Arun K. Subramaniyan's picture

Biswajit,

Since you use matlab to plot the digitized data, you may like to use a digitizer written in matlab. I wrote a simple digitizer in matlab some time ago. It is not as sophisticated as Engauge, but gets the job done. Do take a look (link ) and see if it is useful.

 Arun

Gentlemen: Hope you can help. I need a  freeware which will

scan & digitize a bmp or jpg spectral profile and give me (export)a .dat file. I tried Peak Explorer but discovered its the wrong software - only works with.dat numeric input. I tried Windig many times but cannot get it to work - its very cumbersome  and instructions I cannot follow. Is there anything simple with good instructions you can recommend.

I need this because most spectral libraries provide bmp profiles only so I need to take the profile and convert to .dat which  my processing software Vspec will open and give an image of.  Any help greatly appreciated.

Jerry 

 

You might find Dagra useful for digitizing graphs on Windows. It uses Bezier curves, so you can trace quite complicated data without having to click on every point. Very fast. There is a trial available here: http://www.blueleafsoftware.com/Products/Dagra/

Paul.

 

Subscribe to Comments for "On digitizing and editing figures"

More comments

Syndicate

Subscribe to Syndicate