h1

Trying to NOT falsify data

March 19, 2008

The vast majority of the data for my paper consists of photographic images. I have five figures and all but one of them are going to be mostly photographs. These images are of my cells and the images are supposed to answer the following question: where is Protein 1 when I do something severe to Protein 2?

Now, these data are exclusively qualitative. So, you would think that there wouldn’t be much of an issue with image processing. As long as I show images that look like what the cells actually do look like through the scope, then I’m fine, right?

Well, there exists in this world a program called Photoshop. With it, you can do any number of things to photographic images. For instance, I can take an image that looks like this:

And turn it into this:

And then, I can crop the image so that it looks like this:

But, should I? And why would I want to?

First of all, is the first image representative of what I see through the microscope? No. There is way too much green background (green that is not in spots) in the cells. That is most definitely not what they look like when I look at them through the scope. So, I can attempt to subtract out the green background. However, I can’t get rid of all of it because when I look through the scope there is a bit of green background fluorescence (cells naturally have some green background fluorescence). Besides which, when I try to get rid of all of it, I end up losing the greens spots which are my data.

Even more concerning, what if that green background is not actually background, but diffuse staining of my protein throughout the cell? That would mean that not all of my protein is found in spots. So, by getting rid of the background, I could be getting rid of data. If my purpose in showing this figure is to say that all of my protein is in these particular green spots and I manipulate the image to show that you only see green in spots, then I am misrepresenting my data.

Finally, by cropping the image, am I selecting a subset of images that look like what I want them to look like rather than what the majority of cells actually look like? I have no choice about cropping the image–I can’t possibly include the whole thing in the paper. The first image I showed you isn’t even the entirety of the original image. The original image is 18 x 11 inches in size, whereas my final image for the paper is probably going to be 1.5 inches in size. In order to be able to include more cells, I can resize the image to be 11 inches wide and I have done that with the actual image I’m going to use for my paper. Then, I can crop the image so that it’s only 1.5 inches wide–but only if the cells I include are representative of the majority of the cells I see when I look through the scope.

The Journal of Cell Biology has this to say about image manipulation:

No specific feature within an image may be enhanced, obscured, moved, removed, or introduced. The grouping of images from different parts of the same gel, or from different gels, fields, or exposures must be made explicit by the arrangement of the figure (i.e., using dividing lines) and in the text of the figure legend. If dividing lines are not included, they will be added by our production department, and this may result in production delays. Adjustments of brightness, contrast, or color balance are acceptable if they are applied to the whole image and as long as they do not obscure, eliminate, or misrepresent any information present in the original, including backgrounds. Without any background information, it is not possible to see exactly how much of the original gel is actually shown. Non-linear adjustments (e.g., changes to gamma settings) must be disclosed in the figure legend. All digital images in manuscripts accepted for publication will be scrutinized by our production department for any indication of improper manipulation. Questions raised by the production department will be referred to the Editors, who will request the original data from the authors for comparison to the prepared figures. If the original data cannot be produced, the acceptance of the manuscript may be revoked. Cases of deliberate misrepresentation of data will result in revocation of acceptance, and will be reported to the corresponding author’s home institution or funding agency. [emphasis added]

It’s somewhat tricky ground. I have to try to get rid of some of the green background in order to make the image true to what I see when I look through the scope, but if I get rid of too much of it then I’m massaging the data. Add to that the fact that I took these images a couple of weeks ago and am only now getting around to processing them which means that my memory of what things truly looked like when I looked in the scope is a little bit fuzzy. Do I really remember it having less green background or is that what I remember because ideally there would be less green background? Because of this, I’m going to repeat the experiment and process the images immediately after I collect them.

As you can see, it is relatively easy to manipulate your data to the point of falsification, even if you don’t mean to. The vast majority of scientists truly intend to present their data in a way that is truthful, but it is difficult to completely eliminate personal bias. I would love for all of my images to look like that last one, but that’s not reality and to present that image in a paper would be misleading. In some ways, I wish the image collection and processing were being done by someone other than me–someone who is not emotionally attached to the project in the way that I am. On the other hand, then I would have to trust that person to not manipulate the data in a way that is not truthful.

Leave a Comment