Adaptive architecture neural nets for medical image compression ABSTRACT

Kuwait J. Sci. Eng. 34 (2B) pp. 73-85, 2007
Adaptive architecture neural nets for medical image compression
ROBINA ASHRAF, MUHAMMAD AKBAR AND NOMAN JAFRI
College of Signals, National University of Sciences & Technology, Pakistan.
[email protected]
ABSTRACT
In this paper a technique is proposed for medical image compression using neural
network based vector quantizers. There exist hundreds of modalities of medical images
and each modality has hundreds of subclasses for di€erent organs. In such a situation, it
is dicult to generalize a neural network for all modalities. To tackle this problem and
having a prior knowledge about similar nature of medical images for a single type, we
propose a ¯ag byte which is automatically set by image size and some other features.
This ¯ag byte is then used to select the size of the net and codebook. The proposed
method not only leads to dynamic architectures of neural nets but also towards an
adaptive selection of codebook sizes. This method yields high compression ratios with
much better quality than existing standards.
Keywords: Image compression; Learning Vector Quantizer; Self Organizing
Feature Maps.
INTRODUCTION
A few years ago radiologists were exclusively using ®lms and view boxes for
their diagnoses. However, the computer revolution has completely changed the
medical imaging systems which are moving towards a ®lm-less environment.
Digital systems are an integral part of CT, MRI, PET, SPECT and Ultrasound
imaging and even non-digital ®lm X-rays are gradually evolving to digitized
imaging. All these digital imaging technologies are rich in data and dicult to
store, transmit and manipulate. Thus, compression has become an indispensable
tool in the use of these technologies.
Existingstandardsaremostlybasedon®xedtransforms.InJPEG2000multiresolution
algorithms including wavelet approaches are somewhat adaptive (ISO/IEC JTC1/SC29/
WG1. WG1N1523 1999). Neural network compression use adaptive techniques (Jiang
1999). Other advantages of NN over JPEG may include robustness under noisy
conditions and simple decoding, while drawbacks of compression using NN includes
slow training, moderate compression ratios. According to Jiang (1999), the quality of
reconstructed image is highly dependent on training data. This paper introduces a
di€erent approach for NN compression that overcomes these problems.
74
Robina Ashraf, Muhammad Akbar and Noman Jafri
This paper is organized as follows. Section 2 introduces NNVQ (Neural
networks vector quantizer). In section 3 performance evaluation factors are
discussed. The proposed method is then described in section 4. Section 5 reports
experimental results. The paper concludes with a summary that highlights
advantages and disadvantages of the proposed method and some future prospects.
NNVQ (NEURAL NETWORK VECTOR QUANTIZER)
A neural network can be de®ned as massively parallel distributed processors
that have a mutual propensity for storing experiential knowledge and making it
available for use (Haykin 2001). Neural networks are trained using examples of
data which the network will encounter. During training, the network forms an
internal representation of the state space so that the novel data presented later
will be satisfactorily processed by the network.
Vector quantization can be de®ned as a mapping of Q of K dimensional
Euclidean space Rk into a ®nite subset Y thus Q Rk ! Y (Gersho & Gray
1992). Codebook design plays a signi®cant role in performance of VQ.
Techniques attempt to produce a codebook that is optimum for a given source
in the sense that average distortion may be kept to a minimum. The most widely
used technique for codebook design is the LBG (Linde-Buzo-Gray) algorithm
(Linde etal 1980). The LBG algorithm is very sensitive to codebook
initialization. In addition, while LBG converges to a local minimum it may not
reach a global minimum. Furthermore it is computationally expansive since
each iteration requires exhaustive search through the entire codebook.
In recent research (Laha etal 2004, Ferguson & Allinson 2004, Asari 2005),
the unsupervised learning neural network referred to as SOFM was shown to
provides good VQ codebooks leading to better quality reconstructed images as
compared to LBG designed codebooks. Other advantages include less sensitivity
to codebook initialization, better rate distortion performance and faster
convergence. The SOFM algorithm computes a set of vectors ( w1,w2,\ldots
..wk) which are used as code vectors. Kohonen (2001) introduced the concept of
classes ordered in a topological map of features. In many clustering algorithms
the input vector x is classi®ed and only the winning class is modi®ed during each
iteration. In SOFM algorithm the vector x is used to update not only the
winning class but also its neighboring classes according to the following rule:
:
x 2 cj if jjx ÿ wi jj
ˆ
min
jjx ÿ wjjj
;
wi …t ‡ 1† ˆ wj …t† ‡ …t†‰x ÿ wj …t†Š for cj 2 N…ci t†
wi …t ‡ 1† ˆ wj …t†
;
for cj 2 N…ci t†
=
;
;
;
and
Adaptive architecture neural nets for medical image compression
75
where x is the input vector, wi is weight vector for class iand N…ci t† is the set of
classes which are in the neighborhood of the winning class ciat time t.
The neighborhood of a class is de®ned according to some distance measure
on a topological ordering of the classes. Initially the neighborhood may be quite
large, while as training progresses the size of neighborhood shrinks to eventually
include only one class. Figure1 shows the scheme of NNVQ which is a
combination of unsupervised classi®cation for codebook generation and
supervised LVQ as re®ning layer.
;
Fig.1: NNVQ
the learning rate parameter (0< <1) is typically initialized to 0.1 and then
decreased monotonically with each iteration. After a suitable number of
iterations the codebook typically converges and training is terminated. An LVQ
is added to ®ne tune the code vectors generated by SOFM with supervised
training. The last step is an entropy coder to further compress the indices.
PERFORMANCE EVALUATION FACTORS
With lossy compression the di€erence between the original and reconstructed
image results in some visible distortion which may be measured in number of
ways. Two objective quality measures are MSE (Mean Square Error) and PSNR
(Peak Signal to Noise Ratio):
1
MSE ˆ MxN
X X ‰…X ÿ Y † Š
M N
iˆ1 jˆ1
ij
ij
2
where X is the original image and Y is the retrieved image both of size
M 2 N For 256 gray level images PSNR is given as:
:
76
Robina Ashraf, Muhammad Akbar and Noman Jafri
PSNR ˆ 10
2552
10 MSE
log
Despite of their popularity PSNR and MSE are poor indicators of subjective
quality of reconstruction (Al-Otum 2003). As a result perceptually based criteria
may be more appropriate. In this procedure, number of subjects view a
reconstructed image and rate them on ®ve point scale;`bad', `poor', `fair', `good'
and `excellent'. The mean opinion score is simply the average rating assigned by
all the subjects (Mark etal. 2003).
Compression ratio is another performance evaluation factor often mentioned.
Usually, both compression ratio and distortion measure are quoted together.
Many existing standards have symmetric complex encoders and decoders. In
the broadcast environment or database retrieval environment where an image is
coded once and decoded many times, the complexity of the decoder is of great
importance. Medical images once stored, are decoded many times for diagnosis,
discussion and future reference. For this reason, VQ was chosen for the
proposed method in which the decoder is only a lookup table.
A common set of training images and common test images from outside the
training set would allow the performance of di€erent algorithms to be
compared. Training images must be representative of the class of images for
which the network is to be used. In this study, the similarity of single organ,
single modality medical images were exploited.
PROPOSED ALGORITHM
Our work is basically to store data from a big radiological lab or hospital for
future referral and record. The data includes several image acquisition devices
with di€erent resolutions. In addition, there are numerous commonly o€ered
radiological tests and image sizes. For di€erent modalities and organs, di€erent
sets of images were obtained. Each set has unique resolution and size and can be
compressed by di€erent networks (containing di€erent no. of neurons) and
various codebook sizes. We propose an automatically detected ¯ag bit
determined by the image size which directs the system for a particular
compression architecture con®guration. For codebook design, Kohonen's self
organizing feature map method (2001) is applied. As it is an unsupervised
method so we cannot calculate the size of the codebooks prior to training. The
weight matrix thus obtained, is in fact the required codebook. With prior
knowledge, we devise a net for maximum size of image which can then be
presented with a ¯ag bit. We can then decide how many neurons will take part
in the process of compression. For a diagnostic lab or a radiological department
of a hospital one already knows about the modalities which are processed there.
Adaptive architecture neural nets for medical image compression
77
If these are for example twenty, then a ®ve bit ¯ag is enough to de®ne the types.
Once ¯ags are designated and neural nets are trained then the compressor can be
used as a real time device.
All work is done in MATLAB (Gonzalez etal. 2004) so these nets are de®ned
by custom design tools. Indices hence acquired are sent for transmission and
storage. Once trained and the codebook is ready. The codebook is transmitted
to the receiver and then afterwards for any subsequent use it is assumed that
receiver knows the codebook. The only overhead is the ¯ag byte which depends
on how many types of medical images are to be treated. The proposed algorithm
is de®ned clearly in Figure2. When an image becomes the input, ¯ag bits are set
using the image size and these act as a selector switch. On the decoder side ¯ags
are used just to identify which of the codebooks is to be used for decoding. The
decoding process is only a lookup table.
Fig.2: Proposed scheme encoder
The block ``NET'' in Figure2 is further explained in Figure3.
Fig.3: Neural net architecture for compression
78
Robina Ashraf, Muhammad Akbar and Noman Jafri
Nin is number of neurons at input. Nhidden is the number of neurons in the
hidden layer and Nout is number of neurons at the output layer. Nhiddenis actually
the number which decides the compression ratio and codebook size. The hidden
layer consists of a competitive layer with no bias and a linear layer at the
output. The competitive layer learns to classify input vectors producing a
codebook and the linear layer transforms the competitive layer's classes into
target classi®cations de®ned by the user.
Before training, it is often useful to scale the inputs and targets so that they
always fall within a speci®ed range. For scaling network inputs and targets, we
normalize the mean and standard deviation of the training set so that they have
zero mean and unity standard deviation.
EXPERIMENTS AND RESULTS
Most of the ideas presented in this work are con®rmed by extensive
experimental simulations. A group of image samples known to both encoder
and decoder is designated as the training set. We have used seventeen types of
medical images as training data and forty copies each for training purposes,
including lung X ray, cardiac angiogram, retinal image, hand X-ray, brain, foot
and arm MRI, brain, liver and pancreatic CT scans, ultrasound images for
pregnancy, abdomen and kidney stones. Each image type has di€erent size and
on size information for each we designate ¯ags. For seventeen modalities we
require a ®ve bit ¯ag. A block size of 4x4 is used for the images with sizes less
than 256x256 and block size of 8x8 for images with sizes more than 256x256.
Images are zero padded to get equal elements in each block. In Figure4 training
data is shown, while Figure5 shows the test data images. For both ®gures all
images are resized and processed (Gonzalez1993) to ®t in a speci®c size.
All images are divided into 4x4 blocks, each block is treated as a vector of 16
elements and preprocessed to train. As the test data is similar to the training
data, the quality of reconstructed images is near to lossless compression. Not
only PSNR and MSE are calculated as a quantitative quality measures but
subjective tests are also carried out. Five radiologists took part in subjective
tests and their comments were positive. In Figure6, standard JPEG and
proposed scheme are compared for PSNR vs. Compression ratio. A graph is
plotted for average readings of training images.
Adaptive architecture neural nets for medical image compression
Fig.4: Training images
79
80
Robina Ashraf, Muhammad Akbar and Noman Jafri
Fig.5: Test data images
Adaptive architecture neural nets for medical image compression
81
Figure 6 shows that the proposed scheme performance is compatible with
JPEG for compression ratios up to 30 and the scheme outperforms beyond
compression ratio 35 with 0.5-1dB more PSNR than JPEG. Experiments were
not carried out for compression ratios more than 45 as retrieved images for both
JPEG and NNVQ su€er perceptual loss which is unacceptable in the case of
medical images.
Fig.6: PSNR vs. Compression ratio
Table.1 shows some of the experimental results. From the results of test
images we can see that the proposed method has approximately 0.5-1dB
di€erence of PSNR for all images at a compression ratio of 40 and compression
ratios are all better than JPEG for the same PSNR.
The results with similar test images are also of interest. The quality is `good'
for the same organs and `fair' for di€erent organs. However, `fair' is not an
acceptable subjective quality measure for medical images. In table1, the last
three images teeth X-ray, stomach ultrasound and X-ray for leg fracture, show
tthat test data which are not similar to the training data. Results show that for
these images, PSNR by the proposed method is less than JPEG but the
compression ratio is higher in value.
82
Robina Ashraf, Muhammad Akbar and Noman Jafri
Table.1: Comparison of JPEG and NNVQ
PSNR for Compression ratio 40 Compression ratio for PSNR 30
JPEG
NNVQ
JPEG
NNVQ
Retinal
28.47
30.19
40.33
44.44
X-ray hand
30.84
31.19
42.12
42.5
X-ray lung
29.69
30.35
39.98
40.13
MRI brain
30.52
32.56
40.04
42.32
CT liver
29.43
31.96
39.12
40.48
CT kidney
29.66
30.27
38.34
40.02
CT pacriase
27.44
28.67
38.56
41.34
Pregnancy
29.92
30.31
39.45
42.35
Angiogram
29.16
29.51
40.56
43.5
X-ray foot
30.02
30.88
41.32
43.35
MRI spine
28.45
29.97
39.96
40.78
X-ray arm
30.98
31.45
41.23
43.34
X-ray pelvis
28.56
30.43
40.54
44.67
Cardiac Angio
31.02
32.23
38.88
40.23
X-ray teeth
31.33
30.44
37.33
40.38
Usound stomach
29.65
28.97
38.65
40.65
X-ray fracture
29.90
28.68
39.32
40.12
Test Images
Figure7 shows reconstructed images of lung x-rays for standard JPEG and the
proposed scheme. These images are for subjective quality comparison, with same
portion expanded. Figure 7(a) is the proposed scheme output whereas Figure 7(b) is in
JPEG format. Blocksize forboth images is 8x8and compressionratio is 40.Figure7(a)
is comparatively smooth and in JPEG output blockiness is more prominent.
(a) NNVQ
(b) JPEG
Fig.7: Comparison of blocky artifacts
Adaptive architecture neural nets for medical image compression
83
CONCLUSION AND FUTURE PROSPECTS
A scheme is presented for medical image compression using dynamic neural
network architecture and adaptive codebooks. The scheme exploits the special
capability of SOFM to generate optimal codebooks and re®ning features of
LVQ algorithm. The proposed scheme give higher compression ratios for good
quality reconstructed images.
However, the limitation is prior knowledge of the medical image modalities to
be processed and the test data should be similar to the training data. In cases
where the test image is unknown to the net, the results will not satisfy the legal
demand of medical image diagnosis. To get full advantage the only demand is
similar test data all along. Compression also depends on: block sizes chosen for
VQ, codebook size assigned, number of training images, number of epochs used
to train NNVQ and entropy coder used.
The results may be improved by increasing the number of training images,
above the 40 copies used in the present study. In addition, number of epochs
used to train NNVQ was 2000. By increasing these parameters the result can
also be improved, but in either case the time taken to train a NNVQ will also be
increased. We have used Hu€man coder as a last step in NNVQ indices
compression. The Lempel Ziv coder could be another choice. Large block sizes
and small codebooks enhance compression but destroy quality, therefore a
tradeo€ must be observed for these parameters. Although the current results are
of good quality, combination with our previous work([R. Ashraf & Akbar
2005a,b) could make them lossless.
REFERENCES
Al-Otum & H.M. (2003). Qualitative and quantitative image quality assessment of vector
quantization, JPEG and JPEG2000 compressed images. Journal of Electronic Imaging, 12(3):
511-521.
Asari V.K. (2005). Adaptive technique for image compression based on vector quantization using a
self-organizing neural network. Journal of Electronic Imaging 14(2): 230-239.
Ashraf, R. & Akbar, M. (2005). Diagnostically lossless compression of medical images. presented in
International Conference on Biological and Medical Physics Al-Ain, UAE.
Ashraf, R. & Akbar, M. (2005). Absolutely lossless compression of medical images presented in 27th
Annual International Conference of the IEEE EMBS Shanghai, China.
Ferguson, K.L. & Allinson, N.M. (2004). Ecient video compression codebooks using SOM-based
vector quantization. Vision, Image and Signal Processing, IEE Proceedings 151(2): 102 - 108.
Gersho, & Gray, R.M. (1992). Vector Quantization and Signal Compression. Kluwer Academic,
Boston, , MA USA.
Gonzalez, R.C., Woods,R.E. & Eddins,S.L. (2004). Digital Image Processing Using MATLAB.
Pearson Prentice Hall, USA.
Gonzalez, R.C. & Woods,R.E. (1993). Digital Image Processing, 2nd Edition. Addison Wesley,
USA .
84
Robina Ashraf, Muhammad Akbar and Noman Jafri
Haykin,S. (2001). Neural Networks: A Comprehensive Foundation. 2nd Edition. Pearson
Education, Germany.
ISO/IEC JTC1/SC29/WG1.WG1N1523 JPEG 2000(1999). Part I Committee Draft Version 1.0.
Jiang J. (1999). Image compression with neural networks --A survey. Signal Processing Image
Communication 14: 737-760.
Kohonen T. (2001). Self-Organizing Maps. Third Extended Edition. Springer, Berlin, Heidelberg
Mark M. & Grgic S. (2003) Picture quality measures in image compression systems. Eurocon 1:
233-236.
Linde Y., Buzo A. & Gray R.(1980). An algorithm for vector quantizer design. IEEE transactions
on communication. COM-28: 84-95.
Laha N. Pal R. & Chanda B.(2004). Design of vector quantizer for image compression using self
organizing feature map and surface ®tting. IEEE Transactions on image processing.13(10):
1291-1303.
Submitted : 27/11/2005
Revised : 27/3/2007
Accepted : 21/4/2007
85
Adaptive architecture neural nets for medical image compression
+
(
+
*
? =ayG Q ZyG bj]y ? =ZgyG 9 ;MyG ?w=V h
!
jQqF f9}g
f
Q=vC O}J|
h
9ATv9< ,9+F(y("wAyGh
e(
h
"+<hQ
aQVC
zgzy ?+"_(yG ?g|9GyG
*([email protected]
9
9V
,JGQ
G
?+zv
?Y;L
?+=ZgyG ?w=WyG
9}gAS9< ?+=_
{v h=A*h ?+=ayG
Z yG
b
?w=V ~+}[email protected] ?<(gZyG
#
?%<9W|
YG(!}
Y bj]y IO*OF ?+"[email protected]
I Q(
d
Q(
9T!C
#
|
p G Py
&!{
|
J
?t=T| ?pQg| >zaA*
k
ZyG ~GJ< 9+yB OOJ*
I Q(
o
U
eG
98}yG OF(* .?%F(}yG
,
&!{
OMASG [email protected] ?sQ(yG gP$
9 }wAyG
J +
~TGyG A9]fC rzAM}y
p ?zwW}yG gP$
9L B*9< [[email protected]
M
L
;gyh
QAt* P8"+Ih
9 Y}G
y 9]*C {<
) E
bj\
? =ZgyG ?w=Wzy
, +
,
w+|9"*O A9"<
y btp
) E
| ?+y9f >T"< ?t*QayG gP$ K}[email protected]
#
? y9JyG ?+S9+tyG
. +
9qYG(}yG
J
,
p
.
Q+qWAyG
9 A|
M
($
jO|
dhG
9 }|
)
#
|
p
JGP
98| uT!
J
9T!}G gP$ {wy ?+=Zf
,Y(!
.i
k
zf O9}Af:G
)
a "
.b
,
{wy ?+=ayG
QL}G
Zzy
Q(
MyG ^g<h
UG(
@ : IQw=}yG ?t*QayG gP$
OF e9GI} r+wA| Q9+ALG
zfC
IO (
F [email protected] h|
IQ(
Z yG