You Are Here:

Community: Developer Discussion Boards

#1 Old text compression - 2004-04-24, 11:05

Join Date: Jan 2004
Posts: 49
Location: Poland
ooooooo
Offline
Registered User
I've heard that character data can be encoded with lesser bits than ANSI's 8.
Are there J2SE's API classes helpful to convert limited set of symbols (just letters) to e.g 5 or 6-bit encoded data and, most important, is there a way to READ correctly in J2ME such encoded data?
I am planning to squeeze lots of text into jar and looking for some ways to compress it. Have already someone done it this way?
Reply With Quote

#2 Old 2004-04-24, 14:22

Join Date: Aug 2003
Posts: 232
Location: uk
Send a message via ICQ to alex_crowther
alex_crowther's Avatar
alex_crowther
Offline
Regular Contributor
J2ME doesn't directly support this, you could however
write the nessessary code yourself.

Doing so will mean writing a custom compressor
(any language, C probably easyest), and a J2ME
decompressor to turn the byte stream back into
ANSI text (the latter step being nessessary because
the J2ME methods expect 8bit text).

Its worth noting that text files (like all files) in a .jar
are compressed anyway, because .jar's like zip files
are a compressed format, so savings will not be as
big as you might imagine .. anyway.

You will get the best compression, if you limit the
ascii char you use. Example, if you just choose the
letters a-z,space and full stop, you have only 28
characters.

Powers of two go:
0->1
1->2
2->4
3->8
4->16
5->32
6->64
7->128
8->256

Hence the first number of bits that will hold your 28
characters is 5, 5bits=>32 different numbers (which
leaves you 32-28=4 spare if you want to add 4 other
characters to your supported list).

Now we have to convert input text, into its 5bit equivilent
and then convert that back as we loading it in the midlet.
Bare in mind at this point, that our data might be in the
form 5:5:5:5:.....etc but the available api's in most languages
only deal with bytes, i.e. 8:8:8:8.... so you
will have to use bit wise operators to merge the 5bit chunks
into 8bit chunks.

e.g: if 1, 2, 3 are the 5 bit chunks, like this
[1][1][1][1][1] [2][2][2][2][2] [3][3][3][3][3] .. etc

They must be converted into 8bit chunks like this:
[1][1][1][1][1][2][2][2] [2][2][3][3][3][3][3][3][4] .. etc
note: first byte is a mixture of 1 and 2, second
a mixture of 2,3 and 4 etc.

Pieces of code you need to write:
(1) choose a mapping and bit depth:
0-25 => a-z
26 => space
27 => full stop

(2) write something to convert regular ascii text into
this mapping, e.g. a array of numbers (in this case from
0-27).

(3) assuming all your numbers are now less than your
bit depth (in this case 5), then bit wise OR them together
to make your 8bit stream (which contains the 5bit chunks).

This is the raw data you put in your jar file.

(4) write some J2ME code that will expand your 5bit chunks back into 8bit characters.

Magic!

To do all this, you are going to have to be fairly competent
with bit wise operators, e.g. OR, AND, NOT etc
(note bit wise, and not boolean).

Hope that helps.

Alex
Last edited by alex_crowther : 2004-04-24 at 14:29.
Reply With Quote

#3 Old 2004-04-26, 10:13

Join Date: Jul 2003
Posts: 1,094
Location: Finland, Tampere
doctordwarf
Offline
Super Contributor
ooooooo
JARs already utilize not bad ZIP compression.
Of course you can still add more compression, by storing less bits of actual data, etc
However, I've got an impression, that you jus want to compress text data. Wouldn't it be just a reimplementation of ZIP algorithms already applied?
Reply With Quote

#4 Old 2004-04-26, 14:04

Join Date: Jan 2004
Posts: 49
Location: Poland
ooooooo
Offline
Registered User
I guess you right - my intention was to shrink text file size by eg coding in less bit format and then compressing in jar, but if jar compression algorithm uses similiar mechanizms - that would be not big effect. But you doctordwarf suggests that
>>>"you can still add more compression, by storing less bits of actual data"
If zip compression does it's job using other ways than less-bit coding (does it?) - using the method alex_crowther explained - makes sense. Alex - thanks for your thorough effort.
Reply With Quote
Reply « Previous Thread | Next Thread »
Display Modes
Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules

You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Forum Jump

Rate This

 
Bookmark this page: DeliciousDiggFacebookGoogleYahooStumbleUponRedditDiigoTechnocratiTwitter  Share this page Share this page Print this Page Print this page Invite a friend Invite a friend
京ICP备05048969号    Email Newsletters Press Terms & Conditions Privacy Policy Sitemap Contact Us © 2009 Nokia