| Reply | « Previous Thread | Next Thread » |
|
I've heard that character data can be encoded with lesser bits than ANSI's 8.
Are there J2SE's API classes helpful to convert limited set of symbols (just letters) to e.g 5 or 6-bit encoded data and, most important, is there a way to READ correctly in J2ME such encoded data? I am planning to squeeze lots of text into jar and looking for some ways to compress it. Have already someone done it this way? |
|
J2ME doesn't directly support this, you could however
write the nessessary code yourself. Doing so will mean writing a custom compressor (any language, C probably easyest), and a J2ME decompressor to turn the byte stream back into ANSI text (the latter step being nessessary because the J2ME methods expect 8bit text). Its worth noting that text files (like all files) in a .jar are compressed anyway, because .jar's like zip files are a compressed format, so savings will not be as big as you might imagine .. anyway. You will get the best compression, if you limit the ascii char you use. Example, if you just choose the letters a-z,space and full stop, you have only 28 characters. Powers of two go: 0->1 1->2 2->4 3->8 4->16 5->32 6->64 7->128 8->256 Hence the first number of bits that will hold your 28 characters is 5, 5bits=>32 different numbers (which leaves you 32-28=4 spare if you want to add 4 other characters to your supported list). Now we have to convert input text, into its 5bit equivilent and then convert that back as we loading it in the midlet. Bare in mind at this point, that our data might be in the form 5:5:5:5:.....etc but the available api's in most languages only deal with bytes, i.e. 8:8:8:8.... so you will have to use bit wise operators to merge the 5bit chunks into 8bit chunks. e.g: if 1, 2, 3 are the 5 bit chunks, like this [1][1][1][1][1] [2][2][2][2][2] [3][3][3][3][3] .. etc They must be converted into 8bit chunks like this: [1][1][1][1][1][2][2][2] [2][2][3][3][3][3][3][3][4] .. etc note: first byte is a mixture of 1 and 2, second a mixture of 2,3 and 4 etc. Pieces of code you need to write: (1) choose a mapping and bit depth: 0-25 => a-z 26 => space 27 => full stop (2) write something to convert regular ascii text into this mapping, e.g. a array of numbers (in this case from 0-27). (3) assuming all your numbers are now less than your bit depth (in this case 5), then bit wise OR them together to make your 8bit stream (which contains the 5bit chunks). This is the raw data you put in your jar file. (4) write some J2ME code that will expand your 5bit chunks back into 8bit characters. Magic! To do all this, you are going to have to be fairly competent with bit wise operators, e.g. OR, AND, NOT etc (note bit wise, and not boolean). Hope that helps. Alex
Last edited by alex_crowther : 2004-04-24 at 14:29.
|
| alex_crowther |
| View Public Profile |
| Find all posts by alex_crowther |
|
ooooooo
JARs already utilize not bad ZIP compression. Of course you can still add more compression, by storing less bits of actual data, etc However, I've got an impression, that you jus want to compress text data. Wouldn't it be just a reimplementation of ZIP algorithms already applied? |
|
I guess you right - my intention was to shrink text file size by eg coding in less bit format and then compressing in jar, but if jar compression algorithm uses similiar mechanizms - that would be not big effect. But you doctordwarf suggests that
>>>"you can still add more compression, by storing less bits of actual data" If zip compression does it's job using other ways than less-bit coding (does it?) - using the method alex_crowther explained - makes sense. Alex - thanks for your thorough effort. |
| Reply | « Previous Thread | Next Thread » |
| Thread Tools | Search this Thread |
|---|---|