[PATCH] Bug in label encoding - mkgmap-dev - The mkgmap lists

newer
[PATCH v1] Reduce memory footprint...

[PATCH] Bug in label encoding

older
Splitter caused a...

Ronny Klier

7 Feb 2010 7 Feb '10

11:47 p.m.

I think there is a bug in label encoding in Format6Encoder. For some string length the last encoded byte is not stored. E.g. having a string "10007" the encoded byte buffer looks like this [0] [0x86] [1] [0x8] [2] [0x20] [3] [0x9f] [4] [0xf0] The number of stored bytes is 4. So the 0xf0 will not show up in the final image file.

Attachments:

Format6Encoder.patch (text/plain — 361 bytes)

Reply

Sign in to reply online Use email software

Show replies by date

Marko Mäkelä

8 Feb 8 Feb

7:15 a.m.

On Mon, Feb 08, 2010 at 12:47:50AM +0100, Ronny Klier wrote:

I think there is a bug in label encoding in Format6Encoder. For some string length the last encoded byte is not stored.

E.g. having a string "10007" the encoded byte buffer looks like this

[0] [0x86] [1] [0x8] [2] [0x20] [3] [0x9f] [4] [0xf0]

The number of stored bytes is 4. So the 0xf0 will not show up in the final image file.

Index: Format6Encoder.java =================================================================== --- Format6Encoder.java (Revision 1541) +++ Format6Encoder.java (Arbeitskopie) @@ -86,7 +86,7 @@

buf = put6(buf, off++, 0xff);

- int len = ((off - 1) * 6) / 8 + 1; + int len = (int)Math.ceil((off * 6) / 8.0);

You can do this with integer math, truncating division. Your example was off=6 (5 chars and the end-of-string code), and I suppose we would get len=4 instead of 5: (6-1)*6 / 8 + 1 = 30/8 + 1 = 3.75 + 1 = 4 If you want to round up to full blocks, the normal trick is to add divisor-1 before dividing, like this: int len = ((off - 1) * 6 + 7) / 8 + 1 = 4.625 + 1 = 5 I don't know if the off-1 and the +1 are correct. An integer version of your formula would also work in this case: int len = (off * 6 + 7) / 8 = 43/8 = 5.375 = 5 This formula is clear to me: it will clearly convert the "off" 6-byte chars (including the end-of-string code) to the number of required 8-bit octets. Best regards, Marko

Reply

Sign in to reply online Use email software

Toby Speight

10:04 a.m.

0> In article <20100208071528.GA11669@x60s>, 0> Marko Mäkelä <URL:mailto:marko.makela@iki.fi> ("Marko") wrote: Marko> An integer version of your formula would also work in this case: Marko> Marko> int len = (off * 6 + 7) / 8 = 43/8 = 5.375 = 5 Marko> Marko> This formula is clear to me: it will clearly convert the "off" Marko> 6-byte chars (including the end-of-string code) to the number of Marko> required 8-bit octets. I'm with Marko here - this integer version is both computationally efficient and clear in its intent. That's the standard idiom for rounded-up division.

Reply

Sign in to reply online Use email software

Steve Ratcliffe

9:46 a.m.

On 07/02/10 23:47, Ronny Klier wrote:

I think there is a bug in label encoding in Format6Encoder. For some string length the last encoded byte is not stored.

E.g. having a string "10007" the encoded byte buffer looks like this

[0] [0x86] [1] [0x8] [2] [0x20] [3] [0x9f] [4] [0xf0]

The number of stored bytes is 4. So the 0xf0 will not show up in the final image file.

I believe the code is correct and the 0xf0 is not required and omitted on purpose. A 5 byte string where each character is encoded as 6 bits requires 30 bits of storage which is 3.75 bytes. There is then a end-of-string marker which is written as the 6 bit value 0x3f. However, the end of string marker is actually variable length and as long as the first two bits are written you can drop the rest. So the whole thing fits into 4 bytes which is what is written. The code is this way because originally I did not realise that the string terminator was effectively a two bit quantity ie. if the first two bits are both 1 then the strings ends and you stop reading and throw away any remaining part of the byte. You could probably write 2 bits and then round the byte count up instead of writing 6 and rounding down. ..Steve

Reply

Sign in to reply online Use email software

Ronny Klier

8:52 p.m.

Am 08.02.2010 10:46, schrieb Steve Ratcliffe:

On 07/02/10 23:47, Ronny Klier wrote:

...
I think there is a bug in label encoding in Format6Encoder. For some string length the last encoded byte is not stored.

E.g. having a string "10007" the encoded byte buffer looks like this

[0] [0x86] [1] [0x8] [2] [0x20] [3] [0x9f] [4] [0xf0]

The number of stored bytes is 4. So the 0xf0 will not show up in the final image file.

I believe the code is correct and the 0xf0 is not required and omitted on purpose.

A 5 byte string where each character is encoded as 6 bits requires 30 bits of storage which is 3.75 bytes. There is then a end-of-string marker which is written as the 6 bit value 0x3f. However, the end of string marker is actually variable length and as long as the first two bits are written you can drop the rest. So the whole thing fits into 4 bytes which is what is written.

The code is this way because originally I did not realise that the string terminator was effectively a two bit quantity ie. if the first two bits are both 1 then the strings ends and you stop reading and throw away any remaining part of the byte. You could probably write 2 bits and then round the byte count up instead of writing 6 and rounding down.

..Steve

OK, I got this wrong. I thought the label section could be continously read. Every label ending with 0x3f and next label starting at next byte boundary.

Reply

Sign in to reply online Use email software

5494

Age (days ago)

5495

Last active (days ago)

4 comments

4 participants

tags

participants (4)

Marko Mäkelä
Ronny Klier
Steve Ratcliffe
Toby Speight