Web lists-archives.com

Bug? wcsxfrm causing memory corruption




Greetings--

In the process of fixing the Python test suite on Cygwin I ran across
one test that was consistently causing segfaults later on, not
directly local to that test.  The test involves wcsxfrm so that's
where I focused my attention.

The attached test demonstrates the bug.  Given an output buffer of N
wide characters, wcsxfrm will cause bytes beyond the destination size
to be reversed. I believe it might actually be a bug in the underlying
LCMapStringW workhorse (this is on Windows 10; have not tested other
versions).

According to its docs [1], the cchDest argument (size of the
destination buffer) is treated as a *byte* count when using
LCMAP_SORTKEY.  However, for the purposes of applying the
LCMAP_BYTEREV transformation it seems to be treating the output size
(in bytes) as character count.  So in the example I give, where the
output sort key is 7 bytes (including the null terminator), it swaps
*14* bytes--the bytes including the sort key as well as the next 7
adjacent bytes.  This is obviously a problem if the destination buffer
is allocated out of some larger memory pool.

This definitely has to be a bug, right?  Or at least very poorly
documented on MS's part.  A workaround would either be to not use
LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
LCMapStringW with LCMAP_BYTEREV and the correct character count...

Thanks,
Erik


[1] https://msdn.microsoft.com/en-us/library/windows/desktop/dd318700(v=vs.85).aspx
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#include <string.h>
#include <windows.h>

#define SIZE 32


void fill_bytes(uint8_t *a, int n) {
    int idx;
    for (idx=0; idx<n; idx++) {
        a[idx] = idx;
    }
}


void print_bytes(uint8_t *a, int n) {
    int idx;
    for (idx=0; idx<n; idx++) {
        printf("0x%02x ", ((uint8_t*)a)[idx]);
        if ((idx + 1) % 8 == 0) printf("\n");
    }
}

int main(void) {
    wchar_t *a, *b;
    uint8_t *aa;
    size_t ret;
    LCID collate_lcid;
    int idx;
    collate_lcid = 1033;
    b = L"b";
    a = (wchar_t*) malloc(SIZE);
    aa = (uint8_t*) a;

    setlocale(LC_ALL, "en_US.UTF-8");

    printf("using wcsxfrm:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    ret = wcsxfrm(a, b, 4);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);

    printf("\nusing LCMapStringW directly:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    
    ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY | LCMAP_BYTEREV, b, -1, a, 8);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);

    printf("\nwithout LCMAP_BYTEREV:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    
    ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY, b, -1, a, 8);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);
    free(a);
    
    return 0;
}
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple