| delorie.com/archives/browse.cgi | search |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| X-Original-To: | cygwin AT cygwin DOT com |
| Delivered-To: | cygwin AT cygwin DOT com |
| DMARC-Filter: | OpenDMARC Filter v1.3.2 sourceware.org 811003857C53 |
| Authentication-Results: | sourceware.org; dmarc=none (p=none dis=none) |
| header.from=SystematicSw.ab.ca | |
| Authentication-Results: | sourceware.org; |
| spf=none smtp.mailfrom=brian DOT inglis AT systematicsw DOT ab DOT ca | |
| X-Authority-Analysis: | v=2.3 cv=ePaIcEh1 c=1 sm=1 tr=0 |
| a=kiZT5GMN3KAWqtYcXc+/4Q==:117 a=kiZT5GMN3KAWqtYcXc+/4Q==:17 | |
| a=IkcTkHD0fZMA:10 a=r-inJIJVAAAA:8 a=Ed7FdIT4gc43trk-okQA:9 a=QEXdDO2ut3YA:10 | |
| a=BQhyvZF-XxUuHVZtuGPo:22 | |
| Subject: | Re: Trouble with character sets |
| To: | cygwin AT cygwin DOT com |
| References: | <OF3F4D2646 DOT 3A75682C-ON852585B5 DOT 0058983D-852585B9 DOT 0055B758 AT abinitio DOT com> |
| From: | Brian Inglis <Brian DOT Inglis AT SystematicSw DOT ab DOT ca> |
| Autocrypt: | addr=Brian DOT Inglis AT SystematicSw DOT ab DOT ca; prefer-encrypt=mutual; |
| keydata= | |
| mDMEXopx8xYJKwYBBAHaRw8BAQdAnCK0qv/xwUCCZQoA9BHRYpstERrspfT0NkUWQVuoePa0 | |
| LkJyaWFuIEluZ2xpcyA8QnJpYW4uSW5nbGlzQFN5c3RlbWF0aWNTdy5hYi5jYT6IlgQTFggA | |
| PhYhBMM5/lbU970GBS2bZB62lxu92I8YBQJeinHzAhsDBQkJZgGABQsJCAcCBhUKCQgLAgQW | |
| AgMBAh4BAheAAAoJEB62lxu92I8Y0ioBAI8xrggNxziAVmr+Xm6nnyjoujMqWcq3oEhlYGAO | |
| WacZAQDFtdDx2koSVSoOmfaOyRTbIWSf9/Cjai29060fsmdsDLg4BF6KcfMSCisGAQQBl1UB | |
| BQEBB0Awv8kHI2PaEgViDqzbnoe8B9KMHoBZLS92HdC7ZPh8HQMBCAeIfgQYFggAJhYhBMM5 | |
| /lbU970GBS2bZB62lxu92I8YBQJeinHzAhsMBQkJZgGAAAoJEB62lxu92I8YZwUBAJw/74rF | |
| IyaSsGI7ewCdCy88Lce/kdwX7zGwid+f8NZ3AQC/ezTFFi5obXnyMxZJN464nPXiggtT9gN5 | |
| RSyTY8X+AQ== | |
| Organization: | Systematic Software |
| Message-ID: | <ae1f8133-948a-4497-049b-b8349a138143@SystematicSw.ab.ca> |
| Date: | Mon, 3 Aug 2020 10:31:15 -0600 |
| User-Agent: | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 |
| Thunderbird/68.11.0 | |
| MIME-Version: | 1.0 |
| In-Reply-To: | <OF3F4D2646.3A75682C-ON852585B5.0058983D-852585B9.0055B758@abinitio.com> |
| X-CMAE-Envelope: | MS4wfO+axsxNBVlybyjrnnrDskMNIRDHvxrY5HB8+ei3jsETE2OSypvqxKNy7yP0gFU6xrLvJnTplOODJN3p/OSvKMglKaLOKMbw106NtcRTsgyX9pXF3vad |
| q5VKICnn2tICCSHqHWrt43K/qQiX/m29tmWcqqu8hPbiQj5Q/4M15SZPfIKhS21CsRFB00jacdNL1Q== | |
| X-Spam-Status: | No, score=-8.7 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, |
| KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, | |
| SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 | |
| X-Spam-Checker-Version: | SpamAssassin 3.4.2 (2018-09-13) on |
| server2.sourceware.org | |
| X-BeenThere: | cygwin AT cygwin DOT com |
| X-Mailman-Version: | 2.1.29 |
| List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
| List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
| List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
| List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
| Reply-To: | cygwin AT cygwin DOT com |
| Errors-To: | cygwin-bounces AT cygwin DOT com |
| Sender: | "Cygwin" <cygwin-bounces AT cygwin DOT com> |
| X-MIME-Autoconverted: | from base64 to 8bit by delorie.com id 073GVnLg032344 |
On 2020-08-03 09:36, Michael Shay via Cygwin wrote:
> I'm having a problem with Cygwin 3.1.4, changing the character set on the
> fly. It seems to work with Cygwin applications, but not with Win32
> applications.
> I have a Korn shell script:
> #!/bin/ksh
> OLD_LANG="$LANG"
> OLD_LC_ALL="$LC_ALL"
> echo "locale on entry"
> locale
> echo ""
> export LANG="en_US.CP1252"
> export LC_ALL=en_US.CP1252
> echo "locale changed to"
> locale
> echo ""
> # Default is to run the Win32 program. Input any argument other than
> 'WIN32'
> # to run '/bin/echo'.
> case $# in
> 0 ) echo "Running WIN32 pgm"
> ksh -c 'cygtest.exe ZÇ'
> ;;
> 1 ) echo "Running Cygwin 'echo'"
> ksh -c '/bin/echo ZÇ'
> ;;
> 2 ) echo "Running WIN32 pgm"
> ksh -c 'cygtest.exe ZÇ'
> echo ""
> echo "Running Cygwin 'echo'"
> ksh -c '/bin/echo ZÇ'
> ;;
> * ) ;;
> esac
> LC_ALL="$OLD_LC_ALL"
> LANG="$OLD_LANG"
> and a Win32 application (attached file cygtest.cpp)
> I used gdb to see what was happening in child_info_spawn::worker(), when a
> Win32 program is started using:
> rc = CreateProcessW (runpath, /* image name w/ full path */
> cmd.wcs (wcmd), /* what was passed to exec */
> sa, /* process security attrs */
> sa, /* thread security attrs */
> TRUE, /* inherit handles */
> c_flags,
> envblock, /* environment */
> NULL,
> &si,
> &pi);
> Specifically, 'cmd.wcs(wcmd)' invokes:
> wchar_t *wcs (wchar_t *wbuf, size_t n)
> {
> if (n == 1)
> wbuf[0] = L'\0';
> else
> sys_mbstowcs (wbuf, n, buf);
> return wbuf;
> }
> and sys_mbstowcs():
> size_t __reg3
> sys_mbstowcs (wchar_t * dst, size_t dlen, const char *src, size_t nms)
> {
> mbtowc_p f_mbtowc = __MBTOWC;
> if (f_mbtowc == __ascii_mbtowc)
> {
> f_mbtowc = __utf8_mbtowc; <<<<< this
> is ALWAYS done, no matter what charset is in use.
> }
> return sys_cp_mbstowcs (f_mbtowc, dst, dlen, src, nms);
> }
> Since the CP1252 is an 8-bit single-byte character set with characters >=
> 0x80, the '0xc7' character is always translated as '0xc7 0xf0', with the
> '0xf0' byte indicating an invalid character in the string.
> This doesn't seem to happen when e.g. '/bin/echo' is run, although I
> haven't stepped into the code to see what's happening.
> I do not think this is a Cygwin bug, but since the User's Guide says the
> locale and charset can be changed on the fly, I don't know what's going
> awry.
> Any suggestions? If you need more information, I'm happy to provide it.
Try:
$ chcp.com
Active code page: 850
$ chcp.com 65001
Active code page: 65001
$ chcp.com
Active code page: 65001
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in IEC units and prefixes, physical quantities in SI.]
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |