Created
July 21, 2024 01:46
-
-
Save deton/8a8f6ff77a45ff77faf3e62724a3dff6 to your computer and use it in GitHub Desktop.
Lynx patch to ignore xml encoding if charset is specified in Content-Type header
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Lynx patch to ignore xml encoding if charset is specified in Content-Type header | |
## Problem | |
Some sites (2chcopipe.com, digital-thread.com) use | |
incorrect <?xml encoding value. | |
Lynx displays garbled characters. | |
Content-Type: text/html; charset=euc-jp | |
... | |
<?xml version="1.0" encoding="UTF-8"?> | |
... | |
<meta http-equiv="Content-Type" content="text/html; charset=euc-jp" /> | |
... | |
(euc-jp encoded body text) | |
## Patch | |
Change to ignore xml encoding if charset is specified in Content-Type HTTP response header. | |
## TODO | |
* Use xml encoding like META charset (not only check utf-8) | |
--- ../orig/lynx2.9.2/WWW/Library/Implementation/SGML.c 2024-04-12 05:22:19.000000000 +0900 | |
+++ WWW/Library/Implementation/SGML.c 2024-07-20 10:36:02.977162250 +0900 | |
@@ -896,6 +896,8 @@ static void handle_processing_instructio | |
int flag = me->T.decode_utf8; | |
me->strict_xml = TRUE; | |
+ if (HTAnchor_getUCLYhndl(me->node_anchor, UCT_STAGE_MIME) >= 0) | |
+ return; | |
/* | |
* Switch to UTF-8 if the encoding is explicitly "utf-8". | |
*/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment