Ticket #441 (new defect)

Opened 12 years ago

Last modified 12 years ago

Unreadable characters in Spanish Pronunciation dictionary

Reported by: kmaclean Owned by: kmaclean
Priority: minor Milestone: WebSite 0.2.1
Component: Web Site Version: Website 0.2
Keywords: Cc:

Description

From this post:

I just took a look into the spanish pronounciation dictionary. On my computer (Win XP, Firefox) there are some unreadable characters. We have similar problems with the german language when it comes to characters like "ä, ö, ü, ß".

The special characters of the Spanish language may be displayed correctly on your own personal computer. But please keep in mind that other people like me may experience problems with the special characters. Probably you know about this problem. If not, please read this article.

Change History

comment:1 Changed 12 years ago by kmaclean

from svn-book:

Subversion internally handles certain bits of data—for example, property names, pathnames, and log messages—as UTF-8-encoded Unicode.

comment:2 Changed 12 years ago by kmaclean

Subversion Web front-end uses WebDAV on Apache.

Apache httpd.conf contains this line: AddDefaultCharset? UTF-8

So all content without a character set defined, should be served in UTF-8

comment:3 Changed 12 years ago by kmaclean

from Apache 2.2 manual

This directive specifies a default value for the media type charset parameter (the name of a character encoding) to be added to a response if and only if the response's content-type is either text/plain or text/html. [...]

AddDefaultCharset? should only be used when all of the text resources to which it applies are known to be in that character encoding and it is too inconvenient to label their charset individually. One such example is to add the charset parameter to resources containing generated content, such as legacy CGI scripts, that might be vulnerable to cross-site scripting attacks due to user-provided data being included in the output. Note, however, that a better solution is to just fix (or delete) those scripts, since setting a default charset does not protect users that have enabled the "auto-detect character encoding" feature on their browser.

comment:4 Changed 12 years ago by kmaclean

Prompts file has the same problem:

comment:5 Changed 12 years ago by kmaclean

Pronunciation dictionary:

  • Trac Version is too big to display in Trac, and if you download it, it displays OK in Gedit, but only because gedit displays its original format.
  • Subversion Version has problems

To determine a document's encoding, download it, and open it in FireFox? - then look at "View > Character Encoding" to see what encoding it has... both the prompts and lexicon files show up with ISO-8859-1 character encoding, rather than UTF-8.

Note: See TracTickets for help on using tickets.