Ticket #426 (closed defect: fixed)

Opened 12 years ago

Last modified 12 years ago

WebGUI Upgrade messed up Russian web page

Reported by: kmaclean Owned by: kmaclean
Priority: major Milestone: WebSite 0.2.1
Component: Web Site Version: Website 0.2
Keywords: Cc:

Description (last modified by kmaclean) (diff)

  • Upgrade of webgui from release 7.4.20 to 7.5.20 caused some problems with Russian web page.
  • Changed everything to English while trying to find a workaround
  • cut and pasted Russian page from Google cache and put into WebGUI
  • now WebGUI displays the page OK when signed on, but if viewing the page as a visitor, the page is all garbled.
  • cannot duplicate this issue on WebGUI demo site???

Attachments

read-asVisitor.zip (39.2 KB) - added by kmaclean 12 years ago.
read-asVisitor.zip
read-asRegisteredUser.zip (74.7 KB) - added by kmaclean 12 years ago.
read-asRegisteredUser.zip

Change History

comment:1 Changed 12 years ago by kmaclean

  • Description modified (diff)

comment:2 Changed 12 years ago by kmaclean

  • rolling back the Russian changes... so that English version is used while trying to find a fix for this...

comment:3 Changed 12 years ago by kmaclean

Problem is not consistent across the site:

  • Languages section of Home, Forum, Dev, Download and About page display the russian word for "Russian" OK
  • Listen Page has display problems with Russian word; so does the Read page, but all the Russian text on the Read page (except the Speech Submission App) does not display properly.

Changed 12 years ago by kmaclean

read-asVisitor.zip

Changed 12 years ago by kmaclean

read-asRegisteredUser.zip

comment:4 Changed 12 years ago by kmaclean

If you log out while displaying the Russian Read page (as a registered user...), then when you see this URL: http://www.voxforge.org/ru/read?op=auth;method=logout everything diplays fine.

If you remove the "?op=auth;method=logout", then the page displays incorrectly.

comment:5 Changed 12 years ago by kmaclean

  • Status changed from new to assigned

After backing everything out, the Listen page still displays the English word Russian incorrectly, even though it appears OK when signed in or when accessing the apge with commands attached to it - like this: http://www.voxforge.org/ru/listen?op=auth;method=logout;

If I remove the "?op=auth;method=logout" (after logging out) from the URL the word "Russian" displays all garbled up...???

comment:6 Changed 12 years ago by kmaclean

This is the HTML from the listen page:

<p>
<style type="text/css">
	&amp;lt;!--
		@page { size: 8.5in 11in; margin: 0.79in }
		P { margin-bottom: 0.08in }
	--&amp;gt;</style>
</p>
<div align="center"><span style="font-size: xx-small;">
<a href="/de/listen" title="German">Deutsch</a><br />
<a href="/home/listen" title="English">English</a><br />
<a href="/es/listen" title="Spanish">Espa&ntilde;ol</a></span></div>
<div align="center">
<div align="center"><span style="font-size: xx-small;"><a href="/fr/listen">Fran&ccedil;ais</a></span><br /></div>
<span style="font-size: xx-small;">
<a href="/hb/listen" title="Hebrew">Hebrew</a><br />
<a href="/it/listen" title="Italian">Italiano</a><br />
<a href="/nl/listen" title="Dutch">Nederlands</a><br />
<a href="/pt_br/listen" title="Portuguese-Brazilian">Portugu&ecirc;s</a><br />
<a href="/ru/listen" title="Russian">Russian</a></span>
 <br /></div>
<p>
<a href="/es" title="Spanish"><br /></a></p>
<div align="center">
</div>
<p style="margin-bottom: 0in" align="center"><br />
</p>

comment:7 Changed 12 years ago by kmaclean

Fix to Listen page so that Russian (in English) will display properly:

<div align="center">
<span style="font-size: xx-small;">
<a href="/de/listen" title="German">Deutsch</a><br />
<a href="/home/listen" title="English">English</a><br />
<a href="/es/listen" title="Spanish">Espa&ntilde;ol</a><br />
<a href="/fr/listen">Fran&ccedil;ais</a><br />
<a href="/hb/listen" title="Hebrew">Hebrew</a><br />
<a href="/it/listen" title="Italian">Italiano</a><br />
<a href="/nl/listen" title="Dutch">Nederlands</a><br />
<a href="/pt_br/listen" title="Portuguese-Brazilian">Portugu&ecirc;s</a><br />
<a href="/ru/listen" title="Russian">Russian</a></span>
</div>
<div align="center">
<p>&nbsp;</p>
</div>

Not sure why previous one had two Spanish entries...

comment:8 Changed 12 years ago by kmaclean

can replicate the original messed-up characters after the upgrade from WebGUI 7.4.21 to 7.5.22 on dev box (Fedora 9); but if I copy the Google cached version of the web page to the dev WebGUI server, everything looks fine...??? even after clearing cache in WebGUI and in the browser.

comment:9 Changed 12 years ago by kmaclean

from gotcha page in WebGUI 7.5.22:

7.5.9
--------------------------------------------------------------------
 * WebGUI 7.5.6 uses a Unicode database connection, but this can cause problems
   with old data stored in an erroneous format.  The 7.5.6 upgrade has been
   adjusted to compensate for this.  If you are upgrading from prior to 7.5.6,
   the data should be repaired automatically.  However, if you had already upgraded
   past 7.5.6, there is no automated way to resolve the differences in the data.
   For information on how to resolve this if you have already upgraded, see
        http://www.webgui.org/bugs/tracker/charset-db-connection

comment:10 Changed 12 years ago by kmaclean

If you create a brand new page layout, and then create an article with Russian text, then it displays OK with a "clean" URL - i.e. http://www.voxforge.org/ru/read

comment:11 Changed 12 years ago by kmaclean

This problem only seems to occur when there is an HTTP proxy wobject on the same page. Tried it on the WebGUI demo site (r7.5.24) and there is no problem.

Therefore, likely need to upgrade to a newer version of WRE or MYSQL

(Voxforge WRE version is 0.7.1 - most current WebGUI version is 0.8.5)

comment:12 Changed 12 years ago by kmaclean

Tried with WebGUI 7.5.24 and 7.6 and problem is still there. Don't think it is a WRE 0.8.5 issue since the version of Perl, MySQL, ... are not that different than for WRE 0.7.1.

These posts might be related:

Data Form & International labels - says this was fixed in 7.5.22 & 7.6.0

HTTP Proxy & Syndicated Content with international text corrupted

comment:13 Changed 12 years ago by kmaclean

Possible fix (though this is likely another MySQL table config issue...):

sub view {
    my $self = shift;
    if ($self->{_viewVars}{showAdmin} && $self->canEditIfLocked) {
        # under normal circumstances we don't put HTML stuff in our code, but this will make it much easier
        # for end users to work with our templates
        $self->{_viewVars}{"dragger.icon"} = '<div class="dragTrigger dragTriggerWrap">'.$self->session->icon->drag('class="dragTrigger"').'</div>';
        $self->{_viewVars}{"dragger.init"} = '
            <iframe id="dragSubmitter" style="display: none;" src="'.$self->session->url->extras('spacer.gif').'"></iframe>
            <script type="text/javascript">
                dragable_init("'.$self->getUrl("func=setContentPositions;map=").'");
            </script>
            ';
    }
    my $showPerformance = $self->session->errorHandler->canShowPerformanceIndicators();
    my $out = $self->processTemplate($self->{_viewVars},undef,$self->{_viewTemplate});
    my @parts = split("~~~~~",$self->processTemplate($self->{_viewVars},undef,$self->{_viewTemplate}));
    my $output = "";
    foreach my $part (@parts) {
        my ($outputPart, $assetId) = split("~~~",$part,2);
        if ($self->{_viewPrintOverride}) {
            $self->session->output->print($outputPart);
        } else {
            $output .= $outputPart;
        }
        my $asset = $self->{_viewPlaceholder}{$assetId};
        if (defined $asset) {
            my $t = [Time::HiRes::gettimeofday()] if ($showPerformance);
            my $assetOutput = $asset->view;
            $assetOutput .= "Asset:".Time::HiRes::tv_interval($t) if ($showPerformance);
            if ($self->{_viewPrintOverride}) {
                $self->session->output->print($assetOutput);
            } else {
				# !!!!!!
       		    if (! utf8::is_utf8($assetOutput)) {
  					utf8::decode($assetOutput);
				}                
                # !!!!!!
                $output .= $assetOutput;
            }
        }
    }
    return $output;
}

comment:14 Changed 12 years ago by kmaclean

That was in the view method for the WebGUI::Asset::Wobject::Layout class.

comment:15 Changed 12 years ago by kmaclean

Other issues that might be related:

charset problems continue - 13-June-2008 charset of connection to DB - 11-August-2006

comment:17 Changed 12 years ago by kmaclean

tried this:

Add the following to /etc/mysql/my.cnf in the proper sections:

[mysqld]
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci

[client]
default-character-set=utf8

 

Dump the DB: 

# mysqldump --default_character-set=latin1 --skip-set-charset www_voxforge_org > www_voxforge_org.sql -u root -p

 

Switch char set and collate to UTF8:

# perl -pe 's/latin1_bin/utf8_general_ci/g; s/latin1/utf8/g' www_voxforge_org.sql > www_voxforge_org-utf8.sql
 

 Recreate and populate DB:

mysql --execute="DROP DATABASE www_voxforge_org; CREATE DATABASE www_voxforge_org CHARACTER SET utf8 COLLATE utf8_general_ci;" -u root -p

mysql --default-character-set=utf8 www_voxforge_org < www_voxforge_org-utf8.sql

did not work...

comment:18 Changed 12 years ago by kmaclean

There are two related problems:

  1. When you have an Article Wobject with Russian text (happens with some Portuguese characters also...), and there is a HTTPProxy Wobject on the same Page Layout, then the Russian text does not display correctly.

There are no character display problems if the Article with Russian text is on a page layout with no HTTPProxy assets on the same page.

The temporary fix for this is was mentionned a couple of posts back:

sub view {

    ...

    foreach my $part (@parts) {
        my ($outputPart, $assetId) = split("~~~",$part,2);
        if ($self->{_viewPrintOverride}) {
            $self->session->output->print($outputPart);
        } else {
            $output .= $outputPart;
        }
        my $asset = $self->{_viewPlaceholder}{$assetId};
        if (defined $asset) {
            my $t = [Time::HiRes::gettimeofday()] if ($showPerformance);
            my $assetOutput = $asset->view;
            $assetOutput .= "Asset:".Time::HiRes::tv_interval($t) if ($showPerformance);
            if ($self->{_viewPrintOverride}) {
                $self->session->output->print($assetOutput);
            } else {
		# !!!!!!
       		if (! utf8::is_utf8($assetOutput)) {
  		    utf8::decode($assetOutput);
		}                
                # !!!!!!
                $output .= $assetOutput;
            }
        }
    }
    return $output;
}

If you register, you can see the the Russian text correctly. If you are signed out, and have a "?" with some text (i.e. WebGUI thinks you are trying to execute an operation), you can see the text correctly. If you remove any operation text from the URL (i.e. "?..."), then the Russian text does not display correctly. This seems to be a result of the "slashdot / burst protection hack" in the www_view method in the WebGUI::Asset::Wobject::Layout class:

sub www_view {
    my $self = shift;
    slashdot / burst protection hack
    if ($self->session->var->get("userId") eq "1" && $self->session->form->param() == 0) { 
        my $check = $self->checkView;
        return $check if (defined $check);
        my $cache = WebGUI::Cache->new($self->session, "view_".$self->getId);
        my $out = $cache->get if defined $cache;
        unless ($out) {
            $self->prepareView;
            $self->session->stow->set("cacheFixOverride", 1);
            $out = $self->processStyle($self->view);
            $cache->set($out, 60);
            $self->session->stow->delete("cacheFixOverride");
        }
        
        ...

        $self->session->http->setLastModified($self->getContentLastModified);
        $self->session->http->sendHeader;   
        $self->session->output->print($out, 1);
        
        return "chunked";   
    }
    $self->{_viewPrintOverride} = 1; # we do this to make it output each asset as it goes, rather than waiting until the end
    return $self->SUPER::www_view;
}

The hack only operates if "$self->session->var->get("userId") eq "1"" - which corresponds to a Visitor, and "$self->session->form->param() == 0" - which corresponds to there being no operation code in the URL (i.e. "?...").

Why "the slashdot / burst protection hack" causes problem on the VoxForge? instance of WebGUI and not the PlainBlack? demo instance makes me thin this might be a MySQL issue...

comment:19 Changed 12 years ago by kmaclean

  1. The second problem seems to occur when there is an article being *referenced* by a Proxy Macro (not an HTTPProxy wobject... I have not tested whether HTTPProxy causes the same problem to occur...). The erroneous Russian text appears in both the original article being proxied and where the proxy macro appears...

If you register, you can see the the Russian text correctly. If you are signed out, and have a "?" with some text at the end of your URL (i.e. WebGUI thinks you are trying to execute an operation), you can see the text correctly. If you remove any operation text from the URL (i.e. "?..."), then the Russian text does not display correctly. The "utf8::decode($assetOutput);" hack does not seem to fix this particular problem... Therefore need to comment out the entire "slashdot / burst protection hack" in the www_view method in the WebGUI::Asset::Wobject::Layout class:

sub www_view {
    my $self = shift;
    # slashdot / burst protection hack
    #if ($self->session->var->get("userId") eq "1" && $self->session->form->param() == 0) { 
 #       my $check = $self->checkView;
 #       return $check if (defined $check);
        #my $cache = WebGUI::Cache->new($self->session, "view_".$self->getId);
        #my $out = $cache->get if defined $cache;
        #unless ($out) {
#            $self->prepareView;
            #$self->session->stow->set("cacheFixOverride", 1);
  #          $out = $self->processStyle($self->view);
           #$cache->set($out, 60);
            #$self->session->stow->delete("cacheFixOverride");
        #}
        # keep those ads rotating
        #while ($out =~ /(\[AD\:([^\]]+)\])/gs) {
        #    my $code = $1;
        #    my $adSpace = WebGUI::AdSpace->newByName($self->session, $2);
        #    my $ad = $adSpace->displayImpression if (defined $adSpace);
        #    $out =~ s/\Q$code/$ad/ges;
        #}
 #       $self->session->http->setLastModified($self->getContentLastModified);
  #      $self->session->http->sendHeader;   
  #      $self->session->output->print($out, 1);
  #      
  #      return "chunked";   
  #  }
    $self->{_viewPrintOverride} = 1; # we do this to make it output each asset as it goes, rather than waiting until the end
    return $self->SUPER::www_view;
}

comment:20 Changed 12 years ago by kmaclean

  • Status changed from assigned to closed
  • Resolution set to fixed

So basically these issues are fixed, but with an ugly hack that will get overwritten at the next WebGUI upgrade.

Closing this issue, and re-opening another ticker for a later milestone to look at this in more depth (once I get a better understanding of the WebGUI internals)

comment:21 Changed 12 years ago by kmaclean

see #ticket 431

comment:22 Changed 12 years ago by kmaclean

see ticket #431

comment:23 Changed 12 years ago by kmaclean

  • Description modified (diff)
Note: See TracTickets for help on using tickets.