• Verify encoding

    From Michiel van der Vlist@2:280/5555 to Maurice Kinal on Fri Apr 25 10:48:27 2014
    Hello Maurice,

    On Thursday April 24 2014 14:12, you wrote to me:

    MvdV>> there is no way to tell what character encoding is used.

    You mean that you cannot verify encoding.

    No, I can't. Veryfying the encoding is only possible in some specific cases or when making assumptions.

    Try this one for a change:

    They call me PA\MMV ...

    I could have easily provided the wrong encoding in the CHRS kludge
    which happens all the time and for sure it would have been grunged

    Yes, I saw you attempts to throw sand in the gearbox.

    just like the Russian site does with LATIN-1 since it uses iconv and
    there is no such thing as LATIN-1

    Yes, there is, LATIN-1 is mentioned in FTS-5003.

    but instead LATIN1.

    LATIN1 is not a character encoding scheme that is "current practise" in Fidonet.

    I posted an example of this flaw in FTSC_PUBLIC to demonstrate the
    flaw along with evidence that it is so.

    The flaw is in the conversion that the webmaster od that site uses to convert Fidonet messages to html. It is not the problem of Fidonet, that he can't do it right.

    MvdV>> Result -> garbage.

    Actually it is fine since the 8 bit codes match perfectly with the
    utf-8 pair (0xc3, 0xb8) with CP850 for those particular codes.

    Yeah, garbage in, garbage out...


    Cheers, Michiel

    --- GoldED+/W32-MINGW 1.1.5-b20110320
    * Origin: http://www.vlist.eu (2:280/5555)
  • From Maurice Kinal@1:153/7001 to Michiel van der Vlist on Fri Apr 25 14:53:04 2014
    Hey Michiel!

    MvdV> Try this one for a change

    Not sure what you are expecting but the message "MSGID: 2:280/5555 35a3d7b0" I am replying to shows here as ascii with an additional 44 unneeded linefeeds. I am guessing some obsoleted DOS-think abandonware created this one or your notepad idea without golded's help to rid it of the additional linefeeds.

    MvdV> LATIN1 is not a character encoding scheme that is "current
    MvdV> practise" in Fidonet.

    Understood. Not really an issue unless of course you are really using iso-8859-1 with the LATIN-1 alias and wonder why the Apache server, or any other glibc based httpd, isn't displaying your characters correctly.

    MvdV> It is not the problem of Fidonet, that he can't do it right.

    It isn't him but the alias for iso-8859-1 instead. Not quite as bad as an issue of fidonet sysops who insist their abandonware is correct, especially given the obsoleted datetime stamp with the nonstandard tzutc 'fix' that is obviously broken. His biggest mistake was to impliment the CHRS kludge instead of ignoring it which most webmasters do. ;-)

    MvdV> Yeah, garbage in, garbage out...

    Yep. I prefer, "A Møøse once bit my sister ..."

    Life is good,
    Maurice

    ... Don't cry for me I have vi.
    --- GNU bash, version 4.3.11(1)-release (x86_64-unknown-linux-gnu)
    * Origin: Pointy Stick Society - Ladysmith BC, Canada (1:153/7001.0)
  • From Michiel van der Vlist@2:280/5555 to Maurice Kinal on Fri Apr 25 21:42:58 2014
    Hello Maurice,

    On Friday April 25 2014 14:53, you wrote to me:

    MvdV>> Try this one for a change

    Not sure what you are expecting but the message "MSGID: 2:280/5555 35a3d7b0" I am replying to shows here as ascii

    That is more or less what I expected...

    with an additional 44 unneeded linefeeds. I am guessing some
    obsoleted DOS-think

    Keep om guessing. The point is that you were unable to determine what encoding was used. Just as I expected. You schorched me for being unable to verify the encoding. Now I caught you out on the same defiecency. The difference is that I never claimed it was possible, weheras you did.

    You might have figured it out if you had some relevant knowledge or the help of a Scandinavian HAM. Tghe point is that you could not tell from the message alone.

    You will find the answer in the next message.

    MvdV>> LATIN1 is not a character encoding scheme that is "current
    MvdV>> practise" in Fidonet.

    Understood. Not really an issue unless of course you are really using iso-8859-1 with the LATIN-1 alias and wonder why the Apache server, or
    any other glibc based httpd, isn't displaying your characters
    correctly.

    As I wrote before, it is not my problem.

    MvdV>> It is not the problem of Fidonet, that he can't do it right.

    It isn't him but the alias for iso-8859-1 instead.

    No. If you want to read fidonet messages, yiu have to play by the fdionet rules. And in fidonet the identifier for iso-8859-1 is LATIN-1.

    Not only in fidonet BTW. The unicode consortium shares this POV.

    Look at the title of this document: http://www.unicode.org/charts/PDF/U0080.pdf

    "C1 controls and LATIN-1 supplement".

    LATIN-1 with a dash between the 'N' and the '1'. Not LATIN1.




    Cheers, Michiel

    --- GoldED+/W32-MINGW 1.1.5-b20110320
    * Origin: http://www.vlist.eu (2:280/5555)
  • From Michiel van der Vlist@2:280/5555 to Maurice Kinal on Fri Apr 25 22:00:24 2014
    Hello Maurice,


    Try this one for a change:

    They call me PA\MMV ...

    The CHRS kludge should tell you...



    Cheers, Michiel

    --- GoldED+/W32-MINGW 1.1.5-b20110320
    * Origin: http://www.vlist.eu (2:280/5555)
  • From Maurice Kinal@1:153/7001 to Michiel van der Vlist on Fri Apr 25 23:56:39 2014
    Hey Michiel!

    MvdV> That is more or less what I expected

    I already stated that I was only concerned with utf-8 verification and that part has been 100% so far with or without a CHRS kludge. That is all that matters.

    MvdV> I never claimed it was possible, weheras you did.

    I claimed it was possible to verify utf-8 with or without CHRS kludge. The scanning routine successfully determined it wasn't a utf-8 message and that is all it had to do. Given this particular echo called "UTF-8" then it is doing it's job admirably. No?

    MvdV> You will find the answer in the next message.

    So will you except in the reply to the next message.

    Life is good,
    Maurice

    ... Don't cry for me I have vi.
    --- GNU bash, version 4.3.11(1)-release (x86_64-unknown-linux-gnu)
    * Origin: Pointy Stick Society - Ladysmith BC, Canada (1:153/7001.0)
  • From Maurice Kinal@1:153/7001 to Michiel van der Vlist on Sat Apr 26 00:04:21 2014
    Hey Michiel!

    -={ output of 'iconv -f ISO646-NO' with quote string starts }=-
    MvdV> AREA:UTF-8
    MvdV> @TID: FMail-W32-1.68.1.55-B20140410
    MvdV> @CHRS: NORWEIG 1
    MvdV> @RFC-X-No-Archive: Yes
    MvdV> @PID: FTools-W32-1.67.0.45
    MvdV> @MSGID: 2:280/5555 35adaf80
    MvdV> Hello Maurice,
    MvdV>
    MvdV>
    MvdV> Try this one for a change:
    MvdV>
    MvdV> They call me PAØMMV ...
    MvdV>
    MvdV> The CHRS kludge should tell you...
    MvdV>
    MvdV>
    MvdV>
    MvdV> Cheers, Michiel
    MvdV>
    MvdV> --- GoldED+/W32-MINGW 1.1.5-b20110320
    MvdV> * Origin: http://www.vlist.eu (2:280/5555)
    -={ output of 'iconv -f ISO646-NO' with quote string ends }=-

    Note that it is now utf-8 so the CHRS kludge is in error. ;-)

    Also if I cared about conversions of shoddy fidonet chrs kludges I'd need to alias "NORWEIG 1" to a REAL encoding instead of a bogus one. Personally I don't see the need but if it did matter then I'd be inclined to use powerful tools instead of bogus fidonet ones especially ones that nobody actually uses given all tha abandonware currently in use. Bottomline is it doesn't matter ... does it?

    Life is good,
    Maurice

    ... Don't cry for me I have vi.
    --- GNU bash, version 4.3.11(1)-release (x86_64-unknown-linux-gnu)
    * Origin: Pointy Stick Society - Ladysmith BC, Canada (1:153/7001.0)
  • From Michiel van der Vlist@2:280/5555 to Maurice Kinal on Sun Apr 27 10:50:06 2014
    Hello Maurice,

    On Friday April 25 2014 23:56, you wrote to me:

    MvdV>> That is more or less what I expected

    I already stated that I was only concerned with utf-8 verification and that part has been 100% so far with or without a CHRS kludge. That is
    all that matters.

    No, it is not all that matters. Your character reconition scheme only works if you go by the assumption that it uis UTF-8.

    MvdV>> I never claimed it was possible, weheras you did.

    I claimed it was possible to verify utf-8 with or without CHRS kludge.

    I complained that I had no way of knowing WHAT encoding is used when the CHRS kludge is absent.

    The scanning routine successfully determined it wasn't a utf-8 message
    and that is all it had to do. Given this particular echo called
    "UTF-8" then it is doing it's job admirably. No?

    No indeed.The world of character encoding in Fidonet is wider than that of ASCII and UTF-8. Just determining that it is utf-8 or not is not good enough.


    Cheers, Michiel

    --- GoldED+/W32-MINGW 1.1.5-b20110320
    * Origin: http://www.vlist.eu (2:280/5555)