OpenBSD Journal

c2k15: stsp@ on wifi and usb matters, and a peek to the UTF-8 future

Contributed by pitrh on from the dept.

Stefan Sperling (stsp@) may not have landed just yet, but he did file this report from the newly concluded hackathon:

The net80211 wireless code has plenty of comments referring to sections of and old version of the 802.11 standard. I started updating such references in the ieee80211.h header to the 802.11-2012 ("11n") version of the standard, and also added new macros for meta data added in this newer version.

Next I wanted to inspect 11n meta data floating around the hackroom. There were two access points run by us, and plenty additional ones already present in the venue. I taught tcpdump(8) about 11n meta data in beacons broadcasted by access points to see which 11n features they support. Because many 11n features are in fact optional (the lowest required feature set of a client provides only 65Mbit/s, not much more than the 54Mbit/s offered by older standards), inspecting feature flags in beacons is the only way to know what a particular 11n AP can really do.

I had brought with me a bag full of USB wifi devices I've collected, some of which already work and some not. Some should be working but are somehow broken, either by hardware issues or because of driver bugs. With help from mpi@ I fixed a problem in the urtw(4) driver which brought down the entire system if a device with a slightly damaged connector was attached. We also tried to fix a problem in the uath(4) driver which fails to initialise any of 3 supposedly supported devices thrown at it. We didn't manage to figure this problem out, though.

I also looked into a crash that is caused by a race condition in the run(4) driver. This is not fixed yet because the solution is a bit more involved than expected. I ended up trying to hunt down a similar race in the iwm(4) driver. I have a diff for this that's almost ready, but testing by jca@ revieled my diff breaks suspend/resume for him most of the time. So more work is required and the diff did not get committed.

Partway through the hackathon tedu@ proposed an old diff of his to make our base ls(1) utlity display multi-byte characters. This led to a long discussion about how to expand UTF-8 support in base. The conclusion so far indicates that single-byte locales (such as ISO-8859-1 and KOI-8) will be removed from the base OS after the 5.8 release is cut. This simplifies things because the whole system only has to care about a single character encoding. We'll then have a full release cycle to bring UTF-8 support to more base system utilities such as vi(1), ksh(1), and mg(1). To help with this plan I started organizing a UTF-8-focused hackathon for some time later this year.

Thanks for the report and the great work, Stefan! We're looking forward to hearing about the upcoming developments already.

(Comments are closed)


Comments
  1. By Jorden Verwer (94.209.56.203) on

    Wait a minute, did I read that right? Will all single-byte encodings be removed? I obviously don't care about ISO-8859-1 and KOI-8 and other worthless crap encodings, but support for the venerable codepage 437 is essential. It's what real PCs boot up with! I welcome better UTF-8 support, but it would be a real pity if the price would be the loss of support for codepage 437.

    Comments
    1. By Noryungi (noryungi) on

      > Wait a minute, did I read that right? Will all single-byte encodings
      > be removed? I obviously don't care about ISO-8859-1 and KOI-8 and
      > other worthless crap encodings, but support for the venerable codepage
      > 437 is essential. It's what real PCs boot up with! I welcome better
      > UTF-8 support, but it would be a real pity if the price would be the
      > loss of support for codepage 437.

      From what I undertstand, code page 437 has been mapped onto UTF-8.

      Therefore, UTF-8 encoding should be able to provide some form of emulation for that code page.

      Besides, this is OpenBSD we are talking about. Something tells me UTF-8 will be well implemented. ;-)

      And a good UTF-8 implementation is not something trivial, especially considering these:

      http://www.unicode.org/reports/tr36/
      http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/character-encoding.html

    2. By Anonymous Coward (95.16.168.7) on

      > I welcome better UTF-8 support, but it would be a real pity if the price would be the loss of support for codepage 437.

      CP437 was default codepage up to MS-DOS 3.30. It was replaced by CP850 on MS-DOS 4.0. You should seriously consider upgrading your operating system!

      Comments
      1. By Jorden Verwer (94.209.56.203) on

        > > I welcome better UTF-8 support, but it would be a real pity if the price would be the loss of support for codepage 437.
        >
        > CP437 was default codepage up to MS-DOS 3.30. It was replaced by CP850 on MS-DOS 4.0. You should seriously consider upgrading your operating system!

        Ah yes, codepage 850... The bane of box drawing character lovers around the world.

        Also, you missed my point, because I wasn't referring to the software's default, but to the hardware's. For the original IBM PC (and many of its clones), that default is equivalent to codepage 437.

        Comments
        1. By Stefan Sperling (stsp) on http://stsp.name

          This discussion is confusing me. We don't have locales for the code pages you are talking about.

          Comments
          1. By Jorden Verwer (94.209.56.203) on

            437 has always just worked for me, although indeed it is not explicitly part of the list of locales. What I care about is that it will continue to just work after 5.8. That's all that matters to me and is what I was asking about in the first place.

            Comments
            1. By Stefan Sperling (stsp) on http://stsp.name

              > 437 has always just worked for me, although indeed it is not explicitly part of the list of locales.

              That probably means you've been using the "C" locale (i.e. plain ASCII) which is the fallback used for any locale charset not listed by `locale -m`.
              In which case nothing will change for you. The "C" locale will be supported forever.

              Comments
              1. By Jorden Verwer (94.209.56.203) on

                Ah, that's excellent news, thank you!

                Of course, in retrospect, I should've been able to figure this out by myself. Obviously the C/POSIX locale isn't ever going to be removed...

    3. By Anonymous Coward (82.237.214.167) on

      > Wait a minute, did I read that right? Will all single-byte encodings be removed? I obviously don't care about ISO-8859-1 and KOI-8 and other worthless crap encodings, but support for the venerable codepage 437 is essential. It's what real PCs boot up with! I welcome better UTF-8 support, but it would be a real pity if the price would be the loss of support for codepage 437.
      I'm confused about your post. From what I've see, the default in OpenBSD text console was always 7-bit ASCII. I don't know if that changed when they added the framebuffer for ATI and Intel, because I don't have those so it's always been running in text mode. Anytime I try to look at an 8-bit ANSI file (like was popular in BBS days) it's mostly garbage that gets cat'd to the terminal. I know this can be remedied in X with the DOS VGA font, or in uxterm with a filter like luit (that can translate the CP 437 characters to Unicode equivalent) but it's never been the default to display the IBM charset from what I've seen. I could be wrong though because I didn't use OpenBSD before 2.8 release.
      The other thing is, you're assuming a lot about hardware architecture. It doesn't really make sense for CP 437 to be the default on a SPARC or other thing that's not x86.
      But if I'm wrong, please let me know what your terminal settings (including locale, etc.) are because I still enjoy the old ANSI art stuff.

  2. By Just Another OpenBSD User (87.126.197.32) on

    Multi-byte characters and better universal encodings are actually expected and needful in user level utilities as understood by users.

  3. By Henrique L. (201.14.247.60) henriqueleng@openmailbox.org on www.henriquelengler.com

    Can't wait to see the utf-8 implementation!

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]