[openstack-dev] Sprint at Pycon: Port OpenStack to Python 3

John Dennis jdennis at redhat.com
Tue Apr 1 17:39:11 UTC 2014


On 04/01/2014 12:15 PM, Victor Stinner wrote:
> Hi,
> 
> Le mardi 1 avril 2014, 09:11:52 John Dennis a écrit :
>> What are the plans for python-ldap? Only a small part of python-ldap is
>> pure python, are you also planning on tackling the CPython code?
> 
> Oh, python-ldap was just an example, I don't have concrete plan for each 
> dependency. We are porting dependencies since some weeks, and many have 
> already a pending patch or pull request:
> https://wiki.openstack.org/wiki/Python3#Dependencies
> 
> I know the Python C API and I know well all the Unicode issues, so I'm not 
> afraid of having to hack python-ldap if it's written in C ;-)
> 
> For your information, I am a Python core developer and I'm fixing Unicode 
> issues in Python since 4 years or more :-) I also wrote a free ebook 
> "Programming with Unicode":
> 
>    http://unicodebook.readthedocs.org/

Great! It's wonderful to have someone steering the effort that actually
understands the issues. FWIW, I too have been fixing Python unicode
issues for years as well as using CPython and knowing it intimately.

Your book is good, I've seen it.

My general observation is i18n is a lot like security, 95% of developers
don't understand it, don't want to deal with it and have the mistaken
belief they can postpone addressing it until after the "coding is done"
instead of building it in from the very beginning. Security and
internationalization can't be bolted onto the side as an afterthought,
it has to be designed in from the beginning.

Since developers are not going to learn the issues what I think is
needed is a small set of do's and dont's, Follow some simple rules and
you'll be mostly O.K.

My simple rules go like this (I think you would concur):

* Every text string *internal* to your code is unicode.

* You encode/decode at the boundaries. Either an API boundary or an I/O
boundary. You must know and understand which encoding will be used at
the boundary and what the boundary requirements are.

* The use of str() should be banned, it's evil. Use six.text_type instead.

O.K. that might be a bit simplistic but it covers a large percentage.
The downside is the existing OpenStack code is nowhere near close to
following even these simple rules.



-- 
John



More information about the OpenStack-dev mailing list