[OpenStack-Infra] [gear] Making Gear easier to consume ( less .encode() and .decode() )

James E. Blair corvus at inaugust.com
Tue Jun 21 22:16:05 UTC 2016


Morgan Fainberg <morgan.fainberg at gmail.com> writes:

> As I have been converting Zuul and NodePool to python3, I have had to do a
> bunch of changes around encode() and decode() of strings since gear is
> (properly) an implementation of a protocol that requires binary data
> (rather than text_strings).
>
> What this has highlighted is that gear should be made a bit more friendly
> to use in the python world. We already explicitly assume a utf-8 encoding
> for when things are turned into "binary" when crafting the job object in
> certain cases [1].  I have discussed this with Jim Blair, and we both agree
> that the ability to still reference attributes such as "job.name" (in a
> simple manner that is straight forward), is important.

Thanks for this email -- with this and a little browsing of the gear
code, I think I've absorbed most of the context and am able to process
this.  :)

> Here is the outline of the change I'm proposing:
>
> * The main consumable part of gear (public classes) will convert the
> "string" data we assign ( name[2], unique[3]) into utf-8-encoded bytes via
> @property, @property.setter, and @property.getter for public consumption.

This seems good.  I think we can make the assertion that in the API
these things should always be utf-8 encoded strings, and do what's
needed behind the scenes to encode/decode to bytes.

I think this is the bulk of what we need to do to make gear
user-friendly.

> * Arguments are explicitly supposed to be a binary_blob [4]. I am unsure if
> this should also be automatically converted *or* if it should be the
> inverse, .arguments and .arguments_string ?

This is the thing we can't handle automatically.  At least, we can on
the input side (if the user passes in a string, we could auto-encode but
leave bytes alone -- but on the output side since the protocol is
un-typed, we wouldn't know what to do).  So maybe this is a place where
we should force the user to encode/decode.

Looking at the gear code, I think that was the intent in a lot of
places, but I think we may have gotten ahead of ourselves and written
code assuming that isinstance('string', bytes) is False (which is the
case in python3, but not in python2):

  https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L589

That line of code explains a lot to me.

I think we should stick close to the idea that arguments are binary
blobs.

>From that, it seems to follow that reading job.arguments should get you
a bytes object.

On the one hand, I don't think it's unreasonable to say that users must
encode/decode on that boundary.

On the other hand, I think we could easily have a job.arguments_string
property that decodes that from utf-8 for convenience, so why not?  I
think I'm in favor of this.

If we do that, then I think the question is: do we have gear
automatically encode job arguments (in the Job constructor) if the user
passes in a string?  I tend to favor this as well because it should
almost always do the right thing in both python2 and python3 without
inhibiting usage for either strings or binary data.  Does that sound
reasonable?

> * Internally gear will reference the encoded bits via the underlying
> <name>_binary form, which will allow direct access to a non-"string" form
> of the data itself in the case that there is interesting things that need
> to be handled via binary packing (for example) instead of "stringified"
> versions.

Sounds good.  Maybe we should call it "<name>_bytes" though for clarity?
/bikeshed

> * For compatibility the main @property.setter will handle either
> binary_type or string_type (so we don't break everyone).

Most of these are set in the constructor -- having a setter do that
should make the constructor do the right thing, but in case there are
cases where we need to remove type validation, I think it's worth saying
that our intent is to have the constructor handle both forms as well for
the same reason.

> * The "<name>_binary" will enforce that the data with be a binary_type
> only.
>
>
> I think this can be done in a single release of gear with minimal impact to
> those using it. For what it is worth, it is unlikely that anyone has used
> gear extensively in python3 as of yet because of recent bug fixes that
> addressed py2->py3 compat issues around dict.values() and similar list() ->
> iter() changes.

Agreed.

> See the one question in the above proposal for "arguments".
>
> [1]
> https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L86
> [2]
> https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2054
> [3]
> https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2058
> [4]
> https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2056
>
> Thanks,
> --Morgan
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Thanks again!

-Jim



More information about the OpenStack-Infra mailing list