Planet Twisted

October 18, 2019

Moshe Zadka

An introduction to zope.interface

This has previously been published on opensource.com.

The Zen of Python is loose enough and contradicts itself enough that you can prove anything from it. Let's meditate upon one of its most famous principles: "Explicit is better than implicit."

One thing that traditionally has been implicit in Python is the expected interface. Functions have been documented to expect a "file-like object" or a "sequence." But what is a file-like object? Does it support .writelines? What about .seek? What is a "sequence"? Does it support step-slicing, such as a[1:10:2]?

Originally, Python's answer was the so-called "duck-typing," taken from the phrase "if it walks like a duck and quacks like a duck, it's probably a duck." In other words, "try it and see," which is possibly the most implicit you could possibly get.

In order to make those things explicit, you need a way to express expected interfaces. One of the first big systems written in Python was the Zope web framework, and it needed those things desperately to make it obvious what rendering code, for example, expected from a "user-like object."

Enter zope.interface, which was part of Zope but published as a separate Python package. The zope.interface package helps declare what interfaces exist, which objects provide them, and how to query for that information.

Imagine writing a simple 2D game that needs various things to support a "sprite" interface; e.g., indicate a bounding box, but also indicate when the object intersects with a box. Unlike some other languages, in Python, attribute access as part of the public interface is a common practice, instead of implementing getters and setters. The bounding box should be an attribute, not a method.

A method that renders the list of sprites might look like:

def render_sprites(render_surface, sprites):
    """
    sprites should be a list of objects complying with the Sprite interface:
    * An attribute "bounding_box", containing the bounding box.
    * A method called "intersects", that accepts a box and returns
      True or False
    """
    pass # some code that would actually render

The game will have many functions that deal with sprites. In each of them, you would have to specify the expected contract in a docstring.

Additionally, some functions might expect a more sophisticated sprite object, maybe one that has a Z-order. We would have to keep track of which methods expect a Sprite object, and which expect a SpriteWithZ object.

Wouldn't it be nice to be able to make what a sprite is explicit and obvious so that methods could declare "I need a sprite" and have that interface strictly defined? Enter zope.interface.

from zope import interface

class ISprite(interface.Interface):

    bounding_box = interface.Attribute(
        "The bounding box"
    )

    def intersects(box):
        "Does this intersect with a box"

This code looks a bit strange at first glance. The methods do not include a self, which is a common practice, and it has an Attribute thing. This is the way to declare interfaces in zope.interface. It looks strange because most people are not used to strictly declaring interfaces.

The reason for this practice is that the interface shows how the method will be called, not how it is defined. Because interfaces are not superclasses, they can be used to declare data attributes.

One possible implementation of the interface can be with a circular sprite:

@implementer(ISprite)
@attr.s(auto_attribs=True)
class CircleSprite:
    x: float
    y: float
    radius: float

    @property
    def bounding_box(self):
        return (
            self.x - self.radius,
            self.y - self.radius,
            self.x + self.radius,
            self.y + self.radius,
        )

    def intersects(self, box):
        # A box intersects a circle if and only if
        # at least one corner is inside the circle.
        top_left, bottom_right = box[:2], box[2:]
        for choose_x_from (top_left, bottom_right):
            for choose_y_from (top_left, bottom_right):
                x = choose_x_from[0]
                y = choose_y_from[1]
                if (((x - self.x) ** 2 + (y - self.y) ** 2) <=
                    self.radius ** 2):
                     return True
        return False

This explicitly declares that the CircleSprite class implements the interface. It even enables us to verify that the class implements it properly:

from zope.interface import verify

def test_implementation():
    sprite = CircleSprite(x=0, y=0, radius=1)
    verify.verifyObject(ISprite, sprite)

This is something that can be run by pytest, nose, or another test runner, and it will verify that the sprite created complies with the interface. The test is often partial: it will not test anything only mentioned in the documentation, and it will not even test that the methods can be called without exceptions! However, it does check that the right methods and attributes exist. This is a nice addition to the unit test suite and -- at a minimum -- prevents simple misspellings from passing the tests.

If you have some implicit interfaces in your code, why not document them clearly with zope.interface?

by Moshe Zadka at October 18, 2019 03:00 AM

October 16, 2019

Hynek Schlawack

Sharing Your Labor of Love: PyPI Quick and Dirty

A completely incomplete guide to packaging a Python module and sharing it with the world on PyPI.

by Hynek Schlawack (hs@ox.cx) at October 16, 2019 12:00 AM

October 13, 2019

Glyph Lefkowitz

Mac Python Distribution Post Updated for Catalina and Notarization

I previously wrote a post about shipping a PyGame app to users on macOS. It’s now substantially updated for the new Notarization requirements in Catalina. I hope it’s useful to somebody!

by Glyph at October 13, 2019 09:10 PM

October 07, 2019

Glyph Lefkowitz

The Numbers, They Lie

It’s October, and we’re all getting ready for Halloween, so allow me to me tell you a horror story, in Python:

1
2
>>> 0.1 + 0.2 - 0.3
5.551115123125783e-17

some scary branches

Some of you might already be familiar with this chilling tale, but for those who might not have experienced it directly, let me briefly recap.

In Python, the default representation of a number with a decimal point in it is something called an “IEEE 754 double precision binary floating-point number”. This standard achieves a generally useful trade-off between performance, correctness, and is widely implemented in hardware, making it a popular choice for numbers in many programming language.

However, as our spooky story above indicates, it’s not perfect. 0.1 + 0.2 is very slightly less than 0.3 in this representation, because it is a floating-point representation in base 2.

If you’ve worked professionally with software that manipulates money1, you typically learn this lesson early; it’s quite easy to smash head-first into the problem with binary floating-point the first time you have an item that costs 30 cents and for some reason three dimes doesn’t suffice to cover it.

There are a few different approaches to the problem; one is using integers for everything, and denominating your transactions in cents rather than dollars. A strategy which requires less weird unit-conversion2, is to use the built-in decimal module, which provides a floating-point base 10 representation, rather than the standard base-2, which doesn’t have any of these weird glitches surrounding numbers like 0.1.

This is often where a working programmer’s numerical education ends; don’t use floats, they’re bad, use decimals, they’re good. Indeed, this advice will work well up to a pretty high degree of application complexity. But the story doesn’t end there. Once division gets involved, things can still get weird really fast:

1
2
3
>>> from decimal import Decimal
>>> (Decimal("1") / 7) * 14
Decimal('2.000000000000000000000000001')

The problem is the same: before, we were working with 1/10, a value that doesn’t have a finite (non-repeating) representation in base 2; now we’re working with 1/7, which has the same problem in base 10.

Any time you have a representation of a number which uses digits and a decimal point, no matter the base, you’re going to run in to some rational values which do not have an exact representation with a finite number of digits; thus, you’ll drop some digits off the (necessarily finite) end, and end up with a slightly inaccurate representation.

But Python does have a way to maintain symbolic accuracy for arbitrary rational numbers -- the fractions module!

1
2
3
4
5
>>> from fractions import Fraction
>>> Fraction(1)/3 + Fraction(2)/3 == 1
True
>>> (Fraction(1)/7) * 14 == 2
True

You can multiply and divide and add and subtract to your heart’s content, and still compare against zero and it’ll always work exactly, giving you the right answers.

So if Python has a “correct” representation, which doesn’t screw up our results under a basic arithmetic operation such as division, why isn’t it the default? We don’t care all that much about performance, right? Python certainly trades off correctness and safety in plenty of other areas.

First of all, while Python’s willing to trade off some storage or CPU efficiency for correctness, precise fractions rapidly consume huge amounts of storage even under very basic algorithms, like consuming gigabytes while just trying to maintain a simple running average over a stream of incoming numbers.

But even more importantly, you’ll notice that I said we could maintain symbolic accuracy for arbitrary rational numbers; but, as it turns out, a whole lot of interesting math you might want to do with a computer involves numbers which are irrational: like π. If you want to use a computer to do it, pretty much all trigonometry3 involves a slightly inaccurate approximation unless you have a literally infinite amount of storage.

As Morpheus put it, “welcome to the desert of the ”.


  1. or any proxy for it, like video-game virtual currency 

  2. and less time saying weird words like “nanodollars” to your co-workers 

  3. or, for that matter, geometry, or anything involving a square root 

by Glyph at October 07, 2019 06:25 AM

October 05, 2019

Glyph Lefkowitz

A Few Bad Apples

I’m a little annoyed at my Apple devices right now.

Time to complain.

“Trust us!” says Apple.

“We’re not like the big, bad Google! We don’t just want to advertise to you all the time! We’re not like Amazon, just trying to sell you stuff! We care about your experience. Magical. Revolutionary. Courageous!”

But I can’t hear them over the sound of my freshly-updated Apple TV — the appliance which exists solely to play Daniel Tiger for our toddler — playing the John Wick 3 trailer at full volume automatically as soon as it turns on.

For the aforementioned toddler.

I should mention that it is playing this trailer while specifically logged in to a profile that knows their birth date1 and also their play history2.


I’m aware of the preferences which control autoplay on the home screen; it’s disabled now. I’m aware that I can put an app other than “TV” in the default spot, so that I can see ads for other stuff, instead of the stuff “TV” shows me ads for.

But the whole point of all this video-on-demand junk was supposed to be that I can watch what I want, when I want — and buying stuff on the iTunes store included the implicit promise of no advertisements.

At least Google lets me search the web without any full-screen magazine-style ads popping up.

Launch the app store to check for new versions?

apple arcade ad

I can’t install my software updates without accidentally seeing HUGE ads for new apps.

Launch iTunes to play my own music?

apple music ad

I can’t play my own, purchased music without accidentally seeing ads for other music — and also Apple’s increasingly thirsty, desperate plea for me to remember that they have a streaming service now. I don’t want it! I know where Spotify is if I wanted such a thing, the whole reason I’m launching iTunes is that I want to buy and own the music!

On my iPhone, I can’t even launch the Settings app to turn off my WiFi without seeing an ad for AppleCare+, right there at the top of the UI, above everything but my iCloud account. I already have AppleCare+; I bought it with the phone! Worse, at some point the ad glitched itself out, and now it’s blank, and when I tap the blank spot where the ad used to be, it just shows me this:

undefined is not an insurance plan

I just want to use my device, I don’t need ad detritus littering every blank pixel of screen real estate.

Knock it off, Apple.


  1. less than 3 years ago 

  2. Daniel Tiger, Doctor McStuffins, Word World; none of which have super significant audience overlap with the John Wick franchise 

by Glyph at October 05, 2019 06:32 PM

September 24, 2019

Jp Calderone

Tahoe-LAFS on Python 3 - Call for Porters

Hello Pythonistas,

Earlier this year a number of Tahoe-LAFS community members began an effort to port Tahoe-LAFS from Python 2 to Python 3.  Around five people are currently involved in a part-time capacity.  We wish to accelerate the effort to ensure a Python 3-compatible release of Tahoe-LAFS can be made before the end of upstream support for CPython 2.x.

Tahoe-LAFS is a Free and Open system for private, secure, decentralized storage.  It encrypts and distributes your data across multiple servers.  If some of the servers fail or are taken over by an attacker, the entire file store continues to function correctly, preserving your privacy and security.

Foolscap, a dependency of Tahoe-LAFS, is also being ported.  Foolscap is an object-capability-based RPC protocol with flexible serialization.

Some details of the porting effort are available in a milestone on the Tahoe-LAFS trac instance.

For this help, we are hoping to find a person/people with significant prior Python 3 porting experience and, preferably, some familiarity with Twisted, though in general the Tahoe-LAFS project welcomes contributors of all backgrounds and skill levels.

We would prefer someone to start with us as soon as possible and no later than October 15th. If you are interested in this opportunity, please send us any questions you have, as well as details of your availability and any related work you have done previously (GitHub, LinkedIn links, etc). If you would like to find out more about this opportunity, please contact us at jessielisbetfrance at gmail (dot) com or on IRC in #tahoe-lafs on Freenode.

by Jean-Paul Calderone (noreply@blogger.com) at September 24, 2019 04:59 PM

September 17, 2019

Moshe Zadka

Adding Methods Retroactively

The following post was originally published on OpenSource.com as part of a series on seven libraries that help solve common problems.

Imagine you have a "shapes" library. We have a Circle class, a Square class, etc.

A Circle has a radius, a Square has a side, and maybe Rectangle has height and width. The library already exists: we do not want to change it.

However, we do want to add an area calculation. If this was our library, we would just add an area method, so that we can call shape.area(), and not worry about what the shape is.

While it is possible to reach into a class and add a method, this is a bad idea: nobody expects their class to grow new methods, and things might break in weird ways.

Instead, the singledispatch function in functools can come to our rescue:

@singledispatch
def get_area(shape):
    raise NotImplementedError("cannot calculate area for unknown shape",
                              shape)

The "base" implementation for the get_area function just fails. This makes sure that if we get a new shape, we will cleanly fail instead of returning a nonsense result.

@get_area.register(Square)
def _get_area_square(shape):
    return shape.side ** 2
@get_area.register(Circle)
def _get_area_circle(shape):
    return math.pi * (shape.radius ** 2)

One nice thing about doing things this way is that if someone else writes a new shape that is intended to play well with our code, they can implement the get_area themselves:

from area_calculator import get_area

@attr.s(auto_attribs=True, frozen=True)
class Ellipse:
    horizontal_axis: float
    vertical_axis: float

@get_area.register(Ellipse)
def _get_area_ellipse(shape):
    return math.pi * shape.horizontal_axis * shape.vertical_axis

Calling get_area is straightforward:

print(get_area(shape))

This means we can change a function that has a long if isintance()/elif isinstance() chain to work this way, without changing the interface. The next time you are tempted to check if isinstance, try using singledispatch!

by Moshe Zadka at September 17, 2019 01:00 AM

September 10, 2019

Itamar Turner-Trauring

What can a software developer do about climate change?

Pines and firs are dying across the Pacific Northwest, fires rage across the Amazon, it’s the hottest it’s ever been in Paris—climate change is impacting the whole planet, and things are not getting any better. You want to do something about climate change, but you’re not sure what.

If you do some research you might encounter an essay by Bret Victor—What can a technologist do about climate change? There’s a whole pile of good ideas in there, and it’s worth reading, but the short version is that you can use technology to “create options for policy-makers.”

Thing is, policy-makers aren’t doing very much.

So this essay isn’t about technology, because technology isn’t the bottleneck right now, it’s about policy and politics what you can do about it. It’s still written for software developers, because that’s who I write for, but also because software developers often have access to two critical catalysts for political change. And it’s written for software developers in the US, because that’s where I live, and because the US is a big part of the problem.

But before I go into what you can do, let me tell you the story of a small success I happened to be involved in, a small step towards a better future.

Infrastructure and the status quo

About a year ago I spent some of my mornings handing out pamphlets to bicycle riders. I looked like an idiot: in order to show I was one of them I wore my bike helmet, which is weirdly shaped and the color of fluorescent yellow snot.

After finding an intersection with plenty of bicycle riders and a long red light that forces them to stop, I would do the following:

  1. When the light turns red, step into the street and hand out the pamphlet.
  2. Keep an eye out for the light changing to green so that I didn’t get run over by moving cars.
  3. Twiddle my thumbs waiting for the next light cycle.

It was boring, and not very glamorous.

I was one of just many volunteers, and besides gathering signatures we also held rallies, had conversations with city councilors and staff, wrote emails, talked at city council meetings—it was a process. The total effort took a couple of years (and I only joined in towards the end)—but in the end we succeeded.

We succeeded in having the council pass a short ordinance, a city-level law in the city of Cambridge, Massachusetts. The ordinance states that whenever a road that was supposed to have protected bike lanes (per the city’s Bike Plan) was rebuilt from scratch, it would have those lanes built by default.

Now, clearly this ordinance isn’t going to solve climate change. In fact, nothing Cambridge does as a city will solve climate change, because there’s only so much impact 100,000 people can have on greenhouse gas emissions.

But while in some ways this ordinance was a tiny victory in a massive war, if we take a step back it’s actually more important than it seems. In particular, this ordinance has three effects:

  1. Locally, safer bike infrastructure means more bicycle riders, and fewer car drivers. That reduces emissions—a little.
  2. Over time, more bicycle riders can kick off a positive feedback cycle, reducing emissions even more.
  3. Most significantly, local initiatives spread to other cities—kicking off these three effects in those other cities.

Let’s examine these effects one by one.

Effect #1: Fewer cars, less emissions

About 43% of the greenhouse gas emissions in Massachusetts are due to transportation; for the US overall it’s 29% (ref). And that means cars.

The reason people in the US mostly drive cars is because all the transportation infrastructure is built for cars. No bike lanes, infrequent, slow and non-existent buses, no trains… Even in cities, where other means of transportation are feasible, the whole built infrastructure sends the very strong message that cars are the only reasonable way to get around.

If we focus on bicycles, our example at hand, the problem is that riding a bicycle can be dangerous—mostly because of all those cars! But if you get rid of the danger and build good infrastructure—dedicated protected bike lanes that separate bicycle riders from those dangerous cars—then bicycle use goes up.

Consider what Copenhagen achieved between 2008 and 2017 (ref):

2008 2018
# of seriously injured cyclists 121 81
% who residents who feel secure cycling 51 77
% who cycle to work/school 37 49

With safer infrastructure for bicycles, perception of safety goes up, and people bike more and drive less. Similarly, if you have frequent, fast, and reliable buses and trains, people drive less. And that means less carbon emissions.

In Copenhagen the number of kilometers driven by cars was flat or slightly down over those 10 years—whereas in the US, it’s up 6-7% (ref).

Effect #2: A positive feedback loop

The changes in Copenhagen are a result of a plan the city government there adopted in 2011 (ref): they’re the result of a policy action. And the political will was there in part because there were already a huge number of bicycle riders. So it’s a positive feedback loop, and a good one.

Let’s see how this is happening in Cambridge:

  • Cambridge has a slowly growing number of bicycle rider. This means more political support for bike infrastructure—if there’s a group that can mobilize that support!
  • With the ordinance, more roads will have safe infrastructure. For example, one neighborhood previously had a safe route only in one direction; the other direction will be rebuilt with a protected bike lane in 2020.
  • With safer infrastructure, there will be more bicycle riders, and therefore more support by residents for safer infrastructure. Merely having support isn’t enough, of course, and I’ll get back to that later on.

If Copenhagen can reach 50% of residents with a bicycle commute, so can Cambridge—and the ordinance is a good step in that direction.

Effect #3: The idea spreads

The Cambridge ordinance passed in April 2019—and the idea is spreading elsewhere:

  • The California State Assembly is voting on a law with similar provisions (ref), through a parallel push by Calbike.
  • In May 2019 a Washington DC Council member introduced a bill which among other points has the same rebuild requirements as the Cambridge ordinance (ref).
  • The Seattle City Council passed an ordinance, parts of which were literally copy/pasted from the Cambridge ordinance (ref).

All of this is the result of local advocacy—but I’ve no doubt Cambridge’s example helped. It’s always easier to be the second adopter. And the examples from these larger localities will no doubt inspire other groups and cities, spreading the idea even more.

Change requires politics

Bike infrastructure is just an example, not a solution—but there are three takeaways from this story that I’d like to emphasize:

  • If you want to change policy, you need to engage in politics.
  • Politics are easier to impact on the local level.
  • Local policy changes have a cumulative, larger-scale impact.

By politics I don’t just mean having an opinion or voting for a candidate, but rather engaging in the process of how policy decisions are made.

Merely having an opinion doesn’t change anything. For example, two-thirds of Cambridge residents support building more protected bike lanes (ref). But that doesn’t mean that many protected lanes are getting built—the neighboring much smaller city of Somerville is building far more than Cambridge.

The only reason the city polled residents about bike lanes is because, one suspects, all the fuss we’d been making—emails, rallies, meetings, city council policy orders—made the city staff wonder if bike infrastructure really had a lot of public support or not.

Voting results in some change, but not enough. Elected officials and government staff have lots and lots of things to worry about—if they’re not being pressured to focus on a particular issue, it’s likely to fall behind.

What’s more, the candidates you get to vote for have to get on the ballot, and to do that they need money (for advertising, hiring staff, buying supplies). Lacking money, they need volunteer time.

And it’s much easier for a small group of rich people to provide that support to the candidates they want—so by the time you’re voting, you only get to choose between candidates that have been pre-vetted (I highly recommend reading The Golden Rule to understand how this works on a national level).

What you can do: Become an activist

In the end power is social. Power comes from people showing up to meetings, people showing up for rallies, people going door-to-door convincing other people to vote for the right person or support the right initiative, people blocking roads and making a fuss.

And that takes time and money.

So if you want to change policy, you need to engage in politics, with time and money:

  • You can volunteer for candidates’ political campaigns, as early as possible in the process. Too many good candidates get filtered out before they even make the ballot. That doesn’t mean you can just go home after the election—that’s when the real work of legislation starts, which means activism is just as important.
  • You can volunteer with groups either acting on a particular issue (transportation, housing policy) or more broadly on climate change.
  • Also useful is donating money to political campaigns, both candidates and issue-based organizations.

Here are some policies you might be interested in:

  • Transportation policy determines what infrastructure is built—and the current infrastructure favors privately-owned cars over public transportation and bicycles.
  • Zoning laws determine what gets built and where. Denser construction would reduce the need for long trips, and more efficient buildings (ideally net zero carbon) would reduce emissions from heating and cooling.
  • Moving utilities from private to public ownership, so they can focus on the public good and not on profit.
  • Bulk municipal contracts for electricity: this allows for cheaper electricity for all residents, and to have green energy as the default.
  • State-level carbon restrictions or taxes.

Where you should do it: Start local

If you are going to become an activist, the local level is a good starting point.

  • An easier first step: Cambridge has 100,000 residents—city councilors are routinely elected with just 2500 votes. That means impacting policies here is much easier than at a larger scale. Not only does this mean faster results, it also means you’re less likely to get discouraged and give up—you can see the change happening.
  • Direct impact: A significant amount of greenhouse gas emissions in the US are due to causes that are under control of local governments.
  • Wider impact: As in the case of Cambridge’s ordinance, local changes can be adopted elsewhere.

Of course, local organizing is just the starting point for creating change on the global level. But you have to start somewhere. And global change is a lot easier if you have thousands of local organizations supporting it.

It’s a good to be a software developer

Let’s get back to our starting point—you’re paid to write software, you want to do something about climate change. As a software developer you likely have access to the inputs needed to make political campaigns succeed—both candidate-based and issue-based:

  • Money: Software developers tend to get paid pretty well, certainly better than most Americans. Chances are you have some money to spare for political donations.
  • Time: This one is a bit more controversial, but in my experience many programmers can get more free time if they want to.

If you don’t have children or other responsibilities, you can work a 40-hour workweek, leaving you time for other things. Before I got married I worked full-time and went to a local adult education college half-time in the evenings: it was a lot of work, but it was totally doable. Set boundaries at your job, and you’ll have at least some free time for activism.

You can also negotiate a shorter workweek, which is possible in part because software developers are in such demand. I’ve done this, I’ve interviewed people who have done it, I’ve found many random people on the Internet who have done it—it is possible.

If you need help doing it yourself, I’ve written a book to help you negotiate a shorter workweek. If you want to negotiate a shorter workweek so you have time for political activism, you can use the code FIGHTCLIMATECHANGE to get the book for 60% off.

Some common responses

“There will never be the political will to make this happen”

Things do change, for better and for worse, and sometimes unexpectedly. To give a couple of examples:

  • In Ireland, the Catholic Church went from all-powerful to losing badly, most recently with Ireland legalizing abortion.
  • The anti-gay-marriage Defense of Marriage Act was passed by veto-proof majorities of Congress in 1996—and eight years later in 2004 the first legal gay marriage took place right here in Cambridge, MA.

The timelines for gay marriage and cannabis legalization in the US are illuminating: these things didn’t just happen, it was the result of long, sustained activist efforts, much of it at the local level.

Local changes do make a difference.

“Politics is awful and broken”

So are all our software tools, and somehow we manage to get things done!

“I don’t like your policy suggestions, we should do X instead”

No problem, find the local groups that promote your favorite policies and join them.

“The necessary policies will never work because of problem Y”

Same answer: join and help the local groups working on Y.

“It’s too late, the planet is doomed no matter what we do”

Perhaps, but it’s very hard to say. So we’re in Pascal’s Wager territory here: given even a tiny chance there is something we can do, we had better do our best to make it happen.

And even if humanity really is doomed, there’s always the hope that someday a hyperintelligent species of cockroach will inherit the Earth. And when cockroach archaeologists try to reconstruct our history, I would like them to be able to say, loosely translated from their complex pheromone-and-dancing system of communication: “These meatsacks may not have been as good at surviving as us cockroaches—but at least they tried!”

Time to get started

If you find this argument compelling—that policy is driven by power, and that power requires social mobilization—then it’s up to you to take the next step. Find a local group or candidate pushing for a policy you care about, and show up for the next meeting.

And the meeting after that.

And then go to the rally.

And knock on doors.

And make some friends, and make some changes happen.

Some of the work is fun, some of it is boring, but there’s plenty to do—time to get started!



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

September 10, 2019 04:00 AM

September 09, 2019

Ralph Meijer

XMPP Message Attaching, Fastening, References

Services like Twitter and Slack have functionality that attempts to interpret parts of the plain text of tweets or message as entered by the user. Pieces of the text that look like links, mentions of another user, hash tags, or stock symbols, cause additional meta data to be added to the object representing the message, so that receiving clients can mark up those pieces of text in a special way. Twitter calls this meta data Tweet Entities and for each piece of interpreted text, it includes indices for the start and end of along with additional information depending on the type of entity. A client can then do in-line replacements at the exact character indices, e.g. by making it into a hyperlink. Twitter Entities served as inspiration for XEP-0372: References.

References can be used in two ways: including a reference as a sibling to the body element of a message. The begin and end attributes then point to the indices of the plain text in the body. This would typically be used if the interpretation of the message is done by the sending client.

Alternatively, a service (e.g. a MUC service) could parse incoming messages and send a separate stanza to mark up the original stanza. In this case you need a mechanism for pointing to that other message. There have been two proposals for this, with slightly differing approaches, and in the examples below, I'll use the proto-XEP Message Fastening. While pointing to the stanza ID of the other message, it embeds a reference element in the apply-to element.

Mentioning another user

Let's start out with the example of mentioning another user.

<message from="room@muc.this.example/Kev" type="groupchat">
  <stanza-id id="2019-09-02-1" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <body>Some rubbish @ralphm</body>
</message>

A client might render this as:

Kev

Some rubbish @ralphm

The MUC service then parses the plain-text message, and finds a reference to my nickname prefixed with an @-sign, and sends a stanza to the room that marks up the message Kev sent to me.

<message from="room@muc.this.example"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-2" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-1">
    <reference begin="13" end="19" xmlns="urn:example:reference:0">
      <mention jid="room@muc.this.example/ralphm"/>
    </reference>
  </apply-to>
</message>

This stanza declares that it is attached to the previous message by the stanza ID that was included with the original stanza. In its payload, it includes a reference, referring to the characters 13 through 19. It has a mention child pointing to my occupant JID. Alternatively, the room might have linked to my real JID. A client can then alter the presentation of the original message to use the attached mention reference:

Kev

Some rubbish @ralphm

The characters referencing @ralphm are now highlighted, hovering the mention shows a tooltip with my full name, and clicking on it brings you to a page describing me. This information was not present in the stanza, but a client can use the XMPP URI as a key to present additional information. E.g. from the user's contact list, by doing a vCard lookup, etc.


Note:

The current specification for References does not have defined child elements, but instead uses a type attribute and URIs. However, Jonas Wielicki Schäfer provided some valuable feedback, suggesting this idea. By using a dedicated element for the target of the reference, each can have their own attributes, making it more explicit. Also, it is a natural extension point, by including a differently namespaced element instead.


Referring to previous messages

<message from="room@muc.this.example/Ge0rG"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-3" by="room@muc.this.example"/>
  <reference begin="0" end="6" xmlns="urn:example:reference:0">
    <mention jid="room@muc.this.example/ralphm"/>
  </reference>
  <reference begin="26" end="32" xmlns="urn:example:reference:0">
    <message id="2019-09-02-1"/>
  </reference>
  <body>@ralphm did you see Kev's message earlier?</body>
</message>

Unlike before, this example does not point to another stanza with apply-to. Instead, Ge0rG's client added references to go along with the plain-text body: one for the mention of me, and one for a reference to an earlier message.

Ge0rG

@ralphm did you see Kev's message earlier?

Emoji Reactions

Instead of reacting with a full message, Slack, like online forum software much earlier, has the ability to attach emoji reactions to messages.

<message from="room@muc.this.example/Kev"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
            id="2019-09-02-4" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-3">
    <reactions xmlns="urn:example:reactions:0">
      <reaction label=":+1:">👍</reaction>
    </reactions>
  </apply-to>
</message>
<message from="room@muc.this.example/ralphm"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-6" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-3">
    <reactions xmlns="urn:example:reactions:0">
      <reaction label=":parrot:"
                img="cid:b729aec3f521694a35c3fc94d7477b32bc6444ca@bob.xmpp.org"/>
    </reactions>
  </apply-to>
</message>

These two examples show two separate instances of a person reacting to the previous message by Ge0rG. It uses the protocol from Message Reactions, another Proto-XEP. However, I expanded on it by introducing two new attributes. The label allows for a textual shorthand, that might be typed by a user. Custom emoji can be represented with the img attribute, that points to a XEP-0231: Bits of Binary object.

Ge0rG

@ralphm did you see Kev's message earlier?

👍 2  1

The attached emoji are rendered below the original message, and hovering over them reveals who were the respondents. Here my own reaction is highlighted by a squircle border.

Including a link

<message from="room@muc.this.example/ralphm" type="groupchat">
  <stanza-id id="2019-09-02-7" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <body>Have you seen https://ralphm.net/blog/2013/10/10/logitech_t630?</body>
</message>
<message from="room@muc.this.example"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-8" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-7">
    <reference begin="14" end="61" xmlns="urn:example:reference:0">
      <link url="https://ralphm.net/blog/2013/10/10/logitech_t630"/>
    </reference>
  </apply-to>
</message>

Here the MUC service marks up the original messages with an explicit link reference. Possibly, the protocol might be extended so that a service can include shortened versions of the URL for display purposes.

ralphm

Have you seen https://ralphm.net/blog/2013/10/10/logitech_t630?

Logitech Ultrathin Touch Mouse

Logitech input devices are my favorite. This tiny bluetooth mouse is a nice portable device for every day use or while traveling.

The client has used the markup to fetch meta data on the URL and presents a summary card below the original message. Alternatively, the MUC service could have done this using XEP-0385: Stateless Inline Media Sharing (SIMS):

<message from="room@muc.this.example"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-8" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-7">
    <reference begin="14" end="61" xmlns="urn:example:reference:0">
      <link url="https://ralphm.net/blog/2013/10/10/logitech_t630"/>
      <card xmlns="urn:example:card:0">
        <title>Logitech Ultrathin Touch Mouse</ulink></title>
        <description>Logitech input devices are my favorite. This tiny bluetooth mouse is a nice portable device for every day use or while traveling.</description>
      </card>
      <media-sharing xmlns='urn:xmpp:sims:1'>
        <file xmlns='urn:xmpp:jingle:apps:file-transfer:5'>
          <media-type>image/jpeg</media-type>
          <name>ultrathin-touch-mouse-t630.jpg</name>
          <size>23458</size>
          <hash xmlns='urn:xmpp:hashes:2' algo='sha3-256'>5TOeoNI9z6rN5f+cQagnCgxitQE0VUgzCMeQ9JqbhWJT/FzPpDTTFCbbo1jWwOsIoo9u0hQk6CPxH4t/dvTN0Q==</hash>
          <thumbnail xmlns='urn:xmpp:thumbs:1'uri='cid:sha1+21ed723481c24efed81f256c8ed11854a8d47eff@bob.xmpp.org' media-type='image/jpeg' width='116' height='128'/>
        </file>
        <sources>
          <reference xmlns='urn:xmpp:reference:0' type='data' uri='https://test.ralphm.net/images/blog/ultrathin-touch-mouse-t630.jpg' />
        </sources>
      </media-sharing>
    </reference>
  </apply-to>
</message>

Editing a previous message

<message from="room@muc.this.example/ralphm" type="groupchat">
  <stanza-id id="2019-09-02-9" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <body>Some thoughtful reply</body>
</message>
ralphm

Some thoughtful reply

After sending that message, I want to add a bit more information:

<message from="room@muc.this.example/ralphm" type="groupchat">
  <stanza-id id="2019-09-02-10" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-9">
    <external name='body'/>
    <replace xmlns='urn:example:message-correct:1'/>
  </apply-to>
  <body>Some more thoughtful reply</body>
</message>

Unlike XEP-0308: Last Message Correction, this example uses Fastening to refer to the original message. I would also lift the restriction on correcting just the last message, but allow any previous message to be edited.

ralphm

Some more thoughtful reply

Upon receiving the correction, the client indicates that the message has been edited. Hovering over the marker reveals when the message was changed.

Editing a previous message that had fastened references

<message from="room@muc.this.example/Kev" type="groupchat">
  <stanza-id id="2019-09-02-11" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <body>A witty response mentioning @ralphm</body>
</message>
<message from="room@muc.this.example"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-12" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-11">
    <reference begin="28" end="34" xmlns="urn:example:reference:0">
      <mention jid="room@muc.this.example/ralphm"/>
    </reference>
  </apply-to>
</message>
Kev

A witty response mentioning @ralphm

After a bit of consideration, Kev edits his response:

<message from="room@muc.this.example/Kev" type="groupchat">
  <stanza-id id="2019-09-02-13" by="room@muc.this.example"
             xmlns="urn:xmpp:sid:0"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-11">
    <external name='body'/>
    <replace xmlns='urn:example:message-correct:1'/>
  </apply-to>
  <body>A slighty wittier response mentioning @ralphm</body>
</message>
Kev

A slightly wittier response mentioning @ralphm

Upon receiving the correction, the client discards all fastened references. The body text was changed, so the reference indices are stale. The room can then send a new stanza marking up the new text:

<message from="room@muc.this.example"
         type="groupchat">
  <stanza-id xmlns="urn:xmpp:sid:0"
             id="2019-09-02-14" by="room@muc.this.example"/>
  <apply-to xmlns="urn:xmpp:fasten:0"
            id="2019-09-02-11">
    <reference begin="40" end="46" xmlns="urn:example:reference:0">
      <mention jid="room@muc.this.example/ralphm"/>
    </reference>
  </apply-to>
</message>
Kev

A slightly wittier response mentioning @ralphm

Closing notes

  • Fastening should also gain a way to unfasten explicitly. I think that should use the stanza ID of the stanza that included the earlier fastening. This allows for undoing individual emoji reactions.

  • Unfastening should probably not use the proto-XEP on Message Retraction. That is for retracting the entire original message plus all its fastened items, and invalidating all message references pointing to it.
  • It might make sense to have a separate document describing how to handle stanza IDs, so that all specifications could point to it instead of each having their own algorithm. In different contexts, different IDs might be used. The other proposal for attachments, XEP-0367: Message Attaching, has a section (4.1) on this that might be taken as a start.

  • In the discussion leading up to this post, a large part was about how to handle all these things attached/fastened to messages in message archives. This is not trivial, as you likely don't want to store a sequence of stanzas, but of (original) messages. Each of those message then might have one or more things fastened to it, and upon retrieval, you want these to come along when retrieving a message. Some of these might be collated, like edits. Some might cause summary counts (emoji, simple polls) with the message itself, and require an explicit retrieval of all the reactions, e.g. when hovering the reaction counts.

    Details on message archive handling is food for a later post. I do think that having a single way of attaching/fastening things to messages makes it much easier to come up with a good solution for archive handling.

  • I didn't provide examples for stanza encryption, but discussions on this suggested that stanzas with fastened items would have an empty apply-to, including the id attribute, so that message archives can do rudimentary grouping of fastened items with the original message.

  • I didn't include examples on Chat Markers, as its current semantics are that a marker sent by a recipient applies to a message and all prior messages. This means the marker isn't really tied to a single message. I think this doesn't match the model for Message Fastening.

by ralphm at September 09, 2019 02:37 PM

August 16, 2019

Twisted Matrix Laboratories

Twisted 19.7.0 Released

On behalf of Twisted Matrix Laboratories and our long-suffering release manager Amber Brown, I am honored to announce1 the release of Twisted 19.7.0!

The highlights of this release include:
  • A full description on the PyPI page!  Check it out here: https://pypi.org/project/Twisted/19.7.0/ (and compare to the slightly sad previous version, here: https://pypi.org/project/Twisted/19.2.1/)
  • twisted.test.proto_helpers has been renamed to "twisted.internet.testing"
    • This removes the gross special-case carve-out where it was the only "public" API in a test module, and now the rule is that all test modules are private once again.
  • Conch's SSH server now supports hmac-sha2-512.
  • The XMPP server in Twisted Words will now validate certificates!
  • A nasty data-corruption bug in the IOCP reactor was fixed. If you're doing high-volume I/O on Windows you'll want to upgrade!
  • Twisted Web no longer gives clients a traceback by default, both when you instantiate Site and when you use twist web on the command line.  You can turn this behavior back on for local development with twist web --display-tracebacks.
  • Several bugfixes and documentation fixes resolving bytes/unicode type confusion in twisted.web.
  • Python 3.4 is no longer supported.
pip install -U twisted[tls] and enjoy all these enhancements today!

Thanks for using Twisted,

-glyph

1: somewhat belatedly: it came out 10 days ago.  Oops!

by glyph (noreply@blogger.com) at August 16, 2019 06:38 AM

August 08, 2019

Moshe Zadka

Designing Interfaces

One of the items of feedback I got from the article about interface immutability is that it did not give any concrete feedback for how to design interfaces. Given that they are forever, it would be good to have some sort of guidance.

The first item is that you want something that uses the implementation, as well as several distinct implementations. However, this item is too obvious: in almost all cases I have seen in the wild of a bad interface, this guideline was followed.

It was also followed in all cases of a good interface.

I think this guideline is covered well enough that by the time anyone designs a real interface, they understand that. Why am I mentioning this guideline at all, then?

Because I think it is important for the context of the guideline that I do think actually distinguishes good interfaces from bad interfaces. It is almost identical to the non-criterion above!

The real guideline is: something that uses the implementation, as well as several distinct implementations that do not share a superclass (other than object or whatever is in the top of the hierarchy).

This simple addition, preventing the implementations from sharing a superclass, is surprisingly powerful. It means each implementation has to implement the "boring" parts by hand. This will immediately cause pressure to avoid "boring" parts, and instead put them in a wrapper, or in the interface user.

Otherwise, the most common failure mode is that the implementations are all basic variants on what is mostly the "big superclass".

In my experience, just the constraint on not having a "helper superclass" puts appropriate pressure on interfaces to be good.

(Thanks to Tom Most for his encouragement to write this, and the feedback on an earlier draft. Any mistakes that remain are my responsibility.)

by Moshe Zadka at August 08, 2019 05:20 AM

July 13, 2019

Moshe Zadka

Interfaces are forever

(The following talks about zope.interface interfaces, but applies equally well to Java interfaces, Go interfaces, and probably other similar constructs.)

When we write a function, we can sometimes change it in backwards-compatible ways. For example, we can loosen the type of a variable. We can restrict the type of the return value. We can add an optional argument.

We can even have a backwards compatible path to make an argument required. We add an optional argument, and encourage people to change it. Then, in the next version, we make the default value be one that causes a warning. In a version after that, we make the value required. At each point, someone could write a library that worked with at least two consecutive versions.

In a similar way, we can have a path to remove an argument. First make it optional. Then warn when it is passed in. Finally, remove it and make it an error to pass it in.

As long as we do not intend to support inheritance, making backwards compatible changes to classes also works. For example, to remove a method we first have a version that warns when you call it, and then remove it in a succeeding version.

However, what changes can we make to an interface?

Assume we have an interface like:

from zope.interface import Interface, implements

class IFancyFormat(Interface):

    def fancify_int(value: int) -> str:
        pass

It is a perfectly reasonable, if thin, interface. Implementing it seems like fun:

@implements(IFancyFormat)
@attr.s(auto_attribs=True)
class FancySuffixer:
    suffix: str

    def fancify_int(self, value: int) -> str:
        return str(value) + self.suffix

Using it also seems like fun:

def dashify_fancy_five(fancifier: IFancyFormat) -> str:
    return f"---{fancifier.fancify_int(5)}---"

These are very different kinds of fun, though! Probably the kind of fun that appeals to different people. The first implementation is in the superfancy open-source library. The second one is in the dash_five open-source library. Such is the beauty of open source: it takes all kinds of people.

We cannot add a method to IFancyFormat: the superfancy library has a unit test that uses verifyImplements, which will fail if we add a method. We cannot remove the method fancify_int, since this will break dash_five: the mypy check will fail, since IFancifySuffixer will not have that method.

Similarly, we cannot make the parameter optional without breaking superfancy, or loosen the return type without breaking dash_five. Once we have published IFancyFormat as an API, it cannot change.

The only way to recover from a bad interface is to create a new interface, IAwesomeFancyFormat. Then write conversion functions from and to IFancyFormat and IAwesomeFancyFormat. Then deprecate using the IFancyFormat interface. Finally, we can remove the interface. Then we can alias IFancyFormat == IAwesomeFancyFormat, and eventually, maybe even deprecate the name IAwesomeFancyFormat.

When publishing interfaces, one must be careful: to a first approximation, they are forever.

(Thanks to Glyph Lefkowitz for his helpful suggestions. Any mistakes or issues that are left are my responsibility.)

by Moshe Zadka at July 13, 2019 05:00 AM

June 14, 2019

Glyph Lefkowitz

Toward a “Kernel Python”

Prompted by Amber Brown’s presentation at the Python Language Summit last month, Christian Heimes has followed up on his own earlier work on slimming down the Python standard library, and created a proper Python Enhancement Proposal PEP 594 for removing obviously obsolete and unmaintained detritus from the standard library.

PEP 594 is great news for Python, and in particular for the maintainers of its standard library, who can now address a reduced surface area. A brief trip through the PEP’s rogues gallery of modules to deprecate or remove1 is illuminating. The python standard library contains plenty of useful modules, but it also hides a veritable necropolis of code, a towering monument to obsolescence, threatening to topple over on its maintainers at any point.

However, I believe the PEP may be approaching the problem from the wrong direction. Currently, the standard library is maintained in tandem with, and by the maintainers of, the CPython python runtime. Large portions of it are simply included in the hope that it might be useful to somebody. In the aforementioned PEP, you can see this logic at work in defense of the colorsys module: why not remove it? “The module is useful to convert CSS colors between coordinate systems. [It] does not impose maintenance overhead on core development.”

There was a time when Internet access was scarce, and maybe it was helpful to pre-load Python with lots of stuff so it could be pre-packaged with the Python binaries on the CD-ROM when you first started learning.

Today, however, the modules you need to convert colors between coordinate systems are only a pip install away. The bigger core interpreter is just more to download before you can get started.

Why Didn’t You Review My PR?

So let’s examine that claim: does a tiny module like colorsys “impose maintenance overhead on core development”?

The core maintainers have enough going on just trying to maintain the huge and ancient C codebase that is CPython itself. As Mariatta put it in her North Bay Python keynote, the most common question that core developers get is “Why haven’t you looked at my PR?” And the answer? It’s easier to not look at PRs when you don’t care about them. This from a talk about what it means to be a core developer!

One might ask, whether Twisted has the same problem. Twisted is a big collection of loosely-connected modules too; a sort of standard library for networking. Are clients and servers for SSH, IMAP, HTTP, TLS, et. al. all a bit much to try to cram into one package?

I’m compelled to reply: yes. Twisted is monolithic because it dates back to a similar historical period as CPython, where installing stuff was really complicated. So I am both sympathetic and empathetic towards CPython’s plight.

At some point, each sub-project within Twisted should ideally become a separate project with its own repository, CI, website, and of course its own more focused maintainers. We’ve been slowly splitting out projects already, where we can find a natural boundary. Some things that started in Twisted like constantly and incremental have been split out; deferred and filepath are in the process of getting that treatment as well. Other projects absorbed into the org continue to live separately, like klein and treq. As we figure out how to reduce the overhead of setting up and maintaining the CI and release infrastructure for each of them, we’ll do more of this.


But is our monolithic nature the most pressing problem, or even a serious problem, for the project? Let’s quantify it.

As of this writing, Twisted has 5 outstanding un-reviewed pull requests in our review queue. The median time a ticket spends in review is roughly four and a half days.2 The oldest ticket in our queue dates from April 22, which means it’s been less than 2 months since our oldest un-reviewed PR was submitted.

It’s always a struggle to find enough maintainers and enough time to respond to pull requests. Subjectively, it does sometimes feel like “Why won’t you review my pull request?” is a question we do still get all too often. We aren’t always doing this well, but all in all, we’re managing; the queue hovers between 0 at its lowest and 25 or so during a bad month.

By comparison to those numbers, how is core CPython doing?

Looking at CPython’s keyword-based review queue queue, we can see that there are 429 tickets currently awaiting review. The oldest PR awaiting review hasn’t been touched since February 2, 2018, which is almost 500 days old.

How many are interpreter issues and how many are stdlib issues? Clearly review latency is a problem, but would removing the stdlib even help?

For a quick and highly unscientific estimate, I scanned the first (oldest) page of PRs in the query above. By my subjective assessment, on this page of 25 PRs, 14 were about the standard library, 10 were about the core language or interpreter code; one was a minor documentation issue that didn’t really apply to either. If I can hazard a very rough estimate based on this proportion, somewhere around half of the unreviewed PRs might be in standard library code.


So the first reason the CPython core team needs to stop maintaining the standard library because they literally don’t have the capacity to maintain the standard library. Or to put it differently: they aren’t maintaining it, and what remains is to admit that and start splitting it out.

It’s true that none of the open PRs on CPython are in colorsys3. It does not, in fact, impose maintenance overhead on core development. Core development imposes maintenance overhead on it. If I wanted to update the colorsys module to be more modern - perhaps to have a Color object rather than a collection of free functions, perhaps to support integer color models - I’d likely have to wait 500 days, or more, for a review.

As a result, code in the standard library is harder to change, which means its users are less motivated to contribute to it. CPython’s unusually infrequent releases also slow down the development of library code and decrease the usefulness of feedback from users. It’s no accident that almost all of the modules in the standard library have actively maintained alternatives outside of it: it’s not a failure on the part of the stdlib’s maintainers. The whole process is set up to produce stagnation in all but the most frequently used parts of the stdlib, and that’s exactly what it does.

New Environments, New Requirements

Perhaps even more importantly is that bundling together CPython with the definition of the standard library privileges CPython itself, and the use-cases that it supports, above every other implementation of the language.

Podcast after podcast after podcast after keynote tells us that in order to keep succeeding and expanding, Python needs to grow into new areas: particularly web frontends, but also mobile clients, embedded systems, and console games.

These environments require one or both of:

  • a completely different runtime, such as Brython, or MicroPython
  • a modified, stripped down version of the standard library, which elides most of it.

In all of these cases, determining which modules have been removed from the standard library is a sticking point. They have to be discovered by a process of trial and error; notably, a process completely different from the standard process for determining dependencies within a Python application. There’s no install_requires declaration you can put in your setup.py that indicates that your library uses a stdlib module that your target Python runtime might leave out due to space constraints.

You can even have this problem even if all you ever use is the standard python on your Linux installation. Even server- and desktop-class Linux distributions have the same need for a more minimal core Python package, and so they already chop up the standard library somewhat arbitrarily. This can break the expectations of many python codebases, and result in bugs where even pip install won’t work.

Take It All Out

How about the suggestion that we should do only a little a day? Although it sounds convincing, don’t be fooled. The reason you never seem to finish is precisely because you tidy a little at a time. [...] The ultimate secret of success is this: If you tidy up in one shot, rather than little by little, you can dramatically change your mind-set.

— Kondō, Marie.
“The Life-Changing Magic of Tidying Up”
(p. 15-16)

While incremental slimming of the standard library is a step in the right direction, incremental change can only get us so far. As Marie Kondō says, when you really want to tidy up, the first step is to take everything out so that you can really see everything, and put back only what you need.

It’s time to thank those modules which do not spark joy and send them on their way.

We need a “kernel” version of Python that contains only the most absolutely minimal library, so that all implementations can agree on a core baseline that gives you a “python”, and applications, even those that want to run on web browsers or microcontrollers, can simply state their additional requirements in terms of requirements.txt.

Now, there are some business environments where adding things to your requirements.txt is a fraught, bureaucratic process, and in those places, a large standard library might seem appealing. But “standard library” is a purely arbitrary boundary that the procurement processes in such places have drawn, and an equally arbitrary line may be easily drawn around a binary distribution.

So it may indeed be useful for some CPython binary distributions — perhaps even the official ones — to still ship with a broader selection of modules from PyPI. Even for the average user, in order to use it for development, at the very least, you’d need enough stdlib stuff that pip can bootstrap itself, to install the other modules you need!

It’s already the case, today, that pip is distributed with Python, but isn’t maintained in the CPython repository. What the default Python binary installer ships with is already a separate question from what is developed in the CPython repo, or what ships in the individual source tarball for the interpreter.

In order to use Linux, you need bootable media with a huge array of additional programs. That doesn’t mean the Linux kernel itself is in one giant repository, where the hundreds of applications you need for a functioning Linux server are all maintained by one team. The Linux kernel project is immensely valuable, but functioning operating systems which use it are built from the combination of the Linux kernel and a wide variety of separately maintained libraries and programs.

Conclusion

The “batteries included” philosophy was a great fit for the time when it was created: a booster rocket to sneak Python into the imagination of the programming public. As the open source and Python packaging ecosystems have matured, however, this strategy has not aged well, and like any booster, we must let it fall back to earth, lest it drag us back down with it.

New Python runtimes, new deployment targets, and new developer audiences all present tremendous opportunities for the Python community to soar ever higher.

But to do it, we need a newer, leaner, unburdened “kernel” Python. We need to dump the whole standard library out on the floor, adding back only the smallest bits that we need, so that we can tell what is truly necessary and what’s just nice to have.

I hope I’ve convinced at least a few of you that we need a kernel Python.

Now: who wants to write the PEP?

🚀

Acknowledgments

Thanks to Jean-Paul Calderone, Donald Stufft, Alex Gaynor, Amber Brown, Ian Cordasco, Jonathan Lange, Augie Fackler, Hynek Schlawack, Pete Fein, Mark Williams, Tom Most, Jeremy Thurgood, and Aaron Gallagher for feedback and corrections on earlier drafts of this post. Any errors of course remain my own.


  1. sunau, xdrlib, and chunk are my personal favorites. 

  2. Yeah, yeah, you got me, the mean is 102 days. 

  3. Well, as it turns out, one is on colorsys, but it’s a documentation fix that Alex Gaynor filed after reviewing a draft of this post so I don’t think it really counts. 

by Glyph at June 14, 2019 04:51 AM

June 06, 2019

Twisted Matrix Laboratories

Twisted 19.2.1 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 19.2.1!

This is a security release, and contains the following changes:
  • All HTTP clients in twisted.web.client now raise a ValueError when called with a method and/or URL that contain invalid characters. This mitigates CVE-2019-12387. Thanks to Alex Brasetvik for reporting this vulnerability.
It is recommended you update to this release as soon as is practical.

Additional mitigation may be required if Twisted is not your only HTTP client library:
You can find the downloads at <https://pypi.python.org/pypi/Twisted> (or alternatively <http://twistedmatrix.com/trac/wiki/Downloads>). The NEWS file is also available at <https://github.com/twisted/twisted/blob/twisted-19.2.1/NEWS.rst>.

Twisted Regards,
Amber Brown (HawkOwl)

by Anonymous (noreply@blogger.com) at June 06, 2019 02:49 PM

June 03, 2019

Hynek Schlawack

Python in Azure Pipelines, Step by Step

Since the acquisition of Travis CI, the future of their free offering is unclear. Azure Pipelines has a generous free tier, but the examples I found are discouragingly complex and take advantage of features like templating that most projects don’t need. To close that gap, this article shows you how to move a Python project with simple CI needs from Travis CI to Azure Pipelines.

by Hynek Schlawack (hs@ox.cx) at June 03, 2019 09:14 AM

May 28, 2019

Moshe Zadka

Analyzing the Stack Overflow Survey

The Stack Overflow Survey Results for 2019 are in! There is some official analysis, that mentioned some things that mattered to me, and some that did not. I decided to dig into the data and see if I can find some things that would potentially interest my readership.

import csv, collections, itertools
with open("survey_results_public.csv") as fpin:
    reader = csv.DictReader(fpin)
    responses = list(reader)
len(responses)
88883

Wow, almost 90K respondents! This is the sweet spots of "enough to make meaningful generalizations" while being able to analyze with rudimentary tools, not big-data-ware.

pythonistas = [x for x in responses if 'Python' in x['LanguageWorkedWith']]
len(pythonistas)/len(responses)
0.41001091322300104

About 40% of the respondents use Python in some capacity. That is pretty cool! This is one of the things where I wonder if there is bias in the source data. Are people who use Stack Overflow, or respond to surveys for SO, more likely to be the kind of person who uses Python? Or less?

In any case, I am excited! This means my favorite language, for all its issues, is doing well. This is also a good reminder that we need to think about the consequences of our decisions on a big swath of developers we will never ever meet.

opensource = collections.Counter(x['OpenSourcer'] for x in pythonistas)
sorted(opensource.items(), key=lambda x:x[1], reverse=True)
[('Never', 11310),
 ('Less than once per year', 10374),
 ('Less than once a month but more than once per year', 9572),
 ('Once a month or more often', 5187)]
opensource['Once a month or more often']/len(pythonistas)
0.1423318607139917

Python is open source. Almost all important libraries (Django, Pandas, PyTorch, requests) are open source. Many important tools (Jupyter) are open source. The number of people who contribute to them with any kind of regular cadence is less than 15%.

general_opensource = collections.Counter(x['OpenSourcer'] for x in responses)
sorted(general_opensource.items(), key=lambda x:x[1], reverse=True)
[('Never', 32295),
 ('Less than once per year', 24972),
 ('Less than once a month but more than once per year', 20561),
 ('Once a month or more often', 11055)]

The Python community does compare well to the general populace, though!

devtype = collections.Counter(itertools.chain.from_iterable(x["DevType"].split(";") for x in pythonistas))
devtype['DevOps specialist']/len(responses)
0.052282213696657406

About 5% of total respondents are my peers: using Python for DevOps. That is pretty exciting! My interest in that is not merely theoretical, my upcoming book targets that crowd.

general_devtype = collections.Counter(itertools.chain.from_iterable(x["DevType"].split(";") for x in responses))
general_devtype['DevOps specialist']/len(responses), devtype['DevOps specialist']/len(pythonistas)
(0.09970410539698255, 0.12751420025793705)

In general, DevOps specialists are 10% of respondents.

devtype['DevOps specialist']/general_devtype['DevOps specialist']
0.524373730534868

Over 50% of DevOps specialists use Python!

def safe_int(x):
    try:
        return int(x)
    except ValueError:
        return -1

intermediate = sum(1 for x in pythonistas if 1<=safe_int(x['YearsCode'])<=5)

My next hush-hush (for now!) project is going to be targeting intermediate Python developers. I wish I could slice by "number of years writing in Python, but this is the best I could do. (I treat "NA" responses as "not intermediate". This is OK, since I prefer to underestimate rather than overestimate.)

intermediate/len(responses)
0.11346376697456206

11%! Not bad.

general_intermediate = sum(1 for x in responses if 1<=safe_int(x['YearsCode'])<=5)
intermediate/len(pythonistas), general_intermediate/len(responses)
(0.27673352907279863, 0.2671264471271222)

Seems like using Python does not change much the chances of someone being intermediate.

Summary

  • 40% of respondents use Python. Python is kind of a big deal.
  • 5% of respondents use Python for DevOps. This is a lot! DevOps as a profession is less than 10 years old.
  • 11% of respondents are intermediate Python users. My previous book targets this crowd.

(Thanks to Robert Collins and Matthew Broberg for their comments on an earlier draft. Any remaining issues are purely my responsibility.)

by Moshe Zadka at May 28, 2019 05:20 AM

May 16, 2019

Moshe Zadka

Inbox Zero

I am the parent of two young kids. It is easy to sink into random stuff, and not follow up on goals. Strict time management and prioritization means I get to work on open source projects, write programming books and update my blog with a decent cadence. Since a lot of people were asking me how to do it, I wanted to share my methodology. The following is descriptive, not prescriptive.

One thing I am proud of is that the initial draft for the post was written a year ago. I have done my edits for clarity, but found that my description of the process, for the most part, has remained the same. This made me confident that it is time to publish: this process has existed in its current form for at least a year, and I believe almost two years. This is not some fad diet for me: this process has proved its worth.

Glyph has already written at length about how a full Inbox is a sign of misprioritized tasks. Saying "no" is one example (in other words, prioritizing away). But when saying "yes", it is a good idea to know when it can be done, when should you give up, and potentially apologize, and when should you give a heads-up that it is being delayed.

His description, being more high-level, is prescriptive. The follow-up is the process I use, shaped by those general ideas.

The tool I use is TODOist. The first time I tried it, I decided it lacked some necessary features. I still feel this way -- about the free version. The free version is completely unusable. The premium version is perfectly usable.

The salient features of TODOist, that the rest of the explanation depends on, are:

  • Android integration. I use Android on my phone, and depend on good phone support. TODOist has a widget which lets me add a task without waiting for an app-launch. It integrates with Google Assistant -- it is possible to configure all "Note to self" to be new task creations. Finally, it integrates with the "Share" menu, so sharing things can create tasks.
  • E-mail integration: a customized e-mail address which opens a task for each e-mail
  • Browser plugin: add a task without opening the site, as well as "Add website as task" for current page.
  • A task can have arbitrary attachments.

E-mail scan process

I read e-mail "when I get around to it". Usually several times a day. I do have notifications enabled on my phone, so I can easily see if the e-mail is urgent. Otherwise, I just ignore the notification.

When I do go through my e-mail, I follow the rules:

  • If it's obvious there is no task, archive
  • If it's something short, obvious and I have the time, do it and archive. However, if I find in the middle that I am wrong about it being short and obvious, I abort. Usually it is obvious if an e-mail will require a lengthy research project. The most common way of being wrong is when, while responding, I find myself getting too emotional. I have trained myself to consider this as a trigger for aborting.
  • Otherwise, I "Forward" and send it to the TODOist auto-task e-mail -- and then immediately archive. The forwarded message, having literally all the words in the original, is enough information to search for the original in my archive.

Browser

The only "permanently" open tabs in a browser should be "communication" tabs: FB messenger, whatsapp, slack, etc. If any other tab feels like it would be bad to close, create task from it. I verify that each tab is OK to close, or needs a task + close, by closing all non-communication tabs if the tabs become too small to read the titles (Chrome) or the tabs need scrolling (Firefox).

My usual research task takes several tabs (Python documentation, StackOverflow, GitHub pull requests, tickets and more), so tab accumulation happens naturally, thus triggering the garbage collection process.

Reviewing tasks

Clean triage

This is a daily task, to go to the filter "triage" and clean it out. The filter is defined as "not marked 'time permitting' and does not have a due date". Since tasks come in without marking or due date, this is a filter for tasks that come in. The task is "done" when the filter is empty. Any task that actually needs to get done will get Scheduled with a due date. Note that this due date is not a real "due": it is when I plan to do it. This will get determined based on the task, on my available time, and when other tasks got scheduled.

Otherwise, the task is marked "time permitting". This means, in real terms, that I will probably never get around to it. This is fine -- and it feels nicer than archiving or deleting the task. It allows me to be less FOMO when doing the triage.

Occasionally, an external trigger will rescue a task from the "time permitting" graveyard.

Rebalance

Rebalance means that I do not want to have an empty day, followed by an avalanche day: I'll be as carefree as the grasshopper that day, watch TV and frolic, and then drown in tasks the next.

I look ahead, and if I see a day with less than 5-6 tasks, I will move some tasks forward to get done sooner. I do not worry about the opposite. If there are too many tasks one day, they'll naturally get postponed.

Non-meta Tasks

I treat the due date as an "ETA". I try to do all tasks due a given day on that day. If there is an objective deadline, e.g. a CFP that closes on a date, that deadline will be in human readable form on the task.

If I am too tired, or cannot handle more load, I start rescheduling "today" tasks. This process will take into considersation the "objective" deadlines, if any. It will also take into account the subjective value of the task to me.

Any task that gets postponed "too many times" gets moved to "time permitting".

Dependencies

Humans are social creatures. Some tasks, I cannot do alone. For example, when publishing a blog post, I like to have some trusted people review it. This means that I need their feedback.

When I need something from someone, that's a task. The task is to use that thing. The due date is the date to poke them about the delivery of the thing. Because I try to build in a buffer, it allows me to be nice about it. I am endlessly patient, with e-mails asking "let me know how it is going".

Some people are also busy. If someone tells me "I'll give it to you in a week", I make a task to ask them about it in a week. If they deliver, they will never know: the task gets done when I get what I need. If not, I'll mention, gently, "hey, it's been a week, wondering if there's an update."

Some people, for good or bad reasons, do not deliver. Then I have the task of deciding what to do about it. Sometime I'll ask someone else for help. Sometime I'll do it myself. Sometime I'll drop it. Whatever it is, it was my explicit decision.

Spoon Management

If there are too many tasks, and I feel overwhelmed, I will start postponing any non-urgent tasks. Sometimes, this means I will postpone everything. If I lack the spoons, I lack the spoons. I do not feel guilt about it.

Summary

Inbox Zero is possible. Not only that. Inbox Zero, I have found, is easy. Doing everything I want to do is not easy. But the meta-process: deciding what I want to do, deciding what I am going to say "no" or flake on, that is easy.

This leads to less anxiety. I do what I can, and decide that this is enough. I am kind to myself. Be kind to yourself. Go Inbox Zero.

(Thanks to Shae Erisson for his feedback. Any issues that remain are my responsibility.)

by Moshe Zadka at May 16, 2019 04:45 AM

May 15, 2019

Hynek Schlawack

The Price of the Hallway Track

There are many good reasons to not go to every talk possible when attending conferences. However increasingly it became hip to boast about avoiding going to talks – encouraging others to follow suit. As a speaker, that rubs me the wrong way and I’ll try to explain why.

by Hynek Schlawack (hs@ox.cx) at May 15, 2019 06:00 PM

Itamar Turner-Trauring

Learning negotiation from Jane Austen

Looking for a job as a software developer can be scary, exhausting, or overwhelming. Where you apply and how you interview impacts whether you’ll get a job offer, and how good it will be, so in some sense the whole job search is a form of negotiation.

So how do you learn to make a good impression, to convince people of your worth, to get picked by the job you want? There are many skills to learn, and in this article I’d like to cover one particular subset.

Let us travel to England, some 200 years in the past, and see what we can learn.

Jane Austen, Game Theorist

What does a novelist writing in the early 19th century have to do with getting a programming job?

In his book Jane Austen, Game Theorist, Michael Suk-Young Chwe argues quite convincingly that Austen’s goal in writing her books is to teach strategic thinking: understanding what and why people do what they do, and how to interact with them accordingly, in order to achieve the outcomes you want.

Strategic thinking is a core skill in negotiation: you’re trying to understand what the other side wants (even if they don’t explicitly say it), and to find a way to use that to get what you want. The hiring manager might want someone who both understands their particular technical domain and can help a team grow, whereas you might want a higher salary, or a shorter workweek. Strategic thinking can help you use the one to achieve the other.

Strategic thinking is of course a useful skill for anyone, but why would Jane Austen in particular care about strategic thinking? To answer that we need a little historical context.

The worst job search ever

Imagine you could only get one job your whole life, that leaving your job was impossible, and that you’d be married to your boss. This is the “job search” that Austen faced in her own life, and is one the main topics covered in her books.

Austen’s own family, and the people she writes about, were part of a very small and elite minority. Even the poorest of the families Austen writes about have at least one servant, for example.

While the men of the English upper classes, if they were not sufficiently wealthy, could and did work—as lawyers, doctors, officers—their wives and daughters for the most part could not. So if they weren’t married and didn’t have sufficient wealth of their own, upper-class women had very few choices—they could live off money from relations, or take on the social status loss of becoming a governess.

Marriage was therefore the presumed path to social status, economic security, and of course it determined who they would live with for the rest of their lives (divorce was basically impossible).

Finding the right husband was very important. And getting that husband—who had all the legal and social authority—to respect their wishes after marriage was just as important. And of course the women who didn’t marry lived at the mercy of the family members who supported them.

And that’s where strategic thinking comes in: it was a critical skill for women in Austen’s class and circumstances.

Learning from Austen

If, as Michael Chwe argues, Austin’s goal with her books is to teach strategic thinking, how can you use them to improve your negotiation skills?

All of Austen’s books are worth reading—excepting the unfortunate Mansfield Park—but for educational purposes Northanger Abbey is a good starting point. Northanger Abbey is the story of Catherine, a naive young woman, and how she becomes less naive and more strategic.

Instead of just reading it as an entertaining novel, you can use it to actively practice your own strategic understanding:

  1. In every social interaction, Catherine has a theory about other people’s motivations, why they’re doing or saying certain things.
  2. Notice the assumptions underlying her theory, and then come up with your alternative theory or explanation for other characters’ actions.
  3. Then, compare both theories as the plot unfolds and you learn more.

Other characters also offer a variety of opportunities to see strategic thinking—or lack of it—in action. Once you’ve gone through the book and experienced the growth of Catherine’s strategic thinking, start practicing those skills in your life.

Why are your coworkers, family, and friends doing what they’re doing? Do they have the same motivations, goals, and expectations that you do? The more you pay attention and compare your assumptions to reality, the more you’ll learn—and the better you’ll do at your next job interview.

Ready to get started? You can get a paper copy from the library, or download a free ebook from Project Gutenberg.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

May 15, 2019 04:00 AM

May 09, 2019

Itamar Turner-Trauring

Part-time software developer jobs don't exist, right?

If you’re tired of working long hours, a part-time—or even just 4 days a week—programming jobs seems appealing. You’ll still get paid, you’ll still hopefully enjoy your job—but you’ll also have more time for other things in your life.

Hypothetically you could negotiate for more free time, but obviously no company would ever agree to a shorter workweek, right?

And indeed there are plenty of people—on Hacker News especially—who will explain to you in great detail why this can’t be done, that no manager would ever agree to this, that it’s a logical impossibility, a mirage, a delusion, not even worth considering.

But—

The fact is there are quite a few software developers who work less than full-time. And to help convince you, I figured I would share just a few of the examples I know of.

I’ve done it

Personally I’ve worked at three different software jobs at between 28 and 35 hours a week. And before that, when I left my last full-time job, my manager offered to help me find a part-time job there so that I would stay.

People who have read my book have done it

Since I appreciated having a shorter workweek so much, I ended up writing a book about negotiating a 3-day weekend, and a number of people who read my book have successfully done so.

I could share quotes from people who did it, and the sales page above includes just some of them, but you might feel that lacks a little credibility. So let’s move on—

People I’ve interviewed have done it

I also interviewed a number of people for the book, including a guy by the name of Mike who has been working 4 days a week for 15 years now. You can read the full interview with Mike if you want to get his perspective.

But he’s just one person, so let’s move on to the final category: random people on the Internet.

Random people on the Internet have done it

Here’s just a sample:

pushcx on lobste.rs: “I’ve worked part-time for about six years of my career.”

Seitsebb on lobste.rs: “I work four days a week and can recommend it.”

stsp on lobste.rs: “I was fortunate enough to be able to negotiate [Fridays off] while employed and it had a very positive impact on both my work and quality of life in general.”

acflint on dev.to: “I negotiated a 4 day weekend so I could spend time on my side project … and enjoy life more.”

autarch on Hacker News: “As part of my negotiations for my current job, I negotiated a 4-day (32 hour) work week. I take Fridays off and do my own projects and volunteer work.”

Boycy on Hacker News: “I asked my then employer if I could drop to 4 days a week, pro-rata, and was surprised when the answer was yes!”

notacoward on Hacker News: “When I reduced my hours, I was amused to notice that everyone from the VP who approved it down to the person in HR who handled the paperwork said they wished they could do the same. I told them all that they could.”

lubonay on Hacker News: “I worked on a 4-day week for about a year between 2017 and 2018 for a small consultancy company.”

duckworthd on Hacker News: “I’ve been working a 4 day/week schedule for 1.5 years now.”

I could go on, but no doubt this is getting repetitive.

You can do it too

Want to join us and get more time for yourself?

For most programmers, the easiest place to negotiate a 3-day weekend is at your current job.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

May 09, 2019 04:00 AM

April 30, 2019

Itamar Turner-Trauring

The new parent's guide to surviving a programming job

Working as a programmer will keep you busy; parenting a baby is a massive amount of work. Doing both at once isn’t easy!

While it does get better with time, there are ways you can make it calmer, simpler, and easier in the short term. Based on my experience as a new parent and programmer, in the rest of this article I’ll cover:

  1. Mitigating sleep deprivation.
  2. Dealing with limited work hours.
  3. Other random tips.

Sleep deprivation is terrible

Sleep deprivation is awful. It makes you less focused, more irritable and cranky, and in many ways it’s similar to being drunk. When done deliberately sleep deprivation is literally a form of torture.

If you’re lucky your child will start sleeping through the night after a few months, but (from personal experience) not everyone is so lucky. So here are some ways to deal with lack of sleep.

Be kind

The irritability that results from sleep deprivation is going to impact all of your relationships—with your spouse/partner, your friends, and your coworkers. Keep in mind that you are going to get annoyed more easily, and try to compensate.

Remind yourself that the reason you’re so annoyed by the code you’re reviewing is probably nothing to do with your coworker’s skill, and everything to do with being woken up at 1AM, 3AM, and finally at 5AM.

Compensate for cognitive impairment

Besides being irritable, you are also cognitively impaired—you’re less smart than you usually are. You can compensate for this in a variety of ways:

  • Spend more planning up front than you usually would; you’re more likely to forget about important details otherwise.
  • Avoid writing complex code, since you’ll have an even harder time than usual keeping it in your head. Figuring out a simpler solution may take longer, but it’s worth it.
  • Keep a “lab notebook”: write down what you’re planning on doing next, what you’ve already done, and status notes. This will help mitigate the memory problems from lack of sleep. It will also help you deal better with interruptions, and to get going at the start of the work day when you’ve already been “awake” for 7 hours and you can’t remember what or why you’re at the office.

Your time is limited

Even if you used to work longer hours (and you really shouldn’t have), you really shouldn’t be working long hours as a new parent. That means:

  • Learning should be done not at home but on the job, which in any case is the best place to learn new skills.
  • You need to learn how to say no to your boss, and how to set boundaries in general.
  • Learn to prioritize. Only the truly most important things should be done first. Everything else will be done next—and if you don’t reach it, that’s OK, it was less important. “But this is almost as important!” Nope, not happening. “It would be really nice…” No.

Other advice

Pumping milk at the office: Pumping milk multiple times a day at the office can be time consuming. If your baby is healthy, I’m told you can pump once in the morning, stick the equipment in the fridge without cleaning it, and then pump a second time later in day. This saves you one cleaning cycle at the office.

(I am not a medical professional, ask your pediatrician first before doing this.)

Working at home: Some babies, I’m told, will just lie there happily babbling to themselves while you work. If you have the other kind of baby, the kind that screams continuously if they’re not held, you might be able to get a little work done at home by putting them in a baby carrier and using a standing desk.

A shorter workweek: Even if you’re not working long hours, a full-time 40-hours-a-week job may still be too much as a new parent. You can often negotiate a shorter workweek at your existing job fairly easily.

What really matters to you?

However efficient you are, having a child is going to take up a whole lot of time. And that means you’re going to have to make some choices about priorities: what things really matter you? Where do you really want to spend your time?

It’s a personal choice—I am glad I got to work part-time and take care of my kid the rest of the time, but I would hate to take care of a baby full-time. Your preferences may well be different.

But whatever you decide, just remember you need to choose: you can’t do everything.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

April 30, 2019 04:00 AM

April 12, 2019

Itamar Turner-Trauring

Can software engineering be meaningful work?

The world is full of problems—from poverty to climate change—and it seems like software ought to be able to help. And yet your own programming job seems pointless, doing nothing to make things better. Far too many jobs are just about making some rich people just a little bit richer.

So how can you do something meaningful with your life?

There are no easy answers, but here’s a starting point, at least.

Don’t make things worse

Even beyond your moral obligations, working on something you actively find wrong is bad for you:

  • Either you end up hating yourself for doing it.
  • Or, in self-defense you become cynical and embittered, assuming the worst of everyone. This is not pleasant, nor is it an accurate view of the surprisingly varied threads of humanity.

If you find yourself in this situation, you have the opportunity to try to make things a little better, by pushing your organization to change. But you can also just go look for another job elsewhere.

Some jobs are actually good

Of course, most software jobs aren’t evil, but neither are they particularly meaningful. You can help an online store come up with a better recommendation engine, or optimize their marketing funnel, or build a web UI for support staff—but does it really matter that people buy at store A instead of store B?

So it’s worth thinking in detail about what exactly it is you would find meaningful, and seeing if there’s work that matches your criteria. There may not be a huge number of jobs that qualify, but chances are some exist.

If you care about climate change, for example, there are companies building alternative energy systems, working on public transportation planning, and more broadly just making computing more efficient.

Your job needn’t be the center of your life

You may not be able to find such a job, or get such a job. So there’s something to be said for not making your work the center of your life’s existence.

As a programmer you are likely get paid well, and you can even negotiate a shorter workweek. Given enough free time and no worries about making a living, you have the ability to find meaning outside your work.

  • Make the world a better place, just a little: I’ve been volunteering with a local advocacy group, and the ability to see the direct impact of my work is extremely gratifying.
  • Beauty and nature: Programming as a job can end up leaving you unbalanced as a person—it’s worth seeing the world in other ways as well.
  • Religion: While it makes no sense to me (apparently even as a very young child), apparently many people find their religion deeply satisfying.
  • Creation for creation’s sake: Many of us become programmers because we want to create things, but having a job means turning to instrumental creation, work that isn’t for its own sake. Try creating something not for its utility, but because you want to.
  • Find people who understand you: Being part of a social group that fundamentally doesn’t match who you are and how you view the world is exhausting and demoralizing. I ended up moving to a whole new country because of this. But if you live in a large city, quite possibly the people who will understand you can be found just down the block.

No easy answers

Unless you want to join a group that will tell you exactly what to think and precisely what to do—and there are certainly no lack of those—meaning is something you need to figure out for yourself.

It’s unlikely that you’ll solve it in one fell swoop, nor is it likely to be a fast process. The best you can do is just get started: a meaningful life isn’t a destination, it’s a journey.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

April 12, 2019 04:00 AM

April 10, 2019

Twisted Matrix Laboratories

Twisted 19.2.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 19.2! The highlights of this release are:
  • twisted.web.client.HostnameCachingHTTPSPolicy was added as a new contextFactory option. This reduces the performance overhead for making many TLS connections to the same host.
  • twisted.conch.ssh.keys can now read private keys in the new "openssh-key-v1" format, introduced in OpenSSH 6.5 and made the default in OpenSSH 7.8.
  • The sample code in the "Twisted Web In 60 Seconds" tutorial runs on Python 3.
  • DeferredLock and DeferredSemaphore can be used as asynchronous context managers on Python 3.5+.
  • twisted.internet.ssl.CertificateOptions now uses 32 random bytes instead of an MD5 hash for the ssl session identifier context.
  • twisted.python.failure.Failure.getTracebackObject now returns traceback objects whose frames can be passed into traceback.print_stack for better debugging of where the exception came from.
  • Much more! 20+ tickets closed overall.
You can find the downloads at <https://pypi.python.org/pypi/Twisted> (or alternatively <http://twistedmatrix.com/trac/wiki/Downloads>). The NEWS file is also available at <https://github.com/twisted/twisted/blob/twisted-19.2.0/NEWS.rst>.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Amber Brown (HawkOwl)

by Anonymous (noreply@blogger.com) at April 10, 2019 12:35 PM

April 08, 2019

Moshe Zadka

Publishing a Book with Sphinx

A while ago, I decided I wanted to self-publish a book on improving your Python skills. It was supposed to be short, sweet, and fairly inexpensive.

The journey was a success, but had some interesting twists along the way.

From the beginning, I knew what technology I wanted to write the book with: Sphinx. This was because I knew that I can use Sphinx to create something reasonable: I have previously ported my "Calculus 101" book to Sphinx, and I have written other small things in it. Sphinx uses ReStructuredText, which I am most familiar with.

I decided I wanted to publish as PDF (for self-printers or others who find it convenient), as browser-ready HTML directory, and as an ePub.

The tox environments I created are: epub builds the ePub, html builds the browser-ready HTML, and pdf builds the PDF.

Initially, the epub environment created a "singlehtml", and I used Calibre command-line utility to transform it into an ePub. This made for a prettier ePub than the one sphinx creates: it had a much nicer cover, which is what most book reading applications use as an icon. However, that rendered poorly on Books.app (AKA iBooks).

One of the projects I still plan to tackle is how to improve the look of the rendered ePub, and add a custom cover image.

Finally, a script runs all the relevant tox environments, and then packs everything into a zip file. This is the zip file I upload to Gumroad, so that people can buy it.

I have tried to use other sellers, but Gumroad was the one with the easiest store creation. In order to test my store, even before the book was ready, I created a simple "Python cheat-sheet" poster, and put it on my store.

I then asked friends to buy it, as well as trying to do it myself. After it all worked, I refunded all the test-run purchases, of course!

Refunding on Gumroad is a pleasant process, which means that if people buy the book, and are unhappy with it, I am happy to refund their money.

(Thanks to Glyph Lefkowitz for his feedback on an earlier draft. All mistakes that remain are my responsibility.)

by Moshe Zadka at April 08, 2019 07:00 AM

April 03, 2019

Itamar Turner-Trauring

Setting boundaries at your job as a programmer

There’s always another bug, another feature, another deadline. So it’s easy to fall into the trap of taking on too much, saying “yes” one time too many, staying at the office a little later, answering a work email on the weekend…

If you’re not careful you can end up setting unreasonable expectations, and ending up tethered to your work email and Slack. Your manager will expect you to work weekends, and your teammates will expect you to reply to bug reports in the middle of your vacation.

What you want is the opposite: when you’re at home or on vacation, you should be spending your time however you want, merry and free.

You need to set boundaries, which is what I’ll be discussing in the rest of this article.

Prepping for a new job

Imagine you’re starting a new job in a week, you enjoy programming for fun, and you want to be productive as soon as possible. Personally, I wouldn’t do any advance preparation for a new job: ongoing learning is part of a programmer’s work, and employers ought to budget time for it. But you might choose differently.

If so, it’s tempting to ask for some learning material so you can spend a few days beforehand getting up to speed. But you’re failing to set boundaries if you do that, and they might give you company-specific material, in which case you’re just doing work for free.

Learning general technologies is less of a problem—knowing more technologies is useful in your career in general, and maybe you enjoy programming for fun. So instead of asking for learning material, you can go on your own and learn the technologies you know they use, without telling them you’re doing so.

Work email and Slack

Never set up work email or Slack on your phone or personal computer:

  1. It will tempt you to engage with work in your free time.
  2. When you do engage, you’ll be setting expectations that you’re available to answer questions 24/7.

While you’re at work you’ll always have your computer, so you don’t need access on your phone. If you do need to set up work email on your phone for travel, remove the account when you’re back home.

And if you want to have your work calendar on your phone, you can share it with your personal calendar account; that way you’re sharing only your calendar, nothing else.

Vacations

When you’re on vacation, you’re on vacation: no work allowed. That means you’re not taking your work laptop with you, or turning it on if you’re at home.

A week or so in advance of your vacation, explain to your team that you won’t be online, and that you won’t have access to work files. Figure out what information they might need—documentation, in-progress work you want to hand off, and the like—and write it all down where they can find it.

If you must, give your personal phone number for emergencies: given you lack access to your work credentials and email, the chances of your being called for something unimportant are quite low.

You’re paid for your normal work hours (and that’s it)

A standard workweek in the US is 40 hours a week; elsewhere it can be a little less. Whatever it is, outside those hours you shouldn’t be working, because you’re not being paid for that work. Your evenings, your weekends, your holidays, your vacations—all of these belong to you, not your employer.

If you don’t enforce that boundary between work and non-work, you are sending the message that your time doesn’t belong to you. And if you have a bad manager, they’re going to take advantage of that—or you might end up working long hours out of a misplaced sense of obligation.

So unless you’re dealing with an emergency, you should forget your job exists when your workday ends—and unless you’re on your on-call rotation, you should make sure you’re inaccessible by normal work channels.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

April 03, 2019 04:00 AM

March 30, 2019

Moshe Zadka

A Local LRU Cache

"It is a truth universally acknowledged, that a shared state in possession of mutability, must be in want of a bug." -- with apologies to Jane Austen

As Ms. Austen, and Henrik Eichenhardt, taught us, shared mutable state is the root of all evil.

Yet, the official documentation of functools tells us to write code like:

@lru_cache(maxsize=32)
def get_pep(num):
    'Retrieve text of a Python Enhancement Proposal'
    resource = 'http://www.python.org/dev/peps/pep-%04d/' % num
    try:
        with urllib.request.urlopen(resource) as s:
            return s.read()
    except urllib.error.HTTPError:
        return 'Not Found'

(This code is copied from the official documentation, verbatim.)

The decorator, @lru_cache(maxsize=32), is now... module-global mutable state. It doesn't get any more shared, in Python, than module-global: every import of the module will share the object!

We try and pretend like there is no "semantic" difference: the cache is "merely" an optimization. However, very quickly things start falling apart: after all, why would the documentation even tell us how to get back the original function (answer: .__wrapped__) if the cache is so benign?

No, decorating the function with lru_cache is anything but benign! For one, because it is shared-thread mutable state, we have introduced some thread locking, with all the resulting complexity, and occasional surprising performance issues.

Another example of non-benign-ness is that, in the get_pep example, sometimes a transient error, such as a 504, will linger on, making all subsequent requests "fail", until a cache eviction (because an unrelated code path went through several PEPs) causes a retry. These are exactly the kind of bugs which lead to warnings against shared mutable state!

If we want to cache, let us own it explicitly in the using code, and not have a global implementation dictate it. Fortunately, there is a way to properly use the LRU cache.

First, remove the decorator from the implementation:

def get_pep(num):
    'Retrieve text of a Python Enhancement Proposal'
    # Same code as an in official example

Then, in the using code, build a cache:

def analyze_peps():
    cached_get_pep = lru_cache(maxsize=32)(get_pep)
    all_peps, pep_by_type = analyze_index(cached_get_pep(0))
    words1 = get_words_in_peps(cached_get_pep, all_peps)
    words2 = get_words_in_informational(cached_get_pep,
                                        pep_by_type["Informational"])
    do_something(words1, words2)

Notice that in this example, the lifetime of the cache is relatively clear: we create it in the beginning of the function, passed it to called functions, and then it goes out of scope and is deleted. (Barring one of those functions sneakily keeping a reference, which would be a bad implementation, and visible when reviewing it.)

This means we do not have to worry about cached failures if the function is retried. If we retry analyze_peps, we know that it will retry retrieving any PEPs, even if those failed before.

If we wanted the cache to persist between invocations of the function, the right solution would be to move it one level up:

def analyze_peps(cached_get_peps):
    # ...

Then it is the caller's responsibility to maintain the cache: once again, we avoid shared mutable state by making the state management be explicit.

In this example, based on the official lru_cache documentation, we used a network-based function to show some of the issues with a global cache. Often, lru_cache is used for performance reasons. However, even there, it is easy to create issues: for example, one function using non-common inputs to the LRU-cached functions can cause massive cache evictions, with surprising performance impacts!

The lru_cache implementation is great: but using it as a decorator means making the cache global, with all the bad effects. Using it locally is a good use of a great implementation.

(Thanks to Adi Stav, Steve Holden, and James Abel for their feedback on early drafts. Any issues that remain are my responsibility.)

by Moshe Zadka at March 30, 2019 04:30 AM

March 29, 2019

Itamar Turner-Trauring

On learning new technologies: why breadth beats depth

As a programmer you face an ever-growing stream of new technologies: new frameworks, new libraries, new tools, new languages, new paradigms. Keeping up is daunting.

  • How can you find the time to learn how to use every relevant tool?
  • How do you keep your skills up-to-date without using up all your free time?
  • How can you even learn all the huge number of existing technologies?

The answer, of course, is that you can’t learn them all—in depth. What you can do is learn about the tools’ existence, and learn just enough about them to know when it might be worth learning more.

Quite often, spending 5 minutes learning about a new technology will give you 80% of the benefit you’d get from spending 5 days on it.

In the rest of this article I’ll cover:

  1. The cost of unnecessary in-depth learning.
  2. Why breadth of knowledge is so useful.
  3. Some easy ways to gain breadth of knowledge.

The cost of in-depth learning

Having a broad range of tools and techniques to reach for is a valuable skill both at your job and when looking for a new job. But there are different levels of knowledge you can have: you can be an expert, or you can have some basic understanding of what the tool does and why you might use it.

The problem with becoming an expert is that it’s time consuming, and you don’t want to put that level of effort into every new tool you encounter.

  1. Some new technologies will die just as quickly as they were born; there’s no point wasting time on a dead end.
  2. Most technologies just aren’t relevant to your current situation. GitHub keeps recommending I look at a library for analyzing pulsar data, and not being an astrophysicist I’m going to assume I can safely ignore it.
  3. Software changes over time: even if you end up using a new library in a year or two, by that point the API may have changed. Time spent learning the current API would be wasted.

If you try to spend a few days—or even hours—on every intriguing new technology you hear about, you’re going to waste a lot of time.

The alternative: shallow breadth of knowledge

Most of the time you don’t actually need to use new tools and techniques. As long as you know a tool exists you’ll be able to learn more about it when you need to.

For example, there is a tool named Logstash that takes your logs and sends them to a central location. That’s pretty much all you have to remember about it, and it took you just a few seconds to read that previous sentence.

Maybe you’ll never use that information… or maybe one day you’ll need to get logs from a cluster of machines to a centralized location. At that point you’ll remember the name “Logstash”, look it up, and have the motivation to actually go read the documentation and play around with it.

This is also true when it comes to finding a new job. I was once asked in a job interview about the difference between NoSQL and traditional databases. At the time I’d never used MongoDB or any other NoSQL database, but I knew enough to answer satisfactorily. Being able to answer that question told the interviewer I’d be able to use that tool, if necessary, even if I hadn’t done it before.

Gaining breadth of knowledge

Learning about the existence of tools can be a fairly fast process. And since this knowledge will benefit your employer and you don’t need to spend significant time on it, you can acquire it during working hours.

You’re never actually working every single minute of your day, you always have some time when you’re slacking off on the Internet. Perhaps you’re doing so right now! You can use that time to expand your knowledge.

Here are a couple ways you can get pointers to new tools and techniques:

Newsletters

A great way to learn new tools and techniques are weekly email newsletters. There are newsletters on many languages and topics, from DevOps to PostgreSQL: here’s one fairly detailed list of potential newsletters you can sign up for.

Conferences and Meetups (you don’t have to go!)

Another good source of new tools and techniques are conferences and Meetups. Good conferences and Meetups will aim for a broad range of talks, on topics both new and classic.

But you don’t have to go to the conference or Meetup to benefit, or even watch a recording of the talks to learn something. Just skimming talk topics will give you a sense of what the community is talking and thinking about—and if something sounds particularly relevant to your interests you can spend the extra time to learn more.

Of course, if you can convince your employer to send you to a conference that’s even better: you’ll learn more, and you’ll do it on the company’s dime and time.

Your time is valuable—use it accordingly

There are only so many hours in the day, so many days in a year. That means you need to work efficiently, spending your limited time in ways that have the most impact:

  1. Spend an hour a week learning about new tools, just enough to know when they might be useful.
  2. Keep a record of these tools so you can find them when you need to: star them on GitHub, or add them your bookmarks or note-taking system.
  3. Only spend the extra time and effort needed to gain more in-depth understanding once you actually need to use the tool. And when you do learn a new tool, do it at your job if you can.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 29, 2019 04:00 AM

March 19, 2019

Itamar Turner-Trauring

You are not a commodity

Recently a reader wrote in with a question:

I’ll be going to [a coding boot camp]. [After I graduate], my current view is to try hard to negotiate for whatever I can and then get better for my second job, but both of those steps are based on the assumption that I understand what an acceptable range for pay, benefits, etc are, and I feel like it’s leaving money (or time) on the table.

I’m not even sure if entry level jobs should be negotiated since they seem to be such a commodity. Do you have any advice for someone standing on the edge of the industry, looking to jump in?

What I told him, and what I’d like to share with you as well, is this:

  • Don’t think of yourself as a commodity—you’re just undermining yourself.
  • Don’t present yourself as a commodity—it’s bad marketing.
  • You are not a commodity—because no one is.

This is perhaps more obvious if you have lots of experience, but it’s just as true for someone looking for their first job.

We all have different strengths, different weaknesses, different experiences and background. So when it comes to finding a job, you should be highlighting your strengths, instead of all the ways you’re the same as everyone else.

In the rest of this article I’ll show just a few of the ways this can be applied by someone who is switching careers into the tech industry; elsewhere I talk more about the more theoretical side of marketing yourself.

Negotiating as a bootcamp graduate

Since employment is a negotiated relationship, negotiation starts not when you’re discussing your salary with a particular company, but long before that when you start looking for a job.

Here are just some of the ways you can improve your negotiating strength.

1. Highlight your previous experience

If you’re going to a coding bootcamp chances are you’ve had previous job experience. Many of those job skills will translate to your new career as a software developer: writing software as an employee isn’t just about writing code.

Whether you worked as a marketer or a carpenter, your resume and interviews should highlight the relevant skills you learned in your previous career. Those skills will make you far more competent than the average college graduate.

This might include people management, project management, writing experience, knowing when to cut corners and when not to, attention to detail, knowing how to manage your time, and so on.

And if you can apply to jobs where your business knowledge is relevant, even better: if you used to work in insurance, you’ll have an easier time getting a programming job at an insurance company.

2. Do your research

Research salaries in advance. There are a number of online salary surveys—e.g. StackOverlow has one—which should give you some sense of what you should be getting.

Keep in mind that top companies like Google or some of the big name startups use stock and bonuses as a large part of total compensation, and salary doesn’t reflect that. Glassdoor has per-company salary surveys but they often tend to be skewed and lower than actual salaries.

3. Get multiple job offers if you can

Imagine candidate A and candidate B: as far as the hiring manager is concerned they seem identical. However, if candidate B has another job offer already, that is evidence that someone elsewhere has decided they like them. So candidate B is no longer seen as a commodity.

Try to apply to multiple jobs at once, and not to say “yes” immediately to the first job offer you can get. If you can get two offers at the same time, chances you’ll be able to get better terms from one or the other.

In fact, even just saying “I’m in the middle of interviewing elsewhere” can be helpful.

You are not a commodity, so don’t act like one

Notice how all of the suggestions above aren’t about that final conversation about salary. There’s a reason: by the time you’ve reached that point, most of the decisions about your salary range have already been made. You still need to ask for more, but there’s only a limited upside at that point.

So it’s important to present your unique strengths and capabilities from the start: you’re not a commodity, and you shouldn’t market yourself like one.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 19, 2019 04:00 AM

March 05, 2019

Itamar Turner-Trauring

You can't avoid negotiating—but you can make it easier

You’re looking for a new job, or feel like it’s time for a raise, or maybe you just want to set some boundaries with your boss. And that means negotiating, and you hate the whole idea: asking for things is hard, you don’t want to be treated specially, the idea of having the necessary conversation just makes you super-uncomfortable.

And that’s a problem, because you can’t avoid negotiating: employment is a negotiated relationship. From the minute you start looking for a job until you leave for a new one, you are negotiating.

And maybe you didn’t quite realize that, and maybe you didn’t ever ask for what you want, but in that case you’re still negotiating. You’re just negotiating badly.

But once you internalize this idea, negotiation can get easier.

That awkward, scary conversation where you ask for what you want is really just a small fraction of the negotiation. Which means if you do it right, that final conversation can be shorter, more comfortable, and much more likely to succeed.

To see why, let’s take the example of a job search, and see how the final conversation where you discuss your salary is just a small part of the real negotiation.

How your salary is determined

To simplify matters, we will specifically focus just on your salary as a programmer.

Companies tend to have different job titles based on experience, with corresponding ranges of salaries. Your salary is determined first by the prospective job title, and second by the specific salary within that title’s range.

The process goes something like this:

  1. When you send in your resume the HR or recruiting person who reads it puts you into some sort of mental bucket: “this is a senior software engineer.”
  2. The hiring manager reads your resume and refines that initial guess.
  3. The interview process then hardens that job title, and gives the company some sense of how much they want you and therefore where in that title’s salary range to put you.
  4. Finally, you get an offer, and you can push back and try to get more.

That final step, the awkward conversation we tend to think of as the negotiation, is only the end of a long process. By the time you’ve reached it much of your scope for negotiation has been restricted: you’ll have a harder time convincing the company you’re a Software Engineer IV if they’ve decided you’re a Software Engineer II.

Employment is a negotiated relationship

Negotiation isn’t a one-time event, it’s an ongoing part of how you interact with an employer You start negotiating for your salary, for example, from the day you start applying:

  1. You can choose companies to apply to where your enthusiasm will come across, or where you have highly relevant technical skills.
  2. You can get yourself introduced by a friend on the inside, instead of just sending in your resume.
  3. You can ensure you’ve demonstrated your correct level of problem-solving skills in your resume. If you can identify problems, it’s very easy to give the impression you can only solve problems if you don’t phrase things right (“I switched us over from VMs to Kubernetes” vs. “I identified hand-built VMs as a problem, investigated, chose Kubernetes, etc.”).
  4. You can interview for multiple jobs at once, so you can use a job offer from company A as independent proof of your value to company B.
  5. You can do well on the technical interview, which correlates with higher salaries.
  6. You can avoid whiteboard puzzles if you tend not to do well on those sorts of interviews.

All of these—and more—are part of the negotiation, long before you get the offer and discuss your salary.

You still need to ask (and negotiation doesn’t stop then)

Yes, you do need to ask for what you want at the end. And yes, that’s going to be scary and awkward and no fun at all. But asking for things is something you can practice in many different contexts, not just job interviews.

But if you treated the whole job interview process as a negotiation, that final conversation will be much easier because the company will really want to hire you—and because they’ll be worried you’ll take that other job offer you mentioned.

You’re not done negotiating when you’ve accepted the offer, though.

When your boss asks you to do something, you don’t have to say yes. In fact, as a good employee it’s your duty not to say yes, but to listen, dig a little deeper, and find the real problem.

Similarly, how many hours you work is not just up to your boss, is also about how you push back on unreasonable demands. And again, it’s your duty as a good employee to push back, because work/life balance makes you a better software engineer.

All of which is to say:

  1. You can’t avoid negotiating.
  2. Negotiation is far broader than just that awkward conversation where you make the ask.
  3. Being a good negotiator will make you a far more effective software engineer.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 05, 2019 05:00 AM

February 14, 2019

Moshe Zadka

Don't Make It Callable

There is a lot of code that overloads the __call__ method. This is the method that "calling" an object activates: something(x, y, z) will call something.__call__(x, y, z) if something is a member of a Python-defined class.

At first, like every operator overload, this seems like a nifty idea. And then, like most operator overload cases, we need to ask: why? Why is this better than a named method?

The first use-case is easily done better with a named method, and more readably: accepting callbacks. Let's say that the function interesting_files will call the passed-in callback with names of interesting files.

We can, of course, use __call__:

class PrefixMatcher:

    def __init__(self, prefix):
        self.prefix = prefix
        self.matches = []

    def __call__(self, name):
        if name.startswith(self.prefix):
            self.matches.append(name)

    def random_match(self):
        return random.choice(self.matches)

matcher = PrefixMatcher("prefix")
interesting_files(matcher)
print(matcher.random_match())

But it is more readable, and obvious, if we...don't:

class PrefixMatcher:

    def __init__(self, prefix):
        self.prefix = prefix
        self.matches = []

    def get_name(self, name):
        if name.startswith(self.prefix):
            self.matches.append(name)

    def random_match(self):
        return random.choice(self.matches)

matcher = PrefixMatcher("prefix")
interesting_files(matcher.get_name)
print(matcher.random_match())

We can pass the matcher.get_name method, which is already callable directly to interesting_files: there is no need to make PrefixMatcher callable by overloading __call__.

If something really is nothing more than a function call with some extra arguments, then either a closure or a partial would be appropriate.

In the example above, the random_match method was added to make sure that the class PrefixMatcher is justified. If this was not there, either of these implementations would be more appropriate:

def prefix_matcher(prefix):
    matches = []
    def callback(name):
        if name.startswith(prefix):
            matches.append(name)
    return callback, matches

matcher, matches = prefix_matcher("prefix")
interesting_files(matcher)
print(random.choice(matches))

This uses the function closure to capture some variables and return a function.

def prefix_matcher(prefix, matches, name):
    if name.startswith(prefix):
        matches.append(name)

matches = []
matcher = functools.partial(prefix_matcher, "prefix", matches)
interesting_files(matcher)
print(random.choice(matches))

This uses the funcotools.partial functions to construct a function that has some of the arguments "prepared".

There is one important use case for __call__, but it is specialized: it is a powerful tool when constructing a Python-based DSL. Indeed, this is exactly the time when we want to trade away "doing exactly when the operator always does" in favor of "succint syntax dedicated to the task at hand."

A good example of such a DSL is stan, where the __call__ function is used to construct XML tags with attributes: div(style="color: blue").

In almost every other case, avoid the temptation to make your objects callable. They are not functions, and should not be pretending.

by Moshe Zadka at February 14, 2019 06:30 AM

January 31, 2019

Itamar Turner-Trauring

Do they have work/life balance? Investigating potential employers with GitHub

When you’re searching for a new programming job you want to avoid companies with long work hours. You can ask about work/life balance during the interview (and unless you’re desperate, you always should ask), but that means wasting time applying and interviewing at companies where you don’t want to work.

What you really want is to only apply to companies that actually allow—or even better, encourage—good work/life balance.

Close reading of the job posting and company hiring pages can sometimes help you do that, but some good companies don’t talk about it, and some bad companies will just lie.

So in this article I’ll explain how you can use GitHub to empirically filter out at least some companies with bad work-life balance.

Let’s look at GitHub profiles

If you go to the profile for a GitHub user (here’s mine) you will see a chart showing contributions over time. The columns are weeks, and each row is a day of the week.

Each square shows contribution on a particular day: the darker the color, the more contributions.

What’s most interesting here are the first and last rows, which are Sundays and Saturdays respectively. As you can see in the image above, this GitHub user doesn’t tend to code much on the weekends: the weekend boxes are mostly blank.

You can also use this UI to see if the company uses GitHub at all. Given this many boxes, the user’s employer probably does use GitHub, but you can also click on a particular box and see the specific contributions. In this case, you would see that on one particular weekday the user contributed to “private repositories”, presumably those for their employer:

On the other hand, if you were to click on the weekend boxes you would see that all the weekend contributions are to open source projects. In short, this user doesn’t code much on the weekend, and when they do it’s on personal open source projects, not work projects.

Generalizing this process will allow you to filter out companies that don’t have work/life balance for developers.

Investigating work/life balance using GitHub

There are two assumptions here:

  • This is a relatively small company; large companies tend to vary more across groups, so you’ll need to adjust the process somewhat to focus on programmers in the group you’re applying to.
  • The company uses GitHub for most of their source code.

Company size you can figure out from the company page or LinkedIn, usage of GitHub will be something we can figure out along the way.

This is not a perfect process, since users can disable showing private repository contributions, or it’s possible the developer has personal private repositories. This is why you want to check as many profiles as possible.

Here’s what you should do:

  1. Find a number of programmers who work for the company. You can do this via the company’s website, and by the company’s page on LinkedIn, which will list employees who have LinkedIn profiles. You can also check for members of the company’s GitHub organization, if they have a public list, but don’t rely on this exclusively since it will likely omit many employees.
  2. For each programmer, use your favorite search engine and search for “$NAME github” to find their GitHub profile. Try to do some sanity checks to make sure it’s the right person, especially if they have a common name: location, organization membership, technologies used, etc..
  3. For each programmer, check if they contribute to private repositories during the week. You do can this by visually seeing if there are lots of green boxes, and by clicking on the Monday-Friday boxes in the timeline and reading the results below. If they mostly don’t, the company might not use GitHub.
  4. If they do use GitHub at work, for each programmer, check if they code on the weekend. You can visually see if they have lots of green boxes on Sundays and Saturdays.
  5. If they do code on the weekend, check if those are work contributions. You can do this by clicking on the weekend boxes and seeing if those are contributions to private repositories.

Additional tips:

  • If the company mostly writes open source code, you can check if the programmers contribute to the relevant open source repositories during the weekend.
  • Look for correlated weekend work dates across different people: if 5 people are working on private repos on the same 4 weekends, that’s probably a sign of crunch time.

By the end of the process you will often have a decent sense if most of the programmers are coding on the weekend. Note that this method can only tell you if they don’t have work/life balance—you still can’t know if they do have work/life balance.

So even if you decide to apply to a company that passed this filter, you will still need to ask questions about work/life balance and listen closely during the interview process.

Always do your research

Before you apply for a job you should always do your research: read their website, reads all their job postings, talk to a friend there if you have one… and as I explained in this article, their GitHub profiles as well. You’ll still need to look for warning signs during interviews, but research is still worth it.

The more you research, the more you’ll know whether you want to work there. And if you do want to work there, the more you know there better your interview will go.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 31, 2019 05:00 AM

January 25, 2019

Moshe Zadka

Staying Safe with Open Source

A couple of months ago, a successful attack against the Node ecosystem resulted in stealing an undisclosed amount of bitcoins from CoPay wallets.

The technical flow of the attack is well-summarized by the NPM blog post. Quick summary:

  1. nodemon, a popular way to run Node applications, depends on event-stream.
  2. The event-stream maintainer has not had time to maintain it.
  3. right9control asked event-stream maintainer for commit privileges to help maintain it.
  4. right9control added a dependency on a new library, flatmap-stream.
  5. flatmap-stream contained malicious code to steal wallets.

Obfuscation

A number of methods were done to disguise the attack.

The dependency was an added in a minor version, and a new version was immediately released. This meant that most projects, which pin to minor, would get the updates, while it stayed invisible on the main GitHub landing page, or the main npm landing page.

The malicious code was only in the minified version of the library that was uploaded to npm.org. The non-minified source code on both GitHub and npm.org, as well as the minified code on GitHub, did not contain the malicious code.

The malicious code was encrypted with a key that used the description of other packages in the dependency tree. That made it impossible to understand the attack without guessing which package decrypts it.

The combination of all those methods meant that the problem remained undetected for two months. It was only luck that detected it: the decryption code was using a deprecated function, and investigating the deprecation message led to the issue being figured out.

This bears thinking about: if the code had been written slightly better, the problem would have still be happening now, and nobody would be the wiser. We should not discount the possibility that currently, someone who followed the same playbook but managed to use AES correctly is still attacking some package, and we have no idea.

Solutions and Non-Solutions

I want to discuss some non-solutions in trying to understand how this problem came about.

Better Vetting of Maintainers

It is true, the person who made this commit had an obviously-auto-generated username (<word>-<digit>-<word>) and made few contributions before getting control. But short of meeting people in person, I do not think this would work.

Attackers adapt. Ask for better usernames, they will generate "<firstname>-<lastname>" names. Are you going to disallow my GitHub username, moshez? Ask for more contributions, you will get some trivial-code-that's-uploaded-to-npm, autogenerated a bit to disguise it. Ask for longer commit history, they'll send fixes to trivial issues.

Remember that this is a distributed problem, with each lead maintainer having to come up with a vetting procedure. Otherwise, you get usernames through the vetting process, and then you use those to spam maintainers, who now are sure they can trust those "vetted".

In short, this is one of the classical defenses that fails to take into considerations that attackers adapt.

Any Solution That Depends on JavaScript-specific Things

This attack could easily have been executed against PyPI or RubyGems. Any solution that relies on JavaScript's ability to have a least-access-based solution only helps make sure that these attacks go elsewhere.

It's not bad to do it. It just does not solve the root cause.

This also means that "stop relying on minified code" is a non-solution in the world where we encourage Python engineers to upload wheels.

Any Solution That Depends on "Audit Code"

A typical medium-sized JavaScript client app depends on some 2000 packages. Auditing each one, on each update, would make using third-packages untenable. This means that start-ups playing fast and loose with these rules would gain an advantage over those who do not. Few companies can afford that pay that much for security.

Hell, we knew this was a possibility a few months before the attack was initiated and still nobody did code auditing. Starting now would mostly mean availability bias, which means it would be over as soon as another couple of months go by without a documented attack.

Partial Solution -- Open Source Sustainability

If we could just pay maintainers, they would be slightly more comfortable maintaining packages and less desperate for help. This means that it would become inherently slightly harder to quickly become a new maintainer.

However, it is worthwhile to consider that this still would not solve the subtler "adding a new dependency" attack described earlier: just making a "good" library and getting other libraries to depend on it.

Summary

I do not know how to prevent the "next" attack. Hillel makes the point that a lot of "root causes" will only prevent almost-exact repeats, while failing to address trivial variations. Remember that one trivial variation, avoiding deprecation warnings, would have made this attack much more successful.

I am concerned that, as an industry, we are not discussing this attack a mere two months after discovery and mitigation. We are vulnerable. We will be attacked again. We need to be prepared.

by Moshe Zadka at January 25, 2019 05:00 AM

Itamar Turner-Trauring

A 4-day workweek for programmers, the easy way

You’re dreaming of a programming job with 30 hours a week, a job where you’ll have time for your own projects, your own hobbies. But this sort of job seems practically non-existent—almost no one advertises programming jobs with shorter workweeks.

How do you carve out a job like this, a job with a shorter workweek?

The ideal would be some company or employer where you just can ask for a shorter workweek, without having to apply or interview, and have a pretty good chance of getting a “yes”.

In this post I’ll talk about the easiest way to get what you want: negotiating at your current job.

The value of being an existing employee

as an existing employee you are much more valuable than an equally experienced programmer who doesn’t work there.

During your time at your employer you have acquired a variety of organization-specific knowledge and skills. It can take quite a while for a new employee to acquire these, and during the ramp-up period they will also take up their colleagues’ time.

Here are just a few of the things that you’ve likely learned during your time as an employee:

  • The existing codebase.
  • Local best practices, idioms, and coding conventions.
  • The business processes at the company.
  • The business domain.
  • The informal knowledge network in the company, i.e. who is the expert in what.

Not only do you have hard to-replace skills and knowledge, you also have your work record to build on: your manager knows what you can do. Once you’ve been working for a manager for a while they’ll know your skills, and whether they can trust you.

A real-life example

In my own career, being an existing employee has benefited me on multiple occasions:

After a number of years working as a software engineer for one company, I got a bad case of RSI (repetitive strain injury). I could no longer type, which meant I could no longer work as a programmer. But I did stay on as an employee: one of the managers at the company, who I’d worked for in an earlier role, offered me a position as a product manager.

In part this was because the company was run by decent people, who for the most part took care of their employees. But it was also because I had a huge amount of hard-to-replace business and technical knowledge of the company’s product, in a highly complex domain.

I worked as a product manager for a few years, but I never really enjoyed it. And with time my hands recovered, at least partially, potentially allowing me to take up programming again. After my daughter was born, my wife and I decided that I’d become a part-time consultant, and take care of our child the rest of the time, while she continued working full-time.

My manager was upset when I told him I was leaving. I felt bad—but really, if your boss is unhappy when you’re leaving, that’s a good thing.

In fact, my boss offered to help me find a less-than-full-time programming position within the company so I wouldn’t have to leave. I’d already made up my mind to go, but under other circumstances I might have taken him up on the offer.

Notice how I was offered reduced hours, even though companies will never advertise such positions. That’s the value of being an existing employee.

Asking for what you want

Unless you work for a really bad manager—or a really badly managed company—a reasonable manager would much prefer to have your experience for 4 days a week than have to find a train a replacement. That doesn’t mean they’ll be happy if you ask for a shorter workweek: you are likely to get some pushback.

But if you have a strong negotiating position—financial savings, valuable work, the willingness to find another job if necessary—there’s a decent chance they will eventually say “yes”.

Does negotiating seem too daunting, or not something you can do? Plenty of other programmers have done it, even those with no previous negotiation experience.

Much of this article was an excerpt from my book, Negotiate a 3-Day Weekend. It covers everything from negotiation exercises you can do on the job, to a specific script for talking to your boss, to negotiating at new employers if you can’t do it at your current job.

With a little bit of practice, you can get the free time you need.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 25, 2019 05:00 AM

January 18, 2019

Itamar Turner-Trauring

Negotiate your salary like a 6-year old

You’re applying for jobs, you’re hoping to get an offer soon—and when you do you’ll have to face the scary issue of negotiation.

If you read a negotiation guide like Valerie Aurora’s HOWTO (and you should!) you’re told you need to negotiate: you need to ask for more. And that’s good advice.

The problem is that asking for more is scary: it feels confrontational, your body will kick in with an unhelpful adrenaline surge, it’s just not comfortable. And honestly, given you only negotiate your salary every few years, that adrenaline surge probably isn’t going to go away.

But I think you can get past that and negotiate anyway—by embracing your inner 6-year old. In particular, a 6-year old negotiating for snacks on a playdate.

Snack time!

Let’s set the scene.

A 6-year old named Emma is visiting her friend Jackson, and now it’s time for a snack. Jackson’s parent—Michael—now needs to feed Emma.

Michael would prefer the children not eat crackers, but he has a problem. Michael has some authority over Jackson since he’s his parent, and some knowledge of what Jackson is willing to eat. So he can say “you’re eating one of these mildly healthy snacks” and that’s the end of it.

But Emma is just visiting: Michael has less authority, less knowledge, and a hungry 6-year old is Bad News. So Michael comes up with an acceptable range of snacks, and tries his best to steer towards the ones he considers healthier.

The conversation goes something like this:

Michael: “Would you like some fruit?”

Emma: blank stare.

Michael: “How about same cheese?”

Emma: shakes her head.

Michael: “Yogurt?”

Emma: shakes her head.

Michael: “Crackers?”

Emma and Jackson: “Yes!”

Michael has committed to feeding Emma something, he doesn’t want her to go hungry—but he doesn’t have the normal leverage he has as a parent. As a result, Emma can just stall until she gets what she wants. Particularly enterprising children will ask for candy (even when they would never get candy at home!), but stalling seems well within the capacity of most 6-year olds.

Salary time!

The dynamic of hiring a new employee is surprisingly similar.

Whoever is making the offer—HR, an internal recruiter, or the hiring manager—has already committed to hiring you. They’ve decided: they had interviews and meetings and they want to get it over with and just get you on board.

So they come up with an acceptable salary range, and offer you the low end of the range. If you accept that, great. And if you say “can you do better?”

Well, they’ve already decided on their acceptable salary range: they’ll just move up within that range. They’re not insulted, they’re used to this. They’re not horrified at a hole in their budget, this is still within their acceptable range.

You went from fruit to crackers, and they can live with that. All you have to do is ask.

Embrace your inner 6-year old

Much of what determines your salary happens before you get the offer, when the decision is made about what salary range to offer you.

You can influence this by the language on your resume, by how you present yourself, how you interview, and by noting you have competing offers. It may not feel like negotiation, but it’s all part of the process—and while it’s a set of skills you can master as an adult, that part is far beyond what your 6-year old self could do.

But the actual conversation about salary? Pretend you’re 6, pretend it’s a snack, and ask for more—chances are you’ll get some delicious crackers.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 18, 2019 05:00 AM

January 09, 2019

Moshe Zadka

Checking in JSON

JSON is a useful format. It might not be ideal for hand-editing, but it does have the benefit that it can be hand-edited, and it is easy enough to manipulate programmatically.

For this reason, it is likely that at some point or another, checking in a JSON file into your repository will seem like a good idea. Perhaps it is even beyond your control: some existing technology uses JSON as a configuration file, and the easiest thing is to go with it.

It is useful to still keep the benefit of programmatic manipulation. For example, if the JSON file encodes a list of numbers, and we want to add 1 to every even number, we can do:

with open("myfile.json") as fp:
    content = json.load(fp)
content = [x + (2 % i) for i, x in enumerate(content)]
with open("myfile.json", "w") as fp:
    json.dumps(fp, content)

However, this does cause a problem: presumably, before, the list was formatted in a visually-pleasing way. Having dumped it, now the diff is unreadable -- and hard to audit visually.

One solution is to enforce consistent formatting.

For example, using pytest, we can write the following test:

def test_formatting():
    with open("myfile.json") as fp:
        raw = fp.read()
    content = json.loads(raw)
    redumped = json.dumps(content, indent=4) + "\n"
    assert raw == redumped

Assuming we gate merges to the main branches on passing tests, it is impossible to check in something that breaks the formatting. Automated programs merely need to remember to give the right options to json.dumps. However, what happens when humans make mistakes?

It turns out that Python already has a command-line tool to reformat:

$ python -m json.tool myfile.json > myfile.json.formatted
$ mv myfile.json.formatted myfile.json

A nice test failure will remind the programmer of this trick, so that it is easy to do and check in.

by Moshe Zadka at January 09, 2019 06:00 AM

Itamar Turner-Trauring

Competing with a "Stanford grad just dying to work all nighters on Red Bull"

A reader of my newsletter wrote in, talking about the problem of finding a job with work/life balance in Silicon Valley:

It seems like us software engineers are in a tough spot: companies demand a lot of hard work and long hours, and due to the competitiveness here in Silicon Valley, you have to go along with it (or else there’s some bright young Stanford grad just dying to work all nighters on Red Bull to take your place).

But they throw you aside once the company has become established and they no longer need the “creative” types.

In short, how do you get a job with work/life balance when you’re competing against people willing to work long hours?

All nighters make for bad software

The starting point is realizing that working long hours makes you a much less productive employee, to the point that your total output will actually decrease (see Evan Robinson on crunch time). If you want to become an effective and productive worker, you’re actually much better off working normal hours and having a personal life than working longer hours.

Since working shorter hours makes you more productive, that gives you a starting point for why you should be hired.

Instead of focusing on demonstrative effort by working long hours, you can focus on identifying and solving valuable problems, especially the bigger and longer term problems that require thought and planning to solve.

Picking the right company

Just because you’re a valuable, productive programmer doesn’t mean you’re going to get hired, of course. So next you need to find the right company.

You can imagine three levels of managerial skill:

  • Level 1: These managers have no idea how to recognize effective workers, so they only judge people by hours worked.
  • Level 2: These managers can recognize effective workers, but don’t quite know how to create a productive culture. That means if you choose work long hours they won’t stop you, however pointless these long hours may be. But, they won’t force you work long hours so long as you’re doing a decent job.
  • Level 3: These managers can recognize effective workers and will encourage a productive culture. Which is to say, they will explicitly discourage working long hours except in emergencies, they will take steps to prevent future emergencies, etc..

When you look for a job you will want to avoid Level 1 managers. However good your work, they will be happy to replace you with someone incompetent so long as they can get more hours out of them. So you’ll be forced to work long hours and work on broken code.

Level 3 managers are ideal, and they do exist. So if you can find a job working for them then you’re all set.

Level 2 managers are probably more common though, and you can get work/life balance working for them—if you set strong boundaries. Since they can recognize actual competence and skills, they won’t judge you during your interview based on how many hours you’re willing to work. You just need to convey your skill and value, and a reasonable amount of dedication to your job.

And once you’ve started work, if you can actually be productive (and if you work 40 hours/week you will be more productive!) they won’t care if you come in at 9 and leave at 5, because they’ll be happy with your work.

Unlike Level 3 managers, however, you need to be explicit about boundaries during the job interview, and even more so after you start. Elsewhere I wrote up some suggestions about how to convey your value, and how to say “no” to your boss.

Employment is a negotiated relationship

To put all this another way: employment is a negotiated relationship. Like it or not, you are negotiating from the moment you start interviewing, while you’re on the job, and until the day you leave.

You are trying to trade valuable work for money, learning opportunities, and whatever else your goals are (you can, for example, negotiate for a shorter workweek). In this case, we’re talking about negotiating for work/life balance:

  1. Level 1 managers you can’t negotiate with, because what they want (long hours) directly conflicts with what you want.
  2. Level 2 managers you can negotiate with, by giving them one of the things they want: valuable work.
  3. Level 3 managers will give you what you want without your having to do anything, because they know it’s in the best interest of everyone.

Of course, even for Level 3 managers you will still need to negotiate other things, like a higher salary.

So how do you get a job with work/life balance? Focus on providing and demonstrating valuable long-term work, avoid bad companies, and make sure you set boundaries from the very start.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 09, 2019 05:00 AM

December 12, 2018

Itamar Turner-Trauring

Tests won't make your software correct

Automated tests are immensely useful. Once you’ve started writing tests and seen their value, the idea of writing software without them becomes unimaginable.

But as with any technique, you need to understand its limitations. When it comes to automated testing—unit tests, BDD, end-to-end tests—it’s tempting to think that if your tests pass, your software is correct.

But tests don’t, tests can’t tell you that your software is correct. Let’s see why.

How to write correct software

To implement a feature or bugfix, you go through multiple stages; they might be compressed or elided, but they are always necessary:

  1. Identification: Figure out what the problem is you’re trying to solve.
  2. Solution: Come up with a solution.
  3. Specification: Define a specification, the details of how the solution will be implemented.
  4. Implementation: Implement the specification in code.

Your software might end up incorrect at any of these points:

  1. You might identify the wrong problem.
  2. You might choose the wrong solution.
  3. You might create a specification that doesn’t match the solution.
  4. You might write code that doesn’t match the specification.

Only human judgment can decide correctness

Automated tests are also a form of software, and are just as prone to error. The fact that your automated tests pass doesn’t tell you that your software is correct: you may still have identified the wrong problem, or chosen the wrong solution, and so on.

Even when it comes to ensuring your implementation matches your specification, tests can’t validate correctness on their own. Consider the following test:

def test_addition():
    assert add(2, 2) == 5

From the code’s perspective—the perspective of an automaton with no understanding—the correct answer of 4 is the one that will cause it to fail. But merely by reading that you can tell it’s wrong: you, the human, are key.

Correctness is something only a person can decide.

The value of testing: the process

While passing tests can’t prove correctness, the process of writing tests and making them pass can help make your software correct. That’s because writing the tests involves applying human judgment: What should this test assert? Does match the specification? Does this actually solve our problem?

When you go through the loop of writing tests, writing code, and checking if tests pass, you continuously apply your judgment: is the code wrong? is the test wrong? did I forget a requirement?

You write the test above, and then reread it, and then say “wait that’s wrong, 2 + 2 = 4”. You fix it, and then maybe you add to your one-off hardcoded tests some additional tests based on core arithmetic principles. Correctness comes from applying the process, not from the artifacts created by the process.

This may seem like pedantry: what does it matter whether the source of correctness is the tests themselves or the process of writing the tests? But it does matter. Understanding that human judgment is the key to correctness can keep you from thinking that passing tests are enough: you also need other forms of applied human judgment, like code review and manual testing.

(Formal methods augment human judgment with automated means… but that’s another discussion.)

The value of tests: stability

So if correctness comes from writing the tests, not the tests themselves, why do we keep the tests around?

Because tests ensure stability. once we judge the software is correct, the tests can keep the software from changing, and thus reduce the chances of its becoming incorrect. The tests are never enough, because the world can change even if the software isn’t, but stability has its value.

(Stability also has costs if you make the wrong abstraction layer stable…)

Tests are useful, but they’re not sufficient

To recap:

  1. Write automated tests.
  2. Run those tests.
  3. Don’t mistake passing tests for correctness: you will likely need additional processes and techniques to achieve that.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

December 12, 2018 05:00 AM

December 09, 2018

Moshe Zadka

Office Hours

If you want to speak to me, 1-on-1, about anything, I want to be able to help. I am a busy person. I have commitments. But I will make the time to talk to you.

Why?

  • I want to help.
  • I think I'll enjoy it. I like talking to people.

What?

I can offer opinions and experience on programming in general, Python, UNIX, the software industry and other topics.

How did you come up with the idea?

I am indebted to Robert Heaton for the idea and encouragement.

Should I...?

Sure! Especially if you have few connections in the industry, and have questions, I can talk to you. I am a fluent speaker of English and Hebrew, so you do need to be able to converse in one of those...

E-mail me!

by Moshe Zadka at December 09, 2018 05:30 AM

December 03, 2018

Itamar Turner-Trauring

'Must be willing to work under pressure' is a warning sign

As a programmer looking for a job, you need to be on the lookout for badly managed companies. Whether it’s malicious exploitation or just plain incompetence, the less time you waste applying for these jobs, the better.

Some warning signs are subtle, but not all. One of the most blatant is a simple phrase: “must be willing to work under pressure.”

The distance between we and you

Let’s take a look at some quotes from real job postings. Can you spot the pattern?

  • “Ability to work under pressure to meet sometimes aggressive deadlines.”
  • “Thick skin, ability to overcome adversity, and keep a level head under pressure.”
  • “Ability to work under pressure and meet tight deadlines.”
  • “Willing to work under pressure” and “working extra hours if necessary.”

If you look at reviews for these companies, many of them mention long working hours, which is not surprising. But if you read carefully there’s more to it than that: it’s not just what they’re saying, it’s also how they’re saying it.

When it comes to talking about the company values, for example, it’s always in the first person: “we are risk-takers, we are thoughtful and careful, we turn lead into gold with a mere touch of our godlike fingers.” But when it comes to pressure it’s always in the second person or third person: it’s always something you need to deal with.

Who is responsible for the pressure? It’s a mysterious mystery of strange mystery.

But of course it’s not. Almost always it’s the employer who is creating the pressure. So let’s switch those job requirements to first person and see how it reads:

  • We set aggressive deadlines, and we will pressure you to meet them.”
  • We will say and do things you might find offensive, and we will pressure you.”
  • We set tight deadlines, and we will pressure you to meet them.”
  • We will pressure you, and we will make you work long hours.”

That sounds even worse, doesn’t it?

Dysfunctional organizations (that won’t admit it)

When phrased in the first person, all of these statements indicate a dysfunctional organization. They are doing things badly, and maybe also doing bad things.

But it’s not just that they’re dysfunctional: it’s also that they won’t admit it. Thus the use of the second or third person. It’s up to you to deal with this crap, cause they certainly aren’t going to try to fix things. Either:

  1. Whoever wrote the job posting doesn’t realize they’re working for a dysfunctional organization.
  2. Or, they don’t care.
  3. Or, they can’t do anything about it.

None of these are good things. Any of them would be sufficient reason to avoid working for this organization.

Pressure is a choice

Now, I am not saying you shouldn’t take a job involving pressure. Consider the United States Digital Service, for example, which tries to fix and improve critical government software systems.

I’ve heard stories from former USDS employees, and yes, sometimes they do work under a lot of pressure: a critical system affecting thousands or tens of thousands of people goes down, and it has to come back up or else. But when the USDS tries to hire you, they’re upfront about what you’re getting in to, and why you should do it anyway.

They explain that if you join them your job will be “untangling, rewiring and redesigning critical government services.”. Notice how “untangling” admits that some things are a mess, but also indicates that your job will be to make things better, not just to passively endure a messed-up situation.

Truth in advertising

There’s no reason why companies couldn’t advertise in the some way. I fondly imagine that someone somewhere has written a job posting that goes like this:

“Our project planning is a mess. We need you, a lead developer/project manager who can make things ship on time. We know you’ll have to say ‘no’ sometimes, and we’re willing to live with that.”

Sadly, I’ve never actually encountered such an ad in the real world.

Instead you’ll be told “you must be able to work under pressure.” Which is just another way of saying that you should find some other, better jobs to apply to.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

December 03, 2018 05:00 AM

November 29, 2018

Hynek Schlawack

Python Application Dependency Management in 2018

We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.

by Hynek Schlawack (hs@ox.cx) at November 29, 2018 05:00 PM