Planet Twisted

October 16, 2018

Jonathan Lange

Notes on test coverage

These are a few quick notes to self, rather than a cogent thesis. I want to get this out while it’s still fresh, and I want to lower my own mental barrier to publishing here.

I’ve been thinking about test coverage recently, inspired by conversations that followed DRMacIver’s recent post.

Here’s my current working hypothesis:

  • every new project should have 100% test coverage
  • every existing project should have a ratchet that enforces increasing coverage
  • 100% coverage” means that every line is either:
    • covered by the test suite
    • or has some markup in code saying that it is explicitly not covered, and why that’s the case
  • these should be enforced in CI

The justification is that “the test of all knowledge is experiment” [0]. While we should absolutely make our code easy to reason about, and prove as much as we can about it, we need to check what it does against actual reality.

Simple testing really can prevent most critical failures. It’s OK to not test some part of your code, but that should be a conscious, local, recorded decision. You have to explicitly opt out of test coverage. The tooling should create a moment where you either write a test, or you turn around and say “hold my beer”.

Switching to this for an existing project can be prohibitively expensive, though, so a ratchet is a good idea. The ratchet should be “lines of uncovered code”, and that should only be allowed to go down. Don’t ratchet on percentages, as that will let people add new lines of uncovered code.

Naturally, all of this has to be enforced in CI. No one is going to remember to run the coverage tool, and no one is going to remember to check for it during code review. Also, it’s almost always easier to get negative feedback from a robot than a human.

I tagged this post with Haskell, because I think all of this is theoretically possible to achieve on a Haskell project, but requires way too much tooling to set up.

  • hpc is great, but it is not particularly user friendly.
  • Existing code coverage SaaS services don’t support expression-level coverage.
  • hpc has mechanisms for excluding code from coverage, but it’s not by marking up your code
  • hpc has some theoretically correct but pragmatically unfortunate defaults, e.g. it’ll report partial coverage for an otherwise guard, because it’s never run through when otherwise is False
  • There are no good ratchet tools

As a bit of an experiment, I set up a test coverage ratchet with graphql-api. I wanted both to test out my new enthusiasm for aiming for 100% coverage, and I wanted to make it easier to review PRs.

The ratchet script is some ad hoc Python, but it’s working. External contributors are actually writing tests, because the computer tells them to do so. I need to think less hard about PRs, because I can look at the tests to see what they actually do. And we are slowly improving our test coverage.

I want to build on this tooling to provide something genuinely good, but I honestly don’t have the budget for it at present. I hope to at least write a good README or user guide that illustrates what I’m aiming for. Don’t hold your breath.

[0]The Feynman Lectures on Physics, Richard Feynman

by Jonathan Lange at October 16, 2018 11:00 PM

October 10, 2018

Itamar Turner-Trauring

The next career step for Senior Software Engineers (that isn't management)

You’ve been working as a programmer for a few years, you’ve been promoted once or twice, and now you’re wondering what’s next. The path until this point was straightforward: you learned how to work on your own, and then you get promoted to Senior Software Engineer or some equivalent job title.

But now there’s no clear path ahead.

Do you become a manager and stop coding?

Do you just learn new technologies, or is that not enough?

What should you be aiming for?

In this post I’d like to present an alternative career progression, an alternative that will give you more autonomy, and more bargaining power. And unlike becoming a manager, it will still allow you to write code.

From coding to solving problems

In the end, your job as a programmer is solving problems, not writing code. Solving problems requires:

  1. Finding and identifying the problem.
  2. Coming up with a solution.
  3. Implementing the solution.

Each of these can be thought of a skill-tree: a set of related skills that can be developed separately and in parallel. In practice, however, you’ll often start in reverse order with the third skill tree, and add the others on one by one as you become more experienced.

Randall Koutnik describes these as job titles of a sort, a career progression: Implementers, Solvers, and Finders.

As an Implementer, you’re an inexperienced programmer, and your tasks are defined by someone else: you just implement small, well-specified chunks of code.

Let’s imagine you work for a company building a website for animal owners. You go to work and get handed a task: “Add a drop-down menu over here listing all iguana diseases, which you can get from the IGUANA_DISEASE table. Selecting a menu item should redirect you the appropriate page.”

You don’t know why a user is going to be listing iguana diseases, and you don’t have to spend too much time figuring out how to implement it. You just do what you’re told.

As you become more experienced, you become a Solver: are able to come up with solutions to less well-defined problems.

You get handed a problem: “We need to add a section to the website where pet owners can figure out if their pet is sick.” You figure out what data you have and which APIs you can use, you come up with a UI together with the designer, and then you create an implementation plan. Then you start coding.

Eventually you become a Finder: you begin identifying problems on your own and figuring out their underlying causes.

You go talk to your manager about the iguanas: almost no one owns iguanas, why are they being given equal space on the screen as cats and dogs? Not to mention that writing iguana-specific code seems like a waste of time, shouldn’t you be writing generic code that will work for all animals?

After some discussion you figure out that the website architecture, business logic, and design are going to have to be redone so that you don’t have to write new code every time a new animal is added. If you come up with the right architecture, adding a new animal will take just an hour’s work, so the company can serve many niche animal markets at low cost. Designing and implementing the solution will likely be enough work that you’re going to have to work with the whole team to do it.

The benefits of being a Finder

Many programmers end up as Solvers and don’t quite know what to do next. If management isn’t your thing, becoming a Finder is a great next step, for two reasons: autonomy and productivity.

Koutnik’s main point is that each of these three stages gives you more autonomy. As an Implementer you have very little autonomy, as a Solver you have more, and as a Finder you have lots: you’re given a pile of vague goals and constraints and it’s up to you to figure out what to do. And this can be a lot of fun.

But there’s another benefit: as you move from Implementer to Solver to Finder you become more productive, because you’re doing less unnecessary work.

  • If you’re just implementing a solution someone else specified, then you might be stuck with an inefficient solution.
  • If you’re just coming up with a solution and taking the problem statement at face value, then you might end up solving the wrong problem, when there’s another more fundamental problem that’s causing all the trouble.

The better you are at diagnosing and identifying underlying problems, coming up with solutions, and working with others, the less unnecessary work you’ll do, and the more productive you’ll be.

Leveraging your productivity

If you’re a Finder you’re vastly more productive, which makes you a far more valuable employee. You’re the person who finds the expensive problems, who identifies the roadblocks no one knew where there, who discovers what your customers really wanted.

And that means you have far more negotiating leverage:

So if you want to keep coding, and you still want to progress in your career, start looking for problems. If you pay attention, you’ll find them everywhere.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

October 10, 2018 04:00 AM

October 02, 2018

Moshe Zadka

Why No Dry Run?

(Thanks to Brian for his feedback. All mistakes and omissions that remain are mine.)

Some commands have a --dry-run option, which simulates running the command but without taking effect. Sometimes the option exists for speed reasons: just pretending to do something is faster than doing it. However, more often this is because doing it can cause big, possibly detrimental, effects, and it is nice to be able to see what would happen before running the script.

For example, ansible-playbook has the --check option, which will not actually have any effect: it will just report what ansible would have done. This is useful when editing a playbook or changing the configuration.

However, this is the worst possible default. If we have already decided that our command can cause much harm, and one way to mitigate the harm is to run it in a "dry run" mode and have a human check that this makes sense, why is "cause damage" the default?

As someone in SRE/DevOps jobs, many of the utilities I run can cause great harm without care. They are built to destroy whole environments in one go, or to upgrade several services, or to clean out unneeded data. Running it against the wrong database, or against the wrong environment, can wreak all kinds of havoc: from disabling a whole team for a day to actual financial harm to the company.

For this reason, the default of every tool I write is to run in dry run mode, and when wanting to actually have effect, explicitly specify --no-dry-run. This means that my finger accidentally slipping on the enter key just causes something to appear on my screen. After I am satisfied with the command, I up-arrow and add --no-dry-run to the end.

I now do it as a matter of course, even for cases where the stakes are lower. For example, the utility that publishes this blog has a --no-dry-run that publishes the blog. When run without arguments, it renders the blog locally so I can check it for errors.

So I really have no excuses... When I write a tool for serious production system, I always implement a --no-dry-run option, and have dry runs by default. What about you?

by Moshe Zadka at October 02, 2018 07:00 AM

September 27, 2018

Itamar Turner-Trauring

Avoiding burnout: lessons learned from a 19th century philosopher

You’re hard at work writing code: you need to ship a feature on time, or release a whole new product, and you’re pouring all your time and energy into it, your heart and your soul. And then, an uninvited and dangerous question insinuates itself into your consciousness.

If you succeed, if you ship your code, if you release your product, will you be happy? Will all your time and effort be worth it?

And you realize the answer is “no”. And suddenly your work is worthless, your goals are meaningless. You just can’t force yourself to work on something that doesn’t matter.

Why bother? Why work at all?

This is not a new experience. Almost 200 years ago, John Stuart Mill went through this crisis. And being a highly verbose 19th century philosopher, he also wrote a highly detailed explanation how he managed to overcome what we would call depression or burnout.

And this explanation is useful not just to his 19th century peers, but to us programmers as well.

“Intellectual enjoyments above all”

At the core of Mill’s argument is the idea that rational thought, “analysis” he calls it, is corrosive: “a perpetual worm at the root both of the passions and of the virtues”. He never rejected rational thought, but he concluded that on its own it was insufficient, and potentially dangerous.

Mill’s education had, from an early age, focused him solely on rational analysis. As a young child Mill was taught by his father to understand—not just memorize—Greek, arithmetic, history, mathematics, political economy, far more than even many well-educated adults learned at the time. And since he was taught at home without outside influences, he internalized his father’s ideas prizing intellect over emotions.

In particular, Mill’s father “never varied in rating intellectual enjoyments above all others… For passionate emotions of all sorts, and for everything which has been said or written in exaltation of them, he professed the greatest contempt.” Thus Mill learned to prize rational thought and analysis over other feelings, as many programmers do—until he discovered the cost of focusing on those alone.

“The dissolving influence of analysis”

One day, things went wrong:

I was in a dull state of nerves, such as everybody is occasionally liable to; unsusceptible to enjoyment or pleasurable excitement; one of those moods when what is pleasure at other times, becomes insipid or indifferent…

In this frame of mind it occurred to me to put the question directly to myself: “Suppose that all your objects in life were realized; that all the changes in institutions and opinions which you are looking forward to, could be completely effected at this very instant: would this be a great joy and happiness to you?” And an irrepressible self-consciousness distinctly answered, “No!”

From this point on Mill suffered from depression, for months on end. And being of an analytic frame of mind, he was able to intellectually diagnose his problem.

On the one hand, rational logical thought is immensely useful in understanding the world: “it enables us mentally to separate ideas which have only casually clung together”. But this ability to analyze also has its costs, since “the habit of analysis has a tendency to wear away the feelings”. In particular, analysis “fearfully undermine all desires, and all pleasures”.

Why should this make you happy? You try to analyze it logically, and eventually conclude there is no reason it should—and now you’re no longer happy.

“Find happiness by the way”

Eventually an emotional, touching scene in a book he was reading nudged Mill out of his misery, and when he fully recovered he changed his approach to life in order to prevent a recurrence.

Mill’s first conclusion was that happiness is a side-effect, not a goal you can achieve directly, nor verify directly by rational self-interrogation. Whenever you ask yourself “can I prove that I’m happy?” the self-consciousness involved will make the answer be “no”. Instead of choosing happiness as your goal, you need to focus on some other thing you care about:

Those only are happy (I thought) who have their minds fixed on some object other than their own happiness; on the happiness of others, on the improvement of mankind, even on some art or pursuit, followed not as a means, but as itself an ideal end. Aiming thus at something else, they find happiness by the way.

It’s worth noticing that Mill is suggesting focusing on something you actually care about. If you’re spending your time working on something that meaningless to you, you will probably have a harder time of it.

“The internal culture of the individual”

Mill’s second conclusion was that logical thought and analysis are not enough on their own. He still believed in the value of “intellectual culture”, but he also aimed to become a more balanced person by “the cultivation of the feelings”. And in particular, he learned the value of “poetry and art as instruments of human culture”.

For example, Mill discovered Wordsworth’s poetry:

These poems addressed themselves powerfully to one of the strongest of my pleasurable susceptibilities, the love of rural objects and natural scenery; to which I had been indebted not only for much of the pleasure of my life, but quite recently for relief from one of my longest relapses into depression….

What made Wordsworth’s poems a medicine for my state of mind, was that they expressed, not mere outward beauty, but states of feeling, and of thought coloured by feeling, under the excitement of beauty. They seemed to be the very culture of the feelings, which I was in quest of. In them I seemed to draw from a Source of inward joy, of sympathetic and imaginative pleasure, which could be shared in by all human beings…

Both nature and art cultivate the feelings, an additional and distinct way of being human beyond logical analysis:

The intensest feeling of the beauty of a cloud lighted by the setting sun, is no hindrance to my knowing that the cloud is vapour of water, subject to all the laws of vapours in a state of suspension…

The practice of happiness

Mill’s advice is not a universal panacea; among other flaws, it starts from a position of immense privilege. But I do think Mill hits on some important points about what it means to be human.

If you wish to put it into practice, here is Mill’s advice, insofar as I can summarize it (I encourage you to go and read his Autobiography on your own):

  1. Aim in your work not for happiness, but for a goal you care about: improving the world, or even just applying and honing a skill you value.
  2. Your work—and the rational thought it entails—will not suffice to make you happy; rational thought on its own will undermine your feelings.
  3. You should therefore also cultivate your feelings: through nature, and through art.


It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 27, 2018 04:00 AM

September 26, 2018

Jp Calderone

Asynchronous Object Initialization - Patterns and Antipatterns

I caught Toshio Kuratomi's post about asyncio initialization patterns (or anti-patterns) on Planet Python. This is something I've dealt with a lot over the years using Twisted (one of the sources of inspiration for the asyncio developers).

To recap, Toshio wondered about a pattern involving asynchronous initialization of an instance. He wondered whether it was a good idea to start this work in __init__ and then explicitly wait for it in other methods of the class before performing the distinctive operations required by those other methods. Using asyncio (and using Toshio's example with some omissions for simplicity) this looks something like:


class Microblog:
def __init__(self, ...):
loop = asyncio.get_event_loop()
self.init_future = loop.run_in_executor(None, self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@asyncio.coroutine
def sync_latest(self):
# Don't do anything until initialization is done
yield from self.init_future
# ... do some work that depends on that initialization ...

It's quite possible to do something similar to this when using Twisted. It only looks a little bit difference:


class Microblog:
def __init__(self, ...):
self.init_deferred = deferToThread(self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@inlineCallbacks
def sync_latest(self):
# Don't do anything until initialization is done
yield self.init_deferred
# ... do some work that depends on that initialization ...

Despite the differing names, these two pieces of code basical do the same thing:

  • run _reading_init in a thread from a thread pool
  • whenever sync_latest is called, first suspend its execution until the thread running _reading_init has finished running it

Maintenance costs

One thing this pattern gives you is an incompletely initialized object. If you write m = Microblog() then m refers to an object that's not actually ready to perform all of the operations it supposedly can perform. It's either up to the implementation or the caller to make sure to wait until it is ready. Toshio suggests that each method should do this implicitly (by starting with yield self.init_deferred or the equivalent). This is definitely better than forcing each call-site of a Microblog method to explicitly wait for this event before actually calling the method.

Still, this is a maintenance burden that's going to get old quickly. If you want full test coverage, it means you now need twice as many unit tests (one for the case where method is called before initialization is complete and another for the case where the method is called after this has happened). At least. Toshio's _reading_init method actually modifies attributes of self which means there are potentially many more than just two possible cases. Even if you're not particularly interested in having full automated test coverage (... for some reason ...), you still have to remember to add this yield statement to the beginning of all of Microblog's methods. It's not exactly a ton of work but it's one more thing to remember any time you maintain this code. And this is the kind of mistake where making a mistake creates a race condition that you might not immediately notice - which means you may ship the broken code to clients and you get to discover the problem when they start complaining about it.

Diminished flexibility

Another thing this pattern gives you is an object that does things as soon as you create it. Have you ever had a class with a __init__ method that raised an exception as a result of a failing interaction with some other part of the system? Perhaps it did file I/O and got a permission denied error or perhaps it was a socket doing blocking I/O on a network that was clogged and unresponsive. Among other problems, these cases are often difficult to report well because you don't have an object to blame the problem on yet. The asynchronous version is perhaps even worse since a failure in this asynchronous initialization doesn't actually prevent you from getting the instance - it's just another way you can end up with an incompletely initialized object (this time, one that is never going to be completely initialized and use of which is unsafe in difficult to reason-about ways).

Another related problem is that it removes one of your options for controlling the behavior of instances of that class. It's great to be able to control everything a class does just by the values passed in to __init__ but most programmers have probably come across a case where behavior is controlled via an attribute instead. If __init__ starts an operation then instantiating code doesn't have a chance to change the values of any attributes first (except, perhaps, by resorting to setting them on the class - which has global consequences and is generally icky).

Loss of control

A third consequence of this pattern is that instances of classes which employ it are inevitably doing something. It may be that you don't always want the instance to do something. It's certainly fine for a Microblog instance to create a SQLite3 database and initialize a cache directory if the program I'm writing which uses it is actually intent on hosting a blog. It's most likely the case that other useful things can be done with a Microblog instance, though. Toshio's own example includes a post method which doesn't use the SQLite3 database or the cache directory. His code correctly doesn't wait for init_future at the beginning of his post method - but this should leave the reader wondering why we need to create a SQLite3 database if all we want to do is post new entries.

Using this pattern, the SQLite3 database is always created - whether we want to use it or not. There are other reasons you might want a Microblog instance that hasn't initialized a bunch of on-disk state too - one of the most common is unit testing (yes, I said "unit testing" twice in one post!). A very convenient thing for a lot of unit tests, both of Microblog itself and of code that uses Microblog, is to compare instances of the class. How do you know you got a Microblog instance that is configured to use the right cache directory or database type? You most likely want to make some comparisons against it. The ideal way to do this is to be able to instantiate a Microblog instance in your test suite and uses its == implementation to compare it against an object given back by some API you've implemented. If creating a Microblog instance always goes off and creates a SQLite3 database then at the very least your test suite is going to be doing a lot of unnecessary work (making it slow) and at worst perhaps the two instances will fight with each other over the same SQLite3 database file (which they must share since they're meant to be instances representing the same state). Another way to look at this is that inextricably embedding the database connection logic into your __init__ method has taken control away from the user. Perhaps they have their own database connection setup logic. Perhaps they want to re-use connections or pass in a fake for testing. Saving a reference to that object on the instance for later use is a separate operation from creating the connection itself. They shouldn't be bound together in __init__ where you have to take them both or give up on using Microblog.

Alternatives

You might notice that these three observations I've made all sound a bit negative. You might conclude that I think this is an antipattern to be avoided. If so, feel free to give yourself a pat on the back at this point.

But if this is an antipattern, is there a pattern to use instead? I think so. I'll try to explain it.

The general idea behind the pattern I'm going to suggest comes in two parts. The first part is that your object should primarily be about representing state and your __init__ method should be about accepting that state from the outside world and storing it away on the instance being initialized for later use. It should always represent complete, internally consistent state - not partial state as asynchronous initialization implies. This means your __init__ methods should mostly look like this:


class Microblog(object):
def __init__(self, cache_dir, database_connection):
self.cache_dir = cache_dir
self.database_connection = database_connection

If you think that looks boring - yes, it does. Boring is a good thing here. Anything exciting your __init__ method does is probably going to be the cause of someone's bad day sooner or later. If you think it looks tedious - yes, it does. Consider using Hynek Schlawack's excellent attrs package (full disclosure - I contributed some ideas to attrs' design and Hynek ocassionally says nice things about me (I don't know if he means them, I just know he says them)).

The second part of the idea an acknowledgement that asynchronous initialization is a reality of programming with asynchronous tools. Fortunately __init__ isn't the only place to put code. Asynchronous factory functions are a great way to wrap up the asynchronous work sometimes necessary before an object can be fully and consistently initialized. Put another way:


class Microblog(object):
# ... __init__ as above ...

@classmethod
@asyncio.coroutine
def from_database(cls, cache_dir, database_path):
# ... or make it a free function, not a classmethod, if you prefer
loop = asyncio.get_event_loop()
database_connection = yield from loop.run_in_executor(None, cls._reading_init)
return cls(cache_dir, database_connection)

Notice that the setup work for a Microblog instance is still asynchronous but initialization of the Microblog instance is not. There is never a time when a Microblog instance is hanging around partially ready for action. There is setup work and then there is a complete, usable Microblog.

This addresses the three observations I made above:

  • Methods of Microblog never need to concern themselves with worries about whether the instance has been completely initialized yet or not.
  • Nothing happens in Microblog.__init__. If Microblog has some methods which depend on instance attributes, any of those attributes can be set after __init__ is done and before those other methods are called. If the from_database constructor proves insufficiently flexible, it's easy to introduce a new constructor that accounts for the new requirements (named constructors mean never having to overload __init__ for different competing purposes again).
  • It's easy to treat a Microblog instance as an inert lump of state. Simply instantiating one (using Microblog(...) has no side-effects. The special extra operations required if one wants the more convenient constructor are still available - but elsewhere, where they won't get in the way of unit tests and unplanned-for uses.

I hope these points have made a strong case for one of these approaches being an anti-pattern to avoid (in Twisted, in asyncio, or in any other asynchronous programming context) and for the other as being a useful pattern to provide both convenient, expressive constructors while at the same time making object initializers unsurprising and maximizing their usefulness.

by Jean-Paul Calderone (noreply@blogger.com) at September 26, 2018 11:39 PM

September 21, 2018

Itamar Turner-Trauring

Never use the word "User" in your code

You’re six months into a project when you realize a tiny, simple assumption you made at the start was completely wrong. And now you need to fix the problem while keeping the existing system running—with far more effort than it would’ve taken if you’d just gotten it right in the first place.

Today I’d like to tell you about one common mistake, a single word that will cause you endless trouble. I am speaking, of course, about “users”.

There are two basic problems with this word:

  1. “User” is almost never a good description of your requirements.
  2. “User” encourages a fundamental security design flaw.

The concept “user” is dangerously vague, and you will almost always be better off using more accurate terminology.

You don’t have users

To begin with, no software system actually has “users”. At first glance “user” is a fine description, but once you look a little closer you realize that your business logic actually has more complexity than that.

We’ll consider three examples, starting with an extreme case.

Airline reservation systems don’t have “users”

I once worked on the access control logic for an airline reservation system. Here’s a very partial list of the requirements:

  • Travelers can view their booking through the website if they have the PNR locator.
  • Purchasers can modify the booking through the website if they have the last 4 digits of the credit card number.
  • Travel agents can see and modify bookings made through their agency.
  • Airline check-in agents can see and modify bookings based on their role and airport, given identifying information from the traveler.

And so on and so forth. Some the basic concepts that map to humans are “Traveler”, “Agent” (the website might also be an agent), and “Purchaser”. The concept of “user” simply wasn’t useful, and we didn’t use the word at all—in many requests, for example, we had to include credentials for both the Traveler and the Agent.

Unix doesn’t have “users”

Let’s take a look at a very different case. Unix (these days known as POSIX) has users: users can log-in and run code. That seems fine, right? But let’s take a closer look.

If we actually go through all the things we call users, we have:

  • Human beings who log in via a terminal or graphical UI.
  • System services (like mail or web servers) who also run as “users”, e.g. nginx might run as the httpd user.
  • On servers, there are often administrative accounts shared by multiple humans who SSH in using this “user” (e.g. ubuntu is the default SSH account on AWS VMs running Ubuntu).
  • root, which isn’t quite the same as any of the above.

These are four fairly different concepts, but in POSIX they are all “users”. As we’ll see later on, smashing all these concept into one vague concept called “user” can lead to many security problems.

But operationally, we don’t even have a way to say “only Alice and Bob can login to the shared admin account” within the boundaries of the POSIX user model.

SaaS providers don’t have “users”

Jeremy Green recently tweeted about the user model in Software-as-a-Service, and that is what first prompted me to write this post. His basic point is that SaaS services virtually always have:

  1. A person at an organization who is paying for the service.
  2. One or more people from that organization who actually use the service, together.

If you combine these into a single “User” at the start, you will be in a world of pain latter. You can’t model teams, you can’t model payment for multiple people at once—and now you need to retrofit your system. Now, you could learn this lesson for the SaaS case, and move on with your life.

But this is just a single instance of a broader problem: the concept “User” is too vague. If you start out being suspicious of the word “User”, you are much more likely to end up realizing you actually have two concepts at least: the Team (unit of payment and ownership) and the team Members (who actually use the service).

“Users” as a security problem

The word “users” isn’t just a problem for business logic: it also has severe security consequences. The word “user” is so vague that it conflates two fundamentally different concepts:

  • A human being.
  • Their representation within the software.

To see why this is a problem, let’s say you visit a malicious website which hosts an image that exploits a buffer overflow in your browser. The remote site now controls your browser, and starts uploading all your files to their server. Why can it do that?

Because your browser is running as your operating system “user”, which is presumed to be identical to you, a human being, a very different kind of “user”. You, the user, don’t want to upload those files. The operating system account, also the user, can upload those files, and since your browser is running under your user all its actions are presumed to be what you intended.

This is known as the Confused Deputy Problem. It’s a problem that’s much more likely to be part of your design if you’re using the word “user” to describe two fundamentally different things as being the same.

The value of up-front design

The key to being a productive programmer is getting the same work done with less effort. Using vague terms like “user” to model your software will take huge amounts of time and effort to fix later on. It may seem productive to start coding immediately, but it’s actually just the opposite.

Next time you start a new software project, spend a few hours up-front nailing down your terminology and concepts: you still won’t get it exactly right, but you’ll do a lot better. Your future self will thank you for the all the wasteful workaround work you’ve prevented.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 21, 2018 04:00 AM

September 10, 2018

Itamar Turner-Trauring

Work/life balance and challenging work: you can have both

You want to work on cutting edge technology, you want challenging problems, you want something interesting. Problem is, you also want work/life balance: you don’t want to deal with unrealistic deadlines from management, or pulling all-nighters to fix a bug.

And the problem is that when you ask around, people tell say you need to work long hours if you want to work on challenging problems. That’s just how it is, they say.

To which I say: bullshit.

You can work on challenging problems and still have work/life balance. In fact, you’ll do much better that way.

My apparently impossible career so far

Just as a counter-example, let me tell you how I’ve spent the past 14 years. Among other things, I’ve worked on:

  • A component of the flight search product that now powers Google Flights (flight search is hard—my team was working on the stuff on slides 44-48).
  • The prototype for what was then cutting edge container storage technology, a prototype that helped my company raise a $12 million Series A—and then we turned it into a production ready distributed system.
  • A crazy/useful Kubernetes local development tool.
  • Most recently, scientific image processing algorithms and processing pipeline.

All of these were hard problems, and interesting problems, and challenging problems, and none of them required working long hours.

Maybe those past 14 years are some sort of statistical aberration, but I rather doubt it. You can, for example, go work on some really tricky distributed systems problems over at Cockroach Labs, and have Fridays off to do whatever you want. (Not a personal endorsement: I know nothing about them other than those two points.)

Long hours have nothing to do with interesting problems

There is no inherent relationship between interesting problems and working long hours. You’re actually much more likely to solve hard problems if you’re well rested, and have plenty of time off to relax and let your brain do its thing off in the background.

The real origin of this connection is a marketing strategy for a certain subset of startups: “Yes, we’ll pay you jack shit and have you work 70 hours a week, but that’s the only way you can work on challenging problems!”

This is nonsense.

The real problem that these companies are trying to solve is “how do I get as much work out of these suckers with as little pay as possible.” It’s an incompetent self-defeating strategy, but there’s enough VCs who think exploitation is a great business model that you’re going to encounter it at least some startups.

The reality is that working long hours is the result of bad management. Which is to say, it’s completely orthogonal to how interesting the problem is.

You can just as easily find bad management in enterprise companies working on the most pointless and mind-numbingly soul-crushing problems (and failing to implement them well). And because of that bad management you’ll be forced to work long hours, even though the problems aren’t hard.

Luckily, you can also find good management in plenty of organizations, big and small—and some of them are working on hard, challenging problems too.

Avoiding bad workplaces

So how do you avoid exploitative workplaces and find the good ones? By asking some questions up front. You shouldn’t be relying on luck to keep you away from bad jobs; I made that mistake once, but never again.

Long ago I was interviewing for a job in NYC, and I mentioned that I wanted to continue working on open source software in my spare time. Here’s how the rest of the conversation went:

Interviewer: “Well, that’s fine, but… we used to have an employee here who did some non-profit work. We could never tell if their mind was here or on their volunteering, and it didn’t really work out. So we want to make sure you’ll be really focused on your job.”

Me: “Did they do their volunteering during work hours?”

Interviewer: “Oh, no, they only did that on their own time, it was just that they left at 5 o'clock every day.”

At that point I realized that, while I was willing to exchange 40 hours a week for a salary, I was not willing to exchange my whole life. I escaped that company by accident because they were so blatant about it, but you can do better.

Finding the job you want

When you’re interviewing for a job, don’t just ask about the problems they’re working on. You should also be asking about the work environment and work/life balance.

You can do so tactfully and informatively by asking things like “What’s a typical work day like here?” or “How are deadlines determined?” (You can get a good list of questions over at Culture Queries.)

There are companies out there that do interesting work and have work/life balance: do your research, ask the right questions, and you too will be able to find them.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 10, 2018 04:00 AM

September 04, 2018

Itamar Turner-Trauring

Stabbing yourself with a fork() in a multiprocessing.Pool full of sharks

It’s time for another deep-dive into Python brokenness and the pain that is POSIX system programming, this time with exciting and not very convincing shark-themed metaphors! Most of what you’ll learn isn’t really Python-specific, so stick around regardless and enjoy the sharks.

Let’s set the metaphorical scene: you’re swimming in a pool full of sharks. (The sharks are a metaphor for processes.)

Next, you take a fork. (The fork is a metaphor for fork().)

You stab yourself with the fork. Stab stab stab. Blood starts seeping out, the sharks start circling, and pretty soon you find yourself—dead(locked) in the water!

In this journey through space and time you will encounter:

  • A mysterious failure wherein Python’s multiprocessing.Pool deadlocks, mysteriously.
  • The root of the mystery: fork().
  • A conundrum wherein fork() copying everything is a problem, and fork() not copying everything is also a problem.
  • Some bandaids that won’t stop the bleeding.
  • The solution that will keep your code from being eaten by sharks.

Let’s begin!

Introducing multiprocessing.Pool

Python provides a handy module that allows you to run tasks in a pool of processes, a great way to improve the parallelism of your program. (Note that none of these examples were tested on Windows; I’m focusing on the *nix platform here.)

from multiprocessing import Pool
from os import getpid

def double(i):
    print("I'm process", getpid())
    return i * 2

if __name__ == '__main__':
    with Pool() as pool:
        result = pool.map(double, [1, 2, 3, 4, 5])
        print(result)

If we run this, we get:

I'm process 4942
I'm process 4943
I'm process 4944
I'm process 4942
I'm process 4943
[2, 4, 6, 8, 10]

As you can see, the double() function ran in different processes.

Some code that ought to work, but doesn’t

Unfortunately, while the Pool class is useful, it’s also full of vicious sharks, just waiting for you to make a mistake. For example, the following perfectly reasonable code:

import logging
from threading import Thread
from queue import Queue
from logging.handlers import QueueListener, QueueHandler
from multiprocessing import Pool

def setup_logging():
    # Logs get written to a queue, and then a thread reads
    # from that queue and writes messages to a file:
    _log_queue = Queue()
    QueueListener(
        _log_queue, logging.FileHandler("out.log")).start()
    logging.getLogger().addHandler(QueueHandler(_log_queue))

    # Our parent process is running a thread that
    # logs messages:
    def write_logs():
        while True:
            logging.error("hello, I just did something")
    Thread(target=write_logs).start()

def runs_in_subprocess():
    print("About to log...")
    logging.error("hello, I did something")
    print("...logged")

if __name__ == '__main__':
    setup_logging()

    # Meanwhile, we start a process pool that writes some
    # logs. We do this in a loop to make race condition more
    # likely to be triggered.
    while True:
        with Pool() as pool:
            pool.apply(runs_in_subprocess)

Here’s what the program does:

  1. In the parent process, log messages are routed to a queue, and a thread reads from the queue and writes those messages to a log file.
  2. Another thread writes a continuous stream of log messages.
  3. Finally, we start a process pool, and log a message in one of the child subprocesses.

If we run this program on Linux, we get the following output:

About to log...
...logged
About to log...
...logged
About to log...
<at this point the program freezes>

Why does this program freeze?

How subprocesses are started on POSIX (the standard formerly known as Unix)

To understand what’s going on you need to understand how you start subprocesses on POSIX (which is to say, Linux, BSDs, macOS, and so on).

  1. A copy of the process is created using the fork() system call.
  2. The child process replaces itself with a different program using the execve() system call (or one of its variants, e.g. execl()).

The thing is, there’s nothing preventing you from just doing fork(). For example, here we fork() and then print the current process’ process ID (PID):

from os import fork, getpid

print("I am parent process", getpid())
if fork():
    print("I am the parent process, with PID", getpid())
else:
    print("I am the child process, with PID", getpid())

When we run it:

I am parent process 3619
I am the parent process, with PID 3619
I am the child process, with PID 3620

As you can see both parent (PID 3619) and child (PID 3620) continue to run the same Python code.

Here’s where it gets interesting: fork()-only is how Python creates process pools by default.

The problem with just fork()ing

So OK, Python starts a pool of processes by just doing fork(). This seems convenient: the child process has access to a copy of everything in the parent process’ memory (though the child can’t change anything in the parent anymore). But how exactly is that causing the deadlock we saw?

The cause is two problems with continuing to run code after a fork()-without-execve():

  1. fork() copies everything in memory.
  2. But it doesn’t copy everything.

fork() copies everything in memory

When you do a fork(), it copies everything in memory. That includes any globals you’ve set in imported Python modules.

For example, your logging configuration:

import logging
from multiprocessing import Pool
from os import getpid

def runs_in_subprocess():
    logging.info(
        "I am the child, with PID {}".format(getpid()))

if __name__ == '__main__':
    logging.basicConfig(
        format='GADZOOKS %(message)s', level=logging.DEBUG)

    logging.info(
        "I am the parent, with PID {}".format(getpid()))

    with Pool() as pool:
        pool.apply(runs_in_subprocess)

When we run this program, we get:

GADZOOKS I am the parent, with PID 3884
GADZOOKS I am the child, with PID 3885

Notice how child processes in your pool inherit the parent process’ logging configuration, even if that wasn’t your intention! More broadly, anything you configure on a module level in the parent is inherited by processes in the pool, which can lead to some unexpected behavior.

But fork() doesn’t copy everything

The second problem is that fork() doesn’t actually copy everything. In particular, one thing that fork() doesn’t copy is threads. Any threads running in the parent process do not exist in the child process.

from threading import Thread, enumerate
from os import fork
from time import sleep

# Start a thread:
Thread(target=lambda: sleep(60)).start()

if fork():
    print("The parent process has {} threads".format(
        len(enumerate())))
else:
    print("The child process has {} threads".format(
        len(enumerate())))

When we run this program, we see the thread we started didn’t survive the fork():

The parent process has 2 threads
The child process has 1 threads

The mystery is solved

Here’s why that original program is deadlocking—with their powers combined, the two problems with fork()-only create a bigger, sharkier problem:

  1. Whenever the thread in the parent process writes a log messages, it adds it to a Queue. That involves acquiring a lock.
  2. If the fork() happens at the wrong time, the lock is copied in an acquired state.
  3. The child process copies the parent’s logging configuration—including the queue.
  4. Whenever the child process writes a log message, it tries to write it to the queue.
  5. That means acquiring the lock, but the lock is already acquired.
  6. The child process now waits for the lock to be released.
  7. The lock will never be released, because the thread that would release it wasn’t copied over by the fork().

In simplified form:

from os import fork
from threading import Lock

# Lock is acquired in the parent process:
lock = Lock()
lock.acquire()

if fork() == 0:
    # In the child process, try to grab the lock:
    print("Acquiring lock...")
    lock.acquire()
    print("Lock acquired! (This code will never run)")

Band-aids and workarounds

There are some workarounds that could make this a little better.

For module state, the logging library could have its configuration reset when child processes are started by multiprocessing.Pool. However, this doesn’t solve the problem for all the other Python modules and libraries that set some sort of module-level global state. Every single library that does this would need to fix itself to work with multiprocessing.

For threads, locks could be set back to released state when fork() is called (Python has a ticket for this.) Unfortunately this doesn’t solve the problem with locks created by C libraries, it would only address locks created directly by Python. And it doesn’t address the fact that those locks don’t really make sense anymore in the child process, whether or not they’ve been released.

Luckily, there is a better, easier solution.

The real solution: stop plain fork()ing

In Python 3 the multiprocessing library added new ways of starting subprocesses. One of these does a fork() followed by an execve() of a completely new Python process. That solves our problem, because module state isn’t inherited by child processes: it starts from scratch.

Enabling this alternate configuration requires changing just two lines of code in your program:

from multiprocessing import get_context

def your_func():
    with get_context("spawn").Pool() as pool:
        # ... everything else is unchanged

That’s it: do that and all the problems we’ve been going over won’t affect you. (See the documentation on contexts for details.)

But this still requires you to do the work. And it requires every Python user who trustingly follows the examples in the documentation to get confused why their program sometimes breaks.

The current default is broken, and in an ideal world Python would document that, or better yet change it to no longer be the default.

Learning more

My explanation here is of course somewhat simplified: for example, there is state other than threads that fork() doesn’t copy. Here are some additional resources:

Stay safe, fellow programmers, and watch out for sharks and bad interactions between threads and processes! 🦈🦑

(Want more stories of software failure? I write a weekly newsletter about 20+ years of my mistakes as a programmer.)

Thanks to Terry Reedy for pointing out the need for if __name__ == '__main__'.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 04, 2018 04:00 AM

September 03, 2018

Moshe Zadka

Managing Dependencies

(Thanks to Mark Rice for his helpful suggestions. Any mistakes or omissions that remain are my responsibility.)

Some Python projects are designed to be libraries, consumed by other projects. These are most of the things people consider "Python projects": for example, Twisted, Flask, and most other open source tools. However, things like mu are sometimes installed as an end-user artifact. More commonly, many web services are written as deployable Python applications. A good example is the issue tracking project trac.

Projects that are deployed must be deployed with their dependencies, and with the dependencies of those dependencies, and so forth. Moreover, at deployment time, a specific version must be deployed. If a project declares a dependency of flask>=1.0.1, for example, something needs to decide whether to deploy flask 1.0.1 or flask 1.0.2.

For clarity, in this text, we will refer to the declared compatibility statements in something like setup.py (e.g., flask>=1.0.1) as "intent" dependencies, since they document programmer intent. The specific dependencies that are eventually deployed will be referred as the "expressed" dependencies, since they are expressed in the actual deployed artifact (for example, a Docker image).

Usually, "intent" dependencies are defined in setup.py. This does not have to be the case, but it almost always is: since there is usually some "glue" code at the top, keeping everything together, it makes sense to treat it as a library -- albeit, one that sometimes is not uploaded to any package index.

When producing the deployed artifact, we need to decide on how to generate the expressed dependencies. There are two competing forces. One is the desire to be current: using the latest version of Django means getting all the latest bug fixes, and means getting fixes to future bugs will require moving less versions. The other is the desire to avoid changes: when deploying a small bug fix, changing all library versions to the newest ones might introduce a lot of change.

For this reason, most projects will check in the "artifact" (often called requirements.txt) into source control, produce actual deployed versions from that, and some procedure to update it.

A similar story can be told about the development dependencies, often defined as extra [dev] dependencies in setup.py, and resulting in a file dev-requirements.txt that is checked into source control. The pressures are a little different, and indeed, sometimes nobody bothers to check in dev-requirements.txt even when checking in requirements.txt, but the basic dynamic is similar.

The worst procedure is probably "when someone remembers to". This is not usually anyone's top priority, and most developers are busy with their regular day-to-day task. When an upgrade is necessary for some reason -- for example, a bug fix is available, this can mean a lot of disruption. Often this disruption manifests in that just upgrading one library does not work. It now depends on newer libraries, so the entire dependency graph has to be updated, all at once. All intermediate "deprecation warnings" that might have been there for several months have been skipped over, and developers are suddenly faced with several breaking upgrades, all at once. The size of the change only grows with time, and becomes less and less surmountable, making it less and less likely that it will be done, until it ends in a case of complete bitrot.

Sadly, however, "when someone remembers to" is the default procedure in the absence of any explicit procedure.

Some organizations, having suffered through the disadvantages of "when someone remembers to", decide to go to the other extreme: avoiding to check in the requirements.txt completely, and generating it on every artifact build. However, this means causing a lot of unnecessary churn. It is impossible to fix a small bug without making sure that the code is compatible with the latest versions of all libraries.

A better way to approach the problem is to have an explicit process of recalculating the expressed dependencies from the intent dependencies. One approach is to manufacture, with some cadence, code change requests that update the requirements.txt. This means they are resolved like all code changes: review, running automated tests, and whatever other local processes are implemented.

Another is to do those on a calendar based event. This can be anything from a manually-strongly-encouraged "update Monday", where on Monday morning, one of a developer tasks is to generate a requirements.txt updates for all projects they are responsible for, to including it as part of a time-based release process: for example, generating it on a cadence that aligns with agile "sprints", as part of the release of the code changes in a particular sprints.

When updating does reveal an incompatibility it needs to be resolved. One way is to update the local code: this certainly is the best thing to do when the problem is that the library changed an API or changed an internal implementation detail that was being used accidentally (...or intentionally). However, sometimes the new version has a bug in it that needs to be fixed. In that case, the intent is now to avoid that version. It is best to express the intent exactly as that: !=<bad version>. This means when an even newer version is released, hopefully fixing the bug, it will be used. If a new version is released without the bug fix, we add another != clause. This is painful, and intentionally so. Either we need to get the bug fixed in the library, stop using the library, or fork it. Since we are falling further and further behind the latest version, this is introducing risk into our code, and the increasing != clauses will indicate this pain: and encourage us to resolve it.

The most important thing is to choose a specific process for updating the expressed dependencies, clearly document it and consistently follow it. As long as such a process is chosen, documented and followed, it is possible to avoid the bitrot issue.

by Moshe Zadka at September 03, 2018 03:00 AM

August 22, 2018

Itamar Turner-Trauring

Guest Post: How to engineer a raise

You’ve discovered you’re underpaid. Maybe you found out a new hire is making more than you. Or maybe you’ve been doing a great job at work, but your compensation hasn’t changed.

Whatever the reason, you want to get a higher salary.

Now what?

To answer that question, the following guest post by Adrienne Bolger will explain how you can negotiate a raise at your current job. As you’ll see, she’s successfully used these strategies to negotiate 20-30% raises on multiple occasions.

This article will answer some common questions, and explain some useful strategies, to help you—a software engineer—engineer a raise from your employer. I’ll cover:

  1. Researching your worth and options.
  2. Expectation setting.
  3. Strategies that I have used—and helped others use—to ask for a raise.

How much are you “worth”?

At the end of the day, an optimized salary in a more-or-less capitalist market is the highest salary you think you can get that passes the “laugh test.” If you ask for a salary or bonus, and your (theoretical) boss or HR head laughs in your face, then the number is too high.

Note that this number isn’t your laugh test number: many people, out of fear of rejection, are afraid to ask for a 25% raise rather than a more “modest” sounding 5% raise. But sometimes the 25% value is the right increase! Your number should not be based on fear: it should be based on research.

There are several ways to calculate your “market value” to an employer. To start, take 2 or 3 of the following quizzes to calculate median/mean salaries based on your demographics:

How much could you be worth in the future?

Take the surveys a second time. However, this time, give yourself a small imaginary promotion: 2 years more experience and the next job title you want—Senior Engineer, Engineer II, Software Architect, Engineering Manager, Director, whatever it is. How far away is that yearly salary amount from the first one? A little? A lot?

This is an important number, because the pay market for software engineers is not linear. Check out this graph created by ArsTechnica from the 2017 Stack Overflow salary data.

This graph shows the economics of a very hot job market: people with relatively little experience still make a good living, because their skills are in high demand. However, the median salary for a developer between 15 and 20 years of experience is completely flat. This isn’t the best news for experienced developers who haven’t kept learning (and some languages pay more than others), but for early career professionals, this external market factor is fantastic.

With data to back you up, you can ask for a 20 to 30% raise after only a year or two on the job with a completely straight face. I did it in my own career at the 2 and 4 year marks at the same company, and received the raise both times.

Adjust expectations for your company and industry

If you’ve come to the conclusion you are underpaid because you know what your colleagues earn, then you can skip this step. Otherwise, you have a little more research to do.

Ask your company’s HR department and recruiters: when hiring in general, does your company go for fair market prices, under-market bargains, or above-market talent? Industries like finance pay better than non-profits and civil service organizations whether you are an engineer or an accountant.

The bigger the company, the more likely you are to get standard yearly pay adjustments for things like cost-of-living expenses, but a bigger company is also likely more rigid in salary bands for a specific job title. HR at your company may or may not be willing to share the exact high and low range for a job title. If they are not, Glassdoor can provide a decent estimate for similarly size companies.

When to ask

Again, know your company. Does it have a standard financial cycle, with cost-of-living and raises allocated yearly 1-2 months after reviews are in?

If so, time your “ask” before your formal review by 3-8 weeks. That might be November if your yearly reviews are in December, or it might be January if company yearly performance reviews occur in March, after the fiscal year results from last year are in.

Why do this?

The problem with waiting until a formal review is scheduled is that is ruins plans you can’t see or are not privy to. Even in the best case where you were getting a raise anyway, the manager giving your review already has a planned number in their head and their accounting software. Asking a month beforehand gives your boss time to budget your raise into a yearly plan, which is much easier than trying to fight bureaucracy out-of-cycle.

You should not ask for a raise more frequently than every 2 years. If you feel like you have to, then you probably didn’t ask for enough last time. Consider that if you find yourself afraid to ask for as much.

If you are debating between asking for a raise and going job hunting because you feel undervalued, ask for the raise first. I suggest this because job searching is a huge time sink, especially if you don’t really want to change jobs.

You owe it to yourself to proactively seek happiness. If what you really want is more money and to stay at your current company, then give your employer a chance to make you happy. If you ask and are denied, then at least you’ve done all the research into compensation when you go looking.

How to ask

Ask for a raise both in writing and in person.

As email is still considered informal, this is one of those cases where an actual letter—printed out and hand-delivered to a scheduled meeting with your manager—is a good idea. The meeting gives the chance to explain what you want a little more, but the letter is a written record of what you want that goes to HR, as well as a way to keep yourself from backing out due to nerves or stress.

I once requested a raise from a manager who (unbeknownst to me) was let go 2 weeks later. However, because my raise request was also in writing, I received the raise from my new boss with no confusion after the transfer.

The letter should be 2-3 paragraphs long and:

  • Be addressed and CC’d to your manager and HR at your company.
  • List your current length of service with the company and affirm that you like working there.
  • Detail exactly what you want: a 20% raise? A $5,000 raise? Tuition money for school? More vacation days? Do not leave it ambiguous.
  • Detail why you believe you deserve it, and back it up with available data:
    • Do you have more experience now?
    • Earned a degree?
    • Learn new skills or programming language?
    • Has it been 3 years since you got a review because you work at a 20 person startup?
  • The exception to the previous point is if you know you are underpaid because a coworker with the same responsibilities is paid more: it’s enough to say that in general terms. Calling a specific coworker out is unnecessary.
  • List, in 2 sentences or less, any recent accomplishments that were especially impactful. This serves 2 purposes: reminding your boss how awesome you are, but also making it easy on them to justify your (deserved) raise to the people they are accountable to at the next level up in the company.
  • End with a request for a meeting discussing the contents of the letter.
  • Be signed and dated.

The letter (and subsequent meeting) should not:

  • Imply you will leave if you don’t get what you want, even if you are planning on it. Bluffing here is a good way to get asked to leave anyway. Even if you are planning to leave if you don’t receive a raise, threats put people on the defensive.
  • Sound angry or imply you have an ungrateful or deficient manager/employer. Position yourself as asking for something a reasonable person should want to give you. Have the most gentle and peaceful individual you know read your letter to double check tone. If all else fails, try your local Buddhist monk.

The meeting

Once you ask for a raise and a meeting to talk about it, nerves may kick in. Do your homework ahead of time and come in prepared. Bring a copy of your letter and, during the meeting, re-iterate exactly what it is you want and why you deserve it.

It’s fine to be nervous, but do not attempt any weird “Hollywood caricature of a car salesman” negotiating tactics. Don’t be short-sighted; remember that you have to perform your day job with your manager once the meeting is over.

If your employer declines

If you asked for your “laugh test” number and your employer can only meet you halfway or can’t increase your compensation at all, your response should be “Why? And what can I do to change that?”

Be proactive in determining where the problem is. At a big company, if there’s a salary band, you may need a promotion before you can get the raise. If the company isn’t making enough money for raises for anyone, it may be time to discreetly look for another job anyway.

Whether or not you choose to accept a compromise or counteroffer is up to you—but make sure that you can live with your choice, at least short term, because it won’t make sense to ask again for another few months.

And that’s Adrienne’s post. I hope you found it useful: I certainly learned a lot from it.

Of course, reading this article isn’t enough. You still need to go and do the work to get the raise. So why not start today?

  1. Do your research.
  2. Pick the right moment.
  3. Go ask for that raise!


It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 22, 2018 04:00 AM

August 16, 2018

Itamar Turner-Trauring

How to say "no" to your boss, your boss's boss, and even the CEO

You’ve got plenty of work to do already, when your boss (or their boss, or the CEO) comes by and asks you to do yet another task. If you take yet another task on you’re going to be working long hours, or delivering your code late, and someone is going to be unhappy.

You don’t want to say no to your boss (let alone the CEO!). You don’t want to say yes and spend your weekend working.

What do you do? How do you keep everyone happy?

What you need is your management to trust your judgment. If they did, you could focus on the important work, the work that really matters. And when you had to say “no”, your boss (or the CEO!) would listen and let you continue on your current task.

To get there, you don’t immediately say “no”, and don’t immediately say “yes”.

Here’s what you do instead:

  1. Start with your organizational and project goals.
  2. Listen and ask questions.
  3. Make a decision.
  4. Communicate your decision in terms of organizational and project goals.

Step 1: Start with you goals

If you want people to listen to you, you need a strong understanding of why you’re doing the work you’re doing.

  • What is your organization trying to achieve?
  • What is your project trying to achieve, and how does that connect to organizational goals?
  • How does your work connect to the project goals?

You should be able to connect your individual action to project success, and connect that to organizational success. For example, “Our goal is to increase recurring revenue, customer churn is too high and it’s decreasing revenue, so I am working on this bug because it’s making our product unusable for at least 7% of users.”

When you’re just starting out as an employee this can be difficult to do, but as you grow in experience you can and should make sure you understand this.

(Starting with your goals is useful in other ways as well, e.g. helping you stay focused).

Step 2: Listen and ask questions

Your lead developer/product manager/team mate/CEO/CTO had just stopped by your desk and given you a new task. No doubt you already have many existing tasks. How should you handle this situation?

To begin with, don’t immediately give an answer:

  • Don’t immediately say “yes”: Unless you happen to have no existing work, any new work you take on will slow down your existing work. Your existing work was chosen for a reason, and may well be more important than this new task.
  • Don’t immediately say “no”: There’s a reason you’re being asked to do this task. By immediately saying “no” you are devaluing the request, and by extension devaluing the person who asked you.

Instead of immediately agreeing or disagreeing to do the task, take the time find out why the task needs to be done. Make sure you demonstrate you actually care about the request and are seriously considering it.

That means first, listening to what they have to say.

And second, asking some questions: why does this need to be done? What is the deadline? How important is it to them?

Sometimes the CEO will come by and ask for something they don’t really care about: they only want you to do it if you have the spare time. Sometimes your summer intern will come by and point out a problem that turns out to be a critical production-impacting bug.

You won’t know unless you listen, and ask questions to find out what’s really going on.

Step 3: Decide based on your goals

Is the new task more important to project and organizational goals than your current task? You should probably switch to working on it.

Is the new task less important? You don’t want to do it.

Not sure? Ask more questions.

Still not sure? Talk to your manager about it: “Can I get back to you in a bit about this? I need to talk this over with Jane.”

Step 4: Communicate your decision

Once you’ve made a decision, you need to communicate it in a meaningful, respectful way, and in a way that reflects organizational and project goals.

If you decided to take the task on:

  1. Tell the person asking you that you’ll take it on.
  2. Explain to the people who requested your previous tasks that those tasks will be late. Make sure it’s clear why you took on a new task: “That feature is going to have to wait: it’s fairly low on the priority list, and the CEO asked me to throw together a demo for the sales meeting on Friday.”

If you decided not to take it on:

  1. Explain why you’re not going to do it, in the context of project and organizational goals. “That’s a great feature idea, and I’d love to do it, but this bug is breaking the app for 10% of our customers and so I really need to focus on getting it done.”
  2. Provide an alternative, which can include:
    • Deflection: “Why don’t you talk to the product manager about this?”
    • Queuing: “Why don’t you add it to the backlog, and we can see if we have time to do it next sprint?”
    • Promise: “I’ll do it next, as soon as I’m done with my current task.”
    • Reminder: “Can you remind me again in a couple of weeks?”
    • Different solution: “Your original proposal would take me too long, given the release-blocker backlog, but maybe if we did this other thing instead I could fit it in. It seems like it would get us 80% of the functionality in a fraction of the time–what do you say?”

Becoming a more valuable employee

Saying “no” the right way makes you more valuable, because it ensures you’re working on important tasks.

It also ensures your managers know you’re more valuable, because you’ve communicated that:

  1. You’ve carefully and respectfully considered their request.
  2. You’ve taken existing requests you’re already working on into account.
  3. You’ve made a decision not based on personal whim, but on your best understanding of what is important to your project and organization.

Best of all, saying “no” the right way means no evenings or weekends spent working on tasks that don’t really need doing.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 16, 2018 04:00 AM

August 10, 2018

Itamar Turner-Trauring

There's always more work to do—but you still don't need to work long hours

Have you ever wished you could reduce your working hours, or even just limit yourself to 40 hours a week, but came up against all the work that just needs doing? There’s always more work to do, always more bugs, always some feature that’s important to someone—

How can you limit yourself to 40 hours a week, let alone a shorter workweek, given all this work?

The answer: by planning ahead. And planning ahead the right way.

The wrong way to plan

I was interviewing for a job at a startup, and my first interviewer was the VP of Engineering. He explained that he’d read my blog posts about the importance of work/life balance, and he just wanted to be upfront about the fact they were working 50-60 hours each week. And this wasn’t a short-term emergency: in fact, they were going to be working long hours for months.

I politely noted that I felt good prioritization and planning could often reduce the need for long hours.

The VP explained the problem: they’d planned all their tasks in detail. But then—to their surprise—an important customer asked for more features, and that blew through their schedule, which is why they needed to work long hours.

I kept my mouth shut and went through the interview process. But I didn’t take the job.

Here’s what’s wrong with this approach:

  1. Important customers asking for more features should not be a surprise. Customers ask for changes, this is how it goes.
  2. More broadly, the original schedule was apparently created with the presumption that everything would go perfectly. In the real world nothing ever goes perfectly.
  3. When it became clear that that there was too much work to do, their solution was to work longer hours, even though research suggests that longer hours do not increase output over the long term.

The better way: prioritization and padding

So how do you keep yourself from blowing through your schedule without working long hours?

  1. Prioritize your work.
  2. Leave some padding in your schedule for unexpected events.
  3. Set your deadlines shorter than they need to be.
  4. If you run out of time, drop the least important work.

1. Prioritize your work

Not all work is created equal. By starting with your goals, you can divide tasks into three buckets:

  1. Critical to your project’s success.
  2. Really nice to have—but not critical.
  3. Clearly not necessary.

Start by dropping the third category, and minimizing the second. You’ll have to say “no” sometimes, but if you don’t say “no” you’ll never get anything delivered on time.

2. Leave some padding in your schedule

You need to assume that things will go wrong and you’ll need extra time to do any given task. And you need to assume other important tasks will also become critical; you don’t know which, but this always happens. So never give your estimate as the actual delivery date: always pad it with extra time for unexpected difficulties and unexpected interruptions.

If you think a task will take a day, promise to deliver it in three days.

3. Set shorter deadlines for yourself

Your own internal deadline, the one you don’t communicate to your boss or customer, should be shorter than your estimate. If you think a task will take a day, try to finish it in less time.

Why?

  • You’ll be forced to prioritize even more.
  • With less time to waste on wrong approaches, you’ll be forced to spend more time upfront thinking about the best solution.

4. When you run out of time, drop the less important work

Inevitably things will still go wrong and you’ll find yourself running low on time. Now’s the time to drop all the nice-to-haves, and rethink whether everything you thought was critical really is (quite often, it’s not).

Long hours are the wrong solution

Whenever you feel yourself with too much work to do, go back and apply these principles: underpromise, limit your own time, prioritize ruthlessly. With practice you’ll learn how to deliver the results that really matter—without working long hours.

When you’ve reached that point, you can work a normal 40-hour workweek without worrying. Or even better, you can start thinking about negotiating a 3-day weekend.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 10, 2018 04:00 AM

August 09, 2018

Hynek Schlawack

Hardening Your Web Server’s SSL Ciphers

There are many wordy articles on configuring your web server’s TLS ciphers. This is not one of them. Instead I will share a configuration which is both compatible enough for today’s needs and scores a straight “A” on Qualys’s SSL Server Test.

by Hynek Schlawack (hs@ox.cx) at August 09, 2018 06:00 PM

August 03, 2018

Moshe Zadka

Tests Should Fail

(Thanks to Avy Faingezicht and Donald Stufft for giving me encouragement and feedback. All mistakes that remain are mine.)

"eyes have they, but they see not" -- Psalms, 135:16

Eyes are expensive to maintain. They require protection from the elements, constant lubrication, behavioral adaptations to protect them and more. However, they give us a benefit. They allow us to see: to detect differences in the environment. Eyes register different signals when looking at an unripe fruit and when looking at a ripe fruit. This allows us to eat the ripe fruit, and wait for the unripe fruit to ripen: to behave differently, in a way that ultimately furthers our goals (eat yummy fruits).

If our eyes did not get different signals that influenced our behavior, they would not be cost effective. Evolution is a harsh mistress, and the eyes would be quickly gone if the signals from them were not valuable.

Writing tests is expensive. It takes time to write them, time to review them, time to modify them as code evolves. A test that never fails is like an eye that cannot see: it always sends the same signal, "eat that fruit!". In order to be valuable, a test must be able to fail, and that failure must modify our behavior.

The only way to be sure that a test can fail is to see it fail. Test-driven-development does it by writing tests that fail before modifying the code. But even when not using TDD, making sure that tests fail is important. Before checking in, break your code. Best of all is to break the code in a way that would be realistic for a maintenance programmer to do. Then run the tests. See them fail. Check it in to the branch, and watch CI fail. Make sure that this CI failure is clearly communicated: something big must be red, and merging should be impossible, or at least require using a clearly visible "override switch".

If there is no code modification that makes the test fail, of if such code modification is weird or unrealistic, it is not a good test. If a test failure does not halt the CI with a visible message, it is not a good CI. These are false gods, with eyes that do not see, and mouths that do not speak.

Real tests have failures.

by Moshe Zadka at August 03, 2018 05:30 AM

Thank you, Guido

When I was in my early 20s, I was OK at programming, but I definitely didn't like it. Then, one evening, I read the Python tutorial. That evening changed my mind. I woke up the next morning, like Neo in the matrix, and knew Python.

I was doing statistics at the time. Python, with Numeric, was a powerful tool. It definitely could do things that SPSS could only dream about. Suddenly, something has happened that never happened before -- I started to enjoy programming.

I had to spend six years in the desert of programming in languages that were not Python, before my work place, and soon afterwards the world, realized what an amazing tool Python is. I have not had to struggle to find a Python position since.

I started with Python 1.4. I have grew up with Python. Now I am...no longer in my 20s, and Python version 3.7 was recently released.

I owe much of my career, many of my friends, and much of my hobby time to that one evening, sitting down and reading the Python tutorial -- and to the man who made the language and wrote the first version of that tutorial, Guido van Rossum.

Python, like all open source projects, like, indeed, all software projects, is not a one man show. A whole team, with changing personnel, works on core Python and its ecosystem. But it was all started by Guido.

As Guido is stepping down to take a less active role in Python's future, I want to offer my eternal gratitude. For my amazing career, for my friends, for my hobby. Thank you, Guido van Rossum. Your contribution to humanity, and to this one human in particular, is hard to overestimate.

by Moshe Zadka at August 03, 2018 04:30 AM

July 29, 2018

Itamar Turner-Trauring

Bad at whiteboard puzzles? You can still get a programming job

Practicing algorithm puzzles stresses you out: just looking at a copy of Cracking the Coding Interview makes you feel nervous.

Interviewing is worse. When you do interview you freeze up: you don’t have IDE error checking and auto-completion, you can’t use a search engine Google like a real programmer would, there’s a stranger staring you down. You screw up, you make typos, you don’t know what to say, you make a bad impression.

If this happens to you, it’s not your fault! Whiteboard puzzles are a bad way to hire programmers.

They’re not realistic: unless you’re Jeff Goldblum haxoring the alien mothership’s computer just in time for Will Smith to blow up some invaders, you’re probably not coding on a 5-minute deadline.

And the skills they’re testing aren’t used by 95% of programmers 95% of the time. I recently had to do a graph traversal in dependency order—which meant I was all prepared to find my algorithms text book from college. But then I found this library already had a utility called toposort, and vague memories of classes 19 years ago reminded me that this was called a “topological sort”. I didn’t actually have to implement it, but if I did would have done it with textbook in hand, over the course of a couple of hours (gotta write tests!).

Unfortunately, many companies still use them, and you need a job. A programming job. What should you do?

Here are some ideas to help you find a job—even if you hate whiteboard puzzles.

1. Interview at companies with a better process

Not all companies do on-the-spot programming puzzles. The last three companies I worked at didn't—one had a take-home exercise that wasn’t about algorithms (a decision I was involved in, and which I now regret because of the burden it puts on people with no free time). Two others just had talking interviews: I talked about myself, they talked about the company, all very relaxes and civilized.

To find such companies:

  1. Here’s one list of 500+ companies that don’t do whiteboard puzzles.
  2. The invaluable Key Values job board also tells you about the interview process at the covered companies (see the column on the right when looking at a particular company).

2. Offer an alternative

If you are interviewing at a company with whiteboard puzzles, you don’t have to accept their process without pushing back. Just like your salary and working hours, the interview process is also something you can negotiate.

If you have some code you’ve written that you’re particularly proud of and have the ability to share, ask the company if you can share it with them in lieu of a whiteboard puzzle. I once made the mistake of only suggesting this during the interview, and the guy who was interviewing me said he would have accepted it if I’d asked earlier. So make sure to suggest this before the day of the interview, so they have time to review the code in advance.

3. Take control of the process

If all else fails and you’re stuck doing a puzzle, there are ways to take control of the process and make a good impression, even if the puzzle is too hard for you. I cover this in more detail in another post.

4. Don’t give up

Finally, remember whiteboard puzzles have nothing to do with actual programming, even when the work you’re doing is algorithmic. They’re a hazing ritual you may be forced to go through, but they in no way reflect on your ability as a programmer.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

July 29, 2018 04:00 AM

July 13, 2018

Twisted Matrix Laboratories

Twisted 18.7.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 18.7!

The highlights of this release are:
  • better support for async/await coroutines in regards to exception and traceback handling;
  • better support for reporting tracebacks in inlineCallbacks, now showing what you would expect in synchronous-like code
  • the epoll reactor now no longer hard-locks when running out of file descriptors
  • directory rendering in t.web works on Python 2 again
  • manhole's colouriser is better at handling Unicode
  • setting the groundwork for Python 3.7 support. Note that Python 3.7 is currently not a supported platform on any operating system, and may completely fail to install, especially on Windows.
For more information, check the NEWS file (link provided below).

You can find the downloads at <https://pypi.python.org/pypi/Twisted> (or alternatively <http://twistedmatrix.com/trac/wiki/Downloads>). The NEWS file is also available at <https://github.com/twisted/twisted/blob/twisted-18.7.0/NEWS.rst>.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Amber Brown (HawkOwl)

by Amber Brown (noreply@blogger.com) at July 13, 2018 07:17 PM

July 10, 2018

Itamar Turner-Trauring

How to be judged by your output, not your time in the office

If you want to limit your working hours as a programmer you need to keep your boss happy. But what can you do if your boss seems to care more about your time in the office than how much you produce?

Not to mention the comparison to your other coworkers. If they fix ten bugs a week, but you only fix two, your boss might not be happy—even if the bugs you fixed had far more impact, and were much harder to address.

If you’re stuck in this situation it may seem impossible to reduce your working hours. But before you give up and start looking for another job, it’s worth seeing if you can improve the situation where you are.

Two reasons your boss might like long hours

If you’re going to be judged by hours you need a manager who cares about the organization’s or team’s goals. There are two possibilities about how your boss is thinking:

Hours as proxy: Your boss cares about achieving goals, and is using hours as a mental shorthand for value produced. If you actually are a valuable worker, and make sure they know it, they won’t notice your working hours, as in this real occurrence:

A programmer I know was having a conversation with their manager when the manager mentioned, in an offhand manner, that the company expected people to work 50 hours a week.

This programmer had always worked 40-45 hours a week, and the manager had never complained or noticed, because the programmer did good work. So the programmer kept their mouth shut, didn’t comment, and kept on working their usual hours.

Hours as goal: Your boss may truly only care about hours in the office number of bugs fixed, or some other irrelevant measure. Which is to say, they’re incompetent. In this case the suggestion that follows won’t work, unless perhaps you can bypass your boss and reach someone who does care about organizational goals. Usually a job with different team or organization will serve you better.

Assuming your boss only uses hours as a proxy measure, let’s see what you can do.

Starting with goals

It’s 3PM on a Wednesday, and your boss swings by your desk and asks how things are going. You explain you’re upgrading one of your JavaScript dependencies.

Your boss nods and wanders off, wondering if you’re actually doing anything worthwhile. You have just wasted an opportunity to demonstrate your value.

What’s the alternative? Starting with goals in mind.

Elsewhere I’ve talked about how starting with goals in mind will keep you focused, and is key to making you more productive. Starting with your organizational and team goals in mind can also help you both choose valuable work and explain its value to your boss.

For every task you work on, you should have a clear logical path from the big picture organizational goals, down to your team’s goals, down to your project’s goals, down to why this particular task at this particular time is a good way to advance those goals. If you can’t make that connection, if you can’t explain why you’re doing what you’re doing:

  1. You may not actually be doing anything valuable.
  2. Even if you are, you can’t prove it.

Let’s get back to that JavaScript dependency. If you started with goals in mind you might have decided this wasn’t a particularly useful task to begin with, and worked on something else. Or, perhaps you know exactly why you’re doing it.

In that case, the conversation might go something like this:

“You know how we’ve decided we wanted to increase user retention? Well, it looks like one of the problems is that our site is rendering way too slowly, so half our users bounce before the page finishes loading.

Turns out that font loading is the problem, and this library has a feature to fix that in its latest release. Once I’ve upgraded I should be able to get pages to render in a quarter of the time, and I’m hoping that’ll increase user retention. And I have some other ideas in case that isn’t sufficient.”

Your boss goes away understanding that what you’re doing is valuable. And if they’re anywhere near competent they won’t be thinking about the bug queue, or how many hours they’ve seen you in the office. They’ll be thinking about the good work you’re doing, and how they can tell their boss that the retention problem is being addressed.

Communicating your value based on goals

You can explain your work in this way when asked, but there’s no reason not to do so proactively as well. Once a week, or once a month, you can take stock of what you’ve achieved and send an email to your boss. And you can also keep a copy of this email to update your resume when the time comes to look for a new job.

To recap:

  1. Understand why you’re doing your work.
  2. Choose work that addresses those goals.
  3. Communicate to your boss why your work is helping those goals.

This will shift many managers from a hour mindset—driven by an assumption that your work isn’t producing that much value—to a value mindset. Your boss will know your output is valuable, and as a result won’t require the proxy measure of hours worked.

Learning how to work towards goals is, of course, easier said than done. It took me many years and many mistakes along the way. If you’d like to accelerate your learning, and take advantage of everything I’ve learned working as a programmer over the past 20 years, you can sign up for my Software Clown weekly newsletter. Every week I share a mistake and what you can learn from it.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

July 10, 2018 04:00 AM

July 03, 2018

Itamar Turner-Trauring

Five ways to work 35 hours (or less!) a week

You’re tired of working yourself to death. You’ve had enough of the pressure of long hours, and you want not more money but more time: you want a shorter workweek. You want to work 35 hours a week, or 32, or even less.

But in an industry where some companies brag about their 70-hour workweeks, can you find a programming job with a short workweek?

The short answer is that yes, you can work 35 hours or less. I’ve done it at multiple jobs, and I know other programmers who do so as well.

The longer answer is that you have multiple different options, different ways of achieving this goal. Let’s see what they are.

Option #1: Find a job that offers shorter hours

While they are few and far between, some organizations do offer shorter workweeks. For example, over on the excellent Key Values job board you can learn about Monograph, a company that provides a 32-hour workweek by default.

Option #2: Negotiate a custom deal

Just like you can negotiate a higher salary, you can also negotiate a shorter workweek. The best place to do it is at your current job, because you’ve likely got expensive-to-replace knowledge of business logic, organizational procedures, local tech stack, and so on. But you can also negotiate for a shorter workweek at a new job, if you do it right.

If you’re interested in seeing how this is done, read my interview with a programmer who has been working part-time for 15 years.

Option #3: Become a consultant (the right way)

If you’re a consultant and you do it right, you can raise your rates high enough that you don’t need to work full time to make a living. Doing it right is important, though: if your hourly rate is low enough you’re going to have to work long hours.

To learn about some of what it takes, Jonathan Stark’s Value Pricing Bootcamp is one place to start.

Option #4: Start a product business (the right way)

If you’re selling a product that you’ve created, the hours you work don’t map one-to-one to your income. You have the upfront time for creating the product, and ongoing time for marketing and maintaining the product, but at that point you can sell the same product over and over with much smaller investment of time.

Consider for example Amy Hoy’s explanation of how bootstrapping a business allowed her to make a living even with a chronic illness.

Option #5: Early retirement

Living below your means is a good idea in general: the more money you have in the bank the easier it’ll be for you to find a better job, for example. But if you don’t want to work at all, over the long term cutting your expenses can help you stop working altogether.

Liberate yourself from the office

One of the pernicious side-effects of the culture of long hours in tech is that even a 40-hour workweek seems impossible. But long hours aren’t necessary: they’re a crutch for bad management. Working shorter hours can actually make you more productive, productive enough that your total output goes up even with shorter hours.

In the short run, if you want to work fewer hours you have to do something about it.

In the long run, there’s no reason why a 32-hour workweek couldn’t be the standard—if we all push for it hard enough.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

July 03, 2018 04:00 AM

July 02, 2018

Moshe Zadka

Composition-oriented programming

A common way to expose an API in Python is as inheritance. Though many projects do that, there is a better way.

But first, let's see. How popular is inheritance-as-an-API, anyway?

Let's go to the Twisted website. Right at the center of the screen, at prime real-estate, we see:

What's there? The following is abridged:

class Echo(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write(data)
class EchoFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Echo()

(This is part of an example on building an echo-server protocol.)

If you are wondering who came up with this amazing API, it is the same person who is writing the words you are reading. I certainly thought it was an amazing API!

Look at how many smart people agreed with me.

Django takes a page of tutorial to get there, but sure enough:

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')
class Choice(models.Model):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

Jupyter's echo kernel starts:

class EchoKernel(Kernel):
    implementation = 'Echo'
    implementation_version = '1.0'
    language = 'no-op'

Everyone is doing it. A project I have been a developer on for ~16 years. The most popular Python web library, responsible for who-knows-how-many requests per second in Instagram. A project that won the ACM award (and well deserved, at that).

However, popularity is not everything. This is not a good idea.

When exposing class inheritance as a public interface, that means committing to a level of backwards compatibility that is unheard of. Even adding private methods or attributes becomes dangerous.

Let's give a toy example:

class Writer:

    _write = lambda x: None

    def set_output(self, output):
        self._write = output.write

    def write(self, message):
        formatted = self.format(message)
        self._write(message)

    def format(self, message):
        raise NotImplementedError("format")

This is a simple writer, that, while initially sending everything down a black hole, can be set to write the output to a file-like object. It needs to format the messages, so the proper usage is to subclass and override format (while taking care not to define methods called set_output or _write.)

class BufferWriter(MultiWriter):

    _buffer = False

    def format(self, message):
        if self._buffer:
            return 'Buffer: ' + message
        else:
            return 'Message: ' + message

    def switch_buffer(self):
        self._buffer = not self._buffer

The simplest formatting would return the message as is. However, this formatter is slightly less trivial -- it prefixes the message with the word Buffer or Message, depending on an internal variable that can be switched.

Now we can do things like:

>>> bp = BufferWriter()
>>> bp.set_output(sys.stdout)
>>> bp.write("hello")
Message: hello
>>> bp.switch_buffer()
>>> bp.write("hello")
Buffer: hello

This looks good, so far. Of course, things are never so simple in real life. The writer library, naturally, gets thousands of stars on GitHub. It becomes popular. There's a development community, complete with a discord channel and a mailing list. So naturally, important features get added.

class Writer:

    _buffer = ""

    _write = lambda x: None

    def set_output(self, output):
        self._write = output.write

    def write(self, message):
        self._buffer += self.format(message)
        if len(self._buffer) > 10:
            self._write(self._buffer)
            self._buffer = ""

    def format(self, message):
        raise NotImplementedError("format")

Turns out people needed to buffer some of the shorter messages. This was a crucial performance improvement, that all users were clamoring for, so version 2018.6.1 is highly anticipated.

It breaks, though, the BufferWriter. The symptoms are weird: TypeError s and other such fun. All because both the superclass and the subclass are competing to access self._buffer.

With enough care, these problems can be avoided. A library which exposes classes for inheritance must add all new private methods or attributes as __ and, naturally, never ever add any public methods or attributes. Sadly, nobody does that.

So what's the alternative?

from zope import interface

class IFormatter(interface.Interface):

    def format(message):
        """show stuff"""

We define an abstract interface. This interface [1] has only one method -- format.

@attr.s
class Writer:

    _buffer = ""

    _write = lambda x: None

    _formatter = attr.ib()

    def set_output(self, output):
        self._write = output.write

    def write(self, message):
        self._buffer += self._formatter.format(message)
        if len(self._buffer) > 10:
            self._write(self._buffer)
            self._buffer = ""

We use the attrs library [#] to define our main functionality: a class that wraps other objects, which we expect to be IFormatter.

We can automatically verify, by instead having the _formatter line say:

_formatter = attr.ib(validator=lambda instance, attribute, value:
                               verify.verifyObject(IFormatter, value))

Note that this separates the concerns: the "fake method" format has moved to a "fake class" (an interface).

@interface.implementer(IFormatter)
class BufferFormatter:

    _buffer = False

    def format(self, message):
        if self._buffer:
            return 'All Channels: ' + message
        else:
            return 'Limited Channels: ' + message

    def switch_buffer(self):
        self._buffer = not self._buffer

Note that now, if we only have the Writer object, there is no way to switch prefixes. Correctly switching prefixes means keeping access to the original object.

If there is a need to "call back" to the original methods, the original object can be passed in to the wrapped object. One advantage is that, being a distinct object, it is obvious one should only call into public methods and only access public variables.

Passing ourselves to a method is, in general, not an ideal practice. What we really should do, is to pass specific methods or variables directly into the method. But this is funny: when using inheritance, we always effectively pass ourselves to every method. So even this refactoring is a net improvement. When the biggest criticism of a refactoring is "this could now be improved even more", it usually means it is a good idea.

Credits:

  • Thanks to Tom Goren for his feedback -- the original version was more aggressive.
  • Thanks to Glyph Lefkowitz for pushing me to make the example better.
  • Thanks to Augie Fackler and Nathaniel Manista for much of the inspiration.
[1]The zope.interface library is a little like the abc libary: both give tools to clarify what methods we expect. However, the abc.ABC like inheritance a little too much. Glyph has a good explanation about the advantages.
[2]attrs makes defining Python classes much less boiler-platey. There's another Glyph post explaining why it is so good.

by Moshe Zadka at July 02, 2018 05:00 AM

June 15, 2018

Itamar Turner-Trauring

Avoiding hour creep: get your work done and still go home at 5PM

You want to work 40 hours a week, you want to head home at 5PM, but—there’s this bug. And you’re stuck, and you really need to fix it, and you’re in the middle so you work just a little longer. Next thing you know you’re leaving work at 6PM.

And before long you’re working 50 hours a week, and then 60 hours a week, and if you stop working overtime it’ll hit your output, and then your manager will have a talk with you but how you really need to put in more effort. So now you’re burning out, and you’re not sure what you can do about it.

But what if you were more productive?

What if you knew how to get your work done on company time, and could spent your own time on whatever you wanted?

You can—with a little time management.

Some caveats

Before we get to the actual techniques you’ll be using, some preliminaries.

First, these techniques will only work if you have a manager who judges you based on results, not hours in the office. Keep in mind that there are many managers who claim they want a 50-hour workweek, but in practice will be happy if you do a good job in just 40. I’m also assuming your company is not in constant crisis mode. If these assumptions are wrong, better time management won’t help: it’s time to find another job.

Second, these techniques are here to help you in day-to-day time management. If production is down, you may need to work longer hours. (And again, if production is down every week, it’s time to find another job.)

Finally, for simplicity’s sake I’m assuming you get in at 9:00AM and want to leave at 5:PM. Adjust the times below accordingly if you start later in the day.

Taking control over your time

Since your problem is time creep, the solution is hard limits on when you can start new work—together with time allocated to planning so future work is more productive.

Here’s the short version of a schedule that will help you do more in less time:

  1. When you get in to work you read your checkpoint from the previous workday (I’ll explain this in a bit).
  2. Until 3:30PM you work as you normally would.
  3. After 3:30PM you continue on any existing task you’re already working on. If you finish that task you can start new tasks only if you know they will take 15 minutes or less. If you don’t have any suitable tasks you should spend this time planning future work.
  4. At 4:45PM you stop what you’re doing and checkpoint your work.
  5. At 5:00PM you go home.

Let’s delve deeper so you can understand what to do, and why this will help you.

End of day → start of next day: checkpointing

In the last 15 minutes of your day you stop working and checkpoint your work. That is, you write down everything you need to know to get started quickly the next morning when you come to work.

If you’re in the middle of a task, for example, you can check in “XXX” comments into your code with notes on the next changes you were planning to make. If you’re doing planning, you can assign yourself a task and write down as much as possible about how you should implement it.

This has two benefits:

  1. Next morning when you get to work, and even more so after a weekend or vacation, you’ll spend much less time context swapping and trying to remember where you were. Instead, you’ll have clear notes about what to do next.
  2. By planning your work for the next day, you’re setting up your brain to work out the problem in the background, while you’re enjoying your free time. You’re more likely to wake up in the morning with a solution to a hard problem, or have an insight in the shower. For more about this see Rich Hickey’s talk on Hammock Driven-Development.

No new large tasks after 3:30PM

By the time the afternoon rolls by you’ve been working for quite a few hours, and your brain isn’t going to work as well. If you’re in the middle of a task you can keep working on it, but if you finish a task you should stop taking on large new tasks near the end of the day. You’ll do much better starting them the next day, when you’re less tired and have a longer stretch of time to work on them.

How should you spend your time? You can focus on small tasks, like code reviews.

Even more importantly, you can spend your afternoon doing planning:

  • Take vague tasks and write down the details and sub-tasks.
  • Investigate potential solutions.
  • Research new technologies.
  • Try to understand the underlying causes of problems you’re seeing come up again and again.
  • Think about the big picture of what you’re working on.

In the long run planning will make your implementation work faster. And by limiting planning to only part of your day you’re making sure you don’t spend all of your time planning.

Going home at 5:00PM exactly

There’s nothing inherently wrong with spending a few more minutes finishing something past 5:00PM. The problem is that you’re experiencing hour creep—it’s a problem for you specifically. Having a hard and fast rule about when you leave will force you not to stay until 6:00 or 7:00PM.

Plus, sometimes it’s not just a few minutes, sometimes you’ll need more than that to solve the problem. And a task that will take two hours in the evening might take you only 10 minutes in the morning, when you’re well-rested.

In the long run you’ll be more productive by not working long hours.

A recap

Here’s a recap of how you should be spending your day at work:

  • 9:00AM-3:30PM: Start by reading your checkpoint notes from the day before so you can get started immediately, then work normally.
  • 3:30PM-4:45PM: Continue on existing task, if you’re finished then transition to small tasks and planning.
  • 4:45PM-5:00PM: Checkpoint your work, then leave your office.
  • 5:00PM-…: Whatever you want to do.

There’s nothing magic about this particular set of rules, of course. You will likely want change or customize this plan to your own needs and situation.

Nonetheless, since you are suffering from hour creep I suggest following this particular plan for a couple of weeks just so you start getting a sense of the benefits. Once you’ve taken control over your time you can start modifying the rules to suit your needs better.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

June 15, 2018 04:00 AM

June 08, 2018

Itamar Turner-Trauring

The true meaning of unit testing

You probably already know what “unit testing” means. So do I.

But—what if our definitions are different? Does unit testing mean:

  • Testing a self-contained unit of code with only in-memory objects involved.
  • Or, does it mean automated testing?

I’ve seen both definitions used quite broadly. For example, the Python standard library has a unittest module intended for generic automated testing.

So we have two different definitions of unit testing: which one is correct?

Not just unit testing

You could argue that your particular definition is the correct one, and that other programmers should just learn to use the right terminology. But this seems to be a broader problem that applies to other forms of testing.

There’s “functional testing”:

  • It might mean black box testing of the specification of the system, as per Wikipedia.
  • At an old job, in contrast, we used the term differently: testing of interactions with external systems outside the control of our own code.

Or “regression testing”:

  • It might mean verifying software continues to perform correctly, again as per Wikipedia.
  • But at another job it meant tests that interacted with our external API.

Why is it so hard to have a consistent meaning for testing terms?

Testing as a magic formula

Imagine you’re a web developer trying to test a HTTP-based interaction with very simple underlying logic. Your thought process might go like this:

  1. “Unit testing is very important, I should unit test this code—that means I should test each function in isolation.”
  2. “But, oh, it’s quite difficult to test each function individually… I’d have to simulate a whole web framework! Not to mention the logic is either framework logic or pretty trivial, and I really want to be testing the external HTTP interaction.”
  3. “Oh, I know, I’ll just write a test that sends an HTTP request and make assertions about the HTTP response.”
  4. “Hooray! I have unit tested my application.”

You go off and share what you’ve learned—and then get scolded for not doing real unit testing, for failing to use the correct magic formula. “This is not unit testing! Where are your mocks? Why are you running a whole web server?”

The problem here is that the belief that one particular kind of testing is a magic formula for software quality. “Unit testing is the answer!” “The testing pyramid must be followed!”

When a particular formula proves not quite relevant to our particular project, our practical side kicks in and we tweak the formula until it actually does what we need. The terminology stays the same, however, even as the technique changes. But of course whether or not it’s Authentic Unit Testing™ is irrelevant: what really matters is whether it’s useful testing.

A better terminology

There is no universal criteria for code quality; it can only be judged in the context of a particular project’s goals. Rather than starting with your favorite testing technique, your starting point should be your goals. You can then use your goals to determine, and explain, what kind of testing you need.

For example, imagine you are trying to implement realistic looking ocean waves for a video game. What is the goal of your testing?

“My testing should ensure the waves look real.”

How would you do that? Not with automated tests. You’re going to have to look at the rendered graphics, and then ask some other humans to look at it. If you’re going to name this form of testing you might call it “looks-good-to-a-human testing.”

Or consider that simple web application discussed above. You can call that “external HTTP contract testing.”

It’s more cumbersome than “unit testing,” “end-to-end testing,” “automated testing”, or “acceptance testing"—but so much more informative. If you told a colleague about it they would know why you were testing, and they’d have a pretty good idea of how you were doing the testing.

Next time you’re thinking or talking about testing don’t talk about "unit testing” or “end-to-end testing.” Instead, talk about your goals: what the testing is validating or verifying. Eventually you might reach the point of talking about particular testing techniques. But if you start with your goals you are much more likely both to be understood and to reach for the appropriate tools for your task.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

June 08, 2018 04:00 AM

June 03, 2018

Itamar Turner-Trauring

Get productive faster at your new job—without working overtime

You’ve just started a new job and you’re not productive yet: you get stuck, you need help, you don’t know enough. You need to learn a new codebase, a new set of conventions, a new set of business problems. You might even need to learn a new programming language.

And so now you feel anxious—

Are you doing well enough?

Are you producing enough code?

How is your manager feeling about your progress?

It’s natural to make yourself feel more comfortable by working overtime. You’re showing your manager that you’re trying, at least, and by working long hours you might get a little bit more done. You don’t want to work overtime in the long run, of course, but you can worry about that in the future.

Unfortunately, working long hours is—as you might suspect—the wrong solution: at best it won’t help, and it might even make your situation worse. Let’s see why overtime isn’t helpful, and then move on to a better solution: a solution that will make you more productive and make you look good to your manager.

Long hours won’t solve your problem

Working overtime might make you feel a little better. Unfortunately it’s also a bad solution in the short run, and a big problem in the long run.

In the short run, you’re not actually going to get more done. Long hours will just tire you out, won’t help you learn any faster, and pretty much are never the solution to producing more (here’s some research if you don’t believe me). Even worse, you might end up giving your manager the wrong impression: you’re working long hours and you’re still not productive yet?

In the long run, you’re setting bad expectations about your work hours. If you have a mediocre manager, let alone a bad one, they will often expect you to keep working those long hours. You need to set boundaries from the start: “here are my work hours, I won’t work more outside of emergencies.”

There’s a better solution: focusing on your real goal, which is learning everything you need to know about your new project.

The real solution: learning with feedback

You have two core problems:

  1. You need to learn a lot, and you don’t necessarily even know what you need to learn.
  2. You can’t demonstrate you’re being productive to your manager the usual way, by fixing bugs or adding features.

You can solve both problems at once with the following process:

  1. Every Friday, with your week’s work still fresh in your mind, write down:
    • Everything you’ve learned that week.
    • What you think you need to learn next.
  2. First thing Monday morning when you get back to work, send an email to your manager with what you wrote Friday, and an additional question: “What is missing from this list? What else do I need to learn?”
  3. Your manager can now provide you with feedback about additional things you need to learn.
  4. When you get stuck and don’t want to ask for help just yet, take a break and go learn something on your list.

If you follow this process:

  • Your manager will know you’re not slacking off.
  • You’ll get feedback about your progress and what to do next.
  • You’ll be better focused on learning the right things first, which will make you productive faster.

And of course, no overtime required.

Want more suggestions for getting started on your best foot? Last time I started a new programming job I created a personal checklist: all the things I should be doing on my first few days at work. If you’d like to read it, you can download it here.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

June 03, 2018 04:00 AM

June 02, 2018

Moshe Zadka

Avoiding Private Methods

Assume MyClass._dangerous(self) is a private method. We could have implemented the same functionality without a private method as follows:

  • Define a class InnerClass with the same __init__ as MyClass
  • Define InnerClass.dangerous(self) with the same logic of MyClass._dangerous
  • Make MyClass into a wrapper class over InnerClass, where the wrapped attribute is private.
  • Proxy all necessary work into InnerClass.

This might seem onerous, but consider that now, dangerous is part of the public interface of a class, and would need to be explicitly documented as to assumptions and guarantees. This documentation would have had to have been in comments around _dangerous anyway -- in order to clarify what its assumptions are, and what invariants it is violating in MyClass -- otherwise, maintaining the code that calls _dangerous would be hard.

Of course, this documentation is almost certain to be missing. The existence of _dangerous itself implies this was an almost mechanical refactoring of similar code into a method, with the excuse of "its private" used to avoid considering the invariants and interface.

Even if the documentation did exist, now it is possible to unit-test that the documentation is correct. Furthermore, if we use best practices when we define MyClass -- in other words, avoid creating an InnerClass object in the initializer, and only creating it in an MyClass.from_parameters, we are also in a good position to unit test MyClass.

This, of course, presented the worst case: the code for _dangerous touches absolutely every data member of MyClass. In real life, the worst case is not often encountered. When we look at a private method as a code smell, and contemplate the best way to refactor it away, it turns out that we often can find a coherent set of attributes that really does make sense as InnerClass on their own merits.

Credit: This is based on an off-handed comment Glyph made in his blog post about attrs. I am also grateful to him for reviewing a draft copy of this post, and making many useful suggestions. All mistakes in interpretation or explanation are mine alone.

by Moshe Zadka at June 02, 2018 04:30 AM

May 20, 2018

Itamar Turner-Trauring

Staying focused, the productive way

Your manager keeps telling you that you’re not getting enough done. So you decide to become more focused, since as everyone knows, to be a productive programmer you need to stay focused. Deep-diving into TV Tropes, chatting with your friends, or reading up on that fancy new web framework might be fun, often even educational, but they won’t get that feature you’re working on out the door.

So you get noise canceling headphones, and only read your email once a day, and use the Pomodoro technique, and became laser-focused on your code—but still, you’re not productive enough. Your colleague across the hall doesn’t write code faster than you, and yet somehow they make more of an impact, they get things more done. You know it, and your manager knows it.

Why?

Because staying focused is not enough to make you productive. In fact, it’s often the other way around: staying focused is a side-effect of what truly makes you productive.

  • To understand why staying focused isn’t enough, we’ll take a detour from programming and go visit my past self: a young soldier being escorted into a military jail.
  • Then, we’ll apply the lesson we learned and see how understanding your goals is key to becoming more productive, and how your goals can help you stay focused.

A short visit to a military jail

Imagine a yard full of dirty gravel, and mixed in with the gravel are tiny twigs, trash, and the like. How long could you spend crawling around looking for this debris before you’d get bored? How long could you stay focused?

Long ago I lived in Israel, and as a Jewish citizen I was required to serve three years in the military. For a variety of reasons, personal and political, I had no interest in becoming a soldier, and so I attempted to avoid conscription by getting a medical discharge for mental health reasons. While on the base I was part of a transients’ unit on the military base: we would clean bathrooms and the like while awaiting processing.

As our story unfolds, I was having a very bad day. My attempt to get a discharge was failing, as the military psychiatrist had decided there was nothing wrong me. And to make things worse, the sergeant who ran the unit wanted me to go off and do some work on the base, and I couldn’t deal with it.

So I said “no"—which to say, I refused orders, serious business in the military. The sergeant organized a quick trial, and the officer in charge sentenced me to a day in the on-base jail. Perhaps for entertainment, perhaps to enforce the importance of obeying orders, while I was in the jail my guards ordered me to search for little bits of tiny debris that were mixed in the jailyard’s gravel.

And so I spent quite a while, crawling around on my knees in the rain, working hard at a pointless task. The guards were impressed, and eventually they felt bad enough to give me an umbrella to keep the rain off.

The moral of the story

I started this episode by refusing to work, and refusing work that had some purpose (washing dishes, or cleaning a bathroom). I ended by working hard doing something that was a complete waste of time.

Both choices were good ones, because in both cases I was working towards my goals:

  1. My broadest goal was getting kicked out of the military. Cooperating was doing me no favors: spending some time in jail for refusing orders demonstrated I was not going to be a good soldier.
  2. My secondary goal was minimizing the amount of time I spent in jail. I had met a soldier on base who had spent his time in jail getting in trouble with his guards, so he’d been sentenced to even more time. He ended up spending months on a military prison base. I wanted to be a model prisoner, so I could get out of jail as quickly as possible.

Staying focused and avoiding distractions is all fine and good, so long as the work you’re doing actually helps you achieve your goals. If it’s not, you’re staying focused on the wrong thing. I could have stayed focused by following orders—and that would have been the wrong way to achieve my goal of getting kicked out of the military.

Plus, knowing your goals can help you stay focused. If you don’t care about your task, then you’ll have a hard time focusing. But once you do understand why you’re doing what you do, you’ll have an easier time staying on task, and you’ll have an easier time distinguishing between necessary subtasks and distracting digressions. And that’s why I was able to enthusiastically clean debris from gravel.

This then is the key to achieving your goals, to productivity, and to staying focused: understanding your goals, and then working towards them as best you can.

Applying your goals to staying focused

So how do you use goals to stayed focused?

  1. Figure out the goals for your task.
  2. Strengthen your motivation.
  3. Judge each part of your work based on your goals.

1. Discovering your goals

Start with the big picture: why are you working this job? Your goals might include:

  • Money: Getting paid so you can buy food and shelter.
  • Social pressure: You want your coworkers and boss to think well of you.
  • Organizational goals: You believe in what the company is doing.
  • A sense of obligation: You want to help your customers or users.
  • Building and playing: Solving a hard problem is fun.
  • Curiosity: Learning is fun too.

Then focus down on your particular task: why is this task necessary? It may be that to answer this question you’ll need do more research, talking to the product owner who requested a feature, or the user who reported a bug. This research will, as an added bonus, also help you solve the problem more effectively.

Combine all of these and you will get a list of goals that applies to your particular task. For example, let’s say you’re working on a bug in a flight search engine. Your goals might be:

  1. Money: I work to make money.
  2. Organizational goal: I work here because I think helping people find cheap, convenient flights is worth doing.
  3. Task goal: This bug should be fixed because it prevents users from finding the most convenient flight on certain popular routes.
  4. Fun: This bug involves a challenging C++ problem I enjoy debugging.

2. Strengthening your motivation

Keeping your goals in mind will help you avoid distractions, and the more goals you’re meeting, and the more your various goals point in the same direction, the better you’ll do. If you have weak or contradictory goals then you can try different solutions:

  • If you work for a company whose goals don’t mean much to you, then you’ll have a harder time focusing: consider finding a new job where you’re doing something you care more about.
  • If after enough research you’ve decided your task is pointless, you can either try to push back (mark the bug as WONTFIX, go talk to the product manager), try to add an additional motivation (is this a good opportunity to learn something new?), or just live with the fact that it’ll take you longer to implement.

3. Judging your work

Understanding your goals will not only help you avoid small distractions (noise, TV Tropes), but bigger distractions as well: digressions, seemingly useful tasks that shouldn’t actually be worked on. Specifically, as you go about solving your task you can use your goals to judge whether a new potential subtask is worth doing.

Going back to the example above, imagine you encounter some interesting C++ language feature while working on it can be tempting to dive in. But judged by the four goals it will only serve the fourth goals, having fun, and likely won’t further your other goals. So if the bug is urgent then you should probably wait until it’s fixed to play around.

On the other hand, if you’re working on a pointless feature, your sole goals might be "keep my manager happy so I can keep getting paid.” If you have two days to do the task, and it’ll only take two hours to implement it, spending some time getting “distracted” learning a technical skill might help with a different goal: switching to a more interesting position or job.

Start with your goals

Once you know goals, you can actually know what it takes to be productive, because you’ll know what you’re working towards. Once you know your goals, you can start thinking about how to avoid distractions because you’ll know you’re doing work that’s worth doing.

Before you start a task, ask yourself: what are my goals? And don’t start coding until you have an answer.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

May 20, 2018 04:00 AM

May 18, 2018

Itamar Turner-Trauring

It's time to quit your shitty job

If it’s been months since you had a day where you feel good

If you hate getting out of bed in the morning because that means you’ll have to go to work—

If your job is tiring you out so much you can’t get through the day without a nap—

It’s time to quit your shitty job. It’s time to quit your shitty job and go someplace better, a job where a good night’s sleep is all you need. A job where you’re valued. A job where people don’t shout at each other, or demean you, or destroy the project you’ve put all your energy into.

But quitting can be difficult: you have a sense of commitment, the fear of change, the indecision about whether your job is really that shitty. So to help you make your decision, and quit in the best possible way, in the rest of this post I will cover:

  1. Identifying a shitty job.
  2. Whether you should quit (spoiler: yes).
  3. Preparations you should make before quitting: legal, bureaucratic, social.
  4. When to quit.
  5. How you should quit.

(Note that some of this will be US-centric, since that’s where I live and what I know best.)

Identifying a shitty job

Shitty jobs can be surprisingly hard to identify.

Sometimes this is because you don’t have a reasonable baseline, or the shittiness has become normalized through exposure. I’ve heard of companies with the following symptoms, for example, and I would consider either grounds for immediately starting a search for a better job:

  • People shouting at each other during meetings on a regular basis.
  • Getting paid late. Money for working hours is the basic contract of employment: if you’re paid late more than once you’re being told that contract isn’t important.

Another reason you might not notice you have a shitty job is a subtle shift over time. A good job slowly gets worse, and your existing relationships and loyalty blind you to the symptoms—for a while, anyway. You might be forced to reconsider due to:

  • Layoffs, especially while the company still continuing to hire.
  • Managers being hired without being interviewed by their future direct reports.

I could go on with other examples, but there are two core themes here:

  1. Your company doesn’t value its employees.
  2. You don’t trust company management in the aggregate.

Again, this may not always have been the case. You may trust many of your managers, and know that they value you and your coworkers. But things change, and not always for the better: what matters is the way the company is now, and who has power now, not the way it used to be.

Should you quit your shitty job?

Yes.

But you should do so at the right time, and with a little preparation.

When should you quit your shitty job?

Ideally, you should have another job lined up before you quit.

I once had to give notice of quitting unexpectedly, without prior planning. A more observant coworker gave notice the same day, but they had started looking a couple months before, when we had a round of layoffs. So while I spent a couple months not getting paid, they moved straight on to another job. The lesson: it pays to look for early signs of shittiness, so that you can leave in the best possible way.

Once you realize you have a shitty job, you should start interviewing elsewhere. Having an existing job improves your negotiation position, since you always have the implicit alternative offer of staying where you are. Two offers you can play against each other, or “I’m far along in interview process with another company” is better, but lacking that you need to downplay how shitty your current job is.

You’ll want a break to catch your breath and relax in between jobs: you can easily negotiate a couple of weeks time off in between jobs. A month shouldn’t be much harder to get.

In practice, your job may be so awful that it leaves you with no time or energy to look for another job. In this case you might be forced to quit without a new job lined up. You can prepare for this by living below your means and saving some money.

Preparing for quitting

Here are some things you should do before quitting any job:

  • Get non-work contact details for all your coworkers.
  • Maximize any benefits you can. When I quit a job with a 401k and donation matches, I maxed those out early in the year. Note that in small enough companies HR might notice when you change 401k contributions.
  • Try to get continued access to your company’s open source projects that you might want to work on after you leave. Often asking is sufficient: I once asked the VP of engineering after I gave notice, and was told I could keep commit access (presumably because I was effectively offering to do work for free).
  • Write down details about your work that can help make your resume look better: specific numbers you improved (sales, performance, costs), and the like. If the company has an overly broad definition of proprietary information you might not be able to put them on your resume—but the company might fold one day, so it’s good to have a reminder of what you did.

At a shitty job you may also need to make copies of some documents: specifically, any emails or other documents where promises are made to you re pay, benefits, and so on. Once you’ve been paid what you’re owed and you’ve left your job, you won’t need those anymore and they should be deleted or shredded. But when it’s your last day at work and you’re trying to get the back pay they owe you, you want to make sure you have documentation.

Speaking of back pay, if you work for a company that has an “unlimited vacation” policy, take some vacation before you quit. You’re not going to get paid for those vacation days you haven’t taken. (In general, if a company has “unlimited vacations” I recommend taking lots of vacation throughout the year, since it’s use or lose it.)

How to quit

It’s a shitty job, and you may be utterly relieved at leaving it, but—you should quit politely. Your management may simply be misguided, or suffering under pressures you don’t understand (VCs in cover-your-ass mode can be quite destructive). Your manager might grow as a person. Your co-workers might end up working with you again.

So just give your notice, with the smallest possible amount of time you have to stay there. You can tell close coworkers why you’re leaving (they probably already know). And on your last day of work just leave, quietly and politely.

For a while you will feel sad: those projects will never get finished. But mostly you will feel relief.

It’s time—

—time to quit your shitty job.

As I mentioned above, I once made the mistake of hanging on when I shouldn’t have, unlike a more clued-in coworker. (You can hear the whole story by signing up for my Software Clown newsletter, where I share 20+ years of my mistakes so that you can avoid them.)

Don’t make my mistake. I had to quit anyway, and without the benefit of advance planning or having a job lined up. Start looking for a new job now, while you’re still able to hold on—your job probably probably won’t be getting any better.



It's Friday afternoon. You just can't write another line of code—but you're still stuck at the office...

What if every weekend could be a 3-day weekend?

May 18, 2018 04:00 AM

May 17, 2018

Jonathan Lange

Announcing quay-admin

We use quay.io a fair bit at work—all our internal Docker images are stored there. I like it a lot, but the website makes it really hard to see who can access your repositories.

In particular, if someone ever leaves your organization, you have to click through all of your repositories one at a time to see whether they have been granted access to a repository as an individual, rather than as a member of a team. This might be OK if you have two or three repositories, but not if you have hundreds.

I had some spare time today, so I wrote a tool to help with this. It’s called quay-admin and you can install it now:

$ pip install quayadmin

This will give you a command-line tool called quay-admin that you can run to see which users outside of your organization have access to your repositories.

I originally tried to write it in Go, basing it off my colleague’s excellent quay-exporter project—a tool that turns security vulnerability warnings into Prometheus metrics so you can get alerted. Unfortunately, getting Go to work well with Swagger APIs is a bit fiddly, and I didn’t have that much spare time. So I tried Python, knowing that it has excellent libraries for working with RESTful services.

First cut used requests, which helped me figure out which APIs I needed and how they gave me the data I wanted. Next version used treq, which allowed me to parallelize, which saves precious seconds of my only life.

It’s been an age since I’ve written Twisted code, but it all comes rushing back fairly quickly. I’ve found that I miss certain things from Haskell’s async library, notably mapConcurrently, but they are easy enough to add.

Releasing Python code is way different though. At Glyph‘s recommendation, I tried flit, which seems to work OK.

Thanks to dstufft, glyph, dreid, AlexGaynor, wsanchez, and others who patiently answered my questions while I was writing this, and who in some cases wrote much of the actual software I am building on top of.

Thanks also to quay.io for actually publishing their API docs. It genuinely helps.

by Jonathan Lange at May 17, 2018 11:00 PM

Moshe Zadka

PyCon US 2018 Twisted Birds of Feather Open Space Summary

We would like Twisted to support contextvars -- this would allow cross-async libraries, like eliot to do fancy things.

Klein is almost ready to be used as-is. Glyph has the good branch which adds

  • CSRF protection
  • Forms
  • Sessions
  • Authentication

But it is too big, and we need to break it to reviewable pieces to add it to master.

The other option for a Twisted-native web framework is Cyclone. It is not under heavy development, but this is mostly because it is done and reasonably stable: Duo Security is using it in production.

We are slowly improving the Request object by taking it out of the built-in and reimplementing it externally. Wilfredo is doing it in a side-project.

We talked a little about advanced use cases: How do you use a reactor in a non-main thread? The only marginally documented installSignalHandlers argument does that just fine.

If you want to spread processing between multiple processes, Ampoule does that. Help is greatly appreciated.

If you want to do weird things with resources, Moshe did something on Twitch this one time.

We made sure everyone knows their help would be appreciated, and gameified: Review tickets and participate on the mailing list.

Remeber: the book Expert Twisted is available for pre-orders!

by Moshe Zadka at May 17, 2018 01:50 AM

May 16, 2018

Moshe Zadka

PyCon 2018 US Docker Birds of Feather Open Space Summary

We started out the conversation with talking about writing good Dockerfiles. There is no list of "best practices" yet. Hynek reiterated for us "ship applications, not build environments". Moshe summarized it as "don't put gcc in the deployed image."

We discussed a little bit what we are trying to achieve with better docker files. Shared base? Reproducible builds?

We talked about some of the challenges for building Docker on CI systems, especially from inside containers.

Docker on air-gapped machines is hard. So many parts assume free access to the internet.

We went on to discuss how to use multistage Dockerfiles. One important bit is what "installable artifact" to move. Some suggested wheels. Moshe suggested Pex. Hynek suggested copying a virtual environment, and Moshe showed an example

There was some discussion on making small images. The consensus was that Alpine is usually part of the answer.

There was a lot of discussion on the trade-offs between updating too soon, and too late. Some of the techniques to control update times were mentioned:

  • Building everything from source
  • Hashing various inputs into the image tag
  • Using Red Hat Satellite

We talked about GPU containers, for machine learning. Apparently nvidia-docker is still nascent but works.

We talked about how to keep your registry clean. Unfortunately, the consensus is that you will need to build your own tooling.

We discussed what registries people use.

We touched lightly on performance. Docker can use either overlayfs vs devicemapper. It's complicated

Would you run your DB in Docker? Docker is just a packaging format. You can run Postgres in Docker just fine, and mount in the data directory. However, usually people are asking about using Orchestration Frameworks for that.

StatefulSets in K8s are sometimes useful for databases.

If you are running your dev DB in Docker, data is not important. In that case, consider using eatmydata to improve performance.

We all agreed you should never use the system Python for your applications. Then how do you get Python in your Docker image?

  • Use the python:<something> images on Docker Hub
  • Compile it yourself
  • Use PyEnv
  • Use the deadsnakes PPA on Ubuntu

Finally, we discussed the ultimate heresy: running more than one process inside your container. Or is it? Moshe mentioned that anyone running uwsgi or gunicorn is already running a process manager: just one that happens to be part of the WSGI "binary". We mentioned supervisor and NColony for explicit process management.

by Moshe Zadka at May 16, 2018 01:50 AM

May 08, 2018

Itamar Turner-Trauring

Guest Post: Networking for programmers with very little free time

The following guest post is by Moshe Zadka, explaining the importance of networking and how you can do it with minimal time outside work.

A good professional network is a long-term asset. When you’re looking for a new job you can talk to people you know, ask them if their company is hiring, and then have them submit your resume directly to the hiring manager. This will allow you to skip the “resume filter”, and often get you past the phone screen as well.

But even if a professional network is useful, how do you find the time to build it? You probably don’t want to have to get out every evening to a social gathering, and spend hours talking, just to plan for a hypothetical future job.

One way to start building your professional network with little time spent is by focusing on your current job. Over time your colleagues will leave for other jobs, and every former colleague is a potential referral to another company. So you should always make sure you have non-work contact details for your colleagues.

Unfortunately, you can only expand your network so much from attrition at work: if you want a larger network, you will have to do some work. But by making judicious use of your time, and going to the right venues, you can grow and maintain a good professional network while still only spending one or two evenings a month at events.

How networking helped me

In one of the San Francisco Python meetups, I met someone working at PayPal– a company I had no interest working for. However, we kept in touch. At some point, he moved to a start-up. At another point, I found myself looking for a job, after a company shutdown. Because I had kept in touch with him, it was easy to reach out.

Even though I was on a tight timeline for getting another job, he made sure I was fast-tracked through the process – a pro-forma resume review, and skipping the phone screen. I still had to go through a half-day’s worth of interview panel, but removing the simple filters from my path probably saved a week or more worth of being unemployed, and also let me put pressure on other prospective employers to fast-track me.

Business cards

Business cards sound like an antiquated thing, something you might see on “Mad Men”. However, even with modern smartphones, there is no faster way to share your contact details with someone you’ve just met. For that, a business card’s most important part is your e-mail address.

In all of the opportunities below, give people a business card when the conversation is done. Making your own cards is free to cheap nowadays, no need to wait for your job to print you one. In any case, you want to make sure you have your personal e-mail on the card, not your work e-mail.

Conferences

If you are already employed, ask your job to send you to relevant conferences. Some places have a budget for “professional development”, others have funds specifically marked “conferences” – or maybe it’s under the recruiting budget. Choose a relevant conference, and remind your manager that sending you to the conference is a form of training: a great investment in employees.

Some companies will only fund your trip if you speak at a conference. Most conferences understand that some people will only come if they speak, and structure their timeline accordingly: you’ll know whether your talk is accepted far enough in advance to get your manager to sign off on sending you. An efficient way to send talk proposals is to recycle – if a talk is declined from one conference, it is fine to send it to another closely related one, although sometimes it will have to be tweaked slightly. If the audiences are sufficiently distinct you can even reuse a talk you’ve already given.

Once you are at a conference, attend birds of feather sessions, and try to sit with new people for conference meals, if those are served. This is a great way to meet more people at the conference. Giving a talk is also a great way to meet more people: you can often meet other speakers, and many people in the audience will want to talk to you afterwards.

Meetups

Many places have tech meetups in the evening. You can probably find time to go to a meetup once a month. If you do go, make sure to make the most of your time – mingle, talk and hand out your business card.

Avoid going to the same meetup month after month – while it is comfortable, it tends to be the same people: your goal should be to expand your network. So once you stop meeting new people, switch to a new meetup.

As with conferences, giving a talk at a meetup is great way to meet new people. You might even be able to work on your talk at work, if you can pitch it as a recruiting event to your manager.

Engineering blog

If your company has an engineering blog, participate. Find something you have done recently which was interesting or surprising, and write about that experience.

If your company does not have an engineering blog, see if you can make one happen. It helps with recruiting, and helps people develop in their career.

Keeping in touch

Keep in touch with your network. If you come across an article relevant to someone, send it to them with a note “thought it might be interesting”. Often they will already have read it, but will be interested in sharing their thoughts.

Summary

Developing and maintaining a professional network does not need a huge time investment – a little bit goes a long way, if properly allocated. And when the day comes that you need a new job, that small investment will pay off. While you usually can’t just get a job just by knowing someone, a network will help skip past companies “resume filters”, and you can have a more streamlined interview process if you have a friend on the inside.


Moshe is a core developer on the Twisted project, and has been using Python for 20 years. He’s just published a book, “from python import better”.



It's Friday afternoon. You just can't write another line of code—but you're still stuck at the office...

What if every weekend could be a 3-day weekend?

May 08, 2018 04:00 AM

May 04, 2018

Jonathan Lange

Site updates

I am pleased to announce that the recent TLS certificate problems and outages to jml.io have been fully resolved.

Here are some notes on what happened and what I did about it.

Background

jml.io is a statically generated blog that’s hosted on AWS. The HTML pages are generated locally and uploaded to S3 buckets. These buckets are then served by CloudFront, which acts as a CDN. The domain names are managed on Route53.

Before now, these were managed by clicking around in the AWS console. There might have been a time when they were generated by Ansible playbooks, but I can’t find the playbooks anymore.

A couple of weeks ago, the wildcard TLS certificate for jml.io expired. This meant that anyone who browsed to the site with a modern browser got a scary warning saying the site wasn’t trustworthy. And fair enough too!

To get the site working, I needed to get a new certificate and distribute it from AWS. Recalling the steps to do this properly was too hard, and besides I knew of a better way.

Enter Terraform

We use Terraform to manage our AWS infrastructure at my employer, and really quite like it. I personally have some qualms about HCL, its configuration language, that I might write about later, but I like both it and Terraform more than any alternatives I’m aware of.

Because my site was down and because I really don’t have time to do considered maintenance, I decided to migrate the whole thing to Terraform. This would mean that all of my thinking and decisions could be stored in a Git repo, rather than my memory.

I would also use the AWS Certificate Manager to generate and manage the certificates, sparing me the difficulty of purchasing, storing, configuring, and later renewing them myself.

How it happened

I spent the first week snatching the occasional hour here and there figuring out how Terraform worked. While we use it at Weaveworks, I wasn’t the one to set it all up, and editing something built by someone else is very different from being able to build to from scratch.

In particular, I needed to get a feel for the workflow and for how Terraform’s means of abstraction, modules, actually worked.

The next week I started migrating all the “redirect” buckets to Terraform. These are S3 buckets for my old domain names (code.mumak.net, mumak.net, etc.) that now redirect here.

Doing this involved figuring out how to import things from Terraform, how to use modules, how to edit Terraform state when you’ve refactored something.

It’s quite slow going. The terraform plan step takes quite a while, which means the edit/test loop is bit of a grind. This really hurts when you are snatching a half-hour before bed here and there to get things done.

During this process, I got a bunch of excellent advice and working, reusable Terraform code from David Reid.

Once I got the redirect buckets incorporated, I moved on to their DNS records. That went fairly smoothly if slowly.

Then I decided to set up CloudFront, ACM, and a Lambda for HSTS all at once. It would have worked great, except that all my stuff was on us-west-2, and all the cool features for integrating with CloudFront are in us-east-1.

So today I had the joy of migrating buckets from one AWS region to another. AWS has no built-in support for this that I know of, so the way you do it is create a new bucket in the new region, copy all the content over, delete the old bucket, then wait a while for eventually consistent data stores and/or batch jobs to do their thing, then create a bucket in the new region with the old name, then copy all the data over again, then delete the temporary bucket.

It’s a real hassle, and AWS’s silly global bucket namespace thing adds an edge of frission: what if someone steals my name while I’m waiting to retry?

The new setup

Everything’s on AWS, managed by Terraform. Even the Terraform state and lock are kept there. I haven’t set up anything like terradiff yet, but it’s only a matter of time.

The module set up means I’ve got a pretty clear list of what’s a genuine static website and what’s a redirect site. There’s some duplication, but its mostly of boilerplate rather than of magic strings.

Going forward, I’m going to use the extra automation provided by Terraform to make publishing to this blog a bit easier for me. I think I can also take some of the stuff I’ve learned and incorporate it into our work infrastructure.

Conclusions

Terraform is great for managing AWS stuff. AWS is a pretty cool way of hosting a static site if you care about TLS certificates (which you should). dreid is awesome for giving me so much useful help at the right time.

by Jonathan Lange at May 04, 2018 11:00 PM

May 02, 2018

Moshe Zadka

Wheels

Announcment: My book, from python import better, has been published. This post is based on one of the chapters from it.

When Python started out, one of the oft-touted benefits was "batteries included!". Gone were the days of searching for which XML parsing library was the best -- just use the one built-in to the standard library. However, the standard library can only hold so much special purpose stuff. Few now remember, but it used to have SGI Audio specific functionality.

These days, one of the biggest benefits of Python is the extensive third-party repository of stuff. This is the Python Package Index (PyPI), formerly known as the "Cheese Shop" after an obscure Monty Python skit. Of course, what else would be available from the Cheese shop than wheels of cheese? But a second pun was hiding behind the term "wheels": those are the things that need no reinvention!

The new PyPI warehouse launched, with new code hosting unbelievable amounts of content: around 140,000 packages at times of this post (unless I take too long in publishing it, and then who knows how big PyPI will be!)

Nobody can sift through 140K package descriptions, of course. A short-lived attempt to have "Stars" fell victim to allegations of ballot stuffing and moderation, and was quickly removed. Searching on key words would be useful, but searching without sorting rarely is -- and what would you sort on?

PyPI is not the place to find which libraries are useful. It is the place to find objective truths: which version is the latest, when was it released, what is a project's homepage, etc. Recommendations are best found elsewhere.

The first place I like to start is with the Awesome Python list. However, it is important to note that its contribution guidelines are just "submit a link" and there is no official way to remove a library from the list. Thus, the "awesome" in the name means "someone once thought it was awesome, and cared enough to add it". The list should be treated as mild suggestions. Before using a library, check release history, GitHub health, code quality and other metrics you might care about.

Another useful resource is Planet Python. It is a feed aggregator of various blogs. Many of the blog posts will feature either a recommendation of a particular library, a release announcement, or just discussion which involves using a third-party library. Along side the written word is the live performance -- PyVideo links to more Python talks than you can shake a stick at, aggregating talks from conferences around the world: again, many of the talks will feature discussion of a particular third-party library.

Last, but not least, the live, interactive, version of PyVideo: Python meetups and conferences. Those are where I discovered some of my favorite libraries.

by Moshe Zadka at May 02, 2018 03:00 PM

May 01, 2018

Itamar Turner-Trauring

A refurbished iPad, the CAP theorem, and a lesson on negotiation

We’ll get to the iPad and CAP theorem soon, but first, let’s talk about negotiation:

  • Your boss hands you a project with an impossible deadline–so you end up working evenings and weekends.
  • You get a job offer that’s lower than you’d like–and you accept it.

And that’s just the way things are, and it’s not like there’s anything you can do, right?

Maybe. But quite possibly there is something you can do.

To explain why, I’d like to share an edifying tale, a story of broken promises and ultimate–albeit minor–triumph. Along the way we’ll take a detour into distributed systems theory, and when we’re done there will even be a moral (hint: it’s about negotiation).

A purchase is made

Once upon a time, at a different job that subsidized such things, my wife purchased an iPad. Years passed, operating systems were upgraded, and over time this iPad became too slow to run some apps, and too old to run others.

It was time to buy a new iPad.

The day Apple released their new 2018 iPad we went to Apple’s online store and purchased a refurbished 2017 model. We got a confirmation email, and looked forward to a tablet that could keep up with many companies’ unwillingness to ship performant code (looking at you, Skype).

The next day Apple sent us another email: our order had been canceled, and our money would be refunded. When we checked the store, the refurbished model was no longer in stock.

Our somewhat cheaper iPad was not coming.

A theory is introduced, with some references to distributed systems

Why did the Apple Store cancel our order? Perhaps it was a bug, but I have another theory: the constraints of the CAP theorem.

Eric Brewer’s CAP theorem states that a distributed data store–a system composed of multiple nodes–can only have two out of three properties:

  1. Consistency: all nodes have same view of the data.
  2. Availability: the system can respond successfully to clients.
  3. Partition-tolerance: if the network between the nodes fails, the system can continue to operate.

Now, the online Apple Store is quite likely to be a distributed system, given the need to scale to many users. And it needs to store data, the size of the inventory of each item in the store. Given a choice between those three properties, the only two reasonable choices are availability and partition-tolerance.

It’s far better to have a store that is available than to have completely consistent tracking of inventory. There is a cost to this choice, though: every once in a while a large rush of orders will cause inconsistent views of the available inventory.

  • Node A thinks there is one iPad left, and sells it to customer 1.
  • At the very same time, Node B thinks there is one iPad left, and sells it to customer 2.

Because the system can’t enforce consistency, the same iPad is sold to two people. What to do?

One common solution (alas, I can’t find the original paper where I read the idea) is “compensation” or “apology”: out-of-band business processes to repair the mistakes. In this case, a post-processing stage that notices the double-sold iPad and handles it somehow.

How this rare but inevitable mistake is handled is a policy decision, and Apple’s chosen policy is to simply cancel the order–contrary to a guarantee they make on their website.

A complaint is made

If you go to the Apple Store website’s refurbished section, you will see in small letters at the top that “availability is guaranteed once we receive your full payment.” Given that promise, and the fact we’d gotten a confirmation email for our payment, my wife called up customer service and politely asked why our order was canceled.

The representative went off, and after some delay she indicated that she’d talked to her manager and she’d gotten approval to send us a new iPad instead. So the next week we received a 2018 iPad, while only paying the cost of a refurbished 2017 model.

Success, and easy success at that that. My wife didn’t have to complain loudly, point out Apple was in the wrong, or hassle anyone. She just asked.

The topic of negotiation is reintroduced

Apple made a promise (payment == guaranteed delivery), and then violated it. Why? Violating the guarantee on their website was an opening offer in a negotiation.

When you’re negotiating, you need to ask for what you want, or you won’t get it. In practice most people won’t ask and won’t complain, and so it’s in Apple’s interest to start with a low offer: most people will get the email canceling the order, grumble a bit, and re-order something else.

Often how you ask is also important: you need to ask the right way. I once had a $5000 medical expense denied over and over by a health insurance company. Eventually I indicated I wished to file a grievance–a bureaucratic procedure I found on their website–and suddenly they found a “coding error” that had caused the problem and my bill was paid. (My uninformed guess is that grievances get reported to state regulators.)

The same dynamic is part of how many companies take advantage of employees, and the cost you’ll pay for not asking is even higher. When a company gives you a job offer they will have a salary range they are willing to pay, but they will usually not offer their high number. Instead, they will offer you the lowest number possible they think you might accept. Sometimes that number will also be far below market rate.

Many people will simply accept that initial salary offer–I’ve certainly made that mistake–and therefore end up getting paid much less than they could otherwise get. But of course that’s just an initial offer, and simply asking for more will usually result in a higher offer. And you can do even better if you do a little work beforehand.

A summary is presented

Here’s what we’ve learned:

  1. Recognize negotiating situations: you’ve already lost if you don’t realize you’re negotiating.
  2. Ask for what you want, or you won’t get it.
  3. Ask in the right way.

For a useful guide to negotiating your salary and benefits in general, see Valerie Aurora’s negotiating guide. And if it’s your working hours you care about, check out my book: Negotiate a 3-Day Weekend.



May 01, 2018 04:00 AM

April 29, 2018

Twisted Matrix Laboratories

Twisted 18.4.0 Released

On behalf of Twisted Matrix Laboratories, I'm honoured to announce the release of Twisted 18.4.0!

The highlights of the release are:

  • The dropping of Python 3.3 support.
  • Python 3 fixes (notably to trial -j, asyncioreactor, conch, and mail)
  • Python 3 TCP speed improvements (less copying when sending data)
  • Better TLS curve selection support for both old and new OpenSSLs
  • IPv6 fixes for WSGIResource
  • 60+ closed tickets with many fixed bugs!
For more information, check the NEWS file (link provided below).

You can find the downloads at <https://pypi.python.org/pypi/Twisted> (or alternatively <http://twistedmatrix.com/trac/wiki/Downloads>). The NEWS file is also available at <https://github.com/twisted/twisted/blob/twisted-18.4.0/NEWS.rst>.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Amber Brown (HawkOwl)

by Amber Brown (noreply@blogger.com) at April 29, 2018 11:19 AM

April 09, 2018

Itamar Turner-Trauring

Programming as natural ability, and the bandaid of long work hours

Your project deadline is getting closer and closer, and you’re stuck: you don’t know what to do. Your manager won’t help, they just push you to work evenings and weekends–and when it looks like you’re going to fail, they hand the project over to another programmer.

You’ve failed your manager, and there’s a little voice in the back of your head telling you that maybe you’re missing what it takes to be a programmer, maybe you just don’t have the requisite natural ability.

That little voice is lying to you.

It’s the other way around: your manager has failed you, and is compounding the failure by conveying a destructive mindset, what’s known as a fixed mindset. To understand what I’m talking about, let’s take a quick detour into the psychology of education, and then return to those long hours you’ve been working.

Growth mindset vs fixed mindset

Carol Dweck, a professor of psychology at Stanford University, contrasts two mindsets when learning:

  • If you have a growth mindset you assume your abilities can change over time, and that your skills can improve by learning and practice.
  • If you have a fixed mindset you assume your abilities are fixed: some people are naturally more talented than others, and there is no way to change your level of abilities.

According to Dweck’s research, students with a growth mindset do better than students with a fixed mindset. A fixed mindset is self-defeating: it keeps you from learning, and it keeps you from even trying to learn.

In the context of programming this suggests that starting with the attitude that programming is a set of skills, skills that you can learn, will result in a much better learning experience.

But what does this have to do with long working hours?

You can’t work smarter, so you gotta work longer

In an article that was one of the inspirations for this post, Carol Dweck points out that a common failure mode among educators is to praise effort, working harder, instead of praising learning. While they may claim to be encouraging a growth mindset, they are simply perpetuating a fixed mindset.

This failure mode appears in the software world as well. Let’s assume for the moment that programming is a natural ability: just before we’re born, the Angel of Software injects between 0 and 100 milliliters of Programming Juice into our soul. If you’re really lucky, you might even get 110ml!

Now, given that each and every one of us only has a limited amount of Programming Juice, how can you maximize our output? You can’t learn more, so there’s no way to do things more efficiently. You can’t improve your skills, so there’s no way to become more productive.

So what’s left?

All together now: WORK LONGER HOURS!

Working longer ain’t smart

The truth, of course, is that there is no Angel of Software, there is no Programming Juice. Programming is just a bunch of skills. You can learn those skills, and so can most anyone else, given motivation, time, support, and some good teachers. And you can become more and more productive by continuing to learn.

If you believe in fixed talent, if you believe you can’t improve, you won’t try to learn. Long hours will be the only tool left to you.

When faced with a problem: just work longer hours.

When faced with another problem: work even longer.

You’ll work and work and work, and you’ll produce far less than you would have if you’d spent all that time improving your skills. And eventually you’ll hit a problem you can’t solve, and that you will never solve by working longer hours.

A growth mindset will serve you far better. You need to believe that skills can grow, and then you need to actually do the work to learn more and grow your skills.

And when you fail–and you will fail, because we all fail on occasion–take this as another opportunity to learn: look for the patterns and cues you should have have spotted. Having learned your lesson, next time you’ll do better.



April 09, 2018 04:00 AM

April 03, 2018

Moshe Zadka

Web Development for the 21st Century

(Thanks to Glyph Lefkowitz for some of the inspiration for this port, and to Mahmoud Hashemi for helpful comments and suggestions. All mistakes and issues that remain are mine alone.)

The Python REPL has always been touted as one of Python's greatest strengths. With Jupyter, Jupyter Lab in its latest incarnation, the REPL has been lifted into the 21st century. It has become the IDE of the future: interactive, great history and logging -- and best of all, you can use it right from your browser, regardless of your platform!

However, we still have 20th century practices for developing web applications. Indeed, the only development is that instead of "CTRL-c, up-arrow, return", we now have "development servers" which are not "production ready" support auto-reloading -- the equivalent of a robot doing "CTRL-c, up-arrow, return".

Using the REPL to develop web applications is pure bliss. Just like using it to develop more linear code: we write a function, test it ad-hocly, see the results, and tweak.

When we are sufficiently pleased, we can then edit the resulting notebook file into a Python module -- which we import from the next version of the notebook, in order to continue the incremental development. Is such a thing possible?

Let's start by initializing Twisted, since this has to happen early on.

from tornado.platform.twisted import install
reactor = install()

Whew! Can't forget that! Now with this out of the way, let's do the most boring part of every Python program: the imports.

from twisted.web import server
from twisted.internet import endpoints, defer
import klein
import treq

Now, let's start Klein app. There are several steps involved here.

root = klein.app.resource()

We take the Klein resource object...

site = server.Site(root)

...make a wrapping site...

ep = endpoints.serverFromString(reactor, "tcp:8000")

...create an endpoint...

ep.listen(site)
<Deferred at 0x7f54c5702080 current result: <<class 'twisted.internet.tcp.Port'> of <class 'twisted.web.server.Site'> on 8000>>

and like "Harry met Sally", eventually bring the two together for Klein to respond on port 8000. We have not written any application code, so Klein is currently "empty".

What does that mean?

async def test_slash():
    response = await treq.get('http://localhost:8000')
    content = await response.content()
    return content

This function uses Python 3's async/await features, to use treq (Twisted Requests) and return the result. We can use it as our ad-hoc debugger (but we could also use a web browser -- this is naturally hard to show in a post, though).

defer.ensureDeferred(test_slash()).addBoth(print)
<Deferred at 0x7f54c5532630>
b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">n<title>404 Not Found</title>n<h1>Not Found</h1>n<p>The requested URL was not found on the server.  If you entered the URL manually please check your spelling and try again.</p>n'

Ah, yeah. Even / gives a 404 error -- we have literally defined nothing. OK, this is easy to fix:

@klein.route("/")
def something_useful(request):
    return 'Hello world'

Wait, did this work?

defer.ensureDeferred(test_slash()).addBoth(print)
<Deferred at 0x7f54c53d8d30>
b'Hello world'

Yep. But it's not a proper sentence...woops.

@klein.route("/")
def something_useful(request):
    return 'Hello, world!'

Nice. Punctuation. Force. Determination. Other nouns.

Did it change anything?

defer.ensureDeferred(test_slash()).addBoth(print)
<Deferred at 0x7f54c4b9e240>
b'Hello, world!'

Yes.

Incremental web application development. Without an "auto-loading" "not production grade" server.

We took advantage of several incidental issues. The Jupyter kernel is Tornado based. Twisted has both a production-grade web development framework, and the ability to run on top of the tornado loop. The combination is powerful.

by Moshe Zadka at April 03, 2018 04:30 AM

March 23, 2018

Itamar Turner-Trauring

You are not your tools

Do you think of yourself as a Python programmer, or a Ruby programmer? Are you a front-end programmer, a back-end programmer? Emacs, vim, Sublime, or Visual Studio? Linux or macOS?

If you think of yourself as a Python programmer, if you identify yourself as an Emacs user, if you know you’re better than those vim-loving Ruby programmers: you’re doing yourself a disservice. You’re a worse programmer for it, and you’re harming your career.

Why? Because you are not your tools, and your tools shouldn’t define your skillset.

Your tools are not your skills

Ask a programmer to list their skills and more often than not you’ll get a list of technologies. But technologies are just a small set of the skills you need as a programmer.

You need to gather requirements.

You need to debug code.

You need to design user experiences.

You need to build abstractions.

You need to avoid security problems.

And so on and so forth.

None of these skills are technologies. And if you think your only skills are technologies you won’t notice the skills you don’t have. And if you don’t know what you’re missing, you won’t take the time to learn the skills that can make you truly productive and valuable.

Your tools are not your job

If you define yourself by your tools, you are limiting yourself to what jobs you can get.

First, because you won’t apply to those jobs.

Second, because you will market yourself based on tools, instead of all the other skills that might get you that job anyway.

(I’ve written elsewhere about how you can get a job with technologies you don’t know).

Your tools are not you

If your identity is tied up with your tools, you won’t listen to people who use different technologies. Some tools are better than others at certain tasks. Some tools are interchangeable. But an expert using a bad tool can often do more than a novice with a bad tool.

Spending your time fighting over which tool is better is a waste of your time. Instead, spend your time listening and learning from everyone, whatever tools they use: most skills will transfer just fine.

The technologies you use, the tools you build with, are just that: tools. Learn to use them, and learn to use them well. But always remember that those tools are there to serve you, you are not there to serve your tools.



March 23, 2018 04:00 AM

March 20, 2018

Moshe Zadka

Running Modules

(Thanks to Paul Ganssle for his suggestions and improvements. All mistakes that remain are mine.)

When exposing a Python program as a command-line application, there are several ways to get the Python code to run. The oldest way, and the one people usually learn in tutorials, is to run python some_file.py.

If the file is intended to be usable as both a module and as a command-line parameter, it will often have

if __name__ == '__main__':
    actually_run_main()

or similar code.

This sometimes has surprising corner-case behavior, but even worse -- some_file.py is not looked for in either $PATH or sys.path, it must be explicitly handed. It also changes the default Python path from including the current directory, to including the location of some_file.py.

The new recommended modern way, of course, is to set entry_points in the setup.py file. When the distribution is installed, a console script is auto-generated and added to the same place the Python interpreter is found. This means that we need to think carefully about the other things that might have the same name on our $PATH to avoid collisions.

There is a third way, which is subtle. When Python sees the -m <some name> option, it will look for a module or a package by that name. If it finds a module, it will run it with __name__ being "__main__" in order to trigger the path that actually does something -- again leading to some, if not all, issues discussed earlier.

However if it finds a package it will run its __main__.py module (still setting __name__ to "__main__") -- not its __init__.py.

This means that at the top of __main__.py we can invert the usual logic:

if __name__ != '__main__':
    raise ImportError("Module intended to be main, not imported",
                      __name__)

from . import main_function
main_function()

This allows running python -m <some package>, but anyone who tried to accidentally import <some package>.__main__ will get an error -- as well they should!

Among other things, this means we only care about our sys.path, not our $PATH. For example, this will work the same whether the package is installed to the global Python or --user installed.

Finally, if an entrypoint is desired, one can easily be made to run __main__:

entrypoint = toolz.compose(lambda _dummy: None,
    functools.partial(runpy.run_module,
                      "<some package>",
                      run_name='__main__'))

Using the builtin module runpy, the builtin module functools and the third party module toolz.

by Moshe Zadka at March 20, 2018 03:30 AM

March 14, 2018

Moshe Zadka

Random Bites of Pi(e)

In today's edition of Pi day post, we will imagine we have a pie. (If you lack imagination, go out and get a pie.) (Even if you do not lack imagination, go out and get a pie.)

As is traditional, we got a round pie. Since pies are important, we will base our unit of measure on this pie -- the diameter of the pie will be 1.

Since we had to carry it home, we put it in a box. We are all ecologically minded, of course, so we put it in a box which is square -- with its length size 1.

We note something interesting -- the box's bottom's area is 1x1 -- or 1. The radius of the pie is 1/2, so its area is pi * 0.25.

As we are driving home, the pie on our passenger seat, we start wondering how we can estimate Pi. Luckily, we got some sugar. What if we sprinkled some sugar, and took notes for each grain, whether it was on the pie, or not?

Let's use Python to simulate:

import random
import attr

First, we need some randomness generator. Then, we also want to use the attrs library, because it makes everything more fun.

We make a Point class. Other than the basics, we give it a class method -- a named constructor which will generate a random point on the unit square -- this is where our sugar grain falls.

We also give it a way to calculate distance from another point, using the Pythagorean theorem.

@attr.s(frozen=True)
class Point:
    x = attr.ib()
    y = attr.ib()

    def distance(self, pt):
        return ((self.x - pt.x) ** 2 + (self.y - pt.y) ** 2) ** 0.5

    @classmethod
    def unit_square_random(cls):
        return cls(x=random.random(), y=random.random())

The center of the pie is at 0.5 by 0.5.

center = Point(0.5, 0.5)

A point is inside the pie if it is less than 0.5 away from the center.

def is_in_circle(pt):
    return center.distance(pt) < 0.5

Now we are ready. Even with just 100,000 grains of sugar, we get 2-digit accuracy.

inside = total = 0
for _ in range(10 ** 5):
    total += 1
    inside += int(is_in_circle(Point.unit_square_random()))
print((inside / total) * 4)
3.14052

by Moshe Zadka at March 14, 2018 03:00 PM

March 09, 2018

Itamar Turner-Trauring

Why you're losing the battle for high-quality software

Every day you go to work and you have to fight. Your boss wants to deliver features as fast as possible, and so you have to either argue for best practice, or work extra hours to do things the right way.

You need to fight for readable and maintainable code. And you need to fight for reusable code. And you need to fight for tests.

And you fight and you fight and you fight, and you keep on losing.

It doesn’t have to be this way. Software development doesn’t have to be a fight. Strange as it may seem, in this game the only winning move is not to play.

Instead of playing “Let’s Write High-Quality Software”, there’s a different and better game you can play: “Let’s Solve The Problem”.

Avoid false dichotomies

The problem with the game of High-Quality Software is that it’s based on a dichotomy, with only two options: low-quality and high-quality. I started this article with another dichotomy: arguing with your boss vs. working longer. And a lot of technical discussions quickly devolves into dichotomies, e.g.:

  • Shipping features vs. fixing technical debt.
  • Testing your code vs. coding quickly.

Dichotomies are tempting, and perhaps even built-in to the way we understand the world: there’s a left hand and a right hand, a wrong way and a right way. But there’s nothing inherently superior in just having two choices. In fact, all it’s doing is making you a worse engineer, because you’re focusing on arguing instead of focusing on solving the problem.

To combat this tendency, Gerald Weinberg suggests the Rule of Three: always consider at least three solutions to any problem. Let’s start with our first dichotomy: arguing with your boss vs. working longer hours to do things the right way. If there’s a third choice, what might it be?

Stop arguing, start listening

When your boss or colleagues argue for a specific design, instead of telling them why they’re wrong, listen to their reasons. Behind every design proposal is a set of goals, motivations, presumed constraints: those are the things you need to address. If you criticize their proposal based on goals they don’t care about, you’re not going to get anywhere.

So first, listen.

Once you understand why they want what they want:

  1. Consider more than the initial two choices, their proposal and your initial reaction.
  2. Try to find a solution that addresses both your goals.
  3. Explain you thinking in ways that are relevant to their goals, not just yours.

Example scenario: testing

Your boss proposes writing code without unit tests. Why? Because they want customers to be happy. Since customers have been complaining about how long it takes for new features to be delivered, your boss believes this is one way to speed up delivery.

You want to write unit tests. Why? You want code to be maintainable over time.

Merely arguing for unit tests isn’t going to address your boss’ concerns. But if you look past the initial false dichotomy, there are other solutions to consider. For example:

  • Investigate why customers aren’t getting features quickly. Perhaps the bottleneck isn’t due to how fast you’re coding, but elsewhere in the delivery process (you don’t do frequent releases, customers need to upgrade manually… there could be many reasons). Fix that, and you will have time to write unit tests and ship quickly.
  • Figure out places where bugs would be costly to customers, explain those costs to your boss, and propose unit testing only that part of the code.
  • Investigate customer needs in more detail. Perhaps existing customers are complaining about feature delivery, but you’re also losing many customers due to bugs.
  • Suggest using tools that will speed up test writing enough that the additional time won’t bother your boss.
  • Suggest limiting unit test writing to a predetermined amount of time: “this will only add 4 hours to a one week project”.

No doubt there are many more potential solutions to the standoff.

There’s always another solution

There is almost never a single correct solution to any problem, nor a single best solution. You can solve problems by relaxing unnecessary constraints, by focusing on different levels of the situation (organization, process, code), by redefining the problem, and more. Worst comes to worst, you can address many problems by switching jobs; different people like different environments, after all.

So don’t try to win technical arguments, and in fact, don’t treat them as arguments at all. When you disagree with someone, take the time to listen, and then try to come up with a solution that address their concerns and yours. And if your first idea doesn’t do that… it’s time to come up with a third, and fourth, and fifth solution.



March 09, 2018 05:00 AM