Planet Twisted

April 12, 2019

Itamar Turner-Trauring

Can software engineering be meaningful work?

The world is full of problems—from poverty to climate change—and it seems like software ought to be able to help. And yet your own programming job seems pointless, doing nothing to make things better. Far too many jobs are just about making some rich people just a little bit richer.

So how can you do something meaningful with your life?

There are no easy answers, but here’s a starting point, at least.

Don’t make things worse

Even beyond your moral obligations, working on something you actively find wrong is bad for you:

  • Either you end up hating yourself for doing it.
  • Or, in self-defense you become cynical and embittered, assuming the worst of everyone. This is not pleasant, nor is it an accurate view of the surprisingly varied threads of humanity.

If you find yourself in this situation, you have the opportunity to try to make things a little better, by pushing your organization to change. But you can also just go look for another job elsewhere.

Some jobs are actually good

Of course, most software jobs aren’t evil, but neither are they particularly meaningful. You can help an online store come up with a better recommendation engine, or optimize their marketing funnel, or build a web UI for support staff—but does it really matter that people buy at store A instead of store B?

So it’s worth thinking in detail about what exactly it is you would find meaningful, and seeing if there’s work that matches your criteria. There may not be a huge number of jobs that qualify, but chances are some exist.

If you care about climate change, for example, there are companies building alternative energy systems, working on public transportation planning, and more broadly just making computing more efficient.

Your job needn’t be the center of your life

You may not be able to find such a job, or get such a job. So there’s something to be said for not making your work the center of your life’s existence.

As a programmer you are likely get paid well, and you can even negotiate a shorter workweek. Given enough free time and no worries about making a living, you have the ability to find meaning outside your work.

  • Make the world a better place, just a little: I’ve been volunteering with a local advocacy group, and the ability to see the direct impact of my work is extremely gratifying.
  • Beauty and nature: Programming as a job can end up leaving you unbalanced as a person—it’s worth seeing the world in other ways as well.
  • Religion: While it makes no sense to me (apparently even as a very young child), apparently many people find their religion deeply satisfying.
  • Creation for creation’s sake: Many of us become programmers because we want to create things, but having a job means turning to instrumental creation, work that isn’t for its own sake. Try creating something not for its utility, but because you want to.
  • Find people who understand you: Being part of a social group that fundamentally doesn’t match who you are and how you view the world is exhausting and demoralizing. I ended up moving to a whole new country because of this. But if you live in a large city, quite possibly the people who will understand you can be found just down the block.

No easy answers

Unless you want to join a group that will tell you exactly what to think and precisely what to do—and there are certainly no lack of those—meaning is something you need to figure out for yourself.

It’s unlikely that you’ll solve it in one fell swoop, nor is it likely to be a fast process. The best you can do is just get started: a meaningful life isn’t a destination, it’s a journey.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

April 12, 2019 04:00 AM

April 10, 2019

Twisted Matrix Laboratories

Twisted 19.2.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 19.2! The highlights of this release are:
  • twisted.web.client.HostnameCachingHTTPSPolicy was added as a new contextFactory option. This reduces the performance overhead for making many TLS connections to the same host.
  • twisted.conch.ssh.keys can now read private keys in the new "openssh-key-v1" format, introduced in OpenSSH 6.5 and made the default in OpenSSH 7.8.
  • The sample code in the "Twisted Web In 60 Seconds" tutorial runs on Python 3.
  • DeferredLock and DeferredSemaphore can be used as asynchronous context managers on Python 3.5+.
  • twisted.internet.ssl.CertificateOptions now uses 32 random bytes instead of an MD5 hash for the ssl session identifier context.
  • twisted.python.failure.Failure.getTracebackObject now returns traceback objects whose frames can be passed into traceback.print_stack for better debugging of where the exception came from.
  • Much more! 20+ tickets closed overall.
You can find the downloads at <https://pypi.python.org/pypi/Twisted> (or alternatively <http://twistedmatrix.com/trac/wiki/Downloads>). The NEWS file is also available at <https://github.com/twisted/twisted/blob/twisted-19.2.0/NEWS.rst>.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Amber Brown (HawkOwl)

by Amber Brown (noreply@blogger.com) at April 10, 2019 12:35 PM

April 08, 2019

Moshe Zadka

Publishing a Book with Sphinx

A while ago, I decided I wanted to self-publish a book on improving your Python skills. It was supposed to be short, sweet, and fairly inexpensive.

The journey was a success, but had some interesting twists along the way.

From the beginning, I knew what technology I wanted to write the book with: Sphinx. This was because I knew that I can use Sphinx to create something reasonable: I have previously ported my "Calculus 101" book to Sphinx, and I have written other small things in it. Sphinx uses ReStructuredText, which I am most familiar with.

I decided I wanted to publish as PDF (for self-printers or others who find it convenient), as browser-ready HTML directory, and as an ePub.

The tox environments I created are: epub builds the ePub, html builds the browser-ready HTML, and pdf builds the PDF.

Initially, the epub environment created a "singlehtml", and I used Calibre command-line utility to transform it into an ePub. This made for a prettier ePub than the one sphinx creates: it had a much nicer cover, which is what most book reading applications use as an icon. However, that rendered poorly on Books.app (AKA iBooks).

One of the projects I still plan to tackle is how to improve the look of the rendered ePub, and add a custom cover image.

Finally, a script runs all the relevant tox environments, and then packs everything into a zip file. This is the zip file I upload to Gumroad, so that people can buy it.

I have tried to use other sellers, but Gumroad was the one with the easiest store creation. In order to test my store, even before the book was ready, I created a simple "Python cheat-sheet" poster, and put it on my store.

I then asked friends to buy it, as well as trying to do it myself. After it all worked, I refunded all the test-run purchases, of course!

Refunding on Gumroad is a pleasant process, which means that if people buy the book, and are unhappy with it, I am happy to refund their money.

(Thanks to Glyph Lefkowitz for his feedback on an earlier draft. All mistakes that remain are my responsibility.)

by Moshe Zadka at April 08, 2019 07:00 AM

April 03, 2019

Itamar Turner-Trauring

Setting boundaries at your job as a programmer

There’s always another bug, another feature, another deadline. So it’s easy to fall into the trap of taking on too much, saying “yes” one time too many, staying at the office a little later, answering a work email on the weekend…

If you’re not careful you can end up setting unreasonable expectations, and ending up tethered to your work email and Slack. Your manager will expect you to work weekends, and your teammates will expect you to reply to bug reports in the middle of your vacation.

What you want is the opposite: when you’re at home or on vacation, you should be spending your time however you want, merry and free.

You need to set boundaries, which is what I’ll be discussing in the rest of this article.

Prepping for a new job

Imagine you’re starting a new job in a week, you enjoy programming for fun, and you want to be productive as soon as possible. Personally, I wouldn’t do any advance preparation for a new job: ongoing learning is part of a programmer’s work, and employers ought to budget time for it. But you might choose differently.

If so, it’s tempting to ask for some learning material so you can spend a few days beforehand getting up to speed. But you’re failing to set boundaries if you do that, and they might give you company-specific material, in which case you’re just doing work for free.

Learning general technologies is less of a problem—knowing more technologies is useful in your career in general, and maybe you enjoy programming for fun. So instead of asking for learning material, you can go on your own and learn the technologies you know they use, without telling them you’re doing so.

Work email and Slack

Never set up work email or Slack on your phone or personal computer:

  1. It will tempt you to engage with work in your free time.
  2. When you do engage, you’ll be setting expectations that you’re available to answer questions 24/7.

While you’re at work you’ll always have your computer, so you don’t need access on your phone. If you do need to set up work email on your phone for travel, remove the account when you’re back home.

And if you want to have your work calendar on your phone, you can share it with your personal calendar account; that way you’re sharing only your calendar, nothing else.

Vacations

When you’re on vacation, you’re on vacation: no work allowed. That means you’re not taking your work laptop with you, or turning it on if you’re at home.

A week or so in advance of your vacation, explain to your team that you won’t be online, and that you won’t have access to work files. Figure out what information they might need—documentation, in-progress work you want to hand off, and the like—and write it all down where they can find it.

If you must, give your personal phone number for emergencies: given you lack access to your work credentials and email, the chances of your being called for something unimportant are quite low.

You’re paid for your normal work hours (and that’s it)

A standard workweek in the US is 40 hours a week; elsewhere it can be a little less. Whatever it is, outside those hours you shouldn’t be working, because you’re not being paid for that work. Your evenings, your weekends, your holidays, your vacations—all of these belong to you, not your employer.

If you don’t enforce that boundary between work and non-work, you are sending the message that your time doesn’t belong to you. And if you have a bad manager, they’re going to take advantage of that—or you might end up working long hours out of a misplaced sense of obligation.

So unless you’re dealing with an emergency, you should forget your job exists when your workday ends—and unless you’re on your on-call rotation, you should make sure you’re inaccessible by normal work channels.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

April 03, 2019 04:00 AM

March 30, 2019

Moshe Zadka

A Local LRU Cache

"It is a truth universally acknowledged, that a shared state in possession of mutability, must be in want of a bug." -- with apologies to Jane Austen

As Ms. Austen, and Henrik Eichenhardt, taught us, shared mutable state is the root of all evil.

Yet, the official documentation of functools tells us to write code like:

@lru_cache(maxsize=32)
def get_pep(num):
    'Retrieve text of a Python Enhancement Proposal'
    resource = 'http://www.python.org/dev/peps/pep-%04d/' % num
    try:
        with urllib.request.urlopen(resource) as s:
            return s.read()
    except urllib.error.HTTPError:
        return 'Not Found'

(This code is copied from the official documentation, verbatim.)

The decorator, @lru_cache(maxsize=32), is now... module-global mutable state. It doesn't get any more shared, in Python, than module-global: every import of the module will share the object!

We try and pretend like there is no "semantic" difference: the cache is "merely" an optimization. However, very quickly things start falling apart: after all, why would the documentation even tell us how to get back the original function (answer: .__wrapped__) if the cache is so benign?

No, decorating the function with lru_cache is anything but benign! For one, because it is shared-thread mutable state, we have introduced some thread locking, with all the resulting complexity, and occasional surprising performance issues.

Another example of non-benign-ness is that, in the get_pep example, sometimes a transient error, such as a 504, will linger on, making all subsequent requests "fail", until a cache eviction (because an unrelated code path went through several PEPs) causes a retry. These are exactly the kind of bugs which lead to warnings against shared mutable state!

If we want to cache, let us own it explicitly in the using code, and not have a global implementation dictate it. Fortunately, there is a way to properly use the LRU cache.

First, remove the decorator from the implementation:

def get_pep(num):
    'Retrieve text of a Python Enhancement Proposal'
    # Same code as an in official example

Then, in the using code, build a cache:

def analyze_peps():
    cached_get_pep = lru_cache(maxsize=32)(get_pep)
    all_peps, pep_by_type = analyze_index(cached_get_pep(0))
    words1 = get_words_in_peps(cached_get_pep, all_peps)
    words2 = get_words_in_informational(cached_get_pep,
                                        pep_by_type["Informational"])
    do_something(words1, words2)

Notice that in this example, the lifetime of the cache is relatively clear: we create it in the beginning of the function, passed it to called functions, and then it goes out of scope and is deleted. (Barring one of those functions sneakily keeping a reference, which would be a bad implementation, and visible when reviewing it.)

This means we do not have to worry about cached failures if the function is retried. If we retry analyze_peps, we know that it will retry retrieving any PEPs, even if those failed before.

If we wanted the cache to persist between invocations of the function, the right solution would be to move it one level up:

def analyze_peps(cached_get_peps):
    # ...

Then it is the caller's responsibility to maintain the cache: once again, we avoid shared mutable state by making the state management be explicit.

In this example, based on the official lru_cache documentation, we used a network-based function to show some of the issues with a global cache. Often, lru_cache is used for performance reasons. However, even there, it is easy to create issues: for example, one function using non-common inputs to the LRU-cached functions can cause massive cache evictions, with surprising performance impacts!

The lru_cache implementation is great: but using it as a decorator means making the cache global, with all the bad effects. Using it locally is a good use of a great implementation.

(Thanks to Adi Stav, Steve Holden, and James Abel for their feedback on early drafts. Any issues that remain are my responsibility.)

by Moshe Zadka at March 30, 2019 04:30 AM

March 29, 2019

Itamar Turner-Trauring

On learning new technologies: why breadth beats depth

As a programmer you face an ever-growing stream of new technologies: new frameworks, new libraries, new tools, new languages, new paradigms. Keeping up is daunting.

  • How can you find the time to learn how to use every relevant tool?
  • How do you keep your skills up-to-date without using up all your free time?
  • How can you even learn all the huge number of existing technologies?

The answer, of course, is that you can’t learn them all—in depth. What you can do is learn about the tools’ existence, and learn just enough about them to know when it might be worth learning more.

Quite often, spending 5 minutes learning about a new technology will give you 80% of the benefit you’d get from spending 5 days on it.

In the rest of this article I’ll cover:

  1. The cost of unnecessary in-depth learning.
  2. Why breadth of knowledge is so useful.
  3. Some easy ways to gain breadth of knowledge.

The cost of in-depth learning

Having a broad range of tools and techniques to reach for is a valuable skill both at your job and when looking for a new job. But there are different levels of knowledge you can have: you can be an expert, or you can have some basic understanding of what the tool does and why you might use it.

The problem with becoming an expert is that it’s time consuming, and you don’t want to put that level of effort into every new tool you encounter.

  1. Some new technologies will die just as quickly as they were born; there’s no point wasting time on a dead end.
  2. Most technologies just aren’t relevant to your current situation. GitHub keeps recommending I look at a library for analyzing pulsar data, and not being an astrophysicist I’m going to assume I can safely ignore it.
  3. Software changes over time: even if you end up using a new library in a year or two, by that point the API may have changed. Time spent learning the current API would be wasted.

If you try to spend a few days—or even hours—on every intriguing new technology you hear about, you’re going to waste a lot of time.

The alternative: shallow breadth of knowledge

Most of the time you don’t actually need to use new tools and techniques. As long as you know a tool exists you’ll be able to learn more about it when you need to.

For example, there is a tool named Logstash that takes your logs and sends them to a central location. That’s pretty much all you have to remember about it, and it took you just a few seconds to read that previous sentence.

Maybe you’ll never use that information… or maybe one day you’ll need to get logs from a cluster of machines to a centralized location. At that point you’ll remember the name “Logstash”, look it up, and have the motivation to actually go read the documentation and play around with it.

This is also true when it comes to finding a new job. I was once asked in a job interview about the difference between NoSQL and traditional databases. At the time I’d never used MongoDB or any other NoSQL database, but I knew enough to answer satisfactorily. Being able to answer that question told the interviewer I’d be able to use that tool, if necessary, even if I hadn’t done it before.

Gaining breadth of knowledge

Learning about the existence of tools can be a fairly fast process. And since this knowledge will benefit your employer and you don’t need to spend significant time on it, you can acquire it during working hours.

You’re never actually working every single minute of your day, you always have some time when you’re slacking off on the Internet. Perhaps you’re doing so right now! You can use that time to expand your knowledge.

Here are a couple ways you can get pointers to new tools and techniques:

Newsletters

A great way to learn new tools and techniques are weekly email newsletters. There are newsletters on many languages and topics, from DevOps to PostgreSQL: here’s one fairly detailed list of potential newsletters you can sign up for.

Conferences and Meetups (you don’t have to go!)

Another good source of new tools and techniques are conferences and Meetups. Good conferences and Meetups will aim for a broad range of talks, on topics both new and classic.

But you don’t have to go to the conference or Meetup to benefit, or even watch a recording of the talks to learn something. Just skimming talk topics will give you a sense of what the community is talking and thinking about—and if something sounds particularly relevant to your interests you can spend the extra time to learn more.

Of course, if you can convince your employer to send you to a conference that’s even better: you’ll learn more, and you’ll do it on the company’s dime and time.

Your time is valuable—use it accordingly

There are only so many hours in the day, so many days in a year. That means you need to work efficiently, spending your limited time in ways that have the most impact:

  1. Spend an hour a week learning about new tools, just enough to know when they might be useful.
  2. Keep a record of these tools so you can find them when you need to: star them on GitHub, or add them your bookmarks or note-taking system.
  3. Only spend the extra time and effort needed to gain more in-depth understanding once you actually need to use the tool. And when you do learn a new tool, do it at your job if you can.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 29, 2019 04:00 AM

March 19, 2019

Itamar Turner-Trauring

You are not a commodity

Recently a reader wrote in with a question:

I’ll be going to [a coding boot camp]. [After I graduate], my current view is to try hard to negotiate for whatever I can and then get better for my second job, but both of those steps are based on the assumption that I understand what an acceptable range for pay, benefits, etc are, and I feel like it’s leaving money (or time) on the table.

I’m not even sure if entry level jobs should be negotiated since they seem to be such a commodity. Do you have any advice for someone standing on the edge of the industry, looking to jump in?

What I told him, and what I’d like to share with you as well, is this:

  • Don’t think of yourself as a commodity—you’re just undermining yourself.
  • Don’t present yourself as a commodity—it’s bad marketing.
  • You are not a commodity—because no one is.

This is perhaps more obvious if you have lots of experience, but it’s just as true for someone looking for their first job.

We all have different strengths, different weaknesses, different experiences and background. So when it comes to finding a job, you should be highlighting your strengths, instead of all the ways you’re the same as everyone else.

In the rest of this article I’ll show just a few of the ways this can be applied by someone who is switching careers into the tech industry; elsewhere I talk more about the more theoretical side of marketing yourself.

Negotiating as a bootcamp graduate

Since employment is a negotiated relationship, negotiation starts not when you’re discussing your salary with a particular company, but long before that when you start looking for a job.

Here are just some of the ways you can improve your negotiating strength.

1. Highlight your previous experience

If you’re going to a coding bootcamp chances are you’ve had previous job experience. Many of those job skills will translate to your new career as a software developer: writing software as an employee isn’t just about writing code.

Whether you worked as a marketer or a carpenter, your resume and interviews should highlight the relevant skills you learned in your previous career. Those skills will make you far more competent than the average college graduate.

This might include people management, project management, writing experience, knowing when to cut corners and when not to, attention to detail, knowing how to manage your time, and so on.

And if you can apply to jobs where your business knowledge is relevant, even better: if you used to work in insurance, you’ll have an easier time getting a programming job at an insurance company.

2. Do your research

Research salaries in advance. There are a number of online salary surveys—e.g. StackOverlow has one—which should give you some sense of what you should be getting.

Keep in mind that top companies like Google or some of the big name startups use stock and bonuses as a large part of total compensation, and salary doesn’t reflect that. Glassdoor has per-company salary surveys but they often tend to be skewed and lower than actual salaries.

3. Get multiple job offers if you can

Imagine candidate A and candidate B: as far as the hiring manager is concerned they seem identical. However, if candidate B has another job offer already, that is evidence that someone elsewhere has decided they like them. So candidate B is no longer seen as a commodity.

Try to apply to multiple jobs at once, and not to say “yes” immediately to the first job offer you can get. If you can get two offers at the same time, chances you’ll be able to get better terms from one or the other.

In fact, even just saying “I’m in the middle of interviewing elsewhere” can be helpful.

You are not a commodity, so don’t act like one

Notice how all of the suggestions above aren’t about that final conversation about salary. There’s a reason: by the time you’ve reached that point, most of the decisions about your salary range have already been made. You still need to ask for more, but there’s only a limited upside at that point.

So it’s important to present your unique strengths and capabilities from the start: you’re not a commodity, and you shouldn’t market yourself like one.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 19, 2019 04:00 AM

March 05, 2019

Itamar Turner-Trauring

You can't avoid negotiating—but you can make it easier

You’re looking for a new job, or feel like it’s time for a raise, or maybe you just want to set some boundaries with your boss. And that means negotiating, and you hate the whole idea: asking for things is hard, you don’t want to be treated specially, the idea of having the necessary conversation just makes you super-uncomfortable.

And that’s a problem, because you can’t avoid negotiating: employment is a negotiated relationship. From the minute you start looking for a job until you leave for a new one, you are negotiating.

And maybe you didn’t quite realize that, and maybe you didn’t ever ask for what you want, but in that case you’re still negotiating. You’re just negotiating badly.

But once you internalize this idea, negotiation can get easier.

That awkward, scary conversation where you ask for what you want is really just a small fraction of the negotiation. Which means if you do it right, that final conversation can be shorter, more comfortable, and much more likely to succeed.

To see why, let’s take the example of a job search, and see how the final conversation where you discuss your salary is just a small part of the real negotiation.

How your salary is determined

To simplify matters, we will specifically focus just on your salary as a programmer.

Companies tend to have different job titles based on experience, with corresponding ranges of salaries. Your salary is determined first by the prospective job title, and second by the specific salary within that title’s range.

The process goes something like this:

  1. When you send in your resume the HR or recruiting person who reads it puts you into some sort of mental bucket: “this is a senior software engineer.”
  2. The hiring manager reads your resume and refines that initial guess.
  3. The interview process then hardens that job title, and gives the company some sense of how much they want you and therefore where in that title’s salary range to put you.
  4. Finally, you get an offer, and you can push back and try to get more.

That final step, the awkward conversation we tend to think of as the negotiation, is only the end of a long process. By the time you’ve reached it much of your scope for negotiation has been restricted: you’ll have a harder time convincing the company you’re a Software Engineer IV if they’ve decided you’re a Software Engineer II.

Employment is a negotiated relationship

Negotiation isn’t a one-time event, it’s an ongoing part of how you interact with an employer You start negotiating for your salary, for example, from the day you start applying:

  1. You can choose companies to apply to where your enthusiasm will come across, or where you have highly relevant technical skills.
  2. You can get yourself introduced by a friend on the inside, instead of just sending in your resume.
  3. You can ensure you’ve demonstrated your correct level of problem-solving skills in your resume. If you can identify problems, it’s very easy to give the impression you can only solve problems if you don’t phrase things right (“I switched us over from VMs to Kubernetes” vs. “I identified hand-built VMs as a problem, investigated, chose Kubernetes, etc.”).
  4. You can interview for multiple jobs at once, so you can use a job offer from company A as independent proof of your value to company B.
  5. You can do well on the technical interview, which correlates with higher salaries.
  6. You can avoid whiteboard puzzles if you tend not to do well on those sorts of interviews.

All of these—and more—are part of the negotiation, long before you get the offer and discuss your salary.

You still need to ask (and negotiation doesn’t stop then)

Yes, you do need to ask for what you want at the end. And yes, that’s going to be scary and awkward and no fun at all. But asking for things is something you can practice in many different contexts, not just job interviews.

But if you treated the whole job interview process as a negotiation, that final conversation will be much easier because the company will really want to hire you—and because they’ll be worried you’ll take that other job offer you mentioned.

You’re not done negotiating when you’ve accepted the offer, though.

When your boss asks you to do something, you don’t have to say yes. In fact, as a good employee it’s your duty not to say yes, but to listen, dig a little deeper, and find the real problem.

Similarly, how many hours you work is not just up to your boss, is also about how you push back on unreasonable demands. And again, it’s your duty as a good employee to push back, because work/life balance makes you a better software engineer.

All of which is to say:

  1. You can’t avoid negotiating.
  2. Negotiation is far broader than just that awkward conversation where you make the ask.
  3. Being a good negotiator will make you a far more effective software engineer.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

March 05, 2019 05:00 AM

February 14, 2019

Moshe Zadka

Don't Make It Callable

There is a lot of code that overloads the __call__ method. This is the method that "calling" an object activates: something(x, y, z) will call something.__call__(x, y, z) if something is a member of a Python-defined class.

At first, like every operator overload, this seems like a nifty idea. And then, like most operator overload cases, we need to ask: why? Why is this better than a named method?

The first use-case is easily done better with a named method, and more readably: accepting callbacks. Let's say that the function interesting_files will call the passed-in callback with names of interesting files.

We can, of course, use __call__:

class PrefixMatcher:

    def __init__(self, prefix):
        self.prefix = prefix
        self.matches = []

    def __call__(self, name):
        if name.startswith(self.prefix):
            self.matches.append(name)

    def random_match(self):
        return random.choice(self.matches)

matcher = PrefixMatcher("prefix")
interesting_files(matcher)
print(matcher.random_match())

But it is more readable, and obvious, if we...don't:

class PrefixMatcher:

    def __init__(self, prefix):
        self.prefix = prefix
        self.matches = []

    def get_name(self, name):
        if name.startswith(self.prefix):
            self.matches.append(name)

    def random_match(self):
        return random.choice(self.matches)

matcher = PrefixMatcher("prefix")
interesting_files(matcher.get_name)
print(matcher.random_match())

We can pass the matcher.get_name method, which is already callable directly to interesting_files: there is no need to make PrefixMatcher callable by overloading __call__.

If something really is nothing more than a function call with some extra arguments, then either a closure or a partial would be appropriate.

In the example above, the random_match method was added to make sure that the class PrefixMatcher is justified. If this was not there, either of these implementations would be more appropriate:

def prefix_matcher(prefix):
    matches = []
    def callback(name):
        if name.startswith(prefix):
            matches.append(name)
    return callback, matches

matcher, matches = prefix_matcher("prefix")
interesting_files(matcher)
print(random.choice(matches))

This uses the function closure to capture some variables and return a function.

def prefix_matcher(prefix, matches, name):
    if name.startswith(prefix):
        matches.append(name)

matches = []
matcher = functools.partial(prefix_matcher, "prefix", matches)
interesting_files(matcher)
print(random.choice(matches))

This uses the funcotools.partial functions to construct a function that has some of the arguments "prepared".

There is one important use case for __call__, but it is specialized: it is a powerful tool when constructing a Python-based DSL. Indeed, this is exactly the time when we want to trade away "doing exactly when the operator always does" in favor of "succint syntax dedicated to the task at hand."

A good example of such a DSL is stan, where the __call__ function is used to construct XML tags with attributes: div(style="color: blue").

In almost every other case, avoid the temptation to make your objects callable. They are not functions, and should not be pretending.

by Moshe Zadka at February 14, 2019 06:30 AM

January 31, 2019

Itamar Turner-Trauring

Do they have work/life balance? Investigating potential employers with GitHub

When you’re searching for a new programming job you want to avoid companies with long work hours. You can ask about work/life balance during the interview (and unless you’re desperate, you always should ask), but that means wasting time applying and interviewing at companies where you don’t want to work.

What you really want is to only apply to companies that actually allow—or even better, encourage—good work/life balance.

Close reading of the job posting and company hiring pages can sometimes help you do that, but some good companies don’t talk about it, and some bad companies will just lie.

So in this article I’ll explain how you can use GitHub to empirically filter out at least some companies with bad work-life balance.

Let’s look at GitHub profiles

If you go to the profile for a GitHub user (here’s mine) you will see a chart showing contributions over time. The columns are weeks, and each row is a day of the week.

Each square shows contribution on a particular day: the darker the color, the more contributions.

What’s most interesting here are the first and last rows, which are Sundays and Saturdays respectively. As you can see in the image above, this GitHub user doesn’t tend to code much on the weekends: the weekend boxes are mostly blank.

You can also use this UI to see if the company uses GitHub at all. Given this many boxes, the user’s employer probably does use GitHub, but you can also click on a particular box and see the specific contributions. In this case, you would see that on one particular weekday the user contributed to “private repositories”, presumably those for their employer:

On the other hand, if you were to click on the weekend boxes you would see that all the weekend contributions are to open source projects. In short, this user doesn’t code much on the weekend, and when they do it’s on personal open source projects, not work projects.

Generalizing this process will allow you to filter out companies that don’t have work/life balance for developers.

Investigating work/life balance using GitHub

There are two assumptions here:

  • This is a relatively small company; large companies tend to vary more across groups, so you’ll need to adjust the process somewhat to focus on programmers in the group you’re applying to.
  • The company uses GitHub for most of their source code.

Company size you can figure out from the company page or LinkedIn, usage of GitHub will be something we can figure out along the way.

This is not a perfect process, since users can disable showing private repository contributions, or it’s possible the developer has personal private repositories. This is why you want to check as many profiles as possible.

Here’s what you should do:

  1. Find a number of programmers who work for the company. You can do this via the company’s website, and by the company’s page on LinkedIn, which will list employees who have LinkedIn profiles. You can also check for members of the company’s GitHub organization, if they have a public list, but don’t rely on this exclusively since it will likely omit many employees.
  2. For each programmer, use your favorite search engine and search for “$NAME github” to find their GitHub profile. Try to do some sanity checks to make sure it’s the right person, especially if they have a common name: location, organization membership, technologies used, etc..
  3. For each programmer, check if they contribute to private repositories during the week. You do can this by visually seeing if there are lots of green boxes, and by clicking on the Monday-Friday boxes in the timeline and reading the results below. If they mostly don’t, the company might not use GitHub.
  4. If they do use GitHub at work, for each programmer, check if they code on the weekend. You can visually see if they have lots of green boxes on Sundays and Saturdays.
  5. If they do code on the weekend, check if those are work contributions. You can do this by clicking on the weekend boxes and seeing if those are contributions to private repositories.

Additional tips:

  • If the company mostly writes open source code, you can check if the programmers contribute to the relevant open source repositories during the weekend.
  • Look for correlated weekend work dates across different people: if 5 people are working on private repos on the same 4 weekends, that’s probably a sign of crunch time.

By the end of the process you will often have a decent sense if most of the programmers are coding on the weekend. Note that this method can only tell you if they don’t have work/life balance—you still can’t know if they do have work/life balance.

So even if you decide to apply to a company that passed this filter, you will still need to ask questions about work/life balance and listen closely during the interview process.

Always do your research

Before you apply for a job you should always do your research: read their website, reads all their job postings, talk to a friend there if you have one… and as I explained in this article, their GitHub profiles as well. You’ll still need to look for warning signs during interviews, but research is still worth it.

The more you research, the more you’ll know whether you want to work there. And if you do want to work there, the more you know there better your interview will go.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 31, 2019 05:00 AM

January 25, 2019

Moshe Zadka

Staying Safe with Open Source

A couple of months ago, a successful attack against the Node ecosystem resulted in stealing an undisclosed amount of bitcoins from CoPay wallets.

The technical flow of the attack is well-summarized by the NPM blog post. Quick summary:

  1. nodemon, a popular way to run Node applications, depends on event-stream.
  2. The event-stream maintainer has not had time to maintain it.
  3. right9control asked event-stream maintainer for commit privileges to help maintain it.
  4. right9control added a dependency on a new library, flatmap-stream.
  5. flatmap-stream contained malicious code to steal wallets.

Obfuscation

A number of methods were done to disguise the attack.

The dependency was an added in a minor version, and a new version was immediately released. This meant that most projects, which pin to minor, would get the updates, while it stayed invisible on the main GitHub landing page, or the main npm landing page.

The malicious code was only in the minified version of the library that was uploaded to npm.org. The non-minified source code on both GitHub and npm.org, as well as the minified code on GitHub, did not contain the malicious code.

The malicious code was encrypted with a key that used the description of other packages in the dependency tree. That made it impossible to understand the attack without guessing which package decrypts it.

The combination of all those methods meant that the problem remained undetected for two months. It was only luck that detected it: the decryption code was using a deprecated function, and investigating the deprecation message led to the issue being figured out.

This bears thinking about: if the code had been written slightly better, the problem would have still be happening now, and nobody would be the wiser. We should not discount the possibility that currently, someone who followed the same playbook but managed to use AES correctly is still attacking some package, and we have no idea.

Solutions and Non-Solutions

I want to discuss some non-solutions in trying to understand how this problem came about.

Better Vetting of Maintainers

It is true, the person who made this commit had an obviously-auto-generated username (<word>-<digit>-<word>) and made few contributions before getting control. But short of meeting people in person, I do not think this would work.

Attackers adapt. Ask for better usernames, they will generate "<firstname>-<lastname>" names. Are you going to disallow my GitHub username, moshez? Ask for more contributions, you will get some trivial-code-that's-uploaded-to-npm, autogenerated a bit to disguise it. Ask for longer commit history, they'll send fixes to trivial issues.

Remember that this is a distributed problem, with each lead maintainer having to come up with a vetting procedure. Otherwise, you get usernames through the vetting process, and then you use those to spam maintainers, who now are sure they can trust those "vetted".

In short, this is one of the classical defenses that fails to take into considerations that attackers adapt.

Any Solution That Depends on JavaScript-specific Things

This attack could easily have been executed against PyPI or RubyGems. Any solution that relies on JavaScript's ability to have a least-access-based solution only helps make sure that these attacks go elsewhere.

It's not bad to do it. It just does not solve the root cause.

This also means that "stop relying on minified code" is a non-solution in the world where we encourage Python engineers to upload wheels.

Any Solution That Depends on "Audit Code"

A typical medium-sized JavaScript client app depends on some 2000 packages. Auditing each one, on each update, would make using third-packages untenable. This means that start-ups playing fast and loose with these rules would gain an advantage over those who do not. Few companies can afford that pay that much for security.

Hell, we knew this was a possibility a few months before the attack was initiated and still nobody did code auditing. Starting now would mostly mean availability bias, which means it would be over as soon as another couple of months go by without a documented attack.

Partial Solution -- Open Source Sustainability

If we could just pay maintainers, they would be slightly more comfortable maintaining packages and less desperate for help. This means that it would become inherently slightly harder to quickly become a new maintainer.

However, it is worthwhile to consider that this still would not solve the subtler "adding a new dependency" attack described earlier: just making a "good" library and getting other libraries to depend on it.

Summary

I do not know how to prevent the "next" attack. Hillel makes the point that a lot of "root causes" will only prevent almost-exact repeats, while failing to address trivial variations. Remember that one trivial variation, avoiding deprecation warnings, would have made this attack much more successful.

I am concerned that, as an industry, we are not discussing this attack a mere two months after discovery and mitigation. We are vulnerable. We will be attacked again. We need to be prepared.

by Moshe Zadka at January 25, 2019 05:00 AM

Itamar Turner-Trauring

A 4-day workweek for programmers, the easy way

You’re dreaming of a programming job with 30 hours a week, a job where you’ll have time for your own projects, your own hobbies. But this sort of job seems practically non-existent—almost no one advertises programming jobs with shorter workweeks.

How do you carve out a job like this, a job with a shorter workweek?

The ideal would be some company or employer where you just can ask for a shorter workweek, without having to apply or interview, and have a pretty good chance of getting a “yes”.

In this post I’ll talk about the easiest way to get what you want: negotiating at your current job.

The value of being an existing employee

as an existing employee you are much more valuable than an equally experienced programmer who doesn’t work there.

During your time at your employer you have acquired a variety of organization-specific knowledge and skills. It can take quite a while for a new employee to acquire these, and during the ramp-up period they will also take up their colleagues’ time.

Here are just a few of the things that you’ve likely learned during your time as an employee:

  • The existing codebase.
  • Local best practices, idioms, and coding conventions.
  • The business processes at the company.
  • The business domain.
  • The informal knowledge network in the company, i.e. who is the expert in what.

Not only do you have hard to-replace skills and knowledge, you also have your work record to build on: your manager knows what you can do. Once you’ve been working for a manager for a while they’ll know your skills, and whether they can trust you.

A real-life example

In my own career, being an existing employee has benefited me on multiple occasions:

After a number of years working as a software engineer for one company, I got a bad case of RSI (repetitive strain injury). I could no longer type, which meant I could no longer work as a programmer. But I did stay on as an employee: one of the managers at the company, who I’d worked for in an earlier role, offered me a position as a product manager.

In part this was because the company was run by decent people, who for the most part took care of their employees. But it was also because I had a huge amount of hard-to-replace business and technical knowledge of the company’s product, in a highly complex domain.

I worked as a product manager for a few years, but I never really enjoyed it. And with time my hands recovered, at least partially, potentially allowing me to take up programming again. After my daughter was born, my wife and I decided that I’d become a part-time consultant, and take care of our child the rest of the time, while she continued working full-time.

My manager was upset when I told him I was leaving. I felt bad—but really, if your boss is unhappy when you’re leaving, that’s a good thing.

In fact, my boss offered to help me find a less-than-full-time programming position within the company so I wouldn’t have to leave. I’d already made up my mind to go, but under other circumstances I might have taken him up on the offer.

Notice how I was offered reduced hours, even though companies will never advertise such positions. That’s the value of being an existing employee.

Asking for what you want

Unless you work for a really bad manager—or a really badly managed company—a reasonable manager would much prefer to have your experience for 4 days a week than have to find a train a replacement. That doesn’t mean they’ll be happy if you ask for a shorter workweek: you are likely to get some pushback.

But if you have a strong negotiating position—financial savings, valuable work, the willingness to find another job if necessary—there’s a decent chance they will eventually say “yes”.

Does negotiating seem too daunting, or not something you can do? Plenty of other programmers have done it, even those with no previous negotiation experience.

Much of this article was an excerpt from my book, Negotiate a 3-Day Weekend. It covers everything from negotiation exercises you can do on the job, to a specific script for talking to your boss, to negotiating at new employers if you can’t do it at your current job.

With a little bit of practice, you can get the free time you need.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 25, 2019 05:00 AM

January 18, 2019

Itamar Turner-Trauring

Negotiate your salary like a 6-year old

You’re applying for jobs, you’re hoping to get an offer soon—and when you do you’ll have to face the scary issue of negotiation.

If you read a negotiation guide like Valerie Aurora’s HOWTO (and you should!) you’re told you need to negotiate: you need to ask for more. And that’s good advice.

The problem is that asking for more is scary: it feels confrontational, your body will kick in with an unhelpful adrenaline surge, it’s just not comfortable. And honestly, given you only negotiate your salary every few years, that adrenaline surge probably isn’t going to go away.

But I think you can get past that and negotiate anyway—by embracing your inner 6-year old. In particular, a 6-year old negotiating for snacks on a playdate.

Snack time!

Let’s set the scene.

A 6-year old named Emma is visiting her friend Jackson, and now it’s time for a snack. Jackson’s parent—Michael—now needs to feed Emma.

Michael would prefer the children not eat crackers, but he has a problem. Michael has some authority over Jackson since he’s his parent, and some knowledge of what Jackson is willing to eat. So he can say “you’re eating one of these mildly healthy snacks” and that’s the end of it.

But Emma is just visiting: Michael has less authority, less knowledge, and a hungry 6-year old is Bad News. So Michael comes up with an acceptable range of snacks, and tries his best to steer towards the ones he considers healthier.

The conversation goes something like this:

Michael: “Would you like some fruit?”

Emma: blank stare.

Michael: “How about same cheese?”

Emma: shakes her head.

Michael: “Yogurt?”

Emma: shakes her head.

Michael: “Crackers?”

Emma and Jackson: “Yes!”

Michael has committed to feeding Emma something, he doesn’t want her to go hungry—but he doesn’t have the normal leverage he has as a parent. As a result, Emma can just stall until she gets what she wants. Particularly enterprising children will ask for candy (even when they would never get candy at home!), but stalling seems well within the capacity of most 6-year olds.

Salary time!

The dynamic of hiring a new employee is surprisingly similar.

Whoever is making the offer—HR, an internal recruiter, or the hiring manager—has already committed to hiring you. They’ve decided: they had interviews and meetings and they want to get it over with and just get you on board.

So they come up with an acceptable salary range, and offer you the low end of the range. If you accept that, great. And if you say “can you do better?”

Well, they’ve already decided on their acceptable salary range: they’ll just move up within that range. They’re not insulted, they’re used to this. They’re not horrified at a hole in their budget, this is still within their acceptable range.

You went from fruit to crackers, and they can live with that. All you have to do is ask.

Embrace your inner 6-year old

Much of what determines your salary happens before you get the offer, when the decision is made about what salary range to offer you.

You can influence this by the language on your resume, by how you present yourself, how you interview, and by noting you have competing offers. It may not feel like negotiation, but it’s all part of the process—and while it’s a set of skills you can master as an adult, that part is far beyond what your 6-year old self could do.

But the actual conversation about salary? Pretend you’re 6, pretend it’s a snack, and ask for more—chances are you’ll get some delicious crackers.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 18, 2019 05:00 AM

January 09, 2019

Moshe Zadka

Checking in JSON

JSON is a useful format. It might not be ideal for hand-editing, but it does have the benefit that it can be hand-edited, and it is easy enough to manipulate programmatically.

For this reason, it is likely that at some point or another, checking in a JSON file into your repository will seem like a good idea. Perhaps it is even beyond your control: some existing technology uses JSON as a configuration file, and the easiest thing is to go with it.

It is useful to still keep the benefit of programmatic manipulation. For example, if the JSON file encodes a list of numbers, and we want to add 1 to every even number, we can do:

with open("myfile.json") as fp:
    content = json.load(fp)
content = [x + (2 % i) for i, x in enumerate(content)]
with open("myfile.json", "w") as fp:
    json.dumps(fp, content)

However, this does cause a problem: presumably, before, the list was formatted in a visually-pleasing way. Having dumped it, now the diff is unreadable -- and hard to audit visually.

One solution is to enforce consistent formatting.

For example, using pytest, we can write the following test:

def test_formatting():
    with open("myfile.json") as fp:
        raw = fp.read()
    content = json.loads(raw)
    redumped = json.dumps(content, indent=4) + "\n"
    assert raw == redumped

Assuming we gate merges to the main branches on passing tests, it is impossible to check in something that breaks the formatting. Automated programs merely need to remember to give the right options to json.dumps. However, what happens when humans make mistakes?

It turns out that Python already has a command-line tool to reformat:

$ python -m json.tool myfile.json > myfile.json.formatted
$ mv myfile.json.formatted myfile.json

A nice test failure will remind the programmer of this trick, so that it is easy to do and check in.

by Moshe Zadka at January 09, 2019 06:00 AM

Itamar Turner-Trauring

Competing with a "Stanford grad just dying to work all nighters on Red Bull"

A reader of my newsletter wrote in, talking about the problem of finding a job with work/life balance in Silicon Valley:

It seems like us software engineers are in a tough spot: companies demand a lot of hard work and long hours, and due to the competitiveness here in Silicon Valley, you have to go along with it (or else there’s some bright young Stanford grad just dying to work all nighters on Red Bull to take your place).

But they throw you aside once the company has become established and they no longer need the “creative” types.

In short, how do you get a job with work/life balance when you’re competing against people willing to work long hours?

All nighters make for bad software

The starting point is realizing that working long hours makes you a much less productive employee, to the point that your total output will actually decrease (see Evan Robinson on crunch time). If you want to become an effective and productive worker, you’re actually much better off working normal hours and having a personal life than working longer hours.

Since working shorter hours makes you more productive, that gives you a starting point for why you should be hired.

Instead of focusing on demonstrative effort by working long hours, you can focus on identifying and solving valuable problems, especially the bigger and longer term problems that require thought and planning to solve.

Picking the right company

Just because you’re a valuable, productive programmer doesn’t mean you’re going to get hired, of course. So next you need to find the right company.

You can imagine three levels of managerial skill:

  • Level 1: These managers have no idea how to recognize effective workers, so they only judge people by hours worked.
  • Level 2: These managers can recognize effective workers, but don’t quite know how to create a productive culture. That means if you choose work long hours they won’t stop you, however pointless these long hours may be. But, they won’t force you work long hours so long as you’re doing a decent job.
  • Level 3: These managers can recognize effective workers and will encourage a productive culture. Which is to say, they will explicitly discourage working long hours except in emergencies, they will take steps to prevent future emergencies, etc..

When you look for a job you will want to avoid Level 1 managers. However good your work, they will be happy to replace you with someone incompetent so long as they can get more hours out of them. So you’ll be forced to work long hours and work on broken code.

Level 3 managers are ideal, and they do exist. So if you can find a job working for them then you’re all set.

Level 2 managers are probably more common though, and you can get work/life balance working for them—if you set strong boundaries. Since they can recognize actual competence and skills, they won’t judge you during your interview based on how many hours you’re willing to work. You just need to convey your skill and value, and a reasonable amount of dedication to your job.

And once you’ve started work, if you can actually be productive (and if you work 40 hours/week you will be more productive!) they won’t care if you come in at 9 and leave at 5, because they’ll be happy with your work.

Unlike Level 3 managers, however, you need to be explicit about boundaries during the job interview, and even more so after you start. Elsewhere I wrote up some suggestions about how to convey your value, and how to say “no” to your boss.

Employment is a negotiated relationship

To put all this another way: employment is a negotiated relationship. Like it or not, you are negotiating from the moment you start interviewing, while you’re on the job, and until the day you leave.

You are trying to trade valuable work for money, learning opportunities, and whatever else your goals are (you can, for example, negotiate for a shorter workweek). In this case, we’re talking about negotiating for work/life balance:

  1. Level 1 managers you can’t negotiate with, because what they want (long hours) directly conflicts with what you want.
  2. Level 2 managers you can negotiate with, by giving them one of the things they want: valuable work.
  3. Level 3 managers will give you what you want without your having to do anything, because they know it’s in the best interest of everyone.

Of course, even for Level 3 managers you will still need to negotiate other things, like a higher salary.

So how do you get a job with work/life balance? Focus on providing and demonstrating valuable long-term work, avoid bad companies, and make sure you set boundaries from the very start.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

January 09, 2019 05:00 AM

December 12, 2018

Itamar Turner-Trauring

Tests won't make your software correct

Automated tests are immensely useful. Once you’ve started writing tests and seen their value, the idea of writing software without them becomes unimaginable.

But as with any technique, you need to understand its limitations. When it comes to automated testing—unit tests, BDD, end-to-end tests—it’s tempting to think that if your tests pass, your software is correct.

But tests don’t, tests can’t tell you that your software is correct. Let’s see why.

How to write correct software

To implement a feature or bugfix, you go through multiple stages; they might be compressed or elided, but they are always necessary:

  1. Identification: Figure out what the problem is you’re trying to solve.
  2. Solution: Come up with a solution.
  3. Specification: Define a specification, the details of how the solution will be implemented.
  4. Implementation: Implement the specification in code.

Your software might end up incorrect at any of these points:

  1. You might identify the wrong problem.
  2. You might choose the wrong solution.
  3. You might create a specification that doesn’t match the solution.
  4. You might write code that doesn’t match the specification.

Only human judgment can decide correctness

Automated tests are also a form of software, and are just as prone to error. The fact that your automated tests pass doesn’t tell you that your software is correct: you may still have identified the wrong problem, or chosen the wrong solution, and so on.

Even when it comes to ensuring your implementation matches your specification, tests can’t validate correctness on their own. Consider the following test:

def test_addition():
    assert add(2, 2) == 5

From the code’s perspective—the perspective of an automaton with no understanding—the correct answer of 4 is the one that will cause it to fail. But merely by reading that you can tell it’s wrong: you, the human, are key.

Correctness is something only a person can decide.

The value of testing: the process

While passing tests can’t prove correctness, the process of writing tests and making them pass can help make your software correct. That’s because writing the tests involves applying human judgment: What should this test assert? Does match the specification? Does this actually solve our problem?

When you go through the loop of writing tests, writing code, and checking if tests pass, you continuously apply your judgment: is the code wrong? is the test wrong? did I forget a requirement?

You write the test above, and then reread it, and then say “wait that’s wrong, 2 + 2 = 4”. You fix it, and then maybe you add to your one-off hardcoded tests some additional tests based on core arithmetic principles. Correctness comes from applying the process, not from the artifacts created by the process.

This may seem like pedantry: what does it matter whether the source of correctness is the tests themselves or the process of writing the tests? But it does matter. Understanding that human judgment is the key to correctness can keep you from thinking that passing tests are enough: you also need other forms of applied human judgment, like code review and manual testing.

(Formal methods augment human judgment with automated means… but that’s another discussion.)

The value of tests: stability

So if correctness comes from writing the tests, not the tests themselves, why do we keep the tests around?

Because tests ensure stability. once we judge the software is correct, the tests can keep the software from changing, and thus reduce the chances of its becoming incorrect. The tests are never enough, because the world can change even if the software isn’t, but stability has its value.

(Stability also has costs if you make the wrong abstraction layer stable…)

Tests are useful, but they’re not sufficient

To recap:

  1. Write automated tests.
  2. Run those tests.
  3. Don’t mistake passing tests for correctness: you will likely need additional processes and techniques to achieve that.


Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

December 12, 2018 05:00 AM

December 09, 2018

Moshe Zadka

Office Hours

If you want to speak to me, 1-on-1, about anything, I want to be able to help. I am a busy person. I have commitments. But I will make the time to talk to you.

Why?

  • I want to help.
  • I think I'll enjoy it. I like talking to people.

What?

I can offer opinions and experience on programming in general, Python, UNIX, the software industry and other topics.

How did you come up with the idea?

I am indebted to Robert Heaton for the idea and encouragement.

Should I...?

Sure! Especially if you have few connections in the industry, and have questions, I can talk to you. I am a fluent speaker of English and Hebrew, so you do need to be able to converse in one of those...

E-mail me!

by Moshe Zadka at December 09, 2018 05:30 AM

December 03, 2018

Itamar Turner-Trauring

'Must be willing to work under pressure' is a warning sign

As a programmer looking for a job, you need to be on the lookout for badly managed companies. Whether it’s malicious exploitation or just plain incompetence, the less time you waste applying for these jobs, the better.

Some warning signs are subtle, but not all. One of the most blatant is a simple phrase: “must be willing to work under pressure.”

The distance between we and you

Let’s take a look at some quotes from real job postings. Can you spot the pattern?

  • “Ability to work under pressure to meet sometimes aggressive deadlines.”
  • “Thick skin, ability to overcome adversity, and keep a level head under pressure.”
  • “Ability to work under pressure and meet tight deadlines.”
  • “Willing to work under pressure” and “working extra hours if necessary.”

If you look at reviews for these companies, many of them mention long working hours, which is not surprising. But if you read carefully there’s more to it than that: it’s not just what they’re saying, it’s also how they’re saying it.

When it comes to talking about the company values, for example, it’s always in the first person: “we are risk-takers, we are thoughtful and careful, we turn lead into gold with a mere touch of our godlike fingers.” But when it comes to pressure it’s always in the second person or third person: it’s always something you need to deal with.

Who is responsible for the pressure? It’s a mysterious mystery of strange mystery.

But of course it’s not. Almost always it’s the employer who is creating the pressure. So let’s switch those job requirements to first person and see how it reads:

  • We set aggressive deadlines, and we will pressure you to meet them.”
  • We will say and do things you might find offensive, and we will pressure you.”
  • We set tight deadlines, and we will pressure you to meet them.”
  • We will pressure you, and we will make you work long hours.”

That sounds even worse, doesn’t it?

Dysfunctional organizations (that won’t admit it)

When phrased in the first person, all of these statements indicate a dysfunctional organization. They are doing things badly, and maybe also doing bad things.

But it’s not just that they’re dysfunctional: it’s also that they won’t admit it. Thus the use of the second or third person. It’s up to you to deal with this crap, cause they certainly aren’t going to try to fix things. Either:

  1. Whoever wrote the job posting doesn’t realize they’re working for a dysfunctional organization.
  2. Or, they don’t care.
  3. Or, they can’t do anything about it.

None of these are good things. Any of them would be sufficient reason to avoid working for this organization.

Pressure is a choice

Now, I am not saying you shouldn’t take a job involving pressure. Consider the United States Digital Service, for example, which tries to fix and improve critical government software systems.

I’ve heard stories from former USDS employees, and yes, sometimes they do work under a lot of pressure: a critical system affecting thousands or tens of thousands of people goes down, and it has to come back up or else. But when the USDS tries to hire you, they’re upfront about what you’re getting in to, and why you should do it anyway.

They explain that if you join them your job will be “untangling, rewiring and redesigning critical government services.”. Notice how “untangling” admits that some things are a mess, but also indicates that your job will be to make things better, not just to passively endure a messed-up situation.

Truth in advertising

There’s no reason why companies couldn’t advertise in the some way. I fondly imagine that someone somewhere has written a job posting that goes like this:

“Our project planning is a mess. We need you, a lead developer/project manager who can make things ship on time. We know you’ll have to say ‘no’ sometimes, and we’re willing to live with that.”

Sadly, I’ve never actually encountered such an ad in the real world.

Instead you’ll be told “you must be able to work under pressure.” Which is just another way of saying that you should find some other, better jobs to apply to.



Struggling with a 40-hour workweek? Too tired by the end of the day to do anything but collapse on the sofa and watch TV?

Learn how you can get a 3-day weekend, every single week.

December 03, 2018 05:00 AM

November 29, 2018

Hynek Schlawack

Python Application Dependency Management in 2018

We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.

by Hynek Schlawack (hs@ox.cx) at November 29, 2018 05:00 PM

Moshe Zadka

Common Mistakes about Generational Garbage Collection

(Thanks to Nelson Elhage and Saivickna Raveendran for their feedback on earlier drafts. All mistakes that remain are mine.)

When talking about garbage collection, the notion of "generational collection" comes up. The usual motivation given for generational garbage collection is that "most objects die young". Therefore, we put the objects that survive a collection cycle (and therefore have proven some resistance) in a separate generation that we scan less often.

This is an optimization if the probability of an object that has survived a cycle to be garbage by the time the next collection cycle has come around is lower than the probability of a newly allocated object to be garbage.

In a foundational paper Infant mortality and generational garbage collection, Dr. Baker laid out an argument deceptive in its simplicity.

Dr. Baker asks the question: "Can we model a process where most objects become garbage fast, but generational garbage collection would not improve things?". His answer is: of course. This is exactly the probability distribution of radioactive decay.

If we have a "fast decaying element", say with a half-life of one second, than 50% of the element's atoms decay in one second. However, keeping the atoms that "survived a generation" apart from newly created atoms is unhelpful: all remaining atoms decay with probability of 50%.

We can bring the probability for "young garbage" as high up as we want: a half-life of half a second, a quarter second, or a microsecond. However, that is not going to make generational garbage collection any better than a straightforward mark-and-sweep.

The Poisson distribution, which models radioactive decay, has the property that P(will die in one second) might be high, but P(will die in one second|survived an hour) is exactly the same: the past does not give us information about the future. This is called the "no memory property" of Poisson distribution.

When talking about generational garbage collection, and especially if we are making theoretical arguments about its helpfulness, we need to make arguments about the distribution, not about the averages. In other words, we need to make an argument that some kinds of objects hang around for a long time, while others tend to die quickly.

One way to model it is "objects are bimodal": if we model objects as belonging to a mix of two Gaussian distributions, one with a small average and one with a big average, then the motivation for generational collection is clear: if we tune it right, most objects that survive the first cycle belong to the other distribution, and will survive for a few more cycles.

To summarize: please choose your words carefully. "Young objects are more likely to die" is an accurate motivation, "Most objects die young" is not. This goes doubly if you do understand the subtlety: do not assume the people you are talking with have an accurate model of how garbage works.

As an aside, some languages decided that generational collection is more trouble than it is worth because the objects that "die young" go through a different allocation style. For example, Go has garbage collection, but it tries to allocate objects on the stack if it can guarantee at compile-time they do not "escape". Because of that, the "first generation" is collected at stack popping time.

CPython has generational garbage collection, but it also has a "zeroth generation" of sorts: when functions return, all local variables get a "decref": a decrease in reference count. Those for whom that results in a 0 reference counts, which is often quite a few, get collected immediately.

by Moshe Zadka at November 29, 2018 03:00 AM

November 20, 2018

Thomas Vander Stichele

Recursive storytelling for kids

Most mornings I take Phoenix to school, as his school is two blocks away from work.

We take the subway to school, having about a half hour window to leave as the school has a half-hour play window before school really starts, which inevitably gets eaten up by collecting all the things, putting on all the clothes, picking the mode of transportation (no, not the stroller; please take the step so we can go fast), and getting out the door.

At the time we make it out, the subway is usually full of people, as are the cars, so we shuffle in and Phoenix searches for a seat, which is not available, but as long as he gets close enough to a pole and a person who looks like they’d be willing to give up a seat once they pay attention, he seems to get his way more often than not. And sometimes, the person next to them also offers up their seat to me. Which is when the fun begins.

Because, like any parent knows these days, as soon as you sit down next to each other, that one question will come:

“Papa, papa, papa… mag ik jouw telefoon?” (Can I have your phone? – Phoenix and I speak Dutch exclusively to each other. Well, I do to him.)

At which point, as a tired parent in the morning, you have a choice – let them have that Instrument of Brain Decay which even Silicon Valley parents don’t let their toddlers use, or push yourself to make every single subway ride an engaging and entertaining fun-filled program for the rest of eternity.

Or maybe… there is a middle way. Which is how, every morning, Phoenix and I engage in the same routine. I answer: “Natuurlijk mag jij mijn telefoon… als je éérst een verhaaltje vertelt.” (Of course you can have my phone – if you first tell me a story.)

Phoenix furrows his brows, and asks the only logical follow-up question there is – “Welk verhaaltje?” (Which story?)

And I say “Ik wil het verhaaltje horen van het jongetje en zijn vader die met de metro naar school gaan” (I want to hear the story of the little boy and his dad who take the subway to school.)

And he looks at me with big eyes and says, “Dat verhaaltje ken ik niet.” (I don’t know that story)

And I begin to tell the story:

“Er was eens… een jongetje en zijn vader.” (Once upon a time, there was a little boy and his father. Phoenix already knows the first three words of any story.)
“En op een dag… gingen dat jongetje en zijn vader met de metro naar school.” (And one day… the little boy and his father took the subway to school. The way he says “op een dag” whenever he pretends to read a story from a book is so endearing it is now part of our family tradition.)

“Maar toen de jongen en zijn vader op de metro stapten zat de metro vol met mensen. En het jongetje wou zitten, maar er was geen plaats. Tot er een vriendelijke mevrouw opstond en haar plaats gaf aan het jongetje, en het jongetje ging zitten. En toen stond de meneer naast de mevrouw ook recht en de papa ging naast het jongetje zitten.” (But when the little boy and his father got on the subway, it was full of people. And the little boy wanted to sit but there was no room. Until a friendly woman stood up and gave up her seat to the little boy, so the little boy sat down. And then the man next to the woman also stood up and his father sat down next to him.)

“En toen de jongen op de stoel zat, zei het jongetje, Papa papa papa papa papa papa papa…”(And when the boy sat down on the chair, he said Papa papa papa papa papa papa)

“Ja?, zei papa.” (Yes?, said papa.)

“Papa, mag ik jouw telefoon”? (Papa, can I have your phone?)

“Natuurlijk jongen….. als je éérst een verhaaltje vertelt.” (Of course son… if you first tell me a story.)

At which point, the story folds in on itself and recurses, and Phoenix’s eyes light up as he mouths parts of the sentences he already remembers, and joins me in telling the next level of recursion of the story.

I apologize in advance to all the closing parentheses left dangling like the terrible lisp programmer I’ve never given myself the chance to be, but making that train ride be phoneless every single time so far is worth it.

Flattr this!

by Thomas at November 20, 2018 02:19 AM

November 12, 2018

Itamar Turner-Trauring

Enthusiasts vs. Pragmatists: two types of programmers and how they fail

Do you love programming for its own sake, or do you do for the outcomes it allows? Depending on which describes you best you will face different problems in your career as a software developer.

Enthusiasts code out of love. If you’re an enthusiast you’d write software just for fun, but one day you discovered your hobby could also be your career, and now you get paid to do what you love.

Pragmatists may enjoy coding, but they do it for the outcomes. If you’re a pragmatist, you write software because it’s a good career, or for what it enables you to do and build.

There’s nothing inherently good or bad about either, and this is just a simplification. But understanding your own starting point can help you understand and avoid some of the problems you might encounter in your career.

In this post I will cover:

  1. Why many companies prefer to hire enthusiasts.
  2. The career problems facing enthusiasts, and how they can solve them.
  3. The career problems facing pragmatists, and how they can solve them.

Why companies prefer hiring enthusiasts

Before we move on to specific career problems you might face, it’s worth looking at the bigger picture: the hiring and work environment.

Many companies prefer to hire enthusiast programmers: from the way they screen candidates to the way they advertise jobs, they try to hire people who care about the technology for its own sake. From an employer’s point of view, enthusiasts have a number of advantages:

  1. In a rapidly changing environment, they’re more likely to keep up with the latest technologies. Even better, they’re more likely to do so in their free time, which means the company can spend less on training.
  2. Since they’d write software for free, it’s easier to pay enthusiasts less money.
  3. It’s also easier to get enthusiasts to work long hours.
  4. Finally, since enthusiasts care more about the technical challenge than the goals of the product, they’re less likely to choose their work based on ethical or moral judgments.

But while many companies prefer enthusiasts, this isn’t always in the best interest of either side, as we’ll see next.

The career problems facing enthusiasts

So let’s say you’re an enthusiast. Here are some of the career problems you might face; not everyone will have all these problems, but it’s worth paying attention to see if you’re suffering from one or more of them.

1. Exploitation

As I alluded to above, companies like enthusiasts because they’re worse negotiators.

If you love what you do you’ll accept less money, you’ll work long hours, and you’ll ask less questions. This can cause you problems in the long run:

So even if you code for fun, you should still learn how to negotiate, if only out of self-defense.

2. Being less effective as an employee

Matt Dupree has an excellent writeup about why being an enthusiast can make you a worse worker; I don’t want to repeat his well-stated points here. Here are some additional ways in which enthusiasm can make you worse at your job:

  • Shiny Object Syndrome: As an enthusiast it’s easy to choose a trendy technology or technique for your work because you want to play with it, not because it’s actually necessary in your situation. The most egregious example I’ve seen in recent years is microservices, where an organizational pattern designed for products with hundreds of programmers is being applied by teams with just a handful of developers.
  • Writing code instead of solving problems: If you enjoy writing code for its own sake, it’s tempting to write more code just because it’s fun. Productivity as a programmer, however, comes from solving problems with as little work as needed.

3. Work vs. art

Finally, as an enthusiast you might face a constant sense of frustration. As an enthusiast, you want to write software for fun: solve interesting problems, write quality code, fine-tune your work until it’s beautiful.

But a work environment is all about outcomes, not about craft. And that means a constant pressure to compromise your artistic standards, a constant need to work on things that aren’t fun, and a constant need to finish things on time, rather than when you’re ready.

So unless you want to become a pragmatist, you might want to get back more time for yourself, time where you can write code however you like. You could, for example, negotiate a 3-day weekend.

The career problems facing pragmatists

Pragmatists face the opposite set of problems; again, not all pragmatists will have all of these problems, but you should keep your eye out to see if they’re affecting you.

1. It’s harder to find a job

Since many companies actively seek out enthusiasts, finding a job as a pragmatist can be somewhat harder. Here are some things you can do to work around this:

  • Actively seek out companies that talk about work/life balance.
  • When interviewing, amplify your enthusiasm for technology beyond what it actually is. After all, you will learn what you need to to get the results you want, right?
  • Demonstrate the ways in which pragmatism actually makes you a more valuable employee.

2. You need to actively keep your skills up

Since you don’t care about technology for technology’s sake, it can be easy to let your skills get out of date, especially if you work for a company that doesn’t invest in training. To avoid this:

3. Pressure to work long hours

Finally, you will often encounter pressure both from management and—indirectly—from enthusiast peers to work long hours. Just remember that working long hours is bad for you and your boss (even if they don’t realize it).

Programmer, know thyself

So are you an enthusiast or a pragmatist?

These are not exclusive categories, nor will they stay frozen with time—these days I’m more of a pragmatist, but I used to be more of an enthusiast—but there is a difference in attitudes. And that difference will lead to different choices, and different problems.

Once you know who you are, you can figure out what you want—and avoid the inevitable obstacles along the way.



By the time you get home from work and finish your chores you’re too tired to do anything but watch TV and then collapse into bed. And weekends are just as bad.

But you could have a 3-day weekend not just on holidays, but every single week: learn how you can do it.

November 12, 2018 05:00 AM

November 09, 2018

Itamar Turner-Trauring

The cold incubator: the VC dream of workerless wealth

Chicken incubators are hot: eggs need heat to thrive. This was a different sort of incubator, a glass enclosure within a VC office. The VCs had the best views, but even we could look down ten stories and see the sun sparkling on the Charles River and reflecting off the buildings of downtown Boston.

It was summer when I joined the startup, but even so the incubator was freezing. The thermostat was skewed by a hot light sensor right below it, and the controls reset every night, so mornings were frigid. I took to wearing sweaters and fingerless gloves; the team at the other side of the incubator had figured out a way cover the AC vents with cardboard in a way that wasn’t visible to passersby.

But I didn’t have to suffer from the cold for very long. Soon after I joined the startup I unwittingly helped trigger our eviction from the rent-free Eden of the incubator to the harsher world outside.

The fall from grace

Most of the people who worked out of the incubator just used a laptop, or perhaps a monitor. But I like working with a standing desk, with a large Kinesis keyboard, and an external monitor on top.

My desk towered over everyone else: it was big and imposing and not very neat. Which is to say, the incubator started looking like a real office, not a room full of identical tables with a few Macbook Airs scattered hither and yon. And standing under the freezing air conditioner vent made my arms hurt, so I had to setup my desk in a way that was visible from the outside of the incubator.

And that view was too much for one of the partners in the VC firm. There were too many of us, we had too many cardboard boxes, my standing desk was just too big: it was time for us to leave the incubator.

The dream of workerless wealth

VCs take money, and (if all goes well) turn it into more money. But the actual reality of the work involved was too unpleasantly messy, too uncouth to be exposed to the sensibilities of wealthy clients.

We had to be banished out of sight, the ever-so-slightly grubby realities of office work hidden away, leaving only the clean cold dream of capital compounding through the genius of canny investors.

This dream—a dream of profit without workers—is the driving force behind many an investment fad:

  • The dream in its purest form, Bitcoin and cryptocurrencies promise wealth pulled from thin air, spinning hay into gold without the involvement of any brutish Rumpelstiltskins.
  • Driverless cars promise fleets of assets without those dirty, messy, expensive drivers; but until they come round, Uber and Lyft’s horde of drivers are safely kept at a distance as independent contractors with five star reviews.
  • Artificial intelligence promises decision making without human involvement, an objective encoding of subjective prejudice that will never feel moral qualms.

And if you work for a VC-funded startup, this dream takes on a nightmarish tinge when it turns to consider you.

Unbanished—for now

The point here is not that VCs want to reduce effort: who wouldn’t want a more efficient world? The dream is not driven by visions of efficiency, it’s about status and aesthetics: doing actual work is ugly, and paying for work is offensive.

Of course, some level of work is always necessary and unavoidable. And so VC firms understand that the startups they fund must hire workers like me and you.

But the cold dream is always there, whispering in the background: these workers take equity, they take cash, they’re grubby. So when times are good hiring has to be done as quickly as possible, but when times are bad the layoffs come just as fast.

And when you are working, you need to work as many hours as humanly possible, not because it’s efficient—it isn't—but because paying for your time is offensive, and so you better damn well work hard. Your work may be necessary, but to the cold dream it’s a necessary—and ugly—evil.



By the time you get home from work and finish your chores you’re too tired to do anything but watch TV and then collapse into bed. And weekends are just as bad.

But you could have a 3-day weekend not just on holidays, but every single week: learn how you can do it.

November 09, 2018 05:00 AM

November 07, 2018

Moshe Zadka

The Conference That Was Almost Called "Pythaluma"

As my friend Thursday said in her excellent talk (sadly, not up as of this time) naming things is important. Avoiding in-jokes is, in general, a good idea.

It is with mixed feelings, therefore, that my pun-loving heart reacted to Chris's disclosure that the most common suggestion was to call the conference "Pythaluma". However, he decided to go with the straightforward legible name, "North Bay Python".

North of the city by the bay, lies the quiet yet chic city of Petaluma, where North Bay Python takes place. In a gold-rush-city turned sleepy wine country, a historical cinema turned live show venu hosted Python enthusiasts in a single-track conference.

Mariatta opened the conference with her gut-wrenching talk about being a core Python developer. "Open source sustainability" might be abstract words, but it is easy to forget that for a language that's somewhere between the first and fifth most important (depending on a metric) there are less than a hundred people supporting its core -- and if they stop, the world breaks.

R0ml opened the second day of the conference talking about how:

  • Servers are unethical.
  • Python is the new COBOL.
  • I put a lot of pressure on him before his talk.

Talks are still being uploaded to the YouTube channel, and I have already had our engineering team at work watch Hayley's post-mortem of Jurassic Park.

If you missed all of it, I have two pieces of advice:

  • Watch the videos. Maybe even mine.
  • Sign up to the mailing list so you will not miss next year's.

If you went there, I hope you told me hi. Either way, please say hi next year!

by Moshe Zadka at November 07, 2018 08:00 AM

November 02, 2018

Itamar Turner-Trauring

When and why to clean up your code: now, later, never

You’ve got to meet your deadlines, you’ve got to fix the bug, you’ve got to ship the product.

But you’ve also got to think about the future: every bug you introduce now will have to be fixed later, using up even more time. And all those deprecated APIs, out-of-date dependencies, and old ways of doing things really shouldn’t be there.

So when do you clean up your code?

Do you do it now?

Later?

Never?

In this article I’ll go over a set of heuristics that will help you decide when to apply three kinds of fixes:

  1. Updating dependencies that are out-of-date and usages of deprecated APIs.
  2. Refactoring to fix bad abstractions.
  3. Miscellaneous Cleanups of anything else, from coding standard violations to awkward idioms.

Heuristics by situation

Prototyping

Before you start building something in earnest, you might start with a prototype (or what Extreme Programming calls a “spike”). You’re not going to keep this code, you’re just exploring the problem and solution space to see what you can learn.

Given you’re going to throw away this code, there’s not much point in Updating or Miscellaneous Cleanups. And if you’re just trying to understand an existing API or technical issue, you won’t be doing much Refactoring wither.

On the other hand, if your goal with prototyping is to find the right abstraction, you will be doing lots of Refactoring.

  1. Updating: never.
  2. Refactoring: now if you’re trying to prototype an API or abstraction, otherwise never.
  3. Miscellaneous Cleanups: never.

A new project

When you’re starting a completely new project, the decisions you make will have a strong impact on the maintenance code going forward.

This is a great opportunity to start with the latest (working) dependencies, situation-specific best practices and maintainable code, and the best abstractions you can come up with. You probably won’t get them completely right, but it’s usually worth spending the time to try to get it as close as possible.

  1. Updating: now.
  2. Refactoring: now.
  3. Miscellaneous Cleanups: now.

An emergency bugfix

You need to get a bug fix to users ASAP. While you might see problems along the way, but unless they’re relevant to this particular bug fix, it’s best to put them off until later.

Sometimes that might mean fixing the bug twice: once in a quick hacky way, and a second time after you’ve done the necessary cleanups.

  1. Updating: later.
  2. Refactoring: later.
  3. Miscellaneous Cleanups: later.

New feature or non-urgent bugfix

When you have a project in active development and you’re doing ongoing work, whether features or bug fixes, you have a great opportunity to incrementally clean up your code.

You don’t need to fix everything every time you touch the code. Instead, an ongoing cleanup of code you’re already touching will cumulatively keep your codebase in good shape. See Ron Jefferies’ excellent article for details.

  1. Updating: now, for code you’re touching.
  2. Refactoring: now, for code you’re touching.
  3. Miscellaneous Cleanups: now, for code you’re touching.

A project in maintenance mode

Eventually your project will be finished: not much new development is done, and mostly it just gets slightly tweaked every few months when something breaks or a drop-down menu needs an extra option.

Your goal at this point is to do the minimum work necessary to keep the project going. Refactoring and Miscellaneous Cleanups aren’t necessary, but Updates might be—dependencies can stop working, or need security updates. And jumping your dependencies 5 years ahead is often much harder than incrementally doing 5 dependency updates at yearly intervals.

So whenever you have to do fix a bug, you should update the dependencies—ideally to Long Term Support releases to reduce the need for API usage updates.

  1. Updating: now, ideally to Long Term Support releases.
  2. Refactoring: never.
  3. Miscellaneous Cleanups: never.

Balancing present and future

Software projects tend to ongoing processes, not one-off efforts. A cleanup now might save you time later—but if you have a deadline to meet now, it might be better to put it off even at the cost of slightly more work later on.

So takes this article only a starting point: as with any heuristic, there will be exceptions to the rule, and you need to be guided by your situation and your goals.



By the time you get home from work and finish your chores you’re too tired to do anything but watch TV and then collapse into bed. And weekends are just as bad.

But you could have a 3-day weekend not just on holidays, but every single week: learn how you can do it.

November 02, 2018 04:00 AM

October 24, 2018

Hynek Schlawack

Testing & Packaging

How to ensure that your tests run code that you think they are running, and how to measure your coverage over multiple tox runs (in parallel!).

by Hynek Schlawack (hs@ox.cx) at October 24, 2018 08:00 AM

Itamar Turner-Trauring

No more crunch time: the better way to meet your deadlines

Deadlines are hard to navigate.

On the one hand you risk crashing into the rocks of late delivery, and on the other you risk drowning in a whirlpool of broken software and technical debt.

And if you end up working long hours—if your only solution is crunch time—you risk both burnout and technical debt at the same time. Metaphorically, you risk drowning in a whirlpool full of fire-breathing rocks.

Given too much work and not enough time, you need to say “no” to your boss and negotiate a better set of deliverables—but what exactly should you be aiming for?

  • You could argue for maintainable, well-tested code, and deadlines be damned. But what if the deadline is important?
  • Or you could argue for meeting the deadline no matter what, even if that means shipping untested or buggy code. But doesn’t the code need to work?

This dilemma is a false dichotomy. Quality is situation-specific and feature-specific, and rote answers aren’t enough to decide on an implementation strategy.

The real answer is to prioritize the work that really matters in your particular situation, and jettison the rest until after the deadline. And that means saying “no”: saying no to your boss, saying no to your customer, and saying no to yourself.

No, all the code can’t be perfect (but this part needs to be rock solid).

No, that feature isn’t important for addressing our goals.

No, we can’t just add one small thing. But yes, we will do what really matters.

Let’s consider two examples: the same task, but very different goals and implementations.

Deadline #1: Raise money or lose your job

I once worked at a startup that was starting to run out of money: our existing business was not working. Luckily, the co-founders had come up with a plan. We would pivot to a trendy new area—Docker was just gaining traction—where we could build something no one else could do at the time.

Now, this meant were going to be building a distributed system, a notoriously difficult and complex task. And we had a deadline we had to meet: we were going to do a press campaign for our initial release, and that requires a few weeks of lead time. And of course on a broader level we needed to make enough of an impression that we could get funding before our current money ran out.

How do you build a complex, sophisticated piece of software with a short deadline? You start with your goals.

Start with your goals

Our goal was to demonstrate the key feature that only our software could do. We decided to do that by having users walk through a tutorial that demonstrated our value proposition. If the user was impressed with our tutorial then we’d succeeded; we explicitly told users that the software was not yet usable in the real world.

Based on this operational goal we were able to make a whole slew of simplifications:

  1. A production system would need to support parallel operation, but our tutorial only needed to be used by a single user. So we implemented the distributed system by having a CLI that SSHed into a machine and ran another CLI.
  2. A production system would need to handle errors, but our tutorial could run in a controlled environment. We created a Vagrant config that started two virtual machines, and didn’t bother writing error handling for lost connections, inaccessible machines, and so on.
  3. A production system would need to be upgradeable and maintainable, but our tutorial would only be run once. So we didn’t bother writing unit tests for most of it, and focused instead on manually running through the tutorial.

Now, you might be thinking “but you’re choosing to abandon quality”. But again, none of the things we dropped were relevant to our goal at that time. Spending time on “best practices” that aren’t relevant to your particular situation is a waste of time.

We were very clear in all our documentation and marketing efforts that this was a demonstration only, and that we would be following up later with a production-ready release.

Prioritize features by goals

Even with these simplifications, we ended up dropping features to meet the deadline. How did we decide which futures to drop?

We started by implementing the minimal features needed to demonstrate our core value, and so when we ran out of time we dropped the remaining, less critical features. We stopped coding before the drop date (a week or a few days, I believe), and focused just on testing and polishing our documentation.

Dropping features was quite alright: the idea was good enough, the tutorial was compelling enough, and our VP of Marketing was skilled enough, that we were able to raise a $12 million Series A based off that unscalable, unmaintainable piece of software. And after the initial release and publicity push we had time to implement those later features to keep up our momentum.

Deadline #2: Production-ready software

Once we had VC funding we rebuilt the product with the same set of features, but a very different goal: a production-ready product. That meant we needed a good architecture, an installer, error handling, good test coverage, and so on. It took much longer, and required a much bigger team, but that was OK: we had a different goal, and enough funding to allow for a longer deadline (again, based on publicity needs).

Even so, we made sure to choose and prioritize our work based on our goal: getting users and customers for our product. Our initial prototype had involved a peer-to-peer storage transfer mechanism, and making that production ready would have been a large R&D exercise. So in the short term we focused on cloud storage, a much simpler problem.

And we made sure to drop less important features as deadlines approached. We certainly didn’t do a perfect job, e.g. we dropped one feature that was half-implemented. We would have done better not starting it at all, since it was less important. But, we succeeded: the code was maintainable, people started using it, and we didn’t have to rely on crunch time to deliver.

Beyond universal solutions

There is no universal solution that will work everywhere, no easy answers. All you can do is ask “why”: why are we doing this, and why does this matter?

Once you know your goals, you can try and prioritize—and negotiate for—a solution that achieves your goals, meets the deadline, and doesn’t involve long hours:

  1. Drop features that don’t further your goals, and start with the most important features, in case you run out of time for the rest.
  2. Write high-quality code in the places where it matters, and drop “quality” where it doesn’t (what Havoc Pennington calls “professional corner-cutting”, emphasis on professional.)
  3. And if you still have too much work, it’s time to have a discussion with your boss or customer about what tradeoffs they actually want.


There’s always too much work to do: too many features to implement, too many bugs to fix—and working evenings and weekends won’t help.

The real solution is working fewer hours. Learn how you can get a 3-day weekend.

October 24, 2018 04:00 AM

October 16, 2018

Jonathan Lange

Notes on test coverage

These are a few quick notes to self, rather than a cogent thesis. I want to get this out while it’s still fresh, and I want to lower my own mental barrier to publishing here.

I’ve been thinking about test coverage recently, inspired by conversations that followed DRMacIver’s recent post.

Here’s my current working hypothesis:

  • every new project should have 100% test coverage
  • every existing project should have a ratchet that enforces increasing coverage
  • 100% coverage” means that every line is either:
    • covered by the test suite
    • or has some markup in code saying that it is explicitly not covered, and why that’s the case
  • these should be enforced in CI

The justification is that “the test of all knowledge is experiment” [0]. While we should absolutely make our code easy to reason about, and prove as much as we can about it, we need to check what it does against actual reality.

Simple testing really can prevent most critical failures. It’s OK to not test some part of your code, but that should be a conscious, local, recorded decision. You have to explicitly opt out of test coverage. The tooling should create a moment where you either write a test, or you turn around and say “hold my beer”.

Switching to this for an existing project can be prohibitively expensive, though, so a ratchet is a good idea. The ratchet should be “lines of uncovered code”, and that should only be allowed to go down. Don’t ratchet on percentages, as that will let people add new lines of uncovered code.

Naturally, all of this has to be enforced in CI. No one is going to remember to run the coverage tool, and no one is going to remember to check for it during code review. Also, it’s almost always easier to get negative feedback from a robot than a human.

I tagged this post with Haskell, because I think all of this is theoretically possible to achieve on a Haskell project, but requires way too much tooling to set up.

  • hpc is great, but it is not particularly user friendly.
  • Existing code coverage SaaS services don’t support expression-level coverage.
  • hpc has mechanisms for excluding code from coverage, but it’s not by marking up your code
  • hpc has some theoretically correct but pragmatically unfortunate defaults, e.g. it’ll report partial coverage for an otherwise guard, because it’s never run through when otherwise is False
  • There are no good ratchet tools

As a bit of an experiment, I set up a test coverage ratchet with graphql-api. I wanted both to test out my new enthusiasm for aiming for 100% coverage, and I wanted to make it easier to review PRs.

The ratchet script is some ad hoc Python, but it’s working. External contributors are actually writing tests, because the computer tells them to do so. I need to think less hard about PRs, because I can look at the tests to see what they actually do. And we are slowly improving our test coverage.

I want to build on this tooling to provide something genuinely good, but I honestly don’t have the budget for it at present. I hope to at least write a good README or user guide that illustrates what I’m aiming for. Don’t hold your breath.

[0]The Feynman Lectures on Physics, Richard Feynman

by Jonathan Lange at October 16, 2018 11:00 PM

October 10, 2018

Itamar Turner-Trauring

The next career step for Senior Software Engineers (that isn't management)

You’ve been working as a programmer for a few years, you’ve been promoted once or twice, and now you’re wondering what’s next. The path until this point was straightforward: you learned how to work on your own, and then you get promoted to Senior Software Engineer or some equivalent job title.

But now there’s no clear path ahead.

Do you become a manager and stop coding?

Do you just learn new technologies, or is that not enough?

What should you be aiming for?

In this post I’d like to present an alternative career progression, an alternative that will give you more autonomy, and more bargaining power. And unlike becoming a manager, it will still allow you to write code.

From coding to solving problems

In the end, your job as a programmer is solving problems, not writing code. Solving problems requires:

  1. Finding and identifying the problem.
  2. Coming up with a solution.
  3. Implementing the solution.

Each of these can be thought of a skill-tree: a set of related skills that can be developed separately and in parallel. In practice, however, you’ll often start in reverse order with the third skill tree, and add the others on one by one as you become more experienced.

Randall Koutnik describes these as job titles of a sort, a career progression: Implementers, Solvers, and Finders.

As an Implementer, you’re an inexperienced programmer, and your tasks are defined by someone else: you just implement small, well-specified chunks of code.

Let’s imagine you work for a company building a website for animal owners. You go to work and get handed a task: “Add a drop-down menu over here listing all iguana diseases, which you can get from the IGUANA_DISEASE table. Selecting a menu item should redirect you the appropriate page.”

You don’t know why a user is going to be listing iguana diseases, and you don’t have to spend too much time figuring out how to implement it. You just do what you’re told.

As you become more experienced, you become a Solver: are able to come up with solutions to less well-defined problems.

You get handed a problem: “We need to add a section to the website where pet owners can figure out if their pet is sick.” You figure out what data you have and which APIs you can use, you come up with a UI together with the designer, and then you create an implementation plan. Then you start coding.

Eventually you become a Finder: you begin identifying problems on your own and figuring out their underlying causes.

You go talk to your manager about the iguanas: almost no one owns iguanas, why are they being given equal space on the screen as cats and dogs? Not to mention that writing iguana-specific code seems like a waste of time, shouldn’t you be writing generic code that will work for all animals?

After some discussion you figure out that the website architecture, business logic, and design are going to have to be redone so that you don’t have to write new code every time a new animal is added. If you come up with the right architecture, adding a new animal will take just an hour’s work, so the company can serve many niche animal markets at low cost. Designing and implementing the solution will likely be enough work that you’re going to have to work with the whole team to do it.

The benefits of being a Finder

Many programmers end up as Solvers and don’t quite know what to do next. If management isn’t your thing, becoming a Finder is a great next step, for two reasons: autonomy and productivity.

Koutnik’s main point is that each of these three stages gives you more autonomy. As an Implementer you have very little autonomy, as a Solver you have more, and as a Finder you have lots: you’re given a pile of vague goals and constraints and it’s up to you to figure out what to do. And this can be a lot of fun.

But there’s another benefit: as you move from Implementer to Solver to Finder you become more productive, because you’re doing less unnecessary work.

  • If you’re just implementing a solution someone else specified, then you might be stuck with an inefficient solution.
  • If you’re just coming up with a solution and taking the problem statement at face value, then you might end up solving the wrong problem, when there’s another more fundamental problem that’s causing all the trouble.

The better you are at diagnosing and identifying underlying problems, coming up with solutions, and working with others, the less unnecessary work you’ll do, and the more productive you’ll be.

Leveraging your productivity

If you’re a Finder you’re vastly more productive, which makes you a far more valuable employee. You’re the person who finds the expensive problems, who identifies the roadblocks no one knew where there, who discovers what your customers really wanted.

And that means you have far more negotiating leverage:

So if you want to keep coding, and you still want to progress in your career, start looking for problems. If you pay attention, you’ll find them everywhere.



There’s always too much work to do: too many features to implement, too many bugs to fix—and working evenings and weekends won’t help.

The real solution is working fewer hours. Learn how you can get a 3-day weekend.

October 10, 2018 04:00 AM

October 06, 2018

Moshe Zadka

Why No Dry Run?

(Thanks to Brian for his feedback. All mistakes and omissions that remain are mine.)

Some commands have a --dry-run option, which simulates running the command but without taking effect. Sometimes the option exists for speed reasons: just pretending to do something is faster than doing it. However, more often this is because doing it can cause big, possibly detrimental, effects, and it is nice to be able to see what would happen before running the script.

For example, ansible-playbook has the --check option, which will not actually have any effect: it will just report what ansible would have done. This is useful when editing a playbook or changing the configuration.

However, this is the worst possible default. If we have already decided that our command can cause much harm, and one way to mitigate the harm is to run it in a "dry run" mode and have a human check that this makes sense, why is "cause damage" the default?

As someone in SRE/DevOps jobs, many of the utilities I run can cause great harm without care. They are built to destroy whole environments in one go, or to upgrade several services, or to clean out unneeded data. Running it against the wrong database, or against the wrong environment, can wreak all kinds of havoc: from disabling a whole team for a day to actual financial harm to the company.

For this reason, the default of every tool I write is to run in dry run mode, and when wanting to actually have effect, explicitly specify --no-dry-run. This means that my finger accidentally slipping on the enter key just causes something to appear on my screen. After I am satisfied with the command, I up-arrow and add --no-dry-run to the end.

I now do it as a matter of course, even for cases where the stakes are lower. For example, the utility that publishes this blog has a --no-dry-run that publishes the blog. When run without arguments, it renders the blog locally so I can check it for errors.

So I really have no excuses... When I write a tool for serious production system, I always implement a --no-dry-run option, and have dry runs by default. What about you?

by Moshe Zadka at October 06, 2018 07:00 AM

October 02, 2018

Moshe Zadka

Why No Dry Run?

(Thanks to Brian for his feedback. All mistakes and omissions that remain are mine.)

Some commands have a --dry-run option, which simulates running the command but without taking effect. Sometimes the option exists for speed reasons: just pretending to do something is faster than doing it. However, more often this is because doing it can cause big, possibly detrimental, effects, and it is nice to be able to see what would happen before running the script.

For example, ansible-playbook has the --check option, which will not actually have any effect: it will just report what ansible would have done. This is useful when editing a playbook or changing the configuration.

However, this is the worst possible default. If we have already decided that our command can cause much harm, and one way to mitigate the harm is to run it in a "dry run" mode and have a human check that this makes sense, why is "cause damage" the default?

As someone in SRE/DevOps jobs, many of the utilities I run can cause great harm without care. They are built to destroy whole environments in one go, or to upgrade several services, or to clean out unneeded data. Running it against the wrong database, or against the wrong environment, can wreak all kinds of havoc: from disabling a whole team for a day to actual financial harm to the company.

For this reason, the default of every tool I write is to run in dry run mode, and when wanting to actually have effect, explicitly specify --no-dry-run. This means that my finger accidentally slipping on the enter key just causes something to appear on my screen. After I am satisfied with the command, I up-arrow and add --no-dry-run to the end.

I now do it as a matter of course, even for cases where the stakes are lower. For example, the utility that publishes this blog has a --no-dry-run that publishes the blog. When run without arguments, it renders the blog locally so I can check it for errors.

So I really have no excuses... When I write a tool for serious production system, I always implement a --no-dry-run option, and have dry runs by default. What about you?

by Moshe Zadka at October 02, 2018 07:00 AM

September 27, 2018

Itamar Turner-Trauring

Avoiding burnout: lessons learned from a 19th century philosopher

You’re hard at work writing code: you need to ship a feature on time, or release a whole new product, and you’re pouring all your time and energy into it, your heart and your soul. And then, an uninvited and dangerous question insinuates itself into your consciousness.

If you succeed, if you ship your code, if you release your product, will you be happy? Will all your time and effort be worth it?

And you realize the answer is “no”. And suddenly your work is worthless, your goals are meaningless. You just can’t force yourself to work on something that doesn’t matter.

Why bother? Why work at all?

This is not a new experience. Almost 200 years ago, John Stuart Mill went through this crisis. And being a highly verbose 19th century philosopher, he also wrote a highly detailed explanation how he managed to overcome what we would call depression or burnout.

And this explanation is useful not just to his 19th century peers, but to us programmers as well.

“Intellectual enjoyments above all”

At the core of Mill’s argument is the idea that rational thought, “analysis” he calls it, is corrosive: “a perpetual worm at the root both of the passions and of the virtues”. He never rejected rational thought, but he concluded that on its own it was insufficient, and potentially dangerous.

Mill’s education had, from an early age, focused him solely on rational analysis. As a young child Mill was taught by his father to understand—not just memorize—Greek, arithmetic, history, mathematics, political economy, far more than even many well-educated adults learned at the time. And since he was taught at home without outside influences, he internalized his father’s ideas prizing intellect over emotions.

In particular, Mill’s father “never varied in rating intellectual enjoyments above all others… For passionate emotions of all sorts, and for everything which has been said or written in exaltation of them, he professed the greatest contempt.” Thus Mill learned to prize rational thought and analysis over other feelings, as many programmers do—until he discovered the cost of focusing on those alone.

“The dissolving influence of analysis”

One day, things went wrong:

I was in a dull state of nerves, such as everybody is occasionally liable to; unsusceptible to enjoyment or pleasurable excitement; one of those moods when what is pleasure at other times, becomes insipid or indifferent…

In this frame of mind it occurred to me to put the question directly to myself: “Suppose that all your objects in life were realized; that all the changes in institutions and opinions which you are looking forward to, could be completely effected at this very instant: would this be a great joy and happiness to you?” And an irrepressible self-consciousness distinctly answered, “No!”

From this point on Mill suffered from depression, for months on end. And being of an analytic frame of mind, he was able to intellectually diagnose his problem.

On the one hand, rational logical thought is immensely useful in understanding the world: “it enables us mentally to separate ideas which have only casually clung together”. But this ability to analyze also has its costs, since “the habit of analysis has a tendency to wear away the feelings”. In particular, analysis “fearfully undermine all desires, and all pleasures”.

Why should this make you happy? You try to analyze it logically, and eventually conclude there is no reason it should—and now you’re no longer happy.

“Find happiness by the way”

Eventually an emotional, touching scene in a book he was reading nudged Mill out of his misery, and when he fully recovered he changed his approach to life in order to prevent a recurrence.

Mill’s first conclusion was that happiness is a side-effect, not a goal you can achieve directly, nor verify directly by rational self-interrogation. Whenever you ask yourself “can I prove that I’m happy?” the self-consciousness involved will make the answer be “no”. Instead of choosing happiness as your goal, you need to focus on some other thing you care about:

Those only are happy (I thought) who have their minds fixed on some object other than their own happiness; on the happiness of others, on the improvement of mankind, even on some art or pursuit, followed not as a means, but as itself an ideal end. Aiming thus at something else, they find happiness by the way.

It’s worth noticing that Mill is suggesting focusing on something you actually care about. If you’re spending your time working on something that meaningless to you, you will probably have a harder time of it.

“The internal culture of the individual”

Mill’s second conclusion was that logical thought and analysis are not enough on their own. He still believed in the value of “intellectual culture”, but he also aimed to become a more balanced person by “the cultivation of the feelings”. And in particular, he learned the value of “poetry and art as instruments of human culture”.

For example, Mill discovered Wordsworth’s poetry:

These poems addressed themselves powerfully to one of the strongest of my pleasurable susceptibilities, the love of rural objects and natural scenery; to which I had been indebted not only for much of the pleasure of my life, but quite recently for relief from one of my longest relapses into depression….

What made Wordsworth’s poems a medicine for my state of mind, was that they expressed, not mere outward beauty, but states of feeling, and of thought coloured by feeling, under the excitement of beauty. They seemed to be the very culture of the feelings, which I was in quest of. In them I seemed to draw from a Source of inward joy, of sympathetic and imaginative pleasure, which could be shared in by all human beings…

Both nature and art cultivate the feelings, an additional and distinct way of being human beyond logical analysis:

The intensest feeling of the beauty of a cloud lighted by the setting sun, is no hindrance to my knowing that the cloud is vapour of water, subject to all the laws of vapours in a state of suspension…

The practice of happiness

Mill’s advice is not a universal panacea; among other flaws, it starts from a position of immense privilege. But I do think Mill hits on some important points about what it means to be human.

If you wish to put it into practice, here is Mill’s advice, insofar as I can summarize it (I encourage you to go and read his Autobiography on your own):

  1. Aim in your work not for happiness, but for a goal you care about: improving the world, or even just applying and honing a skill you value.
  2. Your work—and the rational thought it entails—will not suffice to make you happy; rational thought on its own will undermine your feelings.
  3. You should therefore also cultivate your feelings: through nature, and through art.


There’s always too much work to do: too many features to implement, too many bugs to fix—and working evenings and weekends won’t help.

The real solution is working fewer hours. Learn how you can get a 3-day weekend.

September 27, 2018 04:00 AM

September 26, 2018

Jp Calderone

Asynchronous Object Initialization - Patterns and Antipatterns

I caught Toshio Kuratomi's post about asyncio initialization patterns (or anti-patterns) on Planet Python. This is something I've dealt with a lot over the years using Twisted (one of the sources of inspiration for the asyncio developers).

To recap, Toshio wondered about a pattern involving asynchronous initialization of an instance. He wondered whether it was a good idea to start this work in __init__ and then explicitly wait for it in other methods of the class before performing the distinctive operations required by those other methods. Using asyncio (and using Toshio's example with some omissions for simplicity) this looks something like:


class Microblog:
def __init__(self, ...):
loop = asyncio.get_event_loop()
self.init_future = loop.run_in_executor(None, self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@asyncio.coroutine
def sync_latest(self):
# Don't do anything until initialization is done
yield from self.init_future
# ... do some work that depends on that initialization ...

It's quite possible to do something similar to this when using Twisted. It only looks a little bit difference:


class Microblog:
def __init__(self, ...):
self.init_deferred = deferToThread(self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@inlineCallbacks
def sync_latest(self):
# Don't do anything until initialization is done
yield self.init_deferred
# ... do some work that depends on that initialization ...

Despite the differing names, these two pieces of code basical do the same thing:

  • run _reading_init in a thread from a thread pool
  • whenever sync_latest is called, first suspend its execution until the thread running _reading_init has finished running it

Maintenance costs

One thing this pattern gives you is an incompletely initialized object. If you write m = Microblog() then m refers to an object that's not actually ready to perform all of the operations it supposedly can perform. It's either up to the implementation or the caller to make sure to wait until it is ready. Toshio suggests that each method should do this implicitly (by starting with yield self.init_deferred or the equivalent). This is definitely better than forcing each call-site of a Microblog method to explicitly wait for this event before actually calling the method.

Still, this is a maintenance burden that's going to get old quickly. If you want full test coverage, it means you now need twice as many unit tests (one for the case where method is called before initialization is complete and another for the case where the method is called after this has happened). At least. Toshio's _reading_init method actually modifies attributes of self which means there are potentially many more than just two possible cases. Even if you're not particularly interested in having full automated test coverage (... for some reason ...), you still have to remember to add this yield statement to the beginning of all of Microblog's methods. It's not exactly a ton of work but it's one more thing to remember any time you maintain this code. And this is the kind of mistake where making a mistake creates a race condition that you might not immediately notice - which means you may ship the broken code to clients and you get to discover the problem when they start complaining about it.

Diminished flexibility

Another thing this pattern gives you is an object that does things as soon as you create it. Have you ever had a class with a __init__ method that raised an exception as a result of a failing interaction with some other part of the system? Perhaps it did file I/O and got a permission denied error or perhaps it was a socket doing blocking I/O on a network that was clogged and unresponsive. Among other problems, these cases are often difficult to report well because you don't have an object to blame the problem on yet. The asynchronous version is perhaps even worse since a failure in this asynchronous initialization doesn't actually prevent you from getting the instance - it's just another way you can end up with an incompletely initialized object (this time, one that is never going to be completely initialized and use of which is unsafe in difficult to reason-about ways).

Another related problem is that it removes one of your options for controlling the behavior of instances of that class. It's great to be able to control everything a class does just by the values passed in to __init__ but most programmers have probably come across a case where behavior is controlled via an attribute instead. If __init__ starts an operation then instantiating code doesn't have a chance to change the values of any attributes first (except, perhaps, by resorting to setting them on the class - which has global consequences and is generally icky).

Loss of control

A third consequence of this pattern is that instances of classes which employ it are inevitably doing something. It may be that you don't always want the instance to do something. It's certainly fine for a Microblog instance to create a SQLite3 database and initialize a cache directory if the program I'm writing which uses it is actually intent on hosting a blog. It's most likely the case that other useful things can be done with a Microblog instance, though. Toshio's own example includes a post method which doesn't use the SQLite3 database or the cache directory. His code correctly doesn't wait for init_future at the beginning of his post method - but this should leave the reader wondering why we need to create a SQLite3 database if all we want to do is post new entries.

Using this pattern, the SQLite3 database is always created - whether we want to use it or not. There are other reasons you might want a Microblog instance that hasn't initialized a bunch of on-disk state too - one of the most common is unit testing (yes, I said "unit testing" twice in one post!). A very convenient thing for a lot of unit tests, both of Microblog itself and of code that uses Microblog, is to compare instances of the class. How do you know you got a Microblog instance that is configured to use the right cache directory or database type? You most likely want to make some comparisons against it. The ideal way to do this is to be able to instantiate a Microblog instance in your test suite and uses its == implementation to compare it against an object given back by some API you've implemented. If creating a Microblog instance always goes off and creates a SQLite3 database then at the very least your test suite is going to be doing a lot of unnecessary work (making it slow) and at worst perhaps the two instances will fight with each other over the same SQLite3 database file (which they must share since they're meant to be instances representing the same state). Another way to look at this is that inextricably embedding the database connection logic into your __init__ method has taken control away from the user. Perhaps they have their own database connection setup logic. Perhaps they want to re-use connections or pass in a fake for testing. Saving a reference to that object on the instance for later use is a separate operation from creating the connection itself. They shouldn't be bound together in __init__ where you have to take them both or give up on using Microblog.

Alternatives

You might notice that these three observations I've made all sound a bit negative. You might conclude that I think this is an antipattern to be avoided. If so, feel free to give yourself a pat on the back at this point.

But if this is an antipattern, is there a pattern to use instead? I think so. I'll try to explain it.

The general idea behind the pattern I'm going to suggest comes in two parts. The first part is that your object should primarily be about representing state and your __init__ method should be about accepting that state from the outside world and storing it away on the instance being initialized for later use. It should always represent complete, internally consistent state - not partial state as asynchronous initialization implies. This means your __init__ methods should mostly look like this:


class Microblog(object):
def __init__(self, cache_dir, database_connection):
self.cache_dir = cache_dir
self.database_connection = database_connection

If you think that looks boring - yes, it does. Boring is a good thing here. Anything exciting your __init__ method does is probably going to be the cause of someone's bad day sooner or later. If you think it looks tedious - yes, it does. Consider using Hynek Schlawack's excellent attrs package (full disclosure - I contributed some ideas to attrs' design and Hynek ocassionally says nice things about me (I don't know if he means them, I just know he says them)).

The second part of the idea an acknowledgement that asynchronous initialization is a reality of programming with asynchronous tools. Fortunately __init__ isn't the only place to put code. Asynchronous factory functions are a great way to wrap up the asynchronous work sometimes necessary before an object can be fully and consistently initialized. Put another way:


class Microblog(object):
# ... __init__ as above ...

@classmethod
@asyncio.coroutine
def from_database(cls, cache_dir, database_path):
# ... or make it a free function, not a classmethod, if you prefer
loop = asyncio.get_event_loop()
database_connection = yield from loop.run_in_executor(None, cls._reading_init)
return cls(cache_dir, database_connection)

Notice that the setup work for a Microblog instance is still asynchronous but initialization of the Microblog instance is not. There is never a time when a Microblog instance is hanging around partially ready for action. There is setup work and then there is a complete, usable Microblog.

This addresses the three observations I made above:

  • Methods of Microblog never need to concern themselves with worries about whether the instance has been completely initialized yet or not.
  • Nothing happens in Microblog.__init__. If Microblog has some methods which depend on instance attributes, any of those attributes can be set after __init__ is done and before those other methods are called. If the from_database constructor proves insufficiently flexible, it's easy to introduce a new constructor that accounts for the new requirements (named constructors mean never having to overload __init__ for different competing purposes again).
  • It's easy to treat a Microblog instance as an inert lump of state. Simply instantiating one (using Microblog(...) has no side-effects. The special extra operations required if one wants the more convenient constructor are still available - but elsewhere, where they won't get in the way of unit tests and unplanned-for uses.

I hope these points have made a strong case for one of these approaches being an anti-pattern to avoid (in Twisted, in asyncio, or in any other asynchronous programming context) and for the other as being a useful pattern to provide both convenient, expressive constructors while at the same time making object initializers unsurprising and maximizing their usefulness.

by Jean-Paul Calderone (noreply@blogger.com) at September 26, 2018 11:39 PM

September 21, 2018

Itamar Turner-Trauring

Never use the word "User" in your code

You’re six months into a project when you realize a tiny, simple assumption you made at the start was completely wrong. And now you need to fix the problem while keeping the existing system running—with far more effort than it would’ve taken if you’d just gotten it right in the first place.

Today I’d like to tell you about one common mistake, a single word that will cause you endless trouble. I am speaking, of course, about “users”.

There are two basic problems with this word:

  1. “User” is almost never a good description of your requirements.
  2. “User” encourages a fundamental security design flaw.

The concept “user” is dangerously vague, and you will almost always be better off using more accurate terminology.

You don’t have users

To begin with, no software system actually has “users”. At first glance “user” is a fine description, but once you look a little closer you realize that your business logic actually has more complexity than that.

We’ll consider three examples, starting with an extreme case.

Airline reservation systems don’t have “users”

I once worked on the access control logic for an airline reservation system. Here’s a very partial list of the requirements:

  • Travelers can view their booking through the website if they have the PNR locator.
  • Purchasers can modify the booking through the website if they have the last 4 digits of the credit card number.
  • Travel agents can see and modify bookings made through their agency.
  • Airline check-in agents can see and modify bookings based on their role and airport, given identifying information from the traveler.

And so on and so forth. Some the basic concepts that map to humans are “Traveler”, “Agent” (the website might also be an agent), and “Purchaser”. The concept of “user” simply wasn’t useful, and we didn’t use the word at all—in many requests, for example, we had to include credentials for both the Traveler and the Agent.

Unix doesn’t have “users”

Let’s take a look at a very different case. Unix (these days known as POSIX) has users: users can log-in and run code. That seems fine, right? But let’s take a closer look.

If we actually go through all the things we call users, we have:

  • Human beings who log in via a terminal or graphical UI.
  • System services (like mail or web servers) who also run as “users”, e.g. nginx might run as the httpd user.
  • On servers, there are often administrative accounts shared by multiple humans who SSH in using this “user” (e.g. ubuntu is the default SSH account on AWS VMs running Ubuntu).
  • root, which isn’t quite the same as any of the above.

These are four fairly different concepts, but in POSIX they are all “users”. As we’ll see later on, smashing all these concept into one vague concept called “user” can lead to many security problems.

But operationally, we don’t even have a way to say “only Alice and Bob can login to the shared admin account” within the boundaries of the POSIX user model.

SaaS providers don’t have “users”

Jeremy Green recently tweeted about the user model in Software-as-a-Service, and that is what first prompted me to write this post. His basic point is that SaaS services virtually always have:

  1. A person at an organization who is paying for the service.
  2. One or more people from that organization who actually use the service, together.

If you combine these into a single “User” at the start, you will be in a world of pain latter. You can’t model teams, you can’t model payment for multiple people at once—and now you need to retrofit your system. Now, you could learn this lesson for the SaaS case, and move on with your life.

But this is just a single instance of a broader problem: the concept “User” is too vague. If you start out being suspicious of the word “User”, you are much more likely to end up realizing you actually have two concepts at least: the Team (unit of payment and ownership) and the team Members (who actually use the service).

“Users” as a security problem

The word “users” isn’t just a problem for business logic: it also has severe security consequences. The word “user” is so vague that it conflates two fundamentally different concepts:

  • A human being.
  • Their representation within the software.

To see why this is a problem, let’s say you visit a malicious website which hosts an image that exploits a buffer overflow in your browser. The remote site now controls your browser, and starts uploading all your files to their server. Why can it do that?

Because your browser is running as your operating system “user”, which is presumed to be identical to you, a human being, a very different kind of “user”. You, the user, don’t want to upload those files. The operating system account, also the user, can upload those files, and since your browser is running under your user all its actions are presumed to be what you intended.

This is known as the Confused Deputy Problem. It’s a problem that’s much more likely to be part of your design if you’re using the word “user” to describe two fundamentally different things as being the same.

The value of up-front design

The key to being a productive programmer is getting the same work done with less effort. Using vague terms like “user” to model your software will take huge amounts of time and effort to fix later on. It may seem productive to start coding immediately, but it’s actually just the opposite.

Next time you start a new software project, spend a few hours up-front nailing down your terminology and concepts: you still won’t get it exactly right, but you’ll do a lot better. Your future self will thank you for the all the wasteful workaround work you’ve prevented.



There’s always too much work to do: too many features to implement, too many bugs to fix. And working evenings and weekends won’t help.

The real solution: you need to work fewer hours.

September 21, 2018 04:00 AM

September 10, 2018

Itamar Turner-Trauring

Work/life balance and challenging work: you can have both

You want to work on cutting edge technology, you want challenging problems, you want something interesting. Problem is, you also want work/life balance: you don’t want to deal with unrealistic deadlines from management, or pulling all-nighters to fix a bug.

And the problem is that when you ask around, people tell say you need to work long hours if you want to work on challenging problems. That’s just how it is, they say.

To which I say: bullshit.

You can work on challenging problems and still have work/life balance. In fact, you’ll do much better that way.

My apparently impossible career so far

Just as a counter-example, let me tell you how I’ve spent the past 14 years. Among other things, I’ve worked on:

  • A component of the flight search product that now powers Google Flights (flight search is hard—my team was working on the stuff on slides 44-48).
  • The prototype for what was then cutting edge container storage technology, a prototype that helped my company raise a $12 million Series A—and then we turned it into a production ready distributed system.
  • A crazy/useful Kubernetes local development tool.
  • Most recently, scientific image processing algorithms and processing pipeline.

All of these were hard problems, and interesting problems, and challenging problems, and none of them required working long hours.

Maybe those past 14 years are some sort of statistical aberration, but I rather doubt it. You can, for example, go work on some really tricky distributed systems problems over at Cockroach Labs, and have Fridays off to do whatever you want. (Not a personal endorsement: I know nothing about them other than those two points.)

Long hours have nothing to do with interesting problems

There is no inherent relationship between interesting problems and working long hours. You’re actually much more likely to solve hard problems if you’re well rested, and have plenty of time off to relax and let your brain do its thing off in the background.

The real origin of this connection is a marketing strategy for a certain subset of startups: “Yes, we’ll pay you jack shit and have you work 70 hours a week, but that’s the only way you can work on challenging problems!”

This is nonsense.

The real problem that these companies are trying to solve is “how do I get as much work out of these suckers with as little pay as possible.” It’s an incompetent self-defeating strategy, but there’s enough VCs who think exploitation is a great business model that you’re going to encounter it at least some startups.

The reality is that working long hours is the result of bad management. Which is to say, it’s completely orthogonal to how interesting the problem is.

You can just as easily find bad management in enterprise companies working on the most pointless and mind-numbingly soul-crushing problems (and failing to implement them well). And because of that bad management you’ll be forced to work long hours, even though the problems aren’t hard.

Luckily, you can also find good management in plenty of organizations, big and small—and some of them are working on hard, challenging problems too.

Avoiding bad workplaces

So how do you avoid exploitative workplaces and find the good ones? By asking some questions up front. You shouldn’t be relying on luck to keep you away from bad jobs; I made that mistake once, but never again.

Long ago I was interviewing for a job in NYC, and I mentioned that I wanted to continue working on open source software in my spare time. Here’s how the rest of the conversation went:

Interviewer: “Well, that’s fine, but… we used to have an employee here who did some non-profit work. We could never tell if their mind was here or on their volunteering, and it didn’t really work out. So we want to make sure you’ll be really focused on your job.”

Me: “Did they do their volunteering during work hours?”

Interviewer: “Oh, no, they only did that on their own time, it was just that they left at 5 o'clock every day.”

At that point I realized that, while I was willing to exchange 40 hours a week for a salary, I was not willing to exchange my whole life. I escaped that company by accident because they were so blatant about it, but you can do better.

Finding the job you want

When you’re interviewing for a job, don’t just ask about the problems they’re working on. You should also be asking about the work environment and work/life balance.

You can do so tactfully and informatively by asking things like “What’s a typical work day like here?” or “How are deadlines determined?” (You can get a good list of questions over at Culture Queries.)

There are companies out there that do interesting work and have work/life balance: do your research, ask the right questions, and you too will be able to find them.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 10, 2018 04:00 AM

September 04, 2018

Itamar Turner-Trauring

Stabbing yourself with a fork() in a multiprocessing.Pool full of sharks

It’s time for another deep-dive into Python brokenness and the pain that is POSIX system programming, this time with exciting and not very convincing shark-themed metaphors! Most of what you’ll learn isn’t really Python-specific, so stick around regardless and enjoy the sharks.

Let’s set the metaphorical scene: you’re swimming in a pool full of sharks. (The sharks are a metaphor for processes.)

Next, you take a fork. (The fork is a metaphor for fork().)

You stab yourself with the fork. Stab stab stab. Blood starts seeping out, the sharks start circling, and pretty soon you find yourself—dead(locked) in the water!

In this journey through space and time you will encounter:

  • A mysterious failure wherein Python’s multiprocessing.Pool deadlocks, mysteriously.
  • The root of the mystery: fork().
  • A conundrum wherein fork() copying everything is a problem, and fork() not copying everything is also a problem.
  • Some bandaids that won’t stop the bleeding.
  • The solution that will keep your code from being eaten by sharks.

Let’s begin!

Introducing multiprocessing.Pool

Python provides a handy module that allows you to run tasks in a pool of processes, a great way to improve the parallelism of your program. (Note that none of these examples were tested on Windows; I’m focusing on the *nix platform here.)

from multiprocessing import Pool
from os import getpid

def double(i):
    print("I'm process", getpid())
    return i * 2

if __name__ == '__main__':
    with Pool() as pool:
        result = pool.map(double, [1, 2, 3, 4, 5])
        print(result)

If we run this, we get:

I'm process 4942
I'm process 4943
I'm process 4944
I'm process 4942
I'm process 4943
[2, 4, 6, 8, 10]

As you can see, the double() function ran in different processes.

Some code that ought to work, but doesn’t

Unfortunately, while the Pool class is useful, it’s also full of vicious sharks, just waiting for you to make a mistake. For example, the following perfectly reasonable code:

import logging
from threading import Thread
from queue import Queue
from logging.handlers import QueueListener, QueueHandler
from multiprocessing import Pool

def setup_logging():
    # Logs get written to a queue, and then a thread reads
    # from that queue and writes messages to a file:
    _log_queue = Queue()
    QueueListener(
        _log_queue, logging.FileHandler("out.log")).start()
    logging.getLogger().addHandler(QueueHandler(_log_queue))

    # Our parent process is running a thread that
    # logs messages:
    def write_logs():
        while True:
            logging.error("hello, I just did something")
    Thread(target=write_logs).start()

def runs_in_subprocess():
    print("About to log...")
    logging.error("hello, I did something")
    print("...logged")

if __name__ == '__main__':
    setup_logging()

    # Meanwhile, we start a process pool that writes some
    # logs. We do this in a loop to make race condition more
    # likely to be triggered.
    while True:
        with Pool() as pool:
            pool.apply(runs_in_subprocess)

Here’s what the program does:

  1. In the parent process, log messages are routed to a queue, and a thread reads from the queue and writes those messages to a log file.
  2. Another thread writes a continuous stream of log messages.
  3. Finally, we start a process pool, and log a message in one of the child subprocesses.

If we run this program on Linux, we get the following output:

About to log...
...logged
About to log...
...logged
About to log...
<at this point the program freezes>

Why does this program freeze?

How subprocesses are started on POSIX (the standard formerly known as Unix)

To understand what’s going on you need to understand how you start subprocesses on POSIX (which is to say, Linux, BSDs, macOS, and so on).

  1. A copy of the process is created using the fork() system call.
  2. The child process replaces itself with a different program using the execve() system call (or one of its variants, e.g. execl()).

The thing is, there’s nothing preventing you from just doing fork(). For example, here we fork() and then print the current process’ process ID (PID):

from os import fork, getpid

print("I am parent process", getpid())
if fork():
    print("I am the parent process, with PID", getpid())
else:
    print("I am the child process, with PID", getpid())

When we run it:

I am parent process 3619
I am the parent process, with PID 3619
I am the child process, with PID 3620

As you can see both parent (PID 3619) and child (PID 3620) continue to run the same Python code.

Here’s where it gets interesting: fork()-only is how Python creates process pools by default.

The problem with just fork()ing

So OK, Python starts a pool of processes by just doing fork(). This seems convenient: the child process has access to a copy of everything in the parent process’ memory (though the child can’t change anything in the parent anymore). But how exactly is that causing the deadlock we saw?

The cause is two problems with continuing to run code after a fork()-without-execve():

  1. fork() copies everything in memory.
  2. But it doesn’t copy everything.

fork() copies everything in memory

When you do a fork(), it copies everything in memory. That includes any globals you’ve set in imported Python modules.

For example, your logging configuration:

import logging
from multiprocessing import Pool
from os import getpid

def runs_in_subprocess():
    logging.info(
        "I am the child, with PID {}".format(getpid()))

if __name__ == '__main__':
    logging.basicConfig(
        format='GADZOOKS %(message)s', level=logging.DEBUG)

    logging.info(
        "I am the parent, with PID {}".format(getpid()))

    with Pool() as pool:
        pool.apply(runs_in_subprocess)

When we run this program, we get:

GADZOOKS I am the parent, with PID 3884
GADZOOKS I am the child, with PID 3885

Notice how child processes in your pool inherit the parent process’ logging configuration, even if that wasn’t your intention! More broadly, anything you configure on a module level in the parent is inherited by processes in the pool, which can lead to some unexpected behavior.

But fork() doesn’t copy everything

The second problem is that fork() doesn’t actually copy everything. In particular, one thing that fork() doesn’t copy is threads. Any threads running in the parent process do not exist in the child process.

from threading import Thread, enumerate
from os import fork
from time import sleep

# Start a thread:
Thread(target=lambda: sleep(60)).start()

if fork():
    print("The parent process has {} threads".format(
        len(enumerate())))
else:
    print("The child process has {} threads".format(
        len(enumerate())))

When we run this program, we see the thread we started didn’t survive the fork():

The parent process has 2 threads
The child process has 1 threads

The mystery is solved

Here’s why that original program is deadlocking—with their powers combined, the two problems with fork()-only create a bigger, sharkier problem:

  1. Whenever the thread in the parent process writes a log messages, it adds it to a Queue. That involves acquiring a lock.
  2. If the fork() happens at the wrong time, the lock is copied in an acquired state.
  3. The child process copies the parent’s logging configuration—including the queue.
  4. Whenever the child process writes a log message, it tries to write it to the queue.
  5. That means acquiring the lock, but the lock is already acquired.
  6. The child process now waits for the lock to be released.
  7. The lock will never be released, because the thread that would release it wasn’t copied over by the fork().

In simplified form:

from os import fork
from threading import Lock

# Lock is acquired in the parent process:
lock = Lock()
lock.acquire()

if fork() == 0:
    # In the child process, try to grab the lock:
    print("Acquiring lock...")
    lock.acquire()
    print("Lock acquired! (This code will never run)")

Band-aids and workarounds

There are some workarounds that could make this a little better.

For module state, the logging library could have its configuration reset when child processes are started by multiprocessing.Pool. However, this doesn’t solve the problem for all the other Python modules and libraries that set some sort of module-level global state. Every single library that does this would need to fix itself to work with multiprocessing.

For threads, locks could be set back to released state when fork() is called (Python has a ticket for this.) Unfortunately this doesn’t solve the problem with locks created by C libraries, it would only address locks created directly by Python. And it doesn’t address the fact that those locks don’t really make sense anymore in the child process, whether or not they’ve been released.

Luckily, there is a better, easier solution.

The real solution: stop plain fork()ing

In Python 3 the multiprocessing library added new ways of starting subprocesses. One of these does a fork() followed by an execve() of a completely new Python process. That solves our problem, because module state isn’t inherited by child processes: it starts from scratch.

Enabling this alternate configuration requires changing just two lines of code in your program:

from multiprocessing import get_context

def your_func():
    with get_context("spawn").Pool() as pool:
        # ... everything else is unchanged

That’s it: do that and all the problems we’ve been going over won’t affect you. (See the documentation on contexts for details.)

But this still requires you to do the work. And it requires every Python user who trustingly follows the examples in the documentation to get confused why their program sometimes breaks.

The current default is broken, and in an ideal world Python would document that, or better yet change it to no longer be the default.

Learning more

My explanation here is of course somewhat simplified: for example, there is state other than threads that fork() doesn’t copy. Here are some additional resources:

Stay safe, fellow programmers, and watch out for sharks and bad interactions between threads and processes! 🦈🦑

(Want more stories of software failure? I write a weekly newsletter about 20+ years of my mistakes as a programmer.)

Thanks to Terry Reedy for pointing out the need for if __name__ == '__main__'.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

September 04, 2018 04:00 AM

September 03, 2018

Moshe Zadka

Managing Dependencies

(Thanks to Mark Rice for his helpful suggestions. Any mistakes or omissions that remain are my responsibility.)

Some Python projects are designed to be libraries, consumed by other projects. These are most of the things people consider "Python projects": for example, Twisted, Flask, and most other open source tools. However, things like mu are sometimes installed as an end-user artifact. More commonly, many web services are written as deployable Python applications. A good example is the issue tracking project trac.

Projects that are deployed must be deployed with their dependencies, and with the dependencies of those dependencies, and so forth. Moreover, at deployment time, a specific version must be deployed. If a project declares a dependency of flask>=1.0.1, for example, something needs to decide whether to deploy flask 1.0.1 or flask 1.0.2.

For clarity, in this text, we will refer to the declared compatibility statements in something like setup.py (e.g., flask>=1.0.1) as "intent" dependencies, since they document programmer intent. The specific dependencies that are eventually deployed will be referred as the "expressed" dependencies, since they are expressed in the actual deployed artifact (for example, a Docker image).

Usually, "intent" dependencies are defined in setup.py. This does not have to be the case, but it almost always is: since there is usually some "glue" code at the top, keeping everything together, it makes sense to treat it as a library -- albeit, one that sometimes is not uploaded to any package index.

When producing the deployed artifact, we need to decide on how to generate the expressed dependencies. There are two competing forces. One is the desire to be current: using the latest version of Django means getting all the latest bug fixes, and means getting fixes to future bugs will require moving less versions. The other is the desire to avoid changes: when deploying a small bug fix, changing all library versions to the newest ones might introduce a lot of change.

For this reason, most projects will check in the "artifact" (often called requirements.txt) into source control, produce actual deployed versions from that, and some procedure to update it.

A similar story can be told about the development dependencies, often defined as extra [dev] dependencies in setup.py, and resulting in a file dev-requirements.txt that is checked into source control. The pressures are a little different, and indeed, sometimes nobody bothers to check in dev-requirements.txt even when checking in requirements.txt, but the basic dynamic is similar.

The worst procedure is probably "when someone remembers to". This is not usually anyone's top priority, and most developers are busy with their regular day-to-day task. When an upgrade is necessary for some reason -- for example, a bug fix is available, this can mean a lot of disruption. Often this disruption manifests in that just upgrading one library does not work. It now depends on newer libraries, so the entire dependency graph has to be updated, all at once. All intermediate "deprecation warnings" that might have been there for several months have been skipped over, and developers are suddenly faced with several breaking upgrades, all at once. The size of the change only grows with time, and becomes less and less surmountable, making it less and less likely that it will be done, until it ends in a case of complete bitrot.

Sadly, however, "when someone remembers to" is the default procedure in the absence of any explicit procedure.

Some organizations, having suffered through the disadvantages of "when someone remembers to", decide to go to the other extreme: avoiding to check in the requirements.txt completely, and generating it on every artifact build. However, this means causing a lot of unnecessary churn. It is impossible to fix a small bug without making sure that the code is compatible with the latest versions of all libraries.

A better way to approach the problem is to have an explicit process of recalculating the expressed dependencies from the intent dependencies. One approach is to manufacture, with some cadence, code change requests that update the requirements.txt. This means they are resolved like all code changes: review, running automated tests, and whatever other local processes are implemented.

Another is to do those on a calendar based event. This can be anything from a manually-strongly-encouraged "update Monday", where on Monday morning, one of a developer tasks is to generate a requirements.txt updates for all projects they are responsible for, to including it as part of a time-based release process: for example, generating it on a cadence that aligns with agile "sprints", as part of the release of the code changes in a particular sprints.

When updating does reveal an incompatibility it needs to be resolved. One way is to update the local code: this certainly is the best thing to do when the problem is that the library changed an API or changed an internal implementation detail that was being used accidentally (...or intentionally). However, sometimes the new version has a bug in it that needs to be fixed. In that case, the intent is now to avoid that version. It is best to express the intent exactly as that: !=<bad version>. This means when an even newer version is released, hopefully fixing the bug, it will be used. If a new version is released without the bug fix, we add another != clause. This is painful, and intentionally so. Either we need to get the bug fixed in the library, stop using the library, or fork it. Since we are falling further and further behind the latest version, this is introducing risk into our code, and the increasing != clauses will indicate this pain: and encourage us to resolve it.

The most important thing is to choose a specific process for updating the expressed dependencies, clearly document it and consistently follow it. As long as such a process is chosen, documented and followed, it is possible to avoid the bitrot issue.

by Moshe Zadka at September 03, 2018 03:00 AM

August 22, 2018

Itamar Turner-Trauring

Guest Post: How to engineer a raise

You’ve discovered you’re underpaid. Maybe you found out a new hire is making more than you. Or maybe you’ve been doing a great job at work, but your compensation hasn’t changed.

Whatever the reason, you want to get a higher salary.

Now what?

To answer that question, the following guest post by Adrienne Bolger will explain how you can negotiate a raise at your current job. As you’ll see, she’s successfully used these strategies to negotiate 20-30% raises on multiple occasions.

This article will answer some common questions, and explain some useful strategies, to help you—a software engineer—engineer a raise from your employer. I’ll cover:

  1. Researching your worth and options.
  2. Expectation setting.
  3. Strategies that I have used—and helped others use—to ask for a raise.

How much are you “worth”?

At the end of the day, an optimized salary in a more-or-less capitalist market is the highest salary you think you can get that passes the “laugh test.” If you ask for a salary or bonus, and your (theoretical) boss or HR head laughs in your face, then the number is too high.

Note that this number isn’t your laugh test number: many people, out of fear of rejection, are afraid to ask for a 25% raise rather than a more “modest” sounding 5% raise. But sometimes the 25% value is the right increase! Your number should not be based on fear: it should be based on research.

There are several ways to calculate your “market value” to an employer. To start, take 2 or 3 of the following quizzes to calculate median/mean salaries based on your demographics:

How much could you be worth in the future?

Take the surveys a second time. However, this time, give yourself a small imaginary promotion: 2 years more experience and the next job title you want—Senior Engineer, Engineer II, Software Architect, Engineering Manager, Director, whatever it is. How far away is that yearly salary amount from the first one? A little? A lot?

This is an important number, because the pay market for software engineers is not linear. Check out this graph created by ArsTechnica from the 2017 Stack Overflow salary data.

This graph shows the economics of a very hot job market: people with relatively little experience still make a good living, because their skills are in high demand. However, the median salary for a developer between 15 and 20 years of experience is completely flat. This isn’t the best news for experienced developers who haven’t kept learning (and some languages pay more than others), but for early career professionals, this external market factor is fantastic.

With data to back you up, you can ask for a 20 to 30% raise after only a year or two on the job with a completely straight face. I did it in my own career at the 2 and 4 year marks at the same company, and received the raise both times.

Adjust expectations for your company and industry

If you’ve come to the conclusion you are underpaid because you know what your colleagues earn, then you can skip this step. Otherwise, you have a little more research to do.

Ask your company’s HR department and recruiters: when hiring in general, does your company go for fair market prices, under-market bargains, or above-market talent? Industries like finance pay better than non-profits and civil service organizations whether you are an engineer or an accountant.

The bigger the company, the more likely you are to get standard yearly pay adjustments for things like cost-of-living expenses, but a bigger company is also likely more rigid in salary bands for a specific job title. HR at your company may or may not be willing to share the exact high and low range for a job title. If they are not, Glassdoor can provide a decent estimate for similarly size companies.

When to ask

Again, know your company. Does it have a standard financial cycle, with cost-of-living and raises allocated yearly 1-2 months after reviews are in?

If so, time your “ask” before your formal review by 3-8 weeks. That might be November if your yearly reviews are in December, or it might be January if company yearly performance reviews occur in March, after the fiscal year results from last year are in.

Why do this?

The problem with waiting until a formal review is scheduled is that is ruins plans you can’t see or are not privy to. Even in the best case where you were getting a raise anyway, the manager giving your review already has a planned number in their head and their accounting software. Asking a month beforehand gives your boss time to budget your raise into a yearly plan, which is much easier than trying to fight bureaucracy out-of-cycle.

You should not ask for a raise more frequently than every 2 years. If you feel like you have to, then you probably didn’t ask for enough last time. Consider that if you find yourself afraid to ask for as much.

If you are debating between asking for a raise and going job hunting because you feel undervalued, ask for the raise first. I suggest this because job searching is a huge time sink, especially if you don’t really want to change jobs.

You owe it to yourself to proactively seek happiness. If what you really want is more money and to stay at your current company, then give your employer a chance to make you happy. If you ask and are denied, then at least you’ve done all the research into compensation when you go looking.

How to ask

Ask for a raise both in writing and in person.

As email is still considered informal, this is one of those cases where an actual letter—printed out and hand-delivered to a scheduled meeting with your manager—is a good idea. The meeting gives the chance to explain what you want a little more, but the letter is a written record of what you want that goes to HR, as well as a way to keep yourself from backing out due to nerves or stress.

I once requested a raise from a manager who (unbeknownst to me) was let go 2 weeks later. However, because my raise request was also in writing, I received the raise from my new boss with no confusion after the transfer.

The letter should be 2-3 paragraphs long and:

  • Be addressed and CC’d to your manager and HR at your company.
  • List your current length of service with the company and affirm that you like working there.
  • Detail exactly what you want: a 20% raise? A $5,000 raise? Tuition money for school? More vacation days? Do not leave it ambiguous.
  • Detail why you believe you deserve it, and back it up with available data:
    • Do you have more experience now?
    • Earned a degree?
    • Learn new skills or programming language?
    • Has it been 3 years since you got a review because you work at a 20 person startup?
  • The exception to the previous point is if you know you are underpaid because a coworker with the same responsibilities is paid more: it’s enough to say that in general terms. Calling a specific coworker out is unnecessary.
  • List, in 2 sentences or less, any recent accomplishments that were especially impactful. This serves 2 purposes: reminding your boss how awesome you are, but also making it easy on them to justify your (deserved) raise to the people they are accountable to at the next level up in the company.
  • End with a request for a meeting discussing the contents of the letter.
  • Be signed and dated.

The letter (and subsequent meeting) should not:

  • Imply you will leave if you don’t get what you want, even if you are planning on it. Bluffing here is a good way to get asked to leave anyway. Even if you are planning to leave if you don’t receive a raise, threats put people on the defensive.
  • Sound angry or imply you have an ungrateful or deficient manager/employer. Position yourself as asking for something a reasonable person should want to give you. Have the most gentle and peaceful individual you know read your letter to double check tone. If all else fails, try your local Buddhist monk.

The meeting

Once you ask for a raise and a meeting to talk about it, nerves may kick in. Do your homework ahead of time and come in prepared. Bring a copy of your letter and, during the meeting, re-iterate exactly what it is you want and why you deserve it.

It’s fine to be nervous, but do not attempt any weird “Hollywood caricature of a car salesman” negotiating tactics. Don’t be short-sighted; remember that you have to perform your day job with your manager once the meeting is over.

If your employer declines

If you asked for your “laugh test” number and your employer can only meet you halfway or can’t increase your compensation at all, your response should be “Why? And what can I do to change that?”

Be proactive in determining where the problem is. At a big company, if there’s a salary band, you may need a promotion before you can get the raise. If the company isn’t making enough money for raises for anyone, it may be time to discreetly look for another job anyway.

Whether or not you choose to accept a compromise or counteroffer is up to you—but make sure that you can live with your choice, at least short term, because it won’t make sense to ask again for another few months.

And that’s Adrienne’s post. I hope you found it useful: I certainly learned a lot from it.

Of course, reading this article isn’t enough. You still need to go and do the work to get the raise. So why not start today?

  1. Do your research.
  2. Pick the right moment.
  3. Go ask for that raise!


It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 22, 2018 04:00 AM

August 16, 2018

Itamar Turner-Trauring

How to say "no" to your boss, your boss's boss, and even the CEO

You’ve got plenty of work to do already, when your boss (or their boss, or the CEO) comes by and asks you to do yet another task. If you take yet another task on you’re going to be working long hours, or delivering your code late, and someone is going to be unhappy.

You don’t want to say no to your boss (let alone the CEO!). You don’t want to say yes and spend your weekend working.

What do you do? How do you keep everyone happy?

What you need is your management to trust your judgment. If they did, you could focus on the important work, the work that really matters. And when you had to say “no”, your boss (or the CEO!) would listen and let you continue on your current task.

To get there, you don’t immediately say “no”, and don’t immediately say “yes”.

Here’s what you do instead:

  1. Start with your organizational and project goals.
  2. Listen and ask questions.
  3. Make a decision.
  4. Communicate your decision in terms of organizational and project goals.

Step 1: Start with you goals

If you want people to listen to you, you need a strong understanding of why you’re doing the work you’re doing.

  • What is your organization trying to achieve?
  • What is your project trying to achieve, and how does that connect to organizational goals?
  • How does your work connect to the project goals?

You should be able to connect your individual action to project success, and connect that to organizational success. For example, “Our goal is to increase recurring revenue, customer churn is too high and it’s decreasing revenue, so I am working on this bug because it’s making our product unusable for at least 7% of users.”

When you’re just starting out as an employee this can be difficult to do, but as you grow in experience you can and should make sure you understand this.

(Starting with your goals is useful in other ways as well, e.g. helping you stay focused).

Step 2: Listen and ask questions

Your lead developer/product manager/team mate/CEO/CTO had just stopped by your desk and given you a new task. No doubt you already have many existing tasks. How should you handle this situation?

To begin with, don’t immediately give an answer:

  • Don’t immediately say “yes”: Unless you happen to have no existing work, any new work you take on will slow down your existing work. Your existing work was chosen for a reason, and may well be more important than this new task.
  • Don’t immediately say “no”: There’s a reason you’re being asked to do this task. By immediately saying “no” you are devaluing the request, and by extension devaluing the person who asked you.

Instead of immediately agreeing or disagreeing to do the task, take the time find out why the task needs to be done. Make sure you demonstrate you actually care about the request and are seriously considering it.

That means first, listening to what they have to say.

And second, asking some questions: why does this need to be done? What is the deadline? How important is it to them?

Sometimes the CEO will come by and ask for something they don’t really care about: they only want you to do it if you have the spare time. Sometimes your summer intern will come by and point out a problem that turns out to be a critical production-impacting bug.

You won’t know unless you listen, and ask questions to find out what’s really going on.

Step 3: Decide based on your goals

Is the new task more important to project and organizational goals than your current task? You should probably switch to working on it.

Is the new task less important? You don’t want to do it.

Not sure? Ask more questions.

Still not sure? Talk to your manager about it: “Can I get back to you in a bit about this? I need to talk this over with Jane.”

Step 4: Communicate your decision

Once you’ve made a decision, you need to communicate it in a meaningful, respectful way, and in a way that reflects organizational and project goals.

If you decided to take the task on:

  1. Tell the person asking you that you’ll take it on.
  2. Explain to the people who requested your previous tasks that those tasks will be late. Make sure it’s clear why you took on a new task: “That feature is going to have to wait: it’s fairly low on the priority list, and the CEO asked me to throw together a demo for the sales meeting on Friday.”

If you decided not to take it on:

  1. Explain why you’re not going to do it, in the context of project and organizational goals. “That’s a great feature idea, and I’d love to do it, but this bug is breaking the app for 10% of our customers and so I really need to focus on getting it done.”
  2. Provide an alternative, which can include:
    • Deflection: “Why don’t you talk to the product manager about this?”
    • Queuing: “Why don’t you add it to the backlog, and we can see if we have time to do it next sprint?”
    • Promise: “I’ll do it next, as soon as I’m done with my current task.”
    • Reminder: “Can you remind me again in a couple of weeks?”
    • Different solution: “Your original proposal would take me too long, given the release-blocker backlog, but maybe if we did this other thing instead I could fit it in. It seems like it would get us 80% of the functionality in a fraction of the time–what do you say?”

Becoming a more valuable employee

Saying “no” the right way makes you more valuable, because it ensures you’re working on important tasks.

It also ensures your managers know you’re more valuable, because you’ve communicated that:

  1. You’ve carefully and respectfully considered their request.
  2. You’ve taken existing requests you’re already working on into account.
  3. You’ve made a decision not based on personal whim, but on your best understanding of what is important to your project and organization.

Best of all, saying “no” the right way means no evenings or weekends spent working on tasks that don’t really need doing.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 16, 2018 04:00 AM

August 10, 2018

Itamar Turner-Trauring

There's always more work to do—but you still don't need to work long hours

Have you ever wished you could reduce your working hours, or even just limit yourself to 40 hours a week, but came up against all the work that just needs doing? There’s always more work to do, always more bugs, always some feature that’s important to someone—

How can you limit yourself to 40 hours a week, let alone a shorter workweek, given all this work?

The answer: by planning ahead. And planning ahead the right way.

The wrong way to plan

I was interviewing for a job at a startup, and my first interviewer was the VP of Engineering. He explained that he’d read my blog posts about the importance of work/life balance, and he just wanted to be upfront about the fact they were working 50-60 hours each week. And this wasn’t a short-term emergency: in fact, they were going to be working long hours for months.

I politely noted that I felt good prioritization and planning could often reduce the need for long hours.

The VP explained the problem: they’d planned all their tasks in detail. But then—to their surprise—an important customer asked for more features, and that blew through their schedule, which is why they needed to work long hours.

I kept my mouth shut and went through the interview process. But I didn’t take the job.

Here’s what’s wrong with this approach:

  1. Important customers asking for more features should not be a surprise. Customers ask for changes, this is how it goes.
  2. More broadly, the original schedule was apparently created with the presumption that everything would go perfectly. In the real world nothing ever goes perfectly.
  3. When it became clear that that there was too much work to do, their solution was to work longer hours, even though research suggests that longer hours do not increase output over the long term.

The better way: prioritization and padding

So how do you keep yourself from blowing through your schedule without working long hours?

  1. Prioritize your work.
  2. Leave some padding in your schedule for unexpected events.
  3. Set your deadlines shorter than they need to be.
  4. If you run out of time, drop the least important work.

1. Prioritize your work

Not all work is created equal. By starting with your goals, you can divide tasks into three buckets:

  1. Critical to your project’s success.
  2. Really nice to have—but not critical.
  3. Clearly not necessary.

Start by dropping the third category, and minimizing the second. You’ll have to say “no” sometimes, but if you don’t say “no” you’ll never get anything delivered on time.

2. Leave some padding in your schedule

You need to assume that things will go wrong and you’ll need extra time to do any given task. And you need to assume other important tasks will also become critical; you don’t know which, but this always happens. So never give your estimate as the actual delivery date: always pad it with extra time for unexpected difficulties and unexpected interruptions.

If you think a task will take a day, promise to deliver it in three days.

3. Set shorter deadlines for yourself

Your own internal deadline, the one you don’t communicate to your boss or customer, should be shorter than your estimate. If you think a task will take a day, try to finish it in less time.

Why?

  • You’ll be forced to prioritize even more.
  • With less time to waste on wrong approaches, you’ll be forced to spend more time upfront thinking about the best solution.

4. When you run out of time, drop the less important work

Inevitably things will still go wrong and you’ll find yourself running low on time. Now’s the time to drop all the nice-to-haves, and rethink whether everything you thought was critical really is (quite often, it’s not).

Long hours are the wrong solution

Whenever you feel yourself with too much work to do, go back and apply these principles: underpromise, limit your own time, prioritize ruthlessly. With practice you’ll learn how to deliver the results that really matter—without working long hours.

When you’ve reached that point, you can work a normal 40-hour workweek without worrying. Or even better, you can start thinking about negotiating a 3-day weekend.



It’s Friday afternoon. You just can’t write another line of code—but you’re still stuck at the office...

What if every weekend could be a 3-day weekend?

August 10, 2018 04:00 AM