Planet Twisted

January 19, 2017

Itamar Turner-Trauring

Specialist vs. Generalist: which is better for your career?

One of the decisions you'll need to make during the course of you career as a software developer is whether you should become:

  1. A specialist, an expert in a certain subject.
  2. A generalist, able to take on a wide variety of different work.

Miquel Beltran argues that specialization is the path to choose. At the end of his essay he suggests:

Stick to one platform, framework or language and your professional career will be better on the long run.

I think he's both right and wrong. Specialization is a great career move... but I don't think being a generalist is bad for your career either. In fact, you can be both and still have a good career, because there are two distinct areas in which this question plays out.

Getting hired is not the same as getting things done

Getting hired and getting things done are two different tasks, and you need different skills to do each.

When you're trying to get hired you are trying to show why you are the best candidate. That means dealing with the company's attitude towards employees, and the reason they are hiring, and the way they approach their work. It's not about how well you'll do or your job, or how good a programmer you are, or any of that: it's just about getting your foot in the door.

Once you're in, once you're an employee or a consultant, what matters is the results you deliver. It doesn't matter if you've only spent a few weeks previously writing iOS apps so long you do a good job writing an iOS app after you've been hired. And if you've spent years as an iOS developer and you fail to deliver, the fact you specialize in the iOS apps isn't going to help you.

Since getting hired and doing the work are separate tasks, that means you need to separate the decision to be a specialist or generalist into two questions: which will help you get hired, and which will make you better at actually doing your job?

Specialization is a marketing technique

If the question is how you should get hired then you are in the realm of marketing, not engineering. Specialization is a marketing technique: it's a way to demonstrate why you should be hired because you are an expert in your specialty.

Because specialization a marketing technique, specialization doesn't necessarily need to map to specialization on an engineering level. Let me give some examples from my career.

In 2001 I started contributing to an open source Python project, a networking framework called Twisted. I have used this experience in a variety of ways:

  • In 2004 got a job offer from a company that was writing Java, because I had recently added multicast support to Twisted and they wanted to use multicast for an internal project. I had a little experience writing Java, but mostly they wanted to hire me because I was specialist in multicast.
  • I turned that job down, but later that year I got a job at ITA Software, writing networking code in C++. I didn't know any C++... but I knew stuff about networking.
  • When I left ITA I spent a couple years doing Twisted consulting. I was a Twisted specialist.
  • At my latest job I got hired in part because I knew networking protocols... but also because I had experience participating in open source projects.

While all these specializations are related, they are not identical: each job I got involved being a specialist in a different area.

It's not what you can do, it's what you emphasize

Now, you could argue that the reasons I got hired enough are close enough that I am indeed a specialist: in networking or distributed systems. But consider that earlier in my career I did a number of years of web development. So back in 2004 I could have applied to web development jobs, highlighted that part of my resume, and relegated my open source networking work to a sentence at the end.

You likely have many engineering and "soft" skills available to you. Instead of focusing on one particular skillset ("I am an Android developer") you can focus on some other way you are special. E.g. If you're building a consulting pipeline then maybe it's a some business vertical you specialize in, to differentiate yourself from all the other Android developers.

But if you're marketing yourself on a one-off basis, which is certainly the case when you're applying for a job, you can choose a specialty that fits the occasion. Here's how my former colleague Adam Dangoor does it:

Pick one thing from what they talk about that you think is probably the least pitched-to aspect. E.g. if they’re a Python shop everyone will say that they know Python well. But you can spot that e.g. they need help with growing a team and you have experience with that. It could very well be that 10 other candidates do too, but you just say that and you’re the one candidate who can grow a team.

Specialist or Generalist?

So which should you chose, generalist or specialist?

When it comes to engineering skills, or just learning in general, my bias is towards being a generalist. When I went back to school to finish my degree I focused on the humanities and social science; I didn't take a single programming class. You may have different biases then I do.

But engineering skills are fundamentally different than how you market yourself. You can be a generalist in your engineering skills and market yourself as a specialist. In particular, when applying for jobs, you should try to be a specialist in what the company needs.

Sometimes a technical specialty is exactly what they want: you have some set of skills that are hard to find. But often there's a bit more to it than that. They might say they need an Android expert", but what they really need is someone to ship things fast.

They're looking for "an Android expert" because they don't want a learning curve. So if you emphasize the times you've delivered projects quickly and an on schedule you might get the job even, though another candidate had a couple more years of Android experience than you do..

In short, when it comes to engineering skills I tend towards being a generalist, but that may just be my personal bias. When marketing yourself, be a specialist... but there's nothing keeping you from being a different specialist every time you apply for a new job.

January 19, 2017 05:00 AM

January 18, 2017

Jack Moffitt

Servo Talk at LCA 2017

My talk from Linux.conf.au was just posted, and you can go watch it. In it I cover some of the features of Servo that make it unique and fast, including the constellation and WebRender.

Servo Architecture: Safety & Performance by Jack Moffitt, LCA 2017, Hobart, Australia.

by Jack Moffitt (jack@metajack.im) at January 18, 2017 12:00 AM

January 12, 2017

Jonathan Lange

Announcing grafanalib

Late last year, as part of my work at Weaveworks, I published grafanalib, a Python DSL for building Grafana dashboards.

We use it a lot, and it’s made our dashboards much nicer to maintain. I’ve written a blog post about it that you can find it on the Weaveworks blog.

by Jonathan Lange at January 12, 2017 12:00 AM

January 11, 2017

Itamar Turner-Trauring

Your Job is Not Your Life: staying competitive as a developer

Are you worried about keeping your programming skills up-to-date so you can stay employable? Some programmers believe that to succeed you must spend all of your time learning, practicing and improving your craft. How do you fit all that in and still have a life?

In fact, it's quite possible to limit yourself to programming during work hours and still be employable and successful. If you do it right then staying competitive, if it's even necessary, won't require giving up your life for you job.

What does it mean to be "competitive?"

Before moving on to solutions it's worth understanding the problem a little more. The idea of "competitiveness" presumes that every programmer must continually justify their employment, or they will be replaced by some other more qualified developer.

There are shrinking industries where this is the case, but at the moment at least demand for programmers is quite high. Add on the fact that hiring new employees is always risky and worrying about "competitiveness" seems unnecessary. Yes, you need to do well at your job, but I doubt most programmers are at risk of being replaced a moment's notice.

Instead of worrying about "competitiveness" you should focus on the ability to easily find a new job. For example, there are other ways you improve your chances at finding a new job that have nothing to do with your engineering skills:

  • Living below your means will allow you to save money for a rainy day. You'll have more time to find a job if you need to, and more flexibility in what jobs you can take.
  • Keep in touch with old classmates and former colleagues; people you know are the best way to find a new job. Start a Slack channel for ex-coworkers and hang out. This can also be useful for your engineering skills, as I'll discuss later on.

Moving on to engineering skills, the idea that you need to put in long hours outside of work is based both on the need to become an expert, and on the need to keep up with changing technology. Both can be done on the job.

Becoming an expert

You've probably heard the line about expertise requiring 10,000 hours of practice. The more hours you practice the better, then, right?

In fact many of the original studies were about number of years, not number of hours (10 years in particular). And the kind of practice matters. What you need is "deliberate practice":

... deliberate practice is a highly structured activity, the explicit goal of which is to improve performance. Specific tasks are invented to overcome weaknesses, and performance is carefully monitored to provide cues for ways to improve it further. We claim that deliberate practice requires effort and is not inherently enjoyable.

Putting aside knowledge of particular technologies, the kinds of things you want to become an expert at are problem solving, debugging, reading unknown code, etc.. And while you could practice them on your own time, the most realistic forms of practice will be at your job. What you need to do is utilize your work as a form of practice.

How should you practice? The key is to know your own weaknesses, and to get feedback on how you're doing so you can improve. Here are two ways to do that:

  1. Code reviews: a good code reviewer will point out holes in your design, in the ways you've tested your code, in the technology you're using. And doing code reviews will also improve your skills as you consider other people's approaches. A job at an organization with a good code review culture will be valuable to your skills and to your career.
  2. Self-critique: whenever you make a mistake, try to think about what you should have noticed, what mental model would have caught the problem, and how you could have chosen better. Notice the critique is not of the result. The key is to critique the process, so that you do better next time.

I write a weekly newsletter about my many mistakes, and while this is ostensibly for the benefit of the readers I've actually found it has helped me become a more insightful programmer. If you want to learn how to make self-critique useful than just an exercise in negativity I recommend the book The Power of Intuition by Gary Klein.

Learning technical skills

Beyond expertise you also need technical skills: programming languages, frameworks, and so on. You will never be able to keep up with all the changing technologies that are continuously being released. Instead, try the following:

  • Switching jobs: when you're looking for a new job put some weight on organizations that use newer or slightly different technologies than the ones you know. You'll gain a broader view of the tools available than what you'd get a single company.
  • Building breadth: instead of learning many technologies in depth, focus on breadth. Most tools you'll never use, but the more you know of the more you can reach for... and building breadth takes much less time.
  • Find a community: you'll never know everything. But knowing many programmers with different experiences than you means you have access to all of their knowledge. You can find online forums like Subreddits, IRC, mailing lists and so on. But if you don't feel comfortable with those you can also just hang out on Slack with former coworkers who've moved on to another job.

Your job is not your life

All of the suggestions above shouldn't require much if any time outside of your job. If you enjoy programming and want to do it for fun, by all means do so. But your job shouldn't be the only thing you spend you life on.

If you would like to learn how to to get a job that doesn't overwhelm your life, join my free 6-part email course.

Join the course: Getting to a Sane Workweek

Don't let your job take over your life. Join over 720 other programmers on the journey to a saner workweek by taking this free 6-part email course. You'll learn how you can work reasonable hours and still succeed in your career a programmer.

I won't send you any spam. Unsubscribe at any time. Powered by ConvertKit

January 11, 2017 05:00 AM

January 06, 2017

Itamar Turner-Trauring

The fourfold path to software quality

How do you achieve software quality? How do you write software that actually works, software that isn't buggy, software that doesn't result in 4AM wake up calls when things break in production?

There are four different approaches you can take, four paths to the ultimate goal. Which path you choose to take will depend on your personality, skills and the circumstances of your work.

The path of the Yolo Programmer

The first path is that of the Yolo Programmer. As a follower of the famous slogan "You Only Live Once", the Yolo Programmer chooses not to think about software quality. Instead the Yolo Programmer enjoys the pure act of creation; writing code is a joy that would only be diminished by thoughts of maintenance or bugs.

It's easy to look down on the Yolo Programmer, to deride their approach a foolish attitude only suitable for children. As adults we suppress our playfulness because we worry about the future. But even though the future is important, the joy of creation is still a fundamental part of being human.

When you have the opportunity, when you're creating a prototype or some other code that doesn't need to be maintained, embrace the path of the Yolo Programmer. There's no shame in pure enjoyment.

The path of the Rational Optimizer

In contrast to the Yolo Programmer, the Rational Optimizer is well aware of the costs of bugs and mistakes. Software quality is best approached by counter-balancing two measurable costs: the cost of bugs to users and the business vs. the cost of finding and fixing the bugs.

Since bugs are more expensive the later you catch them, the Rational Optimizer invests in catching bugs as early as possible. And since human effort is expensive, the Rational Optimizer loves tools: software can be written once and used many times. Tools to find bugs are thus an eminently rational way to increase software quality.

David R. MacIver's post The Economics of Software Correctness is a great summary of this approach. And he's built some really wonderful tools: your company should hire him if you need to improve your software's quality.

The path of Mastery

The path of Mastery takes a different attitude, which you can see in the title Kate Thompson's book Zero Bugs and Program Faster (note that she sent me a free copy, so I may be biased).

Mastery is an attitude, a set of assumptions about how one should write code. It assumes that the code we create can be understood with enough effort. Or, if the code is not understandable, it can and should be simplified until we can understand it.

The path of Mastery is a fundamentally optimistic point of view: we can, if we choose, master our creations. If we can understand our code we can write quality code. We can do so by proving to ourselves that we've covered all the cases, and by learning to structure our code the right way. With the right knowledge, the right skills and the right attitude we can write code with very few bugs, perhaps even zero bugs.

To learn more about this path you should read Thompson's book; it's idiosyncratic, very personal, and full of useful advice. You'll become a better programmer by internalizing her lessons and attitude.

The path of the Software Clown

The final path is the path of the Software Clown. If Mastery is a 1980s movie training montage, the Software Clown is a tragicomedy: all software is broken, failure is inevitable, and nothing ever works right. There is always another banana peel to slip on, and that would be sad if it weren't so funny.

Since the Software Clown is always finding bugs, the Software Clown makes sure they get fixed, even when they're in someone else's software. Since software is always broken, the Software Clown plans for brokenness. For example, if bugs are inevitable then you should make sure users have an easy time reporting them.

Since banana peels are everywhere, the Software Clown learns how to avoid them. You can't avoid everything, and you won't avoid everything, but you can try to avoid as many as possible.

If you'd like to avoid the many mistakes I've made as a software engineer, sign up for my Software Clown newsletter. You'll get the story of one of my mistakes in your inbox every week and how you can avoid making it.

These are the four paths you can take, but remember: there is no one true answer, no one true path. Try to learn them all, and the skills and attitudes that go along with them; you'll become a better programmer and perhaps even a better person.

Avoid my programming mistakes!

Get a weekly email with one of my many software and career mistakes, and how you can avoid it. Here's what readers are saying:

"Are you reading @itamarst's "Software Clown" newsletter? If not, you should be. There's a gem in every issue." - @glyph

I won't share your email with anyone else. Unsubscribe at any time. Powered by ConvertKit

January 06, 2017 05:00 AM

January 02, 2017

Itamar Turner-Trauring

When software ecosystems die

How much can you rely on the frameworks, tools and libraries you build your software on? And what can you do to reduce the inherent risk of depending on someone else's software?

Years ago I watched a whole software ecosystem die.

Not the slow decline of a programming language that is losing its users, or a no longer maintained library that has a newer, incompatible replacement. This was perma-death: game over, no resurrection, no second chances.

Here's what happened, and what you can learn from it.

The story of mTropolis

Back in the 1990s the Next Big Thing was multimedia, and in particular multimedia CD-ROMs. The market leader was Macromedia Director, a rather problematic tool.

Macromedia Director started out as an animation tool, using a sequence of frames as its organizing metaphor, which meant using it for hypermedia involved a rather bizarre idiom. Your starting screen would be frame 1 on the timeline, with a redirect to itself on exit, an infinite busy loop. Remember this started as animation tool, so the default was to continue on to later frames automatically.

When you clicked on a button that took you to a new screen it worked by moving you to another frame, let's say frame 100. Frame 100 would have a "go to frame 100" on exit to made sure you didn't continue on to frame 101, and then 102, etc.

Then in 1995 mTropolis showed up, a newer, better competitor to Director. It was considered by many to be the superior alternative, even in its very first release. It had a much more suitable conceptual model, features that were good enough to be copied by Director, and a loyal fan base.

In 1997 mTropolis was bought by Quark, maker of the the QuarkXPress desktop publishing software. A year later in 1998 Quark decided to end development of mTropolis.

mTropolis' users were very upset, of course, so they tried to buy the rights off of Quark and continue development on their own.

The purchase failed. mTropolis died.

Market leader or scrappy competitor?

The story of mTropolis had a strong impression on me as a young programmer: I worked with Director, so I was not affected, but the developers who used mTropolis were dead in the water. All the code they'd built was useless as soon as a new OS release broke mTropolis in even the smallest of ways.

This isn't a unique story, either: spend some time reading Our Incredible Journey. Startups come and go, and software ecosystems die with them.

Professor Beekums has an excellent post about switching costs in software development. He argues that given the choice between equivalent market leader and smaller competitor you should choose the latter, so you don't suffer from monopoly pricing.

But what do you do when they're not equivalent, or it's hard to switch? You still need to pick. I would argue that if they're not equivalent, the market leader is much safer. Macromedia was eventually bought by Adobe, and so Director is now Adobe Director. Director was the market leader in 1998, and it's still being developed and still available for purchase, almost 20 years later.

mTropolis may have been better, but mTropolis wasn't the market leader. And mTropolis is dead, and has been for a very long time.

Making the choice

So which do you go for, when you have the choice?

If you're dealing with open source software, much of the problem goes away. Even if the company sponsoring the software shuts down, access to the source code gives you a way to switch off the software gradually.

With Software-as-a-Service you're back in the realm of choosing between monopoly pricing and chance of software disappearing. And at least with mTropolis the developers still could use their licensed copies; an online SaaS can shut down at any time.

Personally I'd err on the side of choosing the market leader, but it's hard to give a general answer. Just remember: the proprietary software you rely on today might be gone tomorrow. Be prepared.

January 02, 2017 05:00 AM

December 23, 2016

Ralph Meijer

Changes

For me, Christmas and Jabber/XMPP go together. I started being involved with the Jabber community around the end of 2000. One of the first things that I built was a bot that recorded the availability presence of my online friends, and show this on a Christmas tree. Every light in the tree represents one contact, and if the user is offline, the light is darkened.As we are nearing Christmas, I put the tree up on the frontpage again, as many years before.

Over the years, the tooltips gained insight in User Moods and Tunes, first over regular Publish-Subscribe, later enhanced with the Personal Eventing Protocol. A few years later, Jingle was born, and in 2009, stpeter wrote a great specification that solidifies the relationship between Christmas and Jabber/XMPP.

Many things have changed in those 16 years. I've changed jobs quite a few times, most recently switching from the Mailgun team at Rackspace, to an exciting new job at VimpelCom as Chat Expert last April, working on Veon (more on that later). The instant messaging landscape has changed quite a bit, too. While we, unfortunately, still have a lot of different incompatible systems, a lot of progress has been made as well.

XMPP's story is long from over, and as such I am happy and honored to serve as Chair of the XMPP Standards Foundation since last month. As every year, my current focus is making another success of the XMPP Summit and our presence with the Realtime Lounge and Devroom at FOSDEM in Brussels in February. This is always the highlight of the year, with many XMPP enthousiasts, as well as our friends of the wider Realtime Communications, showing and discussing everything they are working on, ranging from protocol discussions to WebRTC and IoT applications.

Like last year, one of the topics that really excite me is the specification known as Mediated Information eXchange (MIX). MIX takes the good parts of the Multi User Chat (MUC) protocol, that has been the basis of group chat in XMPP for quite a while, redesigned on top of XMPP Publish-Subscribe. Modern commercial messaging systems, for business use (e.g. Slack and HipChat), as well as for general use (e.g. WhatsApp, WeChat, Google's offerings), have tried various approaches on the ancient model of multi-part text exchange, adding multi-media and other information sources, e.g. using integrations, bots, and cards.

MIX is the community's attempt to provide a building block that goes beyond the tradional approach of a single stream of information (presence and messages) to a collection of orthogonal information streams in the same context. A room participant can select (manually or automatically by the user agent) which information streams are of interest at that time. E.g. for mobile use or with many participants, exchanging the presence information of all participants can be unneeded or even expensive (in terms of bandwidth or battery use). In MIX, presence is available as a separate stream of information that can be disabled.

Another example is Slack's integrations. You can add streams of information (Tweets, continuous integration build results, or pull requests) to any channel. However, all participants have no choice to receive the resulting messages, intermixed with discussion. The client's notification system doesn't make any distinction between the two, so you either suffer getting alerts for every build, or mute the channel and possibly miss interesting discussion. The way around it is to have separate channels for notifications and discussion, possibly muting the former.

Using MIX, however, a client can be smarter about this. It can offer the user different ways to consume these information streams. E.g. notifications on your builds could be in a side bar. Tweets can be disabled, or modeled as a ticker. And it can be different depending on which of the (concurrent) clients you are connected with. E.g. the desktop or browser-based client has more screen real-estate to show such orthogonal information streams at the same time, a mobile client might still show the discussion and notifications interleaved.

All-in-all MIX allows for much richer, multi-modal, and more scalable interactions. Some of the other improvements over MUC include persistent participation in channels (much like IRC bouncers, but integrated), better defined multi-device use (including individual addressing), reconnection, and message archiving. I expect the discussions at the XMPP Summit to tie the loose ends as a prelude to initial implementations.

I am sure that FOSDEM and the XMPP Summit will have many more exciting topics, so I hope to see you there. Until then, Jabber on!

by ralphm at December 23, 2016 01:28 PM

December 20, 2016

Itamar Turner-Trauring

The one technology you need to learn in 2017

If you want to become a better programmer or find a better-paying job, you might wonder if there's a particular technology you should learn in the coming year. Once you learn this technology you will become far more productive, and provide far more value to current or future employers.

There is something you can learn, but it's not a new web framework, or a programming language, or an IDE. In fact, it's not a technology at all.

It's better than that.

Much better.

Why you need to learn this

Imagine you're a contractor for an online pet-supplies store, selling supplies for cats and dogs.

One Monday the owner of the store asks you to add a new section to the website, selling supplies for aardvarks. Being a skilled designer as well as a programmer you can update the website design, then change the database, write some new code... By the end of the week you've launched the new Aardvark section.

Next Monday the owner is back. This week you're adding badgers, so you add another section of the design, update the code to support another pet, and again you're done after a week. The next week is crabs, and so on and so forth until by the middle of the year you've added zebras.

Unfortunately by this point your website has 30+ sections, and almost all of them are for pets no one wants or owns. No one can find the cat and dog supplies amid the noise, sales are plummeting, and the owner of the store can't afford to pay you any more.

Knowing a better web framework or a more suitable programming language would have let you code the weekly animal's section faster. But coding the wrong thing faster is just as much a waste of time.

Even if you knew every programming language and web framework under the sun you'd still be missing a key skill.

The skill you need to learn

How could this problem have been prevented? Let's get in our time machine and go back to the beginning. Cars drive backwards, raindrops rush up to the clouds, flowers refold, the sun flies from west to east alternating with darkness...

One Monday the owner of the store asks you to add a new section to the website, selling supplies for aardvarks. And you think for a moment, and ask: "why do you want to do that? most people don't own aardvarks."

With a bit more probing you find out the reason the store owners wants to add aardvarks. Perhaps the store wants to sell supplies for all animals, in which case you can code generic animal support once, instead of adding a new one per week. Perhaps this is a misguided search engine optimization strategy, in which case you can suggest adding a blog.

If you ask and figure out the reason for a task, figure out the goal... that is an immensely useful skill. You can only succeed at a project if you know its goal.

Working towards goals

If you can figure out what your boss really needs, what your client's real problem is, you can actually solve their problem. Otherwise they're the ones who have to solve the problem; you're just a hired hand implementing someone else's solution.

If you can figure out what the real goal is you can make sure you're working towards it. You can avoid the irrelevant work, the nice-to-have work, the attractive but useless work: you can work just on what's needed. You won't be writing coding any faster, but you'll ship sooner and code far more effectively.

Technologies are useful and you need to learn them, but in the end they're just tools. Programming languages come and go, web frameworks come and go: what matters is what you choose to build with them.

Don't learn a technology, learn more fundamental skills. Figuring out root causes and why something needs to be done, discovering the real problem, not just the stated one: these skills will serve you this year, and next year and every year after that.

If you have these core skills, you'll be a far more valuable employee. If you also improve your bargaining position and negotiating skills you will be able to find a job on your own terms: a job with work hours you want, a job you love that doesn't own your life.

Join the course: Getting to a Sane Workweek

Don't let your job take over your life. Join over 720 other programmers on the journey to a saner workweek by taking this free 6-part email course. You'll learn how you can work reasonable hours and still succeed in your career a programmer.

I won't send you any spam. Unsubscribe at any time. Powered by ConvertKit

If you would like to learn the skills you need to get a job that doesn't overwhelm your life, join my free 6-part email course.

December 20, 2016 05:00 AM

December 19, 2016

Glyph Lefkowitz

Sourceforge Update

When I wrote my previous post about Sourceforge, things were looking pretty grim for the site; I (rightly, I think) slammed them for some pretty atrocious security practices.

I invited the SourceForge ops team to get in touch about it, and, to their credit, they did. Even better, they didn't ask for me to take down the article, or post some excuse; they said that they knew there were problems and they were working on a long-term plan to address them.

This week I received an update from said ops, saying:

We have converted many of our mirrors over to HTTPS and are actively working on the rest + gathering new ones. The converted ones happen to be our larger mirrors and are prioritized.

We have added support for HTTPS on the project web. New projects will automatically start using it. Old projects can switch over at their convenience as some of them may need to adjust it to properly work. More info here:

https://sourceforge.net/blog/introducing-https-for-project-websites/

Coincidentally, right after I received this email, I installed a macOS update, which means I needed to go back to Sourceforge to grab an update to my boot manager. This time, I didn't have to do any weird tricks to authenticate my download: the HTTPS project page took me to an HTTPS download page, which redirected me to an HTTPS mirror. Success!

(It sounds like there might still be some non-HTTPS mirrors in rotation right now, but I haven't seen one yet in my testing; for now, keep an eye out for that, just in case.)

If you host a project on Sourceforge, please go push that big "Switch to HTTPS" button. And thanks very much to the ops team at Sourceforge for taking these problems seriously and doing the hard work of upgrading their users' security.

by Glyph at December 19, 2016 01:19 AM

December 15, 2016

Itamar Turner-Trauring

Experts, True Believers and Test-Driven Development: how expert advice becomes a religion

If you've encountered test-driven development (TDD), you may have encountered programmers who follow it with almost religious fervor. They will tell you that you must always write unit tests before you write code, no exceptions. If you don't, your code will be condemned to everlasting brokenness, tortured by evil edge cases for all eternity.

This is an example of a common problem in programming: good advice by experts that gets turned into a counter-productive religion. Test-driven development is useful and worth doing... some of the time, but not always. And the experts who came up with it in the first place will be the first to tell you that.

Let's see how expert advice turns into zealotry, and how you can prevent it from happening to you.

Expert advice becomes a religion

The problem with experts is that they suffer from Expert Blind Spot. Because experts understand the subject so well, they have trouble breaking it down into concepts and explaining it in ways that make sense to novices.

Thus the expert may say "always write unit tests before you write your code." What they actually meant is this:

  1. Write unit tests before you write your code.
  2. Unless it's code that isn't amenable to unit testing, e.g. because unit tests aren't informative for this particular kind of code.
  3. Or unless it's code that you're going to throw away immediately.
  4. And technically you can write the test after the code, and then break your code and check the test is failing. But this is annoying and a bit more error prone so as an expert I'm not going to mention that at all.

A True Believer in TDD might start arguing with these claims. But consider that even Extreme Programming, where TDD originates, discusses types of coding where unit tests are unnecessary.

In particular a "spike" in Extreme Programming is an architectural prototype, used to figure out the structure of the code you're going to write. Since it's going to be thrown away you don't need to write unit tests when you write a spike. You can see this visually in this overview of Extreme Programming; the Architectural Spike happens before Iterations, and unit tests are written as part of Iterations.

If you're certain all code is amenable to unit testing, consider these two examples: unit tests aren't very helpful in testing a cryptographically secure random number generator. And if the director of a movie has asked you write some code to 3D render a "really awesome-looking explosion" you won't benefit much from unit tests either, unless you can write a unit test to determine awesomeness.

So if experts know unit-testing-before code isn't always the right answer, why do they say "always write unit tests before you write your code"? Expert blind spot: they can't imagine anyone would write unit tests when they shouldn't.

To the expert, a prototype and code requiring tests are obviously very different things with different goals. But that isn't so obvious to the novice listener.

The novice listener takes the expert at their literal word, and comes to believe that they must always write unit tests before writing code. The novice is now a True Believer. They tell everyone they know "always wrote tests before you write code," and they try to do so under all circumstances.

Thus good advice turns into a counter-productive religion, with unit tests often being written when they needn't or shouldn't be.

Avoiding the trap

How can you keep this from happening to you?

If you're an expert, make sure you explain the qualifications to your statements. Don't say "always do X." Instead say "you should do X, because Y, under circumstance A but not circumstance B."

If you're not an expert things get a bit harder. Consider that both the expert and the True Believer are making the same statement: "always write tests before you write code." How can you tell whether the person telling you this is an expert or a True Believer?

You need to take every piece of advice and figure out when and where it does not apply. An expert's assumptions will often be implicit in the topic they're writing about, so that may give you some hints.

If you're having a two-way conversation try to get them to qualify their statement: ask them to come up with the boundary cases, where this advice is no longer applicable. Bring up the cases you've found where their advice seemingly won't work and see what they say.

An expert, given enough prodding, will be able to come up with cases where their advice isn't applicable, and explain the reason why. But a True Believer won't back down, won't agree to any compromise: they know the The Truth, and it is absolute and unchanging.

Programming is a broad field, and what makes good software depends on your circumstances, goals and tools. There is almost never one answer that applies everywhere. Learn from the experts, but never become a True Believer.

December 15, 2016 05:00 AM

Glyph Lefkowitz

Don’t Stop Tweeting On My Account

Shortly after my previous post, my good friend David Reid not-so-subtly subtweeted me for apparently yelling at everyone using a twitter thread to be quiet and stop expressing themselves. He pointed out:

This is the truth. There are, indeed, important, substantial essays being written on Twitter, in the form of threads. If I could direct your attention to one that’s probably a better use of your time than what I have to say here, this is a great example:

Moreover, although the twitter character limit can inhibit the expression of nuance, just having a blog is not a get-out-of-jail-free card for clumsy, hot takes:

I screwed this one up. I’m sorry.


The point I was trying to primarily focus on in that post is that a twitter thread demands a lot of attention, and that publishers exploiting that aspect of the medium in order to direct more attention to themselves1 are leveraging a limited resource2 and thereby externalizing their marketing costs3. Further, this idiom was invented by4, and has extensively been used by people who don’t really need any more attention than they already have.

If you’re an activist trying to draw attention to an important cause, or a writer trying to find your voice, and social media (or twitter threads specifically) has helped you do that, I am not trying to scold you for growing an audience on - and deriving creative energy from - your platform of choice. If you’re leveraging the focus-stealing power of twitter threads to draw attention to serious social issues, maybe you deserve that attention. Maybe in the face of such issues my convenience and comfort and focus are not paramount. And for people who really don’t want that distraction, the ‘unfollow’ button is, obviously, only a click away.

That’s not to say I think that relying on social media exclusively is a good idea for activists; far from it. I think recent political events have shown that a social media platform is often a knife that will turn in your hand. So I would encourage pretty much anyone trying to cultivate an audience to consider getting an independent web presence where you can host more durable and substantive collections of your thoughts, not because I don’t want you to annoy me, but because it gives you a measure of independence, and avoids a potentially destructive monoculture of social media. Given the mechanics of the technology, this is true even if you use a big hosted service for your long-form stuff, like Medium or Blogger; it’s not just about a big company having a hold on your stuff, but about how your work is presented based on the goals of the product presenting it.

However, the exact specifics of such a recommendation are an extremely complex set of topics, and not topics that I’m confident I’ve thought all the way through. There are dozens more problems with twitter threads for following long-form discussions and unintentionally misrepresenting complex points. Maybe they’re really serious, maybe not.

As far as where the long-form stuff should go, there are very good reasons to want to self-host things, and very good reasons why self-hosting is incredibly dangerous, especially for high-profile activists and intellectuals. There are really good reasons to engage with social media platforms and really good reasons to withdraw.

This is why I didn’t want to address this sort of usage of twitter threading; I didn’t want to dive into the sociopolitical implications of the social media ecosystem. At some point, you can expect a far longer post from me about the dynamics of social media, but it is going to take a serious effort to do it justice.


A final thought before I hopefully stop social-media-ing about social media for a while:

One of the criticisms that I received during this conversation, from David as well as others who contacted me privately, is that I’m criticizing Twitter from a level of remove; implying that since I’m not fully engaged with the medium I don’t really have the right (or perhaps the expertise) to be critical of it. I object to that.

In addition to my previously stated reasons for my reduced engagement - which mostly have to do with personal productivity and creative energy - I also have serious reservations about the political structure of social media. There’s a lot that’s good about it, but I think the incentive structures around it may mean that it is, ultimately, a fundamentally corrosive and corrupting force in society. At the very least, a social media platform is a tool which can be corrosive and corrupting and therefore needs to be used thoughtfully and intentionally to minimize the harm that it can do while retaining as many of its benefits as possible.

I don’t have time to fully explore the problems that I’m alluding to now5 but at this point if I wrote something like “social media platforms are slowly destroying liberal democracy”, I’m not even sure if I’d be exaggerating.

When I explain that I have these concerns, I’m often asked the obvious follow-up: if social media is so bad why don’t I just stop using it entirely?

The problem is, social media companies effectively control access to an enormous audience, which is now difficult to reach without their intermediation. I have friends, as we all probably do, that are hard for me to contact via other channels. An individual cannot effectively boycott a communication tool, and I am not even sure yet that “stop using it” is the right way to combat its problems.

So, I’m not going to stop communicating with my friends because I have concerns about the medium they prefer, and I’m also not going to stop thinking or writing about how to articulate and address those concerns. I think I have as much a right as anyone to do that.


  1. ... even if they’re not doing it on purpose ... 

  2. the reader’s attention 

  3. interrupting the reader repeatedly to get them to pay attention rather than posting stuff as a finished work, allowing the reader to make efficient use of their attention 

  4. I’m aware that many people outside of the white male tech nerd demographic - particularly women of color and the LGBTQ community - have made extensive use of the twitter thread for discussing substantive issues. But, as far as my limited research has shown (although the difficulty of doing such research is one of the problems with Twitter), Marc Andreessen was by far the earliest pioneer of the technique and by far its most prominent advocate. I’d be happy for a correction on this point, however. 

  5. The draft in progress, which I've been working on for a month, is already one of the longest posts I’ve ever written and it’s barely half done, if that. 

by Glyph at December 15, 2016 03:02 AM

December 14, 2016

Glyph Lefkowitz

A Blowhard At Large

Update: I've written a brief follow-up to this post to clarify my feelings about other uses of the tweetstorm, or twitter thread, publishing idiom. This post is intended to be specifically critical of its usage as a self-promotional or commercial tool.

I don’t like Tweetstorms™1, or, to turn to a neologism, “manthreading”. They actively annoy me. Stop it. People who do this are almost always blowhards.

Blogs are free. Put your ideas on your blog.

As Eevee rightfully points out, however, if you’re a massive blowhard in your Tweetstorms, you’re likely a massive blowhard on your blog, too. So why care about the usage of Twitter threads vs. Medium posts vs. anything else for expressions of mediocre ideas?

Here’s the difference, and here’s why my problem with them does have something to do with the medium: if you put your dull, obvious thoughts in a blog2, it’s a single entity. I can skim the introduction and then skip it if it’s tedious, plodding, derivative nonsense.3

Tweetstorms™, as with all social media innovations, however, increase engagement. Affordances to read little bits of the storm abound. Ding. Ding. Ding. Little bits of an essay dribble in, interrupting me at suspiciously precisely calibrated 90-second intervals, reminding me that an Important Thought Leader has Something New To Say.


The conceit of a Tweetstorm™ is that they’re in this format because they’re spontaneous. The hottest of hot takes. The supposed reason that it’s valid to interrupt me at 30-second intervals to keep me up to date on tweet 84 of 216 of some irrelevant commentator’s opinion on the recent trend in chamfer widths on aluminum bezels is that they’re thinking those thoughts in real time! It’s an opportunity to engage with the conversation!

But of course, this is a pretense; transparently so, unless you imagine someone could divine the number after the slash without typing it out first.

The “storm” is scripted in advance, edited, and rehearsed like any other media release. It’s interrupting people repeatedly merely to increase their chances of clicking on it, or reading it. And no Tweetstorm author is meaningfully going to “engage” with their readers; they just want to maximize their view metrics.

Even if I cared a tremendous amount about the geopolitics of aluminum chamfer calibration, this is a terrible format to consume those thoughts in. Twitter’s UI is just atrocious for meaningful consideration of ideas. It’s great for pointers to things (like a link to this post!) but actively interferes with trying to follow a thought to its conclusion.

There’s a whole separate blog in there about just how gross pretty much all social-media UI is, and how much it goes out of its way to show you “what you might have missed”, or things that are “relevant to you” or “people you should follow”, instead of just giving you the actual content you requested from their platform. It’s dark patterns all the way down, betraying the user’s intent for those of the advertisers.


My tone here probably implies that I think everyone doing this is being cynically manipulative. That’s possibly the worst part - I don’t think they are. I think everyone involved is just being slightly thoughtless, trying to do the best that they can in their perceived role. Blowhards are blowing, social media is making you be more social and consume more media. All optimizing for our little niche in society. So unfortunately it’s up to us, as readers, to refuse to consume this disrespectful trash, and pipe up about the destructive aspects of communicating that way.

Personally I’m not much affected by this, because I follow hardly anyone4, I don’t have push enabled, and I would definitely unfollow (or possibly block) someone who managed to get retweeted at such great length into my feed. But a lot of people who are a lot worse than I am about managing the demands on their attention get sucked into the vortex that Tweetstorms™ (and related social-media communication habits) generate.

Attention is a precious resource; in many ways it is the only resource that matters for producing creative work.

But of course, there’s a delicate balance - we must use up that same resource to consume those same works. I don’t think anyone should stop talking. But they should mindfully speak in places and ways that are not abusive of their audience.

This post itself might be a waste of your time. Not everything I write is worth reading. Because I respect my readers, I want to give them the opportunity to ignore it.

And that’s why I don’t use Tweetstorms™5.


  1. ™ 

  2. Hi Ned. 

  3. Like, for example, you can do with this blog! 

  4. I subscribe to more RSS feeds than Twitter people by about an order of magnitude, and I heartily suggest you do the same. 

  5. ™ 

by Glyph at December 14, 2016 02:55 AM

December 10, 2016

Moshe Zadka

On Raising Exceptions in Python

There is a lot of Python code in the wild which does something like:

raise SomeException("Could not fraz the buzz:"
                    "{} is less than {}".format(foo, quux)

This is, in general, a bad idea. Exceptions are not program panics. While they sometimes do terminate the program, or the execution thread with a traceback, they are different in that they can be caught. The code that catches the exception will sometimes have a way to recover: for example, maybe it’s not that important for the application to fraz the buzz if foo is 0. In that case, the code would look like:

try:
    some_function()
except SomeException as e:
    if ???

Oh, right. We do not have direct access to foo. If we formatted better, using repr, at least we could tell the difference between 0 and “0”: but we still would have to start parsing the representation out of the string.

Because of this, in general, it is better to raise exceptions like this:

raise SomeException("Could not fraz the buzz: foo is too small", foo, quux)

This way exception handling has a lot of power: it can introspect foo, introspect quux and introspect the string. If by some reason the exception class is raised and we want to verify the reason, checking string equality, while not ideal, is still better than trying to match string parts or regular expression matching.

When the exception is presented to the user interface, in that case, it will not look as nice. Exceptions, in general, should reach the UI only in extreme circumstances: and in those cases, having something that has as much information is useful for root cause analysis.

by moshez at December 10, 2016 04:42 AM

December 08, 2016

Itamar Turner-Trauring

Don't Get Stuck: 6 ways to get unstuck and code faster

A lot of my time as a programmer, and maybe yours as well, is spent being stuck. My day often goes like this:

  1. Write some code.
  2. Run the tests.
  3. "It failed."
  4. "Why did it fail?"
  5. "I don't know."
  6. "That makes no sense."
  7. "Seriously, what?"
  8. "That's impossible."
  9. "Lets add a print statement here."
  10. "And maybe try poking around with the debugger."
  11. "Oh! That's it!"
  12. "No wait, it isn't."
  13. "Ohhhhhhhh there we go."
  14. Run the tests.
  15. Tests pass.
  16. "Time for snacks!"

Given how much time you can end up wasting in this mode, Kaitlin Duck Sherwood points out that not getting stuck is one of the keys to programmer productivity. Imagine how much time I could save if I skipped steps 5 to 13 in the transcript above.

Not getting stuck will make you far more productive. Here are six ways to keep you from getting stuck:

Recognize when you're stuck

If you don't know you're stuck then you can't do anything about it, so the first step is having some way of measuring progress. Whenever you start a coding task you should have a time estimate in mind.

The time estimates should be short, no more than a few hours or a day, so a bigger project should be broken up into smaller tasks. The estimates don't have to particularly accurate, they just have to be in the right range: a task you estimate at a few hours should not require days of work.

Given an estimate of 4 hours, say, you can then recognize whether you're actually stuck:

  • If it's 10 minutes in and you have no idea what to do, then that's fine, there's plenty more time.
  • If you're 2 hours in and you haven't produced anything, then it's pretty clear you're stuck and need to make some changes.

Comparing actual time spent to the initial estimate tells you if you're making progress, and working in small chunks ensures you recognize problems quickly.

Ask for help

Now that you've recognized you're stuck, the next thing to do is find a solution. The easiest thing to do is talk to a colleague.

This is helpful in multiple ways:

  • You're forced to restate the problem in a way someone else can understand. Sometimes this is sufficient to help you solve the problem, even before they get to answering you.

In fact, this is so useful that sometimes you don't need a person, and talking to a rubber duck will do. I like debugging out loud for this reason, so occasionally I use a #rubberduck Slack channel so I don't distract my coworkers.

  • Your colleague may have an idea you don't, especially if they're experienced. For example, recently I was utterly confused why Java thought that assertThat(0.0, isEqual(-0.0)) was a failing test; it claimed 0.0 wasn't the same as -0.0.

Eventually I shared my confusion, and someone noted expression relies on Double.equals(), and then went and found the Double.equals() API documentation. And indeed, the documentation notes that new Double(0.0).equals(new Double(-0.0)) is false even though in Java 0.0 == -0.0 is true, because reasons.

Use or copy an existing solution

If you or your colleague can't find a solution on your own, you can try using some one else's solution. This can range from the copy/paste-from-StackOverflow fallback (but be careful, sometimes StackOverflow code is wrong or insecure) to copying whole designs.

For example, I built a multicast data distribution protocol. This is not a trivial thing to do, so I found a research paper and copied their design and algorithm. Designing such an algorithm would have been far beyond my abilities, but I didn't have to: someone smarter and more knowledgeable had done the hard work.

Find a workaround

Sometimes you're faced with an important bug you can't fix. Working around it may suffice, however, as you can see in this story shared by reader James (Jason) Harrison:

Several years ago, I was working many late nights on a new Wii game that was going to have gesture recognition. The first part of the game activities went as smoothly as could be expected and then we came to a new level where the player was supposed to bring the Wiimote up and then down quickly. This must have tripped on a bug in the system because this gesture could not be reliably recognized.

Replaying recorded motions demonstrated that the problem wasn’t “just” in the data form the Wiimote or in how the player made the motion but in the system. Instead of being deterministic, the system would work then not work. Looked for variables that were not being initialized, data buffers not being cleared, and all state that could possibly leak from previous inputs.

Unfortunately, all of the searching didn’t find the problem in time. So it was decided to reset the recognition system between inputs. While wasteful, it was the only thing that did fix the system and let us ship the milestone to the publisher. Left a comment in to find the problem later. Never did find it. Game was shipped with this fix.

Drop the feature

If you're working on a feature and it's taking forever, maybe it's time to consider dropping it. Can it wait until the next release? Do you actually need it?

A feature usually lands on the requirements list for a reason, it's true. But a surprising number of features are in the nice-to-have category, especially when it's taking far too long to implement them. If other approaches have failed to get you unstuck, find the person in charge and give them the choice of shipping late or dropping the feature.

Redefine the problem

Finally, another approach you can take is redefining the problem. The way you're approaching the problem may be making it needlessly complicated to solve, so revisiting the problem statement can help get you unstuck.

You can redefine the problem by relaxing some of the constraints. For example, maybe you're having a hard time finding a date picker that matches the website design. If the problem statement is "add a usable, good looking, date picker that matches our website style" then you might spend a while looking and not finding any that are quite right.

Often you can redefine the problem, though, to "find a minimal date picker so we can demo the feature and see if anyone cares." With that more limited scope you can probably use one of the options you initially rejected.

You can also redefine the problem by figuring out the real problem lies elsewhere. Havoc Pennington has a wonderful story about the dangerous "UI team": they will feel their mandate is to build UIs. But software that doesn't have a UI and "just works" is an even better user experience, if you can manage it.

In short, to keep from getting stuck you should:

  1. Break all your work up into small chunks.
  2. Estimate each chunk in advance, and pay attention to your progress against the estimate.
  3. When you recognize you are stuck: ask for help, copy an existing solution, find a workaround, drop the feature or redefine the problem.

I've learned most of this the hard way, over the course of 20 years of being stuck while writing software. If you'd like to avoid the many mistakes I've made as a software engineer during that time, both coding and in my career, sign up for my Software Clown newsletter. You'll get the story of one of my mistakes in your inbox every week and how you can avoid making it.

Avoid my programming mistakes!

Get a weekly email with one of my many software and career mistakes, and how you can avoid it. Here's what readers are saying:

"Are you reading @itamarst's "Software Clown" newsletter? If not, you should be. There's a gem in every issue." - @glyph

"Definitely subscribe if you want to learn some things that Itamar learned the hard way." -- Victor Algaze

I won't share your email with anyone else. Unsubscribe at any time. Powered by ConvertKit

December 08, 2016 05:00 AM

Moshe Zadka

Moshe’z Messaging Preferences

The assumption here is that you have my phone number. If you don’t have my phone number, and you think that’s an oversight on my part, please send me an e-mail at zadka.moshe@gmail.com and ask for it. If you don’t have my phone number because I don’t know you, I am usually pretty responsive on e-mail.

In order of preference:

by moshez at December 08, 2016 03:03 AM

December 05, 2016

Jp Calderone

Twisted Web in 60 Seconds: HTTP/2

Hello, hello. It's been a long time since the last entry in the "Twisted Web in 60 Seconds" series. If you're new to the series and you like this post, I recommend going back and reading the older posts as well.

In this entry, I'll show you how to enable HTTP/2 for your Twisted Web-based site. HTTP/2 is the latest entry in the HTTP family of protocols. It builds on work from Google and others to improve performance (and other) shortcomings of the older HTTP/1.x protocols in wide-spread use today.

Twisted implements HTTP/2 support by building on the general-purpose H2 Python library. In fact, all you have to do to have HTTP/2 for your Twisted Web-based site (starting in Twisted 16.3.0) is install the dependencies:

$ pip install twisted[http2]

Your TLS-based site is now available via HTTP/2! A future version of Twisted will likely extend this to non-TLS sites (which requires the Upgrade: h2c handshake) with no further effort on your part.

If you like this post or others in the Twisted Web in 60 Seconds series, let me know with a donation! I'll post another entry in the series when the counter hits zero. Topic suggestions welcome in the comment section.

by Jean-Paul Calderone (noreply@blogger.com) at December 05, 2016 12:00 PM

December 02, 2016

Moshe Zadka

Don’t Mock the UNIX Filesystem

When writing unit tests, it is good to call functions with “mocks” or “fakes” — objects with equivalent interface but a simple, “fake” implementation. For example, instead of a real socket object, something that has recv() but returns “hello” the first time, and an empty string the second time. This is great! Instead of testing the vagaries of the other side of a socket connection, you can focus on testing your code — and force your code to handle corner cases, like recv() returning partial messages, that happen rarely on the same host (but not so rarely in more complex network environments).

There is one OS interface which it is wise not to mock — the venerable UNIX file system. Mocking the file system is the classic case of low-ROI effort:

  • It is easy to isolate: if functions get a parameter of “which directory to work inside”, tests can use a per-suite temporary directory. Directories are cheap to create and destroy.
  • It is reliable: the file system rarely fails — and if it does, your code is likely to get weird crashes anyway.
  • The surface area is enormous: open(), but also os.open, os.mkdir, os.rename, os.mknod, os.rename, shutil.copytree and others, plus modules calling out to C functions which call out to C’s fopen().

The first two items decrease the Return, since mocking the file system does not make the tests easier to write or the test run more reproducible, while the last one increases the Investment.

Do not mock the file system, or it will mock you back.

by moshez at December 02, 2016 05:34 AM

November 30, 2016

Itamar Turner-Trauring

The Not-So-Passionate Programmer: finding a job when you're just a normal person

When reading programming job postings you'll find many companies that want to hire "passionate programmers". If you're just a normal programmer looking for a normal job this can be pretty discouraging.

What if you're not passionate? What if you don't work on side projects, or code in your spare time?

What if you have a sneaking suspicion that "passionate" is a euphemism for "we want you to work large amounts of unpaid overtime"? Can you really find a job where you don't have to be passionate, where you can just do your job and go home?

The truth is that many companies will hire you even if you don't have "passion". Not to mention that "passion" has very little impact on whether you do your job well.

But since companies do ask for "passion" in job postings and sometimes look for it during interviews, here's what you can do about your lack of "passion" when searching for a job.

Searching for a job

The first thing to do is not worry about it too much. Consider some real job postings for passionate programmers:

  • "[Our company] is looking for Java Engineer who is passionate about solving real world business problems to join our team."
  • "We're looking for a senior developer to play a major role in a team of smart, passionate and driven people."
  • "This role is ideal for somebody who is passionate about building great online apps."

They all say "passionate", yes. But these are all posts from very different kinds of companies, with different customers, different technology stacks, and very different cultures (and they're in two different countries). Whoever wrote the job posting at each company probably didn't think very hard about their choice of words, and if pressed each would probably explain "passionate" differently.

It might be a euphemism for working long hours, but it might also just mean they want to hire competent engineers. If the job looks good otherwise, don't think about it too hard: apply and see how it goes.

Interviewing for a job

Eventually you'll get a job interview at a company that wants "passionate" programmers. A job interview has two aspects: the company is interviewing you, and you are interviewing the company.

When the company is interviewing you they want to find out if you're going to do your job. You need to make a good impression... even if insurance premiums, or content management systems, or internal training or whatever the company does won't be putting beating cartoon hearts in your eyes.

  • First, that means you need to take an interest in the company. Before your interview do some research about the company, and then ask questions about the product during the interview.
  • Second, since you can't muster that crazy love for insurance premiums, focus on what you can provide: emphasize your professional pride in your work, your willingness to get things done and do them right.

At the same time that you're trying to sell yourself to the company you should also be trying to figure out if you want to work for them. Among other things, you want to figure out if the word "passionate" is just a codeword for unpaid overtime.

Ask what a typical workday is like, and what a typical workweek is like. Ask how they do project planning, and how they ensure code ships on time.

Finally, you will sometimes discover that the employees who work at the company are passionate about what they do. If this rubs you the wrong way, you might want to find a different company to work for.

If you're OK with it you'll want to make sure you'll be able to fit in. So try to figure out if they're open to other ways of thinking: how they handle conflicts, how they handle diversity of opinion.

On the job

Eventually you will have a job. Most you'll just have a normal job, with normal co-workers who are just doing their job too.

But sometimes you will end up somewhere where everyone else is passionate and you are not. So long as your coworkers and management value a diversity of opinion, your lack of passion can actually be a positive.

For example, startups are often full of passion for what they're building. Most startups fail, of course, and so every startup has a story about why they are different, why they won't fail. Given the odds that story will usually turn out to be wrong, but passionate employees will keep on believing, or at least won't be willing to contradict the story in public.

As someone who isn't passionate you can provide the necessary sanity checks: "Sure, it's a great product... but we're not getting customers fast enough. Maybe we should figure out what we can change?"

Similarly, passionate programmers often love playing with new technology. But technology can be a distraction, and writing code is often the wrong solution. As someone who isn't passionate you can ensure the company's goals are actually being met... even if that means using existing, boring software instead of writing something new and exciting.

There's nothing wrong with wanting to go home at the end of the day and stop thinking about work. There are many successful software developers who don't work crazy hours and who don't spend their spare time coding.

Join the course: Getting to a Sane Workweek

Don't let your job take over your life. Join over 720 other programmers on the journey to a saner workweek by taking this free 6-part email course. You'll learn how you can work reasonable hours and still succeed in your career a programmer.

I won't send you any spam. Unsubscribe at any time. Powered by ConvertKit

If you would like a job that doesn't overwhelm your life, join my free 6-part email course to learn how you can get to a sane workweek.

November 30, 2016 05:00 AM

November 25, 2016

Twisted Matrix Laboratories

Twisted 16.6.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 16.6!

The highlights of this release are:
  • The ability to use "python -m twisted" to call the new twist runner,
  • More reliable tests from a more reliable implementation of some things, like IOCP,
  • Fixes for async/await & twisted.internet.defer.ensureDeferred, meaning it's getting closer to prime time!
  • ECDSA support in Conch & ckeygen (which has also been ported to Python 3),
  • Python 3 support for Words' IRC support and twisted.protocols.sip among some smaller modules,
  • Some HTTP/2 server optimisations,
  • and a few bugfixes to boot!
For more information, check the NEWS file (link provided below).

You can find the downloads on PyPI (or alternatively our website). The NEWS file is also available on GitHub.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Amber Brown (HawkOwl)

by Amber Brown (noreply@blogger.com) at November 25, 2016 08:06 PM

November 21, 2016

Jack Moffitt

Servo Interview on The Changelog

The Changelog has just published an episode about Servo. It covers the motivations and goals of the project, some aspects of Servo performance and use of the Rust language, and even has a bit about our wonderful community. If your curious about why Servo exists, how we plan to ship it to real users, or what it was like to use Rust before it was stable, I recommend giving it a listen.

by Jack Moffitt (jack@metajack.im) at November 21, 2016 12:00 AM

November 19, 2016

Moshe Zadka

Belt & Suspenders: Why put a .pex file inside a Docker container?

Recently I have been talking about deploying Python, and some people had the reasonable question: if a .pex file is used for isolating dependencies, and a Docker container is used for isolating dependencies, why use both? Isn’t it redundant?

Why use containers?

I really like glyph’s explanation for containers: they isolate not just the filesystem stack but the processes and the network, giving a lot of the power that UNIX was supposed to give but missed out on. Containers isolate the file system, making it easier for code to write/read files from known locations. For example, its log files will be carefully segregated, and can be moved to arbitrary places by the operator without touching the code.

The other part is that none of the reasonable options packages Python and this means that a pex file would still have to be tested with multiple Pythons, and perhaps do some checking at start-up that it is using the right interpreter. If PyPy is the right choice, it is the choice the operator would have to make and implement.

Why use pex?

Containers are an easy sell. They are right on the hype train. But if we use containers, what use is pex?

In order to explain, it is worthwhile comparing a correctly built runtime container that is not using pex, with one that is: (parts that are not relevant have been removed)

ADD wheelhouse /wheelhouse
RUN . /appenv/bin/activate; \
    pip install --no-index -f wheelhouse DeployMe
COPY twist.pex /

Note that in the first option, we are left with extra gunk in the /wheelhouse directory. Note also that we still have to have pip and virtualenv installed in the runtime container. Pex files bring the double-dutch philosophy to its logical conclusion: do even more of the build on the builder side, do even less of it on the runtime side.

by moshez at November 19, 2016 05:11 AM

November 18, 2016

Itamar Turner-Trauring

How I stopped the RSI pain that almost destroyed my programming career

If it hurts to type you'll have a much harder time working as a programmer. Yes, there's voice recognition, but it's just not the same. So when my wrist and arm pain returned soon after starting a new job I was starting to get a little scared.

The last two times this happened I'd had to take months and then years off from programming before the pain went away. Was my career as a programmer going to take another hit?

And then, while biking to work one day, I realized what was going on. I came up with a way to test my theory, tried it out... and the pain went away. It's quite possible the same solution would have worked all those years ago, too: instead of unhappily working as a product manager for a few years I could have been programming.

But before I tell you what I figured out, here's what I tried first.

Failed solution #1: better hardware, better ergonomics, more breaks

When I first got wrist pain bad enough that I couldn't type I started by getting a better keyboard, the Kinesis Advantage. It's expensive, built like a tank and amazingly well designed: because Ctrl, Alt, Space, Enter are on the thumb are you don't end up stretching your hands as much.

As an Emacs user this is important; I basically can't use regular keyboards for anything more than a few minutes these days. I own multiple Kinesis keyboards and would be very sad without them. They've definitely solved one particular kind of pain I used to have due to overstretching.

I reconfigured my desk setup to be more ergonomic (the days I do this via a standing desk). And I also started taking typing breaks: half a minute every few minutes, 10 minutes once an hours. That might have helped, or not.

The pain came and went, and eventually it came and stayed.

Failed solution #2: doctor's orders

I went to see a doctor, and she suggested it was some sort of "-itis", a fancy Latin word saying I was in pain and she wasn't quite sure why. She prescribed a non-steroidal anti-inflammatory (high doses of ibuprofen will do the trick) and occupational therapy.

That didn't help either, though the meds dulled the pain when I took them.

Failed solution #3: alternative physical therapy

Next I tried massage, Yoga, Alexander Technique, and Rolfing. I learned that my muscles were tight and sore, and ended up understanding some ways I could improve my posture. A couple of times during the Alexander Technique classes my whole back suddenly relaxed, an amazing experience: I was obviously doing something wrong with my muscles.

What I learned was useful. My hands are often cold, and all those classes helped me eventually discover that if I relaxed my muscles the right way my hands would warm up. Tense muscles were hurting my circulation.

At the time, however, none of it helped.

Giving up

After six months at home not typing I was no better: I was still in pain.

So I went back to work and got a new role, as a Product Analyst, where I needed to type less and could use voice recognition for dictation. I did this for 2 or 3 years, but I was not happy: I missed programming.

Working part time

At some point during this period I read one of Dr. Sarno's books. His theory is that long periods of pain are not due to actual injury, but rather an emotional problem causing e.g. muscles to tense up or reduced blood flow. There are quite a few people who have had their pain go away by reading one of his books and doing some mental exercises.

I decided to give it a try: release emotional stress, go back to programming, and not worry about pain anymore. Since I wasn't sure I could work full time I took on consulting, and later a part time programming job.

It worked! I was able to type again, with no pain for four years.

The pain comes back

Earlier this year I started another job, with more hours but still slightly less than full time. And then the pain returned.

Why was I in pain again? I wasn't working that many more hours, I was still using a standing desk as I had for the past four years. What was going on?

An epiphany: environmental causes

Biking to work one day the epiphany hit me: Dr. Sarno's theory was that suppressed emotional stress caused the pain by tensing muscles or reducing blood flow. And that seemed to be the case for me at least. But emotional stress wasn't the only way I could end up with tense muscles or reduced blood flow.

The new office I was working in was crazy cold, and a couple of weeks earlier I'd moved my desk right under the air conditioning vent. Cold would definitely reduce blood flow. For that matter, caffeine shrinks blood vessels. And during the four years I'd work part time and pain free I'd been working in a hackerspace with basically no air conditioning.

I started wearing a sweatshirt and hand warmers at work, and I avoided caffeine on the days I went to the office. The pain went away, and so far hasn't come back.

I spent three years unable to work as a programmer, and there's a good chance I could have solved the problem just by wearing warmer clothing.

Lessons learned

If you're suffering from wrist or arm pain:

  1. Start by putting a sweatshirt on: getting warmer may be all you need to solve the problem.
  2. If Emacs key combos are bad for your wrist, consider vi, god-mode, Spacemacs... or the expensive option, a Kinesis Advantage keyboard.
  3. Next, consider improving your posture (standing desks are good for that).
  4. Finally, if you're still in pain after a month or two go read Dr. Sarno's book. (Update: After posting this blog I got a number of emails from people saying "I read that book and my pain quickly went away.")

This may not work for everyone, but I do believe most so-called repetitive strain injury is not actually an injury. If you're in pain, don't give up: you will be able to get back to typing.

By the way, taking so long to figure out why my arms were hurting isn't the only thing I've gotten wrong during my career. So if you want to become a better software engineer, learn how you can avoid my many mistakes as a programmer.

November 18, 2016 05:00 AM

November 16, 2016

Itamar Turner-Trauring

Debugging Software, Sherlock Holmes-style

How many times have you seen software exhibiting completely impossible results? In theory software is completely deterministic, but in practice it often seems capriciously demonic. But all is not lost: the detection methods of Sherlock Holmes can help you discover the hidden order beneath the chaos.

Sherlock Holmes famously stated that "once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth." And what is true for (fictional) crimes is also true for software. The basic process you follow to find a problem is:

  1. Come up with a list of potential causes.
  2. Test each potential cause in isolation, ruling them out one by one.
  3. Whatever cause you can't rule out is the likely cause, even if it seems improbable.

To see how this might work in practice, here's a bug my colleague Phil and I encountered over at my day job where we're building microservices architecture.

The Case of the Missing Stats

I was working on a client library, Phil was working on the server. Phil was testing out a new feature where the client would send messages to the server, containing certain statistics. When he ran the client the server did get messages, but the messages only ever had empty lists of stats.

Someone had kidnapped the stats, and we had to find them.

Phil was using the following components, each of which was a potential suspect:

  1. A local server with in-development code.
  2. Python 3.4.
  3. The latest version of the client.
  4. The latest version of the test script.

Eliminating the impossible

Our next step was to isolate each possible cause and falsify it.

Theory #1: the client was broken

The client code had never been used with a real server; perhaps it was buggy? I checked to see if there were unit tests, and there were some checking for existence of stats. Maybe the unit tests were broken though.

We ran the client with Python 3.5 on my computer using the same test script Phil had used and recorded traffic to the production server. Python 3.5 and 3.4 are similar enough that it seemed OK to change that variable at the same time.

The messages sent to the server did include the expected stats. The client was apparently not the problem, nor was the test script.

Theory #2: Python version

We tried Python 2.7, just for kicks; stats were still there.

Theory #3: Phil's computer

Maybe Phil's computer was cursed? Phil gave me an SSH login to his computer, I set up a new environment and ran the client against the production server using the Python 3.4 on his computer.

Once again we saw stats being sent.

Theory #4: the server was broken

You may have noticed that so far we were testing against the production server, and Phil had been testing against his in-development server. The server seemed an unlikely cause, however: the client unilaterally sent messages to the server, so the server version shouldn't have mattered.

However, having eliminated all other causes, that was the next thing to check. We ran the client against Phil's in-development server... and suddenly the stats were missing from the client transmission logs.

We had found the kidnapper. Now we needed to figure out how the crime had been committed.

Recreating the crime

So far we'd assumed that when the client talked to the dev server the messages did not include stats. Now that we could reproduce the problem we noticed that it wasn't that the messages didn't include stats; rather, we were sending fewer messages.

Messages with stats were failing to be sent. A quick check of the logs indicated an encoding error: we were failing to encode messages that had stats, so they were never sent. (We should have checked the logs much much earlier in the process, as it turns out.)

Reading the code suggested the problem: the in-development server was feeding the client bogus data earlier on. When the client tried to send a message to the server that included stats it needed to use some of that bogus data, and it failed to encode and the message got dropped. If the client sent a message to the server with an empty list of stats the bogus data was not needed, so encoding and sending succeeded.

The server turned out to be the culprit after all, even though it seemed to be the most improbable cause at the outset. Or at least, the first order culprit; a root-cause analysis suggested that some problems in our protocol design were the real cause.

You too can be a scientific software detective

Our debugging process could have been better: we didn't really check only one change at a time, and we neglected the obvious step of checking the logs. But the basic process worked:

  1. Isolate a possible cause.
  2. Falsify it, demonstrating it can't be the real cause.
  3. Repeat until only one cause is left.

Got an impossible bug? Put on your imaginary detective hat, stick an imaginary detective pipe in your mouth, and catch that culprit.

November 16, 2016 05:00 AM

November 13, 2016

Moshe Zadka

Deploying with Twisted: Saying Hello

Too Long: Didn’t Read

The build script builds a Docker container, moshez/sayhello:.

$ ./build MY_VERSION
$ docker run --rm -it --publish 8080:8080 \
  moshez/sayhello:MY_VERSION --port 8080

There will be a simple application running on port 8080.

If you own the domain name hello.example.com, you can point it at a machine that the domain resolves to and then run:

$ docker run --rm -it --publish 443:443 \
  moshez/sayhello:MY_VERSION --port le:/srv/www/certs:tcp:443 \
  --empty-file /srv/www/certs/hello.example.com.pem

It will result in the same application running on a secure web site: https://hello.example.com.

 

All source code is available on GitHub.

Introduction

WSGI has been a successful standard. Very successful. It allows people to write Python applications using many frameworks (Django, Pyramid, Flask and Bottle, to name but a few) and deploy using many different servers (uwsgi, gunicorn and Apache).

Twisted makes a good WSGI container. Like Gunicorn, it is pure Python, simplifying deployment. Like Apache, it sports a production-grade web server that does not need a front end.

Modern web applications tend to be complex beasts. In order to be trusted by users, they need to have TLS support, signed by a trusted CA. They also need to transmit a lot of static resources — images, CSS and JavaScript files, even if all HTML is dynamically generated. Deploying them often requires complicated set-ups.

Containers

Container images allow us to package an application with all of its dependencies. They often cause a temptation to use those as the configuration management. However, Dockerfile is a challenging language to write big parts of the application in. People writing WSGI applications probably think Python is a good programming language. The more of the application logic is in Python, the easier it is for a WSGI-based team to master it.

PEX

Pex is a way to package several Python “distributions” (sometimes informally called “Packages”, the things that are hosted by PyPI) into one file, optionally with an entry-point so that running the file will call a pre-defined function. It can take an explicit list of wheels but can also, as in our example here, take arguments compatible with the ones pip takes. The best practice is to give it a list of wheels, and build the wheels with pip wheel.

pkg_resources

The pkg_resources module allows access to files packaged in a distribution in a way that is agnostic to how the distribution was deployed. Specifically, it is possible to install a distribution as a zipped directory, instead of unpacking it into site-packages. The code:pex format relies on this feature of Python, so adherence to using pkg_resources to access data files is important in order to not break code:pex compatibility.

Let’s Encrypt

Let’s Encrypt is a free, automated, and open Certificate Authority. It has invented the ACME protocol in order to make getting secure certificates a simple operation. txacme is an implementation of an ACME client, i.e., something that asks for certificates, for Twisted applications. It uses the server endpoint plugin mechanism in order to allow any application that builds a listening endpoint to support ACME.

Twist

The twist command-line tools allows running any Twisted service plugin. Service plugins allow us to configure a service using Python, a pretty nifty language, while still allowing specific customizations at the point of use via command line parameters.

Putting it all together

Our setup.py files defines a distribution called sayhello. In it, we have three parts:

  • src/sayhello/wsgi.py: A simple Flask-based WSGI application
  • src/sayhello/data/index.html: an HTML file meant to serve as the root
  • src/twisted/plugins/sayhello.py: A Twist plugin

There is also some build infrastructure:

  • build is a Python script to run the build.
  • build.docker is a Dockerfile designed to build pex files, but not run as a production server.
  • run.docker is a Dockerfile designed for production container.

Note that build does not push the resulting container to DockerHub.

Credits

Glyph Lefkowitz has inspired me in his blog about how to build efficient containers. He has also spoken about how deploying applications should be no more than one file copy.

Tristan Seligmann has written txacme.

Amber “Hawkowl” Brown has written “twist”, which is much better at running Twisted-based services than the older “twistd”.

Of course, all mistakes and problems here are completely my responsibility.

by moshez at November 13, 2016 03:38 PM

November 12, 2016

Glyph Lefkowitz

What are we afraid of?

I’m crying as I write this, and I want you to understand why.

Politics is the mind-killer. I hate talking about it; I hate driving a wedge between myself and someone I might be able to participate in a coalition with, however narrow. But, when you ignore politics for long enough, it doesn't just kill the mind; it goes on to kill the rest of the body, as well as anyone standing nearby. So, sometimes one is really obligated to talk about it.

Today, I am in despair. Donald Trump is an unprecedented catastrophe for American politics, in many ways. I find it likely that I will get into some nasty political arguments with his supporters in the years to come. But hopefully, this post is not one of those arguments. This post is for you, hypothetical Trump supporter. I want you to understand why we1 are not just sad, that we are not just defeated, but that we are in more emotional distress than any election has ever provoked for us. I want you to understand that we are afraid for our safety, and for good reason.

I do not believe I can change your views; don’t @ me to argue, because you certainly can’t change mine. My hope is simply that you can read this and at least understand why a higher level level of care and compassion in political discourse than you are used to may now be required. At least soften your tone, and blunt your rhetoric. You already won, and if you rub it in too much, you may be driving people to literally kill themselves.


First let me list the arguments that I’m not making, so you can’t write off my concerns as a repeat of some rhetoric you’ve heard before.

I won’t tell you about how Trump has the support of the American Nazi Party and the Ku Klux Klan; I know that you’ll tell me that he “can’t control who supports him”, and that he denounced2 their support. I won’t tell you about the very real campaign of violence that has been carried out by his supporters in the mere days since his victory; a campaign that has even affected the behavior of children. I know you don’t believe there’s a connection there.

I think these are very real points to be made. But even if I agreed with you completely, that none of this was his fault, that none of this could have been prevented by his campaign, and that in his heart he’s not a hateful racist, I would still be just as scared.


Bear Sterns estimates that there are approximately 20 million illegal immigrants in the United States. Donald Trump’s official position on how to handle this population is mass deportation. He has promised that this will be done “warmly and humanely”, which betrays his total ignorance of how mass resettlements have happened in the past.

By contrast, the total combined number of active and reserve personnel in the United States Armed Forces is a little over 2 million people.

What do you imagine happens when a person is deported? A person who, as an illegal immigrant, very likely gave up everything they have in their home country, and wants to be where they are so badly that they risk arrest every day, just by living where they live? How do you think that millions of them returning to countries where they have no home, no food, and quite likely no money or access to the resources or support that they had while in the United States?

They die. They die of exposure because they are in poverty and all their possessions were just stripped away and they can no longer feed themselves, or because they were already refugees from political violence in their home country, or because their home country kills them at the border because it is a hostile action to suddenly burden an economy with the shock of millions of displaced (and therefore suddenly poor and unemployed, whether they were before or not) people.

A conflict between 20 million people on one side and 2 million (heavily armed) people on the other is not a “police action”. It cannot be done “warmly and humanely”. At best, such an action could be called a massacre. At worst (and more likely) it would be called a civil war. Individual deportees can be sent home without incident, and many have been, but the victims of a mass deportation will know what is waiting for them on the other side of that train ride. At least some of them won’t go quietly.

It doesn’t matter if this is technically enforcing “existing laws”. It doesn’t matter whether you think these people deserve to be in the country or not. This is just a reality of very, very large numbers.

Let’s say, just for the sake of argument, that of the population of immigrants has assimilated so poorly that each one knows only one citizen who will stand up to defend them, once it’s obvious that they will be sent to their deaths. That’s a hypothetical resistance army of 40 million people. Let’s say they are so thoroughly overpowered by the military and police that there are zero casualties on the other side of this. Generously, let’s say that the police and military are incredibly restrained, and do not use unnecessary overwhelming force, and the casualty rate is just 20%; 4 out of 5 people are captured without lethal force, and miraculously nobody else dies in the remaining 16 million who are sent back to their home countries.

That’s 8 million casualties.

6 million Jews died in the Holocaust.


This is why we are afraid. Forget all the troubling things about Trump’s character. Forget the coded racist language, the support of hate groups, and every detail and gaffe that we could quibble over as the usual chum of left/right political struggle in the USA. Forget his deeply concerning relationship with African-Americans, even.

We are afraid because of things that others have said about him, yes. But mainly, we are afraid because, in his own campaign, Trump promised to be 33% worse than Hitler.

I know that there are mechanisms in our democracy to prevent such an atrocity from occurring. But there are also mechanisms to prevent the kind of madman who would propose such a policy from becoming the President, and thus far they’ve all failed.

I’m not all that afraid for myself. I’m not a Muslim. I am a Jew, but despite all the swastikas painted on walls next to Trump’s name and slogans, I don’t think he’s particularly anti-Semitic. Perhaps he will even make a show of punishing anti-Semites, since he has some Jews in his family3.

I don’t even think he’s trying to engineer a massacre; I just know that what he wants to do will cause one. Perhaps, when he sees what is happening as a result of his orders, he will stop. But his character has been so erratic, I honestly have no idea.

I’m not an immigrant, but many in my family are. One of those immigrants is intimately familiar with the use of the word “deportation” as an euphemism for extermination; there’s even a museum about it where she comes from.

Her mother’s name is written in a book there.


In closing, I’d like to share a quote.

The last thing that my great-grandmother said to my grandmother, before she was dragged off to be killed by the Nazis, was this:

Pleure pas, les gens sont bons.

or, in English:

Don’t cry, people are good.

As it turns out, she was right, in a sense; thanks in large part to the help of anonymous strangers, my grandmother managed to escape, and, here I am.


My greatest hope for this upcoming regime change is that I am dramatically catastrophizing; that none of these plans will come to fruition, that the strange story4 I have been told by Trump supporters is in fact true.

But if my fears, if our fears, should come to pass – and the violence already in the streets already is showing that at least some of those fears will – you, my dear conservative, may find yourself at a crossroads. You may see something happening in your state, or your city, or even in your own home. Your children might use a racial slur, or even just tell a joke that you find troubling. You may see someone, even a policeman, beating a Muslim to death. In that moment, you will have a choice: to say something, or not. To be one of the good people, or not.

Please, be one of the good ones.

In the meanwhile, I’m going to try to take great-grandma’s advice.


  1. When I say “we”, I mean, the people that you would call “liberals”, although our politics are often much more complicated than that; the people from “blue states” even though most states are closer to purple than pure blue or pure red; people of color, and immigrants, and yes, Jews. 

  2. Eventually. 

  3. While tacitly allowing continued violence against Muslims, of course. 

  4. “His campaign is really about campaign finance”, “he just said that stuff to get votes, of course he won’t do it”, “they’ll be better off in their home countries”, and a million other justifications. 

by Glyph at November 12, 2016 02:33 AM

November 10, 2016

Itamar Turner-Trauring

Work/Life Balance Will Make You a Better Software Engineer

It's tempting to believe that taking your work home will make you a better software engineer, and that work/life balance will limit your learning.

  • For some software developers programming isn't just a job: it's something to do for fun, sometimes even a reason for being. If you love coding and coding is your job, why not keep working over the weekend? It's more practice of the skills you need.
  • When you don't have the motivation or ability to take work home on the weekends you might feel you're never going to be as good a software engineer as those who do.

But the truth is that if you want to be a good software engineer you shouldn't take your work home.

What makes a good software engineer? The ability to build solutions for hard, complex problems. Here's why spending extra hours on your normal job won't help you do that.

New problems, new solutions

If you have the time and motivation to write software in your free time you could write more software for your job. But that restricts you to a particular kind of problem and limits the solution space you can consider.

If you take your work home you will end up solving the same kinds of problems that you work on during your normal workweek. You'll need to use technologies that meet your employer's business goals, and you'll need to use the same standards of quality your employer expects. But if you take on a personal programming project you'll have no such constraints.

  • If your company has low quality standards, you can learn how to test really well.
  • Or you can write complete hacks just to learn something new.
  • You can use and learn completely different areas of technology.

I once wrote a Python Global Interpreter Lock profiler, using LD_PRELOAD to override the Python process' interactions with operating system locks and the gdb debugger to look at the live program's C stack. It never worked well enough to be truly useful... but building it was very educational.

The additional learning you'll get from working on different projects will make you a better software engineer. But even if you don't have the time or motivation to code at home, fear not: work/life balance can still make you a better software engineer.

Learning other skills

Being a good software engineer isn't just about churning out code. There are many other skills you need, and time spent doing things other than coding can still improve your abilities.

When I was younger and had more free time I spent my evenings studying at a university for a liberals art degree. Among other things I learned how to write: how to find abstractions that mattered, how to marshal evidence, how to explain complex ideas, how to extract nuances from text I read. This has been immensely useful when working on harder problems, where good abstractions are critical and design documents are a must.

These days I'm spending more of my time with my child, and as a side-effect I'm learning other things. For example, explaining the world to a 4-year-old requires the ability to take complex concepts and simplify them to their essential core.

You need a hammock to solve hard problems

Though additional learning will help you, much of the benefit of work/life balance is that you're not working. Hard problems require downtime, time when you're explicitly not thinking about solutions, time for your brain to sort things out in the background. Rich Hickey, the creator of Clojure, has a great talk on the subject called Hammock Driven Development.

The gist is that hard problems require a lot of research, of alternatives and existing solutions and the problem definition, and then a lot of time letting your intuition sort things out on its own. And that takes time, time when you're not actively thinking about the problem.

At one point I was my kid's primary caregiver when I wasn't at work. I'm not up to Hickey's standard of hard problems, and taking care of an infant and toddler wasn't as restful as a hammock. But I still found that time spent not thinking about work was helpful in solving the hard problems I went back to the next day.

Learning to do more with less

The final benefit of work/life balance is attitude: the way you think about your job. If you work extra hours on your normal job you are training yourself to do your work with more time than necessary. To improve as a software engineer you want to learn how to do your work in less time, which is important if you want to take on bigger, harder projects.

Working a reasonable, limited work week will help focus you on becoming a more productive programmer rather than trying to solve problems the hard, slow way.

Given the choice you shouldn't take your work home with you. If you want to keep coding you should have no trouble finding interesting projects to work on, untrammeled by the requirements of your job. If can't or won't code in your free time, that's fine too.

But what if that isn't a choice you can make? What if you don't have work/life balance as a software engineer because of pressure from your boss, or constant emergencies at work? In that case you should sign up for my free 6-part email course, which will show you how to get a to a saner, shorter workweek.

November 10, 2016 05:00 AM

October 30, 2016

Itamar Turner-Trauring

Maintainable Python Applications: a Guide for Skeptical Java Developers

When you've been writing Java for a while switching to Python can make you a little anxious. Not only are you learning a new language with new idioms and tools, you're also dealing with a language with far less built-in safety. No more type checks, no more clear separation between public and private.

It's much easier to learn Python than Java, it's true, but it's also much easier to write unmaintainable code. Can you really build large scale, robust and maintainable applications in Python? I think you can, if you do it right.

The suggestions below will help get you started on a new Python project, or improve an existing project that you're joining. You'll need to keep up the best practices you've used in Java, and learn new tools that will help you write better Python.

Tools and Best Practices

Python 2 and 3

Before you start a new Python project you have to choose which version of the language to support: Python 3 is not backwards-compatible with Python 2. Python 2 is only barely being maintained, and will be end-of-lifed in 2020, so that leaves you with only two options with long term viability:

  1. A hybrid language, the intersection of Python 2 and Python 3. This requires you to understand the subtleties of the differences between the two languages. The best guide I've seen to writing this hybrid language is on the Python Future website.
  2. Python 3 only.

Most popular Python libraries now support Python 3, as do most runtime environments. Unless you need to write a library that will be used by both new and legacy applications it's best to stick to Python 3 only.

However, on OS X you'll need to use Homebrew to install Python 3 (though using Homebrew's Python 2 is also recommended over using the system Python 2). And on Google App Engine you'll need to use the beta Flexible Environment to get Python 3 support.

Static typing

Java enforces types on method parameters, on object attributes, and on variables. To get the equivalent in Python you can use a combination of runtime type checking and static analysis tools.

  • To ensure your classes have the correct types on attributes you can use the attrs library, though it's very useful even if you don't care about type enforcement. This will only do runtime type checking, so you'll need to have decent test coverage.
  • For method attributes and variables, the mypy static type checker, combined with the new Python 3 type annotation syntax, will catch many problems. For Python 2 there is a comment-based syntax as well. The clever folks at Zulip have a nice introductory article about mypy.

Public, private and interfaces

Python lets you do many things Java wouldn't, everything from metaclasses to replacing a method at runtime. But while these more dynamic capabilities can be quite useful, there's nothing wrong with using them sparingly. For example, while Python allows you to set random attributes on a passed in object, usually you shouldn't.

  • As with Java, you typically want to interact with objects using a method-based interface (explicit or implicit), not by randomly mucking with its internals.
  • As with Java code, you want to have a clear separation between public and private parts of your API.
  • And as with Java, you want to be coding to an interface, not to implementation details.

Where Java has explicit and compiler enforced public/private separation, in Python you do this by convention:

  • Private methods and attributes on a class are typically prefixed with an "_".
  • The public interface of a module is declared using __all__, e.g. __all__ = ["MyClass", "AnotherClass"]. __all__ also controls what you gets imported when you do from module import *, but wildcard imports are a bad idea. For more details see the relevant Python documentation.

As for interfaces, if you want to explicitly declare them you can use Python's built-in abstract base classes; not quite the same, but they can be used as pseudo-interfaces. Alternatively, the zope.interface package is more powerful and flexible (and the attrs library mentioned above understands it).

Tests

Automated tests are important if you want some assurance your code works. Python has a built-in unittest library that is similar to JUnit, but at a minimum you'll want a more powerful test runner.

  • nose is a test runner for the built-in unittest, with many plugins.
  • pytest is a test runner and framework, supporting the built-in unittest library as well as a more succinct style of testing. It also has numerous plugins.

Other useful tools:

  • Hypothesis lets you write a single function that generates hundreds or thousands of test cases for maximal test coverage.
  • To set up isolated test environments tox is useful; it builds on Python's built-in virtualenv.
  • coverage let's you measure code coverage on your test runs. If you have multiple tox environments, here's a tutorial on combining the resulting code coverage.

More static analysis

In addition to mypy, two other lint tools may prove useful:

  • flake8 is quick, catches a few important bugs, and checks for some standard coding style violations.
  • pylint is much more powerful, slower, and generates massive numbers of false positives. As a result much fewer Python projects use it than flake8. I still recommend using it, but see my blog post on the subject for details on making it usable.

Documentation

You should document your classes and public methods using docstrings. Unless you're using the new type signature syntax you should also document the types of function parameters and results.

Typically Python docstrings are written in reStructuredText format. It's surprisingly difficult to find an example of the standard style, but here's one.

Sphinx is the standard documentation tool for Python, for both prose and generated API docs. It supports reStructuredText API documentation, but also Google-style docstrings.

Editors

A good Python editor or IDE won't be as powerful as the equivalent Java IDE, but it will make your life easier. All of these will do syntax highlighting, code completion, error highlighting, etc.:

  • If you're used to IntelliJ you can use PyCharm.
  • If you're used to Eclipse you can use PyDev.
  • Elpy is a great Emacs mode for Python.
  • Not certain what your best bet is for vim, but python-mode looks plausible.

Writing maintainable Python

In the end, writing maintainable Python is very much like writing maintainable Java. Python has more flexibility, but also more potential for abuse, so Python expects you to be a responsible adult.

You can choose to write bad code, but if you follow the best practices you learned from Java you won't have to. And the tools I've described above will help catch any mistakes you make along the way.

October 30, 2016 04:00 AM

October 29, 2016

Moshe Zadka

Twisted as Your WSGI Container

Introduction

WSGI is a great standard. It has been amazingly successful. In order to describe how successful it is, let me describe life before WSGI. In the beginning, CGI existed. CGI was just a standard for how a web server can run a process — what environment variables to pass, and so forth. In order to write a web-based application, people would write programs that complied with CGI. At that time, Apache’s only competition was commercial web servers, and CGI allowed you to write applications that ran on both. However, starting a process for each request was slow and wasteful.

For Python applications, people wrote mod_python for Apache. It allowed people to write Python programs that ran inside the Apache process, and directly used Apache’s API to access the HTTP request details. Since Apache was the only server that mattered, that was fine. However, as more servers arrived, a standard was needed. mod_wsgi was originally a way to run the same Django application on many servers. However, as a side effect, it also allowed the second wave of Python web application frameworks –Paste, Flask and more — to have something to run on. In order to make life easier, Python included wsgiref, a module that implemented a single-thread single-process blocking web server with the WSGI protocol.

Development

Some web frameworks come with their own development web servers that will run their WSGI apps. Some use wsgiref. Almost always those options are carefully documented as “just for development use, do not use in production.” Wouldn’t it be nice to use the same WSGI container in both development and production, eliminating one potential source of reproduction bugs?

For ease of use, it should probably be written in Python. Luckily, “twist web –wsgi” is just such a server. In order to show-case how easy it is to use it, twist-wsgi shows commands to run Django, Flask, Pyramid and Bottle apps as easy as it is to run frameworks’ built-in web server.

Production

In production, using the Twisted WSGI containers come with several advantages. Production-grade SSL support using PyOpenssl and cryptography allows elimination of “SSL terminators”, removing one moving piece from the equation. With third-party extensions like txsni and txacme, it allows modern support for “easy SSL”. The built-in HTTP/2 support, starting with Twisted 16.3, allows better support for parallel requests from modern browsers.

The Twisted web server also has a built-in static file server, allowing the elimination of a “front-end” web server that deals with static files by itself, and passing dynamic requests to the application server.

Twisted is also not limited to web serving. As a full-stack network application, it has support for scheduling repeated tasks, running processes and supporting other protocols (for example, a side-channel for online control). Last but not least, in order to integrate that, the language used is Python. As an example for an integrated solution, the Frankenstenian monster plugin show-cases a combo web application combining 4 frameworks, a static file server and a scheduled task updating a file.

While the goal is not to encourage using four web frameworks and a couple of side services in order to greet the user and tell them what time it is, it is nice that if the need strikes this can all be integrated into one process in one language, without the need to remember how to spell “every 4 seconds” in cron or how to quote a string in the nginx configuration file.

by moshez at October 29, 2016 03:03 PM

Twisted Matrix Laboratories

Twisted 16.5.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 16.5!

The highlights of this release are:

  • Deferred.addTimeout, for timing out your Deferreds! (contributed by cyli, reviews by adiroiban, theisencouple, manishtomar, markrwilliams)
  • yield from support for Deferreds, in functions wrapped with twisted.internet.defer.ensureDeferred. This will work in Python 3.4, unlike async/await which is 3.5+ (contributed by hawkowl, reviews by markrwilliams, lukasa).
  • The new asyncio interop reactor, which allows Twisted to run on top of the asyncio event loop. This doesn't include any Deferred-Future interop, but stay tuned! (contributed by itamar and hawkowl, reviews by rodrigc, markrwilliams)
  • twisted.internet.cfreactor is now supported on Python 2.7 and Python 3.5+! This is useful for writing pyobjc or Toga applications. (contributed by hawkowl, reviews by glyph, markrwilliams)
  • twisted.python.constants has been split out into constantly on PyPI, and likewise with twisted.python.versions going into the PyPI package incremental. Twisted now uses these external packages, which will be shared with other projects (like Klein). (contributed by hawkowl, reviews by glyph, markrwilliams)
  • Many new Python 3 modules, including twisted.pair, twisted.python.zippath, twisted.spread.pb, and more parts of Conch! (contributed by rodrigc, hawkowl, glyph, berdario, & others, reviews by acabhishek942, rodrigc, & others)
  • Many bug fixes and cleanups!
  • 260+ closed tickets overall.

    For more information, check the NEWS file (link provided below).

    You can find the downloads on PyPI (or alternatively our website). The NEWS file is also available on GitHub.

    Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

    Twisted Regards,
    Amber Brown (HawkOwl)

    PS: I wrote a blog post about Twisted's progress in 2016! https://atleastfornow.net/blog/marching-ever-forward/

    by Amber Brown (noreply@blogger.com) at October 29, 2016 07:11 AM

    October 27, 2016

    Glyph Lefkowitz

    What Am Container

    Perhaps you are a software developer.

    Perhaps, as a developer, you have recently become familiar with the term "containers".

    Perhaps you have heard containers described as something like "LXC, but better", "an application-level interface to cgroups" or "like virtual machines, but lightweight", or perhaps (even less usefully), a function call. You've probably heard of "docker"; do you wonder whether a container is the same as, different from, or part of an Docker?

    Are you are bewildered by the blisteringly fast-paced world of "containers"? Maybe you have no trouble understanding what they are - in fact you might be familiar with a half a dozen orchestration systems and container runtimes already - but frustrated because this seems like a whole lot of work and you just don't see what the point of it all is?

    If so, this article is for you.

    I'd like to lay out what exactly the point of "containers" are, why people are so excited about them, what makes the ecosystem around them so confusing. Unlike my previous writing on the topic, I'm not going to assume you know anything about the ecosystem in general; just that you have a basic understanding of how UNIX-like operating systems separate processes, files, and networks.1


    At the dawn of time, a computer was a single-tasking machine. Somehow, you'd load your program into main memory, and then you'd turn it on; it would run the program, and (if you're lucky) spit out some output onto paper tape.

    When a program running on such a computer looked around itself, it could "see" the core memory of the computer it was running on, any attached devices, including consoles, printers, teletypes, or (later) networking equipment. This was of course very powerful - the program had full control of everything attached to the computer - but also somewhat limiting.

    This mode of addressing hardware is limiting because it meant that programs would break the instant you moved them to a new computer. They had to be re-written to accommodate new amounts and types of memory, new sizes and brands of storage, new types of networks. If the program had to contain within itself the full knowledge of every piece of hardware that it might ever interact with, it would be very expensive indeed.

    Also, if all the resources of a computer were dedicated to one program, then you couldn't run a second program without stomping all over the first one - crashing it by mangling its structures in memory, deleting its data by overwriting its data on disk.

    So, programmers cleverly devised a way of indirecting, or "virtualizing", access to hardware resources. Instead of a program simply addressing all the memory in the whole computer, it got its own little space where it could address its own memory - an address space, if you will. If a program wanted more memory, it would ask a supervising program - what we today call a "kernel" - to give it some more memory. This made programs much simpler: instead of memorizing the address offsets where a particular machine kept its memory, a program would simply begin by saying "hey operating system, give me some memory", and then it would access the memory in its own little virtual area.

    In other words: memory allocation is just virtual RAM.

    Virtualizing memory - i.e. ephemeral storage - wasn't enough; in order to save and transfer data, programs also had to virtualize disk - i.e. persistent storage. Whereas a whole-computer program would just seek to position 0 on the disk and start writing data to it however it pleased, a program writing to a virtualized disk - or, as we might call it today, a "file" - first needed to request a file from the operating system.

    In other words: file systems are just virtual disks.

    Networking was treated in a similar way. Rather than addressing the entire network connection at once, each program could allocate a little slice of the network - a "port". That way a program could, instead of consuming all network traffic destined for the entire machine, ask the operating system to just deliver it all the traffic for, say, port number seven.

    In other words: listening ports are just virtual network cards.


    Getting bored by all this obvious stuff yet? Good. One of the things that frustrates me the most about containers is that they are an incredibly obvious idea that is just a logical continuation of a trend that all programmers are intimately familiar with.


    All of these different virtual resources exist for the same reason: as I said earlier, if two programs need the same resource to function properly, and they both try to use it without coordinating, they'll both break horribly.2

    UNIX-like operating systems more or less virtualize RAM correctly. When one program grabs some RAM, nobody else - modulo super-powered administrative debugging tools - gets to use it without talking to that program. It's extremely clear which memory belongs to which process. If programs want to use shared memory, there is a very specific, opt-in protocol for doing so; it is basically impossible for it to happen by accident.

    However, the abstractions we use for disks (filesystems) and network cards (listening ports and addresses) are significantly more limited. Every program on the computer sees the same file-system. The program itself, and the data the program stores, both live on the same file-system. Every program on the computer can see the same network information, can query everything about it, and can receive arbitrary connections. Permissions can remove certain parts of the filesystem from view (i.e. programs can opt-out) but it is far less clear which program "owns" certain parts of the filesystem; access must be carefully controlled, and sometimes mediated by administrators.

    In particular, the way that UNIX manages filesystems creates an environment where "installing" a program requires manipulating state in the same place (the filesystem) where other programs might require different state. Popular package managers on UNIX-like systems (APT, RPM, and so on) rarely have a way to separate program installation even by convention, let alone by strict enforcement. If you want to do that, you have to re-compile the software with ./configure --prefix to hard-code a new location. And, fundamentally, this is why the package managers don't support installing to a different place: if the program can tell the difference between different installation locations, then it will, because its developers thought it should go in one place on the file system, and why not hard code it? It works on their machine.


    In order to address this shortcoming of the UNIX process model, the concept of "virtualization" became popular. The idea of virtualization is simple: you write a program which emulates an entire computer, with its own storage media, network devices, and then you install an operating system on it. This completely resolves the over-sharing of resources: a process inside a virtual machine is in a very real sense running on a different computer than programs running on a different virtual machine on the same physical device.

    However, virtualiztion is also an extremly heavy-weight blunt instrument. Since virtual machines are running operating systems designed for physical machines, they have tons of redundant hardware-management code; enormous amounts of operating system data which could be shared with the host, but since it's in the form of a disk image totally managed by the virtual machine's operating system, the host can't really peek inside to optimize anything. It also makes other kinds of intentional resource sharing very hard: any software to manage the host needs to be installed on the host, since if it is installed on the guest it won't have full access to the host's hardware.

    I hate using the term "heavy-weight" when I'm talking about software - it's often bandied about as a content-free criticism - but the difference in overhead between running a virtual machine and a process is the difference between gigabytes and kilobytes; somewhere between 4-6 orders of magnitude. That's a huge difference.

    This means that you need to treat virtual machines as multi-purpose, since one VM is too big to run just a single small program. Which means you often have to manage them almost as if they were physical harware.


    When we run a program on a UNIX-like operating system, and by so running it, grant it its very own address space, we call the entity that we just created a "process".

    This is how to understand a "container".

    A "container" is what we get when we run a program and give it not just its own memory, but its own whole virtual filesystem and its own whole virtual network card.

    The metaphor to processes isn't perfect, because a container can contain multiple processes with different memory spaces that share a single filesystem. But this is also where some of the "container ecosystem" fervor begins to creep in - this is why people interested in containers will religiously exhort you to treat a container as a single application, not to run multiple things inside it, not to SSH into it, and so on. This is because the whole point of containers is that they are lightweight - far closer in overhead to the size of a process than that of a virtual machine.

    A process inside a container, if it queries the operating system, will see a computer where only it is running, where it owns the entire filesystem, and where any mounted disks were explicitly put there by the administrator who ran the container. In other words, if it wants to share data with another application, it has to be given the shared data; opt-in, not opt-out, the same way that memory-sharing is opt-in in a UNIX-like system.


    So why is this so exciting?

    In a sense, it really is just a lower-overhead way to run a virtual machine, as long as it shares the same kernel. That's not super exciting, by itself.

    The reason that containers are more exciting than processes is the same reason that using a filesystem is more exciting than having to use a whole disk: sharing state always, inevitably, leads to brokenness. Opt-in is better than opt-out.

    When you give a program a whole filesystem to itself, sharing any data explicitly, you eliminate even the possibility that some other program scribbling on a shared area of the filesystem might break it. You don't need package managers any more, only package installers; by removing the other functions of package managers (inventory, removal) they can be radically simplified, and less complexity means less brokenness.

    When you give a program an entire network address to itself, exposing any ports explicitly, you eliminate even the possibility that some rogue program will expose a security hole by listening on a port you weren't expecting. You eliminate the possibility that it might clash with other programs on the same host, hard-coding the same port numbers or auto-discovering the same addresses.


    In addition to the exciting things on the run-time side, containers - or rather, the things you run to get containers, "images"3, present some compelling improvements to the build-time side.

    On Linux and Windows, building a software artifact for distribution to end-users can be quite challenging. It's challenging because it's not clear how to specify that you depend on certain other software being installed; it's not clear what to do if you have conflicting versions of that software that may not be the same as the versions already available on the user's computer. It's not clear where to put things on the filesystem. On Linux, this often just means getting all of your software from your operating system distributor.

    You'll notice I said "Linux and Windows"; not the usual (linux, windows, mac) big-3 desktop platforms, and I didn't say anything about mobile OSes. That's because on macOS, Android, iOS, and Windows Metro, applications already run in their own containers. The rules of macOS containers are a bit weird, and very different from Docker containers, but if you have a Mac you can check out ~/Library/Containers to see the view of the world that the applications you're running can see. iOS looks much the same.

    This is something that doesn't get discussed a lot in the container ecosystem, partially because everyone is developing technology at such a breakneck pace, but in many ways Linux server-side containerization is just a continuation of a trend that started on mainframe operating systems in the 1970s and has already been picked up in full force by mobile operating systems.

    When one builds an image, one is building a picture of the entire filesystem that the container will see, so an image is a complete artifact. By contrast, a package for a Linux package manager is just a fragment of a program, leaving out all of its dependencies, to be integrated later. If an image runs on your machine, it will (except in some extremely unusual circumstances) run on the target machine, because everything it needs to run is fully included.

    Because you build all the software an image requires into the image itself, there are some implications for server management. You no longer need to apply security updates to a machine - they get applied to one application at a time, and they get applied as a normal process of deploying new code. Since there's only one update process, which is "delete the old container, run a new one with a new image", updates can roll out much faster, because you can build an image, run tests for the image with the security updates applied, and be confident that it won't break anything. No more scheduling maintenance windows, or managing reboots (at least for security updates to applications and libraries; kernel updates are a different kettle of fish).


    That's why it's exciting. So why's it all so confusing?5

    Fundamentally the confusion is caused by there just being way too many tools. Why so many tools? Once you've accepted that your software should live in images, none of the old tools work any more. Almost every administrative, monitoring, or management tool for UNIX-like OSes depends intimately upon the ability to promiscuously share the entire filesystem with every other program running on it. Containers break these assumptions, and so new tools need to be built. Nobody really agrees on how those tools should work, and a wide variety of forces ranging from competitive pressure to personality conflicts make it difficult for the panoply of container vendors to collaborate perfectly4.

    Many companies whose core business has nothing to do with infrastructure have gone through this reasoning process:

    1. Containers are so much better than processes, we need to start using them right away, even if there's some tooling pain in adopting them.
    2. The old tools don't work.
    3. The new tools from the tool vendors aren't ready.
    4. The new tools from the community don't work for our use-case.
    5. Time to write our own tool, just for our use-case and nobody else's! (Which causes problem #3 for somebody else, of course...)

    A less fundamental reason is too much focus on scale. If you're running a small-scale web application which has a stable user-base that you don't expect a lot of growth in, there are many great reasons to adopt containers as opposed to automating your operations; and in fact, if you keep things simple, the very fact that your software runs in a container might obviate the need for a system-management solution like Chef, Ansible, Puppet, or Salt. You should totally adopt them and try to ignore the more complex and involved parts of running an orchestration system.

    However, containers are even more useful at significant scale, which means that companies which have significant scaling problems invest in containers heavily and write about them prolifically. Many guides and tutorials on containers assume that you expect to be running a multi-million-node cluster with fully automated continuous deployment, blue-green zero-downtime deploys, a 1000-person operations team. It's great if you've got all that stuff, but building each of those components is a non-trivial investment.


    So, where does that leave you, my dear reader?

    You should absolutely be adopting "container technology", which is to say, you should probably at least be using Docker to build your software. But there are other, radically different container systems - like Sandstorm - which might make sense for you, depending on what kind of services you create. And of course there's a huge ecosystem of other tools you might want to use; too many to mention, although I will shout out to my own employer's docker-as-a-service Carina, which delivered this blog post, among other things, to you.

    You shouldn't feel as though you need to do containers absolutely "the right way", or that the value of containerization is derived from adopting every single tool that you can all at once. The value of containers comes from four very simple things:

    1. It reduces the overhead and increases the performance of co-locating multiple applications on the same hardware,
    2. It forces you to explicitly call out any shared state or required resources,
    3. It creates a complete build pipeline that results in a software artifact that can be run without special installation or set-up instructions (at least, on the "software installation" side; you still might require configuration, of course), and
    4. It gives you a way to test exactly what you're deploying.

    These benefits can combine and interact in surprising and interesting ways, and can be enhanced with a wide and growing variety of tools. But underneath all the hype and the buzz, the very real benefit of containerization is basically just that it is fixing a very old design flaw in UNIX.

    Containers let you share less state, and shared mutable state is the root of all evil.


    1. If you have a more sophisticated understanding of memory, disks, and networks, you'll notice that everything I'm saying here is patently false, and betrays an overly simplistic understanding of the development of UNIX and the complexities of physical hardware and driver software. Please believe that I know this; this is an alternate history of the version of UNIX that was developed on platonically ideal hardware. The messy co-evolution of UNIX, preemptive multitasking, hardware offload for networks, magnetic secondary storage, and so on, is far too large to fit into the margins of this post. 

    2. When programs break horribly like this, it's called "multithreading". I have written some software to help you avoid it. 

    3. One runs an "executable" to get a process; one runs an "image" to get a container. 

    4. Although the container ecosystem is famously acrimonious, companies in it do actually collaborate better than the tech press sometimes give them credit for; the Open Container Project is a significant extraction of common technology from multiple vendors, many of whom are also competitors, to facilitate a technical substrate that is best for the community. 

    5. If it doesn't seem confusing to you, consider this absolute gem from the hilarious folks over at CircleCI. 

    by Glyph at October 27, 2016 09:23 AM

    October 22, 2016

    Glyph Lefkowitz

    docker run glyph/rproxy

    Want to TLS-protect your co-located stack of vanity websites with Twisted and Let's Encrypt using HawkOwl's rproxy, but can't tolerate the bone-grinding tedium of a pip install? I built a docker image for you now, so it's now as simple as:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    $ mkdir -p conf/certificates;
    $ cat > conf/rproxy.ini << EOF;
    > [rproxy]
    > certificates=certificates
    > http_ports=80
    > https_ports=443
    > [hosts]
    > mysite.com_host=<other container host>
    > mysite.com_port=8080
    > EOF
    $ docker run --restart=always -v "$(pwd)"/conf:/conf \
        -p 80:80 -p 443:443 \
        glyph/rproxy;
    

    There are no docs to speak of, so if you're interested in the details, see the tree on github I built it from.

    Modulo some handwaving about docker networking to get that <other container host> IP, that's pretty much it. Go forth and do likewise!

    by Glyph at October 22, 2016 08:12 PM

    October 19, 2016

    Itamar Turner-Trauring

    Why Pylint is both useful and unusable, and how you can actually use it

    This is a story about a tool that caught a production-impacting bug the day before we released the code. This is also the story of a tool no one uses, and for good reason. By the time you're done reading you'll see why this tool is useful, why it's unusable, and how you can actually use it with your Python project.

    (Not a Python programmer? The same problems and solutions are likely apply to tools in your ecosystem as well.)

    Pylint saves the day

    If you're coding in Haskell the compiler's got your back. If you're coding in Java the compiler will usually lend a helping hand. But if you're coding in a dynamic language like Python or Ruby you're on your own: you don't have a compiler to catch bugs for you.

    The next best thing is a lint tool that uses heuristics to catch bugs in your code. One such tool is Pylint, and here's how I started using it.

    One day at work we realized our builds had been consistently failing for a few days, and it wasn't the usual intermittent failures. After a few days of investigating, my colleague Tom Prince discovered the problem. It was Python code that looked something like this:

    for volume in get_volumes():
        do_something(volume)
    
    for volme in get_other_volumes():
        do_something_else(volume)
    

    Notice the typo in the second for loop. Combined with the fact that Python leaks variables from blocks, the last value of volume from the first for loop was used for every iteration of the second loop.

    To see if we could prevent these problems in the future I tried Pylint, re-introduced the bug... and indeed it caught the problem. I then looked at the rest of the output to see what else it had found.

    What it had found was a serious bug. It was in code I had written a few days earlier, and the bug completely broke an important feature we were going to ship to users the very next day. Here's a heavily simplified minimal reproducer for the bug:

    list_of_printers = []
    for i in [1, 2, 3]:
        def printer():
            print(i)
        list_of_printers.append(printer)
    
    for func in list_of_printers:
        func()
    

    The intended result of this reproducer is to print:

    1
    2
    3
    

    But what will actually get printed with this code is:

    3
    3
    3
    

    When you define a nested function in Python that refers to a variable in the outside scope it binds not the value of a variable but the variable itself. In this case that means the i inside printer() ended up always getting the last value of the variable i in the for loop.

    And luckily Pylint caught that bug before it shipped; pretty great, right?

    Why no one uses Pylint

    Pylint is useful, but many projects don't use it. For example, I went and checked just now, and neither Twisted nor Django nor Flask nor Sphinx seem to use Pylint. Why wouldn't these large, sophisticated Python projects use a tool that would automatically catch bugs for them?

    One problem is that it's slow, but that's not the real problem; you can always just run it on the CI system with the other slow tests. The real problem is the amount of output.

    Here's what I mean: I ran pylint on a checkout of Twisted and the resulting output was 28,000 lines of output (at which point pylint crashed, but I'll assume that's fixed in newer releases). Let me say that again: 28,000 errors or warnings.

    That's insane.

    And to be fair Twisted has a coding standard that doesn't match the Python mainstream, but massive amounts of noise has been my experience with other projects as well. Pylint has a lot of useful errors... but also a whole lot of utterly useless garbage assumptions about how your code should look. And fundamentally it treats them all the same; e.g. there's a distinction between warnings and errors but in practice both useful and useless stuff is in the warning category.

    For example:

    W:675, 0: Class has no __init__ method (no-init)

    That's not a useful warning. Now imagine a few thousand of those.

    How you should use Pylint

    So here we have a tool that is potentially useful, but unusable in practice. What to do? Luckily Pylint has some functionality that can help: you can configure it with a whitelist of lint checks.

    First, setup Pylint to do nothing:

    1. Make a list of all the features you plausibly want to enable from the Pylint docs and configure .pylintrc to whitelist them.
    2. Comment them all out.

    At this point Pylint will do no checks. Next:

    1. Uncomment a small batch of checks, and run pylint.
    2. If the resulting errors are real problems, fix them. If the errors are utter garbage, delete those checks from the configuration.

    At this point you have a small number of probably useful checks that are passing: you can run pylint and you only will be told about new problems. In other words, you have a useful tool.

    Repeat this process a few times, or once a week, enabling a new batch of checks each time until you run out of patience or you run out of Pylint checks to enable.

    The end result will be something like this configuration or this configuration; both projects are open source under the Apache 2.0 license, so you can use those as a starting point.

    Go forth and lint

    Here's my challenge to you: if you're a Python programmer, go setup Pylint on a project today. It'll take an hour to get some minimal checks going, and one day it will save you from a production-impacting bug. If you're not a Python programmer you can probably find some equivalent tool for your language; go set that up.

    And if you're the author of a lint tool, please, try to come up with better defaults. It's better to catch 60% of bugs and have 10,000 software projects using your tool than to catch 70% of bugs and have almost no one use it.

    October 19, 2016 04:00 AM

    Glyph Lefkowitz

    docker run glyph/rproxy

    Want to TLS-protect your co-located stack of vanity websites with Twisted and Let's Encrypt using HawkOwl's rproxy, but can't tolerate the bone-grinding tedium of a pip install? I built a docker image for you now, so it's now as simple as:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    $ mkdir -p conf/certificates;
    $ cat > conf/rproxy.ini << EOF;
    > [rproxy]
    > certs=certificates
    > http_ports=80
    > https_ports=443
    > [hosts]
    > mysite.com_host=<other container host>
    > mysite.com_port=8080
    > EOF
    $ docker run --restart=always -v "$(pwd)"/conf:/conf \
        -p 80:80 -p 443:443 \
        glyph/rproxy;
    

    There are no docs to speak of, so if you're interested in the details, see the tree on github I built it from.

    Modulo some handwaving about docker networking to get that <other container host> IP, that's pretty much it. Go forth and do likewise!

    by Glyph at October 19, 2016 12:32 AM

    October 18, 2016

    Glyph Lefkowitz

    As some of you may have guessed from the unintentional recent flurry of activity on my Twitter account, twitter feed, the service I used to use to post blog links automatically, is getting end-of-lifed. I've switched to dlvr.it for the time being, unless they send another unsolicited tweetstorm out on my behalf...

    Sorry about the noise! In the interests of putting some actual content here, maybe you would be interested to know that I was recently interviewed for PyDev of the Week?

    by Glyph at October 18, 2016 08:37 PM

    October 15, 2016

    Jonathan Lange

    servant-template: production-ready Haskell web services in 5 minutes

    If you want to write a web API in Haskell, then you should start by using my new cookiecutter template at https://github.com/jml/servant-template. It’ll get you a production-ready web service in 5 minutes or less.

    Whenever you start any new web service and you actually care about getting it working and available to users, it’s very useful to have:

    • logging
    • monitoring
    • continuous integration
    • tests
    • deployment
    • command-line parsing

    These are largely boring, but nearly essential. Logs and monitoring give you visibility into the code’s behaviour in production, tests and continuous integration help you make sure you don’t break it, and, of course, you need some way of actually shipping code to users. As an engineer who cares deeply about running code in production, these are pretty much the bare minimum for me to be able to deploy something to my users.

    The cookiecutter template at gh:jml/servant-template creates a simple Haskell web API service that does all of these things:

    As the name suggests, all of this enables writing a servant server. Servant lets you declaring web APIs at the type-level and then using those API specifications to write servers. It’s hard to overstate just how useful it is for writing RESTful APIs.

    Get started with:

    $ cookiecutter gh:jml/servant-template
    project_name [awesome-service]: awesome-service
    ...
    $ cd awesome-service
    $ stack test
    $ make image
    ...
    sha256:30e4c9a5f29a2c4caa44e226859dd094c6ac9d297de0d1d2024e8a981a7c8f86
    awesome-service:unversioned
    $ docker run awesome-service:latest --help
    awesome-service - TODO fill this in
    
    Usage: awesome-service --port PORT [--access-logs ARG] [--log-level ARG]
                           [--ghc-metrics]
      One line description of project
    
    Available options:
      -h,--help                Show this help text
      --port PORT              Port to listen on
      --access-logs ARG        How to log HTTP access
      --log-level ARG          Minimum severity for log messages
      --ghc-metrics            Export GHC metrics. Requires running with +RTS.
    $ docker run -p 8080:80 awesome-service --port 80
    [2016-10-16T20:50:07.983292987000] [Informational] Listening on :80
    

    For this to work, you’ll need to have Docker installed on your system. I’ve tested it on my Mac with Docker Machine, but haven’t yet with Linux.

    You might have to run stack docker pull before make image, if you haven’t already used stack to build things from within Docker.

    Once it’s up and running, you can browse to http://localhost:8080/ (or http://$(docker-machine ip):8080/) if you’re on a Mac, and you’ll see a simple HTML page describing the API and giving you a link to the /metrics page, which is where all the Prometheus metrics are exported.

    There you have it, a production-ready web service. At least for some values of “production-ready”.

    Of course, the API it offers is really simple. You can make it your own by editing the API definition and the server implementation to make it really your own. Note these two are in separate libraries to make it easier to generate client code.

    The template comes with a test suite that uses servant-quickcheck to guarantee that none of your endpoints return 500s, take longer than 100ms to serve, and that all the 201s include Location headers.

    If you’re so inclined, you could push the created Docker image to a repository somewhere—it’s around 25MB when built. Then, people could use it and no one would have to know that it’s Haskell, they’d just notice a fast web service that works.

    As the README says, I’ve made a few questionable decisions when building this. If you disagree, or think I could have done anything better I’d love to know. If you use this to build something cool, or even something silly, please let me know on Twitter.

    by jml at October 15, 2016 11:00 PM

    October 14, 2016

    Itamar Turner-Trauring

    How to find a programming job you won't hate

    Somewhere out there is a company that wants to hire you as a software engineer. Working for that company is a salesperson whose incentives were set by an incompetent yet highly compensated upper management. The salesperson has just made a sale, and in return for a large commission has promised the new customer twice the features in half the time.

    The team that wants to hire you will spend the next three months working evenings and weekends. And then, with a job badly done, they'll move on to the next doomed project.

    You don't want to work for this company, and you shouldn't waste your time applying there.

    When you're looking for a new programming job you want to find it quickly:

    • If your current job sucks you want to find a new place before you hit the unfortunate gotta-quit-today moment.
    • If you're not working you don't want your savings to run out. You have been saving money, right?
    • Either way, looking for a job is no fun.

    Assuming you can afford to be choosy, you'll want to speed up the process by filtering out as many companies as possible in advance. There are many useful ways to filter your list down: your technical interests, the kinds of company you want to work for, location.

    In this post, however, I'd like to talk about ways to filter out companies you'd hate. That is, companies with terrible work conditions.

    Talk to your friends

    Some companies have an bad reputation, some have a great reputation. But once a company is big enough different teams can end up with very different work environments.

    Talking to someone who actually works at a company will give you much better insight about how things work more locally. They can tell you which groups to avoid, and which groups have great leadership.

    For example, Amazon does not have a very good reputation as a workplace, but I know someone who enjoys his job there and his very reasonable working hours.

    Glassdoor

    For companies where you don't have contacts Glassdoor can be a great resource. Glassdoor is a site that lets employees post anonymous salaries and reviews of their company.

    The information is anonymous, so you have to be a little skeptical, especially when there's only a few reviews. And you need to pay attention to the reviewer's role, location, and the year it was posted. Once you take all that into account the reviews can often be very informative.

    During my last job search I found one company in the healthcare area with many complaints of long working hours. One of Glassdoor's features is a way for a company to reply to reviews. In this case the CEO himself answered, explaining that they work hard because "sick patients can't wait."

    Personally I'd rather not work for someone who confuses working long hours with increased output or productivity.

    Read company materials

    After you've checked out Glassdoor the next thing to look at is the job posting itself, along with the company's website. These are often written by people other than the engineering team, but you can still learn a lot from them.

    Sometimes you'll get the sense the company is actually a great place to work for. For example, Memrise has this to say in their Software Engineering postings:

    If you aren’t completely confident that you fit our exact criteria, please get in touch immediately. Humility is a wonderful thing and we’re not interested in hiring ‘rockstars’ or ‘ninjas’.

    On the other hand, consider a job post I found for an Automation Test Engineer. First we learn:

    Must be able to execute scripts during off hours if required. ... This isn’t the job for someone looking for a traditional 8-5 position, but it’s a great role for someone who is hungry for a terrific opportunity in a fast-paced, state of the art environment.

    Long hours and being on call at any moment to run some tests. Doesn't sound very appealing, does it?

    Notice, by the way, that it's worth reading all of a company's job postings. Other job postings from the same company are less informative about working conditions than the one I just quoted.

    Interviews

    Finally, if a company has passed the previous filters and you've gotten an interview, make sure you ask about working conditions. Tactfully, of course, and once you've demonstrated your value, but if you don't ask you won't know until it's too late. Here are some sample questions to get you started:

    • What's your typical work day like?
    • How many hours do you end up working?
    • How do you manage project deadlines?

    Depending on the question you might want to ask individual contributors rather than managers. But I've had managers tell me outright they want employees to work really long hours.

    --

    There are many bad software jobs out there. But you don't need to work evenings or weekends to succeed as a programmer.

    If you want to find a programming job with a sane workweek, a job you'll actually enjoy, sign up for the free email course below for more tips and tricks.

    October 14, 2016 04:00 AM

    October 09, 2016

    Thomas Vander Stichele

    Puppet/puppetdb/storeconfigs validation issues

    Over the past year I’ve chipped away at setting up new servers for apestaart and managing the deployment in puppet as opposed to a by now years old manual single server configuration that would be hard to replicate if the drives fail (one of which did recently, making this more urgent).

    It’s been a while since I felt like I was good enough at puppet to love and hate it in equal parts, but mostly manage to control a deployment of around ten servers at a previous job.

    Things were progressing an hour or two here and there at a time, and accelerated when a friend in our collective was launching a new business for which I wanted to make sure he had a decent redundancy setup.

    I was saving the hardest part for last – setting up Nagios monitoring with Matthias Saou’s puppet-nagios module, which needs External Resources and storeconfigs working.

    Even on the previous server setup based on CentOS 6, that was a pain to set up – needing MySQL and ruby’s ActiveRecord. But it sorta worked.

    It seems that for newer puppet setups, you’re now supposed to use something called PuppetDB, which is not in fact a database on its own as the name suggests, but requires another database. Of course, it chose to need a different one – Postgres. Oh, and PuppetDB itself is in Java – now you get the cost of two runtimes when you use puppet!

    So, to add useful Nagios monitoring to my puppet deploys, which without it are quite happy to be simple puppet apply runs from a local git checkout on each server, I now need storedconfigs which needs puppetdb which pulls in Java and Postgres. And that’s just so a system that handles distributed configuration can actually be told about the results of that distributed configuration and create a useful feedback cycle allowing it to do useful things to the observed result.

    Since I test these deployments on local vagrant/VirtualBox machines, I had to double their RAM because of this – even just the puppetdb java server by default starts with 192MB reserved out of the box.

    But enough complaining about these expensive changes – at least there was a working puppetdb module that managed to set things up well enough.

    It was easy enough to get the first host monitored, and apart from some minor changes (like updating the default Nagios config template from 3.x to 4.x), I had a familiar Nagios view working showing results from the server running Nagios itself. Success!

    But all runs from the other vm’s did not trigger adding any exported resources, and I couldn’t find anything wrong in the logs. In fact, I could not find /var/log/puppetdb/puppetdb.log at all…

    fun with utf-8

    After a long night of experimenting and head scratching, I chased down a first clue in /var/log/messages saying puppet-master[17702]: Ignoring invalid UTF-8 byte sequences in data to be sent to PuppetDB

    I traced that down to puppetdb/char_encoding.rb, and with my limited ruby skills, I got a dump of the offending byte sequence by adding this code:

    Puppet.warning "Ignoring invalid UTF-8 byte sequences in data to be sent to PuppetDB"
    File.open('/tmp/ruby', 'w') { |file| file.write(str) }
    Puppet.warning "THOMAS: is here"

    (I tend to use my name in debugging to have something easy to grep for, and I wanted some verification that the File dump wasn’t triggering any errors)
    It took a little time at 3AM to remember where these /tmp files end up thanks to systemd, but once found, I saw it was a json blob with a command to “replace catalog”. That could explain why my puppetdb didn’t have any catalogs for other hosts. But file told me this was a plain ASCII file, so that didn’t help me narrow it down.

    I brute forced it by just checking my whole puppet tree:


    find . -type f -exec file {} \; > /tmp/puppetfile
    grep -v ASCII /tmp/puppetfile | grep -v git

    This turned up a few UTF-8 candidates. Googling around, I was reminded about how terrible utf-8 handling was in ruby 1.8, and saw information that puppet recommended using ASCII only in most of the manifests and files to avoid issues.

    It turned out to be a config from a webalizer module:

    webalizer/templates/webalizer.conf.erb: UTF-8 Unicode text

    While it was written by a Jesús with a unicode name, the file itself didn’t have his name in it, and I couldn’t obviously find where the UTF-8 chars were hiding. One StackOverflow post later, I had nailed it down – UTF-8 spaces!

    00004ba0 2e 0a 23 c2 a0 4e 6f 74 65 20 66 6f 72 20 74 68 |..#..Note for th|
    00004bb0 69 73 20 74 6f 20 77 6f 72 6b 20 79 6f 75 20 6e |is to work you n|

    The offending character is c2 a0 – the non-breaking space

    I have no idea how that slipped into a comment in a config file, but I changed the spaces and got rid of the error.

    Puppet’s error was vague, did not provide any context whatsoever (Where do the bytes come from? Dump the part that is parseable? Dump the hex representation? Tell me the position in it where the problem is?), did not give any indication of the potential impact, and in a sea of spurious puppet warnings that you simply have to live with, is easy to miss. One down.

    However, still no catalogs on the server, so still only one host being monitored. What next?

    users, groups, and permissions

    Chasing my next lead turned out to be my own fault. After turning off SELinux temporarily, checking all permissions on all puppetdb files to make sure that they were group-owned by puppetdb and writable for puppet, I took the last step of switching to that user role and trying to write the log file myself. And it failed. Huh? And then id told me why – while /var/log/puppetdb/ was group-writeable and owned by puppetdb group, my puppetdb user was actually in the www-data group.

    It turns out that I had tried to move some uids and gids around after the automatic assignment puppet does gave different results on two hosts (a problem I still don’t have a satisfying answer for, as I don’t want to hard-code uids/gids for system accounts in other people’s modules), and clearly I did one of them wrong.

    I think a server that for whatever reason cannot log should simply not start, as this is a critical error if you want a defensive system.

    After fixing that properly, I now had a puppetdb log file.

    resource titles

    Now I was staring at an actual exception:

    2016-10-09 14:39:33,957 ERROR [c.p.p.command] [85bae55f-671c-43cf-9a54-c149cede
    c659] [replace catalog] Fatal error on attempt 0
    java.lang.IllegalArgumentException: Resource '{:type "File", :title "/var/lib/p
    uppet/concat/thomas_vimrc/fragments/75_thomas_vimrc-\" allow adding additional
    config through .vimrc.local_if filereadable(glob(\"~_.vimrc.local\"))_\tsource
    ~_.vimrc.local_endif_"}' has an invalid tag 'thomas:vimrc-" allow adding additi
    onal config through .vimrc.local
    if filereadable(glob("~/.vimrc.local"))
    source ~/.vimrc.local
    endif
    '. Tags must match the pattern /\A[a-z0-9_][a-z0-9_:\-.]*\Z/.
    at com.puppetlabs.puppetdb.catalogs$validate_resources.invoke(catalogs.
    clj:331) ~[na:na]

    Given the name of the command (replace catalog), I felt certain this was going to be the problem standing between me and multiple hosts being monitored.

    The problem was a few levels deep, but essentially I had code creating fragments of vimrc files using the concat module, and was naming the resources with file content as part of the title. That’s not a great idea, admittedly, but no other part of puppet had ever complained about it before. Even the files on my file system that store the fragments, which get their filename from these titles, happily stored with a double quote in its name.

    So yet again, puppet’s lax approach to specifying types of variables at any of its layers (hiera, puppet code, ruby code, ruby templates, puppetdb) in any of its data formats (yaml, json, bytes for strings without encoding information) triggers errors somewhere in the stack without informing whatever triggered that error (ie, the agent run on the client didn’t complain or fail).

    Once again, puppet has given me plenty of reasons to hate it with a passion, tipping the balance.

    I couldn’t imagine doing server management without a tool like puppet. But you love it when you don’t have to tweak it much, and you hate it when you’re actually making extensive changes. Hopefully after today I can get back to the loving it part.

    flattr this!

    by Thomas at October 09, 2016 08:31 PM

    October 07, 2016

    Itamar Turner-Trauring

    More learning, less time: how to quickly gather new tools and techniques

    Update: Added newsletters to the list.

    Have you ever worked hard to solve a problem, only to discover a few weeks later an existing design pattern that was even better than your solution? Or built an internal tool, only to discover an existing tool that already solved the problem?

    To be a good software engineer you need a good toolbox. That means software tools you can use when appropriate, design patterns so you don't have to reinvent the wheel, testing techniques... the list goes on. Learning all existing tools and techniques is impossible, and just keeping up with every newly announced library would be a full time job.

    How do you learn what you need to know to succeed at your work? And how can you do so without spending a huge amount of your free time reading and programming just to keep up?

    A broad toolbox, the easy way

    To understand how you can build your toolbox, consider the different levels of knowledge you can have. You can be an expert on a subject, or you can have some basic understanding, or you might just have a vague awareness that the subject exists.

    For our purposes building awareness is the most important of the three. You will never be an expert in everything, and even basic understanding takes some time. But broad awareness takes much less effort: you just need to remember small amounts of information about each tool or technique.

    You don't need to be an expert on a tool or technique, or even use it at all. As long as you know a tool exists you'll be able to learn more about it when you need to.

    For example, there is a tool named Logstash that moves server logs around. That's pretty much all you have to remember about it, and it takes just 3 seconds to read that previous sentence. Maybe you'll never use that information... or maybe one day you'll need to get logs from a cluster of machines to a centralized location. At that point you'll remember the name "Logstash", look it up, and have the motivation to actually go read the documentation and play around with it.

    Design patterns and other techniques take a bit more effort to gain useful awareness, but still, awareness is usually all you need. For example, property-based testing is hugely useful. But all it takes is a little reading to gain awareness, even if it will take more work to actually use it.

    The more tools and techniques you are aware of the more potential solutions you will have to the problems you encounter while programming. Being aware of a broad range of tools and techniques is hugely valuable and easy to achieve.

    Building your toolbox

    How do you build your toolbox? How do you find the tools and techniques you need to be aware of? Here are three ways to do so quickly and efficiently.

    Newsletters

    A great way to learn new tools and techniques are newsletters like Ruby Weekly. There are newsletters on many languages and topics, from DevOps to PostgreSQL.

    Newsletters typically include not just links but also short descriptions, so you can skim them and gain awareness even without reading all the articles. In contrast, sites like Reddit or Hacker News only include links, so you gain less information unless you spend more time reading.

    The downside of newsletters is that they focus on the new. You won't hear about a classic design pattern or a standard tool unless someone happens to write a new blog post about it. You should therefore rely on additional sources as well.

    Conference proceedings

    Another broader source of tools and techniques are conferences. Conference talks are chosen by a committee with some understanding of the conference subject. Often they can be quite competitive: I believe the main US Python conference accepts only a third of proposals. And good conferences will aim for a broad range of talks, within the limits of their target audience. As a result conferences are a great way to discover relevant, useful tools and techniques, both new and old.

    Of course, going to a conference can be expensive and time consuming. Luckily you don't have to go to the conference to benefit.

    Just follow this quick procedure:

    1. Find a conference relevant to your interests. E.g. if you're a Ruby developer find a conference like RubyConf.
    2. Skim the talk descriptions; they're pretty much always online.
    3. If something sounds really interesting, there's a decent chance you can find a recording of the talk, or at least the slides.
    4. Mostly however you just need to see what people are talking about and make a mental note of things that sound useful or interesting.

    For example, skimming the RubyConf 2016 program I see there's something called OpenStruct for dynamic data objects, FactoryGirl which is apparently a testing-related library, a library for writing video games, an explanation of hooks and so on. I'm not really a Ruby programmer, but if I ever want to write a video game in Ruby I'll go find that talk.

    Meetups and user groups

    Much like conferences, meetups are a great way to learn about a topic. And much like conferences, you don't actually have to go to the meetup to gain awareness.

    For example, the Boston Python Meetup has had talks in recent months about CPython internals, microservices, BeeKeeper which is something for REST APIs, the Plone content management system, etc..

    I've never heard of BeeKeeper before, but now I know its name and subject. That's very little information, gained very quickly... but next time I'm building a REST API with Python I can go look it up and see if it's useful.

    If you don't know what a "REST API" is, well, that's another opportunity for growing your awareness: do a Google search and read a paragraph or two. If it's relevant to your job, keep reading. Otherwise, make a mental note and move on.

    Book catalogs

    Since your goal is awareness, not in-depth knowledge, you don't need to read a book to gain something: the title and description may be enough. Technical book publishers are in the business of publishing relevant books, so browsing their catalog can be very educational.

    For example, the Packt book catalog will give you awareness of a long list of tools you might find useful one day. You can see that "Unity" is something you use for game development, "Spark" is something you use for data science, etc.. Spend 20 seconds reading the Spark book description and you'll learn Spark does "statistical data analysis, data visualization, predictive modeling" for "Big Data". If you ever need to do that you now have a starting point for further reading.

    Using your new toolbox

    There are only so many hours in the day, so many days in a year. That means you need to work efficiently, spending your limited time in ways that have the most impact.

    The techniques you've just read do exactly that: you can learn more in less time by spending the minimum necessary to gain awareness. You only need to spend the additional time to gain basic understanding or expertise for those tools and techniques you actually end up using. And having a broad range of tools and techniques means you can get more done at work, without reinventing the wheel every time.

    You don't need to work evenings or weekends to be a successful programmer! This post covers just some of the techniques you can use to be more productive within the limits of a normal working week. To help you get there I'm working on a book, The Programmer's Guide to a Sane Workweek.

    Sign up in the email subscription form below to learn more about the book, and to get notified as I post more tips and tricks on how you can become a better software engineer.

    October 07, 2016 04:00 AM

    September 24, 2016

    Hynek Schlawack

    Sharing Your Labor of Love: PyPI Quick and Dirty

    A completely incomplete guide to packaging a Python module and sharing it with the world on PyPI.

    by Hynek Schlawack (hs@ox.cx) at September 24, 2016 12:00 PM

    September 17, 2016

    Glyph Lefkowitz

    Hitting The Wall

    I’m an introvert.

    I say that with a full-on appreciation of just how awful thinkpieces on “introverts” are.

    However, I feel compelled to write about this today because of a certain type of social pressure that a certain type of introvert faces. Specifically, I am a high-energy introvert.

    Cementing this piece’s place in the hallowed halls of just awful thinkpieces, allow me to compare my mild cognitive fatigue with the plight of those suffering from chronic illness and disability1. There’s a social phenomenon associated with many chronic illnesses, “but you don’t LOOK sick”, where well-meaning people will look at someone who is suffering, with no obvious symptoms, and imply that they really ought to be able to “be normal”.

    As a high-energy introvert, I frequently participate in social events. I go to meet-ups and conferences and I engage in plenty of public speaking. I am, in a sense, comfortable extemporizing in front of large groups of strangers.

    This all sounds like extroverted behavior, I know. But there’s a key difference.

    Let me posit two axes for personality type: on the X axis, “introvert” to “extrovert”, and on the Y, “low energy” up to “high energy”.

    The X axis describes what kinds of activities give you energy, and the Y axis describes how large your energy reserves are for the other type.

    Notice that I didn’t say which type of activity you enjoy.

    Most people who would self-describe as “introverts” are in the low-energy/introvert quadrant. They have a small amount of energy available for social activities, which they need to frequently re-charge by doing solitary activities. As a result of frequently running out of energy for social activities, they don’t enjoy social activities.

    Most people who would self-describe as “extroverts” are also on the “low-energy” end of the spectrum. They have low levels of patience for solitary activity, and need to re-charge by spending time with friends, going to parties, etc, in order to have the mental fortitude to sit still for a while and focus. Since they can endlessly get more energy from the company of others, they tend to enjoy social activities quite a bit.

    Therefore we have certain behaviors we expect to see from “introverts”. We expect them to be shy, and quiet, and withdrawn. When someone who behaves this way has to bail on a social engagement, this is expected. There’s a certain affordance for it. If you spend a few hours with them, they may be initially friendly but will visibly become uncomfortable and withdrawn.

    This “energy” model of personality is of course an oversimplification - it’s my personal belief that everyone needs some balance of privacy and socialization and solitude and eventually overdoing one or the other will be bad for anyone - but it’s a useful one.

    As a high-energy introvert, my behavior often confuses people. I’ll show up at a week’s worth of professional events, be the life of the party, go out to dinner at all of them, and then disappear for a month. I’m not visibily shy - quite the opposite, I’m a gregarious raconteur. In fact, I quite visibly enjoy the company of friends. So, usually, when I try to explain that I am quite introverted, this claim is met with (quite understandable) skepticism.

    In fact, I am quite functionally what society expects of an “extrovert” - until I hit the wall.


    In endurance sports, one is said to “hit the wall” at the point where all the short-term energy reserves in one’s muscles are exhausted, and there is a sudden, dramatic loss of energy. Regardless, many people enjoy endurance sports; part of the challenge of them is properly managing your energy.

    This is true for me and social situations. I do enjoy social situations quite a bit! But they are nevertheless quite taxing for me, and without prolonged intermissions of solitude, eventually I get to the point where I can no longer behave as a normal social creature without an excruciating level of effort and anxiety.

    Several years ago, I attended a prolonged social event2 where I hit the wall, hard. The event itself was several hours too long for me, involved meeting lots of strangers, and in the lead-up to it I hadn’t had a weekend to myself for a few weeks due to work commitments and family stuff. Towards the end I noticed I was developing a completely flat affect, and had to start very consciously performing even basic body language, like looking at someone while they were talking or smiling. I’d never been so exhausted and numb in my life; at the time I thought I was just stressed from work.

    Afterwards though, I started having a lot of weird nightmares, even during the daytime. This concerned me, since I’d never had such a severe reaction to a social situation, and I didn’t have good language to describe it. It was also a little perplexing that what was effectively a nice party, the first half of which had even been fun for me, would cause such a persistent negative reaction after the fact. After some research, I eventually discovered that such involuntary thoughts are a hallmark of PTSD.

    While I’ve managed to avoid this level of exhaustion before or since, this was a real learning experience for me that the consequences of incorrectly managing my level of social interaction can be quite severe.

    I’d rather not do that again.


    The reason I’m writing this, though3, is not to avoid future anxiety. My social energy reserves are quite large enough, and I now have enough self-knowledge, that it is extremely unlikely I’d ever find myself in that situation again.

    The reason I’m writing is to help people understand that I’m not blowing them off because I don’t like them. Many times now, I’ve declined or bailed an invitation from someone, and later heard that they felt hurt that I was passive-aggressively refusing to be friendly.

    I certainly understand this reaction. After all, if you see someone at a party and they’re clearly having a great time and chatting with everyone, but then when you invite them to do something, they say “sorry, too much social stuff”, that seems like a pretty passive-aggressive way to respond.

    You might even still be skeptical after reading this. “Glyph, if you were really an introvert, surely, I would have seen you looking a little shy and withdrawn. Surely I’d see some evidence of stage fright before your talks.”

    But that’s exactly the problem here: no, you wouldn’t.

    At a social event, since I have lots of energy to begin with, I’ll build up a head of steam on burning said energy that no low-energy introvert would ever risk. If I were to run out of social-interaction-juice, I’d be in the middle of a big crowd telling a long and elaborate story when I find myself exhausted. If I hit the wall in that situation, I can’t feel a little awkward and make excuses and leave; I’ll be stuck creepily faking a smile like a sociopath and frantically looking for a way out of the converstaion for an hour, as the pressure from a large crowd of people rapidly builds up months worth of nightmare fuel from my spiraling energy deficit.

    Given that I know that’s what’s going to happen, you won’t see me when I’m close to that line. You won’t be in at my desk when I silently sit and type for a whole day, or on my couch when I quietly read a book for ten hours at a time. My solitary side is, by definition, hidden.

    But, if I don’t show up to your party, I promise: it’s not you, it’s me.


    1. In all seriousness: this is a comparison of kind and not of degree. I absolutely do not have any illusions that my minor mental issues are a serious disability. They are - by definition, since I do not have a diagnosis - subclinical. I am describing a minor annoyance and frequent miscommunication in this post, not a personal tragedy. 

    2. I’ll try to keep this anonymous, so hopefully you can’t guess - I don’t want to make anyone feel bad about this, since it was my poor time-management and not their (lovely!) event which caused the problem. 

    3. ... aside from the hope that maybe someone else has had trouble explaining the same thing, and this will be a useful resource for them ... 

    by Glyph at September 17, 2016 09:18 PM