Planet Twisted

July 30, 2014

Duncan McGreggor

OSCON 2014 Theme Song - Andrew Sorensen's Live Coding Keynote

Andrew Sorensen live-coding at OSCON 2014
Keynote

Shortly after Andrew Sorensen began the performance segment of his keynote at OSCON 2014, the #oscon Twitter topic began erupting with posts about the live coding session. Comments, retweets, and additional links persisted for that day and the next. In short, Andrew was a hit :-)

My first encounter with Andrew's work was a few years ago when I was getting back into Lisp. I was playing with generative music with Overtone (and then, a bit later, experimenting with SuperCollider, Hy, and Twisted) and came across his piece A Study in Keith. You might want to take a break from reading this port and watch that now ...

When Andrew started up his presentation, I didn't immediately recognize him. In fact, when the code was displayed on the big screens, I assumed it was Clojure until I looked closely and saw he was using (define ...) and not (defun ...).  This seemed very familiar, and then I remembered Impromptu, which ultimately lead to my discovery of Extempore (see more links below) and the realization that this is what Andrew was using to live code.

At the end of the performance a bunch of us jumped up and gave a standing ovation. (In fact, you can hear me yell out "YEAH" at the end of his presentation when he says "And there we go."). It was quite a show. It seemed the OSCON 2014 had been given a theme song. The next step was getting the source code ...


Andrew's gist (Dark Github Theme)
Sharing the Code

Andrew gave a presentation on Extempore in the ballroom right after the keynote. This too was fantastic and resulted in much tweeting.

Afterwards a bunch of us went up front and chatted with him, enthusing about his work, the recent presentation, the keynote, and his previously published pieces.

I had Andrew's ear for a moment, and asked him if he was interested in sharing his keynote source -- there had been several requests for it on Twitter (that also got retweeted and/or favourited). Without hesitation, he gave an enthusiastic "yes" and we were off and running for the lounge where we could sit down to create a gist (and grab a cappuccino!). The availability of the source was announced immediately, to the delight of many.


Setting Up Extempore

Sublime Text 3 connected to Extempore
Later that night in my hotel room, I had time to download and run Extempore ... and discovered that I couldn't actually play the keynote code, since there was some implicit setup I was missing. However, after some digging around on the docs site and the mail list, music was pouring forth from my laptop -- to my great joy :-D

To ensure anyone else who is not familiar with Extempore can also have this pleasure, I've put together the all the prerequisites and setup necessary in a forked gist, in multiple parts. I will go through those in this blog post. Also: all of my testing and live coding was done using Ben Swift's Extempore Sublime Text plugin.

The first step is getting all the dependencies. You'll want to start the downloads right away, since they are large (the sample files are compressed .wavs). While that's going on, you can install Extempore using Homebrew (this worked for me on Mac OS X with no additional tweaking/configuration necessary):

With Extempore running, let's do some setup. We're going to need to:

  • load some libraries (this takes a while for them to compile),
  • define some samples, and then
  • define some musical note aliases for convenience (and visual clarity).
The easiest way to use the files below is to clone the gist repo and load them up in Sublime Text, executing blocks of text by hi-lighting them, and then pressing ^x^x.

Here is the file for the fist two bullets mentioned above:


You will need to edit this file to point to the locations where your samples were downloaded. Also,
at the very end there are some lines of code you can execute to make sure that your samples are working.

Now let's define the note aliases. You can just hi-light the entire contents of this file in Sublime Text and then ^x^x:
At this point, we're ready to play!


Playing the Music

To get started on the music, open up the fourth file from the clone of the gist and ^x^x the root, scale, and left-hand-notes-* constants.

Here is the evolution of the left hand part:
Go ahead and start that first one playing (^x^x the definition as well as the call). Wait for a bit, and then execute the next one, etc. Once you've started playing the final left hand form, you can switch to the wider range of notes defined/updated at the bottom.

Next, you'll want to bring in the right hand ... then bassline ... then the higher fmsynth sparkles for the right hand:

Then you'll increase the energy with the drum section:

Finally, you'll bring it to the climax, and then start the gentle fade out:

A slightly modified code listing for the final keynote form is here:


Variation on a Theme

I have recorded a variation of Andrew's keynote based on the code above, for your listening pleasure :-) You can listen to it in your browser or download it.

This version plays part of the left hand piano an octave lower. There's a tiny bit of clipping in places, and I accidentally jazzed it up (and for too long!) with a hi-hat change in the middle. There are also some awkward transitions and volume oddities. However, these should be inspiration for you to make your own variation of the OSCON 2014 Theme Song :-)

The "script" used for the recording can found here.


Links of Note

Some of these were mentioned above, some haven't been. All relate to Extempore :-)


by Duncan McGreggor (noreply@blogger.com) at July 30, 2014 06:14 AM

July 29, 2014

Duncan McGreggor

The Future of Programming - Adopting The Functional Paradigm?

Series Links

Survivors' Breakfast

The previous post covered some thoughts on the future-looking programming themes present at OSCON 2014.

Following that wonderful conference, long-time Open Source advocate, Pythonista, and instructor Steve Holden, was kind enough to host his third annual "OSCON Survivors' Breakfast" with tens of esteemed attendees, speakers, and organizers enjoying great company and conversation, relaxing together after the flurry of conference activity, planning a leisurely day in Portland, and -- most immediately -- having some much-needed breakfast.

The view from the 23rd floor was quite an eyeful, and the conversation ranged across equally panoramic topics. Sitting with Alex Martelli, Anna Ravenscroft, and Katie Miller, the conversation inevitably turned to thoughts programmatical. One thread of the discussion was so compelling that it helped crystallize this series of blog posts. That was kicked off with Katie's question:

Why [have some large companies] not embraced functional programming to the extent that other large ones have?

Multiple points of discussion spawned from this, some of which still continue. The rest of this post explores these. 


Large Companies?

What constitutes a large company? We settled on discussing Fortune 500 companies, which, by definition are:
  • U.S. Companies
  • Ranked by gross revenue (after adjustments for excise taxes).

Afterwards, I looked up the 2013 top 25 tech companies in the Fortune 500. I've listed them below; in parentheses is the Fortune 500 ranking. After the dash are the functional programming languages used on various company projects -- these are listed only if I have talked to someone who has worked on a project (or interviewed for a job that used the language), or if I have read an article by an employee who has stated that they use the listed language(s) [1].
  1. Apple (6) - Swift, Clojure, Scala
  2. AT&T (11) - Haskell
  3. HP (15) - F#, Scala
  4. Verizon Communications (16) - Scala
  5. IBM (20) - Scala
  6. Microsoft (35) - F#, F*
  7. Comcast (46) - Scala
  8. Amazon (49) - Haskell, Scala, Erlang
  9. Dell (51) - Erlang, Scala
  10. Intel (54) - Haskell, SML, PLT Scheme
  11. Google (55) - Haskell [2]
  12. Cisco (60) - Scala
  13. Ingram Micro (76) - ?
  14. Oracle (80) - Scala
  15. Avnet (117) - ?
  16. Tech Data (119) - ?
  17. Emerson Electric (123) - ?
  18. Xerox (131) - Scala
  19. EMC (133) - Scala
  20. Arrow Electronics (141) - ?
  21. Century Link (150) - ?
  22. Computer Sciences Corp. (176) - ?
  23. eBay (196) - Scala 
  24. TI (218) - ?
  25. Western Digital (222) - ?

The companies which have committed to projects guessed to be of significant business value written in FP languages include: Apple, HP, and eBay. Possibly also Oracle and Intel. So, a rough estimate of between 3 to 5 of the top 25 U.S. tech companies have made a significant investment in FP.

Why not Google?

The next two sections offer summaries of some views on this.


Ideal Use Case?

Is an FP language suitable for large organisations? Are smaller companies better served by them? During breakfast, It was postulated that dealing with such things as immutable data, handling I/O in pure FP languages, and creating/using higher order functions is easier for small startups due to the shorter amount of time required to hire or train a critical mass of skilled programmers.

It is certainly true that it will take larger organisations longer to train its personnel simply due to sheer numbers and, even with enough trainers, logistics. But this argument can be made for any corporate level of instruction; in my book, this cancels out on both sides and is not an argument unique to hard topics, even less, specifically pertinent to FP adoption.


Brain Fit?

I've heard this one a bit: "Some people just don't think in FP terms." They need loops and iteration, not higher order functions and recursion. Joel Spolsky makes reference to this in his article The Guerrilla Guide to Interviewing. In particular, he says that "For some reason most people seem to be born without the part of the brain that understands pointers." This has been applied to topics in FP as well as C.

To be fair, Joel's comment was probably made with a bit of lightness and not meant to be a statement on the nature of mind or a theory of cognition. The context of the article is a very practical one: hiring. When trying to identify whether a programmer would be an asset for your team, you're not living in the space of cognitive theory, rather you inhabit the realm of quick approximations, gut instincts, and fast, economical decisions.

Regardless, I find this perspective -- Type Physicalism [3] -- fairly objectionable. This is because I see it as a kind of intellectual "racism." Early social sciences utilized this form of reasoning to justify all sorts of discriminatory thinking in the name of "science", reinforcing a rigid mentality of "us" vs. "them." In my own experience, I've seen this sort of approach used to shutdown exploration, to enforce elitism, and dismiss ideas that threaten the authority of the status quo.

Rather than seeing the problem of comprehending FP as a physical limitation of the individual, I see instructional failure as the obstacle to overcome. If we start with the proposition that certain brains are deficient, we are essentially abandoning education. It is the responsibility of the instructor to engage creatively with each student's learning style. When adhering to the idea that certain brains are limited, one discards creative engagement; one doesn't even consider working with the students and their learning styles. This is a view that, however implicitly, can be used to shun diversity and dismiss potential.

I believe the essence of what Joel was shooting for can be approached in a much kinder fashion (adapted for an FP discussion):

None of us was born knowing GOTO statements, global state, mutable data, or for loops. There are many programmers alive, though, whose first contact with programming involved one or more of these. That's their "home town", as it were; their programmatic birth place. Having utilized -- as well as taught -- imperative, OOP, and functional styles of programming, I do not feel that one is intrinsically any harder than another. However, they are sometimes so vastly different from each other in style or syntax or semantics that once a student has solidified around the concepts of a particular paradigm, it can be a challenge retraining to work easily in another.


Why the Objections?

If both "ideal use case" and "brain fit" are given as arguments against adopting FP (or any other new paradigm) in large organisations, and neither are considered logically or philosophically valid, what's at the root of the resistance?

It is not uncommon for changes in an industry or field of study to be met with resistance. The bigger or more different the change from the status quo, very often is proportional to the amount of resistance. I suspect that this is really what we're seeing when companies take a stance against FP. There are very often valid business concerns: "we've made an investment in OOP" or "it will cost too much to train/hire/migrate to FP." 

I would remind those company leaders, though, that new sources of revenue, that product innovation and changes in market adoption do not often come from maintaining or enforcing the current state. Instead, that is an identifying characteristic of companies whose relevance is fading.

Even if your company has market dominance or is a monopoly, there is still a good incentive for exploring alternative paradigms. At the very least, one can uncover inefficiencies and apply new knowledge to remove duplication of efforts, increase margins, etc.


Careers

As a manager, I have found that about half of the senior engineers up for promotion have very little to no interest in taking on different (new to them) programmatic paradigms. They consider current burdens sufficient (or too much) and would rather spend what little free time they have available to them in improving existing systems.

Senior engineers who have a more academic or research bent (or are easily bored) are much more likely to embrace this sort of change. Interestingly, senior engineers who have little to no competitive drive will more readily pick up something new if the need arises. This may be due to such things as not perceiving accumulated knowledge as territory to defend, for example.

Younger engineers with less experience (and less of an investment made in a particular school of thought) are much more willing to take on new challenges. I believe there are many reasons for this, one of which may include an interest in becoming more professionally competitive with their peers.

Junior or senior, I have found that programmers who are currently looking to find new employment are nearly invariably not only willing to take on the challenge of learning different paradigms, but are usually going about that proactively and engaging in self-study.

I want to work with programmers who can take on any problem space in any paradigm and find creative solutions, contributing as valued members of a team. This is certainly an ideal set of characteristics, but one that I have seen in the wilds of the workplace on multiple occasions. It has nothing to do with FP or OOP paradigms, but rather with the people themselves.

Even if a company is locked into well-established processes and views on programming, they may find it in their best interests to provide a more open-minded approach with their employees who would enjoy that. Their retention rates could very well increase dramatically.


Do We Need To?

Philosophy and hiring strategies aside, do we -- as programmers, software projects, or organizations that support programming -- need to take on the burden of learning or adopting functional programming? Quite possibly not.

If Google's plans around Go involve building a new operating system (in the spirit of 1970s C and UNIX), the systems programmers may find pure functions too cumbersome to work with. FP may be too burdensome a fit for that type of work.

If one is not tied to a historical analogy with UNIX, as Mozilla is not with Rust, doing something like creating a new browser engine (or running a remote services company) may be a good fit for FP, especially if one has data showing reduced error counts when using type systems.

As we shall see illustrated in the next post, the usual advice continues to apply: the decision of which paradigm to employ for any given project should be dictated by the best fit and not ideological inflexibility. The bearing this has on programming is innovation: it is the early adopters who have the best chance of leading us into the future.

Up next: Retrospective on Programming Paradigms
Previously: Themes at OSCON 2014


Footnotes

[1] If anyone has additional information as to which FP languages are used by these top 25 companies, please let me know, and I will include that information. Bonus points for knowing of business-critical applications.

[2] Google Switzerland are using Haskell.

[3] Type Physicality is a form of reductive materialism, also known as the Mind-Brain Identity Theory that does not allow for mental states to be realized in organisms or computational systems that do not have a brain. See "Criticisms of Type Physicality" at http://en.wikipedia.org/wiki/Identity_theory_of_mind#Multiple_realizability.

by Duncan McGreggor (noreply@blogger.com) at July 29, 2014 03:53 AM

July 28, 2014

Duncan McGreggor

The Future of Programming - Themes at OSCON 2014

Series Links


A Qualitative OSCON Debrief

As you might have noticed from the OSCON Twitter-storm this year, the conference was a blast. Even if you weren't physically present, given the 17 tracks, you can imagine that the presentations -- and subsequent conversations -- were deeply varied.

This was the second OSCON I'd attended; the first was was in 2008 as a guest of Michael Bernstein, a friend who was speaking there. OSCON 2008 was a zoo - I'm not sure of the actual body count, but I've heard that attendees + vendors + miscellaneous topped 12,000 people over the course of the week (I would love to hear if someone has hard data on that -- googling didn't reveal much). OSCON 2008 was dominated by Big Data, Hadoop, endless buzzword bingo, and business posturing by all sorts. The most interesting bits of that conference were the outlines that formed around the conversations people weren't having. In fact, over the following 6 months, that's what I spent my spare time pondering: what people didn't say at OSCON.

This year's conference seemed like a completely different animal. It felt like easily 1/2 to 1/3rd the number of attendees in 2008. Where that one had all the anonymizing feel of rush-hour in a major metropolitan hub, OSCON 2014 had a distinctly small-town vibe to it -- I was completely charmed. Conversations (overheard as well as participated in) were not littered with examples from the latest bizspeak, but rather focused on essence. The interactions were not continually distracted, but rather steadily focused, allowing people to form, express, and dispute complete thoughts with their peers.


Conversations

So what were people talking about this year? Here are some of the topics I heard covered during lunches, in hallways, and at podiums; at pubs, in restaurants and at parks [1]:
  • What communities are thriving?
  • Which [projects, organisations, companies, etc.] are treating their people right?
  • What successful processes are being followed at [project, organisation, etc.]?
  • Who is hiring and why should someone want to work there?
  • Where can I go to learn X? Who is teaching X? Who shares the most about X?
  • Which [projects, organisations] support X?
  • Why don't more [people, projects, organisations] care about [possible future X]?
  • Why don't more [people, projects, organisations] spend more time investigating the history of X for "lessons learned"?
  • There was so much more X in computing during the 60s and 70s -- what happened? [2]
  • Why are we reinventing X?
  • When is X going to be invented, and who's going to do it?
  • Everything is changing! I can't keep up anymore.
  • I want to keep up, but how?
  • Why can't we stop making so many X?
  • Nobody cares about Y anymore; we're all doing X now.
  • Full stack developers!
  • Haskell!
  • Fault-tolerant systems!

After lots of reflection, here's how I classified most of the conversations I heard:
  • Developing communities,
  • Developing careers and/or personal/professional qualities, and
  • Developing software, 

along lines such as:
  • Effective maintenance, maturity, and health,
  • Focusing on the "art",  eventual mastery, and investments of time,
  • Tempering bare pragmatism with something resembling science or academic excellence,
  • Learning the new to bolster the old,
  • Inspiring innovation from a place of contemplation and analysis,
  • Mining the past for great ideas, and
  • Figuring out how to better share and spread the adoption of good ideas.


Themes

Generalized to such a degree, this could have been pretty much any congregation of interested, engaged minds since the dawn of civilization. So what does it look like if we don't normalize quite so much? Weighing these with what may well be my own bias (and the bias of like-minded peers), I submit to your review these themes:

  • A very strong interest in programming (thinking and creating) vs. integration (assessing and consuming).
  • An express desire to become better at abstraction (higher-order functions, composition, and types) to better deal with growing systems complexities.
  • An interest in building even more complicated systems.
  • A fear of reimplementing past mistakes or of letting dust gather on past intellectual achievements.

As you might have guessed, these number very highly among the reasons why the conference was such an unexpected pleasure for me. But it should also not come as a surprise that these themes are present:

  • We have had several years of companies such as Google and Amazon (AWS) building and deploying some of the most sophisticated examples of logic-made-manifest in human history. This has created perceived value in our industry and many wish to emulate it. Similarly, we have single purpose distributed systems being purchased for nearly 20 billion USD -- a different kind of complexity, with a different kind of perceived reward.
  • In the 70s and 80s, OOP adoption brought with it the ability to create large software systems in ways that people had not dared dream or were impractical to realize. Today's growing adoption of the Functional paradigm is giving early signs of allowing us to better integrate complex systems with more predictability and fewer errors.
  • Case studies of improvements in productivity or the capacity to handle highly complex or previously intractable problems with better abstractions, has ignited the passions of many. Not wanting to limit their scope of knowledge or sources of inspiration, people are not simply limiting themselves to the exploration of such things as Category Theory -- they are opening the vaults of computer science with such projects as Papers We Love.

There's a brave new world in the making. It's a world for programmers and thinkers, for philosophers and makers. There's a lot to learn, but it's really not so different from older worlds: the same passions drive us, the same idealism burns brightly. And it's nice to see that these themes arise not only in small, highly specialized venues such as university doctoral programs and StrangeLoop (or LambdaJam), but also in larger intersections of the industry like OSCON (or more general-audience ones like Meetups).

Up next: Adopting the Functional Paradigm?
PreviouslyAn Overview


Footnotes

[1] It goes without saying that any one attendee couldn't possibly be exposed to enough conversations to form a perfectly accurate sense of the total distribution of conversation topics. No claim to the contrary is being made here :-)

[2] I strongly adhere to the multifaceted hypothesis proposed by Bret Victor
here in the section titled "Why did all these ideas happen during this particular time period?"


by Duncan McGreggor (noreply@blogger.com) at July 28, 2014 05:25 AM

The Future of Programming - An Overview

Art by Philip Straub
There's a new series of blog posts coming, inspired by on-going conversations with peers, continuous inspection of the development landscape, habitual navel-gazing, and participation at the catalytic OSCON 2014. As you might have inferred, these will be on the topic of "The Future of Programming."

Not to be confused with Bret Victor's excellent talk last year at DBX, these posts will be less about individual technologies or developer user experience, and more about historic trends and viewing the present (and near future) through such a lense.

In this mini-series, the aim is to present posts on following topics:

I did a similar set of posts, conceived in late 2008 and published in 2009 on the future of cloud computing entitled After the Cloud. It was a very successful series and the cloud industry seems to be heading towards some of the predictions made in it -- ZeroVM and Docker are an incremental step towards the future of distributed processes/functions outlined in To Atomic Computation and Beyond

In that post, though, are two quotes from industry greats. These provide an excellent context for this series as well, hinting at an overriding theme:
  • Alan Kay, 1998: A crucial key to growing large systems is effective communications between components.
  • Joe Armstrong, 2004: To effectively model and solve problems in a distributed manner, we need concurrency... this is made easier when we isolate processes and do not share data.

In the decade since these statements were made, we have seen individuals, projects, and companies take that vision to heart -- and succeeding as a result. But as an industry, we continue to struggle with the definition of our art; we still are tormented by change -- both from within and externally -- and do not seem to adapt to it well.

These posts will peer into such places ... in the hope that such inspection might guide us better through the tangled forest of our present into the unimagined forest of our future.

by Duncan McGreggor (noreply@blogger.com) at July 28, 2014 05:17 AM

July 18, 2014

Moshe Zadka

Clash of Clans and the Prisoners’ Dilemma

I have started playing, recently, the online game “Clash of Clans”. Clash of Clans has the option of “donating” troops to other clan members. Troops donated stay in a separate area (so they do not take up area for one’s own troops), and also participate in base defense (as opposed to one’s own troops, which are offensive only). In short, being donated-to makes both one’s offense and one’s defense stronger. However, donating troops means paying the cost of training them and reaping no benefit from it.

Does we have the pay-off matrix: assume paying the cost of troops is -1, and the benefit is +2 (it has to be bigger than the cost, since players do build troops for themselves with even less benefits):

Donate/Donate: 1/1
Don’t donate/Don’t donate: 0/0
Donate/Don’t donate: -1/2
Don’t donate/Donate: 2/-1

What I love about this is this is a massive experiment with prisoners’ dilemma on more-or-less average people. As expected, one sees both equilibrium: clans where hardly anyone donates, and clans where everyone donates. In the clans where hardly anyone donates, there is still some reciprocity: (some) people will try to donate “back” to people who donated to them. Since in practice the cost-benefit analysis turns that way, it is sometimes worthwhile to donate even for a .2 chance of a reciprocation, which allows the loop to be fed.

There are reputation effects (the number of troops one donated and one had donated to appears where everyone can see it), which would be assumed to improve the donation rate. However, from my anecdotal experience, being in two clans, the most distinguishing feature is “do you know these people in real life?” Clans made up of people whose reputation effects extend beyond the game are in the good equilibrium. Since I’ve joined the Facebook-employee-only clan, my life has been much better.

I started thinking that Facebook is like that too. In situations where you can help a colleague, where the benefit to the colleague would be disproportional to one’s self, we encourage that kind of thinking. Facebook is in the good equilibrium of the dilemma!

[Plug: We're hiring engineers and eng. managers. Please contact me on FB with further questions or resumes.]

by moshez at July 18, 2014 04:10 PM

Duncan McGreggor

Uncovering Inherent Structures in Organizations

Vladimir Levenshtein
This post should have a subtitle: "Using Team Analysis and Levenshtein Distance to Reveal said Structure." It's the first part of that subtitle that is the secret, though: being able to correctly analyze and classify individual teams. Without that, using something clever like Levenshtein distance isn't going to be very much help.

But that's coming in towards the end of the story. Let's start at the beginning.

What You're Going to See

This post is a bit long. Here are the sections I've divided it into:

  • What You're Going to See
  • Premise
  • Introducing ACME
  • Categorizing Teams
  • Category Example
  • Calculating the Levenshtein Distance of Teams
  • Sorting and Interpretation
  • Conclusion

However, you don't need to read the whole thing to obtain the main benefits. You can get the Cliff Notes version by reading the Premise, Categorizing Teams, Interpretation, and the Conclusion.

Premise

Companies grow. Teams expand. If you're well-placed in your industry and providing in-demand services or products, this is happening to you. Individuals and small teams tend to deal with this sort of change pretty well. At an organizational level, however, this sort of change tends to have an impact that can bring a group down, or rocket it up to the next level.

Of the many issues faced by growing companies (or rapidly growing orgs within large companies), the structuring one can be most problematic: "Our old structures, though comfortable, won't scale well with all these new teams and all the new hires joining our existing teams. How do we reorganize? Where do we put folks? Are there natural lines along which we can provide better management (and vision!) structure?"

The answer, of course, is "yes" -- but! It requires careful analysis and a deep understanding every team in your org.

The remainder of this post will set up a scenario and then figure out how to do a re-org. I use a software engineering org as an example, but that's just because I have a long and intimate knowledge of them and understand the ways in which one can classify such teams. These same methods could be applied a Sales group, Marketing groups, etc., as long as you know the characteristics that define the teams of which these orgs are comprised.



Introducing ACME

ACME Corporation is the leading producer of some of the most innovative products of the 20th century. The CTO had previously tasked you, the VP of Software Development to bring this product line into the digital age -- and you did! Your great ideas for the updated suite are the new hottness that everyone is clamouring for. Subsequently, the growth of your teams has been fast, and dare we say, exponential.

More details on the scenario: your Software Development Group has several teams of engineers, all working on different products or services, each of which supports ACME Corporation in different ways. In the past 2 years, you've built up your org by an order of magnitude in size. You've started promoting and hiring more managers and directors to help organize these teams into sensible encapsulating structures. These larger groups, once identified, would comprise the whole Development Group.

Ideally, the new groups would represent some aspect of the company, software development, engineering, and product vision -- in other words, some sensible clustering of teams doing related work. How would you group the teams in the most natural way?

Simply dividing along language or platform lines may seem like the obvious answer, but is it the best choice? There are some questions that can help guide you in figuring this out:
  • How do these teams interact with other parts of the company? 
  • Who are the stakeholders in feature development? 
  • Which sorts of customers does each team primarily serve?
There are many more questions you could ask (some are implicit in the analysis data linked below), but this should give a taste.

ACME Software Development has grown the following teams, some of which focus on products, some on infrastructure, some on services, etc.:
  • Digital Anvil Product Team
  • Giant Rubber Band App Team
  • Digital Iron Carrot Team
  • Jet Propelled Unicycle Service Team
  • Jet Propelled Pogo Stick Service Team
  • Ultimatum Dispatcher API Team
  • Virtual Rocket Powered Roller Skates Team
  • Operations (release management, deployments, production maintenance)
  • QA (testing infrastructure, CI/CD)
  • Community Team (documentation, examples, community engagement, meetups, etc.)

Early SW Dev team hacking the ENIAC

Categorizing Teams

Each of those teams started with 2-4 devs hacking on small skunkworks projects. They've now blossomed to the extent that each team has significant sub-teams working on new features and prototyping for the product they support. These large teams now need to be characterized using a method that will allow them to be easily compared. We need the ability to see how closely related one team is to another, across many different variables. (In the scheme outlined below, we end up examining 50 bits of information for each team.)

Keep in mind that each category should be chosen such that it would make sense for teams categorized similarly to be grouped together. A counter example might be "Team Size"; you don't necessarily want all large teams together in one group, and all small teams in a different group. As such, "Team Size" is probably a poor category choice.

Here are the categories which we will use for the ACME Software Development Group:
  • Language
  • Syntax
  • Platform
  • Implementation Focus
  • Supported OS
  • Deployment Type
  • Product?
  • Service?
  • License Type
  • Industry Segment
  • Stakeholders
  • Customer Type
  • Corporate Priority
Each category may be either single-valued or multi-valued. For instance, the categories ending in question marks will be booleans. In contrast, multiple languages might be used by the same team, so the "Language" category will sometimes have several entries.

Category Example

(Things are going to get a bit more technical at this point; for those who care more about the outcomes than the methods used, feel free to skip to the section at the end: Sorting and Interpretation.)

In all cases, we will encode these values as binary digits -- this allows us to very easily compare teams using Levenshtein distance, since the total of all characteristics we are filtering on can be represented as a string value. An example should illustrate this well.

(The Levenshtein distance between two words is the minimum number of single-character edits -- such as insertions, deletions or substitutions -- required to change one word into the other. It is named after Vladimir Levenshtein, who defined this "distance" in 1965 when exploring the possibility of correcting deletions, insertions, and reversals in binary codes.)

Let's say the Software Development Group supports the following languages, with each one assigned a binary value:
  • LFE - #b0000000001
  • Erlang - #b0000000010
  • Elixir - #b0000000100
  • Ruby - #b0000001000
  • Python - #b0000010000
  • Hy - #b0000100000
  • Clojure - #b0001000000
  • Java - #b0010000000
  • JavaScript - #b0100000000
  • CoffeeScript - #b1000000000
A team that used LFE, Hy, and Clojure would obtain its "Language" category value by XOR'ing the three supported languages, and would thus be #b0001100001. In LFE, that could be done by entering the following code the REPL:


We could then compare this to a team that used just Hy and Clojure (#b0001100000), which has a Levenshtein distance of 1 with the previous language category value. A team that used Ruby and Elixir (#b0000001100) would have a Levenshtein distance of 5 with the LFE/Hy/Clojure team (which makes sense: a total of 5 languages between the two teams with no languages shared in common). 


Calculating the Levenshtein Distance of Teams

As a VP who is keen on deeply understanding your teams, you have put together a spreadsheet with a break-down of not only languages used in each team, but lots of other categories, too. For easy reference, you've put a "legend" for the individual category binary values is at the bottom of the linked spreadsheet.

In the third table on that sheet, all of the values for each column are combined into a single binary string. This (or a slight modification of this) is what will be the input to your calculations. Needless to say, as a complete fan of LFE, you will be writing some Lisp code :-)

Partial view of the spreadsheet's first page.
(If you would like to try the code out yourself while reading, and you have lfetool installed, simply create a new project and start up the REPL: $ lfetool new library ld; cd ld && make-shell
That will download and compile the dependencies for you. In particular, you will then have access to the lfe-utils project -- which contains the Levenshtein distance functions we'll be using. You should be able to copy-and-paste functions, vars, etc., into the REPL from the Github gists.)

Let's create a couple of data structures that will allow us to more easily work with the data you collected about your teams in the spreadsheet:


We can use a quick copy and paste into the LFE REPL for two of those numbers to do a sanity check on the distance between the Community Team and the Digital Iron Carrot Team:


That result doesn't seem unreasonable, given that at a quick glance we can see both of these strings have many differences in their respective character positions.

It looks like we're on solid ground, then, so let's define some utility functions to more easily work with our data structures:


Now we're ready to roll; let's try sorting the data based on a comparison with a one of the teams:


It may not be obvious at first glance, but what the levenshtein-sort function did for us is compare our "control" string to every other string in our data set, providing both the distance and the string that the control was compared to. The first entry in the results is the our control string, and we see what we would expect: the Levenshtein distance with itself is 0 :-)

The result above is not very easily read by most humans ... so let's define a custom sorter that will take human-readable text and then output the same, after doing a sort on the binary strings:


(If any of that doesn't make sense, please stop in and say "hello" on the LFE mail list -- ask us your questions! We're a friendly group that loves to chat about LFE and how to translate from Erlang, Common Lisp, or Clojure to LFE :-) )



Sorting and Interpretation

Before we try out our new function, we should ponder which team will be compared to all the others -- the sort results will change based on this choice. Looking at the spreadsheet, we see that the "Digital Iron Carrot Team" (DICT) has some interesting properties that make it a compelling choice:
  • it has stakeholders in Sales, Engineering, and Senior Leadership;
  • it has a "Corporate Priority" of "Business critical"; and
  • it has both internal and external customers.
Of all the products and services, it seems to be the biggest star. Let's try a sort now, using our new custom function -- inputting something that's human-readable: 


Here we're making the request "Show me the sorted results of each team's binary string compared to the binary string of the DICT." Here are the human-readable results:


For a better visual on this, take a look at the second tab of the shared spreadsheet. The results have been applied to the collected data there, and then colored by major groupings. The first group shares these things in common:
  • Lisp- and Python-heavy
  • Middleware running on BSD boxen
  • Mostly proprietary
  • Externally facing
  • Focus on apps and APIs
It would make sense to group these three together.

A sort (and thus grouping) by comparison to critical team.
Next on the list is Operations and QA -- often a natural pairing, and this process bears out such conventional wisdom. These two are good candidates for a second group.

Things get a little trickier at the end of the list. Depending upon the number of developers in the Java-heavy Giant Rubber Band App Team, they might make up their own group. However, both that one and the next team on the list have frontend components written in Angular.js. They both are used internally and have Engineering as a stakeholder in common, so let's go ahead and group them.

The next two are cloud-deployed Finance APIs running on the Erlang VM. These make a very natural pairing.

Which leaves us with the oddball: the Community Team. The Levenshtein distance for this team is the greatest for all the teams ... but don't be mislead. Because it has something in common with all teams (the Community Team supports every product with docs, example code, Sales and TAM support, evangelism for open source projects, etc.), it will have many differing bits with each team. This really should be in a group all its own so that structure represents reality: all teams depend upon the Community Team. A good case could also probably be made for having the manager of this team report directly up to you. 

The other groups should probably have directors that the team managers report to (keeping in mind that the teams have grown to anywhere from 20 to 40 per team). The director will be able to guide these teams according to your vision for the Software Group and the shared traits/common vision you have uncovered in the course of this analysis.

Let's go back to the Community Team. Perhaps in working with them, you have uncovered a hidden fact: the community interactions your devs have are seriously driving market adoption through some impressive and passionate service and open source docs+evangelism. You are curious how your teams might be grouped if sorted from the perspective of the Community Team.

Let's find out!


As one might expect, most of the teams remain grouped in the same way ... the notable exception being the split-up of the Anvil and Rubber Band teams. Mostly no surprises, though -- the same groupings persist in this model.

A sort (and thus grouping) by comparison to highly-connected team.
To be fair, if this is something you'd want to fully explore, you should bump the "Corporate Priority" for the Community Team much higher, recalculate it's overall bits, regenerate your data structures, and then resort. It may not change too much in this case, but you'd be applying consistent methods, and that's definitely the right thing to do :-) You might even see the Anvil and Rubber Band teams get back together (left as an exercise for the reader).

As a last example, let's throw caution and good sense to the wind and get crazy. You know, like the many times you've seen bizarre, anti-intuitive re-orgs done: let's do a sort that compares a team of middling importance and a relatively low corporate impact with the rest of the teams. What do we see then?


This ruins everything. Well, almost everything: the only group that doesn't get split up is the middleware product line (Jet Propelled and Iron Carrot). Everything else suffers from a bad re-org.

A sort (and thus grouping) by comparison to a non-critical team.

If you were to do this because a genuine change in priority had occurred, where the Giant Rubber Band App Team was now the corporate leader/darling, then you'd need to recompute the bit values and do re-sorts. Failing that, you'd just be falling into a trap that has beguiled many before you.


Conclusion

If there's one thing that this exercise should show you, it's this: applying tools and analyses from one field to fresh data in another -- completely unrelated -- field can provide pretty amazing results that turn mystery and guesswork into science and planning.

If we can get two things from this, the other might be: knowing the parts of the system may not necessarily reveal the whole (c.f. Complex Systems), but it may provide you with the data that lets you better predict emergent behaviours and identify patterns and structure where you didn't see them (or even think to look!) before.


by Duncan McGreggor (noreply@blogger.com) at July 18, 2014 06:09 AM

Interview with Erlang Co-Creators

A few weeks back -- the week of the PyCon sprints, in fact -- was the San Francisco Erlang conference. This was a small conference (I haven't been to one so small since PyCon was at GW in the early 2000s), and absolutely charming as a result. There were some really nifty talks and a lot of fantastic hallway and ballroom conversations... not to mention Robert Virding's very sweet Raspberry Pi Erlang-powered wall-sensing Lego robot.

My first Erlang Factory, the event lasted for two fun-filled days and culminated with a stroll in the evening sun of San Francisco down to the Rackspace office where we held a Meetup mini-conference (beer, food, and three more talks). Conversations lasted until well after 10pm with the remaining die-hards making a trek through the nighttime streets SOMA and the Financial District back to their respective abodes.

Before the close of the conference, however, we managed to sneak a ride (4 of us in a Mustang) to Scoble's studio and conduct an interview with Joe Armstrong and Robert Virding. We covered some of the basics in order to provide a gentle overview for folks who may not have been exposed to Erlang yet and are curious about what it has to offer our growing multi-core world. This wend up on the Rackspace blog as well as the Building 43 site (also on YouTube). We've got a couple of teams using Erlang in Rackspace; if you're interested, be sure to email Steve Pestorich and ask him what's available!


by Duncan McGreggor (noreply@blogger.com) at July 18, 2014 05:49 AM

July 16, 2014

Thomas Vander Stichele

morituri 0.2.3 ‘moved’ released!

It’s two weeks shy of a year since the last morituri release. It’s been a pretty crazy year for me, getting married and moving to New York, and I haven’t had much time throughout the year to do any morituri hacking at all. I miss it, and it was time to do something about it, especially since there’s been quite a bit of activity on github since I migrated the repository to it.

I wanted to get this release out to combine all of the bug fixes since the last release before I tackle one of the number one asked for issues – not ripping the hidden track one audio if it’s digital silence. There are patches floating around that hopefully will be good enough so I can quickly do another release with that feature, and there are a lot of minor issues that should be easy to fix still floating around.

But the best way to get back into the spirit of hacking and to remove that feeling of it’s-been-so-long-since-a-release-so-now-it’s-even-harder-to-do-one is to just Get It Done.

I look forward to my next hacking stretch!

Happy ripping everybody.

flattr this!

by Thomas at July 16, 2014 04:01 AM

July 11, 2014

AmvTek Blog

Comparing Python performance of Protobuf/Thrift serialization…

When in need to get two software processes to exchange datas, some sort of protocol is necessary to define how to encode/decode the datas to be transported. A large number of serialization formats are available (json, xml, ASN.1…) so as to tackle the cross process/cross programming language datas encoding problem and Python provides a large number of libraries to leverage them…

What distinguishes the solution provided by protocol buffers or thrift is the need to describe the datas to be exchanged in a central schema file written using an easy to read idl language. Such schema file is then compiled so as to provide data representation for a certain programming language. Not all problems will benefit from this approach, but when developping services that needs to be accessed by clients written in different programming languages ( eg Objective C, Java…) we have found that relying on a well defined schema file allows to save a lot of time.

The need for benchmarking

We write a large part of our server side code using Python and some of the projects we support are accessed mainly by native clients over TCP or UDP. Over the years we moved gradually from our home grown custom serialization solution built on top of Python struct module to protocol buffer…

The move to protocol buffer allowed us to cut down development time required to support new types of client to a minimum. We also realized how the reliance of a central schema was valuable in that everybody can vizualize what the datas are.

One thing however we are regretting from the previous home grown solution are the performances, and at the server performances matter tremendously even more if you are using an asynchronous networking framework like twisted.

For a long while we reinssured ourselves observing that a google supported extension module was available, and that deploying it, would allow us to accelerate serialization/deserialization by a factor 10 at least. Deploying such extension module was delayed till reaching stagging development phase, because it is quite cumbersome to do so. You need to build things from source and manage some environment variables in your server processes, to force the use of the implementation it provides.

Once we activated the protobuf extension module on our stagging server we started to observe random crashes of the server processes. It took us time to understand that those crashes were related to the use of such extension module. Well, we should not have underestimated the fact that Google was labelling this extension module as experimental, but here again we assumed that Google playing in a different category than the rest of us they were probably referring to some pretty advanced usecases :(

After all those hurdles, we realized that selecting the proper serialization technology for your projects is a decision that shall not be taken lightly. Thrift provides an obvious alternative to google protocol buffer, but how does its Python implementation performs ? They exist extensive performance benchmarks of java serialization frameworks, but we found nothing similar for Python.

The benchmark

We have published on GitHub, what we consider to be a good basis to compare the various serialization frameworks which one may want to leverage. The repository for the project can be reached here.

The benchmark for now compares the performances of protobuf and thrift serializations for messages defined in the StuffTotest schema . We welcome suggestions to extend such reference schema so as to explore performances variations more in the details or external help so as to cover more serialization frameworks…

Preliminary results

We have published on GitHub a result run obtained on a low end development machine. If we consider performance to be the average in between serialization and deserialization time for a certain message of the schema, Thrift :

  • outperforms protocol buffers in 75% of the cases.
  • is stable.
  • is much easier to deploy (pip install thrift and you are done…)

So there is currently a clear winner to this benchmark. We will be happy to rerun it so as to validate that things have changed.

by AmvTek developers at July 11, 2014 09:00 PM

July 10, 2014

Jonathan Lange

Cutting edge debugging

One of the great pleasures of advancing in one's craft is the thrill of solving newer, more difficult problems. In the course of my work, I frequently encounter many bugs which appear at first glance to be impossible. Not just impossible to solve, but intrinsically impossible. It's as if being at the very frontiers of technology exposes one to strange new logics that the human mind is not fit to grasp.

Each of these bugs is, of course, unique, and it would be folly to suggest that there could be any sort of systematic method that could be applied at the cutting edge of technology, as if one could push back the boundaries of ignorance in much the same way as one fills in a tax return. Nevertheless, the following principles have served me pretty well, and perhaps deserve a wider audience.

  • You're editing the wrong file
  • You're editing the right file, but on the wrong machine
  • You're editing the right file, but you forgot to save it
  • You've edited the right file, but you forgot to recompile
  • The thing you thought you turned on, you actually turned off
  • The thing you thought you turned off, you actually turned on
  • You're in a meeting, and you should be paying attention. They wouldn't have eighteen people talking about it unless it was really important, would they?
  • You're running the wrong version
  • You're running the right version, but on the wrong machine
  • You've fixed the problem, but forgot to push
  • You've fixed the problem and remembered to push, but you forgot to commit
  • You've fixed the problem, committed, and pushed. However, many users were relying on the previous, broken behaviour and so now you must roll back.

I offer up these principles humbly: no single approach can suffice at the very fringes of the known. I earnestly hope that by meditating on them we can achieve things as yet undreamt of.

by Jonathan M. Lange at July 10, 2014 11:00 PM

July 07, 2014

Jonathan Lange

conda and binstar: initial experiences

A while ago my friend teh recommended that I use conda for Python package management, because of its intrinsic superiority to pip & virtualenv. I'm no fan of either, and teh generally has pretty good taste, so I thought I'd check it out.

I haven't formed an overall opinion yet, but I can say without reservation that the overall experience for someone who has become used to pip & virtualenv is very frustrating.

What I found myself wanting was a one pager explaining conda & binstar to someone familiar with pip & virtualenv, which are unquestionably vastly more popular. Alas, there's none to be found.

There are some guidelines here and there on the net about how to install a package that's available on PyPI into a conda system, but they don't always work.

For example, I'd like to play around with pyrsistent. The recommended approach is to do the following:

$ conda skeleton pypi pyrsistent
$ conda build pyrsistent
$ binstar upload //anaconda/conda-bld/osx-64/pyrsistent-0.3.1-py27_0.tar.bz2
$ conda install pyrsistent

Everything works fine up until the build step, but the upload fails because someone has already uploaded the package. Fine. So conda install pyrsistent should work, right?

$ conda install pyrsistent
Fetching package metadata: ...
Error: No packages found matching: pyrsistent

Wut.

Searching binstar finds pyristent. Apparently I have to add some kind of channel (a concept that's not well explained, and that does not seem to appear at all on the binstar UI).

I try this way:

$ conda config --add channels https://binstar.org/auto

But apparently that's not a channel:

$ conda install pyrsistent
Fetching package metadata: ....Error: HTTPError: 404 Client Error: NOT FOUND: https://binstar.org/auto/osx-64/

So I try this way:

$ conda config --add channels https://binstar.org/auto

And that is a channel, but pyrsistent it doth not find:

$ conda install pyrsistent
Error: No packages found matching: pyrsistent

I take a look at the files page for pyrsistent and it looks like the only file is linux-64/pyrsistent-0.1.0-py27_0.tar.bz2. I'm running OS X (for my sins) and my guess is that the "No packages found matching" error refers to a lack of binary builds.

So, what am I supposed to do? At this point, the documentation and the error messages from the tools leave me with no obvious way forward. Perhaps there's some underlying concept that would illuminate all for me, if only I could grasp it. Perhaps it's simply a bug.

In any case, I'm not abandoning conda yet (in part because I have no idea how to extricate it from my system), but it has some way to go before it is properly better than pip & virtualenv.

by Jonathan M. Lange at July 07, 2014 11:00 PM

July 05, 2014

Jonathan Lange

Migrated to Pelican

A while ago, I got sick of Blogger, so I switched to Octopress.

I really could not be bothered learning how to manage Ruby virtual environments, and kept running into version issues, so I have followed glyph's example and migrated to Pelican.

No comments yet, and still quite a few things that ought to be done. If you notice something broken or missing, please file a bug.

I'm afraid I have neither the energy nor the inclination to do a full write-up of the migration process. However, I'll provide the following notes in case they help anyone who wants to do the same thing.

Patches required

Make the blog

$ pelican-quickstart
Welcome to pelican-quickstart v3.3.1.dev.

This script will help you create a new Pelican-based website.

Please answer the following questions so this script can generate the files
needed by Pelican.


> Where do you want to create your new web site? [.] code-blog
> What will be the title of this web site? Mere Code
> Who will be the author of this web site? Jonathan M. Lange
> What will be the default language of this web site? [en]
> Do you want to specify a URL prefix? e.g., http://example.com   (Y/n) Y
> What is your URL prefix? (see above example; no trailing slash) http://code.mumak.net
> Do you want to enable article pagination? (Y/n) Y
> How many articles per page do you want? [10] 20
> Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n) Y
> Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n) Y
> Do you want to upload your website using FTP? (y/N) n
> Do you want to upload your website using SSH? (y/N)
> Do you want to upload your website using Dropbox? (y/N)
> Do you want to upload your website using S3? (y/N)
> Do you want to upload your website using Rackspace Cloud Files? (y/N)
> Do you want to upload your website using GitHub Pages? (y/N) Y
> Is this your personal page (username.github.io)? (y/N) N

Extract the things

$ pelican-import --blogger -m markdown \
  -o output-directory/ --wp-custpost --dir-page \
  blog-08-29-2013.xml

Transfer the things

$ cp output-directory/* blog-directory/content/
$ rm -rf blog-directory/content/settings blog-directory/content/templates/
$ mv blog-directory/content/comments blog-directory

Tweak the blog

Only basic tweaks. Note that the pelican_comment_system plugin is used in order to nicely render the imported blogger comments.

I had to manually tweak the theme to support the comment system.

by Jonathan M. Lange at July 05, 2014 02:21 PM

June 30, 2014

Moshe Zadka

Thoughts on Warrants

Warrants (and similar things, like indictments at preliminary hearings) rely on the concept of “probable cause”. The definition of probable cause varies, but in general requires for something to be more probable than not — or in terms of Bayesian probability (which applies probability to states of mind, not just to repeatable experiments), higher than 0.5 probability. Do our warrants actually obey this ruling? This is a fairly simple test — at least 50% of warrants handed should, eventually, result in evidence leading to conviction[1]. Do they? Hell, in less than 50% of cases does it lead to an indictment[2]!

Since warrant judges hand out a lot of warrants, we can decide to actually measure and automatically disbar judges whose warrant rates fall significantly below 50% rates (e.g., using p values of 0.005 would correspond to having a pretty high Bayesian prior that judges are “good”). The criteria could simply be — “of all warrants handed out where the case has ended, how many convictions have there been” (so any cases still in trial would be eliminated from consideration). After doing this for a while, we can update the Bayesian prior, periodically, to how many judges actually get disbarred by this.

In a perfect world, we would also apply this to convictions — however, since convictions are handled by one-offs (jurors), it is hard to place blame for overturning convictions. However, at least for warrants and non-grand-jury-indictments, this allows a process of automatic continuous improvement of our standards, avoiding the “judges and police co-operate to come up with lots of warrants” problem.

[1] Technically, since conviction is “beyond reasonable doubt”, this is a lighter standard than the true standard of actual guilt. However, based on the “better a hundred guilty people go free…” (Benjamin Franklin) standard, reasonable doubt being >=1% probability, the effect should be minor.

[2] http://www.insurancejournal.com/news/east/2014/06/27/333106.htm

by moshez at June 30, 2014 04:29 PM

Ashwini Oruganti

Heads Up, San Francisco!

I will be visiting San Francisco this August and staying until November, spending my days working on a TLS v1.2 implementation in Python, using existing crypto math, thanks to Stripe’s Open Source Retreat. Keep an eye out for a blog post with more details on that.

I just wrapped up work at HippyVM, and have gathered numerous stories looking at the PHP source code every day for the past six months. I’ve also just graduated, and it feels great to be able to spend the next few months working full-time on open source software.

What after, you ask? Some of you reading this right now have intimated that you’d like to offer me a job, if I were available. Well, now’s your chance. Get in touch, and let’s talk.

To my various acquaintances in the bay area, let’s get together and do something fun. To others, if grabbing a coffee with me is an interesting idea to you, please drop me a line. I would love to hear your story about how PHP ruined your summer, or how Twisted changed your life, or how you had to stay up all night to fix an OpenSSL vulnerability. Really, I love stories; if you’ve got an interesting one to share, I’d be happy to lend an ear.

June 30, 2014 12:53 PM

June 29, 2014

Thomas Vander Stichele

mach 1.0.3 ‘moved’ released

It’s been very long since I last posted something. Getting married, moving across the Atlantic, enjoying the city, it’s all taken its time. And the longer you don’t do something, the harder it is to get back into.

So I thought I’d start simple – I updated mach to support Fedora 19 and 20, and started rebuilding some packages.

Get the source, update from my repository, or wait until updates hit the Fedora repository.

Happy packaging!

flattr this!

by Thomas at June 29, 2014 09:09 PM

June 22, 2014

Duncan McGreggor

lfest: A Routing Party for LFE Web Apps

The Clojure community (in particular, James Reeves) has produced one of the best APIs I've seen for creating RESTful services, or any application using path dispatching (routing), really: Compojure.

I've been really missing this in LFE. Despite the fact that creating RESTful apps in LFE with YAWS is so simple it doesn't even need a framework, the routes can be a bit difficult to read, due to some of the destructuring that's done (e.g., URL paths).

Take the following route declaration from the yaws-rest-starter project:

Note that this example delegates additional routing to other functions, since all the routing in one function was pretty difficult to read. This, however, detracts from the overall readability in a different way: there's not one place to look to see what functions map to the complete URL-space for this service.

I wanted to be able to do something like what you can do in Clojure:

LFE is a Lisp with macros, so why not? Also, nox or mheise (I forget who) on the #erlang-lisp channel had previously noted all the lfe* projects (and these too), and suggested that someone create the inevitable "lfest" repo on Github.

Enter lfest. After a weekend of hacking and debugging some tiny macros, surprisingly little code now supports routes definitions like the following in LFE+YAWS web apps:

If you're curious to see what this actually expands to before getting compiled in a .beam file, you can view it here.

Bonus for LFE webdevs: this project also has a module of HTTP status codes, for easy re-use in other projects.

The recent work I've been doing with macros has really helped shape the section I've been planning to write in the LFE User Guide. I haven't wanted it to be yet another dry description of macros in a Lisp-2. Instead, my hope is to help jump-start people into how to think about macros, how to start writing them immediately (without years of study first), and how to debug them.

The trick, as with all complicated subjects, is how to remove the barrier to entry and get folks that necessary hands-on experience without dumbing-down the material. It's almost ready...

Also, it's Bay to Breakers today, and Ocean Beach is bumpin. The synchronicity is just eery. May you party just as hard with routes in LFE.


by Duncan McGreggor (noreply@blogger.com) at June 22, 2014 03:16 AM

June 10, 2014

Jp Calderone

Oh, Imaginary!

Still here, still hacking away at this vision/clothing issue[1][2][3][4]. I hope this isn't getting tedious. If it is, hang in there! There's some light up ahead.

Over the last couple weeks a few interesting things have happened to Imaginary. For one, a number of the fixes that Glyph and I made along the way have actually made their way into master now. Most significantly (I think) the documentation I started and Glyph and I worked to finish is there now. Seven other fixes landed too, though. This is the most activity Imaginary has seen in a handful of years now. Crufty, dead code has been deleted. Several interfaces has been improved. Useful functionality has been factored out so it can actually be re-used. Good stuff. ashfall gets the credit for doing the legwork of splitting out these good pieces from the larger branch where most work has been going on.

Something else came about last week which is worthy of note: the code in that branch now passes the full test suite. That is pretty big news. I think maybe it means we actually figured out how to fix the bug (I'm totally hedging here; Imaginary's test suite is pretty good). This came about after a relatively big change and a relatively little change.

The bigger change was an expansion of the interface for exits. Wait,, you're surely going to exclaim, exits? Recall that the general idea for the fix to the way clothing is rendered was to preserve information about the path from the observer to the clothing and then use that path information to inform the renderer. As you might guess, this involved some code changes to the renderer. In the process of doing this, we disturbed the way the renderer shows you exits from a place. The previous implementation was rather hacky and just included some extra code to find exits. It did so without properly traversing the simulation graph and so potentially produced incorrect results. In the new renderer, exit information comes along with all of its related path information - automatically, even! - in the same collection of paths-to-ideas that represents everything else you might be able to see.

This generalization resulted in a bit more information being rendered than we really wanted. Containers (boxes, chairs, rooms, submarines, etc) almost all have implicit "in" and "out" exits (okay exit might not be the best name for how you go in to something; it's just a name, okay?) and it's usually not very interesting to include this information along with other more interesting exits. Anyway, that's a decision we'd made previously in the renderer and we wanted to preserve it. So the exit interface got expanded to include a method which can be used to determine whether it makes sense to present that particular exit when rendering something near that exit. This removed a bunch of extra, not-very-interesting "into" and "out of" exits from the text which is generated to describe most things you might encounter in an Imaginary simulation.

The smaller change was a fix for a bug in how actions resolve targets. One of the fixes we needed to make to the garment system was to make it possible for the person wearing some clothing to always be able to refer to it. This was originally possible by virtue of the duplicate links to all clothing that I discussed in the first post I wrote about this bug. Once we got rid of those duplicate links, referring to clothing you were wearing became impossible if you couldn't see the clothing. At some level this makes sense, but even if you can't see your socks at the moment you still know they're there and you can manipulate them in various ways (pull them up, for example). The long term fix for this will most likely be to add an "awareness" component (to be specified) to action target resolution (the system that turns the "my socks" string into an Idea instance representing your socks). The short term fix was just to just add an observer argument to an existing method, isOpaque and hack the garment system to make clothing not opaque to the person wearing it. This lets you see all your own layers but still prevents other people from seeing anything but your top layer. It has problems - for example, if you look in a mirror, you'll see your socks, even if they're covered by your shoes and your pants - but it'll do for now (because we don't have mirrors yet, ha ha).

And that fix we actually made a few weeks ago. What we did last week was remember to update the code that calls isOpaque to actually pass the correct observer object.

So, pending some refactoring, some documentation, probably some new automated tests, clothing and vision are now playing nicely together - in a way with fewer net hacks than we had before. Woot.

After working on that stuff with Glyph I also tried to get the documentation that has been merged into master actually published somewhere so it can be read. Unfortunately this proved to be something of a challenge. After fighting with distutils for a couple hours I still couldn't get readthedocs.org to build Imaginary's Sphinx-based documentation and had to give up (for my sanity as much as for anything). There is a solution in sight but it involves the long hard road of actually bringing both the packaging state of Imaginary and (here's the real bummer) all of Imaginary's dependencies up to contemporary standards. After cooling off for a couple days I did take the first steps towards this, doing some work on Nevow (yes, that's a github link). There's more work to do on that front but perhaps mostly just getting a new release out. So, watch this space (but please don't hold your breath).

by Jean-Paul Calderone (noreply@blogger.com) at June 10, 2014 12:24 AM

May 27, 2014

Jp Calderone

Imaginary Graph Traversal Woes

So we worked on Imaginary some more. Unsurprisingly, the part that was supposed to be easier was surprisingly hard.

As part of the change Glyph and I are working on, we needed to rewrite the code in the presentation layer that shows a player what exits exist from a location. The implementation strategy we chose to fix looking at stuff involved passing a lot more information to this presentation code and teaching the presentation code how to (drumroll) present it.

Unfortunately, we ran into the tiny little snag that we haven't actually been passing the necessary information to the presentation layer! Recall that Imaginary represents the simulation as a graph. The graph is directed and most certainly cyclic. To avoid spending all eternity walking loops in the graph, Imaginary imposes a constraint that when using obtain, no path through the graph will be returned as part of the result if the target (a node) of the last link (a directed edge in the graph path) is the source (a node) of any other link in the path.

If Alice is in room A looking through a door at room B then the basic assumption is that she should be able to see that there is a door in room B leading to room A (there are several different grounds on which this assumption can be challenged but that's a problem for another day). The difficulty we noticed is that in this scenario, obtain gets to the path Alice -> room A -> door -> room B -> door and then it doesn't take the next step (door -> room A) because the next step is the first path that qualifies as cyclic by Imaginary's definition: it includes room A in two places.

Having given this a few more days thought, I'm curious to explore the idea that the definition of cyclic is flawed. Perhaps Alice -> room A -> door -> room B -> door -> room A should actually be considered a viable path to include in the result of obtain? It is cyclic but it represents a useful piece of information that can't easily be discovered otherwise. Going any further than this in the graph would be pointless because obtain is already going to make sure to give back paths that include all of the other edges away from room A - we don't need duplicates of those edges tacked on after the path has gone to room B and come back again.

Though there are other narrower solutions as well, such as just making the presentation layer smart enough to be able to represent Alice -> room A -> door -> room B -> door correctly even though it hasn't been directly told that the door from room B is leading back to room A. This would have a smaller impact on Imaginary's behavior overall. I'm not yet sure if avoiding the big impact here makes the most sense - if obtain is omitting useful information then maybe fixing that is the best course of action even if it is more disruptive in the short term.

by Jean-Paul Calderone (noreply@blogger.com) at May 27, 2014 01:13 PM

May 25, 2014

Duncan McGreggor

LFE Design Summit

Well, maybe not summit, but certainly a meeting of minds :-)

The LFE Community is pleased to announce that we will be holding our first ever community design meeting in Stockholm, Sweden this June during the Erlang User Conference! We have set aside 1 hour for a kick-off discussion that we can continue in breakout sessions, hallways, meadhalls, in #erlang-lisp on Freenode, and on the LFE mail list

Our goals are to not only share plans, but to discover what you want in LFE, how you want to use it. This is an opportunity for us to work together, explore interesting problems to solve, build community awareness around language goals, and identify efforts around which each of us are interested in collaborating with others in the course of subsequent months.

We've opened up a Google Docs form so you can add your ideas for session topics. Here are some discussion topics that we've come up with so far:
  • LFE Standard Library 
  • Java Interop via Erjang 
  • State of LFE/LFE Roadmap 
  • Specs, Types, Debug Info and Dialyzer: Erlang Core AST in BEAM 
  • Refining LFE docs and guides, creating a CookBook, etc.
Add your favorites!

In the next couple weeks, we'll identify the top session topics the folks have proposed and attempt to cover these during the Erlang User Conference. Robert was able to get us some meeting space & time for this; he's hoping to have something viewable on the EUC site soon.

This is a first step; let's see how this goes, and if people really enjoy it, we can have a real summit next time!

Update: Robert has announced the time and place.


by Duncan McGreggor (noreply@blogger.com) at May 25, 2014 10:47 PM

May 21, 2014

Jp Calderone

De-cruft (Divmod Imaginary Update)

Over the weekend Glyph and I had another brief hack session on Imaginary. We continued our efforts towards fixing the visibility problem I've described over the last couple of posts. This really was a brief session but we took two excellent steps forward.

First, we made a big chunk of progress towards fixing a weird hack in the garment system. If you recall, I previously mentioned that the way obtain works in Imaginary now, the system will find two links to a hat someone is wearing. In a previous session, Glyph and I removed the code responsible for creating the second link. This was a good thing because it was straight-up deletion of code. The first link still exists and that's enough to have a path between an observer and a hat being worn by someone nearby.

The downside of this change is that the weird garment system hack was getting in the way of that remaining link being useful. The purpose of the hack was to prevent hats from showing up twice in the old system - once for each link to them. Now, with only one link, the hack would sometimes prevent the hat from even showing up once. The fix was fairly straightforward. To explain it, I have to explain annotations first.

As I've mentioned before, Imaginary represents a simulation as a graph. One powerful tool that simulation developers have when using Imaginary is that edges in the graph can be given arbitrary annotations. The behavior of simulation code will be informed by the annotations on each link along the path through the graph the simulation code is concerned with. Clear as mud, right?

Consider this example. Alice and Bob are standing in a room together. Alice is wearing a pair of white tennis shoes. Alice and Bob are standing in two parts of the room which are divided from each other by a piece of red tinted glass. A realistic vision simulation would have Bob observing Alice's shoes as red. Alice, being able to look directly at those same shoes without an intervening piece of tinted glass, perceives them as white. In Imaginary, for Bob to see the tennis shoes, the vision system uses obtain to find the path from Bob to those shoes. The resulting path necessarily traverses the Idea representing the glass - the path looks very roughly like Bob to glass to Alice to shoes. The glass, being concerned with the implementation of tinting for the vision system, annotates the link from itself to Alice with an object that participates in the vision simulation. That object takes care of representing the spectrum which is filtered out of any light which has to traverse that link. When the vision system shows the shoes to Bob, the light spectrum he uses to see them has been altered so that he now perceives them as red.

Clearly I've hand-waved over a lot of details of how this works but I hope this at least conveys the very general idea of what's going on inside Imaginary when a simulation system is basing its behavior on the simulation graph. Incidentally, better documentation for how this stuff works is one of the things that's been added in the branch Glyph and I are working on.

Now, back to hats. The hack in the garment system annotates links between clothing and the person wearing that clothing. The annotation made the clothing invisible. This produced a good effect - when you look around you, you're probably not very interested in seeing descriptions of your own clothing. Unfortunately, this annotation is on what is now the only link to your clothing. Therefore, this annotation makes it impossible for you to ever see your own clothing. You could put it on because when you're merely holding it, it doesn't receive this annotation. However, as soon as you put it on it vanishes. This poses some clear problems: for example, you can never take anything off.

The fix for this was simple and satisfying. We changed the garment system so that instead of annotating clothing in a way that says "you can't see this" it annotates clothing in a way that merely says "you're wearing this". This is very satisfying because, as you'll note, the new annotation is a much more obviously correct piece of information. Part of implementing a good simulation system in Imaginary is having a good model for what's being simulated. "Worn clothing is worn clothing" sounds like a much better piece of information to have in the model than "Worn clothing can't be seen". There's still some polish to do in this area but we're clearly headed in the right direction.

By comparison, the second thing we did is ridiculously simple. Before Imaginary had obtain it had search. Sounds similar (well, maybe...) and they were similar (well, sort of...). search was a much simpler, less featureful graph traversal API from before Imaginary was as explicitly graph-oriented as it is now. Suffice it to say most of what's cool about Imaginary now wasn't possible in the days of search. When obtain was introduced, search was re-implemented as a simple wrapper on top of obtain. This was neat - both because it maintained API compatibility and because it demonstrated that obtain was strictly more expressive than search. However, that was a while ago and search is mostly just technical baggage at this point. We noticed there were only three users of search left (outside of unit tests) and took the opportunity to delete it and update the users to use obtain directly instead. Hooray, more code deleted.

Oh, and I think we're now far enough beneath the fold that I can mention that the project has now completed most of a migration from Launchpad to github (the remaining work is mostly taking things off of Launchpad). Thanks very much to ashfall for doing very nearly all the work on this.

by Jean-Paul Calderone (noreply@blogger.com) at May 21, 2014 12:25 AM

May 12, 2014

Jp Calderone

Seeing Things Right (Divmod Imaginary Update)

Earlier this month I wrote about how Glyph and I have been trying to fix a bug in Imaginary. Since then we've worked on the problem a little more and made some excellent progress.

If you recall, the problem involved being shown articles of clothing as though they were lying on the floor rather than being worn by nearby people. I mentioned that our strategy to fix this was to make the "look at something" action pay attention to more of the structure in the simulation graph.

That's just what Glyph and I did when we got together to work on this some more last week.

The version of the "look at something" action in trunk operates in two steps. First, it searches around the player in the simulation graph for an object satisfying a few simple criteria:

  • The object is something that can be seen at all - for example, a chair or a shirt, not the wind or your rising dread at the scritch scritch scritch noise that always seems to be coming from just out of your field of vision.
  • The object is something the player's senses actually allow them to see - for example, objects not draped in a cloak of invisibility, objects not sitting in a pitch black room.
  • The object answers to the name the player used in the action - "look at hat" will not consider the Sears tower or a passing dog.
  • The object is reasonably nearby (which I'll just hand-wave over for now).

Having found one and only one object satisfying all of these criteria (behavior for the case where zero or more than one result are found produce another outcome), the action proceeds to the second portion of its implementation. It invokes a method that all objects capable of being seen are contractually obligated to provide, a method called visualize which is responsible for representing that thing to the player doing the looking. The most common implementation of that method is a stack of special-cases:

  • is the thing a location? if so, include information about its exits.
  • does the thing have a special description? if so, include that.
  • is the thing in good condition or bad condition? include details about that.
  • is the thing wearing clothing? if so, include details about those.
  • is the thing a container and open? if so, include details about all the things inside it.

Much of this logic is implemented using a plugin system so, while somewhat gross, it at least holds with some of Imaginary's goals of generality. However, it has some problems beyond just being gross. One of those problems is that the full path from the player doing the looking to all of the things that appear in the visualize method's result is not known. This is because the path is broken in half: that from the player to the direct target (whatever something names) and that from the direct target to each of the (for lack of a better word) indirect targets (if Bob looks at Alice then Alice is a target of the action but so is the axe Alice is holding behind her back, Alice's hat, etc). And in most cases the problem is even worse because the second part - the path from the direct target to each of the indirect targets - is ignored. Breaking up the path or ignoring part of it like this has problematic consequences.

If Alice is carrying that axe behind her back and Bob looks at her, unless you know the complete path through the simulation graph from Bob to the axe then you can't actually decide whether Bob can see the axe or not: is Bob standing in front of Alice or behind her?

The idea that Glyph and I are pursuing to replace the visualize method solves this problem, cuts down significantly on the grossness involved (that is, on the large amount of special-case code), and shifts what remaining grossness there is to a more suitable location - the presentation layer (where, so far as I can tell, you probably do want a lot of special-case code because deciding exactly how best to present information to people is really just a lot of special cases).

Perhaps by now you're wondering how this new idea works.

As I've mentioned already, the core of the idea is to consider the full path between the player taking the action and the objects (direct and indirect) that end up being targets of the action.

An important piece of the new implementation is that instead of just collecting the direct target and then acting on it, we now collect both the direct target and all of the indirect targets as well. We also save the path to all of these targets. So, whereas before resolve_target (the method responsible for turning u"something" into some kind of structured object, probably an object from the simulation graph) would just return the object representing Alice, it now returns the path from Bob to Alice, the path from Bob to Alice's hat, the path from Bob to Alice's axe, the path from Bob to the chair Alice is sitting in, and so forth.

With all this extra information in hand, the presentation layer can spit out something both nicely formatted and correct like:

Alice is here, wearing a fedora, holding one hand behind her back.
Or, should it be more appropriate to the state of the simulation:
Alice is here, her back turned to you, wearing a fedora, holding an axe.

Of course, we haven't implemented the "holding something behind your back" feature of the "holding stuff" system yet so Imaginary can't actually reproduce this example exactly. And we still have some issues left to fix in the branch where we're working on fixing this bug. Still, we've made some more good progress - both on this specific issue, on documentation to explain how the core Imaginary simulation engine works, and on developing useful idioms for simulation code that has yet to be written.

by Jean-Paul Calderone (noreply@blogger.com) at May 12, 2014 11:39 PM

Twisted Matrix Laboratories

Twisted 14.0.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 14.0! It has been a long road to get here, but we’ve done it!

The highlights of this release are:
  • Twisted Positioning (`twisted.positioning`) makes its entry into Twisted! It comes ready to talk with common GPS devices, and will supersede `twisted.protocols.gps`.
  • A wealth of SSL/TLS improvements, including ECDHE support, TLS Service Identity (with service_identity on PyPI), a stronger default set of ciphers, and strengthening against attacks such as CRIME. A Twisted Web server with pyOpenSSL 0.14 is capable of getting an A in Qualys SSL Labs tests out of the box, and A+ with small application modifications. Twisted Agent can also now do HTTPS hostname verification.
  • Python 3 improvements, including the ability for `pip install` to install all ported modules.
  • Twisted Pair’s TUN/TAP support has been overhauled, with documentation and full test coverage.
  • Significant documentation improvements, including more API documentation for Twisted Mail & Twisted Names, narrative documentation for Twisted Names, and a migration to Sphinx for building Twisted narrative docs.
  • Support is dropped for pyOpenSSL older than 0.10 and Windows XP.
For more information, check the NEWS file.

You can find the downloads at https://pypi.python.org/pypi/Twisted (or alternatively http://twistedmatrix.com/trac/wiki/Downloads) .

Many thanks to everyone who had a part in this release - we’ve got some big things landed, and if it weren’t for the support of developers (both core and occasional), the Twisted Software Foundation, or people giving feedback and filing bugs, we’d have never got it done.

Twisted Regards,
HawkOwl

by HawkOwl (noreply@blogger.com) at May 12, 2014 11:05 AM

Glyph Lefkowitz

Harried

Some things, one writes because one has something one wants the world to know.

Some things, one writes because, despite one’s better judgement, one can’t help it.

This is one of the latter type.

I have a recurring dream. I mean this in the most literal sense possible. I do not mean that I have an aspiration, or a hope. This is literally a dream that I’ve had, while I was asleep, multiple times. I’m not really sure why, since the apparent stimulus that this dream is responding to is many, many years in my past. Nevertheless, I have had it 3 or 4 times now.

It’s a movie trailer.

There’s a man driving a pickup truck along a highway near open plains.

Intercut are images of him driving a tractor over an apparently endless field, shot from a fixed 3rd-person perspective behind the tractor, driving towards the horizon. Close up of his face. He looks bored, resigned.

Image Credit: Ben Smith

He sees a falling star streaking down from the sky. As he watches, there’s a roaring noise, and a boom, and it crashes down into an adjacent field.

Cut to him cautiously exploring the crash site. There’s an object in the middle of the crash site, still glowing red hot. It’s a strange shape; conical, about a meter long, but with an indentation on its side and various protrusions, one of which looks like a human-sized handle.

Another cut, now he’s inside the truck again, and we can see that the object is under a tarp in the back of the truck.

Now he’s in his garage. He’s wearing a red work shirt and blue jeans. The object is hoisted up on a lift, and he’s walking around it, examining it. A prolonged sequence of him trying to cut it open with a plasma arc cutter, then wiping off the smudge to reveal it’s still shiny.

Finally, he lowers the lift, but as it goes towards the ground, the object beeps and stops dropping, at about waist height. It slowly rotates, swiveling to point towards the open garage door.

He walks over to the object and stands next to it, his hip near the indentation in its side, and he reaches out to grab the handle. The machine makes a powering-on whine and more protrusions start folding out of its sides; he yanks his hand back like he’s been burned, and it folds back into itself. More cautiously this time, he reaches out to grab the handle.

The machine’s tip opens like a claw, revealing a central spike surrounded by a bunch of curved radiating anntennae. Sparks start shooting between the antennae and the spike, charging it. A view screen unfolds, initially displaying a targetting reticle pointed out of his garage door, then switching to text in an alien language. A grainy voice comes out of speakers in the side of the machine, initially speaking the alien language. Then the language on the screen switches to Japanese, then to French, then to German, and finally to english. In large, flashing, block letters, it reads, and the grainy voice from its speakers says, simply:

GET READY

Before the man has a chance to the react, the indentation in the side of the machine extrudes several parts of a metallic harness which surround his body. A thruster on the back part of the machine starts glowing a dull red color, and then the machine (with the man in tow) shoots out of his garage.

Initially, he is running furiously to keep up. He trips over a rock in the road, but the harness allows him to sort of gyroscopically spin while still attached to the machine, and then right himself again. Eventually he pulls back on the handle and he’s flying through the air, and then the protrusion on the nose of the machine shoots a beam in front of him and opens a portal, which he then flies through.

Space Harrier

The machine says “Welcome to the fantasy zone.” and then “Get ready!” again, and then we see a montage of him flying over alien landscapes, primarily plains checkered in garish primary colors, shooting giant plasma bullets out of the nose of the weapon at trees, tiny fighter jets, rocks, and basically anything that gets in his way. There are scenes of him talking to giant one-eyed mammoths, and a teaser-trailer shot of giant, horrible looking two-headed dragons.

COMING THIS SUMMER: SPACE HARRIER, THE MOTION PICTURE.

Then I wake up.

Sega: please, please get someone to make this movie. The dream is only ever the trailer, and I really want to see how it ends.

by Glyph at May 12, 2014 01:32 AM

May 11, 2014

AmvTek Blog

Making use of twisted coiterate

Twisted provides various ways to integrate CPU bound operations or blocking libraries to the reactor. It provides very clean integration path for threading or external processes.

In this post, we describe an under documented alternative, where the long running task is implemented using an iterator that will be consumed directly in the reactor event loop after passing it to coiterate.

Where usable, coiteration allows to completely avoid using threading, bypassing the well known python GIL bottleneck…

Basic idea

Coiteration requires developers to use a divide and conquer strategy to plan their task execution. In python, we will code the task using a generator function or alternatively a class implementing the iterator protocol.

The task will be executed step by step in the reactor event loop after passing the iterator that represents it to coiterate.

Summing integers

Let’s consider a python function which sums the N first integers, N being arbitrary large.

def sum_all_integers_until(N):

    s = 0
    for i in xrange(N):
        s += i
    return s

For very large value of N, calling such function from the same thread as the one inside which the reactor is running, is not a good idea as the event loop will be blocked for as long as this function needs to return…

Summing using coiteration

A first approach

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from twisted.internet import reactor
from twisted.internet.task import coiterate

def make_iterator_to_sum_all_integers_until(N):

    s = 0
    for i in xrange(N):
        s += i
        print "Adding %i to result" % i
        yield None # event loop can looks after other things...

    print "result is %i " % s

def sum_all_integers_being_nice_to_reactor(N):

    all_sum_steps = make_iterator_to_sum_all_integers_until(N)
    coiterate(all_sum_steps)

reactor.callLater(0, sum_all_integers_being_nice_to_reactor, 8)
reactor.run()

See gist coiterate01.py

At line 4 a generator function is defined, that return an iterator that will calculate the sum of all integers until a certain value N. At line 17, this iterator is passed to coiterate which will result in such iterator being consumed in the reactor event loop in an optimal way.

Note that this does not make your iterator magically non blocking, as everywhere else in Twisted, developer shall ensure that each iteration is non blocking.

Obtaining the result

If you took the time to run the above sum_all_integer …, you have probably been delighted to see the result being printed in the console.

Retrieving such result to make use of it, requires some additional efforts that will be detailled now.

Let’s first observe that coiterate is a well behaved Twisted citizen. As it is starting an operation (consumption of the argument iterator…) that will take some time to complete, it returns a Deferred. As you may expect, this Deferred will fire when iteration is over.

If we attach a callback function to this Deferred we will not receive our result, but the same iterator that coiterate has consumed. Let’s see a possible solution to obtain a result from the iterator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from twisted.internet import reactor
from twisted.internet.task import coiterate

def make_iterator_to_sum_all_integers_until(N, context):

    s = 0
    for i in xrange(N):
        s += i
        print "Adding %i to result" % i
        yield None # event loop can looks after other things...

    context['result'] = s

def sum_all_integers_being_nice_to_reactor(N):
    "return Deferred firing calculated sum..."

    def extract_result_cb(ignored, context):
        "return context['result']"

        rv = context['result']
        print "Got result = %s" % rv
        return rv

    context = {}
    all_sum_steps = make_iterator_to_sum_all_integers_until(N, context)

    deferred = coiterate(all_sum_steps)
    deferred.addCallback(extract_result_cb, context)

    return deferred

reactor.callLater(0, sum_all_integers_being_nice_to_reactor, 8)
reactor.run()

See gist coiterate_02.py

Let’s summarize how this code proceeds :

At line 4, we define a generator function which returns an iterator that let us execute our task step by step. As we also need a result from such iterator, we pass it an additional context object which provides a way to "return" any result obtained during iteration.

At line 14, we construct a well behaved python function that returns a Deferred that will fire with the result we are awaiting. Internally this function takes care of all the gory details of constructing the iterator that will be passed to coiterate and extracting the result we need.

Waiting for Deferred…

Meanwhile executing a long running task, it is quite common to have to wait some time until some externals operations complete. Twisted let our coiterable tasks indicate that they shall be paused until a certain Deferred fires. To achieve so, the only thing to do is to yield the Deferred of interest out of the task iterator.

Let’s see how we could have our sum_all_integer… wait 1 second in between each iteration step.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from twisted.internet import reactor
from twisted.internet.task import coiterate, deferLater

def make_iterator_to_sum_all_integers_until(N, context):

    def wait_some_time(t):
        "return Deferred firing after t seconds"

        return deferLater(reactor,t,lambda :"I was paused %.02f seconds"%t)

    def print_pause_cb(msg):
        "callback printing result message..."

        print msg

    s = 0
    for i in xrange(N):

        s += i
        print "Adding %i to result" % i

        d = wait_some_time(1.0)
        d.addCallback(print_pause_cb)
        yield d    # we will be paused until d fires...

    context['result'] = s

See gist coiterate_03.py

At line 4 is the modified generator function that will pause some time in between each step. The wait_some_time function at line 6 could be anything that returns a Deferred.

It is our experience that the yield to wait approach which coiterate allows greatly simplify coding complex tasks with Twisted.

Cancelling coiteration

When requiring clients to wait long time to get the result of a long running operation, we shall expect situations where the client will give up. In such situations, we normally want to cleanup as soon as possible any resources allocated to service such client.

Before showing how this can be achieved in the context of this example, let’s mention that if you need to control your task from the outside to pause it or stop it, you should consider using cooperate instead of coiterate. Like coiterate, cooperate shall be called with an iterator which will be consumed in the reactor event loop. Unlike coiterate that returns a Deferred that fires when iteration is completed, cooperate returns a Task object that can be used to pause or stop the ongoing task…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
from twisted.internet import reactor
from twisted.internet.defer import Deferred, CancelledError
from twisted.internet.task import coiterate, deferLater

def make_iterator_to_sum_all_integers_until(N, context):

    def wait_some_time(t):
        "return Deferred firing after t seconds"

        return deferLater(reactor,t,lambda :"I was paused %.02f seconds"%t)

    def print_pause_cb(msg):
        "callback printing result message..."

        print msg

    d = None
    s = 0

    try:

        for i in xrange(N):

            s += i
            print "Adding %i to result" % i

            d = wait_some_time(1.0)
            d.addCallback(print_pause_cb)
            yield d    # we will be paused until d fires...

        context['result'] = s

    except GeneratorExit:

        print "---"
        print "Early termination..."

        # cancel pending Defferred
        if d and not d.called:
            d.cancel()

def sum_all_integers_being_nice_to_reactor(N):
    "return Deferred firing calculated sum..."

    def extract_result_cb(ignored, context):
        "return context['result']"

        rv = context['result']
        return rv

    def suppress_cancel_log_eb(error):
        "trap CancelledError"

        # this suppress UnhandledError warning...
        error.trap(CancelledError)

    context = {}
    all_sum_steps = make_iterator_to_sum_all_integers_until(N, context)

    deferred = Deferred(lambda _:all_sum_steps.close())
    coiterate(all_sum_steps).chainDeferred(deferred)
    deferred.addCallback(extract_result_cb, context)
    deferred.addErrback(suppress_cancel_log_eb)
    return deferred

def main():
    "start summing integers and stop after 3 seconds..."

    def print_result_cb(res):
        "print result if any..."

        if res is not None:
            print "Got result = %s" % res

    # start sum calculation using coiteration...
    d = sum_all_integers_being_nice_to_reactor(8)
    d.addCallback(print_result_cb)

    # schedule cancellation after 3.00 seconds
    reactor.callLater(3.0, d.cancel)


reactor.callLater(0, main)
reactor.run()

See gist coiterate_04.py

At line 4 our generator function was again modified. At line 32, an inner handler block for the GeneratorExit exception was added. This block will be reached in case close is called on iterator objects returned by our generator function. In this block, we are cleaning up any pending deferred that the task may be waiting for.

One would expect that cancelling the Deferred returned by coiterate would automatically close the related iterator, but this is not the case. Let’s modify the sum_all_integer… function for this to happen. At line 60, we construct the Deferred that sum_all_integer… will return, providing it a cancellation function. Such function simply close the iterator returned by the generator function. This helper Deferred is chained to the Deferred that coiterate returns, so that when no cancellation occurs, we get our result…

by AmvTek developers at May 11, 2014 09:00 PM

May 08, 2014

Glyph Lefkowitz

The Report Of Our Death

tulips

Lots of folks are very excited about the Tulip project, recently released as the asyncio standard library module in Python 3.4.

This module is potentially exciting for two reasons:

  1. It provides an abstract, language-blessed, standard interface for event-driven network I/O. This means that instead of every event-loop library out there having to implement everything in terms of every other event-loop library’s ideas of how things work, each library can simply implement adapters from and to those standard interfaces. These interfaces substantially resemble the abstract interfaces that Twisted has provided for a long time, so it will be relatively easy for us to adapt to them.
  2. It provides a new, high-level coroutine scheduler, providing a slightly cleaned-up syntax (return works!) and more efficient implementation (no need for a manual trampoline, it’s built straight into the language runtime via yield from) of something like inlineCallbacks.

However, in their understandable enthusiasm, some observers of Tulip’s progress – links withheld to protect the guilty – have been forecasting Twisted’s inevitable death, or at least its inevitable consignment to the dustbin of “legacy code”.

At first I thought that this was just sour grapes from people who disliked Twisted for some reason or other, but then I started hearing this as a concern from users who had enjoyed using Twisted.

So let me reassure you that the idea that Twisted is going away is totally wrong. I’ll explain how.

The logic that leads to this belief seems to go like this:

Twisted is an async I/O thing, asyncio is an async I/O thing. Therefore they are the same kind of thing. I only need one kind of thing in each category of thing. Therefore I only need one of them, and the “standard” one is probably the better one to depend on. So I guess nobody will need Twisted any more!

The problem with this reasoning is that “an async I/O thing” is about as specific as “software that runs on a computer”. After all, Firefox is also “an async I/O thing” but I don’t think that anyone is forecasting the death of web browsers with the release of Python 3.4.

Which Is Better: OpenOffice or Linux?

Let’s begin with the most enduring reason that Twisted is not going anywhere any time soon. asyncio is an implementation of a transport layer and an event-loop API; it can move bytes into and out of your application, it can schedule timed calls to happen at some point in the future, and it can start and stop. It’s also an implementation of a coroutine scheduler; it can interleave apparently sequential logic with explicit yield points. There are also some experimental third-party extension modules available, including an event-driven HTTP server and client, and the community keeps building more stuff.

In other words, asyncio is a kernel for event-driven programming, with some applications starting to be developed.

Twisted is also an implementation of a transport layer and an event-loop API. It’s also got a coroutine scheduler, in the form of inlineCallbacks.

Twisted is also a production-quality event-driven HTTP server and client including its own event-driven templating engine, with third-party HTTP addons including server microframeworks, high-level client tools, API construction kits, robust, two-way browser communication, and automation for usually complex advanced security features.

Twisted is also an SSH client, both an API and a command-line replacement for OpenSSH. It’s also an SSH server which I have heard some people think is OK to use in production. Again, the SSH server is both an API and again as a daemon replacement for OpenSSH. (I'd say "drop-in replacement" except that neither the client nor server can parse OpenSSH configuration files. Yet.)

Twisted also has a native, symmetric event-driven message-passing protocol designed to be easy to implement in other languages and environments, making it incredibly easy to develop a custom protocol to propagate real-time events through multiple components; clients, servers, embedded devices, even web browsers.

Twisted is also a chat server you can deploy with one shell command. Twisted is also a construction kit for IRC bots. Twisted is also an XMPP client and server library. Twisted is also a DNS server and event-driven DNS client. Twisted is also a multi-protocol integrated authentication API. Twisted is also a pluggable system for creating transports to allow for third-party transport components to do things like allow you to run as a TOR hidden service with a single command-line switch and no modifications to your code.

Twisted is also a system for processing streams of geolocation data including from real hardware devices, via serial port support.

Twisted also natively implements GUI integration support for the Mac, Windows, and Linux.

I could go on.

If I were to include what third-party modules are available as well, I could go on at some considerable length.

The point is, while Twisted also has an existing kernel – largely compatible, at a conceptual level at least, with the way asyncio was designed – it also has a huge suite of functionality, both libraries and applications. Twisted is OpenOffice to asyncio’s Linux.

Of course this metaphor isn’t perfect. Of course the nascent asyncio community will come to supplant some of these things with other third-party tools. Of course there will be some duplication and some competing projects. That’s normal, and even healthy. But this is not a winner-take all existential Malthusian competition. Being able to leverage the existing functionality within the Twisted and Tornado ecosystems – and vice versa, allowing those ecosystems to leverage new things written with asyncio – was not an accident, it was an explicit, documented design goal of asyncio.

Now And Later

Python 3 is the future of Python.

While Twisted still has a ways to go to finish porting to Python 3, you will be able to pip install the portions that already work as of the upcoming 14.0 release (which should be out any day now). So contrary to some incorrect impressions I’ve heard, the Twisted team is working to support that future.

(One of the odder things that I’ve heard people say about asyncio is that now that Python 3 has asyncio, porting Twisted is now unnecessary. I'm not sure that Twisted is necessary per se – our planet has been orbiting that cursed orb for over 4 billion years without Twisted’s help – but if you want to create an XMPP to IMAP gateway in Python it's not clear to me how having just asyncio is going to help.)

However, while Python 3 may be the future of Python, right now it is sadly just the future, and not the present.

If you want to use asyncio today, that means foregoing the significant performance benefits of pypy. (Even the beta releases of pypy3, which still routinely segfault for me, only support the language version 3.2, so no “yield from” syntax.) It means cutting yourself off from a significant (albeit gradually diminishing) subset of available Python libraries.

You could use the Python 2 backport of Tulip, Trollius, but since idiomatic Tulip code relies heavily on the new yield from syntax, it’s possible to write code that works on both, but not idiomatically. Trollius actually works on Python 3 as well, but then you miss out on one of the real marquee features of Tulip.

Also, while Twisted has a strict compatibility policy, asyncio is still marked as having provisional status, meaning that unlike the rest of the standard library, its API may change incompatibly in the next version of Python. While it’s unlikely that this will mean major changes for asyncio, since Python 3.4 was just released, it will be in this status for at least the next 18 months, until Python 3.5 arrives.

As opposed to the huge laundry list of functionality above, all of these reasons will eventually be invalidated, hopefully sooner rather than later; if these were the only reasons that Twisted were going to stick around, I would definitely be worried. However, they’re still reasons why today, even if you only need the pieces of an asynchronous I/O system that the new asyncio module offers, you still might want to choose Twisted’s core event loop APIs. Keep in mind that using Twisted today doesn’t cut you off from using asyncio in the future: far from it, it makes it likely that you will be able to easily integrate whatever new asyncio code you write once you adopt it. Twisted’s goal, as Laurens van Houtven eloquently explained it this year in a PyCon talk, is to work with absolutely everything, and that very definitely includes asyncio and Python 3.

My Own Feelings

I feel like asyncio is a step forward for Python, and, despite the dire consequences some people seemed to expect, a tremendous potential benefit to Twisted.

For years, while we – the Twisted team – were trying to build the “engine of your Internet”, we were also having to make a constant, tedious sales pitch for the event-driven model of programming. Even today, we’re still stuck writing rambling, digressive rants explaining why you might not want three threads for every socket, giving conference talks where we try to trick the audience into writing a callback, and trying to explain basic stuff about networking protocols to an unreceptive, frustrated audience.

This audience was unreceptive because the broader python community has been less than excited about event-driven networking and cooperative task coordination in general. It’s a vicious cycle: programmers think events look “unpythonic”, so they write their code to block. Other programmers then just want to make use of the libraries suitable to their task, and they find ones which couple (blocking) I/O together with basic data-processing tasks like parsing.

Oddly enough, I noticed a drop in the frequency that I needed to have this sort of argument once node.js started to gain some traction. Once server-side Python programmers started to hear all the time about how writing callbacks wasn't a totally crazy thing to be doing on the server, there was a whole other community to answer their questions about why that was.

With the advent of asyncio, there is functionality available in the standard library to make event-driven implementations of things immediately useful. Perhaps even more important this functionality, there is guidance on how to make your own event-driven stuff. Every module that is written using asyncio rather than io is a module that at least can be made to work natively within Twisted without rewriting or monkeypatching it.

In other words, this has shifted the burden of arguing that event-driven programming is a worthwhile thing to do at all from Twisted to a module in the language’s core.

While it’ll be quite a while before most Python programmers are able to use asyncio on a day to day basis its mere existence justifies the conceptual basis of Twisted to our core consituency of Python programmers who want to put an object on a network. Which, in turn, means that we can dedicate more of our energy to doing cool stuff with Twisted, and we can dedicate more of the time we spend educating people to explaining how to do cool things with all the crazy features Twisted provides rather than explaining why you would even want to write all these weird callbacks in the first place.

So Tulip is a good thing for Python, a good thing for Twisted, I’m glad it exists, and it doesn't make me worried at all about Twisted’s future.

Quite the opposite, in fact.

by Glyph at May 08, 2014 08:00 AM

May 05, 2014

Moshe Zadka

Minimum wage, efficient markets and BATNA

[I keep writing the same content over and over, and wanted a canonical place to put it.]

Wages for low-skill labor is not a good market because of differing BATNAs. The BATNA for the marginal employee at an employer is “starve to death”, the BATNA for a marginal employer is “be slightly less efficient”. Minimum wage is designed to work around the BATNA issue, and so there is an optimum level for the minimum wage. Lobbying will tend to keep the minimum wage below that level, and so in general, increasing the minimum wage by a little bit is usually a good idea because it brings up closer to the optimum.

This is a better formalization of the argument typically given as “give people money, they’ll spend it and more jobs are created”. The naive approach to the formalization of this argument ends up being a broken windows fallacy, with the usual counter arguments available regarding opportunity cost. Otherwise, this argument supports raising minimum wage arbitrarily high, and so proves too much – why not raise minimum wage to $1M/hr?

Phrasing it as regulation around BATNA problem shows that there are two ways to solve it: we can keep regulating around the BATNA, with the usual limitations of regulations (e.g., the structure of flat-until-minimum wage will mean that workers who are “worth” minimum wage + $0.5/hr will, because the BATNA is still there, will be making minimum wage). An alternative is to regulate the BATNA away by making sure that people have a different BATNA — in other words, a social safety net. The simplest and easiest social safety net is Basic Guaranteed Income. If we had BGI, we would not need minimum wage, and we would make it easy for businesses to pay employees exactly what they are marginally worth — no more, no less — with no further regulation.

Me, personally? I think instead of regulating the market to work around the BATNA problem, we should fix the goddam BATNA problem (I know, call me crazy). If the BATNA for the marginal employee is “have to subsist on Basic Guaranteed Income”, we won’t need minimum wage laws and all the complexity that comes with them (especially around tipping and enforcement).

If it were up to me, we would be teaching about BATNA is grade-school, because it’s a concept that does a lot to fill the gap of supply-and-demand in understanding how economics works.

by moshez at May 05, 2014 05:35 PM

May 04, 2014

Jp Calderone

Update on Divmod Imaginary

Recently Glyph and I have been tinkering with Imaginary. For anyone reading who doesn't know, Imaginary is the current iteration of the project which is the reason the Twisted project started. It's a simulation kernel geared towards the construction multiplayer adventure-style games - imagine something in the ballpark of Zork or CircleMUD or a piece of interactive fiction.

The last time I worked on Imaginary I was trying to understand a bug that I thought was in the new "obtain" system. I made some progress, mostly on documenting how the "obtain" system works, but I eventually got lost in the tall grass trying to understand how exactly the implementation led to the bug and what an appropriate fix might be. It was nice to get some of that documentation written but I eventually found myself making no progress and got frustrated and gave up.

So it is very nice to have Glyph participating again (he did write the "obtain" system in the first place, after all).

"But wait," I hear you ask, "what is this obtain thing?"

As I mentioned, Imaginary is a simulation kernel. One of its primary responsibilities is to provide an object model for the representation of the simulation scenario. Another is to offer tools for interacting with that object model. "obtain" is the name of one of the central APIs Imaginary offers. It lets an application developer using Imaginary find objects in the simulation. For example, when the weather system in your simulation decides it is time to rain on the plains in Spain it relies on "obtain" to find any actors (or inanimate objects) on those plains in order to drop some water on them.

So, the problem? Imagine that Alice and Bob are in a room together. Alice is wearing a fedora hat. When Bob looks around the room he sees Alice is there. He may also see that Alice is wearing a hat. The bug is that Bob will see the hat as if it were sitting on the floor instead of on Alice's head.

Put another way, the world would be described to Bob like this:

You are in a room. Alice is here. A fedora hat is here.

When a more correct description would be more like this:

You are in a room. Alice is here. She is wearing a fedora hat.

This had me quite stumped. I was chasing graph nodes and graph edges through various graph traversal functions. There's quite a bit of extra data associated with each node and edge, mostly for describing what useful simulation behavior each as (for example, defining what happens if the object ever gets wet because it has started raining) and having to sort through all of this didn't make the job any easier. I eventually concluded the graph traversal logic was just wrong in some terribly subtle way and gave up.

After a long break, Glyph and I took a look at this code together (initally at PyCon 2014 and a couple times since as well). We started out just by going over the basics of how the graph traversal code works and why it's useful for it to work this way. This turned out to be a really good idea as it pointed out some misunderstandings I had about the intent of the system. Glyph also had the great insight that we should actually be paying more attention to the application code using obtain rather than the implementation of obtain itself. This was great because it looks like the real culprit is mostly that application code. It's getting back lots of useful results and then misinterpreting them.

The misinterpretation goes something like this.

  • To describe Bob's surroundings to him, the "look" action does an "obtain" call on Bob's location. It specifies that only results for visible things should be returned from the call.
  • "obtain" walks around the simulation graph and discovers that Alice exists and from Alice discovers there's a fedora. Because of quirks in the way hats are implemented it actually discovers the fedora twice via different edges in the graph (this is perhaps also a bug but not one I'm going to try to discuss now).
  • The "look" action inspects the graph nodes that came back from the "obtain" call. When inspecting the fedora for the first time it notices it is being worn by Alice and doesn't try to present it as if it were simply present in the room. When inspecting the fedora for the second time, though, it forgets this. And this is where the bug arises.

My mistake was thinking that the fedora being present twice was a bug in "obtain". This is a necessary feature, though. While we'll probably go and fix the bug with hats that makes the fedora show up twice, the "look" action could have the same problem in other situation. For example, a mirror in the room might let Bob see both Alice and the hat in two places. It needs to be able to keep the graph paths to objects straight to present cases like this correctly.

So far we haven't implemented the solution to this problem completely. We have a pretty good start, though, involving making the "look" action more explicitly operate on paths instead of only the objects at the *end* of each path. And actually understanding all the pieces of the bug helps a lot.

by Jean-Paul Calderone (noreply@blogger.com) at May 04, 2014 03:01 PM

April 14, 2014

Future Foundries

Crochet 1.2.0, now with a better API!

Crochet is a library for using Twisted more easily from blocking programs and libraries. The latest version, released here at PyCon 2014, includes a much improved API for calling into Twisted from threads. In particular, a timeout is passed in - if it is hit the underlying operation is cancelled, and an exception is raised. Not all APIs in Twisted support cancellation, but for those that do (or APIs you implement) this is a really nice feature. You get high level timeouts (instead of blocking sockets' timeout-per-socket-operation) and automatic cleanup of resources if something takes too long.

#!/usr/bin/python
"""
Do a DNS lookup using Twisted's APIs.
"""
from __future__ import print_function

# The Twisted code we'll be using:
from twisted.names import client

from crochet import setup, wait_for
setup()


# Crochet layer, wrapping Twisted's DNS library in a blocking call.
@wait_for(timeout=5.0)
def gethostbyname(name):
"""Lookup the IP of a given hostname.

Unlike socket.gethostbyname() which can take an arbitrary amount
of time to finish, this function will raise crochet.TimeoutError
if more than 5 seconds elapse without an answer being received.
"""
d = client.lookupAddress(name)
d.addCallback(lambda result: result[0][0].payload.dottedQuad())
return d


if __name__ == '__main__':
# Application code using the public API - notice it works in a normal
# blocking manner, with no event loop visible:
import sys
name = sys.argv[1]
ip = gethostbyname(name)
print(name, "->", ip)

by Itamar Turner-Trauring (noreply@blogger.com) at April 14, 2014 02:52 PM

April 02, 2014

Duncan McGreggor

Hash Maps in LFE: Request for Comment

As you may have heard, hash maps are coming to Erlang in R17. We're all pretty excited about this. The LFE community (yes, we have one... hey, being headquartered on Gutland keeps us lean!) has been abuzz with excitement: do we get some new syntax for Erlang maps? Or just record-like macros?

That's still an open question. There's a good chance that if we find an elegant solution, we'll get some new syntax.

In an effort to (re)start this conversation and get us thinking about the possibilities, I've drawn together some examples from various Lisps. At the end of the post, we'll review some related data structures in LFE... as a point of contrast and possible guidance.

Note that I've tried to keep the code grouped in larger gists, not split up with prose wedged between them. This should make it easier to compare and contrast whole examples at a glance.

Before we dive into the Lisps, let's take a look at maps in Erlang:

Erlang Maps

Common Lisp Hash Tables

Racket Hash Tables

Clojure Hash Maps

Shen Property Lists

OpenLisp Hash Tables

LFE Property Lists

LFE orddicts

I summarized some very basic usability and aesthetic thoughts on the LFE mail list, but I'll restate them here:
  • Erlang syntax really is quite powerful; I continue to be impressed.
  • Clojure was by far the most enjoyable to work with... however, doing something similar in LFE would require quite a bit of additions for language or macro infrastructure. My concern here is that we'd end up with a Clojure clone rather than something distinctly Erlang-Lispy.
  • Racket had the fullest and most useful set of hash functions (and best docs).
  • Chicken Scheme was probably second.
  • Common Lisp was probably (I hate to say it) the most awkward of the bunch). I'm hoping we can avoid pretty much everything the way it was done there :-/
One of the things that makes Clojure such a joy to work with is the unified aspect of core functions and how one uses these to manipulate data structures of different types. Most other implementations have functions/macros that are dedicated to working with just maps. While that's clean and definitely has a strong appeal, Clojure reflects a great deal of elegance.

That being said, I don't think today is the day to propose unifying features for LFE/Erlang data types ;-) (To be honest, though, it's certainly in the back of my mind... this is probably also true for many folks on the mail list.)

Given my positive experience with maps (hash tables) in Racket, and Robert's initial proposed functions like map-new, map-set, I'd encourage us to look to Racket for some inspiration:
Additional thoughts:
  • "map" has a specific meaning in FPs (: lists map), and there's a little bit of cognitive dissonance for me when I look at map-*
  • In my experience, applications generally don't have too many records; however, I've known apps with 100s and 1000s of instances of hash maps; as such, the idea of creating macros for each hash-map (e.g., my-map-get, my-map-set, ...) terrifies me a little. I don't believe this has been proposed, and I don't know enough about LFE's internals (much less, Erlang's) to be able to discuss this with any certainty.
  • The thought did occur that we could put all the map functions in a module e.g., (: maps new ... ), etc. I haven't actually looked at the Erlang source and don't know how maps are implemented in R17 yet (nor how that functionality is presented to the developer). Obviously, once I have, this point will be more clear for me.
With this done, I then did a thought experiment in potential syntax additions for LFE. Below are the series of gists that demonstrate this.

Looking at this Erlang syntax:

My fingers want to do something like this in LFE:

That feels pretty natural, from the LFE perspective. However, it looks like it might require hacking on the tuple-parsing logic (or splitting that into two code paths: one for regular tuple-parsing, and the other for maps...?).

The above syntax also lends itself nicely to these:

The question that arises for me is "how would we do this when calling functions?" Perhaps one of these:

Then, for Joe's other example:

We'd have this for LFE:

Before we pattern match on this, let's look at Erlang pattern matching for tuples:

Compare this with pattern matching elements of a tuple in LFE:

With that in our minds, we turn to Joe's matching example against a specific map element:

And we could do the same in LFE like this:

I'm really uncertain about add-pair and update-pair, both the need for them and the names. Interested to hear from others who know how map is implemented in Erlang and the best way to work with that in LFE...

by Duncan McGreggor (noreply@blogger.com) at April 02, 2014 05:33 PM

Moshe Zadka

GPS Navigators, morality and the human utility function

GPS navigators used to be simple. You plug in the desired address, they calculate the quickest route. At some point, many started getting a feed of traffic information, so that they can route against big jams. Still pretty simple.

Enter Waze. Waze hyper-micro-optimizes for traffic conditions, using other users as roaming sensors and, probably, complicated AI on the back-end to predict what route would take how long. This means Waze learns from experience how fast people drive where, under what conditions — and, as far as I can tell, takes all this information into account. This means that on a residential street that is constantly being speeded to, Waze will learn that it is a fast path, and will start directing drivers through it. Drivers who are using Waze, which means they want to get to their destination as fast as possible.

There are a lot of valid reasons to want to get somewhere fast. Maybe you scheduled a meeting. Maybe someone is having an emergency. Maybe you just want to cuddle your kid before she goes to bed. Waze does not care. It tries to get users to their destination as fast as possible, no matter which residential streets they speed through. It has no concerns for safety, either the driver’s or pedestrian, neatly divulging itself of all responsibility by “just suggesting”. What if a nice street starts being higher-up on the Waze paths? And this lowers property values? Waze has no concerns for that.

I’m not picking on Waze especially. Their only fault was that they took the optimization criteria that all GPS navigators use (find fastest path) and used superior sensors and superior algorithms to implement them better. Whoever wins the “navigator wars” would have had to do at least as well, so in that sense, there is nothing special about Waze except hiring smart people who made good decisions.

But this does show how AI moves forward: smart engineers optimize the hell of whatever target function they are given. At some point, it starts having real costs that humans would understand, but an AI would not care about because the AI cares about optimizing the target function. The only way to solve it is to figure out what humans care about in terms computers can understand and make sure that any AI takes those into account — even if it is not smart enough to self-modify, it can do plenty of damage without it.

In other words, the Friendliness apocalypse, if we include not just existential risk but also a general raising of human risk factors, is not in some nebulous future until we figure out self-modification. We are in the middle of it, and we should make sure that we take it into account when building AI — even if that AI is limited to “just suggesting things”, because humans are suggestible, and have learned to trust computers’ suggestions.

by moshez at April 02, 2014 04:03 PM

March 27, 2014

Glyph Lefkowitz

Panopticon Rift

Greg Price expresses a common opinion here:

I myself had a similar reaction; despite not being particularly invested in VR specifically (I was not an Oculus backer) I felt pretty negatively about the fact that Facebook was the acquirer. I wasn't quite sure why, at first, and after some reflection I'd like to share my thoughts with you on both why this is a bad thing and also why gamers, in particular, were disturbed by it.

The Oculus Rift really captured the imagination of the gaming community's intelligentsia. John Carmack's imprimatur alone was enough to get people interested, but the real power of the Oculus was to finally deliver on the promise of all those unbelievably clunky virtual reality headsets that we've played around with at one time or another.

Virtual Boy

The promise of Virtual Reality is, of course, to transport us so completely to another place and time that we cease to even be aware of the real world. It is that experience, that complete and overwhelming sense of being in an imagined place, that many gamers are looking for when they sit down in front of an existing game. It is the aspiration to that experience that makes "immersive" such a high compliment in game criticism.

Personally, immersion in a virtual world was the beginning of my real interest in computers. I've never been the kind of gamer who felt the need to be intensely competitive. I didn't really care about rules and mechanics that much, at least not for their own sake. In fact, the term "gamer" is a bit of a misnomer - I'm more a connoisseur of interactive experiences. The very first "game" I remember really enjoying was Zork. Although Zork is a goal-directed game you can "win", that didn't interest me. Instead, I enjoyed wandering around the environments it provided, and experimenting with different actions to see what the computer would do.

Computer games are not really just one thing. The same term, "games", encompasses wildly divergent experiences including computerized Solitaire, Silent Hill, Dyad, and Myst. Nevertheless, I think that pursuit of immersion – on really, fully, presently paying attention to an interactive experience – is a primary reason that "gamers" feel the need to distinguish themselves (ourselves?) from the casual computing public. Obviously, the fans of Oculus are among those most focused on immersive experiences.

Gamers feel the need to set themselves apart because computing today is practically defined by a torrential cascade of things that we're only barely paying attention to. What makes computing "today" different than the computing of yesteryear is that computing used to be about thinking, and now it's about communication.

The advent of social media has left us not only focused on communication, but focused on constant, ephemeral, shallow communication. This is not an accident. In our present economy there is, after all, no such thing as a "social media business" (or, indeed, a "search engine business"); there are only ad agencies.

The purveyors of social media need you to be engaged enough with your friends that you won't leave their sites, so there is some level of entertainment or interest they must bring to their transactions with you, of course; but they don't want you to be so engaged that a distracting advertisement would be obviously crass and inappropriate. Lighthearted banter, pictures of shiba inus, and shallow gossip are fantastic fodder for this. Less so are long, soul-searching long-form writing explaining and examining your feelings.

An ad for cat food might seem fine if you're chuckling at a picture of a dog in a hoodie saying "wow, very meme, such meta". It's less likely to drive you through to the terminus of that purchase conversion funnel if you're intently focused on supporting a friend who is explaining to you how a cancer scare drove home how they're doing nothing with their life, or how your friend's sibling has run away from home and your friend doesn't know if they're safe.

Even if you're a highly social extrovert, the dominant emotion that this torrent of ephemeral communication produces is, at best, sarcastic amusement. More likely, it produces constant anxiety. We do not experience our better selves when we're not really paying focused attention to anything. As Community Season 5 Episode 8 recently put it, somewhat more bluntly: "Mark Zuckerberg is Fidel Castro in flip-flops."

I think that despite all the other reasons we're all annoyed - the feelings of betrayal around the Kickstarter, protestations of Facebook's creepyness, and so on - the root of the anger around the Facebook acquisition is the sense that this technology with so much potential to reverse the Balkanization of our attention has now been put directly into the hands of those who created the problem in the first place.

So now, instead of looking forward to a technology that will allow us to visit a world of pure imagination, we now eagerly await something that will shove distracting notifications and annoying advertisements literally an inch away from our eyeballs. Probably after charging us several hundred dollars for the privilege.

It seems to me that's a pretty clear justification for a few hundred negative reddit comments.

by Glyph at March 27, 2014 09:25 AM

March 20, 2014

Duncan McGreggor

lfetool T-Shirts?!

So, yeah... waaaaay too early for a T-shirt; the code has just gone from 0.1.x to 0.2.x. But, how could we resist: the project logo is retro like woodblock prints!

Earlier today, lfetool converted to using plugins (easier for contributors to play!), and hopefully later tonight YAWS + Bootstrap support will land. Regardless, there's tons more work to do, and what's more motivating for new work than a T-shirt? As you can see, we had no choice.

So let's get some. T-shirts, that is.

The lfetool logo is on the front, and Robert Virding's example Fibonacci lfescript code is on the back (with the lfetool command that generates the script at the top). Here's the front of the T-shirt:




And here's the back:



We've got a CustomInk sign-up sheet that you can add your name to if you want a shirt. I'll coordinate with folks individually, once we meet our minimum number (we're going to need to use paypal with payment upfront). Due to the colors of the source code on the back, the minimum order is 15. This will put the shirts at $22 a piece + $5 to send it to you. I've just ordered 2; 13 more go!


by Duncan McGreggor (noreply@blogger.com) at March 20, 2014 03:58 AM

March 18, 2014

Duncan McGreggor

lfetool 0.2 Released

This release of lfetool has a bunch of new interesting features -- too much to cover in a tweet, so it's getting a little blog post ;-)

Here are the high-level bits that should make users' lives better:
  • New yaws project type, provided for building basic web sites; includes the exemplar project as a dependency for generating HTML with S-expressions.
  • Every project now gets TravisCI support by default.
  • Unit tests, integration tests, and systems tests now each have their own dir.
  • Tests now get run in parallel for faster run times.
  • Added support for running on Linux and *BSD systems.
  • Added unit tests for lfetool itself (shUnit) (for checking skeleton-generation and tool options).
  • lfetool has a new logo :-)
Note that the version jumped from 0.1.2 to 0.2.0 due to the strong improvements in quality and feature set of the tool. Forth-coming features include:
  • Support for an e2 service skeleton project.
  • Support for a YAWS-RESTful service skeleton project.
  • Support for YAWS embedded in a skeleton OTP application.
  • Support for YAWS + Exemplar + Bootstrap projects
  • Support for slide decks with Exemplare + Reveal.js
If you've got more ideas for the lfetool wishlist, post a comment here or send an email to the LFE mailing list.


by Duncan McGreggor (noreply@blogger.com) at March 18, 2014 11:29 PM

March 15, 2014

Future Foundries

Signal/GC-safe cross-thread queueing in Python

I've just released a new version of Crochet, and one of the bugs fixed involves an interesting problem - reentrancy. In this particular case I'm talking about garbage collection and signal reentrancy - any function your Python program is running may be interrupted at any time (on bytecode boundaries) to do garbage collection or handle a signal. A signal handler can run arbitrary Python code, as can GC via to __del__ or weakref callbacks. Once that code finishes running control is returned to the original location in the code.

Unfortunately, due to a bug in Python, Queue.put() can deadlock in the following situation:
  1. As part of calling Queue.put(), a thread acquires the Queue's lock. This lock does not support being acquired more than once by the same thread.
  2. GC or a signal handler interrupts the function call.
  3. If the GC or signal handler code then also does Queue.put(), it will try to acquire the lock again... and since it's already locked it blocks waiting for the lock to be released.
  4. Since the signal handler/GC code is now blocked, control is never returned to original code, so lock is never released there.
  5. The thread is now deadlocked and will never recover.
Unfortunately there was no way to prevent the Queue.put() in GC; the Queue accepts log messages, and this is a GC-caused logging message coming out of code that is not under Crochet control's.

The obvious short-term solution is to reimplement a simplified Queue using Python's RLock, which allows the same thread to acquire the lock multiples times. But... RLock isn't reentrancy safe either due to another bug in Python! I could wrap OS-specific reentrant lock implementations, but that's a bigger project than I want to start.

The solution I ended up with (suggested by Jean-Paul Calderone I believe) was giving up on using threading primitives to communicate across threads. Instead I used the self-pipe trick. That is, the thread uses select() (or poll() or epoll()) to wait on one end of the pipe; to wake the thread up and tell it to check for new messages to process we simply write a byte to the other end of the pipe. Since Crochet uses Twisted, I had a pre-written event loop that already implemented self-pipe waking, and now the logging thread runs another Twisted reactor in addition to the regular reactor thread.

As far as I can tell this works, but it feels a bit like overkill. I'd welcome suggestions for other solutions.

    by Itamar Turner-Trauring (noreply@blogger.com) at March 15, 2014 12:45 PM

    March 14, 2014

    Duncan McGreggor

    The Future of LFE

    Erlang Factory

    First of all, Erlang Factory this year was just phenomenal: great talks, great energy, and none of the feared/anticipated "acquisition feeding frenzy." Instead, everyone was genuinely happy for WhatsApp and Cloudant, and with celebrations aside, they were ready to get on with the conference and dive into the tech :-)

    And gosh, there was a bunch of good tech. If you don't believe me, check out the schedule. Also on that page are the speaker pics. For talks that have video or slides pushed up, the speaker pic is visually annotated and linked.

    There's so much good stuff there -- I've definitely got my watching queue set up for the next few weeks ...

    LFE Presentation

    I gave a presentation on LFE which covered everything from motivational basics for using a Lisp in the 21st century, gave a taste of LFE in small chunks, and then took folks on a quick tour of creating projects in LFE. There was also some dessert of fun side/research projects that are currently in-progress. The slides for the presentation are here; also the slide source code is available (related demo project). I'd also like to give a shout out to the Hoplon crew for their help in making sure I could create this presentation in a Lisp (Clojure), and not HTML ;-) (It uses a Hoplon-based Reveal.js library.)

    The Good Stuff

    After the presentation, several of us chatted about Lisp and Erlang for a while. Robert and I later continued along these lines after heading over to the quiet of the ever-cool Marines Memorial 11th-floor library (complete with fireplace). Here we sketched out some of the interesting areas for future development in LFE. I'm not sure if I'm remembering everything (I've also added stuff that we didn't discuss in the library, like Sean Chalmers' recent experiments with types; blog and discussion):
    • getting the REPL to the point where full dev can happen (defining functions, macros, and records in the LFE shell)
    • adding macros (maybe just one) for easier use of Mnesia in LFE 
    • discussing the possibility of an LFE stdlib
    • gathering some of the best funcs and macros in the wild for inclusion in an LFE stdlib
    • possibly supporting both (: module func ...) and (module:func ...) syntax
    • possibly getting spec and type support in LFE
    • producing an LFE 1.0 release
    Additional efforts planned:
    • building out an LFE rebar plugin
    • examining erlang.mk as another possible option
    • starting work on an LFE Cookbook
    • creating demos of LFE on Erjang
    • creating demos of LFE-Clojure interop via JInterface
    • creating more involved YAWS/REST examples with LFE
    • explore the possibility of an SOA tutorial with LFE + YAWS 
    • async tasks in LFE with rpc lib 
    • monad tutorial for LFE (practical, not conceptual)
    • releasing a planner demo
    • finishing the genetic programming examples
    • LFE style guide
    • producing a stdlib contribution guideline
    • continued work on the LFE user guide
    I'll dump all these into github issues so they'll be easier to track.

    If this stuff is exciting to you, feel free to jump into the low-volume discussions we have on the mail list.

    More soon!


    by Duncan McGreggor (noreply@blogger.com) at March 14, 2014 08:11 PM

    March 09, 2014

    Future Foundries

    Twisted on Python 3 now pip installable

    The subset of Twisted that has been ported to Python 3 can now be pip installed. By either pointing at a version control URL or requiring Twisted 14.0 (once it's released), you can now have Twisted as a dependency for your Python 3 packages.

    Here's a slightly edited version of my Travis-CI config for Crochet, demonstrating how I run unit tests on both Python 2 and Python 3 versions of Twisted (slightly tricky because the trial test runner hasn't been ported yet):

    language: python

    env:
    - TWISTED=Twisted==13.0 RUNTESTS=trial
    - TWISTED=Twisted==13.1 RUNTESTS=trial
    - TWISTED=Twisted==13.2 RUNTESTS=trial

    python:
    - 2.6
    - 2.7
    - pypy

    matrix:
    include:
    - python: 3.3
    env: TWISTED=git+https://github.com/twisted/twisted.git RUNTESTS="python -m unittest discover"

    install:
    - pip install -q --no-use-wheel $TWISTED --use-mirrors
    - python setup.py -q install

    script: $RUNTESTS crochet.tests

    by Itamar Turner-Trauring (noreply@blogger.com) at March 09, 2014 06:27 PM

    March 08, 2014

    Ashwini Oruganti

    Kneel and Disconnect: Getting the fastest connection out of a hostname

    A Caterpillar crawled to me one day and said
    “Oh what the hell goes on inside your swollen head?
    I don’t believe that you can see as much as I
    Now close your eyes and tell me what do you spy?”

    A detailed account of the development of HostnameEndpoint follows, also leading to my talk at PyCon US 2014 in Montreal. During the course of its creation, many wise words were exchanged, thanks to exarkun and glyph (and itamar and tomprince!). I repeat them here, in hopes that it will help make you wiser too, should you aim for that goal. So well, indulge me whilst I rant on.

    Soon after I finished working on a name-resolving TCP IPv6 client endpoint, I heard of a certain crazy endpoint, described (not in as much detail as it is now, but to some extent) in a ticket. In those days, seeing as the description was not too elaborate, and the goals not too clear, any mention of working on it would lead to a discussion such as:

    <exarkun> I suspect it is obvious to anyone with more than ten years network programming experience.
    <glyph> exarkun: Completely. ashfall should hurry up and get ten years of that, then.

    This continued until one day Glyph decided to write a nice spec for it, and that’s where our story begins.

    Bornlivedie

    <kenaan> ashfall r35028 A(branches/gai-endpoint-4859/): Branching to ‘gai-endpoint-4859’ …
    <ashfall> Okay, this one is a little bit scary to start.

    The gist of this endpoint is that it takes a hostname, does a getaddrinfo lookup on it, and picks the first two addresses out of the returned list. Using some time-involving algorithm, it tries connecting to both of them.

    Doing things based on time in Twisted meant using reactor.callLater. That in turn meant using Clock as the reactor in the unit tests, since real time is slow and unpredictable, and tests are ideally fast and predictable.

    This was also the day when my appreciation of the reactor grew.

    My initial understanding of the task at hand, I remember, was that the endpoint would connect to the first address, record how long it took, then connect to the next address, record how long it took, compare the times taken, pick the one that took less time, and call connect one final time. After discussing it further with exarkun, I realized that this doesn’t actually need to record any durations, since we don’t care how long it took, as long as it was first.

    Now, the high-level objective was to establish the connection as quickly as possible, and 3 connects would take a lot of time. So we make the connection attempts in parallel, drop the one that doesn’t connect first and use the one that does connect first. So that there are two connection attempts, one (hopefully) connection success, and one discarded connection attempt. And no third connection attempt.

    And how do we do that?

    “Well, fortunately Twisted is good at doing network operations in parallel”, explained exarkun. For example:

    1
    2
    
    reactor.connectTCP('1.2.3.4', 456, f)
    reactor.connectTCP('1:2::3:4', 567, f)
    

    For my part I wondered, how is that parallel, with one reactor?

    I learnt the reactor can do many things at once, and that that’s its job, essentially. ”reactor.connectTCP returns immediately, after all. It doesn’t wait for the connection to be established first,” exarkun elaborated. “That’s why we need a factory with a buildProtocol method, so that the reactor can call it once the connection has succeeded. Because by the time it does succeed, the application will have moved on to doing something else.”

    “A secondary objective of the endpoint is to be conservative of network resources,” he added. “If the connection to 1.2.3.4 succeeds quickly enough, then it would be wasteful of network resources to try connecting to 1:2::3:4 as well. That’s why the relevant RFC talks about a 300 ms delay. If the first one is done really fast and you can avoid the second connection attempt, you should. So you won’t actually start both connection attempts at the same time. You’ll start the 2nd one a short while after the first one, if and only if the first one has not succeeded. reactor.callLater is how you would do that, and delayedCall.cancel() is how you would stop a delayed call from happening when the first one succeeds.someDeferred.cancel() is how you would make a connection attempt give up early, if the other connection attempt succeeds.”

    But now I wondered, what if 1.2.3.4 takes 254 ms and 1:2::3:4 would have taken, say, 230 ms, had we tried it (but we will never know?). And exarkun said, “[Then] Tough luck for us. We’re not going to try to solve that problem.” Turns out, it’s not as much of a problem as the problem this RFC is aimed at fixing, which is the problem where 1:2::3:4 takes 120 seconds and 1.2.3.4 takes 254 ms, and so a purely serial approach would take 120.254 seconds to succeed, whereas this one will only take 0.254 seconds to succeed. That’s a much more noticeable difference than 0.254 seconds vs 0.230 seconds.

    “I watched nine cats dance on the moon
    A flamingo stalked into my room
    It bowed its head to me and knelt
    To reveal the cards it had dealt

    Now, where does Clock come into the picture? The clock is actually a replacement for the reactor, at least in a very restricted scope. It has a callLater method, but whereas reactor.callLater(5, f) will only call f after roughly 5 seconds have passed, clock.callLater(5, f) will call f the instant you call clock.advance(5). (or any combination of clock.advance calls so that the sum of the arguments passed to all calls is 5 or more). So if you’re writing a unit test for code that does callLater(5, f), the test can complete in fewer than 5 seconds if you pass a Clock instance in for the reactor.

    Using Clock completes with more predictable results, since the reactor really does call f in roughly 5 seconds, and the variation implied by “roughly” can sometimes make a difference (in a bad way) to how the code behaves. The idea is that maybe 5 seconds is close to some other critical amount of time. Perhaps sometimes the reactor gets around to calling f very soon after 5 seconds, perhaps 5.001 seconds. That exercises one code path through the code under test. And perhaps sometimes, the reactor doesn’t get around to calling f until 5.1 seconds have elapsed, meanwhile, something else happened in those extra .099 seconds that didn’t have time to happen in the first case, and so now you’re exercising a second, different code path through the code under test. But this is a single test method, and it should only ever exercise a single code path. Since it is exercising two, now, it can have two different outcomes, and that makes it much less useful. Perhaps one of the outcomes is a success and the other is a failure, so when you run the tests, the results flip flop between good and bad, even though you’re not changing any code. Or perhaps both outcomes are success - for now. And someone breaks one of the code paths accidentally. But they run the test, and it accidentally goes through the other code path, the one they didn’t break, and so the test passes. And they think their code works.

    So, yes. Clock and writing two tests is the solution. Because with Clock, you can write a test where it appears that 5.001 seconds elapse before f is called, and a second test where it appears that 5.1 seconds elapse. (just by doing clock.advance(5.001) in one of them and clock.advance(5.1) in the other)

    Nomenclature

    An ace, three jacks, two queens, four kings
    Then turned them into burning rings
    The flames jumped out and chased four mice
    Caught by their tails they turned to ice

    The new super-smart TCP endpoint, what should it be called?

    The first name glyph suggested was HostnameEndpoint, because it connects you to a hostname, over whatever protocol is appropriate. Another suggestion was to call it TCPEndpoint, since we haven’t used that name yet. And then it was pointed out that hostnames aren’t part of TCP. Nor even IP. They’re part of DNS. 99% of this endpoint is DNS functionality. So… DNSEndpoint? But you can look up hostnames for use with SCTP or UDP via DNS as well! And this won’t let you do that. And, in fact, this endpoint does not necessarily use DNS: it could be doing name resolution via /etc/hosts, or, it could be doing name resolution using WINS… you don’t know.

    The important lesson to be learnt here is that naming things for how they work is generally worse than naming things for what they do. DNS and TCP are both implementation details (albeit kind of important details in this case); the important thing about this endpoint is that it takes a hostname and gives you a stream-oriented connection.

    So HostnameEndpoint seemed like the best and the least ambiguous choice.

    Deform to form a Star

    Reading the spec, the whole idea sounded (to me, at least) too convoluted and it wasn’t very obvious where to begin. On seeking help, exarkun suggested:

    • Make a list of features it should have.
    • Sort them from simplest to most complex. (the failure ones come before the success ones)
    • Now, write a unit test for the first thing on that list. (i.e., write the test before you have any implementation for the endpoint at all, or at least set up a TestCase subclass that mixes in that helper and implements the method for creating an endpoint)

    Part of the idea with proceeding from simplest to most complicated is that once you’ve done all the simple things, the complicated things generally end up pretty simple. They’re only complicated if you have to tackle all the complexity at once.

    We moved on to talk about what test cases to write first, because… TDD! Starting with the easy cases, I planned to begin by using the EndpointTestCaseMixin - Try making the test methods from that mixin that exercise failures pass. Then the success cases. Perhaps then try a test that exercises the case where getaddrinfo returns one ipv4 address. And then the case where it returns one ipv6 address. And when all that works, then maybe it’s time to start thinking about the interesting cases, where getaddrinfo returns an ipv4 and an ipv6 address, e.g. What if it returns ipv4 then ipv6? What if it returns ipv6 then ipv4? What if the first connection attempt succeeds in < 300ms? What if it succeeds in > 300ms, but before the second? What if the second succeeds before the first? What if they both fail? But start with the simpler cases, and work on them one at a time.

    Deferreds and Raising Exceptions

    A cloud appeared outside my door
    And through the window saw four more
    And on the back of each cloud sat
    Two rainbow smiles in wizard’s hats

    There was a part of the endpoint where I was using “except” to work with a Deferred firing with an exception. Needless to say, the unit test that was to check this was failing, and I wasn’t sure why. This led exarkun to elaborate on how Deferreds work, and where I was going wrong.

    We have Deferreds so that functions can return before they have finished the abstract task they are responsible for. They return a Deferred and the Deferred becomes responsible for indicating the result of the abstract task, once it is complete. This lets multiple tasks proceed concurrently. You can just call one function to start one task, then it will return (before the task is complete), and you can call another function to start another task (which will also return before that task is complete). Now you have two Deferreds for the two tasks, and you can use them to learn the results of those tasks.

    You learn the result by adding a callback to the Deferred. And when the result is available, the callback is called with it.

    Not all results are “successes” though. But for the same reason that these functions can’t return their result directly, neither can they raise an exception directly. By the time they return, they might not have gotten far enough in their abstract task to know that there is going to be an error. So they can still only return a Deferred. So you can also attach error callbacks, or errbacks, to Deferreds as well. An errback is just like a callback, except it is called when the Deferred gets its result and that result is not a success.

    So, to handle results from asynchronous, Deferred-returning functions, you take the returned Deferred and call addCallback(callbackFunction) on it. And to handle error results from asynchronous, Deferred-returning functions, you take the returned Deferred and call addErrback(errbackFunction) on it.

    The errback is ignored if the result is not a failure, just like an “except” statement is ignored if no exception is raised. Similarly, the callback is ignored if the result is a failure.

    Hence, “except” will not work for a Deferred that fires with exception, because the Deferred was returned. If anything was returned at all, an exception could not possibly have been raised. Because of how Python functions work - they can raise or they can return, they cannot do both.

    Returning Deferreds From Tests

    At one point, I had written a test that was returning a Deferred, and then the whole test case was timing out, never reaching completion. I learnt that since it doesn’t require any external resources, there was no reason this test needed to spin a reactor.

    I asked, then, is returning Deferreds from tests a bad thing?

    “It’s not ‘good’ or ‘bad really,” Glyph explained, “but if you’re going to think of it as one of those, let’s think of it as ‘bad’ by default”.

    Returning a Deferred spins the global reactor. You return Deferreds from tests that need to interface with real, external resources or connect to real sockets. In this case you absolutely do not want to do that. You want to very carefully control everything that your HostnameEndpoint is talking to. You want to fire the connection attempt success and failure in a very specific order - in several different orders, as a matter of fact. and to verify that attempts are in progress at certain times and not in progress at other times. That means you want to use Clock and you want to use MemoryReactor (possibly smashed together on one object so that you only need to have one reactor).

    Then, at precise points in your test, you want to say “this deferred should be fired by now” not “whenever this Deferred happens to fire…”. As you can see, if you do it by returning Deferreds, then your failure mode is “uh… the test didn’t finish”, and not “Oh, of course, I forgot to callback deferred X when event Y happened, and my test failed telling me that that Deferred wasn’t fired”.

    If you return an unfired Deferred from your test, then the test never ends. It just sits around waiting for that Deferred forever. where Trial defines “forever” to be 120 seconds, after which the test times out. There are times when it’s appropriate, but you won’t likely encounter them when testing Twisted itself, I was told.

    Basically if you’re originating the Deferreds that you’re testing, you should definitely not be yielding them, or returning them from your test. Tests which do return Deferreds depend on the things that return the Deferreds that they’re returning being fully tested and having no chance of timing out.

    So as a rule of thumb, only return Deferreds from a test if the test interacts with external stuff, e.g. writing tests that talk to, say, Postgres, and have to do multiple asynchronous operations. And, if you’re doing that, see if there’s a way you could reasonably interact with less external stuff first.

    This is just a special case of a general rule though. The general rule is, your tests should not depend on any more stuff than they need to. Especially, they should not depend on unpredictable external factors that may make your tests behave differently under different conditions. A real reactor is by definition an unpredictable external factor; the whole point is that it’s for talking to unpredictable external stuff.

    Glyph also shared an excellent analogy: “Think of it like an actual nuclear reactor. Before you hook your thermal regulator up to the coolant rods, you really want to have a high level of confidence that it is going to react in the appropriate way when the “uranium is too hot” wire gets some voltage. So you give that wire some voltage using a hand crank, not a critical mass.”

    0 seconds later is still “some amount of time”

    They threw five clocks down on my bed
    The chimes danced out on golden threads
    And turned to footprints on my wall
    Sequined tears began to fall”

    Now, I had a function iterateEndpoint, which iterates through a generator that yields different endpoints for different addresses returned by gai. Now, I needed to keep calling this function for each endpoint, and as per the algorithm we are implementing, we introduce a time lag of 0.3 seconds between two endpoint’s connection attempts. So, each call to iterateEndpoint needs to occur 0.3 seconds after the previous call. I needed a technique to introduce this delay in the calls. There’s reactor.callLater, which did this and I tried using it, and it didn’t work.

    By “didn’t work”, I mean that all my tests were failing, but not failing in a way that would help me identify what’s wrong. But mostly, MemoryReactor.tcpClients was empty, when it should hold some host address, i.e., (in terms of real endpoint) a connection wasn’t being established.

    I blamed callLater for this, because there was at least one connection attempt when I hadn’t used callLater and introduced the delay, whereas now it wasn’t even trying the first endpoint’s host address.

    Then I made the delay in the looped calls 0 seconds to check, and I learnt that I needed to “advance” the clock for this to work. Even when the delay was 0 seconds. And this just wasn’t make any sense.

    That was when glyph started quoting Morpheus and said, “Free your mind!”

    “Advancing the clock means calling Clock.advance()”, he explained further. “Specifically when I say ‘free your mind’ I mean ‘stop thinking about everything in terms of real stuff happening’. Everything here is just a method calling another method. Some of those methods represent things (like the passage of time on a clock) but in a test it’s all a simulation. I’d also say that ultimately when it’s not in the test it’s just a higher fidelity simulation but that’s a thorny philosophical issue we probably shouldn’t get into. At any rate it’s always just methods calling other methods. So your LoopingCall specifies a function to call to advance to the next connection attempt; it asks an IReactorTime to call that thing via callLater.

    But then, I wondered, if I set the time delay between two consecutive calls to the looped code as 0 seconds, why would I need to advance the clock now?

    “Probably the easiest way to understand the answer to that is to look at the implementation of Clock”, Glyph said patiently. “Particularly the implementation of Clock.callLater. When you call callLater, it just remembers the function you asked it to run, puts it in a list, and sorts the list. Sorting a list does not call your function. Similarly, a “real” reactor’s callLater will never run your function reentrantly. 0 seconds later is still ‘some amount of time’ later, even if it’s not a very big amount of time. the behavior of Clock is that it only invokes application code when ‘time passes’ which is to say when you call .advance(). Real reactors do the same thing and only invoke timed calls when you get back to the main loop.

    But… it’s zero. It’s supposed to mean “nothing”! I protested.

    “You’re thinking too hard about it”, said Glyph. “It’s a number. The behavior of Clock is such that it remembers a callable, sorts them by number, and then calls them. It stuffs it on the stack of callLaters and re-evaluates that when the reactor gets control again. The magnitude of the number isn’t important except in that negative numbers aren’t allowed because that’s non-sensical. In actual fact it is actually an orderable object, ‘number’ is a convention that we apply to it. callLater means ‘when time advances, no sooner than N seconds later, call this function’.”

    “So, when the reactor is told to perform a task 0 seconds later, does it not do that immediately?”, I asked, still not quite convinced.

    “Well of course it does it immediately. But not reentrantly. ‘As soon as possible but no sooner’.” was Glyph’s answer. “‘immediately’ is defined by some time-keeping thing, so in order to figure out when “immediately” occurs, it has to consult some time-keeping thing. And the point in the code - not in time - where it does that consultation is the reactor loop for reactors and the advance() method for Clock. Practically speaking, callLater is not a hard realtime API, because Python isn’t a hard realtime system, so callLater(n, f) cannot physically run f precisely n seconds after its invocation; it can’t even get a zero-error concept of when its own invocation was. clocks keep moving while computers are computing. so if you say callLater(1, f) really you’re likely for f to get run at 1.000000001 seconds later, or something close to that. so if you say callLater(0, f) you’re going to get 0.000000001 seconds later, if you’re comparing against real time. if you’re using a task.Clock(), time is frozen, so you don’t get to 0 seconds later until you allow time to advance.”

    “The most important thing though is that you understand the (very simple) implementation within Clock. there’s nothing complicated going on in there, no magic. it’s almost as simple as l = []; l.append(f); for g in l: g(). All the stuff about how time actually works is useful to understand but ultimately peripheral to the very simple rule that functions don’t get called until you call them.”

    “One last time”, I said, now almost convinced, “what exactly does ‘reentrant’ mean here?”

    “It means that def forever(): reactor.callLater(0, forever) will not cause a ‘maximum recursion depth exceeded’ error, because callLater does not call forever below its stack frame. forever() doesn’t re-enter itself upon the call to callLater, hence, callLater does not invoke its argument reentrantly. In other words, it doesn’t mutually recurse.”

    The Calculus of Testing

    Now, in my tests, I was just popping the result I want from MemoryReactor.tcpClients, and matching it against a value. So, the test ensured that the connection attempts were being made, but they weren’t checking the algorithm.

    Turns out, you can’t check an algorithm with a unit test. “Some people think you can check an algorithm with formal logic, but they’re probably wrong in practice. An algorithm is too complicated to check in any general way.”, explained exarkun.

    So I can only hope that the implementation of the algorithm is correct?

    “No, you can test that the output of the algorithm is correct for some number of inputs”, said he. “And you can hope that your selection of inputs exercises all meaningful branches (and branch combinations) in the algorithm.”

    “The idea with unit tests is that you constrain the domain of possible incorrect outputs”, Glyph added. “You test what you believe to be a representative sample of each ‘interesting’ region of behavior for your algorithm, and by ‘interesting’ I mean each range of input that requires more code.”

    A general lesson I learnt was that I was trying really hard to understand how the whole thing fits together all at once, but it helps to compartmentalize. It helps to just figure out what functions call what other functions; later, once you know what’s going on, learn why they’re getting called in that way. You can do a whole lot without knowing why. And in some cases there isn’t really a “why”, it’s just an arbitrary decision because the reactor has a bunch of work to do and it has to choose some order, so it does things in whatever order we thought to implement first.

    And just like that, we had our endpoint written in a respectful, TDD-way, with well-tested parts, etc. But it still wasn’t perfect. Later in the reviews, exarkun suggested that connect could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. That led me to file another ticket, and work on getting a finite state machine into Twisted. But that’s another story!

    The caterpillar gasped at me and said
    “My god if that’s what’s going on inside your head
    You can see so much more than I
    I think it’s time to turn into a butterfly.”

    March 08, 2014 02:24 PM

    February 24, 2014

    Glyph Lefkowitz

    Unyielding

    The Oak and the Reed by Achille Michallon

    … that which is hard and stiff
    is the follower of death
    that which is soft and yielding
    is the follower of life …

    – the Tao Te Ching, chapter 76

    Problem: Threads Are Bad

    As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.

    With the word “threads”, I am referring to shared-state multithreading, despite the fact that there are languages, like Erlang and Haskell which refer to concurrent processes – those which do not implicitly share state, and require explicit coordination – as “threads”.

    My experience is mainly (although not exclusively) with Python but the ideas presented here should generalize to most languages which have global shared mutable state by default, which is to say, quite a lot of them: C (including Original Recipe, Sharp, Extra Crispy, Objective, and Plus Plus), JavaScript, Java, Scheme, Ruby, and PHP, just to name a few.

    With the phrase “local reasoning”, I’m referring to the ability to understand the behavior (and thereby, the correctness) of a routine by examining the routine itself rather than examining the entire system.

    When you’re looking at a routine that manipulates some state, in a single-tasking, nonconcurrent system, you only have to imagine the state at the beginning of the routine, and the state at the end of the routine. To imagine the different states, you need only to read the routine and imagine executing its instructions in order from top to bottom. This means that the number of instructions you must consider is n, where n is the number of instructions in the routine. By contrast, in a system with arbitrary concurrent execution – one where multiple threads might concurrently execute this routine with the same state – you have to read the method in every possible order, making the complexity nn.

    Therefore it is – literally – exponentially more difficult to reason about a routine that may be executed from an arbitrary number of threads concurrently. Instead, you need to consider every possible caller across your program, understanding what threads they might be invoked from, or what state they might share. If you’re writing a library desgined to be thread-safe, then you must place some of the burden of this understanding on your caller.

    The importance of local reasoning really cannot be overstated. Computer programs are, at least for the time being, constructed by human beings who are thinking thoughts. Correct computer programs are constructed by human beings who can simultaneously think thoughts about all the interactions that the portion of the system they’re developing will have with other portions.

    A human being can only think about seven things at once, plus or minus two. Therefore, although we may develop software systems that contain thousands, millions, or billions of components over time, we must be able to make changes to that system while only holding in mind an average of seven things. Really bad systems will make us concentrate on nine things and we will only be able to correctly change them when we’re at our absolute best. Really good systems will require us to concentrate on only five, and we might be able to write correct code for them even when we’re tired.

    Aside: “Oh Come On They’re Not That Bad”

    Those of you who actually use threads to write real software are probably objecting at this point. “Nobody would actually try to write free-threading code like this,” I can hear you complain, “Of course we’d use a lock or a queue to introduce some critical sections if we’re manipulating state.”

    Mutexes can help mitigate this combinatorial explosion, but they can’t eliminate it, and they come with their own cost; you need to develop strategies to ensure consistent ordering of their acquisition. Mutexes should really be used to build queues, and to avoid deadlocks those queues should be non-blocking but eventually a system which communicates exclusively through non-blocking queues effectively becomes a set of communicating event loops, and its problems revert to those of an event-driven system; it doesn’t look like regular programming with threads any more.

    But even if you build such a system, if you’re using a language like Python (or the ones detailed above) where modules, classes, and methods are all globally shared, mutable state, it’s always possible to make an error that will affect the behavior of your whole program without even realizing that you’re interacting with state at all. You have to have a level of vigilance bordering on paranoia just to make sure that your conventions around where state can be manipulated and by whom are honored, because when such an interaction causes a bug it’s nearly impossible to tell where it came from.

    Of course, threads are just one source of inscrutable, brain-bending bugs, and quite often you can make workable assumptions that preclude you from actually having to square the complexity of every single routine that you touch; for one thing, many computations don’t require manipulating state at all, and you can (and must) ignore lots of things that can happen on every line of code anyway. (If you think not, when was the last time you audited your code base for correct behavior in the face of memory allocation failures?) So, in a sense, it’s possible to write real systems with threads that perform more or less correctly for the same reasons it’s possible to write any software approximating correctness at all; we all need a little strength of will and faith in our holy cause sometimes.

    Nevertheless I still think it’s a bad idea to make things harder for ourselves if we can avoid it.

    Solution: Don’t Use Threads

    So now I’ve convinced you that if you’re programming in Python (or one of its moral equivalents with respect to concurrency and state) you shouldn’t use threads. Great. What are you going to do instead?

    There’s a lot of debate over the best way to do “asynchronous” programming - that is to say, “not threads”, four options are often presented.

    1. Straight callbacks: Twisted’s IProtocol, JavaScript’s on<foo> idiom, where you give a callback to something which will call it later and then return control to something (usually a main loop) which will execute those callbacks,
    2. “Managed” callbacks, or Futures: Twisted’s Deferred, JavaScript’s Promises/A[+], E’s Promises, where you create a dedicated result-that-will-be-available-in-the-future object and return it for the caller to add callbacks to,
    3. Explicit coroutines: Twisted’s @inlineCallbacks, Tulip’s yield from coroutines, C#’s async/await, where you have a syntactic feature that explicitly suspends the current routine,
    4. and finally, implicit coroutines: Java’s “green threads”, Twisted’s Corotwine, eventlet, gevent, where any function may switch the entire stack of the current thread of control by calling a function which suspends it.

    One of these things is not like the others; one of these things just doesn’t belong.

    Don’t Use Those Threads Either

    Options 1-3 are all ways of representing the cooperative transfer of control within a stateful system. They are a semantic improvement over threads. Callbacks, Futures, and Yield-based coroutines all allow for local reasoning about concurrent operations.

    So why does option 4 even show up in this list?

    Unfortunately, “asynchronous” systems have often been evangelized by emphasizing a somewhat dubious optimization which allows for a higher level of I/O-bound concurrency than with preemptive threads, rather than the problems with threading as a programming model that I’ve explained above. By characterizing “asynchronousness” in this way, it makes sense to lump all 4 choices together.

    I’ve been guilty of this myself, especially in years past: saying that a system using Twisted is more efficient than one using an alternative approach using threads. In many cases that’s been true, but:

    1. the situation is almost always more complicated than that, when it comes to performance,
    2. “context switching” is rarely a bottleneck in real-world programs, and
    3. it’s a bit of a distraction from the much bigger advantage of event-driven programming, which is simply that it’s easier to write programs at scale, in both senses (that is, programs containing lots of code as well as programs which have many concurrent users).

    A system that presents “implicit coroutines” – those which may transfer control to another concurrent task at any layer of the stack without any syntactic indication that this may happen – are simply the dubious optimization by itself.

    Despite the fact that implicit coroutines masquerade under many different names, many of which don’t include the word “thread” – for example, “greenlets”, “coroutines”, “fibers”, “tasks” – green or lightweight threads are indeed threads, in that they present these same problems. In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads. Which, as shown above, are bad.

    When you look at the implementation of a potentially concurrent routine written using callbacks or yielding coroutines, you can visually see exactly where it might yield control, either to other routines, or perhaps even re-enter the same routine concurrently. If you are using callbacks – managed or otherwise – you will see a return statement, or the termination of a routine, which allows execution of the main loop to potentially continue. If you’re using explicit coroutines, you’ll see a yield (or await) statement which suspends the coroutine. Because you can see these indications of potential concurrency, they’re outside of your mind, in your text editor, and you don’t need to actively remember them as you’re working on them.

    You can think of these explicit yield-points as places where your program may gracefully bend to the needs of concurrent inputs. Crumple zones, or relief valves, for your logic, if you will: a single point where you have to consider the implications of a transfer of control to other parts of your program, rather than a rigid routine which might transfer (break) at any point beyond your control.

    Like crumple zones, you shouldn’t have too many of them, or they lose their effectiveness. A long routine which has an explicit yield point before every single instruction requires just as much out-of-order reasoning, and is therefore just as error-prone as one which has none, but might context switch before any instruction anyway. The advantage of having to actually insert the yield point explicitly is that at least you can see when a routine has this problem, and start to clean up and consolidate the mangement of its concurrency.

    But this is all pretty abstract; let me give you a specific practical example, and a small theoretical demonstration.

    The Buggiest Bug

    Brass Cockroach - Image Credit GlamourGirlBeads http://www.etsy.com/listing/62042780/large-antiqued-brass-cockroach1-ants3074

    When we wrote the very first version of Twisted Reality in Python, the version we had previously written in Java was already using green threads; at the time, the JVM didn’t have any other kind of threads. The advantage to the new networking layer that we developed was not some massive leap forward in performance (the software in question was a multiplayer text adventure, which at the absolute height of its popularity might have been played by 30 people simultaneously) but rather the dramatic reduction in the number and severity of horrible, un-traceable concurrency bugs. One, in particular, involved a brass, mechanical cockroach which would crawl around on a timer, leaping out of a player’s hands if it was in their inventory, moving between rooms if not. In the multithreaded version, the cockroach would leap out of your hands but then also still stay in your hands. As the cockroach moved between rooms it would create shadow copies of itself, slowly but inexorably creating a cockroach apocalypse as tens of thousands of pointers to the cockroach, each somehow acquiring their own timer, scuttled their way into every player’s inventory dozens of times.

    Given that the feeling that this particular narrative feature was supposed to inspire was eccentric whimsy and not existential terror, the non-determinism introduced by threads was a serious problem. Our hope for the even-driven re-write was simply that we’d be able to diagnose the bug by single-stepping through a debugger; instead, the bug simply disappeared. (Echoes of this persist, in that you may rarely hear a particularly grizzled Twisted old-timer refer to a particularly intractable bug as a “brass cockroach”.)

    The original source of the bug was so completely intractable that the only workable solution was to re-write the entire system from scratch. Months of debugging and testing and experimenting could still reproduce it only intermittently, and several “fixes” (read: random, desperate changes to the code) never resulted in anything.

    I’d rather not do that ever again.

    Ca(sh|che Coherent) Money

    Despite the (I hope) entertaining nature of that anecdote, it still might be somewhat hard to visualize how concurrency results in a bug like that, and the code for that example is far too sprawling to be useful as an explanation. So here's a smaller in vitro example. Take my word for it that the source of the above bug was the result of many, many intersecting examples of the problem described below.

    As it happens, this is the same variety of example Guido van Rossum gives when he describes why chose to use explicit coroutines instead of green threads for the upcoming standard library asyncio module, born out of the “tulip” project, so it's happened to more than one person in real life.

    Photo Credit: Ennor https://www.flickr.com/photos/ennor/441394582/sizes/l/

    Let’s say we have this program:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    def transfer(amount, payer, payee, server):
        if not payer.sufficient_funds_for_withdrawl(amount):
            raise InsufficientFunds()
        log("{payer} has sufficient funds.", payer=payer)
        payee.deposit(amount)
        log("{payee} received payment", payee=payee)
        payer.withdraw(amount)
        log("{payer} made payment", payer=payer)
        server.update_balances([payer, payee])
    

    (I realize that the ordering of operations is a bit odd in this example, but it makes the point easier to demonstrate, so please bear with me.)

    In a world without concurrency, this is of course correct. If you run transfer twice in a row, the balance of both accounts is always correct. But if we were to run transfer with the same two accounts in an arbitrary number of threads simultaneously, it is (obviously, I hope) wrong. One thread could update a payer’s balance below the funds-sufficient threshold after the check to see if they’re sufficient, but before issuing the withdrawl.

    So, let’s make it concurrent, in the PEP 3156 style. That update_balances routine looks like it probably has to do some network communication and block, so let’s consider that it is as follows:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    @coroutine
    def transfer(amount, payer, payee, server):
        if not payer.sufficient_funds_for_withdrawl(amount):
            raise InsufficientFunds()
        log("{payer} has sufficient funds.", payer=payer)
        payee.deposit(amount)
        log("{payee} received payment", payee=payee)
        payer.withdraw(amount)
        log("{payer} made payment", payer=payer)
        yield from server.update_balances([payer, payee])
    

    So now we have a trivially concurrent, correct version of this routine, although we did have to update it a little. Regardless of what sufficient_funds_for_withdrawl, deposit and withdrawl do - even if they do network I/O - we know that we aren’t waiting for any of them to complete, so they can’t cause transfer to interfere with itself. For the sake of a brief example here, we’ll have to assume update_balances is a bit magical; for this to work our reads of the payer and payee’s balance must be consistent.

    But if we were to use green threads as our “asynchronous” mechanism rather than coroutines and yields, we wouldn’t need to modify the program at all! Isn’t that better? And only update_balances blocks anyway, so isn’t it just as correct?

    Sure: for now.

    But now let’s make another, subtler code change: our hypothetical operations team has requested that we put all of our log messages into a networked log-gathering system for analysis. A reasonable request, so we alter the implementation of log to write to the network.

    Now, what will we have to do to modify the green-threaded version of this code? Nothing! This is usually the point where fans of various green-threading systems will point and jeer, since once the logging system is modified to do its network I/O, you don’t even have to touch the code for the payments system. Separation of concerns! Less pointless busy-work! Looks like the green-threaded system is winning.

    Oh well. Since I’m still a fan of explicit concurrency management, let’s do the clearly unnecessary busy-work of updating the ledger code.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    @coroutine
    def transfer(amount, payer, payee, server):
        if not payer.sufficient_funds_for_withdrawl(amount):
            raise InsufficientFunds()
        yield from log("{payer} has sufficient funds.", payer=payer)
        payee.deposit(amount)
        yield from log("{payee} received payment", payee=payee)
        payer.withdraw(amount)
        yield from log("{payer} made payment", payer=payer)
        yield from server.update_balances([payer, payee])
    

    Well okay, at least that wasn’t too hard, if somewhat tedious. Sigh. I guess we can go update all of the ledger’s callers now and update them too…

    …wait a second.

    In order to update this routine for a non-blocking version of log, we had to type a yield keyword between the sufficient_funds_for_withdrawl check and the withdraw call, between the deposit and the withdraw call, and between the withdraw and update_balances call. If we know a little about concurrency and a little about what this program is doing, we know that every one of those yield froms are a potential problem. If those log calls start to back up and block, a payer may have their account checked for sufficient funds, then funds could be deducted while a log message is going on, leaving them with a negative balance.

    If we were in the middle of updating lots of code, we might have blindly added these yield keywords without noticing that mistake. I've certainly done that in the past, too. But just the mechanical act of typing these out is an opportunity to notice that something’s wrong, both now and later. Even if we get all the way through making the changes without realizing the problem, when we notice that balances are off, we can look only (reasoning locally!) at the transfer routine and realize, when we look at it, based on the presence of the yield from keywords, that there is something wrong with the transfer routine itself, regardless of the behavior of any of the things it’s calling.

    In the process of making all these obviously broken modifications, another thought might occur to us: do we really need to wait before log messages are transmitted to the logging system before moving on with our application logic? The answer would almost always be “no”. A smart implementation of log could simply queue some outbound messages to the logging system, then discard if too many are buffered, removing any need for its caller to honor backpressure or slow down if the logging system can’t keep up. Consider the way syslog says “and N more” instead of logging certain messages repeatedly. That feature allows it to avoid filling up logs with repeated messages, and decreases the amount of stuff that needs to be buffered if writing the logs to disk is slow.

    All the extra work you need to do when you update all the callers of log when you make it asynchronous is therefore a feature. Tedious as it may be, the asynchronousness of an individual function is, in fact, something that all of its callers must be aware of, just as they must be aware of its arguments and its return type.

    In fact you are changing its return type: in Twisted, that return type would be Deferred, and in Tulip, that return type is a new flavor of generator. This new return type represents the new semantics that happen when you make a function start having concurrency implications.

    Haskell does this as well, by embedding the IO monad in the return type of any function which needs to have side-effects. This is what certain people mean when they say Deferreds are a Monad.

    The main difference between lightweight and heavyweight threads is that it is that, with rigorous application of strict principles like “never share any state unnecessarily”, and “always write tests for every routine at every point where it might suspend”, lightweight threads make it at least possible to write a program that will behave deterministically and correctly, assuming you understand it in its entirety. When you find a surprising bug in production, because a routine that is now suspending in a place it wasn’t before, it’s possible with a lightweight threading system to write a deterministic test that will exercise that code path. With heavyweight threads, any line could be the position of a context switch at any time, so it’s just not tractable to write tests for every possible order of execution.

    However, with lightweight threads, you still can’t write a test to discover when a new yield point might be causing problems, so you're still always playing catch-up.

    Although it’s possible to do this, it remains very challenging. As I described above, in languages like Python, Ruby, JavaScript, and PHP, even the code itself is shared, mutable state. Classes, types, functions, and namespaces are all shared, and all mutable. Libraries like object relational mappers commonly store state on classes.

    No Shortcuts

    Despite the great deal of badmouthing of threads above, my main purpose in writing this was not to convince you that threads are, in fact, bad. (Hopefully, you were convinced before you started reading this.) What I hope I’ve demonstrated is that if you agree with me that threading has problematic semantics, and is difficult to reason about, then there’s no particular advantage to using microthreads, beyond potentially optimizing your multithreaded code for a very specific I/O bound workload.

    There are no shortcuts to making single-tasking code concurrent. It's just a hard problem, and some of that hard problem is reflected in the difficulty of typing a bunch of new concurrency-specific code.

    So don’t be fooled: a thread is a thread regardless of its color. If you want your program to be supple and resilient in the face of concurrency, when the storm of concurrency blows, allow it to change. Tell it to yield, just like the reed. Otherwise, just like the steadfast and unchanging oak tree in the storm, your steadfast and unchanging algorithms will break right in half.

    by Glyph at February 24, 2014 10:05 AM

    February 11, 2014

    The Moon Research Blog

    Twisted Names – now with a (private) EDNS(0) message parser

    This is the first in a series of articles describing my progress towards introducing EDNS(0) and DNSSEC support to Twisted Names.

    In this article, I'll describe two new private classes designed to parse EDNS(0) messages.

    But first a little background.

    The EDNS protocol extensions depend on the use of a special DNS record which crucially contains the advertised maximum datagram size, extended return codes, a DNSSECOK flag, and a payload which contains zero or more extensible headers.

    This special record is called an OPT record. It isn't a real record though; it doesn't convey any domain names or domain records. It's a hack to work around the lack of space in a standard 512 byte DNS message. All the header flags and fields that couldn't be shoehorned into the 96bit header section of a DNS message are instead encoded into an OPT "pseudo" record.

    EDNS clients and servers append one OPT record to the "additional" records in a sent DNS message. This record is then extracted and interpreted by an EDNS aware receiver.

    Twisted already has a class for parsing DNS records and it already has a class for parsing DNS messages. I've now created some wrappers which add the EDNS special behaviour to those existing APIs.

    The first wrapper is a class called twisted.names.dns._OPTHeader, which wraps an instance of twisted.names.dns.RRHeader. This allows us to re-use the existing record parsing code, and means that the code in _OPTHeader is purely concerned with the special behaviour of an OPT record. Which seems neat! It also made the unit tests much more focused.

    The second wrapper is a class called twisted.names.dns._EDNSMessage which is in trunk but not yet in a released version. This wraps an instance of twisted.names.dns.Message. And again, this allows us to re-use the existing parsing code while clearly separating the EDNS specific behaviour.

    So Twisted Names can now interpret OPT records and EDNS messages, but we've decided to keep these private for the time being; until we are satisfied with the APIs.

    This first version of _EDNSMessage has been written as a drop in replacement for dns.Message. But as a result it has inherited some of the ugliness of that class, namely:

    Tom has some ideas about introducing a new Interface for message constructorsand I've done some work on improving the abbreviated names, but I got stuck trying to figure out how to introduce the changes while maintaining backwards compatibility.

    I'd originally thought those things were blockers, because I intended that users would pass curried message constructors / factories to the various protocols and client APIs, as a way of configuring / overriding EDNS flags and fields in their requests. But that's not a good API (as Tom points out in another comment).

    So I've decided that those cleanups would be nice to have, but probably not necessary right now.

    A much simpler strategy – and my current priority – is to just switch all Twisted names code (client, protocols, server) to use the new dns._EDNSMessage class in place of dns.Message. This shouldn't introduce any backwards incompatibilities and I'll explicitly disable the use EDNS during encoding (for now). That way Twisted Names clients and servers should continue to behave exactly as before.

    But I'll then be free to start work on a DNS client which can be configured to send EDNS(0) requests and which is capable of negotiating with EDNS(0) servers. That's the key requirement of the first project milestone.

    (twisted.names.server and the twistd dns plugin will remain non-EDNS for the timebeing.)

    In fact I've already started that work, but first I had to fix a bug in the way twisted.names.server generates response messages. My proposed fix includes new functions and methods for generating response messages from a request message. (Previously, the request message was being re-used as the basis for the response, and inevitably, client specific flags and fields were leaking into the response.)

    Happily, these new APIs will also allow me to explicitly disable EDNS(0) in twisted.names.server, until such time as we implement server side EDNS(0) support.

    So that's all for now. In my next post I hope to be able to report that Twisted Names is using the new _EDNSMessage class internally and summarise some of the other work that I've done on Twisted Names in support of the EDNS(0) and DNSSEC project.

    Thanks once again to The NLnet Foundation for funding this project.

    by Richard Wall (noreply@blogger.com) at February 11, 2014 04:50 PM

    February 03, 2014

    The Moon Research Blog

    The Bristol Twisted Sprint – a belated thankyou

    I'd like to thank Luke, Rob and everyone at HybridCluster for organising the Bristol Twisted Sprint in December last year and for their hospitality on the day.

    I got quite a bit of work done and it was a rare opportunity to meet Tom and JP, who had flown in from the Americas, and Hynek who had flown in from Berlin.

    Earlier that week I'd been amazed to receive an email from Tom with code-reviews of three of my EDNS branches. He'd done them during his flight from Canada! Not sure I'd have had the stamina, given the lack of elbow room and the temptation of in-flight entertainment and complimentary booze. But Tom's a pro and it meant that we had lots to discuss on the day of the sprint. Tom had raised some concerns about my proposed mechanism for setting EDNS parameters from the high level twisted.names.client API – with which I agree. But I'm still struggling to implement the alternative mechanism that we discussed on the day. More on that in a later blog post.

    It was also nice to see Hynek, who showed me an interesting custom DNS server which he'd written despite the lack of documentation and despite having to subclass and override the private _lookup method of twisted.names.authority. He laid it on pretty thick, and guilt–tripped me into writing some decent documentation for twisted.names.server. Conscience is what motivates me, so I'm pleased to report that we now have a little more narrative documentation for Twisted Names. And it looks pretty nice too, thanks to khorn's and glyph's work on porting the Twisted documentation to ReST and Sphinx.

    JP adopted a "headmasterly" role; supervising things from a lectern at the front of the room (an improvised standing desk), and intimidating us (the pupils) with astonishing Typespeed scores between builds – the keyboard noise was deafening! But I'm exaggerating, JP was as helpful and as thoughtful as usual. Switching context effortlessly to assist us with all areas of Twisted — from development environment tips, to Twisted OpenSSL integration, to EDNS, to PyPy tests and benchmarking, to Tun/Tap APIs. Very impressive!

    Let's hope there are more UK Twisted sprints in 2014.

    by Richard Wall (noreply@blogger.com) at February 03, 2014 12:32 PM