After seeing that I possibly might have had some exploits run on my site again, I upgraded to wordpress 2.8
After reading up on hardening wordpress, the official site mentions AskApache, some plugin that helps hardening. I’m not too sure about it yet, because it wants to be writing .htaccess files in my directories and for that I have to open up more than I would want. But hey, let’s give it a go.
At some point it creates a username and password that you choose. I go on and configure stuff, not knowing very well which of its many modules I’m supposed to activate, or why.
I forget about it, and ten minutes later I check my mail. I have a mail from AskApache. With my login details. And the password in plaintext.
…
Is the WordPress security model just fundamentally broken ?
Yesterday was cause for celebration. We got together to celebrate five years of the Fluendo Group!
The picture quality is bad, and not everyone is in it, but I just took it on a whim after marveling how many people were there. I didn’t even know all of them – yes it’s gotten to that point. 67 months ago I arrived in Barcelona without the company even being created…
We celebrated with mountains of cheese and rivers of wine which in the first year would have lasted us a few weeks and now only lasted an hour.
As magical accidents sometimes happen, today is also the day Fluendo received the certification confirmation from Dolby for our DVD player. It didn’t take long to land in the webshop, so finally our DVD player is up for sale! So you know what to get us for our birthday – a shop checkout with the dvd player in your cart.
Good timing – that means that at this year’s GUADEC/Desktop Summit I know what the answer will be to one of the most asked questions I get.
This is the first GUADEC I’m going to with Kristien in tow, I hope she can manage. I’ll be there from Monday through Friday, because the week is bookended by two weddings. Looking forward to a GStreamer summit on Thursday discussing 1.0…
left work around 38 degrees C, got a haircut, went for some great tapas on my own reading Darkly Dreaming Dexter, went to a bar, met up with friends, an impromptu bbq plan was hatched, went to a lovely atico at Portal De L’Angel, barbecued in a soothing summer breeze, rode home on the back of a motorcycle hanging on for dear life. All in all a typical Barcelona summer Tuesday.

Jan and Arek entered this year’s ICFP programming contest. It’s a three day programming contest, so this morning they asked if they could swap their Friday project day for today to finish the contest. They seem to be in the top third at the moment.
Arek’s never been a fan of long meetings, but today’s standup meeting was particularly amusing with Arek urging everyone to keep focused and get out there quickly. They had less than two hours left on the clock.
Spot the seven differences
Amusingly, today they came to work with almost the same shirt on, by accident! I can only assume there is a big clothes factory in Poland where they have huge stock of the same fabric…
90 minutes left, knock them dead, guys!
mach allows you to set up clean roots from scratch for any distribution or distribution variation supported.
This release of mach contains fixes for Python 2.6, and adds Fedora 10 and 11, while fixing the archived Fedora locations.
Get it from the mach project page.
Next step on this weekend’s yakshave: a first implementation of moap vcs bisect!
The interface is lifted from git, obviously, since that’s where most people will know the feature from.
I implemented it first with CVS, so I could fix this pychecker bug which was blocking Fedora from bumping the pychecker version from 0.8.17 (3 years old) to 0.8.18. And sure enough, it picked out the commit I broke.
While implementing and while dealing with CVS’s idea of how it stores CVS revisions and dates and so on, I googled and was amused to find this first hit on google for the words cvs and bisect. Clever Andy! And he cleverly sidestepped the problem I wrestled with by making the user specify two dates at the start instead of trying to figure it out from the checkout. And all in lisp too!
Then, to test that my VCS interface was sane, I implemented it for Subversion as well. That took about 15 minutes, since Subversion is much more sane than CVS. I tried the following command on a flumotion checkout:
moap vcs bisect reset; moap vcs bisect start; svn up; moap vcs bisect good; svn up -r 3000; moap vcs bisect bad; MOAP_DEBUG=4 moap vcs bisect run ./test.sh
With test.sh containing
test -e flumotion/component/consumers/gdp/gdp.py
(In other words, look for the commit that added this file.)
Sure enough, it picked out this commit:
[moap-trunk] [thomas@ana flumotion]$ moap vcs bisect diff
Index: /home/thomas/tmp/flumotion/configure.ac
===================================================================
--- /home/thomas/tmp/flumotion/configure.ac (revision 6909)
+++ /home/thomas/tmp/flumotion/configure.ac (revision 6908)
@@ -212,7 +212,6 @@
flumotion/component/combiners/switch/Makefile
flumotion/component/consumers/Makefile
flumotion/component/consumers/disker/Makefile
-flumotion/component/consumers/gdp/Makefile
flumotion/component/consumers/httpstreamer/Makefile
flumotion/component/consumers/preview/Makefile
flumotion/component/consumers/shout2/Makefile
Index: /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am
===================================================================
--- /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am (revision 6909)
+++ /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am (revision 6908)
@@ -11,7 +11,6 @@
SUBDIRS = \
disker \
- gdp \
httpstreamer \
preview \
shout2
Index: /home/thomas/tmp/flumotion/ChangeLog
===================================================================
--- /home/thomas/tmp/flumotion/ChangeLog (revision 6909)
+++ /home/thomas/tmp/flumotion/ChangeLog (revision 6908)
@@ -1,16 +1,5 @@
2008-06-20 Thomas Vander Stichele <thomas at apestaart dot org>
</thomas>
- * configure.ac:
- * flumotion/component/consumers/Makefile.am:
- * flumotion/component/consumers/gdp (added):
- * flumotion/component/consumers/gdp/gdp.py (added):
- * flumotion/component/consumers/gdp/__init__.py (added):
- * flumotion/component/consumers/gdp/Makefile.am (added):
- * flumotion/component/consumers/gdp/gdp.xml (added):
- Add a GDP consumer.
-
-2008-06-20 Thomas Vander Stichele <thomas at apestaart dot org>
-
* flumotion/component/producers/gdp/gdp.py:
Add error for http://bugzilla.gnome.org/show_bug.cgi?id=532364
</thomas>
So, the feature is ready for testing. It could use some more documenting, and some additional goodies like accepting arguments to moap vcs bisect start for example.
Feedback appreciated!
Still on the yak shave expedition.
I’ve written some simple scripts and files to set up and build python 2.3, 2.4, and 2.5 in separate prefixes to be able to test my software against these versions.
If you’re interested, in theory it should be really simple:
As the README says, this should go on to build all versions of python, and install some scripts.
After that, you just run py-2.3 to go into a shell with Python 2.3 on your path.
Don’t say I never did anything for you.
As part of this weekend’s yakshave, I’m trying to implement a handler for STORE_MAP in pychecker. STORE_MAP is a new opcode in Python 2.6, which speeds up dict building.
So, for the first time I went under the hood of Python and figured out just enough to understand this problem. It was a lot less scary than I thought it was going to be!
It seems that using dis.dis(), one can easily dissassemble any python function into its opcodes. This shows clearly where the behaviour is different between python 2.5 and python 2.6.
Given the following function:
f = lambda: {'a': 1, 'b': 2}
Python 2.6 gives:
1 0 BUILD_MAP 2
3 LOAD_CONST 0 (1)
6 LOAD_CONST 1 ('a')
9 STORE_MAP
10 LOAD_CONST 2 (2)
13 LOAD_CONST 3 ('b')
16 STORE_MAP
17 RETURN_VALUE
I couldn’t find a good description of the output of dis.dis, but in my naiveness I am guessing the following:
I am assuming each opcode takes one address location, and each argument takes two; that maps with the address pointers in front of the opcodes.
The opcodes are all documented.
So, in human terms:
Pretty simple, when you look at it twice.
For the same code, python 2.5 gives:
>>> dis.dis(f)
1 0 BUILD_MAP 0
3 DUP_TOP
4 LOAD_CONST 1 ('a')
7 LOAD_CONST 2 (1)
10 ROT_THREE
11 STORE_SUBSCR
12 DUP_TOP
13 LOAD_CONST 3 ('b')
16 LOAD_CONST 4 (2)
19 ROT_THREE
20 STORE_SUBSCR
21 RETURN_VALUE
This code is slightly longer and more complicated. Basically, LOAD_CONST, LOAD_CONST, STORE_MAP was implemented with DUP_TOP, LOAD_CONST, LOAD_CONST, ROT_THREE, STORE_SUBSCR
It looks like DUP_TOP was needed because STORE_SUBSCR consumes the dictionary off the stack, and ROT_THREE is needed because the arguments are pushed on the stack in the wrong order.
Seems like a nice and obvious improvement once you understand it. An exercise for the reader is to profile whether this change actually makes things faster in practice.
So, where does this leave me for pychecker ? It now looks deceptively simple. STORE_MAP simply pops off two items of the stack. There is nothing to check for, since we’re in a dictionary context. So all my implementation needs to do is to pop 2 items off the stack, and that’s it.
And thus it was commited to pychecker CVS. Popping one item off the yak stack!
The yak shave started yesterday evening. The yak stack is actually a forked one this time, both of the forks involving pychecker.
I might not remember everything in order, but in a nutshell the stack is something like this:
Fork point 1 continues here:
Fork point 2 continues here:
ef func():
d = { 'a': 1, 'b': 2}
print d.keys()
which triggers, in python 2.6, the following warning:
Object (d) has no attribute (keys)
I’ll blog about the useful products of my yak shave separately, for those who don’t enjoy descriptions of yak shavings, only outcomes.
In general, I actually enjoy yak shaves. They’re massive treasure hunts, you learn a lot, and you end up fixing a nice bunch of things all over the stack if you persevere. But it’s probably more a mentality thing than anything else, and I really only indulge myself in these in my spare time.
moap is a swiss army knife for maintainers and developers.
This is MOAP 0.2.7, “MMM…”.
Coverage in 0.2.7: 1424 / 1899 (74 %), 109 python tests, 2 bash tests
Features added since 0.2.6:
- Added moap vcs backup, a command to backup a checkout to a tarball that
can be used later to reconstruct the checkout. Implemented for svn.
- Fixes for git-svn, git, svn and darcs.
- Fixes for Python 2.3 and Python 2.6
I’ve been fixing things left and right for python 2.6, and in the process I noticed that moap hasn’t had a release for over a year. This release contains mostly bug fixes collected over the year, and a new feature that isn’t implemented yet for all VCS’s. Basically it’s an automatic replacement for something I was doing manually every time I removed an old GNOME cvs/svn/git checkout: figure out what’s in that tree that’s not in the repository (diffs, unversioned files, …), so I can delete everything else and free some disk space.
The only problem with this release is that, after doing the release, I noticed that Freshmeat removed their XML-RPC interface. Apparently they have some new kind of interface they want people to use. Sigh. But that means 0.2.8 is right around the corner!
Last week, after upgrading my home desktop to F11, I had palimpsest tell me one of my disks was broken on the desktop machine. The desktop is running on two 250 GB drives in software raid. It was time to get new drives.
After a weekend of fiddling with new 1 TB disks for my home desktop, trying failure scenarios, making sure the system can boot from each of the two drives, and waiting for the 4 hour resync of the software RAID in between each step, I finally closed up the desktop machine and cleaned up under my desk again, thinking I was done with my halfyearly messing about with broken disks.
I guess I was tempting faith anyway. Doing a routine operation on my home server after all the configuration stuff I’d done to set up asterisk last week, suddenly an rsync aborted, a journal errored out, a partition changed to being mounted read-only, and the log was full of scary drive errors. Ouch.
Well, that’s why I keep around a big box of old drives - for when some drive fails and I want to tempt fate even more by reusing an old drive that’s probably going to fail real soon too. And anyway, I had just spent my hard drive piggybank on the new desktop drives.
Luckily, I seemed to have a 400 GB SATA drive lying around that used to belong to my media center. I don’t remember why I swapped it out, given that the media center has a 160GB drive for the OS (and two 1.5 TB raid drives for the data, of course), but this was a lucky break. I booted with a rescue cd, and tried copying the root filesystem of my CentOS 5.2 home server partition to this new drive. Which worked fine, except that /var was where I triggered an Input/Output error and some more drive errors in the kernel log.
So, powered off, took out the broken drive, and put it in a USB chassis. The advantage of a USB chassis is that you can easily just replug the drive to try again, instead of locking up your system terribly and having to reboot. Sadly, /var was broken beyond repair. I ran an e2fsck hoping to recover the contents, and that partly worked, but some of the important stuff is missing even from lost+found (apart from the annoying situation where you have to reconstruct file names, which I usually end up not bothering with).
But really, how important can /var be ? Turns out, rather important. As in, you need it to boot in the first place. And also, it holds your rpm database. Crap.
Some Googling gave me some posts on how to reconstruct your rpm database from log files (using –justdb –noscripts –notriggers). But to use those, you actually need those log files. Where are those ? On /var as well. Crap. And they’re not in lost+found either.
Ok, so time to get creative. Here’s what I ended up doing:
rpm -qf /etc/* | grep 'not owned' | cut -f2 -d' ' > /tmp/unowned
yum --enablerepo=c5-media --disablerepo=base --disablerepo=updates --disablerepo=addons --disablerepo=extras whatprovides `cat /tmp/unowned` | cut -f1 -d' ' | sort | uniq > /tmp/missing
yum --enablerepo=c5-media --disablerepo=base --disablerepo=updates --disablerepo=addons --disablerepo=extras install `cat /tmp/missing`
This works by first listing all files that are not owned by rpm (on the first run, that’s all of them), figure out what packages can provide these files, then installing those packages.
find / -name *.rpmnew | sed s/.rpmnew//g > /tmp/rpmnew
for c in `cat /tmp/rpmnew`; do echo $c; diff $c $c.rpmnew && mv -f $c.rpmnew $c; done
find / -name *.rpmorig | sed s/.rpmorig//g > /tmp/rpmorig
for c in `cat /tmp/rpmorig`; do echo $c; diff $c $c.rpmorig && mv -f $c.rpmorig $c; doneWhile it’s not an experience I hope to repeat any time soon, it worked out surprisingly well!
I managed to completely skip updating to F10. All my machines (work desktop, home desktop, laptop, media center) where running F9 without any real problems I worried about.
But of course I was curious. And, especially with the move to python 2.6, things I care about where bound to break.
So, last weekend I took the plunge, and after little over a week here are my first impressions:
I am not entirely sure what the security problems are with enabling the network after installation. The default firewall is pretty locked down, SELinux is enabled by default, and there’s no way I can install updates without the network anyway. But I’m sure that I could find huge bikeshedding threads on fedora-devel about this if I really cared why this was decided.
All in all, not a bad first week experience, and seems like a solid release. Now, off to rebuild bits and pieces, and clean up Python 2.6 deprecation warnings…
My main development machine is a custom PowerBook running Ubuntu natively. I use it when I'm sitting on the couch, my office comfy chair, the futon, floor, etc. Every once in a while, though, I want to work at a desk from my 24" iMac. Just to mix it up a little. However, that box is my gaming and web-browsing machine: it runs Mac OS X and that's the way I want to keep it. So, if I'm going to do work on the iMac, I need to ssh into the machines that have the environments set up for development.open -n "/Applications/Utilities/Terminal.app"
vi ~/.bash_profile
if [ ! -z "$REMOTE_CONNECTION" ]; then
ssh $REMOTE_CONNECTION
REMOTE_CONNECTION=""
fi
REMOTE_CONNECTION=rhosgobel \
open -n "/Applications/Utilities/Terminal.app"
Now, I just click my "Shells" menu, choose the destination, and start working on that machine. A new window or new tab opened with that instance of Terminal.app will give me a new session to that server, without having to manually ssh into it -- this is even more convenient than having an icon to double-click!by Duncan McGreggor (oubiwann@gmail.com) at June 21, 2009 04:09 AM
Unlike databases which manage data at rest, messaging is used to manage data in motion. Use messaging to communicate between and scale applications, within your enterprise, across the web, or in the cloud.Paraphasing Wikipedia's entry on AMQP:
The AMQ protocol is for managing the flow of messages across an enterprise's business systems. It is middleware to provide a point of rendezvous between backend systems, such as data stores and services, and front end systems such as end user applications.
sudo rabbitmq-server
BASE=/usr/lib/erlang/lib/rabbitmq-server-1.5.5/
BIN=$BASE/scripts/rabbitmq-server
RABBITMQ_MNESIA_BASE=$BASE/mnesia \
RABBITMQ_LOG_BASE=/var/log/rabbitmq \
RABBITMQ_NODE_PORT=5672 \
RABBITMQ_NODENAME=rabbit \
$BIN &
python2.5 consumer amqp0-8.xml
python2.5 producer amqp0-8.xml \
"producer-to-consumer test message 1"
def someFunc():
d1 = someAsyncCall()
d1.addCallback(_someCallback)
d2 = anotherAsyncCall()
d2.addCallback(_anotherCallback)
@inlineCallbacks
def someFunc():
result1 = yield someAsyncCall()
# work with result; no need for a callback
result2 = yield anotherAsyncCall()
# work with second result; no need for a callback
by Duncan McGreggor (oubiwann@gmail.com) at June 18, 2009 09:28 PM
Message-oriented middleware (MOM) is infrastructure focused on message sending that increases the interoperability, portability, and flexibility of an application by allowing the application to be distributed over multiple heterogeneous platforms. It reduces the complexity of developing applications that span multiple operating systems and network protocols by insulating the application developer from the details of the various operating system and network interfaces.AMQP (Advanced Message Queuing Protocol) is one of these protcols.
Decentralized, Locally Governed Federated Mesh of AMQP Brokers with standardized Global Addressing. The killer application for AMQP is transacted secure business messages between corporations - e.g. send a banking confirmation message to confirms@bank.com [...]I find this rather exciting due to my interest in ultra large-scale systems; scenarios like the one described above are the seeds for tomorrow's ULS systems :-)
by Duncan McGreggor (oubiwann@gmail.com) at June 18, 2009 05:59 PM

Unified orders: at the end of the sales process, there should be one abstraction of the "order", regardless if the source was the web store, a phone call, or the sales guy. The order abstraction will be a message (or series of messages, for orders with multiple items; we'll be addressing only the simple case of a single item).
Unified status: at the end of the manufacturing process, both the shipping guy and customers should be aware that the product has been completed and is ready to be sent: the shipping guy can connect to our messaging system (probably via a service) and the customer can be notified by email or by checking the order status in the web kilt store.by Duncan McGreggor (oubiwann@gmail.com) at June 18, 2009 05:58 PM
Do you want to write a post that will appear in Hacker News?
Here are a few blog headers that are sure to win you votes. Just write some filler and publish:
Seriously, all of these subjects have been done to death. We get them, really we do. The pros and cons have been explained, and making a more extreme post with an even more black-and-white picture is not what we want. And honestly? Debunking those is just as bad (done to death, we know it). Feel free to write such a post if it adds to your ego to have something rise to the top of HN. I’ll be here, waiting for more interesting posts there…


Last year, I did “what’s in your bag?” on Twitter. To mix it up a bit, here’s a photo of the applications I’m running. It’s pretty boring:
What’s running on your desktop?
Introduction
When working on a Twisted-centric project, it can be hard to recruit programmers. While Python, especially in recent years, has achieved main-stream status, Twisted is not quite there yet. It seems to be the case that recruiting Python programmers is, today, a solved problem. In the case where programmers unfamiliar with Python need to be recruited, resources (both free, cheap and expensive) abound for getting them up to par.
In the Twisted world, especially for large projects, programmers who have no previous experience with Twisted will need to be recruited. In all software projects, time is of the essence. An efficient, effective, way to get Python programmers into Twisted is desired. This is the case in VMware Israel (formerly B-hive), where the flagship product AppSpeed has multiple portions written in Twisted.
Teaching Resources
The broadest tutorial for Twisted is the “finger tutorial”. It leads the reader from writing the most basic finger server into a finger server supporting IRC and XML-RPC backends, going through a variety of Twisted programming techniques. Its broadness is also its chief downside, causing many newbies to learn many concepts which are irrelevant to their immediate needs, hiding the forest inside a density of trees.
The Twisted howto collection is another excellent resources. Two primary howtos are the “Writing Servers” which covers, in a focused, no-nonsense manner, the steps needing to write a server for a new protocol. Very little depth and “why” are covered, which is often as much as an upside as a downside. The other important howto is “Deferred Execution”, covering the basics of deferreds. Several other resources for covering deferreds in-depth are available.
Architecture
It is often the case that one, or more, of the people beginning a project have experience in Twisted. It is worthwhile to think how to attempt having a clean interface to those parts which need more such experience so that people writing other parts can hold their own with less such experience. Protocol details, for example, are often uninteresting, have a natural API and require non-trivial coding to deal with all the possibilities of packet portions arriving. The best wrapper for such details, if at all possible, is with an object whose methods return deferreds, and expose as few of the protocol details as possible.
An infrastructure containing such objects will help people learn Twisted in parts, while becoming productive programmers on the project early on.
Another useful part of the architecture is to compensate for beginners’ mistakes. The most popular such mistake is trying to do too much in a single callback, halting the reactor loop. Two solutions, often used in tandem, are useful for this problem. One is to limit the damage by trying to separate loosely-coupled tasks into separate processes, and using infrastructure such as AMP to communicate between those. In case one reactor is paused for too long, the others will still function correctly. The other solution is watchdogging: have a relatively-frequent “LoopingCall” start early in the infrastructure do some measurable action (sending a UDP packet, touching a file) and have a separate watchdog check for this action. If too long passes without the action being performed, the reactor is not functioning up to par — and the process should be rebooted. This, of course, necessitates correct recovery from rebooting: a rudimentary persistence framework, to say the least.
In places where long-running algorithms are required, or are merely easier to write, “deferToThread” is a useful abstraction. It allows the Twisted-beginner to shed all thoughts about events, and to write regular Python code. Those deferred-returning APIs will come in handy as they are used through reactor.blockingCallFromThread. It is useful to make sure blocking API meant to run in a different thread lives in separate modules — it makes code reviews easier when knowing in advance whether “time.sleep” or “reactor.callLater” is a bug.
Deferred-oriented infrastructure, a communication platform and a watchdog are three things most large Twisted projects will sport at one point or another. The sooner they are done, and the better, the better overall quality of the project will be — especially in a project which is introducing a large portion of its programmers to Twisted.
Summary
In most big Twisted-oriented projects, a large portions of the programmers, even if Python experts, will not know Twisted before starting on the project. It is useful to plan for this in advance, allowing an easy Twisted initiation for the beginner. This can be done in the three prongs of focusing on unavoidable Twisted issues first (deferreds and non-blocking rather than low-level APIs), compensating in the architecture for common mistakes and allowing a transfer of non-Twisted Python knowledge by allowing threaded abstractions.

Is python’s matplotlib and pylab just a twisty little maze of global variables and mass imports, making it impossible to learn your way around by introspection ? Or is it me ?
I am getting lost in the difference between Axes, figure()’s and plot()’s…
I’m sure this would all make more sense to me if I could recall my vague Matlab knowledge from university.
[DEBUG]: NoteManager created with note path "/home/glyph/.tomboy".It goes on for several hundred more lines just at startup, and continues to produce messages as the program runs. These messages are diligently classified into categories: DEBUG and INFO. I'm sure they're useful to someone. But why am I seeing them? I just wanted to start a program to put some sticky notes on my desktop, and none of this information is useful to that task.
[INFO]: Initializing Mono.Addins
[DEBUG]: AddinManager.OnAddinLoaded: Tomboy.Tomboy
[DEBUG]: Name: Tomboy.Tomboy,0.10
[DEBUG]: Description:
[DEBUG]: Namespace: Tomboy
[DEBUG]: Enabled: True
[DEBUG]: File: /usr/lib/tomboy/Tomboy.exe
[DEBUG]: Updating note XML to newest format...
log("*****")Of course, this frantic wording doesn't help the output go anywhere but silently into a log file where it will be ignored. But, perhaps if this is some server software, an administrator will notice this message and set up an alert that makes their blackberry buzz when they notice those particular words show up in the log file so they can ssh in and look for problems.
log("THIS SHOULD NEVER HAPPEN! HELP!!!")
log("*****")
log("Serious Error: phase inducers have been depolarized. Contact engineering immediately.")Of course this breaks the administrators' alerts, so after much discussion between programmers and admins, log levels are added so that admins only get alerts when something "really bad" happens, where "really bad" is an agreed upon flag:
log2(SERIOUS_ERROR, "phase inducers have been depolarized. Contact engineering immediately.")Okay. Now we've got a log level so admins can tell when their pagers should go off. Except, different developers have different ideas about what "serious" means.
log2(SERIOUS_ERROR, "OMG I lost my cat Mittens. Where is my cat?")Clearly this is an abuse of the new "severity" flag that was added, but the cat-engineering team thinks that loss of a cat is pretty serious, so we add a new thing, a log "system".
log3(SYSTEM_CATS, SERIOUS_ERROR, "OMG I lost my cat Mittens. Where is my cat?")Most logging systems stop in this general vicinity, but we still haven't solved the problem, which is that the log message has no structure and you can't tell what's going on without groveling around in a bunch of text files with regular expressions or manually reading each message. Which cat was lost? Which phase inducer was depolarized? How do we get from a log message or alert to this information? The 'log levels' solution to this problem is clearly untenable:
logRidiculous(SYSTEM_CATS, ALERT_IF_YOU_LIKE_CATS, O_RLY, YA_RLY, SERIOUS_BUSINESS, BUT_NOT_TOO_SERIOUS, CAT_LEVEL("Mittens"), "OMG I lost my cat Mittens. Where is my cat?")More importantly, if you're writing a library, you have a bunch of other problems. This diagnostic information needs to be logged somewhere, but what if this library is being used on a user's desktop machine? Some of these messages are relevant to them as well. How do you tell the user who is using a GUI that a cat has been lost? How do you show them the picture of Mittens so they will recognize her if they see her?

def foo(bar):and generate a warning like:
baz = bar + 2
return 12
example.py:2: local variable 'baz' is assigned to but never usedI use pyflakes hooked up to flymake, so it's always running all the time on every Python file I'm working on. Relying on it has become as second-nature as relying on syntax highlighting. There's a whole class of mistakes I don't make any more, simply because it's on.
For every positive experience with git there’s still more than enough negative ones to balance out.
Today, I got into the situation where I was updating my gstreamer modules and one of them was apparently still in conflict. Unhelpfully, git just says:
You are in the middle of a conflicted merge.
without telling you what to do about it.
Googling revealed lots of people in the same situation, and git reset –hard would work. It looks like that would throw away my changes though. Of course that’s one way out of the conflict.
I want to know what the other way is - the one where you get a change to merge conflicts.
Apparently the conflict was in a generated config file, so naively I deleted it because I wanted to check out that file again from the repository (probably an svn-ism that stuck with me). I know that in this case I don’t actually need to be able to merge my changes, but I would like to know what would have been the correct way to get out of this situation.
Here’s what I tried before I gave up and did a reset:
[gst-git] [thomas@level gst-plugins-ugly]$ git pull --rebase
You are in the middle of a conflicted merge.
[gst-git] [thomas@level gst-plugins-ugly]$ git diff
diff --cc win32/common/config.h
index 18dbcc4,abf941a..0000000
deleted file mode 100644,100644
-- a/win32/common/config.h
+++ /dev/null
[gst-git] [thomas@level gst-plugins-ugly]$ git checkout win32/common/config.h
error: path 'win32/common/config.h' is unmerged
[gst-git] [thomas@level gst-plugins-ugly]$ git reset -- win32/common/config.h
win32/common/config.h: locally modified
Today I finally had a reason to be happy about GStreamer having switched to GIT.
I added actual encoding support to my CD ripper this weekend. I’m only supporting lossless encoding for now. Sadly, in practice it seems the FFMpeg ALAC encoder crashes, and the wavpack one hangs for me, so I’m left with .wav (meaning no compression) or .flac
And then today I realized that some of my encoded .flac files had a strange bug in them. I wasn’t sure where it was coming from, but some of my encoded files didn’t play with mplayer, gave an error when using flac -d, but still worked completely fine with GStreamer and totem.
I first tried to find a different file from the GStreamer media testsuite to reproduce the symptom. south.mp3 and benow.mp3 owrked fine, but sugar.ogg reproduced the problem.
I also tried it with my installed version (0.10.8, while git master is at 0.10.15.1) and that worked fine.
So that gave me two points to bisect inbetween.
Then I read up on git bisect, and started playing with it. It isn’t particularly nice to do by hand; most checkouts change enough of the autotools files that it has to rerun them most of the times. Then configure changes win32/common/config.h which generates a local change. The common submodule also gets in the way. I got away with just compiling the flac plugin with make -C ext/flac before each test, so that sped up things. But there’s definitely potential for human error.
Basically, you start by doing git bisect start; git bisect bad; git checkout (known good commit); git bisect good
This will leave you somewhere in the middle between the bad and the good commit. Rebuild, do your test, then either type git bisect bad or git bisect good based on the test result. Repeat until you’re at the last commit.
That helped me find where the bug happened (see the bug report).
Of course I wondered immediately if I could automize this, since I wanted to make sure that the same commit broke my original files and I do not want to do that manual bisection again…
It turns out you can; that’s what git bisect run is for.
So, given the following two shell scripts:
test.sh
#!/bin/bash
git submodule update
make -C ext/flac
# rebuilds can touch these files leaving you with local changes
git checkout win32/common/config.hgst-launch -v filesrc location=/home/thomas/gst/media/medium/sugar.ogg ! oggdemux ! vorbisdec ! audioconvert ! audio/x-raw-int,width=16,depth=16 ! flacenc ! filesink location=test.flac
flac -d test.flac -f
# flac -d exits with 1 when it fails
exit $?
and bisect.sh:
#!/bin/bash
git bisect start
git bisect bad
# 0.10.8 release
git checkout c186d67f40827be349f97d810a45243c874b73f7
git bisect good
git bisect run ./test.sh
I can now just run ./bisect.sh and it will do the whole process automatically.
Try it if you’re curious on a checkout of gst-plugins-good. Change test.sh to point to your sugar.ogg file (which you can get from GStreamer’s test repository).
At the end, the output should show something like:
df707c666433a78d3878af6f055698d5756226c4 is first bad commit
commit df707c666433a78d3878af6f055698d5756226c4
Author: (HIDDEN TO PROTECT THE GUILTY)
The ‘run’ command really is what makes the bisection useful for me. Now, back to bug fixing….
PubSubHubbub is a protocol and reference implementation for doing publish-subscribe using web hooks, polling in feeds triggered by a ping from the publisher, and POSTing Atom entries to notify subscribers. The notification part is similar to what I've been working on for the publish-subscribe stuff at Mediamatic Lab, where we spiced up Idavoll with an HTTP interface to bridge the gap between XMPP Publish-Subscribe and HTTP speaking entities.
Although I spend a lot of time working on XMPP based publish-subscribe, I understand the reasons for going for a full HTTP-based approach. XMPP can be intimidating for developers of web applications. While the differences between XMPP and HTTP are important (stateful connections, asynchronous processing, etc), the fact that it is different is reason often enough. Hosting facilities don't always offer ways to do XMPP, and there is not nearly enough running code out there to make it easier for people to play with these technologies to spice up their web application with non-IM XMPP functionality. Having platforms like Google App Engine provide sending and handling raw XMPP stanzas as part of the API would surely help.
That said, PubSubHubbub has two separate sides to it, the publishing part and the notification part. There's nothing that prevents a hub to do the publishing part using regular XMPP publish-subscribe. Instead of fetching the Atom Feed over HTTP every time, it could use autodiscovery to find out the publish-subscribe node and upgrade by subscribing to it instead. Similarly, the notification part could send out XMPP notifications. Combined with existing HTTP aggregator, that combination is very similar to how the aggregator for Mimír works.
I'm still not convinced that PubSubHubbub is the answer to the efficient exchange of updates on social objects, but I do think it is a good way to make smaller entities be part of a federation of social networking sites. Likely, we'll see a hybrid approach, to begin with.