I’ve been feeling overwhelmingly pleased with myself. I wrote code that extended the behaviour of some existing code, which is like…object-oriented developer 101. And it’s the first time I’ve ever done it in a real project! And it worked! It’s reassured me that I mostly know what I’m doing.
I’ve also talked folks through some of my work with the AWS CDK, and was reminded that I need to hurry up and finish the big AWS rewrite I started a little while ago. It’s almost finished, I think, but it’s not actually been deployed yet. And that’s inevitably where the problems will start.
Moving is inching ahead. We’ve got a couple of weeks left, and we’re at the awkward in-between stage where we have packed enough to feel confident, but the pace is wrong and our burndown charts are projecting past the end of our timebox…
Kanban is a superb way to organise your moving experience, by the way.
I was offered a position as a senior engineer. It’s hugely exciting for me because it means I’m definitely ready for that level of work. On the other hand, I really like my team and the work – and this week has cemented that feeling for me. So maybe I’ll leave it for a little while and come back to it in a year or two.
This week I’ve been fixing a problem of our own making: a batch API that we’ve been dreadfully abusing.
This batch API allows for a certain number of requests every twenty-four hours – about 10,000 or so. However, each request can itself contain up to 10,000 items, so if you plan it carefully you can get 100,000,000 requests into the space of a day. Our entire database is only about a quarter of a million items, so we could probably squeeze in a full synchronisation 400 times every day.
If everything were perfectly distributed, which of course it’s not – some bits are 100,000 and growing by a thousand a day, and some are 100,000 and likely to stay that way for weeks on end. But that’s by the by.
400 synchronisations over the course of 24 hours is more or less one every 7 or 8 minutes. However: the folks who started this project felt that was too darn slow, and instead opted for instantaneous synchronisation.
This was a good idea when we only had a few things in the database, but before long we grew past 10,000 changes per day. And then we ran into problems. Because now the queue of things we need to synchronise is longer than the number of requests we get per day, and the tasks we don’t get to just get put back on the queue. Which means the queue naturally grows over time, and the time between an item being put on the queue and actually processed starts getting silly long real quick.
They did the best with the information they had, but now I have a queue of unopened mail, some of it pertaining to circumstances that have already changed, and I need to know how to file it appropriately. Because, to stretch a metaphor: some of it may be a wedding invitation – even if I’ve missed it, it’s good to know it happened. Some of it may be last week’s news: important at the time, but today good only for wrapping fish. And some of it may be the news of a death in the family: some very important change that will need to be communicated onwards.
Figuring this out has been such an interesting challenge. I’ve been applying a more and more test-driven approach, mitigated in some areas by some of the peculiar ways Django does things. For example, we make extensive use of Signals, which are events fired by parts of the system when events happen. Generally helpful unless you’re trying to test specific behaviour, in which case: total disaster. I spent fully half a day working out how to unplug those.
In the end though, the change happened, and it was effective, and my colleagues now have access to timely data. And I really, really enjoyed it, even if now I feel like I need to take my brain out and soak it for a week. No rest for the wicked, especially not this week.
